Dear users,
I am trying to perform a MD simulation for a large cell (128 Fe atoms,
gamma point) using pw.x and I get a strange scaling behavior. To test the
performance I ran the same MD simulation with an increasing number of nodes
(2, 4, 6, 8, etc.) using 24 cores per node. The simulation is successful
when using 2, 4, and 6 nodes, so 48, 96 and 144 cores resp (albeit slow,
which is within my expectations for such a small number of processors).
Going to 8 and more nodes, I run into an out-of-memory error after about
two time steps.
I am a little bit confused as to what could be the reason. Since a smaller
amount of cores works I would not expect a higher number of cores to run
without an oom error as well.
The 8 node run explictly outputs at the beginning:
" Estimated max dynamical RAM per process > 140.54 MB
Estimated total dynamical RAM > 26.35 GB
"
which is well within the 2.5 GB I have allocated for each core.
I am obviously doing something wrong, could anyone point to what it is?
The input files for a 6 and 8 node run can be found here:
https://drive.google.com/drive/folders/1kro3ooa2OngvddB8RL-6Iyvdc07xADNJ?usp=sharing
I am using QE6.6.
Kind regards
Lenz
PhD Student (HZDR / CASUS)
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users