[QE-users] MD runs out of memory with increasing number of cores

Lenz Fiedler Mon, 14 Jun 2021 07:22:24 -0700

Dear users,

I am trying to perform a MD simulation for a large cell (128 Fe atoms,
gamma point) using pw.x and I get a strange scaling behavior. To test the
performance I ran the same MD simulation with an increasing number of nodes
(2, 4, 6, 8, etc.) using 24 cores per node. The simulation is successful
when using 2, 4, and 6 nodes, so 48, 96 and 144 cores resp (albeit slow,
which is within my expectations for such a small number of processors).
Going to 8 and more nodes, I run into an out-of-memory error after about
two time steps.
I am a little bit confused as to what could be the reason. Since a smaller
amount of cores works I would not expect a higher number of cores to run
without an oom error as well.
The 8 node run explictly outputs at the beginning:
"     Estimated max dynamical RAM per process >     140.54 MB
      Estimated total dynamical RAM >      26.35 GB
"


which is well within the 2.5 GB I have allocated for each core.
I am obviously doing something wrong, could anyone point to what it is?
The input files for a 6 and 8 node run can be found here:
https://drive.google.com/drive/folders/1kro3ooa2OngvddB8RL-6Iyvdc07xADNJ?usp=sharing
I am using QE6.6.

Kind regards
Lenz

PhD Student (HZDR / CASUS)

_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

[QE-users] MD runs out of memory with increasing number of cores

Reply via email to