Hello all,

 I am experiencing a crash when working with ph.x across multiple nodes.  Input 
and output files are attached.  The first q-point appears to be calculated 
correctly, but the code crashes when attempting to start calculating the second 
q-point. A file "charge-density" is said to be missing but "charge-density.dat" 
exists when I manually inspect the files.  As there are 16 reports that the 
file cannot be found, I am assuming that this is an issue with me using 
multiple nodes (each node has 16 cores).  A general description of my computing 
environment and workflow follows:

I am using SLURM on a cluster.  I have two nodes assigned to my job, each with 
a local scratch drive that is not visible to the other node.  I also have 
access to a gpfs networked drive that both nodes can access.  To improve 
performance, I am attempting to perform all calculations using the local 
scratch drives. All input files are copied from the gpfs networked drive to the 
local drive on each node before the initial pw.x calculation.  After the pw.x 
calculation, a small script copies the output files (pwscf.save folder and 
pwscf.xml) from the first node to the networked drive and then a second script 
copies them from the networked drive to the second node before starting the 
phonon code.

I am open to any suggestions as this solution has been somewhat hacked together 
after performance using the gpfs networked drives proved incredibly poor.

Thanks,
Brad


--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University


Attachment: Phonon.out
Description: Phonon.out

Attachment: QESlurm.slurm
Description: QESlurm.slurm

Attachment: Phonon.in
Description: Phonon.in

_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to