Dear prof. Giannozzi
I am writing to the same thread as it may be relevant  here.
I am using qe 6.5 on  3 node linux cluster.When the calculation is performed 
everything runs normally. When saving wf something unusual happens.1) 
calculation exits without time stamps and job done stamp2) this happened due to 
mpi exit from  one of the node which cannot write to out dir.3) scf run only 
starts after copying the files to slave nodes without which it will terminate 
saying files cannot be read.4) after pointing  outdir to common paths (via 
NFS).These errors  have disappeared. 
1) My question is if recent versions of QE is collecting all the wf to head 
node why slave nodes eco mpi abort while they have no access to head node.2) is 
there any way that we can restart calculations without copying to slave nodes.
Thanks and regards Janardhan 



    On Thursday, 20 February, 2020, 11:15:53 pm IST, Paolo Giannozzi 
<[email protected]> wrote:  
 
 It's a long story. By default, recent versions of QE collect both the 
wavefunctions and the charge density into a single array on a single processor 
that writes them to file. Even if you do not have a parallel file system, your 
data is no longer spread on scratch directories that are not visible to the 
other processors. This means that in principle it is possible to restart, witj 
several potential caveats:
- there is no guarantee that a batch queuing system will distribute processes 
across processors in the same way as in the previous run;- pseudopotential 
files are in principle read from data file so they may still be a source of 
problems;
- if you parallelize on k-points, with Nk pools, one process per pool will 
write wavefunctions, that will thus end up on Nk different processors.
Paolo

On Thu, Feb 20, 2020 at 4:54 PM alberto <[email protected]> wrote:

Hi, 
I'm using QE in some single point simulations. 
In particular I'm running scf/nscf calculations 

In my block input 

calculation = 'nscf' ,
                restart_mode = 'from_scratch' ,
                      outdir = './tmp_qe' ,
                  pseudo_dir = 
'/home/alberto/QUANTUM_ESPRESSO/BASIS/upf_files/' ,
                      prefix = 'BIS-IMID-PbI4_SR' ,
                   verbosity = 'high' ,
               etot_conv_thr = 1.0D-8 ,
               forc_conv_thr = 1.0D-7 ,
                  wf_collect = .true.
the out dir is located in /home/alberto/ and I notice that the writing/reading 
time is very long
I would use /tmp dir of one node where the jobs is running.(my cluster has got 
some nodes xeon to 20 CPU every nodes)
This is my PBS script
## Script for parallel Quantum Espresso job by Alberto
## Run script with 3 arguments:
## $1 = Name of input-file, without extension
## $2 = Numbers of nodes to use (ncpus=nodes*20)
## $3 = Module to run

if [ -z "$1" -o -z "$2" -o -z "$3" ]; then
 echo "Usage: $0 <input_file> <np> <module> "
fi

if [ $2 -ge 8 ]; then
 NODES=$(($2/20))
 CPUS=20
else
 NODES=1
 CPUS=$2
fi

cat<<EOF>$1.job
#!/bin/bash
#PBS -l 
nodes=xeon1:ppn=$CPUS:xeon20+xeon2:ppn=$CPUS:xeon20+xeon3:ppn=$CPUS:xeon20+xeon4:ppn=$CPUS:xeon20+xeon5:ppn=$CPUS:xeon20+xeon6:ppn=$CPUS:xeon20
#PBS -l walltime=9999:00:00
#PBS -N $1
#PBS -e $1.err
#PBS -o $1.sum
#PBS -j oe
job=$1      # Name of input file, no extension
project=\$PBS_O_WORKDIR
cd \$project
cat \$PBS_NODEFILE > \$PBS_O_WORKDIR/nodes.txt 

export OMP_NUM_THREADS=$(($2/40))
time /opt/openmpi-1.4.5/bin/mpirun -machinefile \$PBS_NODEFILE -np $2 
/opt/qe-6.4.1/bin/$3 -ntg $(($2/60)) -npool $(($2/60)) < $1.inp > $1.out
EOF

qsub $1.job
how could I use the directory /tmp and avoid that the nscf calculation don't 
stop it because no files are found! really the files are present, but they are 
divided on different nodes

regards
Alberto
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users


-- 
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222

_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users  
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to