Hello
The size of your system is such that parallel execution should be beneficial 
for up to 50, 60 MPI ranks, without any further option. If you are planning to 
use something like 200 cores the simplest and yet effective way to go would be 
to use an executable compiled with hybrid MPI + OpenMP parallelism and run it 
with 4 OpenMP threads.

About your test in the 8 core machines, looking at your output, it seems to me 
that you are using a multithreaded linear algebra library.
If it is the case you should make sure that the number of MPI ranks times the 
number of threads doesn't exceed the number of cores.  8 in your case. To set 
the number of openMP threads you need to specify the OMP_NUM_THREADS 
environment variable. e.g.
export OMP_NUM_THREAD=2   if you run with 4 MPIs
or
export OMP_NUM_THREADS=1 if you run with 8 MPIs

hope this helps, best regards -- Pietro
________________________________
Da: users <[email protected]> per conto di Robert 
Fleming <[email protected]>
Inviato: mercoledì 13 aprile 2022 16:36
A: Quantum ESPRESSO users Forum <[email protected]>
Oggetto: [QE-users] Advice for Parallel Execution

Greetings,

I’m running scf calculations on an amorphous Si surface terminated with 
different functional groups (input file attached for context), and I’m 
experiencing poor scaling behavior in parallel.  I’m running these jobs locally 
on an 8-core CPU to test before scaling up the system size to an hpc cluster.

I’ve noticed that increasing the number of MPI processes from 4 to 8 (mpirun -n 
4 pw.x -in [myscript] vs. mpirun -n 8 pw.x -in [myscript]) results in the job 
taking longer (~ 1hr vs. 1.5 hr).  Looking through the documentation, I see 
that there are several command line switches for different levels of 
parallelization beyond just the number of MPI processes.  While it’s possible 
that my system size is potentially too small to benefit from parallel execution 
(24 atoms, 103 electrons), I think it’s probably more likely that I’m not 
appropriately taking advantage of these.

Would anyone be willing to share some advice or “rules of thumb” on the best 
way to select these parallelization levels for small-to-medium sized jobs (say, 
an 8 core CPU vs. 150-200 processors on an hpc platform)?

Thank you,
________________________________
Robert “Drew” Fleming, Ph.D.
Assistant Professor of Mechanical Engineering
College of Engineering & Computer Science
Arkansas State University
(870) 972-3743
[email protected]<mailto:[email protected]>






_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to