Dear Brad,
I can only confirm what Paolo and Michal suggested.
Even with infiniband the efficiency of the FFT parallelization drastically 
decreases at each new node, WHATEVER THE CODE (not only QE) or the librairy.
For SLURM jobs, if you ask 2 nodes of 16 cores, the first 16 are indexed 1 to 
16 and 16 last 17-32, that is exactlly the same repartition implemented in QE 
for k points, bands or images parallelization.
Thanks to this, I never face trouble concerning the way the mpi processes are 
spread to the cores when the number of pools (or images) equals the number of 
For these reason, except for large supercells at gamma only, I always do 

Antoine Jay
Toulouse France

Le Vendredi, Novembre 06, 2020 01:04 CET, Michal Krompiec via users 
<> a écrit:
 Dear Brad,Fast communications means here Infiniband or other RDMA. Make sure 
your MPI uses RDMA, I’ve seen systems where it isn’t enabled by default. That 
said, if you use k-point parallelization you can get away with gigabit ethernet 
as Paolo mentioned.Best wishes,Michal KrompiecMerck KGaA  On Thu, Nov 5, 2020 
at 11:40 PM Baer, Bradly via users <> 
wrote:Paolo, I believe the nodes I am using have gigabit connections. There are 
additional nodes that have 10 or 25 gigabit connections but I don't think I 
would land on one of them without specifically requesting them.  What 
communication speed would be appropriate for QE's needs? I also did consider 
trying to manually set the parallelization but I don't currently know enough 
about SLURM to identify each node and ensure that all 16 cores assigned from a 
pool are on the same node.  I will keep it in mind though as a possible future 
solution. Thanks,Brad 
--------------------------------------------------------Bradly BaerGraduate 
Research Assistant, Walker LabInterdisciplinary Materials ScienceVanderbilt 
From: Paolo Giannozzi <>
Sent: Thursday, November 5, 2020 3:54 PM
To: Baer, Bradly <>; Quantum ESPRESSO users Forum 
Subject: Re: [QE-users] Running efficiently on multiple nodes Are there fast 
communications between the two nodes? if not, the parallel distributed 3D FFT 
will be very slow (note the time taken by fft_scatt_yz). You might find 
convenient to exploit k-point parallelization, that requires much less 
communication: for instance, "mpirun -n 32 pw.x -nk 2" (2 pools of 16 
processors, each pool performing parallel FFT), but you have to figure out a 
way to convince the first pool of 16 processors on node 1, the second on node 2 
(or vice versa, as long as FFT parallelization happens inside a node, k-point 
parallelization across nodes ) Paolo On Thu, Nov 5, 2020 at 7:29 PM Baer, 
Bradly via users <> wrote:Paolo, Thank you for 
your suggestion.  I will add recompiling to move to 6.6 to my to do list.  For 
now, I corrected the pseudopotential files as you indicated and the calculation 
ran successfully.  It has become slightly faster, but still much slower than 
running on a single node (3:30s vs 0:30s).  Is there more that I should be 
doing to improve performance or is my test problem too small to see the 
benefits of parallelization? Thanks,Brad  
--------------------------------------------------------Bradly BaerGraduate 
Research Assistant, Walker LabInterdisciplinary Materials ScienceVanderbilt 
From: users <> on behalf of Paolo 
Giannozzi <>
Sent: Thursday, November 5, 2020 10:01 AM
To: Quantum ESPRESSO users Forum <>
Subject: Re: [QE-users] Running efficiently on multiple nodes On Thu, Nov 5, 
2020 at 3:05 PM Baer, Bradly <> wrote: Pseudo file 
Ga.pbe-dn-kjpaw_psl.1.0.0.UPF has been fixed on the fly.To avoid this message 
in the future, permanently fix  your pseudo files following these instructions: This is a 
possible source of trouble if the output directory is not visible to all 
processors. Please try one of the following:- do what it is suggested (or 
simply: edit Ga.pbe-dn-kjpaw_psl.1.0.0.UPF, replace all occurrences of "&" with 
"&amp;")- get version 6.6, that reads the pseudopotential file on one processor 
and broadcast its contents to all other processes- get the development version, 
that in addition is not sensitive to the presence of nonstandard "&" in the 
files, Paolo 
-Brad --------------------------------------------------------Bradly 
BaerGraduate Research Assistant, Walker LabInterdisciplinary Materials 
ScienceVanderbilt University   
From: users <> on behalf of Paolo 
Giannozzi <>
Sent: Thursday, November 5, 2020 2:33 AM
To: Quantum ESPRESSO users Forum <>
Subject: Re: [QE-users] Running efficiently on multiple nodes On Wed, Nov 4, 
2020 at 11:28 PM Baer, Bradly <> wrote: Now that I 
have two nodes, the script for a single node results in a crash shortly after 
reading in the pseudopotentials. which version of QE are you using, and which 
crash do you obtain, with which executable?Paolo--Paolo Giannozzi, Dip. Scienze 
Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
Quantum ESPRESSO is supported by MaX (
users mailing list

--Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
Quantum ESPRESSO is supported by MaX (
users mailing list

--Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
Quantum ESPRESSO is supported by MaX (
users mailing list

Quantum ESPRESSO is supported by MaX (
users mailing list

Reply via email to