Thank you all for taking the time to share your experience.  It looks like I 
have some work to do this weekend to learn more about how our cluster handles 
inter-node communication.  I appreciate all the help.

-Brad

--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University


________________________________
From: users <[email protected]> on behalf of Antoine Jay 
via users <[email protected]>
Sent: Friday, November 6, 2020 1:14 AM
To: Quantum ESPRESSO users Forum <[email protected]>
Subject: Re: [QE-users] Running efficiently on multiple nodes

Dear Brad,
I can only confirm what Paolo and Michal suggested.
Even with infiniband the efficiency of the FFT parallelization drastically 
decreases at each new node, WHATEVER THE CODE (not only QE) or the librairy.
For SLURM jobs, if you ask 2 nodes of 16 cores, the first 16 are indexed 1 to 
16 and 16 last 17-32, that is exactlly the same repartition implemented in QE 
for k points, bands or images parallelization.
Thanks to this, I never face trouble concerning the way the mpi processes are 
spread to the cores when the number of pools (or images) equals the number of 
nodes.
For these reason, except for large supercells at gamma only, I always do 
npool=nodes

Regards,
Antoine Jay
LAAS CNRS
Toulouse France

Le Vendredi, Novembre 06, 2020 01:04 CET, Michal Krompiec via users 
<[email protected]> a écrit:

Dear Brad,
Fast communications means here Infiniband or other RDMA. Make sure your MPI 
uses RDMA, I’ve seen systems where it isn’t enabled by default. That said, if 
you use k-point parallelization you can get away with gigabit ethernet as Paolo 
mentioned.
Best wishes,
Michal Krompiec
Merck KGaA

On Thu, Nov 5, 2020 at 11:40 PM Baer, Bradly via users 
<[email protected]<mailto:[email protected]>> 
wrote:
Paolo,

I believe the nodes I am using have gigabit connections. There are additional 
nodes that have 10 or 25 gigabit connections but I don't think I would land on 
one of them without specifically requesting them.  What communication speed 
would be appropriate for QE's needs?

I also did consider trying to manually set the parallelization but I don't 
currently know enough about SLURM to identify each node and ensure that all 16 
cores assigned from a pool are on the same node.  I will keep it in mind though 
as a possible future solution.

Thanks,
Brad

--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University



________________________________
From: Paolo Giannozzi <[email protected]<mailto:[email protected]>>
Sent: Thursday, November 5, 2020 3:54 PM
To: Baer, Bradly <[email protected]>; Quantum ESPRESSO users Forum 
<[email protected]<mailto:[email protected]>>

Subject: Re: [QE-users] Running efficiently on multiple nodes

Are there fast communications between the two nodes? if not, the parallel 
distributed 3D FFT will be very slow (note the time taken by fft_scatt_yz). You 
might find convenient to exploit k-point parallelization, that requires much 
less communication: for instance, "mpirun -n 32 pw.x -nk 2" (2 pools of 16 
processors, each pool performing parallel FFT), but you have to figure out a 
way to convince the first pool of 16 processors on node 1, the second on node 2 
(or vice versa, as long as FFT parallelization happens inside a node, k-point 
parallelization across nodes )

Paolo

On Thu, Nov 5, 2020 at 7:29 PM Baer, Bradly via users 
<[email protected]<mailto:[email protected]>> 
wrote:
Paolo,

Thank you for your suggestion.  I will add recompiling to move to 6.6 to my to 
do list.  For now, I corrected the pseudopotential files as you indicated and 
the calculation ran successfully.  It has become slightly faster, but still 
much slower than running on a single node (3:30s vs 0:30s).  Is there more that 
I should be doing to improve performance or is my test problem too small to see 
the benefits of parallelization?

Thanks,
Brad

--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University



________________________________
From: users 
<[email protected]<mailto:[email protected]>>
 on behalf of Paolo Giannozzi 
<[email protected]<mailto:[email protected]>>
Sent: Thursday, November 5, 2020 10:01 AM
To: Quantum ESPRESSO users Forum 
<[email protected]<mailto:[email protected]>>
Subject: Re: [QE-users] Running efficiently on multiple nodes

On Thu, Nov 5, 2020 at 3:05 PM Baer, Bradly 
<[email protected]<mailto:[email protected]>> wrote:

Pseudo file Ga.pbe-dn-kjpaw_psl.1.0.0.UPF has been fixed on the fly.
To avoid this message in the future, permanently fix
 your pseudo files following these instructions:
https://gitlab.com/QEF/q-e/blob/master/upftools/how_to_fix_upf.md<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2FQEF%2Fq-e%2Fblob%2Fmaster%2Fupftools%2Fhow_to_fix_upf.md&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232156388%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Dow2rXyGSRZd%2FvKNOc5T1izM%2FiPxPoAJzVJjU28DHfo%3D&reserved=0>

This is a possible source of trouble if the output directory is not visible to 
all processors. Please try one of the following:
- do what it is suggested (or simply: edit Ga.pbe-dn-kjpaw_psl.1.0.0.UPF, 
replace all occurrences of "&" with "&amp;")
- get version 6.6, that reads the pseudopotential file on one processor and 
broadcast its contents to all other processes
- get the development version, that in addition is not sensitive to the 
presence of nonstandard "&" in the files,

Paolo


-Brad

--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University



________________________________
From: users 
<[email protected]<mailto:[email protected]>>
 on behalf of Paolo Giannozzi 
<[email protected]<mailto:[email protected]>>
Sent: Thursday, November 5, 2020 2:33 AM
To: Quantum ESPRESSO users Forum 
<[email protected]<mailto:[email protected]>>
Subject: Re: [QE-users] Running efficiently on multiple nodes

On Wed, Nov 4, 2020 at 11:28 PM Baer, Bradly 
<[email protected]<mailto:[email protected]>> wrote:

Now that I have two nodes, the script for a single node results in a crash 
shortly after reading in the pseudopotentials.

which version of QE are you using, and which crash do you obtain, with which 
executable?
Paolo
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, 
Italy<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.google.com%2Fmaps%2Fsearch%2FUdine%2C%2Bvia%2Bdelle%2BScienze%2B208%2C%2B33100%2BUdine%2C%2BItaly%3Fentry%3Dgmail%26source%3Dg&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232166380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2FL9RjqUBnCiGTjxZVVqXjTGLXNDgdKtBbKq%2BYmoz2zY%3D&reserved=0>
Phone +39-0432-558216, fax +39-0432-558222

_________________
Quantum ESPRESSO is supported by MaX 
(www.max-centre.eu<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232166380%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=XNmuXbu%2B6V9RPKUftkAR4vgmEAtaOuS%2BYcqHy%2BIM90Y%3D&reserved=0>)
users mailing list 
[email protected]<mailto:[email protected]>
https://lists.quantum-espresso.org/mailman/listinfo/users<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232176369%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=4Et3MwKdIWAUfskmMfK67xygjhuA59BPEny3Wen7%2B34%3D&reserved=0>


--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, 
Italy<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.google.com%2Fmaps%2Fsearch%2FUdine%2C%2Bvia%2Bdelle%2BScienze%2B208%2C%2B33100%2BUdine%2C%2BItaly%3Fentry%3Dgmail%26source%3Dg&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232176369%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=JvhttsG%2FN90fLTCY9U1smphqVslvjmX28T18DsReo98%3D&reserved=0>
Phone +39-0432-558216, fax +39-0432-558222

_______________________________________________
Quantum ESPRESSO is supported by MaX 
(www.max-centre.eu<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232186367%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=9uUZAM1sqTl76J4cOO0WLIgLI24HBlDiIAX0L0HBVDg%3D&reserved=0>)
users mailing list 
[email protected]<mailto:[email protected]>
https://lists.quantum-espresso.org/mailman/listinfo/users<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232196360%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=EH3ADOtCL9lCrL4t63oYM6DEsdrRIIe6g%2BwAOP7dfIs%3D&reserved=0>


--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, 
Italy<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.google.com%2Fmaps%2Fsearch%2FUdine%2C%2Bvia%2Bdelle%2BScienze%2B208%2C%2B33100%2BUdine%2C%2BItaly%3Fentry%3Dgmail%26source%3Dg&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232196360%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=2SohlU0wvyYOy%2FqnoLKlytqDkaIDtW9OD7CnGgXBRfM%3D&reserved=0>
Phone +39-0432-558216, fax +39-0432-558222

_______________________________________________
Quantum ESPRESSO is supported by MaX 
(www.max-centre.eu<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232206353%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=bA%2FcKlXHLFuQXVZY%2BQCVAmTjoDchhqt9iIqM2fmqfAI%3D&reserved=0>)
users mailing list 
[email protected]<mailto:[email protected]>
https://lists.quantum-espresso.org/mailman/listinfo/users<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Cbb9dbbbf6caf4f5c264708d88223b838%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C637402437232206353%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6JuVeAzN9OsRPy49vFlEZ29pfm3cifmCLBaj1MpRlPM%3D&reserved=0>




_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to