On Wed, May 27, 2020 at 4:27 PM Michal Krompiec <[email protected]> wrote:
> Hello, > How can I minimize inter-node MPI communication in a pw.x run? My > system doesn't have Infiniband and inter-node MPI can easily become > the bottleneck. > Let's say, I'm running a calculation with 4 k-points, on 4 nodes, with > 56 MPI tasks per node. I would then use -npool 4 to create 4 pools for > the k-point parallelization. However, it seems that the > diagonalization is by default parallelized imperfectly (or isn't it?): > Subspace diagonalization in iterative solution of the eigenvalue > problem: > one sub-group per band group will be used > scalapack distributed-memory algorithm (size of sub-group: 7* 7 > procs) > So far, speedup on 4 nodes vs 1 node is 3.26x. Is it normal or does it > look like it can be improved? > > Best regards, > > Michal Krompiec > Merck KGaA > Southampton, UK > _______________________________________________ > Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso) > users mailing list [email protected] > https://lists.quantum-espresso.org/mailman/listinfo/users > -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222
_______________________________________________ Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso) users mailing list [email protected] https://lists.quantum-espresso.org/mailman/listinfo/users
