Re: [Pw_forum] default parallelization and parallelization of bands.x

Maxim Skripnik Mon, 07 Dec 2015 05:53:43 -0800

So you mean it's not normal that bands.x takes more than 7 hours? What's 
suspicious is that the reported actual CPU time is much less, only 16 minutes. 
What could be the problem?
Here's the output of a bands.x calculation:


     Program BANDS v.5.1.2 starts on  5Dec2015 at  9:15:18

     This program is part of the open-source Quantum ESPRESSO suite
     for quantum simulation of materials; please cite
         "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
          URL http://www.quantum-espresso.org";,
     in publications or presentations arising from this work. More details at
     http://www.quantum-espresso.org/quote

     Parallel version (MPI), running on    64 processors
     R & G space division:  proc/nbgrp/npool/nimage =      64

     Reading data from directory:
     ./tmp/Ni3HTP2.save

   Info: using nr1, nr2, nr3 values from input

   Info: using nr1s, nr2s, nr3s values from input

     IMPORTANT: XC functional enforced from input :
     Exchange-correlation      =  SLA  PW   PBE  PBE ( 1  4  3  4 0 0)
     Any further DFT definition will be discarded
     Please, verify this is what you really want

               file H.pbe-rrkjus.UPF: wavefunction(s)  1S renormalized
               file C.pbe-rrkjus.UPF: wavefunction(s)  2S 2P renormalized
               file N.pbe-rrkjus.UPF: wavefunction(s)  2S renormalized
               file Ni.pbe-nd-rrkjus.UPF: wavefunction(s)  4S renormalized
 
     Parallelization info
     --------------------
     sticks:   dense  smooth     PW     G-vecs:    dense   smooth      PW
     Min         588     588    151                92668    92668   12083
     Max         590     590    152                92671    92671   12086
     Sum       37643   37643   9677              5930831  5930831  773403
 

     Check: negative/imaginary core charge=   -0.000004    0.000000

     negative rho (up, down):  2.225E-03 0.000E+00
     high-symmetry point:  0.0000 0.0000 0.4981   x coordinate   0.0000
     high-symmetry point:  0.3332 0.5780 0.4981   x coordinate   0.6672
     high-symmetry point:  0.5000 0.2890 0.4981   x coordinate   1.0009
     high-symmetry point:  0.0000 0.0000 0.4981   x coordinate   1.5784

     Plottable bands written to file bands.out.gnu
     Bands written to file bands.out
 
     BANDS        :     0h16m CPU        7h38m WALL

 
   This run was terminated on:  16:53:49   5Dec2015            

=------------------------------------------------------------------------------=
   JOB DONE.
=------------------------------------------------------------------------------=




Am Samstag, 05. Dezember 2015 21:03 CET, stefano de gironcoli 
<[email protected]> schrieb:
  The only parallelization that i see in bands is the basic one over R & G. If 
it is different from the parallelization used previously you should use 
wf_collect.
the code computes the overlap between the orbital at k and k+dk in order to 
decide how to connect them. it's an nbnd^2 operation done band by band. not 
very efficient evidently but it should not take hours.
you can use wf_collect=.true. and increase the number of processors.
 
stefano


On 05/12/2015 12:57, Maxim Skripnik wrote:Thank you for the information. Yes, 
at the beginning of the pw.x output it says:
     Parallel version (MPI), running on    64 processors
     R & G space division:  proc/nbgrp/npool/nimage =      64

Is bands.x parallelized at all? If so, where can I find information on that? 
There's nothing mentioned in the documentation:
http://www.quantum-espresso.org/wp-content/uploads/Doc/pp_user_guide.pdf
http://www.quantum-espresso.org/wp-content/uploads/Doc/INPUT_BANDS.html

What could be the reason for bands.x taking many hours to calculate the bands? 
The foregoing pw.x calculation has already determined the energy for each 
k-point along a path (Gamma -> K -> M -> Gamma). There are 61 k-points and 129 
bands. So what is bands.x actaully doing beside reformating that data? The 
input file job.bands looks like this:
 &bands
    prefix   = 'st1'
    outdir   = './tmp'
/
The calculation is initiated by
mpirun -np 64 bands.x < job.bands

Maxim Skripnik
Department of Physics
University of Konstanz

Am Samstag, 05. Dezember 2015 02:37 CET, stefano de gironcoli 
<[email protected]> schrieb:
   On 04/12/2015 22:53, Maxim Skripnik wrote:Hello,

I'm a bit confused by the parallelization scheme of QE. First of all, I run 
calculations on a cluster with usually 1 to 8 nodes, each of which has 16 
cores. There is a very good scaling of pw.x e.g. for structural relaxation 
jobs. I do not specify any particular parallelization scheme as mentioned in 
the documentation, i.e. I start the calculations with
mpirun -np 128 pw.x < job.pw
on 8 nodes, 16 cores each. According to the documentation ni=1, nk=1 and nt=1. 
So in which respect are the calculations parallelized by default? Why do the 
calculations scale so well without specifying ni, nk, nt, nd? R and G 
parallelization is performed.
wavefunctions' planewaves, density planewaves and slices of real space objects 
are distributed across 128 processors. A report of how this is done is given at 
the beginning of the output.
Did you had a look to it ?
 Second question is, whether one can speed up bands.x calculations. Up to now I 
start these this way:
mpirun -np 64 bands.x < job.bands
on 4 nodes, 16 cores each. Does it make sense to define nb for bands.x? If yes, 
what would be reasonable values? expect no gain. band parallelization is not 
implemented in bands.

stefano






 The systems of interest consist of typically ~50 atoms with periodic 
boundaries.

Maxim Skripnik
Department of Physics
University of Konstanz   _______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum


   _______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum

_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] default parallelization and parallelization of bands.x

Reply via email to