Il giorno 21/gen/2011, alle ore 21.11, Yuyang Zhang ha scritto:

> Dear All,
> 
> I am now carrying on the calculation for a system with 2300 electrons in a 
> 20x22x25 angstrom^3 supercell.  
> 
> I always submit jobs on a blue-gene type machine with 512 CPUs (power 470) 
> connected with infiniband.
> 
> In VASP, time scale for a single SCF step is about 150 sec (spin polarized, 
> gamma point only, 400 eV energy cutoff)
> 
> In PWSCF, with a 25Ry energy cutoff, a single SCF step will take 3000 sec 
> (spin polarized , gamma point only, -npool=2).
> 
> I check "massive parallel" on the manual and the arxive of this mailing list, 
> and try to use -ntg tag when submitting the jobs but no significant 
> improvement.

In that section you should also have found this:
Since v.4.1, ScaLAPACK can be used to diagonalize block distributed matrices, 
yielding better speed-up than the default algorithms for large (> 1000 ) 
matrices, when using a large number of processors (> 512 ). If you want to test 
ScaLAPACK, use configure -with-scalapack. This will add -D__SCALAPACK to DFLAGS 
inmake.sys and set LAPACK_LIBS to something like:

    LAPACK_LIBS = -lscalapack -lblacs -lblacsF77init -lblacs -llapack

are you using parallel diagonalization with scalapack?

GS

> 
> There is no reason that PWSCF runs 20 times slower than VASP.  Does anyone 
> have experience to improve the parallel efficiency for these large systems?
> 
> Best,
> 
> Yuyang Zhang
> 
> Nanoscale Physics and Devices Laboratory
> Institute of Physics, Chinese Academy of Sciences
> Beijing 100190, P. R. China
> 
> 
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum


? Gabriele Sclauzero, EPFL SB ITP CSEA
   PH H2 462, Station 3, CH-1015 Lausanne

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://www.democritos.it/pipermail/pw_forum/attachments/20110121/443970b6/attachment.htm
 

Reply via email to