On 1/6/07, Alexander Shaposhnikov <shaposh at isp.nsc.ru> wrote: > > > The SMP speedup for relatively large pw.x jobs (like scf calc. of 192 atom > ZrO2 supercell with 50 ecut) is ~4.5. As i said, the cdiaghg does not > parallelize and takes ~1/3 exec time of the whole 8-cpu job. Consider it > could be efficiently run in parallel to get, say, 2X speedup on 8cpus - I > could achieve ~5.5X total SMP speedup. > Thats the difference. > > On the other hand, memory contention is indeed huge problem for cp/cpmd codes, > so this Xeon machine is barely faster than dual Opteron 280 2.4GHz (4cores) > for cpmd. For pw.x , however, it is ~2.5 faster for large jobs -and could be > made even better with some working diaghg parallelization algorithm.
hmmmm.... what version of pw.x are you referring to? at least in version 3.2 there should be a parallel diagonalization that is used in pw.x (subroutine cdiagonalize in Modules/ptoolkit.f90). as far as i understand, this is not easy to parallelize, so there is an estimator that checks, whether the serial or the parallel algorithm would be more efficient. cheers, axel. > > > cheers, > > axel. > > Best Regards, > Alexander Shaposhnikov > _______________________________________________ > Pw_forum mailing list > Pw_forum at pwscf.org > http://www.democritos.it/mailman/listinfo/pw_forum > > -- ======================================================================= Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu Center for Molecular Modeling -- University of Pennsylvania Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323 tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425 ======================================================================= If you make something idiot-proof, the universe creates a better idiot.
