[Pw_forum] PW taskgroups and a large run on a BG/P

Paolo Giannozzi Fri, 13 Feb 2009 10:33:46 +0100

In conclusion, and if I understand correctly:

   PWscf runs only with 1 MPI process per core, but
   it runs 800+ atoms in as little as 32 processes
   or in as much as 512, as I was expecting.


This confirms my opinion that the problem is on the
BG side and not on the PWscf side, since there is
NOTHING in the fortran code that depends upon where
the MPI processes are running. Of course one can
never rule out the possibility that some obscure
bug is triggered only in that special cases, but
it seems to me highly unlikely.

Implementation of mixed MPI-openMPI parallelization
is under development, but it wil take some time. In
the meantime, if you can link openMPI-aware
mathematical libraries, you might get some speedup.

If you do not need k-points, and if you know how to
deal with metallic systems, you might try CP instead
of PWscf - it is better tested for large systems - but
I don't expect a different behavior, since the routines
performing parallel subspace diagonalization are the
same that perform iterative orthonormalization, so the
trouble is likely to move from "cholesky" to "ortho".

You might try to find out what is wrong, since you have
two cases that should yield exactly the same results
but don't. It may take a lot of time and lead to no
result, though. You may also try to raise this issue
with the technical staff of the computing center.

Paolo
-- 
Paolo Giannozzi, Democritos and University of Udine, Italy

[Pw_forum] PW taskgroups and a large run on a BG/P

Reply via email to