On Feb 12, 2009, at 15:06 , David Farrell wrote: > I found when I took the 432 atom system I sent you, and ran it on > 128 cores in smp mode > (1 MPI process/node - 2 GB per process) it did work (-ntg 32 -ndiag > 121
32 task groups? that's a lot > as well as -ntg 4 -ndiag 121) 4 looks more reasonable in my opinion > - the system didn't fit into memory in vn mode (4 mpi processes/ > node - > 512 MB per process) that job requires approx. 100Mb of dynamically allocated RAM per process, plus a few tens of Mb of work space. Why it does not fit into 512Mb is a mystery, unless each process comes with a copy of all libraries. If this is the case, the maximum you can fit into 512Mb is a code printing "Hello world" in parallel. By the way: the default number of bands in metallic calculations can be trimmed by a significant amount (e.g. 500 instead of 576) > I then tried the system in dual mode (2 mpi processes/node - 1 GB > per process) > using -ntg 4 and -ndiag 121. In this case, the cholesky error came up: the code performs exactly the same operations, independently on how the MPI processes are distributed. It looks like yet another BlueGene weirdness, like this: http://www.democritos.it:8888/O-sesame/chngview?cn=5777 http://www.democritos.it:8888/O-sesame/chngview?cn=5932 that however affected only the internal parallel diagonalization, not the new scalapack algorithm. I do not see any evidence that there is anything wrong with the code itself. Paolo --- Paolo Giannozzi, Democritos and University of Udine, Italy
