On Sat, 2014-03-01 at 14:27 -0600, Peng Chen wrote: > > And the system is not that large(32 atoms, 400 nband, 8*8*8 kpoints) > which is run in 128 cores. I think you are probably right that QE is > trying to allocate a large array somehow.
... and ? > On Fri, Feb 28, 2014 at 10:35 AM, Paolo Giannozzi > <paolo.giannozzi at uniud.it> wrote: > On Fri, 2014-02-28 at 09:12 -0600, Peng Chen wrote: > > > I think it is memory, because the error message is like: > > : 02/27/2014 14:06:20| main|zeta27|W|job 221982 exceeds job > hard > > limit "h_vmem" of queue (2871259136.00000 > > limit:2147483648.00000) - > > sending SIGKILL > > > there are a few hints on how to reduce memory usage to the > strict > minimum here: > > http://www.quantum-espresso.org/wp-content/uploads/Doc/pw_user_guide/node19.html#SECTION000600100000000000000 > If the FFT grid is large, reduce mixing_ndim from its default > value (8) > to 4 or so. If the number of bands is large, distribute > nbnd*nbnd > matrices using "-ndiag". If you have many k-points, save to > disk with > disk_io='medium'. The message you get: "2871259136 > > limit:2147483648" > makes me think that you crash when trying to allocate an array > whose > size is at least 2871259136-2147483648=a lot. It shouldn' be > difficult > to figure out where such a large array comes from > > Paolo > > > > > > I normally used h_stak=128M, it is working fine. > > > > > > > > > > > > > > On Fri, Feb 28, 2014 at 7:30 AM, Paolo Giannozzi > > <paolo.giannozzi at uniud.it> wrote: > > On Thu, 2014-02-27 at 17:30 -0600, Peng Chen wrote: > > > P.S. Most of the jobs failed at the beginning of > scf > > calculation, and > > > the length of output scf file is zero. > > > > > > are you sure the problem is the size of the RAM and > not the > > size of > > the stack? > > > > P. > > > > > > > > > > > > > On Thu, Feb 27, 2014 at 5:09 PM, Peng Chen > > <pchen229 at illinois.edu> > > > wrote: > > > Dear QE users, > > > > > > > > > Recently, our workstation is updated and > there is a > > hard limit > > > on memory (2G per core). Some of QE jobs > are > > constantly failed > > > (not always) because one of the MPI > processes > > exceeded the RAM > > > limit and was killed. I am wondering if > there is a > > way to > > > distribute using memory more evenly in > every core. > > > > > > > > > > > _______________________________________________ > > > Pw_forum mailing list > > > Pw_forum at pwscf.org > > > http://pwscf.org/mailman/listinfo/pw_forum > > > > > > -- > > Paolo Giannozzi, Dept. > Chemistry&Physics&Environment, > > Univ. Udine, via delle Scienze 208, 33100 Udine, > Italy > > Phone +39-0432-558216, fax +39-0432-558222 > > > > _______________________________________________ > > Pw_forum mailing list > > Pw_forum at pwscf.org > > http://pwscf.org/mailman/listinfo/pw_forum > > > > > > > > _______________________________________________ > > Pw_forum mailing list > > Pw_forum at pwscf.org > > http://pwscf.org/mailman/listinfo/pw_forum > > -- > Paolo Giannozzi, Dept. Chemistry&Physics&Environment, > Univ. Udine, via delle Scienze 208, 33100 Udine, Italy > Phone +39-0432-558216, fax +39-0432-558222 > > _______________________________________________ > Pw_forum mailing list > Pw_forum at pwscf.org > http://pwscf.org/mailman/listinfo/pw_forum > > > > _______________________________________________ > Pw_forum mailing list > Pw_forum at pwscf.org > http://pwscf.org/mailman/listinfo/pw_forum -- Paolo Giannozzi, Dept. Chemistry&Physics&Environment, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222
