dear Axel and Paolo, My situation is the nscf can not run at all with the very dense kpoints.
I used a small cluster of our group and I am the administrator of it, so I am sure it did not overload the computing resources. I tested it with serveral situations. Such as that the calculation with or without mpi. When the number of kpoints is smaller, it runs in both situation. When the number of kpoints is very large, using pw.x without mpi, the work runs. But when i use mpi, no matter how much of the processes used, it can not run at all, and lead to such errors immediately. I used MPICH3. Maybe i should reconfigure it? Or use open-mpi instead? Thanks a lot. Yours Dingfu Dingfu Shao, Ph.D Institute of Solid State Physics Chinese Academy of Sciences P. O. Box 1129 Hefei 230031 Anhui Province P. R. China Email: dingfu.shao at gmail.com From: Axel Kohlmeyer Date: 2014-04-21 16:02 To: PWSCF Forum Subject: Re: [Pw_forum] error in the NSCF calculation with a large number of kpoints. On Mon, Apr 21, 2014 at 12:08 AM, Dingfu Shao <dingfu.shao at gmail.com> wrote: > Dear QE users, > > I want to plot a fermi surface with a very dense kpoints, so that I > can do some calculation such as the Fermi surface nesting function. A > smaller kpoints number, such as 1200, works fine. But when I take the > nscf calculation with a large kpoints number, such as 2000, the error > happens as forllowing: > > [proxy:0:0 at node6] HYD_pmcd_pmip_control_cmd_cb > (./pm/pmiserv/pmip_cb.c:939): process reading stdin too slowly; can't > keep up > [proxy:0:0 at node6] HYDT_dmxu_poll_wait_for_event > (./tools/demux/demux_poll.c:77): callback returned error status > [proxy:0:0 at node6] main (./pm/pmiserv/pmip.c:206): demux engine error > waiting for event > [mpiexec at node0] control_cb (./pm/pmiserv/pmiserv_cb.c:202): assert > (!closed) failed > [mpiexec at node0] HYDT_dmxu_poll_wait_for_event > (./tools/demux/demux_poll.c:77): callback returned error status > [mpiexec at node0] HYD_pmci_wait_for_completion > (./pm/pmiserv/pmiserv_pmci.c:197): error waiting for event > [mpiexec at node0] main (./ui/mpich/mpiexec.c:331): process manager error > waiting for completion > > > > Does anybody know what is the porblem? first of all, there are messages from your MPI library. my guess is that you are overloading your compute node with i/o. probably caused by excessive swapping. do you use a sufficient number of nodes and k-point parallelization? you most likely best talk to your system administrators, since overloading nodes can cause disruptions in the overall service of the cluster and then work with them to run your calculation better. axel. > > > Best regards, > > Yours Dingfu Shao > > > -- > > Dingfu Shao, Ph.D > > Institute of Solid State Physics > > Chinese Academy of Sciences > > P. O. Box 1129 > > Hefei 230031 > > Anhui Province > > P. R. China > > Email: dingfu.shao at gmail.com > > ________________________________ > > > > _______________________________________________ > Pw_forum mailing list > Pw_forum at pwscf.org > http://pwscf.org/mailman/listinfo/pw_forum -- Dr. Axel Kohlmeyer akohlmey at gmail.com http://goo.gl/1wk0 College of Science & Technology, Temple University, Philadelphia PA, USA International Centre for Theoretical Physics, Trieste. Italy. _______________________________________________ Pw_forum mailing list Pw_forum at pwscf.org http://pwscf.org/mailman/listinfo/pw_forum -------------- next part -------------- An HTML attachment was scrubbed... URL: http://pwscf.org/pipermail/pw_forum/attachments/20140422/97890a31/attachment.html
