On Tue, 11 Sep 2012, Schmidt U. wrote:

Hi Mark Dixon,
is the mentioned patch maybe a little bit helpful as well for my problem
with the virtual memory overload of the first node in massive parallel jobs
? overhead_vmem = bash_vmem + mpirun_vmem + (nodes -1)*qrsh_vmem

Udo

Hi Udo,

Apologies for the delay in responding, but I've been away.

Using the memory cgroup controller to measure vmem should reduce the *_vmem values in the above equation (both by more accurate readings and by avoiding double-counting), so should have a beneficial effect. But it will not eliminate the problem.

I see that Dave has suggested trying to configure your MPI to use tree-based launching - that sounds like a much better solution if your MPI supports it.

All the best,

Mark
--
-----------------------------------------------------------------
Mark Dixon                       Email    : [email protected]
HPC/Grid Systems Support         Tel (int): 35429
Information Systems Services     Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to