On Tue, 11 Sep 2012, Schmidt U. wrote:
Hi Mark Dixon,
is the mentioned patch maybe a little bit helpful as well for my problem
with the virtual memory overload of the first node in massive parallel jobs
? overhead_vmem = bash_vmem + mpirun_vmem + (nodes -1)*qrsh_vmem
Udo
Hi Udo,
Apologies for the delay in responding, but I've been away.
Using the memory cgroup controller to measure vmem should reduce the
*_vmem values in the above equation (both by more accurate readings and by
avoiding double-counting), so should have a beneficial effect. But it will
not eliminate the problem.
I see that Dave has suggested trying to configure your MPI to use
tree-based launching - that sounds like a much better solution if your MPI
supports it.
All the best,
Mark
--
-----------------------------------------------------------------
Mark Dixon Email : [email protected]
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users