[gridengine users] memory consumption while running the jobs in parallel environment

sudha.penmetsa Wed, 03 Jun 2015 23:27:07 -0700

Hi,

While running jobs in parallel environment if we want to run a job in grid 
using 4 cores and total memory consumption is 40G we are defining as for example


qrsh -V -cwd -q test.q -l mem_free=40G,h_vmem=10G -pe sharedmem 4 sleep 40

However this assumes that each of the threads consumes max 10G mem, the total 
h_vmem consumed on the execution host is 40G

Our experiments have shown that when running the job in single core it requires 
the 40G mem but if we divide the 40G by four (running with "-pe sharedmem 4") 
the job crashes to out of mem.
One option is to run it like this :
qrsh -V -cwd -q  test.q -l mem_free=40G,h_vmem=40G -pe sharedmem 4 sleep 40
however then we end up consuming 160G of h_vmem from the execution host,

So how to ensure that each thread consumes memory only if needed

-          40G total h_vmem is consumed so that each thread can consume 40G mem 
if needed

One option of course is to leave out the h_vmem definition :
qrsh -V -cwd -q test.q -l mem_free=40G -pe sharedmem 4 sleep 40

however then other users might eat the memory from the host and our run crashes 
again.

Regards,
Sudha

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

[gridengine users] memory consumption while running the jobs in parallel environment

Reply via email to