Am 26.02.2013 um 19:45 schrieb Mikael Brandström Durling:

> I have recently been trying to run OpenMPI jobs spanning several nodes on our 
> small cluster. However, it seems to me as sub-jobs launched with qsub 
> -inherit (by openmpi) gets killed at a memory limit of h_vmem, instead of 
> h_vmem times the number of slots allocated to the sub-node.

Unfortunately this is correct:

https://arc.liv.ac.uk/trac/SGE/ticket/197

Only way around: use virtual_free instead and hope that they users comply to 
this estimated value.

-- Reuti


> Is there any way to get the correct allocation to the sub nodes? I have some 
> vague memory that I have read something about this. As it behaves now, it is 
> impossible to run large memory MPI jobs for us. Would making h_vmem a per job 
> consumable, rather than slot wise, give any other behaviour?
> 
> We are using OGS GE2011.11.
> 
> Thanks for any hints on this issue,
> 
> Mikael
> 
> 
> 
> 
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to