Am 27.02.2013 um 20:56 schrieb Mikael Brandström Durling:

> 27 feb 2013 kl. 20:38 skrev Reuti <[email protected]>
> :
> 
>> Am 27.02.2013 um 16:22 schrieb Mikael Brandström Durling:
>> 
>>> Ok, seems somewhat hard to patch without deep knowledge of the inner 
>>> workings of GE. Interstingly, if I manually start a qrsh -pe openmpi_span 
>>> N, and then qrsh -inherit into the slave, that slave has knowledge of the 
>>> number of slots allocated to it in the environment ($NSLOTS),

I added some comments here:

https://arc.liv.ac.uk/trac/SGE/ticket/1451

-- Reuti


>>> but in execd's do_ck_to_do (execs_ck_to_do.c), the nslots value that the 
>>> h_vmem limits is multiplied with must be another value (1). I'll see what 
>>> workaround to go for, as we don't trust our users to stay within the limit 
>>> they ask for. Many of them has no clue as to what resources might be a 
>>> reasonable request.
>>> Thanks for your rapid reply,
>> 
>> You're welcome.
> 
> Thanks…
> 
>> 
>> In case you look deeper into the issue, it's also worth to note that there 
>> is no option to specify the target queue for `qrsh -inherit` in case you get 
>> slots from different queues on the slave system:
>> 
>> https://arc.liv.ac.uk/trac/SGE/ticket/813
>> 
> 
> Ok. This could lead to incompatible changes to the -inherit behaviour, if the 
> caller to `qrsh -inherit` has to specify the queue requested. On the other 
> hand, I have seen cases where an OMPI job has been allotted slots from two 
> different queues on an exec host, which has resulted in ompi launching two 
> `qrsh -inherit` to the same host.
> 
> 
>> Maybe it's related to the $NSLOTS. If you get slots from one and the same 
>> queue it seems to be indeed correct for the slave nodes. But for a local 
>> `qrsh -inherit` on the master node of the serial job it looks like being set 
>> to the overall slot count instead.
> 
> 
> I noted that too. I will see if I get some spare time to hunt down this 
> track. It seems that an ideal solution could be that $NSLOTS is set to the 
> allotted number of slots for the current job (i.e. correct the number in the 
> master job), and that `qrsh -inherit` could take an argument of 'queue@host' 
> type.
> 
> I'll think of this and add it as a comment to the ticket. Is that trac 
> instance at arc.liv.ac.uk the best place, even though we are running OGS? I 
> suppose so?
> 
> Mikael
> 
>> 
>> -- Reuti
>> 
>> 
>>> Mikael
>>> 
>>> 
>>> 26 feb 2013 kl. 21:32 skrev Reuti <[email protected]>
>>> :
>>> 
>>>> Am 26.02.2013 um 19:45 schrieb Mikael Brandström Durling:
>>>> 
>>>>> I have recently been trying to run OpenMPI jobs spanning several nodes on 
>>>>> our small cluster. However, it seems to me as sub-jobs launched with qsub 
>>>>> -inherit (by openmpi) gets killed at a memory limit of h_vmem, instead of 
>>>>> h_vmem times the number of slots allocated to the sub-node.
>>>> 
>>>> Unfortunately this is correct:
>>>> 
>>>> https://arc.liv.ac.uk/trac/SGE/ticket/197
>>>> 
>>>> Only way around: use virtual_free instead and hope that they users comply 
>>>> to this estimated value.
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>>> Is there any way to get the correct allocation to the sub nodes? I have 
>>>>> some vague memory that I have read something about this. As it behaves 
>>>>> now, it is impossible to run large memory MPI jobs for us. Would making 
>>>>> h_vmem a per job consumable, rather than slot wise, give any other 
>>>>> behaviour?
>>>>> 
>>>>> We are using OGS GE2011.11.
>>>>> 
>>>>> Thanks for any hints on this issue,
>>>>> 
>>>>> Mikael
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> https://gridengine.org/mailman/listinfo/users
>>>> 
>>> 
>>> 
>> 
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to