27 feb 2013 kl. 20:38 skrev Reuti <[email protected]> :
> Am 27.02.2013 um 16:22 schrieb Mikael Brandström Durling: > >> Ok, seems somewhat hard to patch without deep knowledge of the inner >> workings of GE. Interstingly, if I manually start a qrsh -pe openmpi_span N, >> and then qrsh -inherit into the slave, that slave has knowledge of the >> number of slots allocated to it in the environment ($NSLOTS), but in execd's >> do_ck_to_do (execs_ck_to_do.c), the nslots value that the h_vmem limits is >> multiplied with must be another value (1). I'll see what workaround to go >> for, as we don't trust our users to stay within the limit they ask for. Many >> of them has no clue as to what resources might be a reasonable request. >> Thanks for your rapid reply, > > You're welcome. Thanks… > > In case you look deeper into the issue, it's also worth to note that there is > no option to specify the target queue for `qrsh -inherit` in case you get > slots from different queues on the slave system: > > https://arc.liv.ac.uk/trac/SGE/ticket/813 > Ok. This could lead to incompatible changes to the -inherit behaviour, if the caller to `qrsh -inherit` has to specify the queue requested. On the other hand, I have seen cases where an OMPI job has been allotted slots from two different queues on an exec host, which has resulted in ompi launching two `qrsh -inherit` to the same host. > Maybe it's related to the $NSLOTS. If you get slots from one and the same > queue it seems to be indeed correct for the slave nodes. But for a local > `qrsh -inherit` on the master node of the serial job it looks like being set > to the overall slot count instead. I noted that too. I will see if I get some spare time to hunt down this track. It seems that an ideal solution could be that $NSLOTS is set to the allotted number of slots for the current job (i.e. correct the number in the master job), and that `qrsh -inherit` could take an argument of 'queue@host' type. I'll think of this and add it as a comment to the ticket. Is that trac instance at arc.liv.ac.uk the best place, even though we are running OGS? I suppose so? Mikael > > -- Reuti > > >> Mikael >> >> >> 26 feb 2013 kl. 21:32 skrev Reuti <[email protected]> >> : >> >>> Am 26.02.2013 um 19:45 schrieb Mikael Brandström Durling: >>> >>>> I have recently been trying to run OpenMPI jobs spanning several nodes on >>>> our small cluster. However, it seems to me as sub-jobs launched with qsub >>>> -inherit (by openmpi) gets killed at a memory limit of h_vmem, instead of >>>> h_vmem times the number of slots allocated to the sub-node. >>> >>> Unfortunately this is correct: >>> >>> https://arc.liv.ac.uk/trac/SGE/ticket/197 >>> >>> Only way around: use virtual_free instead and hope that they users comply >>> to this estimated value. >>> >>> -- Reuti >>> >>> >>>> Is there any way to get the correct allocation to the sub nodes? I have >>>> some vague memory that I have read something about this. As it behaves >>>> now, it is impossible to run large memory MPI jobs for us. Would making >>>> h_vmem a per job consumable, rather than slot wise, give any other >>>> behaviour? >>>> >>>> We are using OGS GE2011.11. >>>> >>>> Thanks for any hints on this issue, >>>> >>>> Mikael >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://gridengine.org/mailman/listinfo/users >>> >> >> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
