Am 27.02.2013 um 20:56 schrieb Mikael Brandström Durling: > 27 feb 2013 kl. 20:38 skrev Reuti <[email protected]> > : > >> Am 27.02.2013 um 16:22 schrieb Mikael Brandström Durling: >> >>> Ok, seems somewhat hard to patch without deep knowledge of the inner >>> workings of GE. Interstingly, if I manually start a qrsh -pe openmpi_span >>> N, and then qrsh -inherit into the slave, that slave has knowledge of the >>> number of slots allocated to it in the environment ($NSLOTS),
I added some comments here: https://arc.liv.ac.uk/trac/SGE/ticket/1451 -- Reuti >>> but in execd's do_ck_to_do (execs_ck_to_do.c), the nslots value that the >>> h_vmem limits is multiplied with must be another value (1). I'll see what >>> workaround to go for, as we don't trust our users to stay within the limit >>> they ask for. Many of them has no clue as to what resources might be a >>> reasonable request. >>> Thanks for your rapid reply, >> >> You're welcome. > > Thanks… > >> >> In case you look deeper into the issue, it's also worth to note that there >> is no option to specify the target queue for `qrsh -inherit` in case you get >> slots from different queues on the slave system: >> >> https://arc.liv.ac.uk/trac/SGE/ticket/813 >> > > Ok. This could lead to incompatible changes to the -inherit behaviour, if the > caller to `qrsh -inherit` has to specify the queue requested. On the other > hand, I have seen cases where an OMPI job has been allotted slots from two > different queues on an exec host, which has resulted in ompi launching two > `qrsh -inherit` to the same host. > > >> Maybe it's related to the $NSLOTS. If you get slots from one and the same >> queue it seems to be indeed correct for the slave nodes. But for a local >> `qrsh -inherit` on the master node of the serial job it looks like being set >> to the overall slot count instead. > > > I noted that too. I will see if I get some spare time to hunt down this > track. It seems that an ideal solution could be that $NSLOTS is set to the > allotted number of slots for the current job (i.e. correct the number in the > master job), and that `qrsh -inherit` could take an argument of 'queue@host' > type. > > I'll think of this and add it as a comment to the ticket. Is that trac > instance at arc.liv.ac.uk the best place, even though we are running OGS? I > suppose so? > > Mikael > >> >> -- Reuti >> >> >>> Mikael >>> >>> >>> 26 feb 2013 kl. 21:32 skrev Reuti <[email protected]> >>> : >>> >>>> Am 26.02.2013 um 19:45 schrieb Mikael Brandström Durling: >>>> >>>>> I have recently been trying to run OpenMPI jobs spanning several nodes on >>>>> our small cluster. However, it seems to me as sub-jobs launched with qsub >>>>> -inherit (by openmpi) gets killed at a memory limit of h_vmem, instead of >>>>> h_vmem times the number of slots allocated to the sub-node. >>>> >>>> Unfortunately this is correct: >>>> >>>> https://arc.liv.ac.uk/trac/SGE/ticket/197 >>>> >>>> Only way around: use virtual_free instead and hope that they users comply >>>> to this estimated value. >>>> >>>> -- Reuti >>>> >>>> >>>>> Is there any way to get the correct allocation to the sub nodes? I have >>>>> some vague memory that I have read something about this. As it behaves >>>>> now, it is impossible to run large memory MPI jobs for us. Would making >>>>> h_vmem a per job consumable, rather than slot wise, give any other >>>>> behaviour? >>>>> >>>>> We are using OGS GE2011.11. >>>>> >>>>> Thanks for any hints on this issue, >>>>> >>>>> Mikael >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> https://gridengine.org/mailman/listinfo/users >>>> >>> >>> >> > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
