Hi, Am 16.05.2014 um 18:36 schrieb Connell, Jesse:
> I'm picking up on our slotwise preemption problem where my colleague Dan > had left off (thanks for your input on that a few months back, by the way) > and I'm trying to see if we can use a grid engine version that does > properly implement it. We installed SGE 6.2u5p3 and set up a few very > simple queues for testing, but when I run my tests, it still seems to > assume "one job, one slot" -- jobs using a PE and specifying multiple > slots don't get counted properly, just like we saw before. > > So at this point I'm wondering, do any versions (recent or otherwise) > actually support what we're attempting? Has anyone here gotten it > working? I suppose it's possible I need an even older version, or a very > particular one -- that Univa document did mention 6.26u6 and u7 > specifically -- but I realize I could also just be chasing something that > won't work at all :) There's still the option of the workaround you > described; I'm just avoiding setting it up if possible, since it'd add a > fair bit of complexity to the configuration. Unfortunately it's the only way AFAICS. I set it up and it was working as intended. I also don't know, whether it's fixed in the Univa version of GridEngine in the meantime. -- Reuti > Thanks (again) for any thoughts! > > Jesse Connell > Manager of Research Computing > College of Engineering, Boston University > > > >> Am 30.01.2014 um 15:11 schrieb Daniel Kamalic: >> >> >> >>> Thanks for your quick response, Reuti. Uggh, that's too bad. I'm >>> running version 2011.11 . Do you think this was fixed in a newer >>> version? (I think based on your last sentence that you're saying you >>> think it was fixed.) >> >> Maybe it was in OGE only: >> >> http://www.univa.com/resources/files/Release_Notes_Univa_Grid_Engine_8.0.0 >> .pdf >> >> (Second to very last sentence) >> >> >>> If not, do you have any suggested workarounds? >> >> Not really one that I'm happy with. >> >> - in the prolog of the superordinated queue, check how many slots were >> requested, (n-1) suspensions we need in addition >> - in the prolog submit (n-1) super-superordinated jobs to a dummy queue >> with zero resource requests to trigger more jobs getting suspended >> (maybe in the jobname put the job id of the original job to select them >> later easily) >> - in the epilog `qdel` the dummy jobs >> >> The super-superordinated queue will need a setting like: >> >> subordinate_list slots=4(low.q:1:sr, high.q:2:sr) >> >> We need the superordinated high.q here to limit the overall used slot >> count of active slots (replace 4 with your set value) (as we suspend >> (n-1) slots in addition, the event that any job in high.q gets suspended >> should never happen). Setting this up, it's necessary to upgrade to SoGE >> as otherwise we see one unsuspended slot being left over otherwise: >> >> https://arc.liv.ac.uk/trac/SGE/ticket/775 >> >> -- Reuti >> >> >>> On 1/30/14, 6:04 AM, Reuti wrote: >>>> Hi, >>>> >>>> Am 29.01.2014 um 20:01 schrieb Daniel Kamalic: >>>> >>>>> Slotwise preemption doesn't seem to be working correctly for single >>>>> jobs that take up multiple slots on my setup: >>>> >>>> Unfortunately that's true. >>>> >>>> I can't find a discussions about it in the mailing list though. I >>>> thought this was an issue which was fixed in the meantime. >>>> >>>> -- Reuti > > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
