Hi,

Am 16.05.2014 um 18:36 schrieb Connell, Jesse:

> I'm picking up on our slotwise preemption problem where my colleague Dan
> had left off (thanks for your input on that a few months back, by the way)
> and I'm trying to see if we can use a grid engine version that does
> properly implement it.  We installed SGE 6.2u5p3 and set up a few very
> simple queues for testing, but when I run my tests, it still seems to
> assume "one job, one slot" -- jobs using a PE and specifying multiple
> slots don't get counted properly, just like we saw before.
> 
> So at this point I'm wondering, do any versions (recent or otherwise)
> actually support what we're attempting?  Has anyone here gotten it
> working?  I suppose it's possible I need an even older version, or a very
> particular one -- that Univa document did mention 6.26u6 and u7
> specifically -- but I realize I could also just be chasing something that
> won't work at all :)  There's still the option of the workaround you
> described; I'm just avoiding setting it up if possible, since it'd add a
> fair bit of complexity to the configuration.

Unfortunately it's the only way AFAICS. I set it up and it was working as 
intended.

I also don't know, whether it's fixed in the Univa version of GridEngine in the 
meantime.

-- Reuti


> Thanks (again) for any thoughts!
> 
> Jesse Connell
> Manager of Research Computing
> College of Engineering, Boston University
> 
> 
> 
>> Am 30.01.2014 um 15:11 schrieb Daniel Kamalic:
>> 
>> 
>> 
>>> Thanks for your quick response, Reuti. Uggh, that's too bad. I'm
>>> running version 2011.11 . Do you think this was fixed in a newer
>>> version? (I think based on your last sentence that you're saying you
>>> think it was fixed.)
>> 
>> Maybe it was in OGE only:
>> 
>> http://www.univa.com/resources/files/Release_Notes_Univa_Grid_Engine_8.0.0
>> .pdf
>> 
>> (Second to very last sentence)
>> 
>> 
>>> If not, do you have any suggested workarounds?
>> 
>> Not really one that I'm happy with.
>> 
>> - in the prolog of the superordinated queue, check how many slots were
>> requested, (n-1) suspensions we need in addition
>> - in the prolog submit (n-1) super-superordinated jobs to a dummy queue
>> with zero resource requests to trigger more jobs getting suspended
>> (maybe in the jobname put the job id of the original job to select them
>> later easily)
>> - in the epilog `qdel` the dummy jobs
>> 
>> The super-superordinated queue will need a setting like:
>> 
>> subordinate_list slots=4(low.q:1:sr, high.q:2:sr)
>> 
>> We need the superordinated high.q here to limit the overall used slot
>> count of active slots (replace 4 with your set value) (as we suspend
>> (n-1) slots in addition, the event that any job in high.q gets suspended
>> should never happen). Setting this up, it's necessary to upgrade to SoGE
>> as otherwise we see one unsuspended slot being left over otherwise:
>> 
>> https://arc.liv.ac.uk/trac/SGE/ticket/775
>> 
>> -- Reuti
>> 
>> 
>>> On 1/30/14, 6:04 AM, Reuti wrote:
>>>> Hi, 
>>>> 
>>>> Am 29.01.2014 um 20:01 schrieb Daniel Kamalic:
>>>> 
>>>>> Slotwise preemption doesn't seem to be working correctly for single
>>>>> jobs that take up multiple slots on my setup:
>>>> 
>>>> Unfortunately that's true.
>>>> 
>>>> I can't find a discussions about it in the mailing list though. I
>>>> thought this was an issue which was fixed in the meantime.
>>>> 
>>>> -- Reuti
> 
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to