Re: [gridengine users] Subordinate Queue & Job Packing

Reuti Fri, 03 Aug 2012 09:58:59 -0700

Am 03.08.2012 um 18:50 schrieb Joseph Farran:

> On 08/03/2012 09:18 AM, Reuti wrote:
>> Am 03.08.2012 um 18:04 schrieb Joseph Farran:
>> 
>>> I pack jobs unto nodes using the following GE setup:
>>> 
>>>    # qconf -ssconf | egrep "queue|load"
>>>    queue_sort_method                 seqno
>>>    job_load_adjustments              NONE
>>>    load_adjustment_decay_time        0
>>>    load_formula                      slots
>>> 
>>> I also set my nodes with the slots complex value:
>>> 
>>>    # qconf -rattr exechost complex_values "slots=64" compute-2-1
>> Don't limit it here. Just define 64 in both queues for slots.
>> 
> 
> Yes, I tried that approached as well but then parallel jobs will not suspend 
> equal number of serial jobs.
> 
> So after I setup the above ( note my test queue and nodes have 8 cores and 
> not 64 ):
> 
> # qconf -sq owner | egrep "slots"
> slots                 8
> subordinate_list      slots=8(free:0:sr)
> 
> # qconf -sq free | egrep "slots"
> slots                 8
> 
> [# qconf -se compute-3-1 | egrep complex
> complex_values        NONE
> # qconf -se compute-3-2 | egrep complex
> complex_values        NONE
> 
> When I submit one 8 parallel job to owner, only one core in free is suspended 
> instead of 8:
> 
> Here is qstat listing:
> 
> job-ID  prior   name   user      state queue             slots
> --------------------------------------------------------------
>   8531 0.50500 FREE   testfree   r    free@compute-3-1   1
>   8532 0.50500 FREE   testfree   r    free@compute-3-1   1
>   8533 0.50500 FREE   testfree   r    free@compute-3-1   1
>   8534 0.50500 FREE   testfree   r    free@compute-3-1   1
>   8535 0.50500 FREE   testfree   r    free@compute-3-1   1
>   8536 0.50500 FREE   testfree   r    free@compute-3-1   1
>   8537 0.50500 FREE   testfree   r    free@compute-3-1   1
>   8538 0.50500 FREE   testfree   S    free@compute-3-1   1
>   8539 0.50500 FREE   testfree   r    free@compute-3-2   1
>   8540 0.50500 FREE   testfree   r    free@compute-3-2   1
>   8541 0.50500 FREE   testfree   r    free@compute-3-2   1
>   8542 0.50500 FREE   testfree   r    free@compute-3-2   1
>   8543 0.50500 FREE   testfree   r    free@compute-3-2   1
>   8544 0.50500 FREE   testfree   r    free@compute-3-2   1
>   8545 0.50500 FREE   testfree   r    free@compute-3-2   1
>   8546 0.50500 FREE   testfree   r    free@compute-3-2   1
>   8547 0.60500 Owner  me         r    owner@compute-3-1  8
> 
> 
> Job 8547 on owner queue starts just fine running with 8 cores on compute-3-1 
> *but* only one core in compute-3-1 from free queue is suspended instead of 8 
> cores.


AFAIR this is a known bug for parallel jobs.


>>> Serial jobs are all packed nicely unto a node until the node is full and 
>>> then it goes unto the next node.
>>> 
>>> 
>>> The issue I am having is that my subordinate queue breaks when I have set 
>>> my nodes with the node complex value above.
>>> 
>>> I have two queues:  The owner queue and the free queue:
>>> 
>>>    # qconf -sq owner | egrep "subordinate|shell"
>>>    shell                 /bin/bash
>>>    shell_start_mode      posix_compliant
>>>    subordinate_list      free=1
>> subordinate_list      slots=64(free)
>> 
>> 
>>>    # qconf -sq free | egrep "subordinate|shell"
>>>    shell                 /bin/bash
>>>    shell_start_mode      posix_compliant
>>>    subordinate_list      NONE
>>> 
>>> When I fill up the free queue with serial jobs and I then submit a job to 
>>> the owner queue, the owner job will not suspend the free job.   Qstat 
>>> scheduling info says:
>>> 
>>>    queue instance "[email protected]" dropped because it is full
>>>    queue instance "[email protected]" dropped because it is full
>>> 
>>> If I remove the "complex_values=" from my nodes, then jobs are correctly 
>>> suspended in free queue and the owner job runs just fine.
>> Yes, and what's the problem with this setup?
> 
> What is wrong with the above setup is that the 'owner' cannot run because 
> free jobs are not suspended.

They are not suspended in advance. The suspension is the result of an 
additional job being started thereon. Not the other way round.

-- Reuti


>>> So how can I accomplish both items above?
>>> 
>>> 
>>> 
>>> *** By the way, here are some pre-answers to some questions I am going to 
>>> be asked:
>>> 
>>> Why pack jobs?:  Because in any HPC environment that runs a mixture of 
>>> serial and parallel jobs, you really don't want to spread single core jobs 
>>> across multiple nodes, specially 64 cores nodes.   You want to keep nodes 
>>> whole for parallel jobs ( this is HPC 101 ).
>> Depends on the application. E.g. Molcas is writing a lot to the local 
>> scratch disk, so it's better to spread them in the cluster and use the 
>> remaining cores in each exechost for jobs without or at least with less disk 
>> access.
> 
> Yes, there will always be exceptions.    I should have said in most 99% of 
> circumstances.
> 
> 
>> -- Reuti
>> 
>> 
>>> Suspended jobs will not free up resources:  Yeap, but the jobs will *not* 
>>> be consuming CPU cycles which is what I want.
>>> 
>>> Thanks,
>>> Joseph
>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Subordinate Queue & Job Packing

Reply via email to