Re: [gridengine users] slotwise preemption and requeue

Reuti Tue, 29 Apr 2014 04:45:08 -0700

Am 25.04.2014 um 10:39 schrieb HUMMEL Michel:

> Thank you for the response,
> My objective is to allow urgent jobs to requeue jobs running in all.q without 
> having jobs suspended (which consume resources).
> Without the RQS, if all.q is full and an urgent job is submited, a job of 
> all.q is suspended, then it is requeue ( thanks to the checkpoint 
> properties). But as the queue instance of all.q has 1 slot free, a new 
> "normal" job starts in the queue, which is suspended, requeue, etc ...
> 
> To break this loop of "start, suspend, requeue", I use the RQS (which I 
> think, prevent oversubscription of the node only for all.q ?)


Yes.

The:

limit        queues {all.q} hosts {*} to slots=$num_proc

could also be wtitten as:

limit        queues all.q hosts {*} to slots=12


> The number of slots declared in all.q is bigger than 12 because I saw in my 
> tests that this influences the limit above which the system begin to fail.
> In my last test, I set it to 96, this has increased to 60 the limit below 
> which it works fine (I really don't know why).

It means, that the RQS is taken into account before a job is to be resumed, 
while a limit on a queue instance level will allow a job to start which will be 
suspended instantly due to a slot-wise subordination (which can happen 
actually). Is this your obervation?


> To finish I tryed to increase it again but this has no more effect.
> I found an intermediate solution which is to add of an RQS to limit the slots 
> of the urgent queue to 60. This works but I realy need to allow the urgent.q 
> queue to use all the slots of the cluster.
> 
> Here is the RQS definitions used :
>   limit        queues {all.q} hosts {*} to slots=$num_proc
>   limit        queues {urgent.q} hosts * to slots=60


limit        queues urgent.q to slots=60

The last line looks like a limit in the entire cluster.

Although I have no hint for the original issue, maybe shortening the RQS can 
give a clue in:

$ qquota

-- Reuti


> -----Message d'origine-----
> De : Reuti [mailto:[email protected]] 
> Envoyé : vendredi 25 avril 2014 00:31
> À : HUMMEL Michel
> Cc : [email protected]
> Objet : Re: [gridengine users] slotwise preemption and requeue
> 
> Am 24.04.2014 um 09:22 schrieb HUMMEL Michel:
> 
>> Thank you for the hit (slotwise configured at 84 slots per node), but as you 
>> can see, preemption seem's to work (at less until you reach the 42 urgent 
>> jobs).
>> 
>> I tryed with :
>> subordinate_list      slots=12(all.q:0:sr)
>> 
>> And it gives me the exact same result, do you have any suggestion ?
> 
> Okay. Another thing that spot my eye:
> 
> 
>> [@@ THALES GROUP INTERNAL @@]
>> 
>> 
>> -----Message d'origine-----
>> De : Reuti [mailto:[email protected]]
>> Envoyé : mercredi 23 avril 2014 18:42
>> À : HUMMEL Michel
>> Cc : [email protected]
>> Objet : Re: [gridengine users] slotwise preemption and requeue
>> 
>> Hi,
>> 
>> Am 23.04.2014 um 15:06 schrieb HUMMEL Michel:
>> 
>>> I'm trying to configure my OGS to allow urgent priority jobs to requeue low 
>>> priority jobs.
>>> It seem's to work for a limited number of urgent priority jobs but there is 
>>> an limit above which the system don't work as expected.
>>> Here is the configuration I used :
>>> 
>>> I have 7 nodes (named GSE1-7) of 12 slots each, which means 84 slots at all.
>>> 
>>> I have 2 queues using slotwise preemption :
>>> all.q and urgent.q (see configurations [1] and [2])
>>> 
>>> To allow the requeue of  jobs I have configured a checkpoint :
>>> $ qconf -sckpt Requeue
>>> ckpt_name          Requeue
>>> interface          APPLICATION-LEVEL
>>> ckpt_command       
>>> /data/module/install/OGS/2011.11p1/install/GE2011.11/kill_tree.sh \
>>>                 $job_pid
>>> migr_command       
>>> /data/module/install/OGS/2011.11p1/install/GE2011.11/kill_tree.sh \
>>>                 $job_pid
>>> restart_command    NONE
>>> clean_command      NONE
>>> ckpt_dir           /tmp
>>> signal             NONE
>>> when               xsr
>>> 
>>> I manage priority levels with complexes P1,p2, p3 for "normal jobs"
>>> P0 for urgent jobs
>>> $ qconf -sc
>>> priority0           p0         BOOL        ==      FORCED      NO         
>>> FALSE    40
>>> priority1           p1         BOOL        ==      YES         NO         
>>> FALSE    30
>>> priority2           p2         BOOL        ==      YES         NO         
>>> FALSE    20
>>> priority3           p3         BOOL        ==      YES         NO         
>>> FALSE    10
>>> 
>>> To limit the number of jobs running concurrently on the to queues i used an 
>>> RQS on the all.q queue :
>>> $ qconf -srqs
>>> {
>>> name         limit_DCH
>>> description  NONE
>>> enabled      TRUE
>>> limit        queues {all.q} hosts {*} to slots=$num_proc
>>> }
> 
> As you have 12 slots per queue instance in all.q, this RQS seems not to have 
> any effect I think.
> 
> 
>>> I submit 110 "normal jobs" and 84 of them are executed, 12 on each node.
>>> 
>>> for i in $(seq 1 110); do qsub -l p1 -ckpt Requeue job.sh; done qstat
>>> | grep 'OGSE' | sort -k 8 | awk '{print $3 " " $8 }' | uniq -c
>>>   12 job.sh all.q@OGSE1
>>>   12 job.sh all.q@OGSE2
>>>   12 job.sh all.q@OGSE3
>>>   12 job.sh all.q@OGSE4
>>>   12 job.sh all.q@OGSE5
>>>   12 job.sh all.q@OGSE6
>>>   12 job.sh all.q@OGSE7
>>> 
>>> Then I submit 40 urgent jobs and it works as expected, the 40 are executed 
>>> in the urgent.q queue and 40 jobs of all.q are requeued (state Rq):
>>> for i in $(seq 1 40); do qsub  -l p0  job.sh; done (I grep OGSE on 
>>> the output to only catch jobs which are affected to a queue) qstat | 
>>> grep 'OGSE' | sort -k 8 | awk '{print $3 " " $8 }' | uniq -c
>>>    8 job.sh all.q@OGSE2
>>>   12 job.sh all.q@OGSE3
>>>   12 job.sh all.q@OGSE5
>>>   12 job.sh all.q@OGSE6
>>>   12 job.sh urgent.q@OGSE1
>>>    4 job.sh urgent.q@OGSE2
>>>   12 job.sh urgent.q@OGSE4
>>>   12 job.sh urgent.q@OGSE7
>>> As you can see there is only 12 jobs running on each node.
>>> 
>>> It works until i reach 42 urgent jobs (I've submited the others one by one 
>>> to find the exact limit).
>>> When the 42th job starts then the requeue system doesn't work anymore and 
>>> OGS begin to suspend other "normal jobs" then migrates it on an other node, 
>>> the suspend another, ... and this as long as there is 42 or more urgent 
>>> jobs running or pending.
>>> $ qsub  -l p0  job.sh;
>>> $ qsub  -l p0  job.sh;
>>> qstat | grep 'OGSE' | sort -k 8 | awk '{print $3 " " $8 }' | uniq -c
>>>   12 job.sh all.q@OGSE1
>>>   12 job.sh all.q@OGSE2
>>>   11 job.sh all.q@OGSE3
>>>    2 job.sh all.q@OGSE4
>>>   11 job.sh all.q@OGSE5
>>>   12 job.sh all.q@OGSE6
>>>   12 job.sh urgent.q@OGSE1
>>>    4 job.sh urgent.q@OGSE2
>>>    1 job.sh urgent.q@OGSE3
>>>   12 job.sh urgent.q@OGSE4
>>>    1 job.sh urgent.q@OGSE5
>>>    1 job.sh urgent.q@OGSE6
>>>   12 job.sh urgent.q@OGSE7
>>> 
>>> If I qdel one ugrent job, the system work again as expected, only 12 jobs 
>>> can run on each node and no jobs are in the suspended state.
>>> 
>>> Is someone have an idea of what's going on ?
>>> Any help will be appreciated
>>> 
>>> Michel Hummel
>>> 
>>> ------------------
>>> [1]
>>> $ qconf -sq all.q
>>> qname                 all.q
>>> hostlist              OGSE1 OGSE2 OGSE3 OGSE4 OGSE5 OGSE6 OGSE7
>>> seq_no                0
>>> load_thresholds       np_load_avg=1.75
>>> suspend_thresholds    NONE
>>> nsuspend              1
>>> suspend_interval      00:05:00
>>> priority              0
>>> min_cpu_interval      INFINITY
>>> processors            UNDEFINED
>>> qtype                 BATCH INTERACTIVE
>>> ckpt_list             Requeue
>>> pe_list               default distribute make
>>> rerun                 TRUE
>>> slots                 84
> 
> This would be the number of slots per queue instance too. Maybe this was 
> reason you introduced the RQS?
> 
> -- Reuti
> 
> 
>>> tmpdir                /tmp
>>> shell                 /bin/sh
>>> prolog                NONE
>>> epilog                NONE
>>> shell_start_mode      posix_compliant
>>> starter_method        NONE
>>> suspend_method        NONE
>>> resume_method         NONE
>>> terminate_method      NONE
>>> notify                00:00:60
>>> owner_list            NONE
>>> user_lists            arusers
>>> xuser_lists           NONE
>>> subordinate_list      NONE
>>> complex_values        priority1=TRUE,priority2=TRUE,priority3=TRUE
>>> projects              NONE
>>> xprojects             NONE
>>> calendar              NONE
>>> initial_state         default
>>> s_rt                  INFINITY
>>> h_rt                  INFINITY
>>> s_cpu                 INFINITY
>>> h_cpu                 INFINITY
>>> s_fsize               INFINITY
>>> h_fsize               INFINITY
>>> s_data                INFINITY
>>> h_data                INFINITY
>>> s_stack               INFINITY
>>> h_stack               INFINITY
>>> s_core                INFINITY
>>> h_core                INFINITY
>>> s_rss                 INFINITY
>>> h_rss                 INFINITY
>>> s_vmem                INFINITY
>>> h_vmem                INFINITY
>>> 
>>> [2]
>>> $ qconf -sq urgent.q
>>> qname                 urgent.q
>>> hostlist              @allhosts
>>> seq_no                0
>>> load_thresholds       np_load_avg=1.75
>>> suspend_thresholds    NONE
>>> nsuspend              1
>>> suspend_interval      00:05:00
>>> priority              -20
>>> min_cpu_interval      INFINITY
>>> processors            UNDEFINED
>>> qtype                 BATCH INTERACTIVE
>>> ckpt_list             Requeue
>>> pe_list               default make
>>> rerun                 FALSE
>>> slots                 12
>>> tmpdir                /tmp
>>> shell                 /bin/sh
>>> prolog                NONE
>>> epilog                NONE
>>> shell_start_mode      posix_compliant
>>> starter_method        NONE
>>> suspend_method        NONE
>>> resume_method         SIGINT
>>> terminate_method      NONE
>>> notify                00:00:60
>>> owner_list            NONE
>>> user_lists            arusers
>>> xuser_lists           NONE
>>> subordinate_list      slots=84(all.q:0:sr)
>> 
>> First one thought: this limit is per queue instance. So it should never 
>> trigger any suspension at all unless you reach > 84 per node.
>> 
>> -- Reuti
>> 
>> 
>>> complex_values        priority0=True
>>> projects              NONE
>>> xprojects             NONE
>>> calendar              NONE
>>> initial_state         default
>>> s_rt                  INFINITY
>>> h_rt                  INFINITY
>>> s_cpu                 INFINITY
>>> h_cpu                 INFINITY
>>> s_fsize               INFINITY
>>> h_fsize               INFINITY
>>> s_data                INFINITY
>>> h_data                INFINITY
>>> s_stack               INFINITY
>>> h_stack               INFINITY
>>> s_core                INFINITY
>>> h_core                INFINITY
>>> s_rss                 INFINITY
>>> h_rss                 INFINITY
>>> s_vmem                INFINITY
>>> h_vmem                INFINITY
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>> 
>> 
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] slotwise preemption and requeue

Reply via email to