[gridengine users] slotwise preemption and requeue

HUMMEL Michel Wed, 23 Apr 2014 06:19:17 -0700

Hy,

I'm trying to configure my OGS to allow urgent priority jobs to requeue low 
priority jobs.
It seem's to work for a limited number of urgent priority jobs but there is an 
limit above which the system don't work as expected.
Here is the configuration I used :


I have 7 nodes (named GSE1-7) of 12 slots each, which means 84 slots at all.

I have 2 queues using slotwise preemption :
all.q and urgent.q (see configurations [1] and [2]) 

To allow the requeue of  jobs I have configured a checkpoint :
$ qconf -sckpt Requeue
ckpt_name          Requeue
interface          APPLICATION-LEVEL
ckpt_command       
/data/module/install/OGS/2011.11p1/install/GE2011.11/kill_tree.sh \
                   $job_pid
migr_command       
/data/module/install/OGS/2011.11p1/install/GE2011.11/kill_tree.sh \
                   $job_pid
restart_command    NONE
clean_command      NONE
ckpt_dir           /tmp
signal             NONE
when               xsr

I manage priority levels with complexes
P1,p2, p3 for "normal jobs"
P0 for urgent jobs
$ qconf -sc
priority0           p0         BOOL        ==      FORCED      NO         FALSE 
   40
priority1           p1         BOOL        ==      YES         NO         FALSE 
   30
priority2           p2         BOOL        ==      YES         NO         FALSE 
   20
priority3           p3         BOOL        ==      YES         NO         FALSE 
   10

To limit the number of jobs running concurrently on the to queues i used an RQS 
on the all.q queue :
$ qconf -srqs
{
   name         limit_DCH
   description  NONE
   enabled      TRUE
   limit        queues {all.q} hosts {*} to slots=$num_proc
}

I submit 110 "normal jobs" and 84 of them are executed, 12 on each node.

for i in $(seq 1 110); do qsub -l p1 -ckpt Requeue job.sh; done
qstat | grep 'OGSE' | sort -k 8 | awk '{print $3 " " $8 }' | uniq -c
     12 job.sh all.q@OGSE1
     12 job.sh all.q@OGSE2
     12 job.sh all.q@OGSE3
     12 job.sh all.q@OGSE4
     12 job.sh all.q@OGSE5
     12 job.sh all.q@OGSE6
     12 job.sh all.q@OGSE7

Then I submit 40 urgent jobs and it works as expected, the 40 are executed in 
the urgent.q queue and 40 jobs of all.q are requeued (state Rq):
for i in $(seq 1 40); do qsub  -l p0  job.sh; done
(I grep OGSE on  the output to only catch jobs which are affected to a queue)
qstat | grep 'OGSE' | sort -k 8 | awk '{print $3 " " $8 }' | uniq -c
      8 job.sh all.q@OGSE2
     12 job.sh all.q@OGSE3
     12 job.sh all.q@OGSE5
     12 job.sh all.q@OGSE6
     12 job.sh urgent.q@OGSE1
      4 job.sh urgent.q@OGSE2
     12 job.sh urgent.q@OGSE4
     12 job.sh urgent.q@OGSE7
As you can see there is only 12 jobs running on each node.

It works until i reach 42 urgent jobs (I've submited the others one by one to 
find the exact limit).
When the 42th job starts then the requeue system doesn't work anymore and OGS 
begin to suspend other "normal jobs" then migrates it on an other node, the 
suspend another, ... and this as long as there is 42 or more urgent jobs 
running or pending.
$ qsub  -l p0  job.sh;
$ qsub  -l p0  job.sh;
qstat | grep 'OGSE' | sort -k 8 | awk '{print $3 " " $8 }' | uniq -c
     12 job.sh all.q@OGSE1
     12 job.sh all.q@OGSE2
     11 job.sh all.q@OGSE3
      2 job.sh all.q@OGSE4
     11 job.sh all.q@OGSE5
     12 job.sh all.q@OGSE6
     12 job.sh urgent.q@OGSE1
      4 job.sh urgent.q@OGSE2
      1 job.sh urgent.q@OGSE3
     12 job.sh urgent.q@OGSE4
      1 job.sh urgent.q@OGSE5
      1 job.sh urgent.q@OGSE6
     12 job.sh urgent.q@OGSE7

If I qdel one ugrent job, the system work again as expected, only 12 jobs can 
run on each node and no jobs are in the suspended state.
 
Is someone have an idea of what's going on ?
Any help will be appreciated

Michel Hummel

------------------
[1]
$ qconf -sq all.q
qname                 all.q
hostlist              OGSE1 OGSE2 OGSE3 OGSE4 OGSE5 OGSE6 OGSE7
seq_no                0
load_thresholds       np_load_avg=1.75
suspend_thresholds    NONE
nsuspend              1
suspend_interval      00:05:00
priority              0
min_cpu_interval      INFINITY
processors            UNDEFINED
qtype                 BATCH INTERACTIVE
ckpt_list             Requeue
pe_list               default distribute make
rerun                 TRUE
slots                 84
tmpdir                /tmp
shell                 /bin/sh
prolog                NONE
epilog                NONE
shell_start_mode      posix_compliant
starter_method        NONE
suspend_method        NONE
resume_method         NONE
terminate_method      NONE
notify                00:00:60
owner_list            NONE
user_lists            arusers
xuser_lists           NONE
subordinate_list      NONE
complex_values        priority1=TRUE,priority2=TRUE,priority3=TRUE
projects              NONE
xprojects             NONE
calendar              NONE
initial_state         default
s_rt                  INFINITY
h_rt                  INFINITY
s_cpu                 INFINITY
h_cpu                 INFINITY
s_fsize               INFINITY
h_fsize               INFINITY
s_data                INFINITY
h_data                INFINITY
s_stack               INFINITY
h_stack               INFINITY
s_core                INFINITY
h_core                INFINITY
s_rss                 INFINITY
h_rss                 INFINITY
s_vmem                INFINITY
h_vmem                INFINITY
 
[2]
$ qconf -sq urgent.q
qname                 urgent.q
hostlist              @allhosts
seq_no                0
load_thresholds       np_load_avg=1.75
suspend_thresholds    NONE
nsuspend              1
suspend_interval      00:05:00
priority              -20
min_cpu_interval      INFINITY
processors            UNDEFINED
qtype                 BATCH INTERACTIVE
ckpt_list             Requeue
pe_list               default make
rerun                 FALSE
slots                 12
tmpdir                /tmp
shell                 /bin/sh
prolog                NONE
epilog                NONE
shell_start_mode      posix_compliant
starter_method        NONE
suspend_method        NONE
resume_method         SIGINT
terminate_method      NONE
notify                00:00:60
owner_list            NONE
user_lists            arusers
xuser_lists           NONE
subordinate_list      slots=84(all.q:0:sr)
complex_values        priority0=True
projects              NONE
xprojects             NONE
calendar              NONE
initial_state         default
s_rt                  INFINITY
h_rt                  INFINITY
s_cpu                 INFINITY
h_cpu                 INFINITY
s_fsize               INFINITY
h_fsize               INFINITY
s_data                INFINITY
h_data                INFINITY
s_stack               INFINITY
h_stack               INFINITY
s_core                INFINITY
h_core                INFINITY
s_rss                 INFINITY
h_rss                 INFINITY
s_vmem                INFINITY
h_vmem                INFINITY



_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

[gridengine users] slotwise preemption and requeue

Reply via email to