On Tue, 23 Jun 2015 12:33:27 +0000
Erik Soyez <e.so...@science-computing.de> wrote:

> Hello,
> 
> we have a very peculiar array job scheduling problem.
> 
> Cluster:  Very heterogeneous SGE cluster with different types of
> workstations.
> 
> Problem:  One array job can use more job slots than two array jobs
> together.
> 
> Example:  User A submitted an array job, 64 array tasks were running
> at the same time.  Then user B submitted a similar array job and
> after a while each of the jobs was running with approx 28 array
> tasks.  According to the users the problem gets worse with each array
> job.  Terminating one of the jobs immediately leads to the normal
> usage again.
> 
> Question:  Does anybody know if the "job_load_adjustment" of array
> jobs depends only on number of tasks or if the number of jobs is
> taken into account as well?

It should be just array tasks.  Is load the only criteria which
restricts access to a node in your cluster?


Is there any clue from looking at the actual nodes.  Do certain nodes
go into alarm with fewer jobs? 



> 
> From my point of view limits, quotas, etc. cannot be the cause of the 
> problem because then the total of job slots being used by two jobs
> could not be less then the job slots being used by one single job.
> The scheduler configuration though was not made for short array jobs
> but for longer running parallel jobs on interactively used
> workstations.


You claim the jobs are similar but that isn't the same as identical
This might explain the difference.  With identical jobs grid engine will
obtain optimal packing of jobs onto nodes simply by fitting as many
jobs onto each node as it can. For jobs with differing requirements
and nodes with different resources there is no simple strategy for
doing this (indeed it smells like an NP-hard problem to me)

Imagine two nodes with consumables as shown:
nodeA
h_vmem=4G diskspace=30G

nodeB
h_vmem=6G diskspace=20G
 
Array job 1 requests 2G h_vmem and 15G of diskspace
Array job 2 requests 3G of h_vmem and 10G of diskspace

If a task from array job 1 is scheduled to nodeB there isn't room for
another task of either job on that node(diskspace shortage).  If a task
from array job 2 is scheduled to nodeA there isn't room for another
task of either job on that node(h_vmem shortage).  On the other hand you
can fit two tasks of array job 1 on node A and two tasks of array job 2
on nodeB.


> 
> Next I will try to observe the "artifitial" load dependend on the
> number of array jobs and tasks - but maybe someone has a good
> explanation.
> 
> Many thanks!
> 
> Erik Soyez.
> 
> 
> Some details:
> 
> [Queue]
> load_thresholds       np_load_avg=0.75
> suspend_thresholds    NONE
> priority              10
> 
> [Scheduler]
> algorithm                         default
> schedule_interval                 0:0:15
> maxujobs                          0
> queue_sort_method                 seqno
> job_load_adjustments              np_load_avg=0.90
> load_adjustment_decay_time        0:2:30
> load_formula                      np_load_avg
> schedd_job_info                   true
> flush_submit_sec                  2
> flush_finish_sec                  2
> params                            none
> reprioritize_interval             0:0:0
> halftime                          168
> usage_weight_list
> cpu=1.000000,mem=0.000000,io=0.000000
> 
> 
> 
> --
> 
> 
> 
> 
> 

Attachment: pgpSYoZ8I8z5r.pgp
Description: OpenPGP digital signature

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to