We have a cluster consisting of 24 eight processor nodes. The load on
the cluster is dominated by parallel jobs which typically occupy between
one and three nodes (8, 16, or 24 processors). However, there a few
users that run smaller jobs - some single processor "serial" jobs and a
few parallel jobs the use two or four processors. Because the cluster
is heavily used, this mix of jobs leads to conflicts.
Specifically, when dispatching a job requiring less than eight
processors, the SGE scheduler tends to assign it the the "least-loaded"
node with slots available. The result is that such jobs get scattered
around the cluster in a manner that then blocks scheduling of jobs
requiring entire nodes. We would prefer that jobs requiring less than
eight processors get dispatched instead to the "most-loaded" node that
has the required number of available slot. This would cause the small
jobs to get "packed" onto a few nodes rather than scattered around the
cluster. While this is somewhat counter to usual scheduling practice, I
believe it make sense in our environment.
Unfortunately, I have not been able to figure out how to get SGE to do
this. I have tried setting the queue sorting method to "Sort by
sequence number." This helps, in as much that if a series of small jobs
is submitted they will tend to pack on the lowest sequence-numbered
nodes with available slots. However, in general, a job gets assigned to
the lowest sequence numbered node rather than packed onto the
"most-loaded" node with available slots.
Today I tried the following experiment. I set the the queue sorting
method to "Sort by load", and then I changed the Load Formula from the
default "np_load_avg" to simply "slots". The idea was to create a
perversely backwards load calculation. If a node is empty the value of
the "slots" resource will be eight and thus will appear heavily loaded.
Conversely, a node with seven slots already allocated will have a
"slots" resource value of one and thus appear lightly loaded.
Unfortunately, this experiment has produced no discernible result.
Experiments suggest the scheduler is continuing to assign jobs to the
lowest sequentially numbered queue instance with available slots.
Should this work? If so, is there someway to debug this? Is there
someway to put the scheduler into a verbose logging mode that will
compel it to reveal exactly why it chose a particular node? Any
suggestions would be greatly appreciated.
BTW, the version of SGE is 6.2u2-1.
James Gladden
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users