...
...
...
[gladden@stuart ~]$
Please note that queue_sort_method="load" and load_formula="slots"
as per our discussion of how of how to configure the scheduler to
"pack" jobs on nodes.
And here is an example of what happens when I submit a job. This is
the abbreviated output of "qhost -q" showing the state of the queue
instances on the first nine compute nodes on the cluster:
[gladden@stuart ~]$ qhost -q
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE
SWAPTO SWAPUS
-------------------------------------------------------------------------------
global - - - -
- - -
compute-1-1 lx26-amd64 8 8.60 23.5G 16.3G
7.8G 7.8G
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-2 lx26-amd64 8 7.35 23.5G 21.4G
7.8G 38.9M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-3 lx26-amd64 8 0.00 23.5G 89.2M
7.8G 37.2M
serial.q BIP 0/8
stf.q BIP 0/8
all.q BIP 0/8
compute-1-4 lx26-amd64 8 0.00 23.5G 97.2M
7.8G 37.6M
serial.q BIP 0/8
stf.q BIP 0/8
all.q BIP 0/8
compute-1-5 lx26-amd64 8 7.32 23.5G 21.2G
7.8G 33.6M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-6 lx26-amd64 8 7.98 23.5G 701.7M
7.8G 25.1M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-7 lx26-amd64 8 0.00 23.5G 167.1M
7.8G 36.0M
serial.q BIP 0/8
stf.q BIP 0/8
all.q BIP 0/8
compute-1-8 lx26-amd64 8 0.00 23.5G 97.8M
7.8G 35.0M
serial.q BIP 0/8
stf.q BIP 4/8
all.q BIP 0/8
compute-1-9 lx26-amd64 8 7.77 23.5G 528.3M
7.8G 33.8M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
....
....
[gladden@stuart ~]$
Note that compute-1-3 is the first "empty" node on the list, and
that compute-1-8 is partially subscribed with 4 of the 8 slots in
use. And here is the output of " qhost -F slots" confirming the
state of the "slots" resource:
[gladden@stuart ~]$ qhost -F slots
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE
SWAPTO SWAPUS
-------------------------------------------------------------------------------
global - - - -
- - -
compute-1-1 lx26-amd64 8 8.68 23.5G 16.0G
7.8G 7.8G
Host Resource(s): hc:slots=0.000000
compute-1-2 lx26-amd64 8 7.32 23.5G 21.4G
7.8G 38.9M
Host Resource(s): hc:slots=0.000000
compute-1-3 lx26-amd64 8 0.00 23.5G 89.2M
7.8G 37.2M
Host Resource(s): hc:slots=8.000000
compute-1-4 lx26-amd64 8 0.00 23.5G 97.2M
7.8G 37.6M
Host Resource(s): hc:slots=8.000000
compute-1-5 lx26-amd64 8 7.32 23.5G 21.2G
7.8G 33.6M
Host Resource(s): hc:slots=0.000000
compute-1-6 lx26-amd64 8 7.98 23.5G 701.8M
7.8G 25.1M
Host Resource(s): hc:slots=0.000000
compute-1-7 lx26-amd64 8 0.00 23.5G 167.2M
7.8G 36.0M
Host Resource(s): hc:slots=8.000000
compute-1-8 lx26-amd64 8 0.00 23.5G 97.8M
7.8G 35.0M
Host Resource(s): hc:slots=4.000000
compute-1-9 lx26-amd64 8 7.80 23.5G 528.5M
7.8G 33.8M
Host Resource(s): hc:slots=0.000000
...
...
[gladden@stuart ~]$
However, if I submit a job like this:
[gladden@stuart ~]$ qsub -q stf.q submit_test
Your job 524987 ("test") has been submitted
The result is this:
[gladden@stuart ~]$ qstat -u gladden
job-ID prior name user state submit/start at
queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
524987 0.55500 test gladden r 05/31/2011 11:36:57 [email protected]
1
Note that the submitted job ended up on compute-1-3 (the first empty
node) rather than on compute-1-8 which was the partially consumed
node where the job should have been "packed". The stf.q instances
on this system are sequentially numbered starting with node
compute-1-1, so it appears that the scheduler simply picked the
lowest sequence numbered queue instance on a node with an available
slot. I see no evidence that it sorted the queue instances by load
as per the scheduler configuration.
I've done this experiment several times and the result appears to be
consistent. Is there some additional configuration issue I need to
address? Or, perhaps, was there a bug in this version (6.2U2-1)
that was later addressed?
James Gladden
On 5/27/2011 1:02 PM, Reuti wrote:
Am 27.05.2011 um 21:42 schrieb James Gladden:
On 5/18/2011 1:47 PM, Dave Love wrote:
James Gladden<[email protected]> writes:
The scheduler picked stf.q@compute-1-1 which was the unloaded
node, instead of
"packing" the job into one of the four available slots on
compute-1-12 as was
desired and expected. I should add that stf.q@compute-1-1 is
the lowest
sequence number instance in stf.q, so this looks like the job
was assigned by
sequence number rather than by our "-slots" load formula.
Well, what's the queue_sort_method?
However, as I said, there's a bug, but it happens univa just
fixed it --
see commits from a couple of days ago, I think.
Any suggestions? I have poked around in the archive without
finding the error
of my ways. BTW, why the (-) inversion in the load formula?
don't you want to favour more loaded nodes?
Yes, I do. "Slots" is a consumable resource, right? I the case
of our systems, the starting value for each execution host is set
to 8. If the load formula is "-slots", then for an unloaded node
we have:
load = (-8)
If we then dispatch a single slot job that node the load value
would then change to
load = (-7)
Algebraically,
(-7) > (-8)
so the scheduler will perceive the empty node as "less loaded" and
dispatch to it in preference to the node with one slot already
consumed. I don't see how this formula "favors more loaded
nodes." On the other hand, if we go with a load formula of just
(slots) then we get
7 < 8
so the scheduler should perceive the partially consumed node (load
= 7) as "less loaded" and dispatch to it in preference to the
empty node (load = 8). Is there some flaw in this logic?
There is none. It's like outlined here:
http://blogs.oracle.com/sgrell/entry/
grid_engine_scheduler_hacks_least
-- Reuti
Alas, all of this seems moot as I have not been able to establish
that the scheduler actually pays any attention to the setting of
the load formula or the queue sort method. My system appears to
dispatch jobs based on queue sequence number irrespective of these
settings. For example, changing the load formula from "slots" to
"-slots" appears to make no difference. The problem is
demonstrable with serial jobs, so it is not a bug associated with
PEs. I have even tried restarting qmaster on the theory that
perhaps it would only recognize the change on daemon start up. No
luck.
Have any of you actually observed this feature to work? If so,
what version of SGE are you running?
James Gladden
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users