On 5/18/2011 1:47 PM, Dave Love wrote:
James Gladden<[email protected]> writes:
The scheduler picked stf.q@compute-1-1 which was the unloaded node, instead of
"packing" the job into one of the four available slots on compute-1-12 as was
desired and expected. I should add that stf.q@compute-1-1 is the lowest
sequence number instance in stf.q, so this looks like the job was assigned by
sequence number rather than by our "-slots" load formula.
Well, what's the queue_sort_method?
However, as I said, there's a bug, but it happens univa just fixed it --
see commits from a couple of days ago, I think.
Any suggestions? I have poked around in the archive without finding the error
of my ways. BTW, why the (-) inversion in the load formula?
don't you want to favour more loaded nodes?
Yes, I do. "Slots" is a consumable resource, right? I the case of our
systems, the starting value for each execution host is set to 8. If
the load formula is "-slots", then for an unloaded node we have:
load = (-8)
If we then dispatch a single slot job that node the load value would
then change to
load = (-7)
Algebraically,
(-7) > (-8)
so the scheduler will perceive the empty node as "less loaded" and
dispatch to it in preference to the node with one slot already
consumed. I don't see how this formula "favors more loaded nodes." On
the other hand, if we go with a load formula of just (slots) then we get
7 < 8
so the scheduler should perceive the partially consumed node (load = 7)
as "less loaded" and dispatch to it in preference to the empty node
(load = 8). Is there some flaw in this logic?
Alas, all of this seems moot as I have not been able to establish that
the scheduler actually pays any attention to the setting of the load
formula or the queue sort method. My system appears to dispatch jobs
based on queue sequence number irrespective of these settings. For
example, changing the load formula from "slots" to "-slots" appears to
make no difference. The problem is demonstrable with serial jobs, so it
is not a bug associated with PEs. I have even tried restarting qmaster
on the theory that perhaps it would only recognize the change on daemon
start up. No luck.
Have any of you actually observed this feature to work? If so, what
version of SGE are you running?
James Gladden
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users