Re: [gridengine users] Making the fair-share policy/scheduler algorithm "more fair"

Mark Dixon Wed, 01 May 2013 01:43:04 -0700

On Wed, 1 May 2013, Jake Carroll wrote:

Mark,


Thanks for the response. This is opening up a bunch of cool ideas for us.

We're trying to get our heads around how the scaling factor stuff actually
"works" however.

For example, if a host policy says scale factor for mem = 1.0, but we
could perhaps set it to 0.50, what does that actually *mean*? How does it
change the "scale" factor and what impact does it have on the way the
scheduler works to utilise memory on that that node? Trying to get a
better handle on the semantics of this thing.

The scheduler keeps the notion of the "usage" by a particular job. This iswhat is injected into the share tree and, presumably, the functional tree.It is just a number (visible from within qmon). What's important is therelative usage between your users (for functional) or relativecumulative usage (for share tree).

Usage is defined as a bunch of weightings multiplied by usage of cpu/mem/ioin the units I mentioned in my last email:


  usage = (wcpu * cpu) + (wmem * mem) + (wio * io)

If you have wcpu = 0.5, wmem = 0.5, wio = 0 and you have a job on 4 slots,lasting for 1 day and consuming 2Gb of RAM per slot, it would generate ausage of:


  usage = (0.5 * 4*1*24*60*60) + (0.5 * 2*4*1*24*60*60) = 129600

What the usage actually "means" in practice depends on the weights youplug into the usage calculation. The weight numbers can be defined veryprecisely and so it can be difficult to decide on _exact_ values to putin, particularly if you have an inquisitive user or manager looking overyour shoulder asking questions.

Personally, I put together a simple spreadsheet to play around andinvestigate the usage generated by different types of job and weights. Ithen came up with a simple model which gave a precise answer for theweights. I don't really care about the number of decimal places in theanswer, but it means I can point to the spreadsheet if I'm challenged :)It also means anyone who wants it changed has to come up with a bettermodel first :)

For example, we have "small" node and a "large" node in the same queue,
like so:

...

complex_values        virtual_free=92G,h_vmem=92G

...

complex_values        virtual_free=373G,h_vmem=373G

...

So - how does the scale factor etc actually impact the schedulers use of
the node?

...

usage_scaling allows you to tell the scheduler that not all slots, or RAMin the cluster should be considered equal.

For example, if you think that its is affective _occupancy_ of nodes thatis important, you might want to scale the memory usage value before itfeeds into the main usage calculation, so that you generate the same usageif you fill up a node, no matter how much memory that node has. In thecase above, your second node has 4 times the amount of memory as thefirst, so you might want to use a usage_scaling of mem=0.25 for thesecond node.

Alternatively, if you have a number of clusters and don't want to bothermucking around with working out good values for your usage weightings allthe time, you can use usage_scaling to normalise all your node memorysizes to the same value and then use the same weightings on all clusters.

Or, if you have a mixture of generally available nodes that share treecalculations should be done for, and nodes dedicated to specific usersthat shouldn't (e.g. they bought them), you can just stop the dedicatednodes from contributing to a particular user's usage via a usage_scalingof cpu=0.000000,mem=0.000000,io=0.000000.


All the best,

Mark
--
-----------------------------------------------------------------
Mark Dixon                       Email    : [email protected]
HPC/Grid Systems Support         Tel (int): 35429
Information Systems Services     Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Making the fair-share policy/scheduler algorithm "more fair"

Reply via email to