Hi Arnau,

If you declare slots as a consumable attribute and you set it in all nodes, you will avoid over-subscription and that will be the maximum number of jobs running in those nodes.

Once you have this, the maximum number of running jobs in a given node will be the minimum between "queue-slots" and "host-slots". If you set queue-slots to a number greater or equal than host-slots, it will have no effect at all. If you set queue-slots to a smaller value than host-slots, only the smaller queue-slots number of jobs will run. But this would allow you to make some "creative" use of the queues.

For instance, say you have a 10-node cluster bought 50%-50% by two departments. You could give exclusive access to 5 nodes to each department. Or you could set, say, host-slots to 8 and have 2 queues (one for each department) with 6 queue-slots each one. This would give you a system where each department has access to 75% of the cluster (if it's not being used by the other department), and granting the other department at least 25% of the resources to be used immediately in case of need.
(This is obviously a cartoon, lots of other problems will arise).

Txema




El 09/01/13 16:26, Arnau Bria escribió:
Hi all,

I have a doubt about how to define (if needed) queue slots when you
already have hosts slots defined as complex_values.

Following some guides, I had some queues where I defined slots as:

[...]
slots                 0,[@xe=16]
[...]

I'm also configuring host's slots as a complex_value, for avoiding
over-subscription.

So, from understanding, I'm telling the system twice that the queue has
X slots.

My concern with the above conf, is that this conf requires a hostgroup
(@xe) definition with all the hosts that have the same the number of
cpus. So, if that hostgroup mixes hosts with different number of slots,
I should defined 2 groups, and so on.


I've been doing some tests removing slots values in queues, and OGS
behaves as desired: not allocating too many jobs in one node even if
the queue has free slots ot if the node is already running enough jobs,
but, as I found no examples of this conf, I'm wondering if this is
correct to define queue slots as a maximum and then define hosts with
slots complex_values.

** I'm planning to start using queue preemption (subordination), and
this guide https://blogs.oracle.com/templedf/entry/better_preemption
talks about queue slots in some way. As I've not tested preemption, I'm
not sure if removing queue slots will cause any problem (from what I
understand not, cause you publish how many slots are available for
preemption... but it's just my understanding).


Many thanks in advance.

Cheers,
Arnau
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to