On 3/5/2019 12:34 PM, David Trimboli wrote:
On 3/5/2019 12:18 PM, Reuti wrote:
Am 05.03.2019 um 18:06 schrieb David Trimboli<trimb...@cshl.edu>:
I'm looking at SGE limits, and I'm not sure when something applies to all users
or each user individually. I want to find out how to limit each user to a
certain number of slots across the entire cluster (just one queue).
I feel like this isn't it:
{
Name limit-user-slots
description Limit each user to 10 slots
enabled true
limit users * queues {all.q} to slots=10
limit users {*} queues all.q to slots=10
In principle {all.q} wouldn't hurt as it means "for each entry in the list",
and the only entry is all.q. But to lower the impact I would leave this out.
Ohhhhhhh! I didn't realize that {} meant to apply to each entry in the
list. That gives me everything I need. Thanks to you and Bernd.
Now a followup question. I implemented this rule to ensure that no
single user takes more than 90% of our available slots:
{
name limit90percent
description NONE
enabled TRUE
limit users {*} to slots=536
}
(Our cluster has a total of 596 slots.) This worked fine until someone
tried to submit a parallel environment job with the -pe option. On 16
out of our 24 nodes, it still worked. But if they sent a job hard-queued
to one of the upper nodes 17–24, it would never run, with this in the
scheduling info:
cannot run because it exceeds limit "trimboli/////" in rule
"limit90percent/1"
cannot run in PE "threads" because it only offers 0 slots
(My username is trimboli.) Now, it's quite possible that the upper nodes
are set up differently than the lower nodes. The upper eight nodes were
installed later than the others and have been treated differently in the
past. I'd like to find what setting in the upper nodes is making this
limit say that there are 0 slots when a PE job is run. Where can I look
to find the culprit?
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users