Hi,
Am 27.03.2012 um 15:42 schrieb Esztermann, Ansgar:
> Hi everyone,
>
> while in general, all users are equal in our installation, I would like some
> nodes to have a longer maximum runtime for some users. In order to avoid
> oversubscription, we have only one queue per node. So instead of putting both
> a "medium" and a "long" queue on these nodes, I set up an RQS like this:
> {
> name lfn
> description Long Fat Nodes
> enabled TRUE
> limit users {aeszter,b,c} hosts {@lfn} to h_rt=2764800
> limit users {*} hosts {@lfn} to h_rt=604800
> }
>
> This works as expected for serial jobs, but parallel ones (even with one
> slot) refuse to start:
> #qalter -w p 1071136
> Job 1071136 cannot run because it exceeds limit "aeszter////node12-34/" in
> rule "lfn/1"
yes, it's also not running if you have an empty cluster and you are well below
the limit; even if you would think it's just multiplied, any larger limit won't
work too.
http://gridengine.org/pipermail/users/2011-April/000612.html
But I never found the root of it. Besides not being enforced (what is fine),
it's blocking the scheduling if you limit in an RQS a non-consumable value
sometimes.
-- Reuti
> #qstat -j 1071136
> ...
> hard resource_list: h_rt=86400
> ...
> hard_queue_list: *@@lfn
> ...
>
> I've tried adding
> limit users {aeszter,b,c} hosts {@lfn} pes * to h_rt=2764800
> limit users {*} hosts {@lfn} pes * to h_rt=604800
> but to no avail.
>
>
> Thanks,
>
> A.
> --
> Ansgar Esztermann
> DV-Systemadministration
> Max-Planck-Institut für biophysikalische Chemie, Abteilung 105
>
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users