Hi,
Am 27.03.2012 um 15:42 schrieb Esztermann, Ansgar:
> Hi everyone,
>
> while in general, all users are equal in our installation, I would like some
> nodes to have a longer maximum runtime for some users. In order to avoid
> oversubscription, we have only one queue per node. So instead of putting both
> a "medium" and a "long" queue on these nodes, I set up an RQS like this:
> {
> name lfn
> description Long Fat Nodes
> enabled TRUE
> limit users {aeszter,b,c} hosts {@lfn} to h_rt=2764800
> limit users {*} hosts {@lfn} to h_rt=604800
> }
>
> This works as expected for serial jobs, but parallel ones (even with one
> slot) refuse to start:
> #qalter -w p 1071136
> Job 1071136 cannot run because it exceeds limit "aeszter////node12-34/" in
> rule "lfn/1"
> #qstat -j 1071136
> ...
> hard resource_list: h_rt=86400
> ...
> hard_queue_list: *@@lfn
> ...
>
> I've tried adding
> limit users {aeszter,b,c} hosts {@lfn} pes * to h_rt=2764800
> limit users {*} hosts {@lfn} pes * to h_rt=604800
> but to no avail.
just a follow up:
Can you please try to add h_rt with an arbitrary high value to each exechost
(`qconf -me ...`)? Is it working then?
-- Reuti
>
> Thanks,
>
> A.
> --
> Ansgar Esztermann
> DV-Systemadministration
> Max-Planck-Institut für biophysikalische Chemie, Abteilung 105
>
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users