Hi,

Am 27.03.2012 um 15:42 schrieb Esztermann, Ansgar:

> Hi everyone,
> 
> while in general, all users are equal in our installation, I would like some 
> nodes to have a longer maximum runtime for some users. In order to avoid 
> oversubscription, we have only one queue per node. So instead of putting both 
> a "medium" and a "long" queue on these nodes, I set up an RQS like this:
> {
>   name         lfn
>   description  Long Fat Nodes
>   enabled      TRUE
>   limit        users {aeszter,b,c} hosts {@lfn} to h_rt=2764800
>   limit        users {*} hosts {@lfn} to h_rt=604800
> }
> 
> This works as expected for serial jobs, but parallel ones (even with one 
> slot) refuse to start:
> #qalter -w p 1071136
> Job 1071136 cannot run because it exceeds limit "aeszter////node12-34/" in 
> rule "lfn/1"
> #qstat -j 1071136
> ...
> hard resource_list:         h_rt=86400
> ...
> hard_queue_list:            *@@lfn
> ...
> 
> I've tried adding
>   limit        users {aeszter,b,c} hosts {@lfn} pes * to h_rt=2764800
>   limit        users {*} hosts {@lfn} pes * to h_rt=604800
> but to no avail.

just a follow up:

Can you please try to add h_rt with an arbitrary high value to each exechost 
(`qconf -me ...`)? Is it working then?

-- Reuti


> 
> Thanks,
> 
> A.
> -- 
> Ansgar Esztermann
> DV-Systemadministration
> Max-Planck-Institut für biophysikalische Chemie, Abteilung 105
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to