Hi,

Am 12.01.2014 um 19:19 schrieb Julien Nicoulaud:

> I would like to setup a queue with the following rules:
>  - Have a fixed s_rt for all jobs running in the queue

The s_rt will send a warning to the job that it should prepare itself to get 
killed soon (like writing a checkpoint file) when it passes this time limit.


>  - If no jobs are queued waiting, let running jobs exceed their s_rt

You could only ignore the SIGUSR1, but it will be killed later anyway.


>  - Otherwise, kill jobs that have exceeded their s_rt so that jobs start
> 
> Any idea how to do that ?

Only way to achieve this could be a co-scheduler outside of SGE. You submit the 
jobs like usual, and the "granted minimum running time" can be attached as a 
job context. Then the co-scheduler could check whether any job passed this 
limit already and in case there are waiting jobs kill the jobs in question. It 
might not be easy to decide whether any killed job will allow aby waiting job 
to start instantly (in case it's a parallel job or needs much memory) though. 
If it doesn't free up enough resources, you are out of luck.

-- Reuti

PS: You can have a look at my `qstatus` script how to convert the clear text 
start time to plain seconds. Unfortunately there is no "raw" output availabe in 
`qstat` to grab them.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to