We have set the MaxTRESMins limit on accounts and users, to make it
impossible to start what we think is outrageously large jobs.

But we have found an unwanted side effect:
When the user asks for a longer timelimit, we often allow that, and
when we increase the timelimit, sometimes jobs run into the
MaxTRESMins limit and die:
Dec 28 17:20:18 milou-q slurmctld: [2015-12-28T17:20:09.072] Job 6574528 timed 
out, the job is at or exceeds assoc 10056(b2013086/ansgar/(null)) max tres(cpu) 
minutes of 600000 with 600001

For us, this looks like a bug.

Please, we would prefer the MaxTRESMins limit not to kill already
running jobs.

Cheers,
-- Lennart Karlsson
   UPPMAX, Uppsala University, Sweden
   http://www.uppmax.uu.se

Reply via email to