We run Maui (3.2.6p21) on top of SLURM (2.0.2-pre1) with a patch to the
Wiki-interface that enables requeueing of jobs.

In some cases, Maui

 a) schedules a job it has requeued in the same iteration, or
 b) requeues a job it has scheduled in the same iteration.

This creates problems.  Situation a) is not critical, but situation b)
can lead to jobs getting killed.

We are trying to find a way to solve this problem, both from the Maui
side and the SLURM side.

For the Maui side, is it possible to get Maui to not requeue a running
job until it has run for a minimum of time?  (And conversely, though
less critical, not to start a job right after it has been requeued.)

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to