Hi, Am 18.04.2013 um 17:01 schrieb Riccardo Murri:
> In order to have no jobs running on the cluster on maintenance day, I > would like to place an advance reservation: I would reserve all slots > in the cluster for 12 hours on the day we do maintenance and software > upgrades, so that the scheduler knows that no jobs from other users > can run on that day. > > However, large adavance reservations seem to behave erratically. Our > cluster is comprised of 576 nodes, each with 8 cores, so distributed > among queues: > > # qstat -g c > CLUSTER QUEUE CQLOAD USED > RES AVAIL TOTAL aoACDPS cdsuE > > ---------------------------------------------------------------------------------------------- > long.q 1.88 8 > 0 0 496 112 0 > med.q 0.96 8 > 0 16 960 0 0 > short.q 0.81 8 > 0 264 1536 88 16 > test.q 0.36 0 > 0 80 80 0 0 > very-short.q 0.36 8 > 0 40 80 0 0 > wide.q 1.23 8 > 0 616 1536 288 0 > > No two queues overlap, excaept for queue `test.q` which is reserved to > sysadmins and thus of no concern. > > A reservation for 4608 slots (576 nodes with 8 cores each) for 12 > hours (from 08:00 to 20:00) > fails: > > # qrsub -a 201305190800 -e 201305192000 -pe parastation 4608 > queue "short.q@r06c01b12n01" is temporarily disabled > queue "short.q@r06c01b12n02" is temporarily disabled > advance_reservation: no suitable queues The PE is also attached to very-short.q? > A reservation for 4528 slots (4608 - 80 slots in the very-short.q) for > 12 hours (from 08:00 to 20:00) apparently succeeds: > > # qrsub -a 201305190800 -e 201305192000 -pe parastation 4528 > queue "short.q@r06c01b12n01" is temporarily disabled > queue "short.q@r06c01b12n02" is temporarily disabled > Your advance reservation 206 has been granted > > However, trying to reserve the same number of slots for the next day > fails again: > > # qrsub -a 201305200800 -e 201305202000 -pe parastation 4528 > queue "short.q@r06c01b12n01" is temporarily disabled > queue "short.q@r06c01b12n02" is temporarily disabled > advance_reservation: no suitable queues Maybe the problem is not the very-short.q, but the actual running jobs which have a longer h_rt defined. Is there in addition any calendar defined? -- Reuti > Am I doing anything wrong? > > Thanks for any hint, > Riccardo > > -- > Riccardo Murri > http://www.gc3.uzh.ch/people/rm > > Grid Computing Competence Centre > University of Zurich > Winterthurerstrasse 190, CH-8057 Zürich (Switzerland) > Tel: +41 44 635 4222 > Fax: +41 44 635 6888 > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
