Hello,

In order to have no jobs running on the cluster on maintenance day, I
would like to place an advance reservation: I would reserve all slots
in the cluster for 12 hours on the day we do maintenance and software
upgrades, so that the scheduler knows that no jobs from other users
can run on that day.

However, large adavance reservations seem to behave erratically. Our
cluster is comprised of 576 nodes, each with 8 cores, so distributed
among queues:

    # qstat -g c
    CLUSTER QUEUE                                       CQLOAD   USED
  RES  AVAIL  TOTAL aoACDPS  cdsuE
    
----------------------------------------------------------------------------------------------
    long.q                                                1.88      8
    0      0    496    112      0
    med.q                                                 0.96      8
    0     16    960      0      0
    short.q                                               0.81      8
    0    264   1536     88     16
    test.q                                                0.36      0
    0     80     80      0      0
    very-short.q                                          0.36      8
    0     40     80      0      0
    wide.q                                                1.23      8
    0    616   1536    288      0

No two queues overlap, excaept for queue `test.q` which is reserved to
sysadmins and thus of no concern.

A reservation for 4608 slots (576 nodes with 8 cores each) for 12
hours (from 08:00 to 20:00)
fails:

    # qrsub -a 201305190800 -e 201305192000 -pe parastation 4608
    queue "short.q@r06c01b12n01" is temporarily disabled
    queue "short.q@r06c01b12n02" is temporarily disabled
    advance_reservation: no suitable queues

A reservation for 4528 slots (4608 - 80 slots in the very-short.q) for
12 hours (from 08:00 to 20:00) apparently succeeds:

    # qrsub -a 201305190800 -e 201305192000 -pe parastation 4528
    queue "short.q@r06c01b12n01" is temporarily disabled
    queue "short.q@r06c01b12n02" is temporarily disabled
    Your advance reservation 206 has been granted

However, trying to reserve the same number of slots for the next day
fails again:

    # qrsub -a 201305200800 -e 201305202000 -pe parastation 4528
    queue "short.q@r06c01b12n01" is temporarily disabled
    queue "short.q@r06c01b12n02" is temporarily disabled
    advance_reservation: no suitable queues

Am I doing anything wrong?

Thanks for any hint,
Riccardo

--
Riccardo Murri
http://www.gc3.uzh.ch/people/rm

Grid Computing Competence Centre
University of Zurich
Winterthurerstrasse 190, CH-8057 Zürich (Switzerland)
Tel: +41 44 635 4222
Fax: +41 44 635 6888

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to