Hi,

Am 18.04.2013 um 17:01 schrieb Riccardo Murri:

> In order to have no jobs running on the cluster on maintenance day, I
> would like to place an advance reservation: I would reserve all slots
> in the cluster for 12 hours on the day we do maintenance and software
> upgrades, so that the scheduler knows that no jobs from other users
> can run on that day.
> 
> However, large adavance reservations seem to behave erratically. Our
> cluster is comprised of 576 nodes, each with 8 cores, so distributed
> among queues:
> 
>    # qstat -g c
>    CLUSTER QUEUE                                       CQLOAD   USED
>  RES  AVAIL  TOTAL aoACDPS  cdsuE
>    
> ----------------------------------------------------------------------------------------------
>    long.q                                                1.88      8
>    0      0    496    112      0
>    med.q                                                 0.96      8
>    0     16    960      0      0
>    short.q                                               0.81      8
>    0    264   1536     88     16
>    test.q                                                0.36      0
>    0     80     80      0      0
>    very-short.q                                          0.36      8
>    0     40     80      0      0
>    wide.q                                                1.23      8
>    0    616   1536    288      0
> 
> No two queues overlap, excaept for queue `test.q` which is reserved to
> sysadmins and thus of no concern.
> 
> A reservation for 4608 slots (576 nodes with 8 cores each) for 12
> hours (from 08:00 to 20:00)
> fails:
> 
>    # qrsub -a 201305190800 -e 201305192000 -pe parastation 4608
>    queue "short.q@r06c01b12n01" is temporarily disabled
>    queue "short.q@r06c01b12n02" is temporarily disabled
>    advance_reservation: no suitable queues

The PE is also attached to very-short.q?


> A reservation for 4528 slots (4608 - 80 slots in the very-short.q) for
> 12 hours (from 08:00 to 20:00) apparently succeeds:
> 
>    # qrsub -a 201305190800 -e 201305192000 -pe parastation 4528
>    queue "short.q@r06c01b12n01" is temporarily disabled
>    queue "short.q@r06c01b12n02" is temporarily disabled
>    Your advance reservation 206 has been granted
> 
> However, trying to reserve the same number of slots for the next day
> fails again:
> 
>    # qrsub -a 201305200800 -e 201305202000 -pe parastation 4528
>    queue "short.q@r06c01b12n01" is temporarily disabled
>    queue "short.q@r06c01b12n02" is temporarily disabled
>    advance_reservation: no suitable queues

Maybe the problem is not the very-short.q, but the actual running jobs which 
have a longer h_rt defined. Is there in addition any calendar defined?

-- Reuti


> Am I doing anything wrong?
> 
> Thanks for any hint,
> Riccardo
> 
> --
> Riccardo Murri
> http://www.gc3.uzh.ch/people/rm
> 
> Grid Computing Competence Centre
> University of Zurich
> Winterthurerstrasse 190, CH-8057 Zürich (Switzerland)
> Tel: +41 44 635 4222
> Fax: +41 44 635 6888
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to