What is your requested wall time on that job? If there isn't a DefaultTime set for the debug partition, it might be assuming the job will use the maximum (infinite), which would run into the reservation start time.
-----Original Message----- From: Tim Donahue <[email protected]> Reply-To: slurm-dev <[email protected]> Date: Thursday, June 22, 2017 at 4:59 PM To: slurm-dev <[email protected]> Subject: [slurm-dev] Node not available due to future reservation? I have a very simple system, one controller, one server node. The node is up. > ubuntu@controller:~$ sinfo > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > debug* up infinite 1 idle server1 I create a reservation containing the server node but having a start time many days in advance: > ubuntu@controller:~$ scontrol show reservations -o > ReservationName=foo3 StartTime=2017-07-03T00:00:00 > EndTime=2017-07-03T01:00:00 Duration=01:00:00 Nodes=server1 NodeCnt=1 > CoreCnt=1 Features=(null) PartitionName=debug Flags= TRES=cpu=1 > Users=ubuntu Accounts=(null) Licenses=(null) State=INACTIVE > BurstBuffer=(null) Watts=n/a > ubuntu@controller:~$ I then try to run a (very simple) job, but the job is queued: > ubuntu@controller:~$ srun hostname > srun: Required node not available (down, drained or reserved) > srun: job 630 queued and waiting for resources squeue suggests the job is queued because the server node is not available: > > ubuntu@controller:~$ squeue > JOBID PARTITION NAME USER ST TIME NODES > NODELIST(REASON) > 629 debug hostname ubuntu PD 0:00 1 > (ReqNodeNotAvail, May be reserved for other job) Is this the expected behavior and, if so, why? Thanks Tim Donahue MIT / BU / MassOpenCloud
