One addendum that I forgot to include:

If I manually power up all the nodes that are marked as "~idle", SLURM starts 
running the jobs queued up in defq.

This makes it seem like SLURM has decided not to power up the nodes for some 
reason.

...or perhaps something like this could have happened:

1. 1000 jobs get queued up.
2. SLURM powers on all the nodes.
3. For some reason, nodes are slow to boot (e.g., they hit max filesystem boot 
check, and therefore go into fsck during boot up).
4. In this morning's example (i.e., it happened last night), 4 nodes came up 
before SLURM timed them out, and therefore started running jobs on them.
5. The remaining 28 nodes are marked as down/failed to boot, and SLURM doesn't 
launch jobs on them.
6. But eventually the 28 nodes *do* boot, and SLURM recognizes them as "up".
7. ...but SLURM doesn't launch any jobs on them for some reason.
8. SLURM eventually sees that these nodes are idle, and powers them down.

That seems like a bit of a stretch, but it could be possible...?  


On Mar 26, 2013, at 8:08 AM, Jeff Squyres <[email protected]> wrote:

> I am using the Bright cluster manager version 6.0, which uses SLURM v2.3.4.
> 
> I'm seeing an odd issue: I have many jobs queued up, but SLURM has decided to 
> power down most of my nodes and mark them as "~idle".
> 
> The short version is that I have multiple partitions and multiple different 
> types of servers in my cluster, and I have SLURM's power control enabled to 
> power off my servers when they're not in use.  However, I have seen SLURM 
> mark nodes as "~idle" (i.e., idle and powered off) while there are lots of 
> jobs in the queue.  For example, this morning, I see about 1000 jobs queued 
> up in "defq" (my default partition), but only 4 nodes (out of 32) are powered 
> up and running jobs from that queue.  The remaining 28 are marked as "~idle":
> 
> Here's some of the details:
> 
> -----
> [root@savbu-usnic-a ~]# srun --version
> slurm 2.3.4
> [root@savbu-usnic-a ~]# squeue | head
>  JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
>  82646      defq Run imb-  mpiteam  PD       0:00      2 (Priority)
>  82647      defq Run netp  mpiteam  PD       0:00      2 (Priority)
>  82649      defq Run triv  mpiteam  PD       0:00      2 (Priority)
>  82650      defq Run inte  mpiteam  PD       0:00      2 (Priority)
>  82651      defq Run ibm   mpiteam  PD       0:00      2 (Priority)
>  82652      defq Run ones  mpiteam  PD       0:00      2 (Priority)
>  82653      defq Run mpic  mpiteam  PD       0:00      2 (Priority)
>  82654      defq Run mpi-  mpiteam  PD       0:00      2 (Priority)
>  82655      defq Run java  mpiteam  PD       0:00      2 (Priority)
> [root@savbu-usnic-a ~]# squeue | grep Resour
>  82642      defq Run mpic  mpiteam  PD       0:00      2 (Resources)
> [root@savbu-usnic-a ~]# sinfo
> PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
> defq*        up   infinite     28  idle~ node[001-011,014-015,018-032]
> defq*        up   infinite      4  alloc node[012-013,016-017]
> eurompi      up   infinite      0    n/a 
> infiniban    up   infinite     38  idle~ dell[001-016,022-043]
> [root@savbu-usnic-a ~]# 
> -----
> 
> Do you still answer questions about SLURM v2.3.4?  (upgrading is not really 
> an option, since Bright controls my entire SLURM setup)
> 
> Thanks.
> 
> -- 
> Jeff Squyres
> [email protected]
> For corporate legal information go to: 
> http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/163081008172/
> 


-- 
Jeff Squyres
[email protected]
For corporate legal information go to: 
http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/163081008172/

Reply via email to