One addendum that I forgot to include: If I manually power up all the nodes that are marked as "~idle", SLURM starts running the jobs queued up in defq.
This makes it seem like SLURM has decided not to power up the nodes for some reason. ...or perhaps something like this could have happened: 1. 1000 jobs get queued up. 2. SLURM powers on all the nodes. 3. For some reason, nodes are slow to boot (e.g., they hit max filesystem boot check, and therefore go into fsck during boot up). 4. In this morning's example (i.e., it happened last night), 4 nodes came up before SLURM timed them out, and therefore started running jobs on them. 5. The remaining 28 nodes are marked as down/failed to boot, and SLURM doesn't launch jobs on them. 6. But eventually the 28 nodes *do* boot, and SLURM recognizes them as "up". 7. ...but SLURM doesn't launch any jobs on them for some reason. 8. SLURM eventually sees that these nodes are idle, and powers them down. That seems like a bit of a stretch, but it could be possible...? On Mar 26, 2013, at 8:08 AM, Jeff Squyres <[email protected]> wrote: > I am using the Bright cluster manager version 6.0, which uses SLURM v2.3.4. > > I'm seeing an odd issue: I have many jobs queued up, but SLURM has decided to > power down most of my nodes and mark them as "~idle". > > The short version is that I have multiple partitions and multiple different > types of servers in my cluster, and I have SLURM's power control enabled to > power off my servers when they're not in use. However, I have seen SLURM > mark nodes as "~idle" (i.e., idle and powered off) while there are lots of > jobs in the queue. For example, this morning, I see about 1000 jobs queued > up in "defq" (my default partition), but only 4 nodes (out of 32) are powered > up and running jobs from that queue. The remaining 28 are marked as "~idle": > > Here's some of the details: > > ----- > [root@savbu-usnic-a ~]# srun --version > slurm 2.3.4 > [root@savbu-usnic-a ~]# squeue | head > JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) > 82646 defq Run imb- mpiteam PD 0:00 2 (Priority) > 82647 defq Run netp mpiteam PD 0:00 2 (Priority) > 82649 defq Run triv mpiteam PD 0:00 2 (Priority) > 82650 defq Run inte mpiteam PD 0:00 2 (Priority) > 82651 defq Run ibm mpiteam PD 0:00 2 (Priority) > 82652 defq Run ones mpiteam PD 0:00 2 (Priority) > 82653 defq Run mpic mpiteam PD 0:00 2 (Priority) > 82654 defq Run mpi- mpiteam PD 0:00 2 (Priority) > 82655 defq Run java mpiteam PD 0:00 2 (Priority) > [root@savbu-usnic-a ~]# squeue | grep Resour > 82642 defq Run mpic mpiteam PD 0:00 2 (Resources) > [root@savbu-usnic-a ~]# sinfo > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > defq* up infinite 28 idle~ node[001-011,014-015,018-032] > defq* up infinite 4 alloc node[012-013,016-017] > eurompi up infinite 0 n/a > infiniban up infinite 38 idle~ dell[001-016,022-043] > [root@savbu-usnic-a ~]# > ----- > > Do you still answer questions about SLURM v2.3.4? (upgrading is not really > an option, since Bright controls my entire SLURM setup) > > Thanks. > > -- > Jeff Squyres > [email protected] > For corporate legal information go to: > http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/163081008172/ > -- Jeff Squyres [email protected] For corporate legal information go to: http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/163081008172/
