[slurm-dev] Trying to restart slurm from scratch

Andy Riebs Tue, 06 Jan 2015 13:55:47 -0800

Hi,

We've got an RHEL 6.5 system running Slurm 14.11.1. After hundreds ofthousands of test runs, we're trying to clean it up for real users. Weshut down Slurm, removed the SaveState files and the SlurmdSpoolDirfiles, but when we come back up, and set the partitions to up, jobs areheld with "resource not available". With slurmctld -vvvvvvv, this iswhat we see:

[2015-01-06T21:42:38.293] debug2: found 96 usable nodes from configcontaining noden[00-95][2015-01-06T21:42:38.293] debug3: _pick_best_nodes: job 2 idle_nodes 96share_nodes 96

[2015-01-06T21:42:38.295] debug3: JobId=2 required nodes not avail

Even a simple "srun -N1 hostname" hits this. We ARE using slurmdbd withmysql, but assuming that this would only impact accounting results. Anyguidance on what we should be doing to reset the world?


Andy

--
Andy Riebs
Hewlett-Packard Company
High Performance Computing
+1 404 648 9024
My opinions are not necessarily those of HP

[slurm-dev] Trying to restart slurm from scratch

Reply via email to