Hi I don't understand why node is busy after job is launched.
slurm 2.6.0-0pre1 grep 86742 /slurmdb/log/Slurmctld.log [2013-06-03T09:45:46+03:00] _slurm_rpc_submit_batch_job JobId=86742 usec=589 [2013-06-03T09:46:12+03:00] backfill: Started JobId=86742 on c196 [2013-06-03T09:46:14+03:00] _slurm_rpc_job_step_create for job 86742: Requested nodes are busy [2013-06-03T09:47:14+03:00] _slurm_rpc_job_step_create for job 86742: Requested nodes are busy c196: scontrol show node c196 NodeName=c196 Arch=x86_64 CoresPerSocket=8 CPUAlloc=16 CPUErr=0 CPUTot=16 CPULoad=4.40 Features=(null) Gres=(null) NodeAddr=c196 NodeHostName=c196 OS=Linux RealMemory=64000 Sockets=2 Boards=1 State=ALLOCATED ThreadsPerCore=1 TmpDisk=1800000 Weight=10 BootTime=2013-05-16T17:45:13 SlurmdStartTime=2013-05-16T17:46:48 CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 job error file: srun: mem < mem-per-cpu - resizing mem to be equal to mem-per-cpu srun: Job step creation temporarily disabled, retrying c196 Slurmd.log [2013-06-03T09:46:12+03:00] Launching batch job 86742 for UID 18991 [2013-06-03T09:46:12+03:00] Job accounting gather LINUX plugin loaded [2013-06-03T09:46:12+03:00] switch NONE plugin loaded [2013-06-03T09:46:12+03:00] Received cpu frequency information for 16 cpus [2013-06-03T09:46:12+03:00] [86742] task/cgroup: loaded [2013-06-03T09:46:12+03:00] [86742] Checkpoint plugin loaded: checkpoint/none [2013-06-03T09:46:12+03:00] [86742] debug level = 2 [2013-06-03T09:46:12+03:00] [86742] task 0 (18044) started 2013-06-03T09:46:12+03:00 [2013-06-03T09:46:12+03:00] [86742] AcctGatherEnergy NONE plugin loaded [2013-06-03T09:46:42+03:00] Launching batch job 86743 for UID 18991 [2013-06-03T09:46:42+03:00] Job accounting gather LINUX plugin loaded [2013-06-03T09:46:42+03:00] switch NONE plugin loaded [2013-06-03T09:46:42+03:00] Received cpu frequency information for 16 cpus [2013-06-03T09:46:42+03:00] [86743] task/cgroup: loaded [2013-06-03T09:46:42+03:00] [86743] Checkpoint plugin loaded: checkpoint/none [2013-06-03T09:46:42+03:00] [86743] debug level = 2 [2013-06-03T09:46:42+03:00] [86743] task 0 (18462) started 2013-06-03T09:46:42+03:00 [2013-06-03T09:46:42+03:00] [86743] AcctGatherEnergy NONE plugin loaded [2013-06-03T10:02:17+03:00] [86743] auth plugin for Munge (http://code.google.com/p/munge/) loaded [2013-06-03T10:02:17+03:00] [86742] auth plugin for Munge (http://code.google.com/p/munge/) loaded 18041 ? Sl 0:00 slurmstepd: [86742] 18044 ? S 0:00 /bin/bash /slurmdb/tmp/slurmd/job86742/slurm_script 18450 ? S 0:00 srun blastn -num_alignments 5 -num_threads 6 -query /wrk/user/pb_20506_tmpdir/pb_chunk_00005.fasta -db nt -out /wrk/user/pb_20506_tmpdir/pb_chunk_00005.fasta.result #SBATCH -t 24:00:00 #SBATCH -n 6 #SBATCH -p parallel #SBATCH --nodes 1 #SBATCH --ntasks 1 #SBATCH --cpus-per-task=6 #SBATCH --mem 16000 Best Regards, Tommi
