Hello Markus, That section in my conf reads like:
# SCHEDULING #DefMemPerCPU=0 FastSchedule=1 #MaxMemPerCPU=0 #SchedulerRootFilter=1 #SchedulerTimeSlice=30 SchedulerType=sched/backfill SchedulerPort=7321 SelectType=select/cons_res SelectTypeParameters=CR_Core so I don't say anything about memory. Is that bad? Further, I have slurm version 14.03.9 as per Debian package. Is that too old? ---david On Tue, Dec 6, 2016 at 11:53 AM, Markus Koeberl <markus.koeb...@tugraz.at> wrote: > On Tuesday 06 December 2016 10:49:33 David van Leeuwen wrote: >> >> Hello, >> >> I can't understand why jobs---even without asking for GPU >> resources---don't get scheduled. I must have something fundamentally >> wrong in the configuration. Maybe someone can help. >> >> I have 2 machines (physically different apparatuses---I suppose in >> SLURM parlance this is a node, but I am not sure about that), with >> resp. 1 and 2 GPUs, and each with (I believe) 6 hyperthreaded CPUs. >> >> I would like to be able to schedule either normal CPU jobs >> (gres=gpu:0) at a granularity 1 job / CPU (so that I can run 12 jobs >> in parallel), or GPU jobs (gres=gpu:1) at a granularity 1 job / GPU, >> requiring additionally 1 CPU, so that I can run 3 GPU jobs in >> parallel. In that case, there should be still room for 9 >> single-threaded jobs on the cluster (well, maybe not a cluster, but >> rather a binary system). >> >> But in trying to tell SLURM about the gpu's, it has stopped completely >> scheduling jobs. Even jobs where I don't even want a GPU. Slurm >> claims that "gres/gpu count too low (0 < 1)"---but I have to clue as >> to what the 0 and the 1 refer to >> (claimed/detected/physical/reserved/available/required gpus?). >> >> # grep gpu /etc/slurm-llnl/slurm.conf >> >> GresTypes=gpu >> >> NodeName=deep-novo-1 RealMemory=32145 CPUS=12 Sockets=1 >> CoresPerSocket=6 ThreadsPerCore=2 State=UNKNOWN Gres=gpu:1 >> >> NodeName=deep-novo-2 RealMemory=129105 CPUS=12 Sockets=1 >> CoresPerSocket=6 ThreadsPerCore=2 State=UNKNOWN Gres=gpu:2 > > here it is working using slurm 16.05 > do you have these settings defined in your /etc/slurm-llnl/slurm.conf? > > DefMemPerCPU=1000 > SchedulerType=sched/backfill > SelectType=select/cons_res > SelectTypeParameters=CR_Core_Memory > > > regards > Markus Köberl > -- > Markus Koeberl > Graz University of Technology > Signal Processing and Speech Communication Laboratory > E-mail: markus.koeb...@tugraz.at -- David van Leeuwen