Re: [slurm-users] 19.05 and GPUs vs GRES

2019-09-05 Thread Christopher Samuel
On 9/5/19 3:49 PM, Bill Broadley wrote: I have a user with a particularly flexible code that would like to run a single MPI job across multiple nodes, some with 8 GPUs each, some with 2 GPUs. Perhaps they could just specify a number of tasks with cpus per task, mem per task and GPUs per

Re: [slurm-users] 19.05 and GPUs vs GRES

2019-09-05 Thread Christopher Samuel
On 8/13/19 10:44 PM, Barbara KraĊĦovec wrote: We still have the gres configuration, users have their workload scripted and some still use sbatch with gres. Both options work. I missed this before Barbara, sorry - that's really good to know that the options aren't mutually exclusive, thank

Re: [slurm-users] 19.05 and GPUs vs GRES

2019-09-05 Thread Bill Broadley
Anyone know if the new GPU support allows having a different number of GPUs per node? I found: https://www.ch.cam.ac.uk/computing/slurm-usage Which mentions "SLURM does not support having varying numbers of GPUs per node in a job yet." I have a user with a particularly flexible code that

Re: [slurm-users] slurm node weights

2019-09-05 Thread Marcus Boden
Hello Doug, tp quote the slurm.conf page: It would be preferable to allocate smaller memory nodes rather than larger memory nodes if either will satisfy a job's requirements. So I guess the idea is, that if a smaller node satisfies all requirements, why 'waste' a bigger one for it? It makes

Re: [slurm-users] slurm node weights

2019-09-05 Thread Merlin Hartley
I believe this is so that small jobs will naturally go on older, slower nodes first - leaving the bigger,better ones for jobs that actually need them. Merlin -- Merlin Hartley IT Support Engineer MRC Mitochondrial Biology Unit University of Cambridge Cambridge, CB2 0XY United Kingdom > On 5

Re: [slurm-users] slurm node weights

2019-09-05 Thread Brian Andrus
The intention there is to pack jobs on the smallest node that can handle the job. This way jobs that only need 1 cpu don't take it from a 64-core node unless it has to, leaving that one available for that 64-core job. It really boils down to what you want to happen, which will vary with

[slurm-users] slurm node weights

2019-09-05 Thread Douglas Duckworth
Hello We added some newer Epyc nodes, with NVMe scratch, to our cluster and so want jobs to run on these over others. So we added "Weight=100" to the older nodes and left the new ones blank. So indeed, ceteris paribus, srun reveals that the faster nodes will accept jobs over older ones. We