[slurm-dev] prioritize based on walltime request
Hi, How to configure slurm such that the job with shortest wall time will run first in the queue? Doesn't look like any of the priority setting in slurm.conf related to the wall time. We have one queue which has max walltime of 1 hour. We would like to let job which request 30 minutes to run before the 1 hour job. Thanks in advance. Steven.
[slurm-dev] Re: set maximum CPU usage per user
Is MaxTRESPerUser a better option to use? Steven. On 10/20/16 10:21 AM, Steven Lo wrote: Hi Benjamin, We have the following set in slurm.conf as you have suggested: AccountingStorageEnforce=limits,qos PriorityWeightQOS=1000 And we did sacctmgr modify qos normal set Grpcpus=300 sacctmgr show qos format=GrpTRES GrpTRES - cpu=200 I see that when I submit a job requesting over 200 CPUs, the job get blocked, which is good. However, when I submit a job requesting just few CPUs, the job get blocked as well. [slurm-testing]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2064 debug hello_pa slo PD 0:00 10 (QOSGrpCpuLimit) 2065 smallmem c_gth-dz jmcclain PD 0:00 6 (QOSGrpCpuLimit) Do you know why it thinks the job is over 200 CPU limit? Is there other setting we need? Thanks Steven. On 10/20/16 2:13 AM, Benjamin Redling wrote: Hi Steven, On 10/20/2016 00:22, Steven Lo wrote: We have the attribute commented out: #AccountingStorageEnforce=0 I think the best is to (re)visit "Accounting and Resource Limits": http://slurm.schedmd.com/accounting.html Right know I have no setup that needs accounting but as far as I currently understand you'll need AccoutingStorageEnforce=limits,qos to get your examples to work. And just in case you already didn't set it: for QOS (http://slurm.schedmd.com/qos.html) PriorityWeightQOS" configuration parameter must be defined in the slurm.conf file and assigned an integer value greater than zero. What I am unsure -- esp. not knowing your config -- if there are any other unmet dependencies. Would be nice somebody with real experience with accounting could affirm or give a pointer. Regards, Benjamin
[slurm-dev] Wrong behaviour of "--tasks-per-node" flag
Hi all, I am having the weirdest error ever. I am pretty sure this is a bug. I have reproduced the error in latest slurm commit (slurm 17.02.0-0pre2, commit 406d3fe429ef6b694f30e19f69acf989e65d7509 ) and in slurm 16.05.5 branch. It does NOT happen in slurm 15.08.12 . My cluster is composed by 8 nodes, each with 2 sockets, each with 8 cores. Slurm.conf content is SchedulerType=sched/backfill SchedulerPort=7321 SelectType=select/linear #DEDICATED NODES NodeName=acme[11-14,21-24] CPUs=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 State=UNKNOWN I am running a simple hello World parallel code. It is submitted as "sbatch --ntasks=X --tasks-per-node=Y myScript.sh ". The problem is that, depending on the values of X and Y, Slurm performs a wrong opperation and returns an error. " sbatch --ntasks=8 --tasks-per-node=2 myScript.sh srun: Warning: can't honor --ntasks-per-node set to 2 which doesn't match the requested tasks 4 with the number of requested nodes 4. Ignoring --ntasks-per-node. " Note that I did not request 4 but 8 tasks, and I did not request any number of nodes. Same happens with " sbatch --ntasks=16 --tasks-per-node=2 myScript.sh srun: Warning: can't honor --ntasks-per-node set to 2 which doesn't match the requested tasks 8 with the number of requested nodes 8. Ignoring --ntasks-per-node. " and " sbatch --ntasks=32 --tasks-per-node=4 myScript.sh srun: Warning: can't honor --ntasks-per-node set to 4 which doesn't match the requested tasks 8 with the number of requested nodes 8. Ignoring --ntasks-per-node. " All the rest of configurations work correctly and do not return any error. In particular, I have tried the following combinations with no problem: (ntasks, tasks-per-node) (1,1) (2,1), (2,2) (4,1), (4,2), (4,4) (8,1), (4,4), (8,8) (16,4), (16,8), (16,16) (32,8), (32,16) (64,8), (64, 16) (128, 16) As said, this does not happen when executing the very same commands and scripts with slurm 15.08.12. So, have you had any similar experiences? Is this a bug, a desired behaviour, or am I doing something wrong? Thanks for your help. Best regards, Manuel
[slurm-dev] Job dependency across other partitiions
Hi Slurm Folks, We want to CPU+GPU cluster which have two partitions for CPU only partition and GPU Only partition. The customer want to run dependency jobs GPU only job + CPU only job. do you have any good solution of dependency job across 2partitions? if anyone have good idea or suggestions, please let me. Regards, Naoki. Naoki SHIBATA