A user has an array job of very long running jobs. I want to limit the
number of nodes he can access at a time. (or number of concurrent jobs, or
number of cpus)
$ scontrol show config
....
AccountingStorageEnforce = associations,limits,qos
....
$ sacctmgr list assoc account=awhitehegrp user=nreid
Cluster Account User Partition Share GrpJobs GrpNodes
GrpCPUs GrpMem GrpSubmit GrpWall GrpCPUMins MaxJobs MaxNodes
MaxCPUs MaxSubmit MaxWall MaxCPUMins QOS Def QOS
---------- ---------- ---------- ---------- --------- ------- --------
-------- ------- --------- ----------- ----------- ------- --------
-------- --------- ----------- ----------- -------------------- ---------
farm awhiteheg+ nreid low 1 48 2
48 48 2
noah,normal
When he submits to the low partition no limits are set. He takes over
everything available.
Then I tried creating a qos for him:
$ sacctmgr modify qos noah priority=1 MaxNodesPerUser=2 MaxCpusPerUser=48
MaxJobsPerUser=48
The script I'm testing with (as the user):
#!/bin/bash -l
#SBATCH -J test
#SBATCH -o test_out_%j.txt
#SBATCH -e test_err_%j.txt
#SBATCH --array=1-1000
#SBATCH --qos=noah
#SBATCH --partition=low
sleep 180
He still takes over everything available.
The environment:
$ uname -a
Linux nas-9-0 3.8.0-23-generic #34-Ubuntu SMP Wed May 29 20:22:58 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux
VERSION="13.10, Saucy Salamander"
slurm-2.6.2
Any suggestions?
Thanks,
Terri Knight