Hello Mike, Markus is absolutely right. If you request 1 core, then slurm will give you a cgroup with 1 core. That does not stop the user from running x threads, however, they will stay confined to that 1 core. Load is not a good indicator, since it an indication of the (linux) run-queue utilization and it doesn't care if some cores are "overloaded" and some are idle. In top you can press f (fields management) and select "Last Used Cpu" to see on which core a process is running.
The issue you see is that openmp is ignoring any cgroup setting and counts all cores for OMP_NUM_THREADS. You probably have to set this variable by hand in your slurm script. Best, Paul On Tue, 2017-04-25 at 02:32 -0700, Markus Koeberl wrote: > On Monday 24 April 2017 22:04:49 Mike Cammilleri wrote: > > > > Thanks for your help on this. I've enabled cgroups plugin with > > these same settings > > > > CgroupAutomount=yes > > CgroupReleaseAgentDir="/etc/cgroup" > > CgroupMountpoint=/sys/fs/cgroup > > ConstrainCores=yes > > ConstrainDevices=yes > > ConstrainRAMSpace=yes > > ConstrainSwapSpace=yes > > > > And put cgroup.conf in /etc for our intalls. > > > > I can see in the slurm logging that it's reading in cgroup.conf. > > I've loaded the new slurm.conf and restarted all slurmd processes > > and ran scontrol reconfigure on the submit node. > > > > Memory seems to not be swapping anymore, however, I'm still having > > way too many threads get scheduled. I've tried many combinations of > > --cpus-per-task, --ntasks, cpu_bind=threads, whatever - and nothing > > seems to prevent any process from each having 48 threads according > > to 'top'. > > > > The most interesting thing I've found is that even a single R job > > is reporting 48 threads in 'top' (by pressing F in interactive mode > > when using top and selecting the nTH column to display). The only > > thing that seems to limit thread usage is setting OMP_NUM_THREADS > > env variable - this it will obey. But what we really need is a hard > > limit so no one user who thinks they're running a simple R job and > > requesting --ntasks 6, is actually getting 6*48 threads going at > > once and overloading the node. 48 threads would be the total number > > of "cpus" as the machine sees it logically. It's a 24 core machine > > with 2 threads on each core. > > > > Any ideas? Could this be a non slurm issue and something specific > > to our servers (running ubuntu 14.04 LTS)? I don't want to resort > > to having to turn off hyperthreading. > > If it is working all processes and threads should only be allowed to > run on the number of cpus asked for and not on the others. > > for example: > > # AMD FX-8370, 8 CPUs 8 threads (no hyperthreading) > # all cpus slurm is allowed to use > cat /sys/fs/cgroup/cpuset/slurm/cpuset.cpus > 0-7 > # job 666554 of user with uidnumber 1044 (asked for 1 cpu) > cat /sys/fs/cgroup/cpuset/slurm/uid_1044/job_666554/cpuset.cpus > 0 > # all processes and threads of job 666554 can only run on cpu 0 > > # Intel E5-1620 v3, 4 CPUs 8 threads (with hyperthreading) > # all cpus slurm is allowed to use > cat /sys/fs/cgroup/cpuset/slurm/cpuset.cpus > 0-7 > # job 758732 of user with uidnumber 1311 (asked for 1 cpu) > cat /sys/fs/cgroup/cpuset/slurm/uid_1311/job_758732/cpuset.cpus > 1,5 > # all processes and threads of job 758732 can only run on cpu 1 and 5 > (core 1 with 2 threads) > > > You may think of it like this: > For the process hierarchy in a cgroup the linux kernel runs a > separate scheduler. Therefore in theory processes in one cgroup will > not affect processes in another cgroup. Slurm creates a new cgroup > for each process and with ConstrainCores=yes pins it also to cpu > cores. > > > Therefore the wrong number of processes and threads should not make > any problem. In your case (asking for 6 cpus with hyperthreading) > only 12 threads of 48 can run at the same time. > > Concerning the program: > The program could use the information in cpuset.cpus of the cgroup or > slurm environment variables to determiner how much threads may run > instead of taking the total number. > > > regards > Markus Köberl