On Thu, 7 May 2015 04:01:25 AM Igor Kozin wrote: > My real question is why running > salloc --mem-per-cpu=1000 --ntasks=1 bash > does not create cgroups and therefore gets you an unlimited interactive > session?
My understanding is that salloc will give you a session on the same node you run it, and you then need to use srun to launch a process on the assigned compute node (and thus into the relevant control group). To demonstrate, here is an example from one of our systems (Slurm 14.03.11), first just running hostname in salloc so you can see the shell is on the same node: [samuel@merri ~]$ salloc hostname salloc: Pending job allocation 2096414 salloc: job 2096414 queued and waiting for resources salloc: job 2096414 has been allocated resources salloc: Granted job allocation 2096414 merri salloc: Relinquishing job allocation 2096414 [samuel@merri ~]$ Now running hostname with srun inside salloc to show it appears on the compute node instead: [samuel@merri ~]$ salloc srun hostname salloc: Pending job allocation 2096415 salloc: job 2096415 queued and waiting for resources salloc: job 2096415 has been allocated resources salloc: Granted job allocation 2096415 Scratch directory /scratch/merri/jobs/2096415 has been allocated merri009 salloc: Relinquishing job allocation 2096415 Now to demonstrate that the one on the login node has (as expected) no cgroup whilst the one run with srun does run inside a cgroup: [samuel@merri ~]$ salloc cat /proc/self/cpuset salloc: Pending job allocation 2096416 salloc: job 2096416 queued and waiting for resources salloc: job 2096416 has been allocated resources salloc: Granted job allocation 2096416 / salloc: Relinquishing job allocation 2096416 salloc: Job allocation 2096416 has been revoked. [samuel@merri ~]$ [samuel@merri ~]$ salloc srun cat /proc/self/cpuset salloc: Pending job allocation 2096417 salloc: job 2096417 queued and waiting for resources salloc: job 2096417 has been allocated resources salloc: Granted job allocation 2096417 Scratch directory /scratch/merri/jobs/2096417 has been allocated /slurm/uid_500/job_2096417/step_0 salloc: Relinquishing job allocation 2096417 salloc: Job allocation 2096417 has been revoked. [samuel@merri ~]$ Hope that helps! All the best, Chris -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: [email protected] Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci
