Here is another example which is (from my point of view) less confusing: [root@host1 ~]# salloc -N 1 salloc: Granted job allocation 8 [root@host1 ~]# srun hostname host9 [root@host1 ~]# hostname host1 [root@host1 ~]# exit exit salloc: Relinquishing job allocation 8 salloc: Job allocation 8 has been revoked. [root@host1 ~]#
Le 07/05/2015 13:28, Chris Samuel a écrit : > On Thu, 7 May 2015 04:01:25 AM Igor Kozin wrote: > >> My real question is why running >> salloc --mem-per-cpu=1000 --ntasks=1 bash >> does not create cgroups and therefore gets you an unlimited interactive >> session? > My understanding is that salloc will give you a session on the same node you > run it, and you then need to use srun to launch a process on the assigned > compute node (and thus into the relevant control group). > > To demonstrate, here is an example from one of our systems (Slurm 14.03.11), > first just running hostname in salloc so you can see the shell is on the same > node: > > [samuel@merri ~]$ salloc hostname > salloc: Pending job allocation 2096414 > salloc: job 2096414 queued and waiting for resources > salloc: job 2096414 has been allocated resources > salloc: Granted job allocation 2096414 > merri > salloc: Relinquishing job allocation 2096414 > [samuel@merri ~]$ > > > Now running hostname with srun inside salloc to show it appears on the > compute > node instead: > > [samuel@merri ~]$ salloc srun hostname > salloc: Pending job allocation 2096415 > salloc: job 2096415 queued and waiting for resources > salloc: job 2096415 has been allocated resources > salloc: Granted job allocation 2096415 > Scratch directory /scratch/merri/jobs/2096415 has been allocated > merri009 > salloc: Relinquishing job allocation 2096415 > > > Now to demonstrate that the one on the login node has (as expected) no cgroup > whilst the one run with srun does run inside a cgroup: > > [samuel@merri ~]$ salloc cat /proc/self/cpuset > salloc: Pending job allocation 2096416 > salloc: job 2096416 queued and waiting for resources > salloc: job 2096416 has been allocated resources > salloc: Granted job allocation 2096416 > / > salloc: Relinquishing job allocation 2096416 > salloc: Job allocation 2096416 has been revoked. > [samuel@merri ~]$ > > [samuel@merri ~]$ salloc srun cat /proc/self/cpuset > salloc: Pending job allocation 2096417 > salloc: job 2096417 queued and waiting for resources > salloc: job 2096417 has been allocated resources > salloc: Granted job allocation 2096417 > Scratch directory /scratch/merri/jobs/2096417 has been allocated > /slurm/uid_500/job_2096417/step_0 > salloc: Relinquishing job allocation 2096417 > salloc: Job allocation 2096417 has been revoked. > [samuel@merri ~]$ > > > Hope that helps! > > All the best, > Chris -- --- Mehdi Denou International HPC support +336 45 57 66 56
