Re: [slurm-users] wckey specification error
Thanks Trevor for pointing out that there is an option for such thing is slurm.conf. Although I previously greped for *wc* and found nothing, the correct name is TrackWCKey which is set to "yes" by default. After setting that to "no", the error disappeared. About the comments on Rocks and the Slurm roll... in my experiences, rocks 7 is very good and the unofficial slurm roll provided by Werner is also very good. It is worth to give them a try. Although I had some experiences with manual slurm installation on an ubuntu cluster some years ago, the automatic installation of the roll was very nice indeed! All the commands and configurations can be extracted from the roll. So there is no dark point about that. Limited issues about slurm, e.g. installation, are directly related to Werner. Most of the other question are related to the slurm itself. For example accounting and other things. Regards, Mahmood On Tue, May 1, 2018 at 9:35 PM, Cooper, Trevorwrote: > >> On May 1, 2018, at 2:58 AM, John Hearns wrote: >> >> Rocks 7 is now available, which is based on CentOS 7.4 >> I hate to be uncharitable, but I am not a fan of Rocks. I speak from >> experience, having installed my share of Rocks clusters. >> The philosophy just does not fit in with the way I look at the world. >> >> Anyway, to install extra software on Rocks you need a 'Roll' Mahmood Looks >> like you are using this Roll >> https://sourceforge.net/projects/slurm-roll/ >> It seems pretty mpdern as it installs Slurm 17.11.3 >> >> >> On 1 May 2018 at 11:40, Chris Samuel wrote: >> On Tuesday, 1 May 2018 2:45:21 PM AEST Mahmood Naderan wrote: >> >> > The wckey explanation in the manual [1] is not meaningful at the >> > moment. Can someone explain that? >> >> I've never used it, but it sounds like you've configured your system to >> require >> it (or perhaps Rocks has done that?). >> >> https://slurm.schedmd.com/wckey.html >> >> Good luck, >> Chris >> -- >> Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC >> >> > > The slurm-roll hosted on sourceforge is developed and supported by Werner > Saar not by developers of Rocks and/or other Rocks application rolls (e.g. > SDSC). > > There is ample documentation on sourceforge[1] on how to configure your Rocks > cluster to properly deploy the slurm-roll components and update your Slurm > configuration. > > There is also an active discussion group for the slurm-roll on sourceforge[2] > where Werner supports users of the slurm-roll for Rocks. > > While we don't use Werner's slurm-roll on our Rocks/Slurm based systems I > have installed it on test system and can say that it works as > expected/documented. > > In the default configuration WCKeys were NOT enabled so this something that > you must have added to your Slurm configuration. > > If you don't need the WCKeys capability of Slurm perhaps you could simply > disable it in your Slurm configuration. > > Hope this helps, > Trevor > > [1] - > https://sourceforge.net/projects/slurm-roll/files/release-7.0.0-17.11.05/slurm-roll.pdf > [2] - https://sourceforge.net/p/slurm-roll/discussion/ > > -- > Trevor Cooper > HPC Systems Programmer > San Diego Supercomputer Center, UCSD > 9500 Gilman Drive, 0505 > La Jolla, CA 92093-0505 >
Re: [slurm-users] GPU / cgroup challenges
On 02/05/18 10:15, R. Paul Wiegand wrote: Yes, I am sure they are all the same. Typically, I just scontrol reconfig; however, I have also tried restarting all daemons. Understood. Any diagnostics in the slurmd logs when trying to start a GPU job on the node? We are moving to 7.4 in a few weeks during our downtime. We had a QDR -> OFED version constraint -> Lustre client version constraint issue that delayed our upgrade. I feel your pain.. BTW RHEL 7.5 is out now so you'll need that if you need current security fixes. Should I just wait and test after the upgrade? Well 17.11.6 will be out then that will include for a deadlock that some sites hit occasionally, so that will be worth throwing into the mix too. Do read the RELEASE_NOTES carefully though, especially if you're using slurmdbd! All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
Re: [slurm-users] GPU / cgroup challenges
Yes, I am sure they are all the same. Typically, I just scontrol reconfig; however, I have also tried restarting all daemons. We are moving to 7.4 in a few weeks during our downtime. We had a QDR -> OFED version constraint -> Lustre client version constraint issue that delayed our upgrade. Should I just wait and test after the upgrade? On Tue, May 1, 2018, 19:56 Christopher Samuelwrote: > On 02/05/18 09:31, R. Paul Wiegand wrote: > > > Slurm 17.11.0 on CentOS 7.1 > > That's quite old (on both fronts, RHEL 7.1 is from 2015), we started on > that same Slurm release but didn't do the GPU cgroup stuff until a later > version (17.11.3 on RHEL 7.4). > > I don't see anything in the NEWS file about relevant cgroup changes > though (there is a cgroup affinity fix but that's unrelated). > > You do have identical slurm.conf, cgroup.conf, > cgroup_allowed_devices_file.conf etc on all the compute nodes too? > Slurmd and slurmctld have both been restarted since they were > configured? > > All the best, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC > >
Re: [slurm-users] GPU / cgroup challenges
On 02/05/18 09:31, R. Paul Wiegand wrote: Slurm 17.11.0 on CentOS 7.1 That's quite old (on both fronts, RHEL 7.1 is from 2015), we started on that same Slurm release but didn't do the GPU cgroup stuff until a later version (17.11.3 on RHEL 7.4). I don't see anything in the NEWS file about relevant cgroup changes though (there is a cgroup affinity fix but that's unrelated). You do have identical slurm.conf, cgroup.conf, cgroup_allowed_devices_file.conf etc on all the compute nodes too? Slurmd and slurmctld have both been restarted since they were configured? All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
Re: [slurm-users] GPU / cgroup challenges
Chris, Thanks for the correction there, that /dev/nvidia* isn’t needed in [cgroup_allowed_devices_file.conf] for constraining GPU devices. -Kevin From: slurm-userson behalf of "R. Paul Wiegand" Reply-To: "p...@tesseract.org" , Slurm User Community List Date: Tuesday, May 1, 2018 at 7:34 PM To: Slurm User Community List Subject: Re: [slurm-users] GPU / cgroup challenges Slurm 17.11.0 on CentOS 7.1 On Tue, May 1, 2018, 19:26 Christopher Samuel > wrote: On 02/05/18 09:23, R. Paul Wiegand wrote: > I thought including the /dev/nvidia* would whitelist those devices > ... which seems to be the opposite of what I want, no? Or do I > misunderstand? No, I think you're right there, we don't have them listed and cgroups constrains it correctly (nvidia-smi says no devices when you don't request any GPUs). Which version of Slurm are you on? cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
Re: [slurm-users] GPU / cgroup challenges
Slurm 17.11.0 on CentOS 7.1 On Tue, May 1, 2018, 19:26 Christopher Samuelwrote: > On 02/05/18 09:23, R. Paul Wiegand wrote: > > > I thought including the /dev/nvidia* would whitelist those devices > > ... which seems to be the opposite of what I want, no? Or do I > > misunderstand? > > No, I think you're right there, we don't have them listed and cgroups > constrains it correctly (nvidia-smi says no devices when you don't > request any GPUs). > > Which version of Slurm are you on? > > cheers, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC > >
Re: [slurm-users] GPU / cgroup challenges
Thanks Chris. I do have the ConstrainDevices turned on. Are the differences in your cgroup_allowed_devices_file.conf relevant in this case? On Tue, May 1, 2018, 19:23 Christopher Samuelwrote: > On 02/05/18 09:00, Kevin Manalo wrote: > > > Also, I recall appending this to the bottom of > > > > [cgroup_allowed_devices_file.conf] > > .. > > Same as yours > > ... > > /dev/nvidia* > > > > There was a SLURM bug issue that made this clear, not so much in the > website docs. > > That shouldn't be necessary, all we have for this is.. > > The relevant line from our cgroup.conf: > > [...] > # Constrain devices via cgroups (to limits access to GPUs etc) > ConstrainDevices=yes > [...] > > Our entire cgroup_allowed_devices_file.conf: > > /dev/null > /dev/urandom > /dev/zero > /dev/sda* > /dev/cpu/*/* > /dev/pts/* > /dev/ram > /dev/random > /dev/hfi* > > > This is on RHEL7. > > -- > Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC > >
Re: [slurm-users] GPU / cgroup challenges
Thanks Kevin! Indeed, nvidia-smi in an interactive job tells me that I can get access to the device when I should not be able to. I thought including the /dev/nvidia* would whitelist those devices ... which seems to be the opposite of what I want, no? Or do I misunderstand? Thanks, Paul On Tue, May 1, 2018, 19:00 Kevin Manalowrote: > Paul, > > Having recently set this up, this was my test, when you make a single GPU > request from inside an interactive run (salloc ... --gres=gpu:1 srun --pty > bash) request you should only see the GPU assigned to you via 'nvidia-smi' > > When gres is unset you should see > > nvidia-smi > No devices were found > > Otherwise, if you ask for 1 of 2, you should only see 1 device. > > Also, I recall appending this to the bottom of > > [cgroup_allowed_devices_file.conf] > .. > Same as yours > ... > /dev/nvidia* > > There was a SLURM bug issue that made this clear, not so much in the > website docs. > > -Kevin > > > On 5/1/18, 5:28 PM, "slurm-users on behalf of R. Paul Wiegand" < > slurm-users-boun...@lists.schedmd.com on behalf of rpwieg...@gmail.com> > wrote: > > Greetings, > > I am setting up our new GPU cluster, and I seem to have a problem > configuring things so that the devices are properly walled off via > cgroups. Our nodes each of two GPUS; however, if --gres is unset, or > set to --gres=gpu:0, I can access both GPUs from inside a job. > Moreover, if I ask for just 1 GPU then unset the CUDA_VISIBLE_DEVICES > environmental variable, I can access both GPUs. From my > understanding, this suggests that it is *not* being protected under > cgroups. > > I've read the documentation, and I've read through a number of threads > where people have resolved similar issues. I've tried a lot of > configurations, but to no avail. Below I include some snippets of > relevant (current) parameters; however, I also am attaching most of > our full conf files. > > [slurm.conf] > ProctrackType=proctrack/cgroup > TaskPlugin=task/cgroup > SelectType=select/cons_res > SelectTypeParameters=CR_Core_Memory > JobAcctGatherType=jobacct_gather/linux > AccountingStorageTRES=gres/gpu > GresTypes=gpu > > NodeName=evc1 CPUs=32 RealMemory=191917 Sockets=2 CoresPerSocket=16 > ThreadsPerCore=1 State=UNKNOWN NodeAddr=ivc1 Weight=1 Gres=gpu:2 > > [gres.conf] > NodeName=evc[1-10] Name=gpu File=/dev/nvidia0 > COREs=0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 > NodeName=evc[1-10] Name=gpu File=/dev/nvidia1 > COREs=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 > > [cgroup.conf] > ConstrainDevices=yes > > [cgroup_allowed_devices_file.conf] > /dev/null > /dev/urandom > /dev/zero > /dev/sda* > /dev/cpu/*/* > /dev/pts/* > > Thanks, > Paul. > > >
Re: [slurm-users] GPU / cgroup challenges
On 02/05/18 09:00, Kevin Manalo wrote: Also, I recall appending this to the bottom of [cgroup_allowed_devices_file.conf] .. Same as yours ... /dev/nvidia* There was a SLURM bug issue that made this clear, not so much in the website docs. That shouldn't be necessary, all we have for this is.. The relevant line from our cgroup.conf: [...] # Constrain devices via cgroups (to limits access to GPUs etc) ConstrainDevices=yes [...] Our entire cgroup_allowed_devices_file.conf: /dev/null /dev/urandom /dev/zero /dev/sda* /dev/cpu/*/* /dev/pts/* /dev/ram /dev/random /dev/hfi* This is on RHEL7. -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
Re: [slurm-users] GPU / cgroup challenges
Paul, Having recently set this up, this was my test, when you make a single GPU request from inside an interactive run (salloc ... --gres=gpu:1 srun --pty bash) request you should only see the GPU assigned to you via 'nvidia-smi' When gres is unset you should see nvidia-smi No devices were found Otherwise, if you ask for 1 of 2, you should only see 1 device. Also, I recall appending this to the bottom of [cgroup_allowed_devices_file.conf] .. Same as yours ... /dev/nvidia* There was a SLURM bug issue that made this clear, not so much in the website docs. -Kevin On 5/1/18, 5:28 PM, "slurm-users on behalf of R. Paul Wiegand"wrote: Greetings, I am setting up our new GPU cluster, and I seem to have a problem configuring things so that the devices are properly walled off via cgroups. Our nodes each of two GPUS; however, if --gres is unset, or set to --gres=gpu:0, I can access both GPUs from inside a job. Moreover, if I ask for just 1 GPU then unset the CUDA_VISIBLE_DEVICES environmental variable, I can access both GPUs. From my understanding, this suggests that it is *not* being protected under cgroups. I've read the documentation, and I've read through a number of threads where people have resolved similar issues. I've tried a lot of configurations, but to no avail. Below I include some snippets of relevant (current) parameters; however, I also am attaching most of our full conf files. [slurm.conf] ProctrackType=proctrack/cgroup TaskPlugin=task/cgroup SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory JobAcctGatherType=jobacct_gather/linux AccountingStorageTRES=gres/gpu GresTypes=gpu NodeName=evc1 CPUs=32 RealMemory=191917 Sockets=2 CoresPerSocket=16 ThreadsPerCore=1 State=UNKNOWN NodeAddr=ivc1 Weight=1 Gres=gpu:2 [gres.conf] NodeName=evc[1-10] Name=gpu File=/dev/nvidia0 COREs=0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 NodeName=evc[1-10] Name=gpu File=/dev/nvidia1 COREs=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 [cgroup.conf] ConstrainDevices=yes [cgroup_allowed_devices_file.conf] /dev/null /dev/urandom /dev/zero /dev/sda* /dev/cpu/*/* /dev/pts/* Thanks, Paul.
[slurm-users] GPU / cgroup challenges
Greetings, I am setting up our new GPU cluster, and I seem to have a problem configuring things so that the devices are properly walled off via cgroups. Our nodes each of two GPUS; however, if --gres is unset, or set to --gres=gpu:0, I can access both GPUs from inside a job. Moreover, if I ask for just 1 GPU then unset the CUDA_VISIBLE_DEVICES environmental variable, I can access both GPUs. From my understanding, this suggests that it is *not* being protected under cgroups. I've read the documentation, and I've read through a number of threads where people have resolved similar issues. I've tried a lot of configurations, but to no avail. Below I include some snippets of relevant (current) parameters; however, I also am attaching most of our full conf files. [slurm.conf] ProctrackType=proctrack/cgroup TaskPlugin=task/cgroup SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory JobAcctGatherType=jobacct_gather/linux AccountingStorageTRES=gres/gpu GresTypes=gpu NodeName=evc1 CPUs=32 RealMemory=191917 Sockets=2 CoresPerSocket=16 ThreadsPerCore=1 State=UNKNOWN NodeAddr=ivc1 Weight=1 Gres=gpu:2 [gres.conf] NodeName=evc[1-10] Name=gpu File=/dev/nvidia0 COREs=0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 NodeName=evc[1-10] Name=gpu File=/dev/nvidia1 COREs=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 [cgroup.conf] ConstrainDevices=yes [cgroup_allowed_devices_file.conf] /dev/null /dev/urandom /dev/zero /dev/sda* /dev/cpu/*/* /dev/pts/* Thanks, Paul. cgroup_allowed_devices_file.conf Description: Binary data cgroup.conf Description: Binary data gres.conf Description: Binary data slurm.conf Description: Binary data
Re: [slurm-users] wckey specification error
> On May 1, 2018, at 2:58 AM, John Hearnswrote: > > Rocks 7 is now available, which is based on CentOS 7.4 > I hate to be uncharitable, but I am not a fan of Rocks. I speak from > experience, having installed my share of Rocks clusters. > The philosophy just does not fit in with the way I look at the world. > > Anyway, to install extra software on Rocks you need a 'Roll' Mahmood Looks > like you are using this Roll > https://sourceforge.net/projects/slurm-roll/ > It seems pretty mpdern as it installs Slurm 17.11.3 > > > On 1 May 2018 at 11:40, Chris Samuel wrote: > On Tuesday, 1 May 2018 2:45:21 PM AEST Mahmood Naderan wrote: > > > The wckey explanation in the manual [1] is not meaningful at the > > moment. Can someone explain that? > > I've never used it, but it sounds like you've configured your system to > require > it (or perhaps Rocks has done that?). > > https://slurm.schedmd.com/wckey.html > > Good luck, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC > > The slurm-roll hosted on sourceforge is developed and supported by Werner Saar not by developers of Rocks and/or other Rocks application rolls (e.g. SDSC). There is ample documentation on sourceforge[1] on how to configure your Rocks cluster to properly deploy the slurm-roll components and update your Slurm configuration. There is also an active discussion group for the slurm-roll on sourceforge[2] where Werner supports users of the slurm-roll for Rocks. While we don't use Werner's slurm-roll on our Rocks/Slurm based systems I have installed it on test system and can say that it works as expected/documented. In the default configuration WCKeys were NOT enabled so this something that you must have added to your Slurm configuration. If you don't need the WCKeys capability of Slurm perhaps you could simply disable it in your Slurm configuration. Hope this helps, Trevor [1] - https://sourceforge.net/projects/slurm-roll/files/release-7.0.0-17.11.05/slurm-roll.pdf [2] - https://sourceforge.net/p/slurm-roll/discussion/ -- Trevor Cooper HPC Systems Programmer San Diego Supercomputer Center, UCSD 9500 Gilman Drive, 0505 La Jolla, CA 92093-0505
Re: [slurm-users] Jobs escaping cgroup device controls after some amount of time.
Thanks Andy, I've been able to confirm that in my case, any jobs that ran for at least 30 minutes (puppet's run interval) would lose their cgroups, and that the time those cgroups disappear corresponds exactly with puppet runs. I am not sure if this is cgroup change to root is what causes the oom event that Slurm detects - I looked through src/plugins/task/cgroup/task_cgroup_memory.c and the memory cgroup documentation and it's not clear to me what would happen if you've created the oom event listener on a specific cgroup and that cgroup disappears. But since I disabled puppet overnight, jobs running longer than 30 minutes are completing, and cgroups are persisting, whereas before that, they were not. --nate On Mon, Apr 30, 2018 at 5:47 PM, Andy Georgeswrote: > > > > On 30 Apr 2018, at 22:37, Nate Coraor wrote: > > > > Hi Shawn, > > > > I'm wondering if you're still seeing this. I've recently enabled > task/cgroup on 17.11.5 running on CentOS 7 and just discovered that jobs > are escaping their cgroups. For me this is resulting in a lot of jobs > ending in OUT_OF_MEMORY that shouldn't, because it appears slurmd thinks > the oom-killer has triggered when it hasn't. I'm not using GRES or devices, > only: > > I am not sure that you are making the correct conclusion here. > > There is a known cgroups issue, due to > > https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt > > Relevant part: > > The memory controller has a long history. A request for comments for the > memory > controller was posted by Balbir Singh [1]. At the time the RFC was posted > there were several implementations for memory control. The goal of the > RFC was to build consensus and agreement for the minimal features required > for memory control. The first RSS controller was posted by Balbir Singh[2] > in Feb 2007. Pavel Emelianov [3][4][5] has since posted three versions of > the > RSS controller. At OLS, at the resource management BoF, everyone suggested > that we handle both page cache and RSS together. Another request was raised > to allow user space handling of OOM. The current memory controller is > at version 6; it combines both mapped (RSS) and unmapped Page > Cache Control [11]. > > Are the jobs killed prematurely? If not, then you ran into the above. > > Kind regards. > — Andy >
Re: [slurm-users] wckey specification error
I quickly downloaded that roll and unpacked the RPMs. I cannot quite see how SLurm is configured, so to my shame I gave up (I did say that Rocks was not my thing) On 1 May 2018 at 11:58, John Hearnswrote: > Rocks 7 is now available, which is based on CentOS 7.4 > I hate to be uncharitable, but I am not a fan of Rocks. I speak from > experience, having installed my share of Rocks clusters. > The philosophy just does not fit in with the way I look at the world. > > Anyway, to install extra software on Rocks you need a 'Roll' Mahmood > Looks like you are using this Roll > https://sourceforge.net/projects/slurm-roll/ > It seems pretty mpdern as it installs Slurm 17.11.3 > > > > > On 1 May 2018 at 11:40, Chris Samuel wrote: > >> On Tuesday, 1 May 2018 2:45:21 PM AEST Mahmood Naderan wrote: >> >> > The wckey explanation in the manual [1] is not meaningful at the >> > moment. Can someone explain that? >> >> I've never used it, but it sounds like you've configured your system to >> require >> it (or perhaps Rocks has done that?). >> >> https://slurm.schedmd.com/wckey.html >> >> Good luck, >> Chris >> -- >> Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC >> >> >> >
Re: [slurm-users] wckey specification error
On Tuesday, 1 May 2018 2:45:21 PM AEST Mahmood Naderan wrote: > The wckey explanation in the manual [1] is not meaningful at the > moment. Can someone explain that? I've never used it, but it sounds like you've configured your system to require it (or perhaps Rocks has done that?). https://slurm.schedmd.com/wckey.html Good luck, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC