[slurm-users] Re: Why is Slurm 20 the latest RPM in RHEL 8/Fedora repo?

2024-01-31 Thread Bas van der Vlies via slurm-users
needs (IB? SlingShot? Nvidia? etc.). -- -- Bas van der Vlies | High Performance Computing & Visualization | SURF| Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl | -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe

Re: [slurm-users] Partition not allowing subaccount use

2023-07-25 Thread Bas van der Vlies
ing what constitutes a subaccount?  When I "sacctmgr show assoc tree", then I see that mystuff is under stuff. Or am I misreading the documentation that says that subaccounts are included in who is allowed to use the partition? Thanks, Rob -- -- Bas van der Vlies | High Perform

[slurm-users] sbatch --uid option and slurmrestd equivalent

2023-07-12 Thread Bas van der Vlies
I have question can I submit jobs as root user to the slumrestd server and change the userid like `sbatch --uid`? Or can this acomplished by: * (root) scontrol token username= * (user) use this token to submit jobs as Regards Bas -- -- Bas van der Vlies | High Performance Computing

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-27 Thread Bas van der Vlies
we're talking about large numbers of jobs. Ward -- -- Bas van der Vlies | High Performance Computing & Visualization | SURF| Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl |

Re: [slurm-users] Use cases for "include" in slurm.conf?

2022-09-21 Thread Bas van der Vlies
ocated adjacent to slurm.conf. Any additional config files will need to be shared a different way or added to the parent config. ``` We also used include statements before we switched to configless. Same arguments as Bjorn. -- -- Bas van der Vlies | High Performance Computing & Visu

[slurm-users] SLURM 22.05 and NHC in prolog/epilog

2022-08-05 Thread Bas van der Vlies
sites also have this problem? Did I miss an option? Regards -- -- Bas van der Vlies | High Performance Computing & Visualization | SURF| Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl |

Re: [slurm-users] Sharing a GPU

2022-04-13 Thread Bas van der Vlies
Just released a new version of the plugin. Our cluster has been upgraded to 21.08.6 and the cgroups structure is different. Fixed in latest release: * Tested on 21.08 and 20.11 Regards > On 4 Apr 2022, at 09:20, Bas van der Vlies wrote: > > We have the exact same request for

Re: [slurm-users] Sharing a GPU

2022-04-04 Thread Bas van der Vlies
-- Bas van der Vlies | HPCV Supercomputing | Internal Services | SURF | https://userinfo.surfsara.nl | | Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 | | bas.vandervl...@surf.nl smime.p7s Description: S/MIME Cryptographic Signature

Re: [slurm-users] addressing NVIDIA MIG + non MIG devices in Slurm

2022-01-31 Thread Bas van der Vlies
the perfect solution - slurm.conf Node Def: MIG + non MIG Gres types -> problem: it doesn't work -> Parsing error at unrecognized key: UniqueId Thanks for reading this far. Am I missing something? How can MIG and non MIG devices be addressed in a cluster? This setup of having MIG and non MIG dev

Re: [slurm-users] SlurmDBD 20.02.7

2022-01-06 Thread Bas van der Vlies
rror Does anyone knows what that mean? I also found out, that all *tres* fields in taurus_assoc_table are empty but not NULL. Kind regards, Danny Rotscher -- Bas van der Vlies | HPCV Supercomputing | Internal Services | SURF | https://userinfo.surfsara.nl | | Science Park 140 | 1098 XG

Re: [slurm-users] issue with mpirun when using through slurm / pmix

2021-10-22 Thread Bas van der Vlies
I have no other solution this was the solution at our site. > On 22 Oct 2021, at 03:19, pankajd wrote: > > thanks, but after setting PMIX_MCA_psec=native, now mpirun hangs and does not > produce any output. > > On October 21, 2021 at 9:21 PM Bas van der Vlies > wrote

Re: [slurm-users] issue with mpirun when using through slurm / pmix

2021-10-21 Thread Bas van der Vlies
opriate legal action will be taken. -- Bas van der Vlies | HPCV Supercomputing | Internal Services | SURF | https://userinfo.surfsara.nl | | Science Park 140 | 1098 XG Amsterdam

Re: [slurm-users] Custom GRES not working in 21.08.2

2021-10-18 Thread Bas van der Vlies
Hi Quirin maybe you have this gres issue https://bugs.schedmd.com/show_bug.cgi?id=12642#c27 -- Bas van der Vlies > On 17 Oct 2021, at 16:32, Quirin Lohr wrote: > > Hi, > > I just upgraded from 20.11 to 21.08.2. > > Now it seems the slurmd cannot handle my custom GR

Re: [slurm-users] Prologslurmctld environment variables

2021-09-17 Thread Bas van der Vlies
goal is to pass information from a user's SBATCH job to the Prologslurmctld script to provision the node correctly before running the job. -- Bas van der Vlies | HPCV Supercomputing | Internal Services | SURF | https://userinfo.surfsara.nl | | Science Park 140 | 1098 XG Amsterdam | Phone

Re: [slurm-users] SLURM reservations with MAGNETIC flag

2021-08-26 Thread Bas van der Vlies
) for the solution, see: * https://bugs.schedmd.com/show_bug.cgi?id=12350 Regards > On 7 Apr 2021, at 13:57, Bas van der Vlies wrote: > > > > Still have this question. Sometime we have free nodes and users that are > allowed to run in the MAGNETIC reservation are first scheduled on

Re: [slurm-users] What is an easy way to prevent users run programs on the master/login node.

2021-05-20 Thread Bas van der Vlies
I know but see script we only do this for uid > 1000. On 20/05/2021 17:29, Timo Rothenpieler wrote: You shouldn't need this script and pam_exec. You can set those limits directly in the systemd config to match every user. On 20.05.2021 16:28, Bas van der Vlies wrote: same here we

Re: [slurm-users] What is an easy way to prevent users run programs on the master/login node.

2021-05-20 Thread Bas van der Vlies
the login nodes, which so far worked fine. They occasionally compile stuff on the login nodes in preparation of runs, so I don't want to limit them too much. -- Bas van der Vlies | HPCV Supercomputing | Internal Services | SURF | https://userinfo.surfsara.nl | | Science Park 140 | 1098 XG A

Re: [slurm-users] job_submit_lua improvement

2021-05-11 Thread Bas van der Vlies
the patch or request the feature? And how would I do it either way? Thanks, Alexander -- Bas van der Vlies | HPCV Supercomputing | Internal Services | SURF | https://userinfo.surfsara.nl | | Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 | | bas.vandervl...@surf.nl

Re: [slurm-users] SLURM reservations with MAGNETIC flag

2021-04-07 Thread Bas van der Vlies
Still have this question. Sometime we have free nodes and users that are allowed to run in the MAGNETIC reservation are first scheduled on the free nodes instead of reservation nodes. Dit I forgot an option or is this the expected behavior? On 25/09/2020 16:47, Bas van der Vlies wrote

Re: [slurm-users] how to print all the key-values of "job_desc" in job_submit.lua?

2021-03-29 Thread Bas van der Vlies
userdata(L, job_desc); lua_setfield(L, -2, "_job_desc"); lua_setmetatable(L, -2); } ``` -- Bas van der Vlies | HPCV Supercomputing | Internal Services | SURF | https://userinfo.surfsara.nl | | Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 | | bas.vandervl...@surf.nl

Re: [slurm-users] [External] Submitting to multiple paritions problem with gres specified

2021-03-09 Thread Bas van der Vlies
For those who are interested: * https://bugs.schedmd.com/show_bug.cgi?id=11044 On 09/03/2021 14:21, Bas van der Vlies wrote: I have found the problem and will submit a patch. If we find a partition were a job can run but all nodes are busy. Save this state and return this when all partitions

Re: [slurm-users] [External] Submitting to multiple paritions problem with gres specified

2021-03-09 Thread Bas van der Vlies
On 3/8/21 11:29 AM, Bas van der Vlies wrote: Hi, On this cluster I have version 20.02.6 installed. We have different partitions for cpu type and gpu types. we want to make it easy for the user who not care where there job runs and for the experienced user they can specify the gres type: cpu_type

Re: [slurm-users] [External] Submitting to multiple paritions problem with gres specified

2021-03-09 Thread Bas van der Vlies
:29 AM, Bas van der Vlies wrote: Hi, On this cluster I have version 20.02.6 installed. We have different partitions for cpu type and gpu types. we want to make it easy for the user who not care where there job runs and for the experienced user they can specify the gres type: cpu_type or gpu I

Re: [slurm-users] Submitting to multiple paritions problem with gres specified

2021-03-08 Thread Bas van der Vlies
figuration is not available) [2021-03-08T19:46:09.378] _slurm_rpc_allocate_resources: Requested node configuration is not available ``` On 08/03/2021 17:29, Bas van der Vlies wrote: Hi, On this cluster I have version 20.02.6 installed. We have different partitions for cpu type and gpu types. we want to ma

[slurm-users] Submitting to multiple paritions problem with gres specified

2021-03-08 Thread Bas van der Vlies
Bas van der Vlies | HPCV Supercomputing | Internal Services | SURF | https://userinfo.surfsara.nl | | Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 | | bas.vandervl...@surf.nl

Re: [slurm-users] Extremely sluggish squeue -p partition

2020-12-08 Thread Bas van der Vlies
Just a question do you use allowgroups for partition access? we had similar problems that squeue or other slurm commands hang when slurm fetch the groups form LDAP for each partition also the one it already fetched. So we configured `nscd` to prevent this. -- Bas van der Vlies | Operations

Re: [slurm-users] SLURM reservations with MAGNETIC flag

2020-09-25 Thread Bas van der Vlies
Thanks Troy, That is our intention also for course/training purposes if there are course days. regards -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl | >

Re: [slurm-users] SLURM reservations with MAGNETIC flag

2020-09-25 Thread Bas van der Vlies
Are people using the MAGNETIC reservation flag? My question would be how? because to my it would be more useful if the reservation is tried first and then the free nodes. That is what I expected with the MAGNETIC flag. Bas van der Vlies | Operations, Support & Development | SURFsara | Sci

Re: [slurm-users] Compiling Slurm with nvml support

2020-09-25 Thread Bas van der Vlies
That is why we switched to tarball installations with version directories as suggested by schedmd. No deb/rpm installations any more. -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@sur

[slurm-users] SLURM reservations with MAGNETIC flag

2020-09-24 Thread Bas van der Vlies
has been allocated resources ``` From this I see that the "magentic" reservation is considered as last. regards -- -- Bas van der Vlies | Operations, Support & Development | SURF | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.

[slurm-users] Question about Partition and AllowAccounts

2020-09-02 Thread Bas van der Vlies
ved in a later version? Or is there another solution for this? Regards, -- -- Bas van der Vlies | Operations, Support & Development | SURF | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl smime.p7s Description: S/MIME cryptographic signature

Re: [slurm-users] Billing issue

2020-08-06 Thread Bas van der Vlies
here is this "2048" and "2" factor coming from in those two calculations? > job is using 8G --> 0.25 * 8192 = 2048 with .25G ---) .25/1024M --> 8192 * 0.25/1024 = 2 Memory is always a factor 1024 > On Thu, 6 Aug 2020 6:46am, Bas van der Vlies wrote: > > > Il

Re: [slurm-users] Billing issue

2020-08-06 Thread Bas van der Vlies
Il 06/08/20 10:00, Bas van der Vlies ha scritto: Tks for the answer. >> We have also node with GPU's (dfiferent types) and some cost more the others. > The partitions always have the same type of nodes not mixed,eg: > * > TRESBillingWeights=CPU=3801.0,Mem=502246.0T,GRES/g

Re: [slurm-users] Billing issue

2020-08-06 Thread Bas van der Vlies
Hi Diego, Yes this can be tricky we also use this feature. The billing is on partition level. so you can set different schemas. We have nodes with 16 cores and 96GB of ram and this are the cheapest nodes they cost in our model. 1 SBU (System Billing Unit). For this node we have the following

Re: [slurm-users] SLURM Install

2020-07-14 Thread Bas van der Vlies
I'd known that two years ago, might've saved me some setting up > (if it was around two years ago). My SLURM configuration is also > CFEngine3 controlled. So I'm quite interested in sharing . > > Having a look at it in a minute... > > Tina > > > On 13/07

[slurm-users] SLURM Install

2020-07-13 Thread Bas van der Vlies
/basvandervlies/cf_surfsara_lib/blob/master/doc/services.md -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl | smime.p7s Description: S/MIME cryptographic signature

Re: [slurm-users] Enforcing GPU-CPU ratios

2020-06-23 Thread Bas van der Vlies
t of SLURM configuration can be used to enforce this ratio? Thank you, Durai Arasan Zentrum für Datenverarbeitung Tübingen -- -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |

Re: [slurm-users] Do not upgrade mysql to 5.7.30!

2020-05-07 Thread Bas van der Vlies
We have a Debian Stretch en Debian Buster cluster and both using MariaDB no problems so far. Version 19.05.5 and we are planning to upgrade to 20.02 -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800

Re: [slurm-users] Nodelist dependent environment setup ?

2020-04-06 Thread Bas van der Vlies
was tested on my laptop using a set of docker containers using this configuration : https://github.com/SciDAS/slurm-in-docker . -- Sajid Ali | PhD Candidate Applied Physics Northwestern University s-sajid-ali.github.io <http://s-sajid-ali.github.io> -- -- Bas van der Vlies | Opera

Re: [slurm-users] DefMemPerGPU bug?

2020-03-30 Thread Bas van der Vlies
 test-s5 wayne.he  R       0:03      1 ucs480 -- -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |

Re: [slurm-users] Upgrade slurm to 19.05.3 from 18.08.7

2019-11-13 Thread Bas van der Vlies
Chris I also found the above link. I read the RPC documentation wrong and now have the correct procedure for upgrading -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |

Re: [slurm-users] Upgrade slurm to 19.05.3 from 18.08.7

2019-11-13 Thread Bas van der Vlies
= commands Thanks a lot Ole. This helps a a lot. Regards -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |

[slurm-users] Upgrade slurm to 19.05.3 from 18.08.7

2019-11-13 Thread Bas van der Vlies
: slurm_unpack_received_msg: Incompatible versions of client and server code }}} I have read about the RPC protocol: * https://slurm.schedmd.com/rpc.html Can an old `slurmctld` not communicate with a newer `slurmd`? Or is this setup supported and something else goes wrong? Regards -- Bas van der

Re: [slurm-users] Apparent scontrol reboot bug

2019-01-22 Thread Bas van der Vlies
SURFsara | Science Park 140 | 1098 XG Amsterdam | | T +31 6 20043417  | martijn.krui...@surfsara.nl <mailto:bas.vandervl...@surfsara.nl> | www.surfsara.nl <http://www.surfsara.nl> | -- Sent from Gmail Mobile -- Sent from Gmail Mobile -- -- Bas va

Re: [slurm-users] About x11 support

2018-11-16 Thread Bas van der Vlies
splays. Regards, Mahmood -- -- Bas van der Vlies | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl | smime.p7s Description: S/MIME Cryptographic Signature

Re: [slurm-users] slurmstepd crash 18.03 when using pmi2 interface

2018-11-01 Thread Bas van der Vlies
Oke if we change: * TaskPlugin=task/affinity,task/cgroup to: * TaskPlugin=task/affinity The pmi2 interface works. Investigating this further On 31/10/2018 08:26, Bas van der Vlies wrote: I am busy with migrating from Torque/Moab to SLURM. I have installed slurm 18.03 and trying to run

[slurm-users] slurmstepd crash 18.03 when using pmi2 interface

2018-10-31 Thread Bas van der Vlies
/x86_64-linux-gnu/slurm//task_cgroup.so #7 0x2b9cea2a5977 in task_p_pre_setuid () from /usr/lib/x86_64-linux-gnu/slurm//task_cgroup.so #8 0x5631c2a04216 in task_g_pre_setuid () #9 0x5631c29e713d in ?? () #10 0x5631c29ec3f4 in job_manager () #11 0x5631c29e9374 in main () }}