[slurm-users] Anyone built PMIX 3.1.1 against Slurm 18.08.4?

2019-01-21 Thread Greg Wickham
Hi All, I’m trying to build pmix 3.1.1 against slurm 18.08.4, however in the slurm pmix plugin I get a fatal error: pmixp_client.c:147:28: error: ‘flag’ undeclared (first use in this function) PMIX_VAL_SET(>value, flag, 0); Is there something wrong with my build environment?

[slurm-users] Apparent scontrol reboot ASAP bug

2019-01-21 Thread Martijn Kruiten
Hi, We encounter a strange issue on our system (Slurm 18.08.3), and I'm curious whether anyone of you recognizes this behavior. In the following example we try to reboot 32 nodes, of which 31 nodes are idle: root# scontrol reboot ASAP nextstate=resume reason=image r8n[1-32] root# sinfo -o "%100E

[slurm-users] Specify number of cpus allocated to each partition when nodes are shared

2019-01-21 Thread Bruno Santos
Hi everyone, I am scratching my head to find a way to do this on slurm. We have three nodes configured as such: > # COMPUTE NODES > NodeName=brassica NodeAddr=10.1.10.83 CPUs=64 RealMemory=95 Sockets=4 > CoresPerSocket=8 ThreadsPerCore=2 State=UNKNOWN > NodeName=triticum NodeAddr=10.1.10.170

Re: [slurm-users] SLURM_JOB_GPU not set in salloc

2019-01-21 Thread Henkel, Andreas
Thank you Chris. This is what I assumed since setting those Variables for complicated Allocations may be just useless. Yet I wasn’t sure if it was possible at all. Best, Andreas > Am 19.01.2019 um 08:39 schrieb Chris Samuel : > >> On 18/1/19 3:18 am, Henkel wrote: >> >> we just found

Re: [slurm-users] New checktopology tool: Check consistency of /etc/slurm/topology.conf with nodelist in /etc/slurm/slurm.conf

2019-01-21 Thread Bjørn-Helge Mevik
Ole Holm Nielsen writes: > New version at > https://github.com/OleHolmNielsen/Slurm_tools/tree/master/nodes Nice. Thanks! -- Cheers, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP signature

Re: [slurm-users] New checktopology tool: Check consistency of /etc/slurm/topology.conf with nodelist in /etc/slurm/slurm.conf

2019-01-21 Thread Ole Holm Nielsen
On 1/21/19 12:18 PM, Bjørn-Helge Mevik wrote: Ole Holm Nielsen writes: Comments and suggestions are most welcome! Splendid tool! I immediately found that I'd forgotten to take a few nodes out of the topology definition. :) Me too :-) One thing: The script doesn't work if you have

Re: [slurm-users] New checktopology tool: Check consistency of /etc/slurm/topology.conf with nodelist in /etc/slurm/slurm.conf

2019-01-21 Thread Ole Holm Nielsen
Hi Bjørn-Helge, Thanks: On 1/21/19 12:37 PM, Bjørn-Helge Mevik wrote: Two more details/enhancements: 1) Sites which use node names like c[1-20]-[1-36], would benefit from "sort -V" instead of just sort -- otherwise c10-12 will be listed before c2-12, for instance. (For sites that use names

Re: [slurm-users] New checktopology tool: Check consistency of /etc/slurm/topology.conf with nodelist in /etc/slurm/slurm.conf

2019-01-21 Thread Bjørn-Helge Mevik
Two more details/enhancements: 1) Sites which use node names like c[1-20]-[1-36], would benefit from "sort -V" instead of just sort -- otherwise c10-12 will be listed before c2-12, for instance. (For sites that use names like c[01-20]-[01-36] there is no difference.) Not very important, but

Re: [slurm-users] New checktopology tool: Check consistency of /etc/slurm/topology.conf with nodelist in /etc/slurm/slurm.conf

2019-01-21 Thread Bjørn-Helge Mevik
Ole Holm Nielsen writes: > Comments and suggestions are most welcome! Splendid tool! I immediately found that I'd forgotten to take a few nodes out of the topology definition. :) One thing: The script doesn't work if you have different NodeName and NodeHostName on the nodes in slurm.conf,

[slurm-users] New checktopology tool: Check consistency of /etc/slurm/topology.conf with nodelist in /etc/slurm/slurm.conf

2019-01-21 Thread Ole Holm Nielsen
Hi Slurm users, When removing dead nodes or adding new nodes, I've too often made the mistake of forgetting to update correctly the /etc/slurm/topology.conf file. Therefore I wrote a simple "checktopology" tool to check the consistency of /etc/slurm/topology.conf with the nodelist in