[slurm-dev] Re:

2017-05-09 Thread Lachlan Musicman
Ignore this - I discovered the problem. A couple of bpipe jobs from three weeks ago were zombied and eating all the memory. cheers L. -- "Mission Statement: To provide hope and inspiration for collective action, to build collective power, to achieve collective transformation, rooted in grief

[slurm-dev]

2017-05-09 Thread Lachlan Musicman
Running Slurm 16.05 on CentOS 7.3 I'm trying to start an interactive session with srun -w papr-expanded01 --pty --mem 8192 -t 06:00 /bin/bash --partition=expanded srun -w papr-expanded01 --pty -t 06:00 /bin/bash --partition=expanded srun -w papr-expanded01 --pty --mem 8192 /bin/bash --partition=ex

[slurm-dev] Re: Announce: Infiniband topology tool "slurmibtopology.sh" version 0.1

2017-05-09 Thread Jeffrey Frey
>> The primary problem I've had with ib2slurm is that it segfaults. There's a >> bug in the ibnetdiscover library -- ib2slurm passes a NULL config pointer to >> the ibnd_discover_fabric() which is supposed to be okay according to the >> documentation, but that function actually requires a conf

[slurm-dev] Re: Issue to startup slurm daemon on Compute nodes

2017-05-09 Thread John Hearns
Followig on from Maik's response, it would be worth mentioning the compat-glibc package for CentOS https://centos-packages.com/7/package/compat-glibc/ https://www.centos.org/forums/viewtopic.php?t=22250 Big get out of jail card - I have never built any version of Slurm on a CentOS 7 system using

[slurm-dev] Re: Partition default job time limit

2017-05-09 Thread Paul Edmon
Yes. Use the DefaultTime option. *DefaultTime* Run time limit used for jobs that don't specify a value. If not set then MaxTime will be used. Format is the same as for MaxTime. https://slurm.schedmd.com/slurm.conf.html -Paul Edmon- On 05/09/2017 05:35 AM, Georg Hildebrand wrote: Pa

[slurm-dev] Re: Issue to startup slurm daemon on Compute nodes

2017-05-09 Thread Maik Schmidt
It means you have to build SLURM on the node with the oldest glibc that you might still have in your cluster. It will then also run on the ones with newer glibc versions, just not the other way around. Best, Maik Am 09.05.2017 um 15:49 schrieb J. Smith: Hi, I have compiled slurm v17.02.2 on

[slurm-dev] Issue to startup slurm daemon on Compute nodes

2017-05-09 Thread J. Smith
Hi, I have compiled slurm v17.02.2 on Master Nodes running CentOS7. I have no issue to startup slurm on the Master nodes but I am unable to start the daemon on the Compute Nodes running on CentOS6. It is looking for GLIBC 2.14 which is not available on our compute Nodes(using glibc-2.12). Error:

[slurm-dev] Announce: Node status tool "pestat" for Slurm updated to version 0.40

2017-05-09 Thread Ole Holm Nielsen
I'm announcing an updated version 0.40 of the node status tool "pestat" for Slurm. Download the tool (a short bash script) from https://ftp.fysik.dtu.dk/Slurm/pestat Thanks to Daniel Letai for recommending better script coding styles. If your commands do not live in /usr/bin, please make

[slurm-dev] RE: Announce: Infiniband topology tool "slurmibtopology.sh" updated version 0.21

2017-05-09 Thread Yaron Weitz
Thanks Ole. I just rebuild my InfiniBand fabric and your script will surely help me. Yaron -Original Message- From: Ole Holm Nielsen [mailto:ole.h.niel...@fysik.dtu.dk] Sent: Tuesday, May 9, 2017 12:43 PM To: slurm-dev Subject: [slurm-dev] Announce: Infiniband topology tool "slurmibtop

[slurm-dev] Announce: Infiniband topology tool "slurmibtopology.sh" updated version 0.21

2017-05-09 Thread Ole Holm Nielsen
I'm announcing an updated version 0.21 of an Infiniband topology tool "slurmibtopology.sh" for Slurm. The output may be used as a starting point for writing your own topology.conf file. Download the script from https://ftp.fysik.dtu.dk/Slurm/slurmibtopology.sh Thanks to Felip Moll for testi

[slurm-dev] Partition default job time limit

2017-05-09 Thread Georg Hildebrand
Hi @here, Is it possible to have a default job time limit for an slurm partition that is lower than the MaxTime? Viele Grüße / kind regards Georg

[slurm-dev] Inconsistent job timings with openMPI's mpirun

2017-05-09 Thread Gunnar Jansen
I am experiencing inconsistent timings running the exact same job file on a small cluster using slurm 14.11.8 and OpenMPI 1.8.8. The test case I am running is the device-device latency test from the osu-micro-benchmarks-5.0 suite. I use the following slurm script file: #!/bin/bash # #SBATCH --jo

[slurm-dev] Re: Announce: Infiniband topology tool "slurmibtopology.sh" version 0.1

2017-05-09 Thread Janne Blomqvist
On 2017-05-09 10:27, Ole Holm Nielsen wrote: On 05/09/2017 09:14 AM, Janne Blomqvist wrote: On 2017-05-07 15:29, Ole Holm Nielsen wrote: I'm announcing an initial version 0.1 of an Infiniband topology tool "slurmibtopology.sh" for Slurm. I have also created one, at https://github.com/jab

[slurm-dev] Re: Creating init script in /etc/init.d while building from source

2017-05-09 Thread Gennaro Oliva
Hi, On Tue, May 09, 2017 at 12:17:11AM -0700, Janne Blomqvist wrote: > for Ubuntu 16.04 you should be using the systemd service files instead of > init.d scripts. They are part of the rpm file when building for red hat > based systems, don't know about ubuntu; but presumably you can find them > s

[slurm-dev] Re: Announce: Infiniband topology tool "slurmibtopology.sh" version 0.1

2017-05-09 Thread Ole Holm Nielsen
On 05/09/2017 09:14 AM, Janne Blomqvist wrote: On 2017-05-07 15:29, Ole Holm Nielsen wrote: I'm announcing an initial version 0.1 of an Infiniband topology tool "slurmibtopology.sh" for Slurm. I have also created one, at https://github.com/jabl/ibtopotool You need the python networkx libr

[slurm-dev] Re: Creating init script in /etc/init.d while building from source

2017-05-09 Thread Janne Blomqvist
On 2017-05-09 09:09, Dhiraj Reddy wrote: Hi, How to create slurmd and slurmctld init scripts in the directory /etc/init.d while building and installing slurm from source. I think something should be done with the files ./init.d.slurm in /etc directory but I don't know what do. I am using Ubun

[slurm-dev] Re: Announce: Infiniband topology tool "slurmibtopology.sh" version 0.1

2017-05-09 Thread Janne Blomqvist
On 2017-05-07 15:29, Ole Holm Nielsen wrote: I'm announcing an initial version 0.1 of an Infiniband topology tool "slurmibtopology.sh" for Slurm. I have also created one, at https://github.com/jabl/ibtopotool You need the python networkx library (python-networkx package on centos & Ubuntu,