Re: [slurm-users] PMIX and slurm failure (and fix).

2018-05-17 Thread Artem Polyakov
Thank you Bill. Can you provide anonymized Slurm.conf (I mainly interested in auth setting), srun launch error and config.log where you saw libssl mention. Being PMIx plugin developer I’m not aware about any explicit dependencies from libssl in Slurm. Only thing I can think of would be

Re: [slurm-users] Creating custom partition GRES

2018-05-17 Thread Sébastien VIGNERON
Hello almon, Did you look the NodeName/Feature list functionality with sbatch —constraints before choosing GRES? Best regards, Sebastien VIGNERON > Le 18 mai 2018 à 00:02, Almon Gem Otanes a écrit : > > Hi everyone, > Is there a way to add GRES/features/attributes(not

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-17 Thread Sean Caron
Awesome tip. Thanks so much, Matthieu. I hadn't considered that. I will give that a shot and see what happens. Best, Sean On Thu, May 17, 2018 at 4:49 PM, Matthieu Hautreux < matthieu.hautr...@gmail.com> wrote: > Hi, > > Communications in Slurm are not only performed from controller to slurmd

[slurm-users] Creating custom partition GRES

2018-05-17 Thread Almon Gem Otanes
Hi everyone, Is there a way to add GRES/features/attributes(not sure which is the correct term) to partitions? I'm trying to port from SGE to SLURM. Our current setup have queues(partitions) that refer to physical systems. The binaries we want to execute are just scripts that don't need to run on

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-17 Thread Matthieu Hautreux
Hi, Communications in Slurm are not only performed from controller to slurmd and from slurmd to controller. You need to ensure that your login nodes can reach the controller and the slurmd nodes as well as ensure that slurmd on the various nodes can contact each other. This last requirement is

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-17 Thread Sean Caron
Sorry, how do you mean? The environment is very basic. Compute nodes and SLURM controller are on an RFC1918 subnet. Gateways are dual homed with one leg on a public IP and one leg on the RFC1918 cluster network. It used to be that nodes that only had a leg on the RFC1918 network (compute nodes and

Re: [slurm-users] Job step aborted

2018-05-17 Thread Matthieu Hautreux
Le jeu. 17 mai 2018 11:28, Mahmood Naderan a écrit : > Hi, > For an interactive job via srun, I see that after opening the gui, the > session is terminated automatically which is weird. > > [mahmood@rocks7 ansys_test]$ srun --x11 -A y8 -p RUBY --ntasks=10 > --mem=8GB --pty

Re: [slurm-users] Job step aborted

2018-05-17 Thread Mahmood Naderan
I have opened a bug ticket at https://bugs.schedmd.com/show_bug.cgi?id=5182 It is annoying... Regards, Mahmood On Thu, May 17, 2018 at 1:54 PM, Mahmood Naderan wrote: > Hi, > For an interactive job via srun, I see that after opening the gui, the > session is terminated

[slurm-users] PMIX and slurm failure (and fix).

2018-05-17 Thread Bill Broadley
Greetings all, Just wanted to mention I build building the newest slurm on Ubuntu 18.04. Gcc-7.3 is the default compiler, which means that the various dependencies (munge, libevent, hwloc, netloc, pmix, etc) are already available and built with gcc-7.3. I carefully built slurm-17.11.6 +

[slurm-users] Job step aborted

2018-05-17 Thread Mahmood Naderan
Hi, For an interactive job via srun, I see that after opening the gui, the session is terminated automatically which is weird. [mahmood@rocks7 ansys_test]$ srun --x11 -A y8 -p RUBY --ntasks=10 --mem=8GB --pty bash [mahmood@compute-0-6 ansys_test]$ /state/partition1/scfd/sc -t10 srun: First task

Re: [slurm-users] X11 debug

2018-05-17 Thread Marco Ehlert
On Thu, 17 May 2018, Nadav Toledo wrote: Hello everyone, After fighting with x11 forwarding couple of weeks, I think i've got a few tips that can help others. I am using slurm 17.11.6 with builtin x11 forwarding with ubuntu server distro, all servers in cluster share /home via beegfs. slurm

Re: [slurm-users] X11 debug

2018-05-17 Thread Ole Holm Nielsen
On 05/17/2018 08:45 AM, Nadav Toledo wrote: Hello everyone, After fighting with x11 forwarding couple of weeks, I think i've got a few tips that can help others. I am using slurm 17.11.6 with builtin x11 forwarding with ubuntu server distro, all servers in cluster share /home via beegfs.