Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-21 Thread Matthieu Hautreux
Glad to hear that you make it work. Regards Matthieu 2018-05-21 21:21 GMT+02:00 Sean Caron : > Just wanted to follow up. In addition to passing all traffic to the SLURM > controller, opened port 6818/TCP to all other compute nodes and this seems > to have resolved the issue.

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-21 Thread Sean Caron
Just wanted to follow up. In addition to passing all traffic to the SLURM controller, opened port 6818/TCP to all other compute nodes and this seems to have resolved the issue. Thanks again, Matthieu! Best, Sean On Thu, May 17, 2018 at 8:06 PM, Sean Caron wrote: > Awesome

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-17 Thread Sean Caron
Awesome tip. Thanks so much, Matthieu. I hadn't considered that. I will give that a shot and see what happens. Best, Sean On Thu, May 17, 2018 at 4:49 PM, Matthieu Hautreux < matthieu.hautr...@gmail.com> wrote: > Hi, > > Communications in Slurm are not only performed from controller to slurmd

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-17 Thread Matthieu Hautreux
Hi, Communications in Slurm are not only performed from controller to slurmd and from slurmd to controller. You need to ensure that your login nodes can reach the controller and the slurmd nodes as well as ensure that slurmd on the various nodes can contact each other. This last requirement is

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-17 Thread Sean Caron
Sorry, how do you mean? The environment is very basic. Compute nodes and SLURM controller are on an RFC1918 subnet. Gateways are dual homed with one leg on a public IP and one leg on the RFC1918 cluster network. It used to be that nodes that only had a leg on the RFC1918 network (compute nodes and

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-16 Thread Sean Caron
I see some chatter on 6818/TCP from the compute node to the SLURM controller, and from the SLURM controller to the compute node. The policy is to permit all packets inbound from SLURM controller regardless of port and protocol, and perform no filtering whatsoever on any output packets to

Re: [slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-16 Thread Alex Chekholko
Add a logging rule to your iptables and look at what traffic is actually being blocked? On Wed, May 16, 2018 at 11:11 AM Sean Caron wrote: > Hi all, > > Does anyone use SLURM in a scenario where there is an iptables firewall on > the compute nodes on the same network it uses

[slurm-users] SLURM nodes flap in "Not responding" status when iptables firewall enabled

2018-05-16 Thread Sean Caron
Hi all, Does anyone use SLURM in a scenario where there is an iptables firewall on the compute nodes on the same network it uses to communicate with the SLURM controller and DBD machine? I have the very basic situation where ... 1. There is no iptables firewall enabled at all on the SLURM