That was it! I had Iptables disabled on one node and not on the other, but once 
I disabled it again it worked!

Thanks so much!

-Trevor

> On Jun 25, 2015, at 5:56 PM, Eric Lund <[email protected]> wrote:
> 
> 
> Also check your iptables config on both nodes, there might be a firewall rule 
> hanging you up...  It sounds like you can talk back to the head node on an 
> established socket but you can't establish a socket to the head node from the 
> compute node.
> 
> Eric
> 
> On 6/25/15 4:38 PM, Peter Kjellström wrote:
>> On Thu, 25 Jun 2015 14:15:17 -0700
>> Trevor Gale <[email protected]> wrote:
>> > Hello all,
>> >
>> > I am experiencing an odd issue where my head node can see the compute
>> > node but the compute node cannot see the headnode. If I run “sinfo”
>> > on the head node I see both nodes in the state idle, but I can’t run
>> > sinfo on the compute node. If i look at the head nodes logs I see no
>> > issues, and I see things like “node_did_resp compute0”. but if I look
>> > at the compute nodes log I see “slurm connect failed: no route to
>> > host”. I am using the IP addresses that I assigned the nodes in my
>> > IPoIB config, and I know these IPs work normally (I can ssh, scp, and
>> > ping with them), but for some reason the compute node does not see
>> > the head node.
>> >
>> > Does anyone have any idea what the issue might be?
>> 
>> Two ideas:
>> 
>> 1) You have different slurm.conf files with different node definitions
>> across you cluster causing connectivity problems.
>> 
>> 2) You have actual IPoIB connectivity problems, maybe the quite recent
>> rhel6/centos-6 bug that caused islands of connectivity under certain
>> circumstances? (fixed in -504.16.2).
>> 
>> /Peter
>> 
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.

Reply via email to