[slurm-dev] Re: Failed to contact primary controller : No route to host
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/11/13 16:08, Arjun J Rao wrote: Yes, it was the firewall rules on my Scientific Linux installation. Flushed the iptables using iptables -F and now the slurm daemons talk with the slurm controller just fine. Wonderful! Well done. All the best, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEUEARECAAYFAlJ6xX4ACgkQO2KABBYQAh8vEgCbBEjr79Du8Z6HB+YnJiDyR7K1 qbgAmJ9AXaqlxLjYX+R5SLqDWlbHgek= =i2yH -END PGP SIGNATURE-
[slurm-dev] Re: Failed to contact primary controller : No route to host
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/11/13 18:52, Arjun J Rao wrote: Now I have tried netstat -lnt on qdr1(130.1.2.205) and it shows this : Proto Recv-QSend-Q LocalAddress ForeignAddress State tcp0 0 0.0.0.0:6817 http://0.0.0.0:6817 0.0.0.0:* LISTEN tcp0 0 0.0.0.0:6818 http://0.0.0.0:6818 0.0.0.0:* LISTEN This shows that both slurmctld and slurmd on qdr1 are listening and talking to each other. No, it shows that the processes are listening, it does not show whether or not they are communicating, or can even connect. But doing nc -zv qdr1 6818 from qdr2 gives me the following error : nc: Connect to qdr1 port 6818(tcp) failed: No route to host That would tend to imply you've got firewall rules that are preventing the communication between the nodes, you'll need to check that. Good luck! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlJ5fO0ACgkQO2KABBYQAh9BqQCfUIFABgCrVc92bKTX28IqqsT9 KsEAn2PobnG+Y4UqupyQv7ILlINycIKX =FTlZ -END PGP SIGNATURE-
[slurm-dev] Re: Failed to contact primary controller : No route to host
Yes, it was the firewall rules on my Scientific Linux installation. Flushed the iptables using iptables -F and now the slurm daemons talk with the slurm controller just fine. On Tue, Nov 5, 2013 at 11:21 PM, Christopher Samuel sam...@unimelb.edu.auwrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/11/13 18:52, Arjun J Rao wrote: Now I have tried netstat -lnt on qdr1(130.1.2.205) and it shows this : Proto Recv-QSend-Q LocalAddress ForeignAddress State tcp0 0 0.0.0.0:6817 http://0.0.0.0:6817 0.0.0.0:* LISTEN tcp0 0 0.0.0.0:6818 http://0.0.0.0:6818 0.0.0.0:* LISTEN This shows that both slurmctld and slurmd on qdr1 are listening and talking to each other. No, it shows that the processes are listening, it does not show whether or not they are communicating, or can even connect. But doing nc -zv qdr1 6818 from qdr2 gives me the following error : nc: Connect to qdr1 port 6818(tcp) failed: No route to host That would tend to imply you've got firewall rules that are preventing the communication between the nodes, you'll need to check that. Good luck! Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlJ5fO0ACgkQO2KABBYQAh9BqQCfUIFABgCrVc92bKTX28IqqsT9 KsEAn2PobnG+Y4UqupyQv7ILlINycIKX =FTlZ -END PGP SIGNATURE-