Hello, I’m in the process of setting up SLURM 15.08.X for the first time. I’ve got a head node and ten compute nodes working fine for serial and parallel jobs. I’m struggling with figuring out how to configure a pair of login nodes to be able to get status and submit jobs. I’ve been searching through the documentation and googling, but I’ve not found a reference other than brief mentions of login nodes or submit hosts.
Our pair of login nodes share an ethernet network with the head node. They do not, currently, share a network, ethernet or IB, with the compute nodes. When I execute SLURM commands like sinfo or squeue, using strace I can see them attempting to communicate with port 6817 on the head node. However, the head node doesn’t respond. lsof on the head node seems to indicate slurmctld is only listening on the private network shared with the compute nodes. Is there a reference for configuring login nodes to be able to query slurmctld and submit jobs? If not, would somebody with a similar configuration be willing to share some guidance? Do the login nodes need to be on the same network as the compute nodes, or at least the same network as slurmctld is listening on? Apologies if these questions have been answered somewhere else, I just haven’t found that documentation. Regards, -liam -There are uncountably more irrational fears than rational ones. -P. Dolan Liam Forbes [email protected] ph: 907-450-8618 fax: 907-450-8601 UAF Research Computing Systems Senior HPC Engineer LPIC1, CISSP
