[slurm-dev] Re: login node configuration?
On Oct 26, 2015, at 5:44 PM, Paul Edmonwrote: > What we did was that we just opened up port 6817 between the two VLAN's. So > long as the traffic is routeable and they see the same slurm.conf that should > work. All the login node needs is slurm.conf, and the slurm, slurm-munge, > and slurm-plugin rpms. You don't need to run the slurm service to submit as > all you need is the ability of the login node to talk to the master. Paul, That confirms my thinking that I just need the login nodes to be able to communicate with the slurmctld port over the private network. Unfortunately these are two separate physical networks. We’ve got the interfaces available so I suppose running a couple cables isn’t a big deal. Thank you for the input! Regards, -liam -There are uncountably more irrational fears than rational ones. -P. Dolan Liam Forbes lofor...@alaska.edu ph: 907-450-8618 fax: 907-450-8601 UAF Research Computing Systems Senior HPC EngineerLPIC1, CISSP
[slurm-dev] Re: login node configuration?
For clarity, they should not need to talk to the compute nodes unless you intend to do interactive work. You should only need to talk to the master to submit jobs. -Paul Edmon- On 10/26/2015 9:45 PM, Paul Edmon wrote: What we did was that we just opened up port 6817 between the two VLAN's. So long as the traffic is routeable and they see the same slurm.conf that should work. All the login node needs is slurm.conf, and the slurm, slurm-munge, and slurm-plugin rpms. You don't need to run the slurm service to submit as all you need is the ability of the login node to talk to the master. -Paul Edmon- On 10/26/2015 8:02 PM, Liam Forbes wrote: Hello, I’m in the process of setting up SLURM 15.08.X for the first time. I’ve got a head node and ten compute nodes working fine for serial and parallel jobs. I’m struggling with figuring out how to configure a pair of login nodes to be able to get status and submit jobs. I’ve been searching through the documentation and googling, but I’ve not found a reference other than brief mentions of login nodes or submit hosts. Our pair of login nodes share an ethernet network with the head node. They do not, currently, share a network, ethernet or IB, with the compute nodes. When I execute SLURM commands like sinfo or squeue, using strace I can see them attempting to communicate with port 6817 on the head node. However, the head node doesn’t respond. lsof on the head node seems to indicate slurmctld is only listening on the private network shared with the compute nodes. Is there a reference for configuring login nodes to be able to query slurmctld and submit jobs? If not, would somebody with a similar configuration be willing to share some guidance? Do the login nodes need to be on the same network as the compute nodes, or at least the same network as slurmctld is listening on? Apologies if these questions have been answered somewhere else, I just haven’t found that documentation. Regards, -liam -There are uncountably more irrational fears than rational ones. -P. Dolan Liam Forbes lofor...@alaska.edu ph: 907-450-8618 fax: 907-450-8601 UAF Research Computing Systems Senior HPC Engineer LPIC1, CISSP
[slurm-dev] Re: login node configuration?
What we did was that we just opened up port 6817 between the two VLAN's. So long as the traffic is routeable and they see the same slurm.conf that should work. All the login node needs is slurm.conf, and the slurm, slurm-munge, and slurm-plugin rpms. You don't need to run the slurm service to submit as all you need is the ability of the login node to talk to the master. -Paul Edmon- On 10/26/2015 8:02 PM, Liam Forbes wrote: Hello, I’m in the process of setting up SLURM 15.08.X for the first time. I’ve got a head node and ten compute nodes working fine for serial and parallel jobs. I’m struggling with figuring out how to configure a pair of login nodes to be able to get status and submit jobs. I’ve been searching through the documentation and googling, but I’ve not found a reference other than brief mentions of login nodes or submit hosts. Our pair of login nodes share an ethernet network with the head node. They do not, currently, share a network, ethernet or IB, with the compute nodes. When I execute SLURM commands like sinfo or squeue, using strace I can see them attempting to communicate with port 6817 on the head node. However, the head node doesn’t respond. lsof on the head node seems to indicate slurmctld is only listening on the private network shared with the compute nodes. Is there a reference for configuring login nodes to be able to query slurmctld and submit jobs? If not, would somebody with a similar configuration be willing to share some guidance? Do the login nodes need to be on the same network as the compute nodes, or at least the same network as slurmctld is listening on? Apologies if these questions have been answered somewhere else, I just haven’t found that documentation. Regards, -liam -There are uncountably more irrational fears than rational ones. -P. Dolan Liam Forbes lofor...@alaska.edu ph: 907-450-8618 fax: 907-450-8601 UAF Research Computing Systems Senior HPC EngineerLPIC1, CISSP