Hello,

I'm trying to get SLURM set up on a small cluster comprised of a head node
and 4 compute nodes.  On the head node, I have run

```
sudo systemctl enable slurmctld
```

but after a reboot SLURM is not running and `sudo systemctl status
slurmctld` returns:

```
● slurmctld.service - Slurm controller daemon
   Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor
preset: enabled)
   Active: failed (Result: exit-code) since Tue 2017-09-19 10:38:00 EDT;
9min ago
  Process: 1363 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS
(code=exited, status=0/SUCCESS)
 Main PID: 1395 (code=exited, status=1/FAILURE)

Sep 19 10:38:00 arcesius slurmctld[1395]: Recovered state of 4 nodes
Sep 19 10:38:00 arcesius slurmctld[1395]: Recovered information about 0 jobs
Sep 19 10:38:00 arcesius slurmctld[1395]: Recovered state of 0 reservations
Sep 19 10:38:00 arcesius slurmctld[1395]: read_slurm_conf:
backup_controller not specified.
Sep 19 10:38:00 arcesius slurmctld[1395]: Running as primary controller
Sep 19 10:38:00 arcesius slurmctld[1395]: Recovered information about 0
sicp jobs
Sep 19 10:38:00 arcesius slurmctld[1395]: error: Error binding slurm stream
socket: Cannot assign requested address
Sep 19 10:38:00 arcesius systemd[1]: slurmctld.service: Main process
exited, code=exited, status=1/FAILURE
Sep 19 10:38:00 arcesius systemd[1]: slurmctld.service: Unit entered failed
state.
Sep 19 10:38:00 arcesius systemd[1]: slurmctld.service: Failed with result
'exit-code'.
```

If I then run `sudo systemctl start slurmctld`, it starts up without any
errors and my compute nodes can communicate with the server.  Launching
`slurmctld -Dvvvvvv` works, and doesn't print anything that I deem
concerning.

Why would it work manually, but not automatically on boot?  If you need any
more information, please let me know; I'm not sure what is necessary to
diagnose this problem.

Thank-you,
KM

Reply via email to