The first thing I would check would be if the system clocks are in
sync, or at least reasonably close.
Andy
On 03/02/2016 05:09 PM, Berryhill,
Jerome wrote:
I am running slurm on a small cluster, with
the control on a machine running RHEL7.1. Slurm has been
working fine. I recently added a node running SLES12.1, and I
am trying to get slurmd working on it.
The slurmd log contains a long series of
messages like these;
[2016-03-02T06:01:19.790] error:
slurm_receive_msg: Zero Bytes were transmitted or received
[2016-03-02T06:01:19.800] error: Unable to
register: Zero Bytes were transmitted or received
[2016-03-02T06:01:19.800] debug: Unable to
register with slurm controller, retrying
[2016-03-02T06:01:20.800] debug3: CPUs=1
Boards=1 Sockets=1 Cores=1 Threads=1 Memory=64452
TmpDisk=32226 Uptime=89021 CPUSpecList=(null)
[2016-03-02T06:01:20.811] debug:
_slurm_recv_timeout at 0 of 4, recv zero bytes
[2016-03-02T06:01:20.811] error:
slurm_receive_msg: Zero Bytes were transmitted or received
[2016-03-02T06:01:20.821] error: Unable to
register: Zero Bytes were transmitted or received
[2016-03-02T06:01:20.821] debug: Unable to
register with slurm controller, retrying
I am using version 14.11.3 on both
machines.
Any suggestions?