Hi!

We're trying to install slurm 15.08.7 with munge 0.5.11 and experience regular communication errors on the client side:

[initialization]
slurmd: debug3: in the service_connection
slurmd: debug: _slurm_recv_timeout at 0 of 4, recv zero bytes
slurmd: error: slurm_receive_msg_and_forward: Zero Bytes were transmitted or 
received
slurmd: error: service_connection: slurm_receive_msg: Zero Bytes were 
transmitted or received
slurmd: debug2: slurm_send_timeout: Socket no longer there
slurmd: debug3: slurm_msg_sendto: peer has disappeared for msg_type=8001
slurmd: debug3: in the service_connection
slurmd: debug2: got this type of message 1008
slurmd: debug3: in the service_connection
slurmd: debug: _slurm_recv_timeout at 0 of 4, recv zero bytes
slurmd: error: slurm_receive_msg_and_forward: Zero Bytes were transmitted or 
received
slurmd: error: service_connection: slurm_receive_msg: Zero Bytes were 
transmitted or received
slurmd: debug2: slurm_send_timeout: Socket no longer there
slurmd: debug3: slurm_msg_sendto: peer has disappeared for msg_type=8001
[repeating]

However, running simple commands like

srun -n 16 hostname

works without a problem. Additionally, a quick test in a VM installation (smaller installation with less packages) works flawlessly.

What could be the cause of these problems?


Best regards,
Stefan

Reply via email to