Hi Ramiro,

You might check to ensure that all of your clocks are in sync (varying by no more than a minute or two).

Andy

On 10/04/2011 07:56 AM, Ramiro Alba wrote:
Sten,


On Tue, 2011-10-04 at 13:45 +0200, Sten Wolf wrote:
did you create munge key?

Yes, I did. See local test:

# munge -n | unmunge

STATUS:           Success (0)
ENCODE_HOST:      jff.cttc-jffeth.org (10.2.254.1)
ENCODE_TIME:      2011-10-04 13:54:19 (1317729259)
DECODE_TIME:      2011-10-04 13:54:19 (1317729259)
TTL:              300
CIPHER:           aes128 (4)
MAC:              sha1 (3)
ZIP:              none (0)
UID:              root (0)
GID:              root (0)
LENGTH:           0





On 04/10//2011 13:28, Ramiro Alba wrote:
Hi all,

I am trying to setup a slurm controller (2.2.7) on Ubuntu 10.04 on
cluster server and even with a simple slurm.conf (see attached file) the
'slurmctld' daemon sends continuously to the log file:

debug:  _slurm_recv_timeout at 0 of 4, recv zero bytes
error: slurm_receive_msg: Zero Bytes were transmitted or received
error: slurm_receive_msg: Zero Bytes were transmitted or received

You can see at 'slurm.conf' that the same node acts as a controller and
as a compute node. Jobs can be submited.


Any other cluster node/server (apparently having the same/similar
hardware and the same operating system) works smoothly without any error
acting both as a controller or a backup controller.

Can anyone give me some idea what to look at, so as to suppress those
error messages?
I've looked at the mailing list for similar messages but none was of
help.

--
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que está net.

--
Andy Riebs
Hewlett-Packard Company
High Performance Computing
+1-786-263-9743
My opinions are not necessarily those of HP

Reply via email to