Dear developers,

I see this...

slurmctld.log
[2015-08-28T08:25:43.370] Updating acct_gather data for <nodelist>

slurmd.log
[2015-08-28T08:26:08.401] debug3: in the service_connection
[2015-08-28T08:26:08.401] debug2: got this type of message 1017
[2015-08-28T08:26:08.401] debug2: Processing RPC: REQUEST_ACCT_GATHER_UPDATE
[2015-08-28T08:26:08.401] debug2: Processing RPC: REQUEST_ACCT_GATHER_UPDATE

what is the meaning of these messages?

The slurmctld tries to summon the daemon on the node (just for status), 
but somehow gets no response. The load on both is nodes is low, ping runs fine.

This happens (stochastically) from time to time and makes the node unreliable.

thanks a lot,
Ulf


PS. sometimes I get other lines doubled, but this is the only slurmd process 
running...
[2015-08-28T08:27:23.475] debug3: in the service_connection
[2015-08-28T08:27:23.476] debug2: got this type of message 1017
[2015-08-28T08:27:23.476] debug2: got this type of message 1017
[2015-08-28T08:27:23.484] debug2: Processing RPC: REQUEST_ACCT_GATHER_UPDATE


-- 
___________________________________________________________________
Dr. Ulf Markwardt

Technische Universität Dresden
Center for Information Services and High Performance Computing (ZIH)
01062 Dresden, Germany

Phone: (+49) 351/463-33640      WWW:  http://www.tu-dresden.de/zih

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to