So I am working on upgrading a number of things right now to collectd 4.10.2 
and rrdtool 1.4.5. Unfortunately one of my major problems, is still around. 
When rrdcached dies/stops and then gets restarted, collectd will not reconnect. 
There was a discussion of this on IRC at some point, but I never got back to do 
more testing. Even with the latest version of rrdtool on client and server, 
collectd will go into a spin with messages like:

Dec 29 23:45:40 appbuild01 collectd: collectd startup succeeded
Dec 29 23:49:00 appbuild01 collectd[10338]: rrdcached plugin: rrdc_update 
(appbuild01.autc.com/cpu-0/cpu-user.rrd, [1293695340:49451049], 1) failed with 
status -3.
Dec 29 23:49:00 appbuild01 collectd[10338]: Filter subsystem: Built-in target 
`write': Dispatching value to all write plugins failed with status -1.
Dec 29 23:49:00 appbuild01 collectd[10338]: rrdcached plugin: rrdc_update 
(appbuild01.autc.com/cpu-0/cpu-nice.rrd, [1293695340:28472], 1) failed with 
status -3.
Dec 29 23:49:00 appbuild01 collectd[10338]: Filter subsystem: Built-in target 
`write': Dispatching value to all write plugins failed with status -1.
Dec 29 23:49:00 appbuild01 collectd[10338]: rrdcached plugin: rrdc_update 
(appbuild01.autc.com/cpu-0/cpu-system.rrd, [1293695340:10117087], 1) failed 
with status -3.

At this point I have to restart collectd and everything will be fine. The pain 
is having to do this on > 300 machines.

I would like to take this post to try and track this further down. The client 
for this particular test was RedHat EL4 32-bit using collectd 4.10.2 and 
rrdtool 1.4.5 (with patches to remove graphing). Server is EL 5 64-bit with 
rrdtool 1.4.5 (full code).

Ulf.


_______________________________________________
collectd mailing list
[email protected]
http://mailman.verplant.org/listinfo/collectd

Reply via email to