Hi all, I'm having some intermittent issues with collectd on CentOS 6, that I could really use some help debugging.
The specific issue I'm facing is the output plugins appear to stop reporting and don't start again until collectd is restarted. To be clear: I can't be certain the read plugins are still collecting and I could use some help figuring that out too. I don't know what causes this to happen, so I can't reproduce the issue which makes it _extremely_ difficult to debug. Initially I thought this was limited to the write_http plugin and some network issue sending metrics to librato. But since enabling the csv plugin, it appears that both the write_http and csv plugins stop reporting at the same time. I have the syslog plugin enabled, but haven't seen any helpful messages. I have also tried 'strace -ttt -s 2048 -p $COLLECTD_PID' on the effected nodes, but all that gives me is a bunch of nanosleep syscalls. I chucked my default config in a gist (yes, it's configured by puppet): https://gist.github.com/quiffman/11d8b028675bc57f3e13 And FWIW, I initially ran into this on 4.10.9-1.el6 and have since compiled and switched to 5.4.1, which is running on around 30 nodes. *Any* help or hints would be greatly appreciated. Thanks in advance, Rich -- Richard Guest - GeoNet Senior Software Engineer GNS Science - Te Pu Ao D +64-4-570-4854 :: M +64-27-415-1417 :: T +64-4-570-1444 :: F +64-4-570-4600 1 Fairway Drive, Avalon, PO Box 30-368, Lower Hutt, 5040, New Zealand. http://www.gns.cri.nz/who/staff/2658.html Notice: This email and any attachments are confidential. If received in error please destroy and immediately notify us. Do not copy or disclose the contents.
_______________________________________________ collectd mailing list [email protected] http://mailman.verplant.org/listinfo/collectd
