[collectd] Debugging NaN values being recorded

Brandon Hume Mon, 22 Oct 2012 09:46:59 -0700

 I've got the following collectd arrangement:

    Solaris Zone 1 collectd --.
    Solaris Zone 2 collectd --+--  Linux collectd -> rrd
    Solaris Zone 3 collectd --'
    Solaris zone 4 collectd --'

So four Solaris zones, which all exist on the same host server,reporting (via network plugin) to collectd running on Linux. Itactually works very well.

The binaries and configurations for all four zones are identical, exceptfor Hostname. Most of the stats are working fine, *except* for"fork_rate" from the processes plugin.


This is where it gets weird.

"fork_rate", because these are zones and not full VMs, is the exact samemetric across all four. So it's wasteful for me to be recording it fourtimes, but not terribly so - and it helps avoid needing to flip pageswhen viewing the stats.

However, two of the zones are reporting "NaN" for that metric, while theother two are happily recording real, useful values. Keep in mind thatthis is effectively the same number being sent by all four zones... Idon't think it'd vary that much as each zone's collectd gets CPU time,and not this consistently.

What are my best means of finding out *why* RRD would reject a value?I've checked to make sure the "heartbeat" of each rrd matches theinterval... and I've tried turning up syslogging but there's a lot oftraffic and it's hard to pick things out when I don't know what I'mlooking for.


Is there a means of detecting rrd rejections?

_______________________________________________
collectd mailing list
[email protected]
http://mailman.verplant.org/listinfo/collectd

[collectd] Debugging NaN values being recorded

Reply via email to