matt massie wrote:
Today, Steven Wagner wrote forth saying...
I'll tell you this, though - I'm seeing gaps in both cluster graphs
like you wouldn't believe.  I don't know what's causing it but I
assume it's related to gmetad assuming its data sources are dead so
quickly.  It can't be an "it's an old version" thing because it
happens on both new and old versions...


if you take a look at data_thread.c (around line 74), you'll see how C
gmetad is pulling the data.  i'm using 10 second timeouts per 1024 bytes
of data read.  this means that if a datasource is unable to deliver 102.4
bytes/sec (820 bits/s) it is considered down. you might play with the values to see what works best for you. if the timeout is too long then you'll miss RRD heartbeats and have dead spots anyway. the gaps are good indications of transient network connect problems.

I'll tweak these values after lunch, but so far it's looking like two chunks of code are "at fault," one controlling the dead-source flag for new monitoring sources and one controlling the dead-source flag for pre-2.5.0 sources. My old source has far more dropouts (lasting 60 seconds or more) than the new source.

And the perl gmetad never had trouble polling 'em. So I don't know what the deal is at this point...


Reply via email to