i've just checked in changes to the monitoring core.
a little vocabulary
T0 (T-zero) the time gmond received its last message
Tn the number of seconds elapsed since T0
Tmax the max number of seconds between multicast messages
so if Tn > Tmax was know that either a multicast message was lost or the
host is down. all version of ganglia before now have assumed that if
Tn > Tmax*4 then the host is dead.
i've also added
Dmax the number of seconds until this data will be deleted
lastly, you will see a
SLOPE attribute added to the METRIC tag. it has four possible values:
ZERO, POSITIVE, NEGATIVE, or BOTH.
pseudo-code
switch (SLOPE)
case ZERO, then we have a constant value which shouldn't have an
historical information saved
break;
case POSITIVE:
case NEGATIVE:
the historical information should be saved as a "counter"
in the round-robin databases
break;
case BOTH:
the historical information will be saved as a "gauge" in the
round-robin databases (which is what it does now).
these changes will allow us to make gmetad and the web frontend much more
efficient by removing constand metrics from the round-robin db tree and
also allow us to create custom rrds based on the metric characteristics.
i also changed this release to be 2.5.0. i hope to package it up and
release it this week (but then again i feel i've been saying that for
weeks... we'll see)
time for a game of delta force 3
-matt