Hi Alex,

On Thursday 08 February 2007 18:31, Alex Balk wrote:
> Why would you consider gmetrics to be "second class"? I find those to be
> much more useful than the builtin metrics.

Ah, sorry; a slight misunderstanding here.

I'm considering gmetric metrics as "second class" just because of how the 
metrics are XDR-encoded for their transport to a gmond.  I'm not saying 
they're less useful than the gmond-provided metrics, just that they're 
treated differently when encoded into the UDP multicast packet.

gmetric metrics are always encoded as Strings ("string value<>" in 
protocol.x).  For number-based metrics, this can be very space-wise 
inefficient.  A binary format would be better, and this is what the "core" 
metrics use.

Currently, only "core" metrics can be binary encoded.  One can only extend the 
list of core metrics by forking ganglia and make a version that's pretty much 
incompatible with the ganglia.sf.net version (with the caveat that a 
ganglia.sf.net-gmond can still publish to the new gmond, allowing some level 
of interoperability).  This is the Swiss binary-only Windows version of 
gmond.

Simply allowing gmetric data to be binary encoded would be a great 
improvement; but, perhaps a better approach would be to rethink the encoding 
so they're no explicit mention of gmond metrics: none of the metrics would be 
more important than others.

> In fact, the only thing 
> that's nice about builtin metrics (other than the fact that you get them
> out-of-the-box) is that they get reported even when the machine is under
> extremely high load.

Hmmm: interesting.  How are you sending gmetric data?  Are you running gmetric 
from cron?  How are you generating the metrics?  Do you know how much 
overhead is involved sending gmetric data?

I've my own low-overhead gmetric-like solution (monami) that should just keep 
on chuggin', like gmond, even when the machine is heavily loaded.

So, I'm interested in how monami does in difficult high-load situations.  
Perhaps we could talk more off-line about this if its going a little 
off-topic for the list.

> I view Ganglia as a framework rather than a performance monitoring
> solution. Anything can be encoded in the UDP messages through the use of
> gmetric, and specialized web interfaces can be quite easily built
> through the use of the "custom metrics addon" I'd written a while back (
> http://wtf.ath.cx/screenshots.html ). So you can not only access the XML
> data on machine, cluster & grid levels, you can also generate whatever
> UIs you want for your users... regardless of whether the metrics are
> hardcoded or brought in from gmetric.

Yes, sure.  I'd imagine some people look at Ganglia as a complete solution, 
others as a building block.

Its really just a question of at what point is the distinction between gmond- 
and gmetric- generated metrics lost?

When gmetad reads XML from a gmond, data from both sources is pretty much 
equivalent (except for the SOURCE attribute).  After then, the data should 
hit rrdtool as equivalent.

Thereafter, the only different being the default web-pages are setup to expect 
core metrics and display them nicely.  This is where you're custom metrics 
addon comes in useful.

> The only thing left on my wishlist is support for 64bit metrics, to
> achieve better scalability over aggregated data.

BTW, you might want to add a wish-list/todo bugzilla entry, just so it doesn't 
get lost.

Cheers,

Paul.

Reply via email to