Changed subjects because this part of the discussion is important enough to 
have its own thread

On Dec 20, 2009, at 8:55 AM, Jesse Becker wrote:

On Sun, Dec 20, 2009 at 11:02, Spike Spiegel 
<[email protected]<mailto:[email protected]>> wrote:
...
I think there's a middle ground here that'd be interesting to explore,
altho that's a different thread, but for kicks this is the gist: the
common pattern for rrd storage is hour/day/month/year and I've always
found it bogus. In many cases I've needed higher resolution (down to
the second) for the last 5-20 minutes, then intervals of an hr to a
couple hrs, then a day to three days and then a week to 3 weeks etc
etc, which increases your storage requirements, but  is imho not an
abuse of rrd and still retains the many advantages of rrd over having
to maintain a RDBMs.

The d/w/m/y split is a good *starting point*.  Ganglia needs to ship
with some sort of sensible default configuration that essentially
works for many/most people.  You (singular or plural) are are free to
customize your RRD configuration as policy and storage capacity
require and permit.  Ganglia officially supports this via the RRD
config like in gmetad.conf.   and as your storage system permits.  In
the ideal world, you keep all data, at the highest resolution,
forever, but that usually isn't practical.


We like to have different resolutions for specific metrics, which Ganglia 
doesn't support directly in any way.  Also, as far as scaling the data "up" 
goes, it would be very handy to retain high-resolution longer for just the 
summary metrics (they're what you need for planning anyway).

As an illustration: on one of our clusters, I have about 10 metrics (per host) 
where we retain the 15s samples for more than a week, resulting in a nearly 1MB 
rrdtool file per metric. The rest stay at the more typical (~12K) we run around 
here.

In the same vein, we've tried using the Holt-Winters features in rrdtool, but 
different metrics need very different tunings, so there's no single 'RRA' that 
works for a whole grid.

At the moment, this is pretty hard to set up -- you have to create your rrdfile 
outside Ganglia with the specialized policy you want.  When thinking about how 
to provide plugins or other extensibility hooks for gmetad, this is among the 
first places I'd look: providing the rrd create helper function with all of the 
meta-data about a metric (is it cluster or grid summary? is it host? hostname? 
metric name? Other annotations (we're 3.0, so all our annotations are in the 
metric naming convention)?) and making the configuration language flexible 
enough to get a whole little pattern language into how to decide what RRAs to 
have.

Of course, with more flexibility, you'll get it wrong more often, so you'd also 
be stuck with some less-than-perfect RRD merges after fixes, but it sure would 
be handy.

The direction of using the 'python gmetad' to experiment with this sort of 
extensibility concerns me, since it seems to me the needs grow with larger 
installations; I need this more in my biggest grids than the smaller ones, and 
those are already running right at the threshold of "gapping" using the C 
implementation.


-- ReC



------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to