I guess a lot of the conversation depends on what you want and expect 
Ganglia to be used for. For example there are a lot of people out there 
that are using Ganglia for performance monitoring and using Nagios NRPE 
to get user level stats from the host. To me that is redundant. Thus if 
you decide you are gonna use Ganglia for providing metric to e.g. Nagios 
you will have to go the route of parsing the Gmond XML. I checked on my 
cluster and each host uses about 15 kBytes (average) of XML to define 
metrics. This works well in small to mid size clusters however as soon 
as you get over certain threshold it breaks down. Let's say

200 hosts * 15 kB = 3 MB

if I wanted to keep track of one metric that would be about 600 MBytes 
of traffic per minute or 10 Mbytes/sec just to fetch the whole XML tree. 
More metrics that need to be checked ie. swap_free and you may be doing 
quite a bit of network traffic. This is just to serve the XML and it 
doesn't take into account overhead processing and parsing data.

You'll say wait a minute :-) if I was doing such a thing I would cache 
the data etc. I hear some people are doing just that ie. storing XML on 
local storage. I have couple ideas myself but the point is that such a 
set up requires yet another thing to setup, monitor and maintain.

Also perhaps REST API is not really the way to go but a simple HTTP 
interface would suffice.

I hope this makes sense :-).

Vladimir

---- Original message ----


Spike Spiegel
Thu, 17 Sep 2009 19:09:21 -0700

On Fri, Sep 18, 2009 at 8:32 AM, Bernard Li <bern...@vanhpc.org> wrote:
> Forwarding this to ganglia-developers since this is a more -devel
> related discussion.  Also can get spike's opinions in ;-)

remember that you asked for it :P

> On Wed, Sep 16, 2009 at 11:49 AM, Vladimir Vuksan <vli...@veus.hr> wrote:
>> There have been some tweets that someone was working on a REST interface
>> for Ganglia.

I would have loved to see something more than a tweet about that
(which I haven't seen either, but just told about). do you have any
more info? what kind of REST interface? it can mean a lot of things
and nothing.

> At first I thought it wasn't such a big deal

Care to share why's that? Personally it'd find it a great addition and
a basic requirement to make extensibility and interoperability with
other applications possible (of course it can be argued that given the
user base and scope there is no interest in doing so).

>> but I think that
>> adding a simplistic interface to Ganglia would be a nice addition ie.
>> something like
>>
>>> telnet ganglia 8653
>> METRIC web1 load_one
>>
>> Which would echo out the current value for load_one. That way you can
>> avoid parsing out the XML to get those values. I think for large sites it
>> makes a lot of sense. Granted there are "workarounds" that could be
>> implemented and people have.

as one of those people I wonder what a new interface like that
changes, as you say the only difference would be making xml parsing
client side unnecessary, which imho is not the problem here.

What I'd like to see is a way to access *all* the data gmetad knows
about, which means both what's in memory and inside the rrds, and
being able to do so for multiple nodes at the same time (I sent a
patch for multiple nodes request a while ago that maybe I should try
to push for again). The same interface, with obviously only in-memory
values available, should exist for gmond.

Also, I wouldn't make up another port for it, but rather use 8652 and
extend the already supported control parameters. So for example you'd
use the interface like this:
telnet ganglia 8652
/grid/cluster/host1/metric1/time[interval];/grid/cluster/host2/metric1;...?format=text
lastupdated time host1 metric1 value[s]
lastupdated time host2 metric1 value

if you don't specify a time it's assumed you want most recent reading
and it's fetched from memory, otherwise you get it from the rrd. The
?format=text regulates if you get the classic xml output (default if
format isn't specified) and that could be amended to be json.

something like that to me would start to make a lot more sense, but
it's still not a REST api to which you can speak http and use known
methods to do useful things like caching results.

let's keep this discussion going.

Spike



------------------------------------------------------------------------------
Come build with us! The BlackBerry&reg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9&#45;12, 2009. Register now&#33;
http://p.sf.net/sfu/devconf
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to