You almost certainly don't need to work as hard as you think you do on the
protocol side.
The architecture of Ganglia is very nicely separated between the monitoring
and what I'll call 'local data collection' daemons (gmond), the grid-level
collection daemon (gmetad), and the web front-end.
So what you want to do is write a Java-based equivalent of gmetad, and then
figure out whatever display strategy you want.
Gmetad *polls* the appropriate gmonds (see the documentation for the
'data_source' configuration parameter for gmetad.conf) every so often (15s
by default).
The protocol for 3.0 is as simple-as-pie: connect with TCP and read until
the gmond closes (shuts down) the socket. I don't think there's a
difference for 3.1, but I haven't coded to that interface.
The payload is an XML document. At least in 3.0, it contained its own DTD.
You should be able to analyze these with just about any XML toolkit. It's
nicely hierarchic, with GRIDs (often null) containing CLUSTERS, which
contain NODEs, which contain METRICs.
The definitions for the fields (especially the time fields) can be a little
odd, but they're still what they were in the original Ganglia paper,
referenced on the wiki. The important ones to be sure you understand are the
time attributes of every metric: Td, Tmax, and Dmax.
One caveat: in 3.0, the XML document would include both the definitions of
the metrics, and their values. It's my understanding (perhaps incorrect)
that in 3.1, the definitions (meta-data) aren't sent every time. You'll
want to read the gmond.conf documentation to identify when it is.
So what your architecture calls for doesn't require a modification of any
Ganglia code at all. It's a replacement of gmetad & the ganglia front-end.
You can use gmond completely unmodified, and I'd recommend that since it
sounds like your actual requirements are to change the metric recording and
presentation layer, not the collection layer.
You do have to sample your gmonds at least as often as the most-sampled
metric, since gmond's don't record any history, just the current value.
Where you will have to work harder than you might think is in recording the
number of diverse metrics in a usably accurate way at speed. Any one XML
document for even a basic cluster (say 50 machines monitoring a few
application metrics in addition to the standard ones) represents parsing and
recording 5-10000 measurements in whatever statistics store you're using,
every 15s (or faster). If you're also summing things up across other
categories (grids, clusters, applications, etc)., you have to deal with
making those updates consistent. RRDtool, when implemented on a RAMdisk or
RRDcache, does a pretty credible job of keeping the statistics working in
the face of missed samples, samples taken a second or two late, etc. Gmetad,
by focusing on a cluster @ a time, avoids having a lot of lock contention
(except when trying to summarize grid metrics out of cluster metrics).
I hope you can come up with a solution that scales that up more and allows
for faster sampling rates; looking forward to your results.
-- ReC
On Wed, Nov 3, 2010 at 3:50 AM, Afef MDHAFFAR <[email protected]>wrote:
> Hello,
>
> I am trying to use Ganglia for my research works.
> So the idea consists of modifying Ganglia in order to get a new Ganglia
> tool which sends informations (monitored data) via TCP/IP sockets to a Java
> server instead of saving it in a RRD database.
> Can you help me in that way? I need name of files, methods that have to be
> modified to reach this goal.
> Also, I am using Mac OS and I want to import ganglia into Xcode in order to
> compile and build it. Can you please send me a tutorial for that?
>
> Thank you,
>
> Best regards
>
> --
> Afef MDHAFFAR
> http://www.redcad.org/members/mdhaffar/
>
>
>
>
>
>
> --
> Afef MDHAFFAR
> http://www.redcad.org/members/mdhaffar/
>
>
>
>
>
> ------------------------------------------------------------------------------
> Achieve Improved Network Security with IP and DNS Reputation.
> Defend against bad network traffic, including botnets, malware,
> phishing sites, and compromised hosts - saving your company time,
> money, and embarrassment. Learn More!
> http://p.sf.net/sfu/hpdev2dev-nov
> _______________________________________________
> Ganglia-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>
>
------------------------------------------------------------------------------
Achieve Improved Network Security with IP and DNS Reputation.
Defend against bad network traffic, including botnets, malware,
phishing sites, and compromised hosts - saving your company time,
money, and embarrassment. Learn More!
http://p.sf.net/sfu/hpdev2dev-nov
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general