guys-

i'd like to offer a first preview of ganglia 3.  

CVS: module g3
but the code is changing about every 5 minutes

http://matt-massie.com/g3/
has a stable snapshot.

the code has been tested on linux, solaris, freebsd, macos x and cygwin!  
i'm really surprised how well ganglia 3 is running on cygwin.  i have to
tip my red hat to the cygwin developers.  my biggest obstacle was porting
the xdr routines to cygwin.  if you look in the ./winrpc directory you'll
see what i mean.  i had to use the latest libc code too in order to avoid
a huge security hole in xdr_array().  enough about cygwin though.

the ganglia 3 api is documented using doxygen.  tune your favorite browser 
to ./docs/html/index.html to take a look at what is there.

this code only covers half of what ganglia needs to do but in terms of 
difficulty i think it's the largest half.  

the code now has

  o a central scheduling priority queue which doesn't rely on a fixed list
    of metric functions but rather can grow and shrink to accomodate
    the needs of the monitoring core. (see q.c).  these queues having
    an underlying heap for efficiency and speed in prioritizing jobs.
    NO MORE HARDCODED metric array!  i've spent a lot of time testing
    and optimizing this part of the code.  i'm confident that this
    queue will be able to scale well beyond our immediate need.  this
    central priority queue is what each monitoring plugin will plug
    their jobs into (see plugin.c and test-plugin.c).

  o a well-defined XDR representation of hierarchical metric messages
    (see g3xdr.x for the XDR definition.  rpcgen is used to create the
    g3xdr.c and g3xdr.h files from this definition).  i think it is
    critical that the XDR be a well defined as the XML.  you can think
    of this XDR description file as the "DTD" for the XDR.  using
    this definition we are able to group metrics together on the wire
    and NO MORE METRIC COLLISIONS!  there is no longer a unique key
    per metric.  this means ganglia 3 will hum along happy as a clam in
    heterogenous environments. (see example test-xdr.c)

  o a simple API for creating, modifying, serializing and deserializing 
    g3_metric_msg variables to XDR network messages (see msg.c).

  o an API for creating hierarchical g3_metric_msg variables (see msg.c).
    this part of the code is not finished but as you can see from the 
    test-xdr.c code... it's close.  i just need to write a few routines
    to translate from C to ganglia discriminant unions and back again.
    (pretty easy).

  o the error functions now prefix output with a timestamp when a process
    is not daemonized (otherwise it's sending to syslog() which timestamps
    messages).  the debug message now have timestamps and the debug_level
    really works now.  also, you can set a debug mask (steve's 
    suggestion).

if you want to play with what is there already...

# gunzip < ganglia-3.0.0.tar.gz | tar -xvf -
# cd ganglia-3.0.0
# ./configure
# make
...
# ls .libs
<you should see the g3_test_plugin there.  most operating systems
 call it g3_test_plugin.so but it could be g3_test_plugin.dll>

# ./gmond -l 10 -p .libs/g3_test_plugin.so 

you'll see a lot of verbage. about what gmond is doing.  if you look at 
the gmond.c code you'll see that it is infinitely simplified over ganglia 
2.  in the end, ganglia 2 will likely come a ganglia 3 plugin.

i'm starting work on the second half of ganglia with listens for these
g3_metric_msg xdr messages and puts the data into a hierarchical data
structure.  ganglia 3 will also have a hierarchical delegation model for
information (much like DNS) so i'm keeping that in mind as i write it 
(we're going to need to revamp our XML for sure).

i just want to thank you guys for all the great ideas, code and feedback 
about ganglia.  federico and i were on the phone for hours talking about 
what ganglia 3 should look like.  i've been getting feedback from dozens 
of outside groups about what they want/need ganglia to do.  

if any of you have time to try g3 on some other os's like HPUX, False64,
AIX et al.. let me know if it works.  i don't have access to machines 
running those OSes.

happy new year... (let's hope)
-- matt


Reply via email to