[Ganglia-developers] Ganglia Reference Architecture

daniel . j . marrera Fri, 12 Sep 2014 08:42:41 -0700

Greetings all and apologies if I have directed this to the incorrect list 
but I thought it might be best to begin with those closest to the source, 
so to speak.


I work for a fairly large international bank and we are currently 
evaluating options for collecting and visualizing performance related 
statistics for the entirety of our UNIX/Linux estate (with the possibility 
of including Windows at some point).  Naturally I took to the internet and 
came across Ganglia as one of the (widely used) possible options.  I then 
spent some time looking through reports of issues, etc. and have some 
questions/concerns regarding how best to organize my infrastructure should 
I decide to recommend Ganglia as the solution.

In preparation I thought it best to do some discovery around the size of 
our estate and any details our end users (system administrators, 
performance engineers, etc) would say needed to be included in the metric 
set.  To that end I would say that we have approximately 26K servers today 
and, given rough extrapolation, could easily wind up in the neighborhood 
of 4.5M total metrics within the total system.  Our expectation is to 
extend the base set of metrics to include any number of middleware related 
measurements which is the primary reason for the significant number of 
metrics.  We will also be using unicast .. unless, of course, a compelling 
enough case can be made for the alternative.

My initial instincts are to subdivide the Ganglia infrastructure by major 
data-center with each one represented by a single grid.  I imagine I would 
need 6-12 clusters (possibly more) per grid and will definitely be looking 
to use rrdcached.  I do not know if that will be enough segregation to 
allow gmetad to perform as required.  Several of my larger (more 
influential) end users have indicated a need for some fairly tight 
resolutions (15s for 4hrs for a number of high value metrics).

I guess my initial question is this ... has anyone done anything like 
this, at this scale, with any success and - if so - would it be possible 
to get some additional information (scrubbed diagram, etc) regarding how 
it is best done?  I've been searching the net and keep coming back to a 
single image showing a hierarchy of gmetad and some fairly interesting 
descriptions of other implementations but nothing that actually makes it 
clear to me.

Daniel J Marrera 
Global Unix Tooling & Support 


-----------------------------------------
******************************************************************
This E-mail is confidential. It may also be legally privileged. If
you are not the addressee you may not copy, forward, disclose or
use any part of it. If you have received this message in error,
please delete it and all copies from your system and notify the
sender immediately by return E-mail.

Internet communications cannot be guaranteed to be timely, secure,
error or virus-free. The sender does not accept liability for any
errors or omissions.
******************************************************************
SAVE PAPER - THINK BEFORE YOU PRINT!

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk

_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

[Ganglia-developers] Ganglia Reference Architecture

Reply via email to