Greetings all and apologies if I have directed this to the incorrect list
but I thought it might be best to begin with those closest to the source,
so to speak.
I work for a fairly large international bank and we are currently
evaluating options for collecting and visualizing performance related
statistics for the entirety of our UNIX/Linux estate (with the possibility
of including Windows at some point). Naturally I took to the internet and
came across Ganglia as one of the (widely used) possible options. I then
spent some time looking through reports of issues, etc. and have some
questions/concerns regarding how best to organize my infrastructure should
I decide to recommend Ganglia as the solution.
In preparation I thought it best to do some discovery around the size of
our estate and any details our end users (system administrators,
performance engineers, etc) would say needed to be included in the metric
set. To that end I would say that we have approximately 26K servers today
and, given rough extrapolation, could easily wind up in the neighborhood
of 4.5M total metrics within the total system. Our expectation is to
extend the base set of metrics to include any number of middleware related
measurements which is the primary reason for the significant number of
metrics. We will also be using unicast .. unless, of course, a compelling
enough case can be made for the alternative.
My initial instincts are to subdivide the Ganglia infrastructure by major
data-center with each one represented by a single grid. I imagine I would
need 6-12 clusters (possibly more) per grid and will definitely be looking
to use rrdcached. I do not know if that will be enough segregation to
allow gmetad to perform as required. Several of my larger (more
influential) end users have indicated a need for some fairly tight
resolutions (15s for 4hrs for a number of high value metrics).
I guess my initial question is this ... has anyone done anything like
this, at this scale, with any success and - if so - would it be possible
to get some additional information (scrubbed diagram, etc) regarding how
it is best done? I've been searching the net and keep coming back to a
single image showing a hierarchy of gmetad and some fairly interesting
descriptions of other implementations but nothing that actually makes it
clear to me.
Daniel J Marrera
Global Unix Tooling & Support
-----------------------------------------
******************************************************************
This E-mail is confidential. It may also be legally privileged. If
you are not the addressee you may not copy, forward, disclose or
use any part of it. If you have received this message in error,
please delete it and all copies from your system and notify the
sender immediately by return E-mail.
Internet communications cannot be guaranteed to be timely, secure,
error or virus-free. The sender does not accept liability for any
errors or omissions.
******************************************************************
SAVE PAPER - THINK BEFORE YOU PRINT!
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers