So, once you've gotten Ganglia to pull in metrics from gazillions of
nodes in umpteen clusters, and got pretty graphs of everything, what
do you use for monitoring the values? I mean, when a machine goes
down, you don't want just a webpage to be updated, you want something
to trigger the klaxons.

I've tried to adapt Nagios (formerly known as Netsaint) for that
purpose, but Nagios doesn't really fit the bill; it's designed to
collect it's own monitoring data and is not very happy with just
being fed data from other sources.

-- 
Leif Nixon                                    Systems expert
------------------------------------------------------------
National Supercomputer Centre           Linkoping University
------------------------------------------------------------

Reply via email to