----- Original Message ----- From: "Joe Kaiser" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[email protected]> Sent: Monday, November 04, 2002 12:59 PM Subject: Re: [Ganglia-general] Questions about using ganglia
> > Hi, > > I'm from Fermilab and am the lead system administrator for the CMS > experiment in the Scientific Computing Support group. We are currently > using ganglia for monitoring several logical clusters. We do not use > ganglia for an alarm system because it has not really been designed for > that. We use it for a trending and analysis tool and for a first blush > look at a node that may be experiencing problems. We use NGOP for > monitoring critical system daemons and general machine availability. > Our alarm system is all through NGOP. NGOP is freely available from the > Here i the ngop directory I ended up navigating to, for retrieving the software ftp://ftp.fnal.gov/products/ngop/ and then the v2_0a directory, at the moment Is anyone else here using NGOP in combination with ganglia, in terms of using ganglia for the first blush look and then NGOP for the true alarm monitoring to page cell phones, fire off emails, etc? I may install NGOP next week but I wonder if gmond and the use of gmetric would be able to provide a realistic solution. Especially when having gmetad point to multiple gmonds in case one is taken down momentarily, or for restart, etc. A processs or script could run once a minute to look over a handful of rrds, and check for any 'holes' or symtoms of inactivity.. values out of range, etc. To really do this right for the public, a .php front end to specify triggers and other sorts of alarms based on the ganglia data.

