Re: Email notification of repairs

michoski Tue, 09 Feb 2010 14:10:10 -0800

On 2/9/10 1:09 PM, "Justin Lloyd" <jll...@digitalglobe.com> wrote:
> Has anyone done any investigation into having a monitoring tool like
> Zenoss (which we use), Nagios, or OpenNMS watch for repairs? At the very
> least, centralizing at least some of Cfengine hosts' logs and using a
> log-watching tool like Swatch or Splunk would be a step in the right
> direction.
> 
> Team Cfengine: Is there any kind of roadmap for integration with such
> third-party monitoring tools?


We ended up incorporating sensible checks into our local nagios instance.
"Sensible" was determined by poking around the working directory on healthy
and sick hosts.  Each host currently watches the state of various files
beneath /var/cfengine and alerts if they grow stale, cfagent exits non-zero,
etc.  This lets us know when cfengine is not running at all, along with a
few edge cases like corrupt bdbs, and escalate/email/page.

In general, we've adopted the mantra "Email is not for monitoring."  We all
get lots of mail already -- it's unreliable, there are quotas, lack of
trending/reporting, no way to handle exceptions (schedule downtime), etc.
Ditching email in favor of a real monitoring system will certainly make your
life easier in the long run.  You can still send informational emails as
needed, but it's good to consider anything sent via email as an optional
read and anything routed through monitoring as requiring action.

Unfortunately, we have not yet implemented anything providing the
per-promise granularity you are looking for.  I wish we had it today!  In
the past I have thought that if policies syslog everything over TCP
(requires some effort to do it right, syslog-ng or stunnel could work), to
central splunk servers (or whatever)...  It would certainly be possible to
have a meaningful history of "everything" -- but building filters to extract
the useful data would take time.

_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Re: Email notification of repairs

Reply via email to