Over time we've been slowly modifying the code a little and adding our own features.
Two we've found really useful... "Ack All" to ack everything in the current view and a hold feature... so we can stop alerts going out for up to 180 mins (but still see what's failed). The hold feature includes who put Mon in to hold and their reason. At the end of the 180mins (or timeframe specified less than that) Mon automatically comes out of hold and the alerts automatically resume, so someone can't accidentally leave it on hold like we could when we stopped the scheduler (which had the disadvantage of not knowing what was down). Stephane Bortzmeyer wrote: > On Wed, Mar 12, 2008 at 12:07:38PM -0400, > Ed Ravin <[EMAIL PROTECTED]> wrote > a message of 23 lines which said: > > >> In most cases, our engineers log into Mon and use the "host disable" >> or "service disable" to stop montoring the stuff that's about to go >> down, and re-enable them when the maintenance is over. >> >> Sometimes, we just ACK whatever's broken when Mon starts alarming. >> > > The good thing about "doing nothing when there is a planned > maintenance" is that it allows you to test that monitoring indeed > works. > > I had several times the bad experience of an undetected failure > because the monitoring had an hidden problem. > > _______________________________________________ > mon mailing list > mon@linux.kernel.org > http://linux.kernel.org/mailman/listinfo/mon > -- Ben Ragg - Internode - Network Operations 150 Grenfell Street, Adelaide, SA, 5000 Phone: 13NODE Web: http://www.on.net _______________________________________________ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon