Since this hasn't appeared in the MON releases yet, I'd like to note that
the patch included below solved some serious scheduler failures. I just
brought mon up in a badly designed network, and the scheduler was dying
every 3-4 hours. Applying the patch included here solved the problem, and
the scheduler has been stable for nearly 2 weeks without a single failure.

Thanks to Theo, and Jim could you please make sure this or something
similar gets into mon 1.0 ?
 
        From: Theo Van Dinter 
        Subject: Re: Occasional scheduler death 
        Date: Thu, 30 Aug 2001 12:11:30 -0700 

        On Thu, Aug 30, 2001 at 08:11:25AM -0400, Bates, C Thomas wrote:
        >  I think I've finally found what has been causing the MON daemon to die. It
        > finally died with debug running, with this in stderr:
        > /^**** Time Out/: nested *?+ in regexp at /usr/local/mon/prod/mon line 587.

        Line 593 has another bug (for which I submitted a patch to Jim) where alerts
        will be skipped if a monitor fails with summary "alpha bravo", and then
        fails again with a summary "alpha".  The summarys are different, so there
        should be an alert but mon won't alert until the "alertevery" period has
        passed.

        I've attached my patch which I believe will fix both your problem and
        the one in the above paragraph.  :)

        -- 
        Randomly Generated Tagline:
        "You must lash out with every limb, like the octopus who plays drums."
                 - The Sphinx in Mystery Men

        586a587
        >           my($prevsumm) = split("\n", $sref->{"_failure_output"});
        593c594
        <                           ($sref->{"_failure_output"} =~ /^$summary/)
        ---
        >                           (!$pref->{"_observe_detail"} && $prevsumm eq 
$summary)

-- 
Joe Rhett                                                      Chief Geek
[EMAIL PROTECTED]                                      ISite Services, Inc.

Reply via email to