On Sun, 27 Jun 2004, David Nolan wrote:

> --On Saturday, June 26, 2004 6:57 PM -0500 Tim Klein 
> <[EMAIL PROTECTED]> wrote:
> 

> > sent after it gets below zero.)  I can't find anything in the
> > code that could ever reset it.  Am I misunderstanding the
> > intended purpose of _trap_timer?

tim, i don't think so.

> Essentially _trap_timer is used entirely as a way to prevent trap timeout 
> alarms from happening on every pass through the code after the timeout is 
> reached.  I.e. the actual check for the timeout is where it compares
> ($tm - $sref->{"_last_trap"}) to $sref->{"traptimeout"}.  And then when a 
> trap timeout actually occurs, _trap_timer is reset so that no more timeout
> alerts will be sent until that much time has passed again.

this isn't the bug that time reported, but i believe it's related.  i had a
look at the code, and it does seem to me that recieving a trap before
_trap_timer falls to zero or less should reset _trap_timer to
$sref->{"traptimeout"}. the intention of _trap_timer is to send an alert when
the service doesn't recieve a trap in some amount of time, indicating that the
thing is dead or otherwise isn't working as it should. 

resetting _trap_timer in sub handle_trap should fix this problem.
for example, put this after $sref->{"_last_summary"} = $trap{"sum"}:

    #
    # a trap recieved resets the trap timeout timer
    #
    if (exists $sref->{"traptimeout")
    {   
        $sref->{"_trap_timer"} = $sref->{"traptimeout"};
    }

i'll patch this in the mon-1-0-0pre1 branch. dave, what are your thoughts?
it seems to me clear-cut.



_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to