My problem is something different: I don't have to do an heartbeat style monitoring, 
but simply to mark the begin and the end of a task, which I know it takes a certain 
amount of time to complete.
In the case that it takes too long to complete I want to be notified.

The real case is that: I want to monitor the backup of a server.
The things should happen in this way:

1) begin backup: a perl script sends a trap to indicate that the backup is started
2) the main backup script does his work
3) end backup: at the end of the main backup script this is called to indicate that 
the backup is finished.
 A trap is sent to the mon server, with appropriate return code and a summary of the 
backup log.

Well, the problem is: if the backup is taking too long to complete, I want to be 
notified.
I tried to do this in that way:

--
watch Backup
     service bkServerA
        description Backup serverA
        period
         alert mail.alert [EMAIL PROTECTED]
         upalert mail.alert -u [EMAIL PROTECTED]
         alertafter 3h
         alertevery 24h
--

but the alert is never sent, so neither the upalert.
I tried all the patch I seen on the mailing list.

Here is the code I use to notify the begin of the backup:

--
$c = new Mon::Client (
                   host => "monserver",
                   port => monport,
                   username => "montrap",
                   password => "montrap",
             );

$c->send_trap (
                   group           => "Backup",
                   service         => "bkServerA",
                   retval          => 2,
                   opstatus        => "fail",
                   summary         => "Backup started",
                   detail          => ""
               );
--


Any idea?

Roberto T.


-----Messaggio originale-----
Da: Ed Ravin [mailto:[EMAIL PROTECTED]]
Inviato: marted� 19 marzo 2002 18.25
A: TORRESANI, Roberto
Cc: [EMAIL PROTECTED]
Oggetto: Re: traptimeout


TORRESANI, Roberto writes:
> 
> I'm trying to do a thing like :
> - do something and let me know when you have finished (with a mon trap)
> - if mon doesn't get the trap in a reasonable period of time,
> the service is considered failed.

Check the mon man page for "traptimeout" and "trapduration".
That will let you do what you want.  Here's a snippet from my config:

 watch trapthing
        service whereareyou
                description go red if we don't hear from you
                traptimeout 5m
                trapduration 1s

The above service will go into failure mode if a trap is not
received in 5 minutes (traptimeout).  After the trap is received,
the service will be marked "OK" after 1 second (trapduration).

Reply via email to