traptimeout

2002-05-14 Thread Barbalata

is there any reason why the do_alert in handl_trap_timeout is called with
undef parameters:

   do_alert ($group, $service, undef, undef, $FL_TRAPTIMEOUT);

This results that the alert is not sent. Instead of that I put this in:

   do_alert ($group, $service, , 0, $FL_TRAPTIMEOUT);

now traptimeouts get alerted.

Any help /reason welcome.


Armin




R: traptimeout / alertafter

2002-03-22 Thread TORRESANI, Roberto

My problem is something different: I don't have to do an heartbeat style monitoring, 
but simply to mark the begin and the end of a task, which I know it takes a certain 
amount of time to complete.
In the case that it takes too long to complete I want to be notified.

The real case is that: I want to monitor the backup of a server.
The things should happen in this way:

1) begin backup: a perl script sends a trap to indicate that the backup is started
2) the main backup script does his work
3) end backup: at the end of the main backup script this is called to indicate that 
the backup is finished.
 A trap is sent to the mon server, with appropriate return code and a summary of the 
backup log.

Well, the problem is: if the backup is taking too long to complete, I want to be 
notified.
I tried to do this in that way:

--
watch Backup
 service bkServerA
description Backup serverA
period
 alert mail.alert [EMAIL PROTECTED]
 upalert mail.alert -u [EMAIL PROTECTED]
 alertafter 3h
 alertevery 24h
--

but the alert is never sent, so neither the upalert.
I tried all the patch I seen on the mailing list.

Here is the code I use to notify the begin of the backup:

--
$c = new Mon::Client (
   host = monserver,
   port = monport,
   username = montrap,
   password = montrap,
 );

$c-send_trap (
   group   = Backup,
   service = bkServerA,
   retval  = 2,
   opstatus= fail,
   summary = Backup started,
   detail  = 
   );
--


Any idea?

Roberto T.


-Messaggio originale-
Da: Ed Ravin [mailto:[EMAIL PROTECTED]]
Inviato: martedì 19 marzo 2002 18.25
A: TORRESANI, Roberto
Cc: [EMAIL PROTECTED]
Oggetto: Re: traptimeout


TORRESANI, Roberto writes:
 
 I'm trying to do a thing like :
 - do something and let me know when you have finished (with a mon trap)
 - if mon doesn't get the trap in a reasonable period of time,
 the service is considered failed.

Check the mon man page for traptimeout and trapduration.
That will let you do what you want.  Here's a snippet from my config:

 watch trapthing
service whereareyou
description go red if we don't hear from you
traptimeout 5m
trapduration 1s

The above service will go into failure mode if a trap is not
received in 5 minutes (traptimeout).  After the trap is received,
the service will be marked OK after 1 second (trapduration).



Re: R: traptimeout / alertafter

2002-03-22 Thread Jim Trocki

On Fri, 22 Mar 2002, TORRESANI, Roberto wrote:

 1) begin backup: a perl script sends a trap to indicate that the backup is started
 2) the main backup script does his work
 3) end backup: at the end of the main backup script this is called to indicate that 
the backup is finished.
  A trap is sent to the mon server, with appropriate return code and a summary of the 
backup log.

wrap this up into a regular monitor script which returns success if the
backup completes in time, and failure if it doesn't.

#!/usr/bin/perl

use English;

eval
{
local $SIG{ALRM} = sub { die Timeout Alarm };
alarm 300;

# run the backup
system (the-backup);

alarm 0;
};

if ($EVAL_ERROR and ($EVAL_ERROR =~ /^Timeout Alarm/))
{
print fail\n;
exit 1;
}

else
{
print success\n;
exit 0;
}




R: R: traptimeout / alertafter

2002-03-22 Thread TORRESANI, Roberto


 wrap this up into a regular monitor script which returns success if the
 backup completes in time, and failure if it doesn't.


Nice!
It works if the backup is made of a monolithic block, but that's not my case.

I have different types of backup that takes place:
1) classical tar and similar (eg, Oracle exports and tar) 
2) Windows 2000 backup (I haven't tested yet, but redirecting the NT event log on a 
Unix syslog I can control the state of the backup)
3) backup taken with tools like Oracle Enterprise Manager

In some cases I have to separate the work in 3 steps:
1) send a trap begin backup
2) backup
3) end backup


The test I took on mon, make me think as alertafter doesn't work with trap. With 
normal monitor it works normally.
Is there a bug on alertafter+traps or is my misunderstanding of alertafter+traps 
behavior?

Roberto T.





Re: traptimeout

2002-03-19 Thread Ed Ravin

TORRESANI, Roberto writes:
 
 I'm trying to do a thing like :
 - do something and let me know when you have finished (with a mon trap)
 - if mon doesn't get the trap in a reasonable period of time,
 the service is considered failed.

Check the mon man page for traptimeout and trapduration.
That will let you do what you want.  Here's a snippet from my config:

 watch trapthing
service whereareyou
description go red if we don't hear from you
traptimeout 5m
trapduration 1s

The above service will go into failure mode if a trap is not
received in 5 minutes (traptimeout).  After the trap is received,
the service will be marked OK after 1 second (trapduration).