traptimeout
is there any reason why the do_alert in handl_trap_timeout is called with undef parameters: do_alert ($group, $service, undef, undef, $FL_TRAPTIMEOUT); This results that the alert is not sent. Instead of that I put this in: do_alert ($group, $service, , 0, $FL_TRAPTIMEOUT); now traptimeouts get alerted. Any help /reason welcome. Armin
R: traptimeout / alertafter
My problem is something different: I don't have to do an heartbeat style monitoring, but simply to mark the begin and the end of a task, which I know it takes a certain amount of time to complete. In the case that it takes too long to complete I want to be notified. The real case is that: I want to monitor the backup of a server. The things should happen in this way: 1) begin backup: a perl script sends a trap to indicate that the backup is started 2) the main backup script does his work 3) end backup: at the end of the main backup script this is called to indicate that the backup is finished. A trap is sent to the mon server, with appropriate return code and a summary of the backup log. Well, the problem is: if the backup is taking too long to complete, I want to be notified. I tried to do this in that way: -- watch Backup service bkServerA description Backup serverA period alert mail.alert [EMAIL PROTECTED] upalert mail.alert -u [EMAIL PROTECTED] alertafter 3h alertevery 24h -- but the alert is never sent, so neither the upalert. I tried all the patch I seen on the mailing list. Here is the code I use to notify the begin of the backup: -- $c = new Mon::Client ( host = monserver, port = monport, username = montrap, password = montrap, ); $c-send_trap ( group = Backup, service = bkServerA, retval = 2, opstatus= fail, summary = Backup started, detail = ); -- Any idea? Roberto T. -Messaggio originale- Da: Ed Ravin [mailto:[EMAIL PROTECTED]] Inviato: martedì 19 marzo 2002 18.25 A: TORRESANI, Roberto Cc: [EMAIL PROTECTED] Oggetto: Re: traptimeout TORRESANI, Roberto writes: I'm trying to do a thing like : - do something and let me know when you have finished (with a mon trap) - if mon doesn't get the trap in a reasonable period of time, the service is considered failed. Check the mon man page for traptimeout and trapduration. That will let you do what you want. Here's a snippet from my config: watch trapthing service whereareyou description go red if we don't hear from you traptimeout 5m trapduration 1s The above service will go into failure mode if a trap is not received in 5 minutes (traptimeout). After the trap is received, the service will be marked OK after 1 second (trapduration).
Re: R: traptimeout / alertafter
On Fri, 22 Mar 2002, TORRESANI, Roberto wrote: 1) begin backup: a perl script sends a trap to indicate that the backup is started 2) the main backup script does his work 3) end backup: at the end of the main backup script this is called to indicate that the backup is finished. A trap is sent to the mon server, with appropriate return code and a summary of the backup log. wrap this up into a regular monitor script which returns success if the backup completes in time, and failure if it doesn't. #!/usr/bin/perl use English; eval { local $SIG{ALRM} = sub { die Timeout Alarm }; alarm 300; # run the backup system (the-backup); alarm 0; }; if ($EVAL_ERROR and ($EVAL_ERROR =~ /^Timeout Alarm/)) { print fail\n; exit 1; } else { print success\n; exit 0; }
R: R: traptimeout / alertafter
wrap this up into a regular monitor script which returns success if the backup completes in time, and failure if it doesn't. Nice! It works if the backup is made of a monolithic block, but that's not my case. I have different types of backup that takes place: 1) classical tar and similar (eg, Oracle exports and tar) 2) Windows 2000 backup (I haven't tested yet, but redirecting the NT event log on a Unix syslog I can control the state of the backup) 3) backup taken with tools like Oracle Enterprise Manager In some cases I have to separate the work in 3 steps: 1) send a trap begin backup 2) backup 3) end backup The test I took on mon, make me think as alertafter doesn't work with trap. With normal monitor it works normally. Is there a bug on alertafter+traps or is my misunderstanding of alertafter+traps behavior? Roberto T.
Re: traptimeout
TORRESANI, Roberto writes: I'm trying to do a thing like : - do something and let me know when you have finished (with a mon trap) - if mon doesn't get the trap in a reasonable period of time, the service is considered failed. Check the mon man page for traptimeout and trapduration. That will let you do what you want. Here's a snippet from my config: watch trapthing service whereareyou description go red if we don't hear from you traptimeout 5m trapduration 1s The above service will go into failure mode if a trap is not received in 5 minutes (traptimeout). After the trap is received, the service will be marked OK after 1 second (trapduration).