Hi, I haven't use nProbe, but you might also consider tackling this from the other end. On some systems, we've had services that for some reason just stop. Sometimes it is a temporary problem, with some patch or another messing things up. Others last longer...
I've written a few 'nanny' scripts scheduled either in cron or the Windows Scheduler that check to see if a specific process is running, and try to restart it if it isn't. I also have these e-mail me when they've had to restart a process or when they've failed to restart a process. Here's one we had for another service that I modified for nprobe. You'll need to change the nprobe arguments for your system and maybe the path too. Set the e-mail recipients and smtp server for your system also. The perl 'Proc::ProcessTable' module is not installed by default. On OSX, just do "sudo perl -MCPAN -e 'install Proc::ProcessTable'" to install it before trying the script. Hope it helps! Ted #!/usr/bin/perl use strict; use warnings; use Proc::ProcessTable; use Net::SMTP; my $svc='nprobe'; my $dir='/usr/local/bin'; my $pid_dir='/var/run'; my $pid_file="$pid_dir/$svc.pid"; my $args="--daemon-mode --verbose 1 --pid-file $pid_file"; my $start_cmd="$dir/$svc $args"; ## Mail notification config my @recipients=qw( [email protected] ); my $smtp_server='smtp.your-domain.com'; # Number of seconds to wait before checking if # 'kill' command, if executed, worked. my $wait_time=3; # Number of times to try to kill a process before # giving up. my $num_kill_tries=5; ######################## ## MAIN ######################## my ($res,$whole_msg,$msg,$expected_pid,$actual_pid); ($expected_pid,$msg)=&check_for_pid_file($pid_file); $whole_msg .= $msg; ($actual_pid,$msg)=&check_for_process(); $whole_msg .= $msg; ## Just quit now if the process is running correctly. if ($expected_pid && $actual_pid) { if ($expected_pid == $actual_pid) { $msg = "Expected and actual PIDs match. Nothing to do.\n"; $whole_msg .= $msg; print $whole_msg; exit; } else { $msg = "Problem: Actual PID doesn't match that in $pid_file.\n"; $whole_msg .= $msg; } } ## Try to restart the service and send a message. ($res,$msg)=&clean_and_stop_svc($actual_pid,$pid_file); $whole_msg .= $msg; ($res,$msg)=&start_svc($start_cmd); $whole_msg .= $msg; ¬ify($whole_msg); ######################## ## FUNCTIONS ######################## # Checks for $pid_file. If found, returns PID it contains # if not, returns undef. Also returns info text as 2nd value. sub check_for_pid_file { my ($f)=...@_; my ($fh,$msg); my $pid=undef; if (-f $f) { if (open($fh,$f)) { while (<$fh>) { chomp; $pid=$_; $msg="PID file \"$f\" exists, read PID $pid.\n"; last; } close $fh; } else { $msg="PID file \"$f\" exists but could not be read.\n"; } } else { $msg="PID file \"$f\" does not exist.\n"; } return ($pid,$msg); } # Checks for running process. If found, returns process ID. # If not, returns undef. Also returns info text as 2nd value. sub check_for_process { my ($found_pid,$msg); my $pt = new Proc::ProcessTable; my (@fields) = $pt->fields; # Find pid of desired service foreach my $proc ( @{$pt->table} ) { for my $field (@fields) { if ($field eq 'fname' && $proc->$field() eq $svc) { $found_pid=$proc->pid; } } } if (! $found_pid) { $msg="Process $svc is not running.\n"; } else { $msg="Process $svc is running with PID $found_pid.\n"; } return ($found_pid,$msg); } # Stops running service. Deletes any old .pid files. # Returns 1 or undef, plus a text message. 1 means success, undef failure. sub clean_and_stop_svc { my ($pid,$pid_file)=...@_; my $res; my $tries=0; while (defined($pid)) { $res=kill 9, $pid; sleep $wait_time; $tries += 1; ($pid,$msg)=&check_for_process(); if ($tries >= $num_kill_tries) { last; } } if (-f $pid_file) { $res=unlink $pid_file; if (! $res) { $msg="Failed to delete old .pid file \"$pid_file\".\n"; } else { $msg="Deleted old .pid file \"$pid_file\".\n"; } } if (! $pid) { if ($tries > 0) { $msg .= "Successfully stopped service $svc.\n"; $pid=1; } } else { $msg .= "Failed to stop service $svc after $num_kill_tries attempts.\n"; $pid=undef; } return ($pid,$msg); } # Starts service. Returns 1 or undef, plus a text message. # 1 means success, undef means failure. sub start_svc { my ($cmd)=...@_; my $res=system($cmd); my $msg; if ($res) { $msg="ERROR trying to start $svc with command \"$start_cmd\".\n"; $msg .= "Return code: $res, error text: $...@.\n"; $res=undef; } else { $msg="Successfully started $svc.\n"; $res=1; } return ($res,$msg); } # Sends notification of what this nanny did and when. sub notify { my ($msg)=...@_; my $now=localtime(time); my $me=`basename $0`; my $hostname=`hostname`; chomp $me; chomp $hostname; my $smtp; my $subject="Notice: $svc on $hostname has been restarted."; if ($msg =~ /error/i || $msg =~ /fail/i) { $subject="Warning: $svc on $hostname is not running."; } $msg .= "\n---------------------------------------------------\n"; $msg .= "This message send from $hostname by $me script on $now.\n"; $smtp = Net::SMTP->new($smtp_server); $smtp->mail($ENV{USER}); $smtp->recipient(@recipients); $smtp->data(); $smtp->datasend("From: $...@$hostname\n"); $smtp->datasend("To: $svc.service.managers.\n"); $smtp->datasend("Subject: $subject\n"); $smtp->datasend("\n"); $smtp->datasend($msg); $smtp->dataend(); $smtp->quit; } On Mon, Dec 21, 2009 at 11:54 PM, Damian Halloran <[email protected]>wrote: > Hello all, > > I need assistance with monitoring that I am receiving probe data from a > remote probe. > > The set up I have is an IM server receiving probes from a remote site. The > probe source is at a client's site with a Mac Mini running OS X 10.6 and > nProbe. > > All is working really well and it is fantastic. > > There is the occasional problem with nProbe where it will stop running and > IM will receive no data from the mini. Currently the only way I see that > this has happened is if I open the Flows windows and see that there is no > data. > > Is there a way to set up a notifier that will let me know when the flows > data has stopped being received by the IM server? > > Thanks > > Damian. > -- > Damian Halloran B Comp CCNA > Capital IT Pty Ltd > http://www.capitalit.net.au > 03 9024 0631 > 0419 308 036 > [email protected] > > > > ____________________________________________________________________ > List archives: > http://www.mail-archive.com/intermapper-talk%40list.dartware.com/ > To unsubscribe: send email to: [email protected] > > ____________________________________________________________________ List archives: http://www.mail-archive.com/intermapper-talk%40list.dartware.com/ To unsubscribe: send email to: [email protected]
