I've had three different checks that I've used and all seem to have flaws. 


First one was a simple TCP port check on the ports that MogileFS has open. This 
is cool if you want to make sure the daemons are still running, but I noticed 
that there were cases when a DB could go down and the port remains open. 

Next I wrote something that used 'mogtool' to test injections and extractions, 
however 'mogtool' does way more than I needed it to do and it would also tend 
to keep retrying in areas if mogile went down making the nagios plugin NRPE 
timeout. 

The last thing that I wrote was script that uses the MogileFS::Client perl 
modules and does an injection, extraction and I then compares the in/out files 
size to simply check if we have the same file. This is what we've been using so 
far, however, I have seen an instance where the database was down and 
MogileFS::Backend would have a return code of '82' or something in that range 
and my nagios check was giving me the UNKNOWN status. That was a long night of 
moving some development databases, so I wasn't up to debugging it that night 
and haven't revisited yet. 

What I'm planning on doing, because most of the problems that I've seen tend to 
revolve around the database side, will be modifying my last nagios plugin to do 
a 'select 1' query on the Mogile DB first and if that fails then to alert. At 
least I'll elimnate that first and then move on to testing whether the trackers 
are functioning, etc. 

-- 
Justin Brehm 
Systems Engineer 
iContact.com 

----- Original Message ----- 
From: "Frieder Kundel" <[EMAIL PROTECTED]> 
To: [email protected] 
Sent: Monday, April 28, 2008 10:18:42 AM (GMT-0500) America/New_York 
Subject: Nagios plugin? 

Hi folks, 

how do you monitor your mogile? Has anyone written a nagios plugin? 

Best regards, 

Frieder Kundel 

Reply via email to