Re: [Nagios-users] Scheduled checks falling far behind

2010-10-27 Thread C. Bensend
> OK, well, I hope I'm not embarrassing myself with this. It's a perl > script and uses Ton Voon's nifty Nagios::Plugins module. I run checks > against things I want to know about. Thinking about it, I guess it would > be nice to have the failed hosts/services check alert on percentage of > fai

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-27 Thread Frost, Mark {PBC}
n percentage of failures. Maybe someday. Mark -Original Message- From: C. Bensend [mailto:be...@bennyvision.com] Sent: Saturday, October 23, 2010 8:44 PM To: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] Scheduled checks falling far behind > You can also run, if

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-25 Thread Litwin, Matthew
>> >> What happens when you disable performance-data parsing and writing? > > Actually, that was what I am trying to get working properly. My RRD data > files are sparse as a result. Wow, after moving nagiosgraph back to batch mode from immediate mode latency dropped to nothing! Now to see if

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-25 Thread Litwin, Matthew
>>> >>> What happens when you disable performance-data parsing and writing? >> >> Actually, that was what I am trying to get working properly. My RRD >> data files are sparse as a result. > > Even so, try disabling it for a bit and see if the way performance > data is handled is causing problems

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-25 Thread Andreas Ericsson
On 10/25/2010 01:19 AM, Litwin, Matthew wrote: > On Oct 24, 2010, at 3:02 PM, Andreas Ericsson wrote: >> >> Note that you should wipe your status.sav files between restarts to >> not let old latency affect the numbers you're seeing. > > I don't seem to have them on my system. Perhaps you haven't

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-25 Thread Andreas Ericsson
On 10/24/2010 12:31 AM, Litwin, Matthew wrote: > > On Oct 22, 2010, at 6:53 PM, Frost, Mark {PBC} wrote: > >> Matthew, >> >> You don't say, but my guess would be that you have high latencies. >> That is for one of several reasons, Nagios is not able to run >> checks when it thinks it should. Yo

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-24 Thread Litwin, Matthew
It would appear that running nagiosgraph in immediate mode was the latency cause. However, since batch mode has some problems that dash out big chunks of data that won't work either, so it looks like I will need to finds another solution, sadly, especially since I have invested so much time to s

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-24 Thread Litwin, Matthew
On Oct 24, 2010, at 3:02 PM, Andreas Ericsson wrote: > On 10/24/2010 10:14 PM, Litwin, Matthew wrote: >> Hi Matthieu (and anyone else who might want to throw their hat into >> the ring): >> > > I'll chip in. Your MUA seems to not wrap lines at all though, which > makes replying inline a bit tri

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-24 Thread Andreas Ericsson
On 10/24/2010 10:14 PM, Litwin, Matthew wrote: > Hi Matthieu (and anyone else who might want to throw their hat into > the ring): > I'll chip in. Your MUA seems to not wrap lines at all though, which makes replying inline a bit tricky. Note that you should wipe your status.sav files between rest

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-24 Thread Andreas Ericsson
On 10/23/2010 07:16 PM, Litwin, Matthew wrote: > For the Total Services, what are the three X / X / X values mean? Is it last > 1/5/15 min? > It's min / max / average. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-24 Thread Litwin, Matthew
Hi Matthieu (and anyone else who might want to throw their hat into the ring): So after identifying that I have latency times that are around 500-600 seconds I have tried the tuning tips form the nagios docs, however I have fiddled with it and it while after the restart latency drops briefly, th

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-24 Thread Mathieu Gagné
On 2010-10-24 03:54, Litwin, Matthew wrote: > You hit the nail on the head. Changing MaxBytes to a very large number made > latency totally dwarf execution time. > > So now what do I do? Try disabling environment variables in nagios.cfg: enable_environment_macros = 0 Our latency dropped from 20

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-24 Thread Litwin, Matthew
You hit the nail on the head. Changing MaxBytes to a very large number made latency totally dwarf execution time. So now what do I do? On Oct 23, 2010, at 4:07 PM, Mathieu Gagné wrote: > On 2010-10-23 18:31, Litwin, Matthew wrote: >> >> I have set up MRTG to track nagios performace and it is r

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-23 Thread C. Bensend
> You can also run, if memory serves, the "nagiostats" command located in > your Nagios "bin" directory to see this information as well. I actually > use that nagiostats data in a custom check and graph a lot of those > latencies and other Nagios performance related info. Boy, would I *love* to

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-23 Thread Mathieu Gagné
On 2010-10-23 18:31, Litwin, Matthew wrote: > > I have set up MRTG to track nagios performace and it is reporting that > latency for host and service checks are next to nothing and service execution > time is just under 400 ms, however, host checks are coming back at around 4 > seconds. Based on

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-23 Thread Litwin, Matthew
On Oct 22, 2010, at 6:53 PM, Frost, Mark {PBC} wrote: > Matthew, > > You don't say, but my guess would be that you have high latencies. That is > for one of several reasons, Nagios is not able to run checks when it thinks > it should. You can see this information and other stats by looking a

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-23 Thread Litwin, Matthew
On Oct 22, 2010, at 7:09 PM, Jonathan Angliss wrote: On 10/22/10 19:29, Litwin, Matthew wrote: --Service information-- Last Updated: Sat Oct 23 00:19:02 UTC 2010 --Service State Information-- Current Status: OK (for 7d 16h 14m 46s) Status Information: CPU STATISTICS OK : user=0.12% system=0.

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-23 Thread Litwin, Matthew
oked up that max latency and then >> quickly looked in the status.dat file to find the service that had that same >> matching latency and dug into that. You could, for example, have a few >> checks that aren't really timing out so the check may take 10 minutes or >> m

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-23 Thread Litwin, Matthew
which would really screw up your overall latencies. Like the > checks wouldn't have finished before the next time they were supposed to be > run. > > Mark > > ________________ > From: Litwin, Matthew [mlit...@stubhub.com] > Sent: Friday, October 22, 2010 8:2

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-22 Thread Jonathan Angliss
On 10/22/10 19:29, Litwin, Matthew wrote: > I have been chasing my tail trying to figure out why my RRD files were > very sparsely populated, and I am realizing that my checks are falling > behind of their scheduled times up to 3 times their set check interval. > For example a service that should b

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-22 Thread Frost, Mark {PBC}
finished before the next time they were supposed to be run. Mark From: Litwin, Matthew [mlit...@stubhub.com] Sent: Friday, October 22, 2010 8:29 PM To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] Scheduled checks falling far behind I have been chasing

[Nagios-users] Scheduled checks falling far behind

2010-10-22 Thread Litwin, Matthew
I have been chasing my tail trying to figure out why my RRD files were very sparsely populated, and I am realizing that my checks are falling behind of their scheduled times up to 3 times their set check interval. For example a service that should be checking every 5 minutes. In the example belo