I am seeing a cascading effect when I receive a timeout error when doing 
service checks.  I have401 hosts and 6182 service checks in this particular 
instance.  When I get more then 3 or 4 socket timeouts when doing the service 
checks, Icinga pukes and all the checks start slowing timing out and the 
latency goes up rapidly ~800-900 seconds was the longest we let it get to.  If 
I restart the service everything is all well and good.  I have the box(physical 
machine) setup to do no more than 500 checks/sec and according to our 
calculations we could easily do double that with the resources on this box.

I have seen this with issue on virtual machines as well but it happened much 
faster and the latency got higher much quicker.  I have been through many of 
the settings and I can't seem to find one that will prevent or help with this 
issue.  Any thoughts on where to look further or how to diagnose further?

My setup is:
Icinga 1.8.1
NRPE 1.12
OpCFG 1.0
PNP4Nagios 0.6.17
RRDTool 1.4.7
Nagios Plugins 1.4.16
Cent5.9

The specs for the machine are 8 core XEON and 16GB of RAM.

Thanks

Matt Jones | Systems Engineer
matt.jo...@monster.com<mailto:matt.jo...@monster.com>| T : 978-823-2032 | M : 
978-760-5645
MONSTER, 5 Clock Tower Place, Suite 500, Maynard, MA 01754


NOTICE:

This message, and any attachments, contain(s) information that may be 
confidential or protected by privilege from disclosure and is intended only for 
the individual or entity named above. No one else may disclose, copy, 
distribute or use the contents of this message for any purpose. Its 
unauthorized use, dissemination or duplication is strictly prohibited and may 
be unlawful. If you receive this message in error or you otherwise are not an 
authorized recipient, please immediately delete the message and any attachments 
and notify the sender.
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
icinga-users mailing list
icinga-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/icinga-users

Reply via email to