I am seeing a cascading effect when I receive a timeout error when doing
service checks. I have401 hosts and 6182 service checks in this particular
instance. When I get more then 3 or 4 socket timeouts when doing the service
checks, Icinga pukes and all the checks start slowing timing out and the
latency goes up rapidly ~800-900 seconds was the longest we let it get to. If
I restart the service everything is all well and good. I have the box(physical
machine) setup to do no more than 500 checks/sec and according to our
calculations we could easily do double that with the resources on this box.
I have seen this with issue on virtual machines as well but it happened much
faster and the latency got higher much quicker. I have been through many of
the settings and I can't seem to find one that will prevent or help with this
issue. Any thoughts on where to look further or how to diagnose further?
My setup is:
Icinga 1.8.1
NRPE 1.12
OpCFG 1.0
PNP4Nagios 0.6.17
RRDTool 1.4.7
Nagios Plugins 1.4.16
Cent5.9
The specs for the machine are 8 core XEON and 16GB of RAM.
Thanks
Matt Jones | Systems Engineer
matt.jo...@monster.com<mailto:matt.jo...@monster.com>| T : 978-823-2032 | M :
978-760-5645
MONSTER, 5 Clock Tower Place, Suite 500, Maynard, MA 01754
NOTICE:
This message, and any attachments, contain(s) information that may be
confidential or protected by privilege from disclosure and is intended only for
the individual or entity named above. No one else may disclose, copy,
distribute or use the contents of this message for any purpose. Its
unauthorized use, dissemination or duplication is strictly prohibited and may
be unlawful. If you receive this message in error or you otherwise are not an
authorized recipient, please immediately delete the message and any attachments
and notify the sender.
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
icinga-users mailing list
icinga-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/icinga-users