PoCo::Client::Ping intermittently fails to emit pings

Stuart Kendrick Wed, 15 Apr 2009 05:00:05 -0700

my favorite pinging application (in-house app, usesPOE::Component::Client::Ping) intermittently reports a rash of missed replies.happens roughly the same time each night; the condition lasts for several minutes

i sat up and watched it last night -- put my code into debug mode (so that itlogs each hit & missed ping) and ran a sniffer


(1) i can see from debug output the rash of "no response" messages

(2) i can see from the packet trace that the box did not emit any ICMP Echoesacross the window during which my code is complaining about "no response"

(3) this box runs a bunch of pinging apps ... i can see ICMP DestinationUnreachable responses trickling back to another application (which uses fping)during the relevant window (a handful per second)

have any pointers on how one might debug such a condition? i suspect that theroot cause is OS-related, rather than POE-related ... what condition wouldinterfere with Linux's ability to emit pings?



Details:
-  perl-5.10.0, POE 1.0.4, POE::Component::Client::Ping 1.14

-  CentOS 5.3 (2.6.18-128.el5 x86_64 x86_64 x86_64 GNU/Linux)

-  the application is pinging ~120 hosts every 30 seconds

-  the box gets busy during this time, in terms of disk I/O (average latency of
   ~100ms ... fifteen minute rolling load average spikes to ~18:  likely stems
   from several disk-intensive jobs which run during this period).  this
   busyness lasts for ~5 hours

-  the box hosts a number of pinging applications (Nagios, a bunch of in-house
   apps), typically employing fping

-  i've been running the application for a handful of years now, a handful of
   instances, monitoring ~100 - 500 hosts per instance.  this is novel behavior


--sk

stuart kendrick
fred hutchinson cancer research center
seattle, wa usa

PoCo::Client::Ping intermittently fails to emit pings

Reply via email to