The majority of our devices are network switches and routers.

I may have to use an access list of some sort to sniff out the ICMP being
sent to the device.

I originally had all the devices being polled by SNMP, then switch back to
using ICMP because the same thing was happening.  With the SNMP poll, we
saw that device started to return information but did not complete the
request, maybe because SNMP is a low priority process on Cisco routers and
switches.  If the device get's busy SNMP takes a back seat, this may be
the issue with ICMP also.

Another thing to note, is that during the summer, when I evaluated
InterMapper prior to our purchase we did not see any failed polls either
ICMP or SNMP, off course during the summer, students where gone, and the
network had a minimal load on it.

I am sure InterMapper is the at fault, I am hopping that an adjustment for
the retries or some other setting may help work around this.

It may also be noted, that when a failed polled occurs, I can turn around
and issue a manual ping from the same server that InterMapper is running
on, and it responds with out any problem.

Steve

-----Original Message-----
From: [email protected]
[mailto:[EMAIL PROTECTED] On Behalf Of Doug Weathers
Sent: Monday, January 24, 2005 1:19 PM
To: [email protected]
Subject: Re: [IM-Talk] DOWN outages because of no response to ICMP Echo
polling - has anyone else had this type o


Hi,

The information you gave seems to indicate that IM is sending out the
pings.  It's just that the responses are not being returned.  This would
indicate that the problem is in your network, or on the devices being
monitored, and not on the IM server.

It's also possible that the pings are being returned, but the IM server
is not receiving or recognizing them.  To discover this, run another
packet sniffer (on a different computer) on the server's connection and
compare what it sees with what is seen on the IM server.  If you see
pings on the wire but not on the server, then you've got some sort of
problem with the server.

Another suggestion: see if you can use different kinds of probes.  ICMP
is very basic and is generally given a low priority on devices and
networks.  SNMP returns a much richer set of data, and where SNMP is not
available on a host device, often you can make a custom TCP probe that
can give more information more reliably.

When you say "some device polls fail", is this occurring randomly
across all of your 1319 devices, or only with certain devices being
monitored?

Can you identify any common factors on the devices that are not
responding, like a) they're all on the same subnet b) they're behind a
particular router c) they're all routers and they're too busy to respond
to ICMP reliably?

If you can identify a particular machine that seems to "fail" a lot,
you could try setting up a continuous ping to it from another (closer)
machine to see if the problem is happening at the device or if it's
happening somewhere on the network between the device and your IM
server.

Hope this helps,

Doug



--
Doug Weathers, Network Administrator
St. Charles Medical Center

>>> [EMAIL PROTECTED] 01/24/2005 9:39:48 AM >>>
Problem:  Some device polls fail and will then generate an outage.
The
situation is that these devices are not down.  It is just that for
that
poll no response came back.

Before I go and try changing anything like timeouts and device down
threshold for lost packets, I am looking to see if anyone else has
experienced this situation, and any suggestions on how to resolve it.

Environment:
        Host Server: Sun Enterprise 250
        Host Version: SunOS 5.9 (Solaris 9)
        IM Server Version: 4.2.4b5
        IM Monitoring: 1319 devices
        Polling Method For All Devices: Ping
        Polling Period: Majority at 2 minutes
                        Some routers and servers at 1 minute

Trouble shooting:
        Ran snoop on the server until we saw some of these missed
polls.
        Verified that the ping was sent out (3 pings, 3 seconds apart)

I am including the snoop data.  I already removed any lines not
relevant
to this device.  I am only including the snoop for this time period
showing a successful poll both before and after the poll with out
replies.

At 9:38:50 the first of three ICMP requests where sent with out
getting
any reply.

109147 9:35:55.40969    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800 (IP), size = 90 bytes
109147 9:35:55.40969    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=76, ID=40891, TOS=0x0, TTL=255
109147 9:35:55.40969    rit010n19 -> x009-066m-15n3.rit.edu UDP D=123
S=123 LEN=56
109147 9:35:55.40969    rit010n19 -> x009-066m-15n3.rit.edu NTP
server
(Mon Jan 24 09:35:55 2005)

113986 9:36:50.78327    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800 (IP), size = 62 bytes
113986 9:36:50.78327    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=48, ID=40892, TOS=0x0, TTL=255
113986 9:36:50.78327    rit010n19 -> x009-066m-15n3.rit.edu ICMP Echo
request (ID: 27552 Sequence number: 43572)

113987 9:36:50.78391 x009-066m-15n3.rit.edu -> rit010n19    ETHER
Type=0800 (IP), size = 62 bytes
113987 9:36:50.78391 x009-066m-15n3.rit.edu -> rit010n19    IP
D=129.21.6.1 S=129.21.66.15 LEN=48, ID=40892, TOS=0x0, TTL=252
113987 9:36:50.78391 x009-066m-15n3.rit.edu -> rit010n19    ICMP Echo
reply (ID: 27552 Sequence number: 43572)

114640 9:36:59.41228 x009-066m-15n3.rit.edu -> rit010n19    ETHER
Type=0800 (IP), size = 90 bytes
114640 9:36:59.41228 x009-066m-15n3.rit.edu -> rit010n19    IP
D=129.21.6.1 S=129.21.66.15 LEN=76, ID=0, TOS=0xc0, TTL=252
114640 9:36:59.41228 x009-066m-15n3.rit.edu -> rit010n19    UDP D=123
S=123 LEN=56
114640 9:36:59.41228 x009-066m-15n3.rit.edu -> rit010n19    NTP
client
(Mon Jan 24 09:36:59 2005)

114641 9:36:59.41289    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800 (IP), size = 90 bytes
114641 9:36:59.41289    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=76, ID=40893, TOS=0x0, TTL=255
114641 9:36:59.41289    rit010n19 -> x009-066m-15n3.rit.edu UDP D=123
S=123 LEN=56
114641 9:36:59.41289    rit010n19 -> x009-066m-15n3.rit.edu NTP
server
(Mon Jan 24 09:36:59 2005)

124052 9:38:3.41454 x009-066m-15n3.rit.edu -> rit010n19    ETHER
Type=0800
(IP), size = 90 bytes
124052 9:38:3.41454 x009-066m-15n3.rit.edu -> rit010n19    IP
D=129.21.6.1 S=129.21.66.15 LEN=76, ID=0, TOS=0xc0, TTL=252
124052 9:38:3.41454 x009-066m-15n3.rit.edu -> rit010n19    UDP D=123
S=123
LEN=56
124052 9:38:3.41454 x009-066m-15n3.rit.edu -> rit010n19    NTP  client
(Mon Jan 24 09:38:03 2005)

124053 9:38:3.41515    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800
(IP), size = 90 bytes
124053 9:38:3.41515    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=76, ID=40894, TOS=0x0, TTL=255
124053 9:38:3.41515    rit010n19 -> x009-066m-15n3.rit.edu UDP D=123
S=123
LEN=56
124053 9:38:3.41515    rit010n19 -> x009-066m-15n3.rit.edu NTP  server
(Mon Jan 24 09:38:03 2005)

128960 9:38:50.79269    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800 (IP), size = 62 bytes
128960 9:38:50.79269    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=48, ID=40895, TOS=0x0, TTL=255
128960 9:38:50.79269    rit010n19 -> x009-066m-15n3.rit.edu ICMP Echo
request (ID: 52560 Sequence number: 45139)

129209 9:38:53.79374    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800 (IP), size = 62 bytes
129209 9:38:53.79374    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=48, ID=40896, TOS=0x0, TTL=255
129209 9:38:53.79374    rit010n19 -> x009-066m-15n3.rit.edu ICMP Echo
request (ID: 52561 Sequence number: 45172)

129547 9:38:56.79264    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800 (IP), size = 62 bytes
129547 9:38:56.79264    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=48, ID=40897, TOS=0x0, TTL=255
129547 9:38:56.79264    rit010n19 -> x009-066m-15n3.rit.edu ICMP Echo
request (ID: 52562 Sequence number: 45226)

130559 9:39:7.41604 x009-066m-15n3.rit.edu -> rit010n19    ETHER
Type=0800
(IP), size = 90 bytes
130559 9:39:7.41604 x009-066m-15n3.rit.edu -> rit010n19    IP
D=129.21.6.1 S=129.21.66.15 LEN=76, ID=0, TOS=0xc0, TTL=252
130559 9:39:7.41604 x009-066m-15n3.rit.edu -> rit010n19    UDP D=123
S=123
LEN=56
130559 9:39:7.41604 x009-066m-15n3.rit.edu -> rit010n19    NTP  client
(Mon Jan 24 09:39:07 2005)

130560 9:39:7.41672    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800
(IP), size = 90 bytes
130560 9:39:7.41672    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=76, ID=40898, TOS=0x0, TTL=255
130560 9:39:7.41672    rit010n19 -> x009-066m-15n3.rit.edu UDP D=123
S=123
LEN=56
130560 9:39:7.41672    rit010n19 -> x009-066m-15n3.rit.edu NTP  server
(Mon Jan 24 09:39:07 2005)

137427 9:40:11.42005 x009-066m-15n3.rit.edu -> rit010n19    ETHER
Type=0800 (IP), size = 90 bytes
137427 9:40:11.42005 x009-066m-15n3.rit.edu -> rit010n19    IP
D=129.21.6.1 S=129.21.66.15 LEN=76, ID=0, TOS=0xc0, TTL=252
137427 9:40:11.42005 x009-066m-15n3.rit.edu -> rit010n19    UDP D=123
S=123 LEN=56
137427 9:40:11.42005 x009-066m-15n3.rit.edu -> rit010n19    NTP
client
(Mon Jan 24 09:40:11 2005)

137429 9:40:11.42087    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800 (IP), size = 90 bytes
137429 9:40:11.42087    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=76, ID=40899, TOS=0x0, TTL=255
137429 9:40:11.42087    rit010n19 -> x009-066m-15n3.rit.edu UDP D=123
S=123 LEN=56
137429 9:40:11.42087    rit010n19 -> x009-066m-15n3.rit.edu NTP
server
(Mon Jan 24 09:40:11 2005)

140956 9:40:50.79352    rit010n19 -> x009-066m-15n3.rit.edu ETHER
Type=0800 (IP), size = 62 bytes
140956 9:40:50.79352    rit010n19 -> x009-066m-15n3.rit.edu IP
D=129.21.66.15 S=129.21.6.1 LEN=48, ID=40900, TOS=0x0, TTL=255
140956 9:40:50.79352    rit010n19 -> x009-066m-15n3.rit.edu ICMP Echo
request (ID: 12128 Sequence number: 46721)

140957 9:40:50.79410 x009-066m-15n3.rit.edu -> rit010n19    ETHER
Type=0800 (IP), size = 62 bytes
140957 9:40:50.79410 x009-066m-15n3.rit.edu -> rit010n19    IP
D=129.21.6.1 S=129.21.66.15 LEN=48, ID=40900, TOS=0x0, TTL=252
140957 9:40:50.79410 x009-066m-15n3.rit.edu -> rit010n19    ICMP Echo
reply (ID: 12128 Sequence number: 46721)


Any suggestions would be appreciated.

Thanks,

Steve


IMPORTANT NOTICE: This communication, including any attachment, contains
information that may be confidential or privileged, and is intended solely
for the entity or individual to whom it is addressed.  If you are not the
intended recipient, you should delete this message and are hereby notified
that any disclosure, copying, or distribution of this message is strictly
prohibited.  Nothing in this email, including any attachment, is intended
to be a legally binding signature.


____________________________________________________________________
List archives:
http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
To unsubscribe: send email to: [EMAIL PROTECTED]

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to