Thanks, Joe and David,
I would bump up the timeout, *but* the stats for the objects show that
out of 36K polls the MAX delay is only 891 ms.
Up since: 12/04/02 17:33:20 Missed 328
Type # Polls % Responded % Missed # Alerts Avg delay Min delay Max delay
ICMP 87468 99.63% 0.37% 6 106 0 1516
NTP-Stratum-X-Check
36464 88.62% 11.38% 54 0 891
SSH-port-scan
36464 99.98% 0.02% 63 0 1328
What I can't figure out is why this one service seems to miss 11.38% of
the polls while the pings and SSH port connections only miss <0.5%?
I have tried running "ntpq -np remote-server-name" a few dozen times on
the command line, but it works every time as fast as I can hit the
enter key, never fails.
I could accept that it was network packet loss, but with a timeout of
only 5 seconds *IF* a TCP packet or ICMP packet were lost it would take
more than 5 seconds to recover (retransmit) as well, so I would get
alerts on those services too, wouldn't I?
Why then does only the NTP-Stratum-X-Check fail?
I don't expect anyone to have the answer in their pocket, but do
you see similar problems?
I also have a very similar problem with DNS checking a server in the
UK. The ICMP and SSH-port-scan rarely fail, but the DNS check fails
routinely.
I could accept that it was a problem with my Custom Service Script or
with my WUG server, but it never happens on the 100 servers that are
on local subnets.
-Ben.
On Mon, 2002-12-16 at 14:48, [EMAIL PROTECTED] wrote:
> Ben,
> The problem may lie in the fact that the service is UDP. Although the
> common ports documentation say NTP is both TCP and UDP, the
> implementatios I've encountered are all UDP. The problem with UDP is
> there is not a guaranteed delivery and this may be reflected in the
> fact your local NTP servers are doing okay, it's the remote (and who
> knows what chewing gum bailing wire links are there) servers are giving
> you headaches.
>
> Another theory may be that the remote servers may be heavily loaded and
> not responding on every query.
>
> You can try increasing the timeouts and increase the number of failures
> before alarming. That won't help the web page, one miss and it alarms -
> one of my favorite gripes to IPswitch.
>
> Jay
>
> ----- Original Message -----
> From: Ben Russo <[EMAIL PROTECTED]>
> Date: Monday, December 16, 2002 11:33 am
> Subject: [WhatsUp Forum] NTP custom monitor
>
> > WUG-List,
> >
> > I have a WUG 7.04 server on Windows 2000 Professional with all the
> > patches.
> >
> > I read the IP-Custom Services white paper that I found on the Ipswitch
> > web site and hacked a custom service monitor for NTP that checks
> > if
> > servers are a Stratum 1,2,3,4,5,6 or 7 NTP server.
> >
> > However I get false down reports for servers that are using this
> > particular service monitor. But only on servers that are in remote
> > data centers in other Cities.
> >
> > I am wondering if there is some Expect script option that I should use
> > that would help reduce the number of false negatives?
> >
> > The basics of the custom service are:
> >
> > Name: NTP-Stratum-X-Check
> > UDP, port 123 Timeout 5 seconds
> > Send=%27%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
> 0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0
> > Expect=~%28(%1|%2|%3|%4|%5|%6|%7)
> >
> >
> > It works 99.9% of the time for servers that are on the local subnet,
> > I do get valid Service Down alerts when the services are really down.
> >
> > -Ben.
> >
> >
> > Please visit http://www.ipswitch.com/support/mailing-lists.html
> > to be removed from this list.
> >
> > An Archive of this list is available at:
> > http://www.mail-archive.com/whatsup_forum%40list.ipswitch.com/
> >
>
>
> Please visit http://www.ipswitch.com/support/mailing-lists.html
> to be removed from this list.
>
> An Archive of this list is available at:
> http://www.mail-archive.com/whatsup_forum%40list.ipswitch.com/
Please visit http://www.ipswitch.com/support/mailing-lists.html
to be removed from this list.
An Archive of this list is available at:
http://www.mail-archive.com/whatsup_forum%40list.ipswitch.com/