Re: [ntp:questions] Testing Sync Across Several Systems

2009-08-13 Thread Ulrich Windl
T g41...@motorola.com writes:

 Greetings:

 We have about 50 Linux/Solaris/Windows boxes running ntpd at several
 different sites. Some of the systems from time to time go out of sync.
 My question is there a way to test ntpd machines are all in sync with
 the master
 server?

 I was thinking of using ssh to get on to each machine to do a date and
 then go back to the master and do a date and compare, but this seems
 problematic at best. What do people do to check that all machines are
 in sync?

Hi!

Recently I've started a completely different approach: Instead of
looking at phase and frequency offsets, I defined a set of sanity
conditions to check ntp servers for. At the moment I have 3 conditions
for the system status, 11 conditions for the peer status, and 2
conditions for the clock status. I compute a value between 0 and 1 of
those, feeding them into rrdtool.

This works amazingly nice: I don't care about a specific offset, I only
want to know whether a ntpd looks as if it should have a reliable time.
That way I found a few servers with mis-configured peers, and I found
that the NTP daemon in HP-UX 11.31 and Solaris 10 both report a freq
value through a mode 6 query that is by a factor of 1000 too large
(i.e. by that factor larger than the value witten into loopstats). I only
noticed, because one server had a frequency error of over 4000
displayed PPM...

The scripts to do the stuff are very new, so I don't wnat to publish
them (in case you would ask), but we can discuss the approach here or in
private e-mail.

Regards,
Ulrich

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-21 Thread T
On Jul 20, 9:21 pm, Steve Kostecke koste...@ntp.org wrote:

 Which may be chosen fromhttp://support.ntp.org/s2


Thanks for all the help. I've learn a lot about ntpd and found a lot
of problems with our
ntp setup here.

o I'll change the servers to ones in the list above.

o My Solaris boxes are running V3, I'll see if I can upgrade that.

o Some of the clients are pointing to the default redhat servers.
server 0.rhel.pool.ntp.org
server 1.rhel.pool.ntp.org
server 2.rhel.pool.ntp.org

 Not sure if this is a problem or not, but I'll reset them to our
master here.

Again, thanks for all the help.

  Tom

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


[ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread T
Greetings:

We have about 50 Linux/Solaris/Windows boxes running ntpd at several
different sites. Some of the systems from time to time go out of sync.
My question is there a way to test ntpd machines are all in sync with
the master
server?

I was thinking of using ssh to get on to each machine to do a date and
then go back to the master and do a date and compare, but this seems
problematic at best. What do people do to check that all machines are
in sync?

Thanks in Advanced for any help.

   Tom

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread Jan Ceuleers
T wrote:
 We have about 50 Linux/Solaris/Windows boxes running ntpd at several
 different sites. Some of the systems from time to time go out of sync.
 My question is there a way to test ntpd machines are all in sync with
 the master
 server?

The easiest way I can think of is to poll those machines using ntpd from a 
monitoring host.

This monitoring host's ntpd.conf contains, for each of your to-be-monitored 
boxes, a line like the following:

   server box1thru50.domain.tld noselect

So that's about 50 lines like the above, in addition to your normal server 
lines (since the monitoring host itself also needs to be synced to the master 
server). (In fact, it might _be_ the master server). The noselect option 
on the server lines tells your monitoring host to only poll that box but never 
to try syncing to it itself.

Then you can inspect the state of play using ntpq:

   ntpq -p monitoringhost.domain.tld

This assumes that the to-be-monitored boxes have static IP addresses (or else 
you would need to restart the monitoring host's ntpd periodically). It also 
assumes (I've never tried it) that ntpd will scale to the number of hosts that 
you want to monitor.

If you need anything more elaborate, google for ntp survey. There is a 
periodic project run by Brazilian academics whose toolset you might be able to 
reuse.

Cheers, Jan

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread Richard B. Gilbert
T wrote:
 Greetings:
 
 We have about 50 Linux/Solaris/Windows boxes running ntpd at several
 different sites. Some of the systems from time to time go out of sync.
 My question is there a way to test ntpd machines are all in sync with
 the master
 server?
 
 I was thinking of using ssh to get on to each machine to do a date and
 then go back to the master and do a date and compare, but this seems
 problematic at best. What do people do to check that all machines are
 in sync?
 
 Thanks in Advanced for any help.
 
Tom

ntpq is a good start.  There is at least one other command that can 
reveal interesting/useful information but I can't retrieve it from my 
fading memory at the moment.

Those systems should NOT be going out of sync!  That they are doing 
so says that something is VERY WRONG somewhere.

Does the problem manifest under all three operating systems??  Do all 
the machines involved run 24x7?  Is the network available 24x7?  What 
are you using for a master server?  Where does the master server  
get time

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread paul
On Jul 20, 8:38 pm, T g41...@motorola.com wrote:
 Greetings:

 We have about 50 Linux/Solaris/Windows boxes running ntpd at several
 different sites. Some of the systems from time to time go out of sync.
 My question is there a way to test ntpd machines are all in sync with
 the master
 server?

 I was thinking of using ssh to get on to each machine to do a date and
 then go back to the master and do a date and compare, but this seems
 problematic at best. What do people do to check that all machines are
 in sync?

 Thanks in Advanced for any help.

    Tom

My solution is to run 'ntpq -p hostname' remotely from monitoring
host.
Output of the command can be processed by some kind of scripts, like
perl
or python, or be redirected to other program.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread T


 Does the problem manifest under all three operating systems??  Do all
 the machines involved run 24x7?  Is the network available 24x7?  What
 are you using for a master server?  Where does the master server 
 get time

Hi:

Disclaimer: I know very little about ntpd. With that said.

These are all 24x7 machines all over the world. I *think* it's the
Solaris boxes that are causing us problems, but
can not confirm that a the moment. The Master Server is a Linux Redhat
5 machine. It's configuration is:

# Specify the key identifier to use with the ntpq utility.
#controlkey 8
restrict 172.20.21.23 mask 255.255.255.255 nomodify notrap noquery
restrict 172.20.21.32 mask 255.255.255.255 nomodify notrap noquery
server time.nist.gov
restrict time.nist.gov mask 255.255.255.255 nomodify notrap noquery
server tick.usno.navy.mil
restrict tick.usno.navy.mil mask 255.255.255.255 nomodify notrap
noquery
server navobs1.gatech.edu
restrict navobs1.gatech.edu mask 255.255.255.255 nomodify notrap
noquery


I did add Jan's idea about and can do a ntpq -p and get:

# /usr/sbin/ntpq -p
 remote   refid  st t when poll reach   delay
offset  jitter
==
 LOCAL(0).LOCL.  10 l   45   64  3770.000
0.000   0.001
+time.nist.gov   .ACTS.   1 u  120  128  377   52.355
-1.866   0.465
+tick.usno.navy. .USNO.   1 u  108  128  377   26.781
2.844   3.366
*navobs1.gatech. .GPS.1 u   56  128  377   29.986
-2.211   0.119
 tewks-cc1.globa .INIT.  16 u- 102400.000
0.000   0.000
 tewks-cc2.globa .INIT.  16 u- 102400.000
0.000   0.000
 tewks-cc3.globa .INIT.  16 u- 102400.000
0.000   0.000
 tewks-cc4   .INIT.  16 u- 102400.000
0.000   0.000
 tewks-cc5.globa .INIT.  16 u- 102400.000
0.000   0.000
...

Got a couple of quests here. He had box1thru50.domain.tld What does
the .tld mean?
I dropped that in the configuration file... Is the .INIT. in the
refid field a problem? These are
all Solaris boxes...

Thanks
  Tom

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread Jan Ceuleers
T wrote:
 Got a couple of quests here. He had box1thru50.domain.tld What does
 the .tld mean?
 I dropped that in the configuration file... Is the .INIT. in the
 refid field a problem? These are
 all Solaris boxes...
Tom,

Replace the box1thru50.domain.tld with the DNS names of your 50 boxes. If 
they're not in DNS, and you know their static IP addresses you can specify 
those instead.

The .INIT. means that your monitoring host has not received any NTP packets 
from these machines at all yet. Two things to check:

- wait a while (order of magnitude: 15 minutes) and see if it changes;

- make sure that the firewall settings on those boxes allow inbound and 
outbound UDP traffic on port 123.

Cheers, Jan

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread David J Taylor
T wrote:
[]
 Got a couple of quests here. He had box1thru50.domain.tld What does
 the .tld mean?

Top level domain?

 I dropped that in the configuration file... Is the .INIT. in the
 refid field a problem? These are
 all Solaris boxes...

 Thanks
   Tom

INIT means that NTP can't talk to the box, confirmed with a read value of 
0 instead of 377.

David 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread David Woolley
Jan Ceuleers wrote:
ce the box1thru50.domain.tld with the DNS names of your 50 boxes. If 
they're not in DNS, and you know their static IP addresses you can 
specify those instead.
 
 The .INIT. means that your monitoring host has not received any NTP packets 
 from these machines at all yet. Two things to check:
 
 - wait a while (order of magnitude: 15 minutes) and see if it changes;
 
 - make sure that the firewall settings on those boxes allow inbound and 
 outbound UDP traffic on port 123.
 

Also make sure they are running NTPV4.  Otherwise you will need to 
specify the version.  (NTPV4 servers will accept NTPV3 requests, but not 
v.v.)

I also noted that there was a poor choice of servers, i.e. a lot of well 
known, overloaded, stratum one servers, when it probably needs lightly 
loaded, local stratum 2 servers.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread Steve Kostecke
On 2009-07-20, David Woolley da...@djwhome.demon.co.uk wrote:

 I also noted that there was a poor choice of servers, i.e. a lot of well 
 known, overloaded, stratum one servers, when it probably needs lightly 
 loaded, local stratum 2 servers.

Which may be chosen from http://support.ntp.org/s2

-- 
Steve Kostecke koste...@ntp.org
NTP Public Services Project - http://support.ntp.org/

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread Nottorf, Stefan
Hello,
We use Nagios to monitor our system - you can use one of the prepared
checks (check_ntp_time) to monitor the synchronization of your nodes.
You'll need the NRPE-Plugin for Nagios also. No costs involved (except
your time, of course).
Regards,
Stefan

-Original Message-
Greetings:

We have about 50 Linux/Solaris/Windows boxes running ntpd at several
different sites. Some of the systems from time to time go out of sync.
My question is there a way to test ntpd machines are all in sync with
the master
server?

I was thinking of using ssh to get on to each machine to do a date and
then go back to the master and do a date and compare, but this seems
problematic at best. What do people do to check that all machines are
in sync?

Thanks in Advanced for any help.

   Tom

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Testing Sync Across Several Systems

2009-07-20 Thread Todd Glassey
T wrote:
 Greetings:

 We have about 50 Linux/Solaris/Windows boxes running ntpd at several
 different sites. Some of the systems from time to time go out of sync.
 My question is there a way to test ntpd machines are all in sync with
 the master server?
   
Sure run a peering based log capture scenario and then have one or more 
of the systems send out alarms.
 I was thinking of using ssh to get on to each machine to do a date and
 then go back to the master and do a date and compare, but this seems
 problematic at best. What do people do to check that all machines are
 in sync?

 Thanks in Advanced for any help.

Tom

 ___
 questions mailing list
 questions@lists.ntp.org
 https://lists.ntp.org/mailman/listinfo/questions

   

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions