Am 24.02.2015 um 15:22 schrieb Terje Mathisen:
Markus Schöpflin wrote:
First some background information:

The following behaviour has been observed on a HP ProLiant DL360 G7
running Oracle Linux Server release 6.3 and ntp 4.2.4p8.

You should probably update to 4.2.8, but I assume you are running the official
versions from Oracle?

Yes, this is the official version from Oracle.

I think I could justify an update to a private build when it actually helps, but not otherwise. I will check using the latest release if the behaviour is any different.


The following ntpd configuration is used, ntp servers are all local
stratum 1 dedicated servers.

driftfile /var/lib/ntp/drift
server ntp1 version 4 iburst minpoll 5 maxpoll 5 prefer
server ntp2 version 4 iburst minpoll 5 maxpoll 5
server ntp3 version 4 iburst minpoll 5 maxpoll 5
peer peer1 version 4 iburst minpoll 5 maxpoll 5

Ntpd is running with the following command line:

   ntpd -u ntp:ntp -p /var/run/ntpd.pid -g

We have an application running on the client which needs a synchronised
time.  The application is only started once NTPD reports that the local
clock is synchronised and the offset within 300ms. For safety reasons,
the application is terminated when NTPD reports an offset larger than
300ms.

OK.

Now the actual problem description:

To speed up initial synchronisation after a system reboot, we are using
iburst. According to the documentation, using iburst is supposed to send
a burst of eight packets when the server is unreachable.

This seems to be untrue, I only ever observed four packets to be sent.
The observed behaviour more looks like it can send up to eight packets,
but stops as soon as it is synchronised to a server.

Is the documentation incorrect here?

I know that the iburst behavior was modified slightly to avoid triggering the
abuse detectors.

It is still documented as sending a burst of 8 packets here: http://doc.ntp.org/4.2.6p5/confopt.html.

I think it is correctly described here: http://www.eecis.udel.edu/~mills/ntp/html/poll.html.

NTPD reports a successful synchronisation about 7 seconds after it
starts, which is pretty good.

But quite frequently the system clock is off more than 128ms after a
reboot, causing NTPD to perform a clock reset immediately after becoming
synchronised and again report the clock as unsynchronised. This reset
also seems to entail throwing away all available data samples in the
clock filter.

Only about three times the fixed poll interval of 32s later the clock is
again reported as synchronised.

Yeah, this is more or less as designed.

I have experimented with running ntpdate (or ntpd -g -q) once before
starting ntpd which indeed helps as it brings the system time close
enough to the correct time to avoid the clock reset when ntpd is started
as a daemon, but according to
http://support.ntp.org/bin/view/Dev/DeprecatingNtpdate ntpdate is
deprecated and anyway it shouldn't be needed to be run before starting
ntpd, so this maybe isn't the best solution.

So is there some way to speed up the second waiting period (after the
clock reset) using a suitable configuration for NTPD? Or can NTPD be
configured to also send a burst of packets after the clock has been reset?

It would be an interesting option to iburst to make it take effect after each
clock reset, not just immediately after a restart, you could enter a feature
request for this!

I'm not sure if it's not a bug that iburst doesn't take effect after the clock reset, as ntpd seems to consider the server as unreachable after performing the reset.

The following excerpt of the log file makes me think so (this is from after the reset of the local clock):

report_event: system event 'event_clock_reset' (0x05) status 'sync_alarm, sync_unspec, 3 events, event_peer/strat_chg' (0xc034)
transmit: at 34 <client> -> <server 1> mode 3
receive: at 34 <server 1> <- <client> mode 4 code 1 auth 0
peer <server 1> event 'event_reach' (0x84) status 'unreach, conf, 2 events, event_reach' (0x8024)
clock_filter: n 1 off 0.000022 del 0.000214 dsp 7.937508 jit 0.000000, age 0

Note the status 'unreach' in the event report for 'event_reach'.

In the meantime, and particularly if you insist on running 5-10 year old
versions, your dual startup of ntpd is probably your best bet:

  ntpd -u ntp:ntp -g -q
  ntpd -u ntp:ntp -p /var/run/ntpd.pid

You should not need the pid file for the first run since that one is stopping
asap, and the -g is unneeded for the final instance since the clock has
already been set. (If it hasn't and the clock is out by more than 1000
seconds, you probably want an alarm instead!)

Thanks for the suggestion, and yes, the system should not start when the second call to ntpd doesn't succeed without -g.

BTW, I don't insist on actually using an old version of NTPD, but we try to avoid using private RPM builds whenever possible.

This way you get the full iburst handling twice, which is exactly what you
want. :-)

Thank you very much for your comments.

Markus

_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions

Reply via email to