[ntp:questions] ntpd losing sync

A C Sat, 04 Feb 2012 00:35:08 -0800

Ok, I thought this was a one-off problem but I've had ntpd lose syncagain after about four days from a restart. It never regains sync.

It starts with what seems to be the system clock drifting away from thePPS lock and then the oscillations from corrections are just too greatand the whole thing blows up.



Here's the current configuration for version 4.2.7p236:

server          0.us.pool.ntp.org minpoll 9 iburst
server          1.us.pool.ntp.org minpoll 9 iburst
server          0.north-america.pool.ntp.org minpoll 9 iburst
server ntp1.gatech.edu prefer minpoll 9
server rolex.usg.edu minpoll 9
server  127.127.22.0  minpoll 2 maxpoll 4
fudge   127.127.22.0  time1 +0.000 flag2 1 flag3 1 refid PPS
server  127.127.28.0  minpoll 7 noselect
fudge   127.127.28.0  time1 -0.6 refid GPSD


The peer list after waiting about a day from the initial system upset:

remote refid st t when poll reach delay offsetjitter


==============================================================================

x127.127.22.0 .PPS. 0 l - 16 377 0.000 -465.49355.933127.127.28.0 .GPSD. 0 l - 128 377 0.000 -2089862833.87207.7.148.214 216.218.254.202 2 u - 512 377 1045.07 -20971311784.072.14.179.211 127.67.113.92 2 u - 512 377 1029.80 -2017106559.37173.255.224.22 128.4.1.1 2 u 245 512 377 919.628 -2026297684.05130.207.165.28 130.207.244.240 2 u - 512 377 994.543 -2041257778.28131.144.4.10 65.212.71.102 2 u 23 512 377 1000.21 -2036487687.63

Note that the offset for PPS is swinging wildly, not exactly visible inthis static snapshot.


ntpq associations:
ind assid status  conf reach auth condition  last_event cnt
===========================================================
  1  4560  912a   yes   yes  none falsetick    sys_peer  2
  2  4561  9014   yes   yes  none    reject   reachable  1
  3  4562  9014   yes   yes  none    reject   reachable  1
  4  4563  9034   yes   yes  none    reject   reachable  3
  5  4564  9014   yes   yes  none    reject   reachable  1
  6  4565  904a   yes   yes  none    reject    sys_peer  4
  7  4566  9014   yes   yes  none    reject   reachable  1

rv 4560 (first sys_peer):
 associd=4560 status=912a conf, reach, sel_falsetick, 2 events, sys_peer,
 srcadr=PPS(0), srcport=123, dstadr=127.0.0.1, dstport=123, leap=00,
 stratum=0, precision=-20, rootdelay=0.000, rootdisp=0.000, refid=PPS,
 reftime=d2d76400.c9b870fd  Sat, Feb  4 2012  8:00:00.787,
 rec=d2d76401.ffffffff  Sat, Feb  4 2012  8:00:02.000, reach=377,
 unreach=0, hmode=3, pmode=4, hpoll=4, ppoll=4, headway=0, flash=00 ok,
 keyid=0, offset=259.524, delay=0.000, dispersion=4.956, jitter=444.467,

filtdelay= 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00,filtoffset= 259.52 344.53 419.52 474.51 -430.48 -335.49 -265.48-185.49,filtdisp= 4.74 4.98 5.22 5.47 5.70 5.94 6.186.42


rv 4565 (second sys_peer)
 associd=4565 status=904a conf, reach, sel_reject, 4 events, sys_peer,
 srcadr=ntp1.gatech.edu, srcport=123, dstadr=10.0.0.21, dstport=123,
 leap=00, stratum=2, precision=-20, rootdelay=0.565, rootdisp=24.597,
 refid=130.207.244.240,
 reftime=d2d7609d.0646422f  Sat, Feb  4 2012  7:45:33.024,
 rec=d2d76271.00c7dd3a  Sat, Feb  4 2012  7:53:21.003, reach=377,
 unreach=0, hmode=3, pmode=4, hpoll=9, ppoll=9, headway=46,
 flash=400 peer_dist, keyid=0, offset=-204125.520, delay=994.543,
 dispersion=16.941, jitter=7778.280,

filtdelay= 997.29 999.05 994.54 996.13 994.70 994.38 977.68995.78,filtoffset= -209351 -206700 -204125 -201435 -198758 -196080 -193475-190882,filtdisp= 0.08 8.07 15.83 23.94 32.01 40.08 47.9155.76

I can provide graphs of the offset, dispersion and skew for any of thepeers if anyone wants them. The physical GPS itself has been tickingjust fine, no apparent issues with its signal to the machine. As far asI can tell from the peers files there is simply a sudden shift away froma nominal few microseconds of offset for the reported PPS. The offsetthen swings wildly (like a PID loop in oscillation) until I restart ntpdand the system clock is stabilized.

The system sits quietly in a corner of the room. It has no duties otherthan to run ntpd and gpsd. Whatever monitoring I do is run on othersystems (ntpd is polled remotely with ntpq on another system, gpsdstatus is queried remotely by another system and compiled there). Theoscillations happen after a few days but no obvious cron jobs arerunning at the times that they start. If there's something I can do toinstrument ntpd further I can do that and see if I catch the problem.

_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions

[ntp:questions] ntpd losing sync

Reply via email to