Hello dear NTP experts, if you would like to tackle some strange (and as far as I can tell undocumented) behaviour, here I go: I am using ntpd 4.2.6p3 on an embedded system. On this box, the RTC is far more stable than the system clock. Therefore I have written an interface to the SHM driver to make the RTC available to ntp. But, strangely, the ntp connection seems to be reset every once in a while. The reach drops to 0 immediately, like it would if ntpd is restarted (which I know for sure it is not). It does not happen regularly, but rather often. I have tried several combinations: a) Setting the SHM-driver every 15minutes (cron), and minpoll=maxpoll=10 b) Setting the SHM-driver every 5minutes (cron), and minpoll=maxpoll=10 c) Setting the SHM-driver every 30s (demon), and no change to the polling intervals.
In test c) it seemed to lose connection at least twice per hour, but could reestablish it rather quickly. In scenario a) and b) it did not happen that often, but lasted much longer (no exact data here). One thing I noted in test b): It looked like the SHM-driver was unreachable, the "when"-counter counted well past "poll" - and then "reach" dropped to zero immediately. The strange thing: It appeared to happen, when the last successful poll (according to "when") encountered very fresh data (i.e. usually it came a few seconds after the cron job who wrote to the shm driver). Why would this make the NEXT attempt fail? Further testing showed me, that the driver does set a "reach" bit to zero, if at the completion of a poll interval it does not encounter new data, i.e. if the shm segment has not been written since its last poll. So this is not what is happening here. There is something very wrong here, either with my understanding, or with the driver itself. Could somebody enlighten me? Thankfully yours, Jörg An excerpt (grepped from ntpq output saved every five minutes): 2012-01-12-223501.report:*127.127.28.1 .RTC. 12 l 60 64 3 0.000 -19.200 9.692 2012-01-12-224001.report:*127.127.28.1 .RTC. 12 l 41 64 177 0.000 926.048 574.127 2012-01-12-224501.report:*127.127.28.1 .RTC. 12 l 20 64 377 0.000 849.446 69.440 2012-01-12-225001.report: 127.127.28.1 .RTC. 12 l 18 64 0 0.000 0.000 0.002 You can see that there is no intermediate reach of 340 or something like that, which one would expect; it looks more like a complete reset. The full ntpq variables output at 22:45 and 22:50 look like this: Internal: associd=0 status=0413 leap_none, sync_uhf_radio, 1 event, spike_detect, version="ntpd [email protected] Tue Jan 10 13:39:46 UTC 2012 (1)", processor="armv5tel", system="Linux/2.6.37.6+", leap=00, stratum=13, precision=-19, rootdelay=0.000, rootdisp=923.530, refid=SHM(1), reftime=d2b9d270.aa85bff7 Thu, Jan 12 2012 22:43:12.666, clock=d2b9d284.489d3baf Thu, Jan 12 2012 22:43:32.283, peer=30840, tc=6, mintc=3, offset=-19.200, frequency=-500.000, sys_jitter=69.440, clk_jitter=4.651, clk_wander=67.436 Timeserver #1 (30840): associd=30840 status=9614 conf, reach, sel_sys.peer, 1 event, reachable, srcadr=127.127.28.1, srcport=123, dstadr=127.0.0.1, dstport=123, leap=00, stratum=12, precision=0, rootdelay=0.000, rootdisp=0.000, refid=RTC, reftime=d2b9d256.42d70e55 Thu, Jan 12 2012 22:42:46.261, rec=d2b9d270.aa85bff7 Thu, Jan 12 2012 22:43:12.666, reach=377, unreach=0, hmode=3, pmode=4, hpoll=6, ppoll=6, headway=0, flash=00 ok, keyid=0, offset=849.446, delay=0.000, dispersion=4.359, jitter=69.440, filtdelay= 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00, filtoffset= 849.45 858.93 886.08 905.60 915.25 926.05 941.68 951.73, filtdisp= 3.38 4.37 5.53 6.34 7.10 9.97 9.16 10.68 After the "disconnect": Internal: associd=0 status=c414 leap_alarm, sync_uhf_radio, 1 event, freq_mode, version="ntpd [email protected] Tue Jan 10 13:39:46 UTC 2012 (1)", processor="armv5tel", system="Linux/2.6.37.6+", leap=11, stratum=16, precision=-19, rootdelay=0.000, rootdisp=0.000, refid=STEP, reftime=00000000.00000000 Thu, Feb 7 2036 7:28:16.000, clock=d2b9d3b2.6c65d5b5 Thu, Jan 12 2012 22:48:34.423, peer=30840, tc=6, mintc=3, offset=0.000, frequency=-500.000, sys_jitter=0.002, clk_jitter=0.002, clk_wander=67.436 Timeserver #1 (30840): associd=30840 status=8014 conf, sel_reject, 1 event, reachable, srcadr=127.127.28.1, srcport=123, dstadr=127.0.0.1, dstport=123, leap=00, stratum=12, precision=0, rootdelay=0.000, rootdisp=0.000, refid=RTC, reftime=d2b9d3a0.51704fd5 Thu, Jan 12 2012 22:48:16.318, rec=00000000.00000000 Thu, Feb 7 2036 7:28:16.000, reach=000, unreach=0, hmode=3, pmode=4, hpoll=6, ppoll=6, headway=0, flash=1000 peer_unreach, keyid=0, offset=0.000, delay=0.000, dispersion=16000.000, jitter=0.002, filtdelay= 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00, filtoffset= 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00, filtdisp= 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 _______________________________________________ questions mailing list [email protected] http://lists.ntp.org/listinfo/questions
