I have some actual data on my server's sudden leaps into instability. The two queries below were run about 2 minutes apart (presumably on either side of a poll).
(For reference, I've put my original question and a follow-up at the end of this post.) (The "cron/ntpd -q" system was not working any better (huge resets), so I've been running ntpd full-time for the last 5 days. It went off into gaga-land once a couple of days ago, so I had to restart it with a reasonable drift value.) > ntpq> peers > remote refid st t when poll reach delay offset jitter > ============================================================================== > -rainforest.neor 33.247.251.47 3 u 894 1024 377 17.418 31.986 15.040 > +clock-a.develoo 207.171.30.106 2 u 888 1024 377 57.953 30.674 2.065 > +mtnlion.com 139.78.135.14 2 u 858 1024 377 15.182 23.067 2.919 > +enigma.wiredgoa 209.81.9.7 2 u 859 1024 377 45.946 22.677 1.881 > *time-sj.stsn.ne 192.5.41.40 2 u 851 1024 377 53.168 31.283 7.098 > ntpq> q > 103 [zorg:/etc]# ntpq -c peers zorg > remote refid st t when poll reach delay offset jitter > ============================================================================== > +rainforest.neor 33.247.251.47 3 u 101 1024 377 19.168 -586.83 618.823 > *clock-a.develoo 207.171.30.106 2 u 96 1024 377 64.535 -587.66 618.341 > +mtnlion.com 139.78.135.14 2 u 62 1024 377 15.002 -615.87 638.938 > +enigma.wiredgoa 209.81.9.7 2 u 65 1024 377 47.882 -614.02 636.702 > +time-sj.stsn.ne 192.5.41.40 2 u 58 1024 377 50.959 -610.66 641.951 > 104 [zorg:/etc]# Anyone have a notion what might be going on? The big offsets are presumably just because the server's clock was off by about half a second, but the huge jitter values don't make sense to me. Aren't they the jitter in round-trip time? That wouldn't be affected by my clock being off. I'm sure this is my ignorance, being an ntpd tyro, but I'd like to understand... TIA - David =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ORIGINAL QUESTION I have a little file server running in my basement, and since it's the only machine running all the time, I set it up to run ntpd and provide clock settings to my other machines. The machine is running FreeBSD 5.5. I installed it some years ago, and have had no reason to upgrade it. Initially, I ran ntpd for a day or two to establish a drift value, then killed it and set up a crontab entry to run "ntpd -q" every 6 hours. This worked perfectly for 2 or 3 years. Corrections were always small numbers of msec. Then, a few days ago, a disk failed. I replaced the disk and restored, and everything was fine -- except that I had lost the drift file. So, I started ntpd, let it run overnight, and looked at the drift file. It had an obviously bogus number. The clock corrections were very large and not getting smaller. So I put a reasonable number in ntp.drift (based on my vague memory of the old good value -- about 100), restarted ntpd and let it run a few hours. It seemed to be converging, so I stopped it and reinstated the crontab/ntpd -q routine -- this time every 3 hours. 12 out of 19 corrections were around 20-30 msec, but the others were off-the-wall -- hundreds of msec. So I did some arithmetic (on the reasonable corrections only) and adjusted the drift value. Since then, most of the corrections have been less than 10 msec, but I'm still getting some crazy ones -- like 1.7 seconds! The wild corrections are all in the same direction (-), so I don't think the time derived from the servers is wrong. It seems as if the clock in the PC must be taking off on wild excursions occasionally. Is this possible? How could replacing a disk have brought this on? What am I missing? =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- FOLLOW-UP on why I've been using cron/nptd -q I don't have a critical need for accuracy, so I didn't want to add any more load to the time servers than necessary. I thought that a few hits 4 times a day would be a smaller load than running the daemon all the time. Now that my computer's clock seems to be running inconsistently, the load from the ntpd running continuously would be even higher, right? BTW, new data on my little mystery: my computer's clock seems to gain time (like 2 seconds in 3 hours) during times when I'm using my other systems -- never when idling. This even though the computer is just providing nfs service and occasional backups. Shouldn't an increase in average load SLOW the clock (by masking more interrupts)? =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- _______________________________________________ questions mailing list [email protected] https://lists.ntp.org/mailman/listinfo/questions _______________________________________________ questions mailing list [email protected] https://lists.ntp.org/mailman/listinfo/questions
