Thanks much to you and the others who have provided useful information. I'll digest all this and try something else Monday.
Couple of notes: "adjtimex" is not available on our systems. They're all Red Hat/Fedora or derived from them. I do have "hwclock" but I don't think it will do what was suggested. If I set a "bias" value in the drift file won't NTP change it anyway? After I zeroed it out, it has been changing over time. I'd like to stay away from a hierarchy with a single point of failure. Best Regards, ./Cal On Fri, 2008-11-21 at 16:06 +0000, Unruh wrote: > David Woolley <[EMAIL PROTECTED]> writes: > > >Cal Webster wrote: > >> Our NTP servers are slowly loosing time. All are in nearly perfect sync > >> but collectively drift backwards over time. Is there a way to apply a > >> bias to the drift calculations? > > >ntp.drift on the one machine with the local clock configured. > >> > >> We had to disconnect from the Internet several months ago. Since then we > >> have had serious drift problems. Shortly after the disconnect I > >> discovered that we were predictably loosing 10 minutes every 15 days. I > >> tried several things but not until I zeroed out the > >> "driftfile" (/var/lib/ntp/drift) 10 days ago [Mon Nov 10 18:10:00 2008] > >> did this large drift abate. > > >Drift > 463ppm (500ppm is ntpd's limit of correctable drift, when no > >phase noise is present). Something is seriously broken. I suspect that > >you have a lost timer interrupts problem and ntpd was papering over the > >cracks. That has to be fixed at source. If the 10/15 minutes a day was > >consistent from when you started free-running, that is the only thing I > >can think of. If it ramped up, another problem might be your misuse of > >local clock drivers. > >> > >> Although it is much improved, we are still steadily loosing time. Three > >> days after I zeroed the drift file [Thu Nov 13 15:04:00 EST 2008] we > >> were 32 seconds behind. Today, 10 days later [Thu Nov 20 09:05:00 2008] > >> we are 1 min 54 secs behind. This works out to roughly 12 secs per day - > >> not bad I guess but still requires regular monitoring. > >> > > >138 ppm is still way too high; temperature only tends to produce > >variations in the single figures. Whilst you will get some benefit by > >setting the drift file to 138, with the opposite sign from before, the > >instability you report indicates that you a more serious problem to fix. > > >Before all the recent clock hacks in Linux, when using just the CTC > >interrupts, 30 seconds a year was a reasonable target for an air > >conditioned computer room and a reasonably stable processing load. > > That was corrected by ntp. 138PPM is not that far off the norm. Especially > since we have no idea what his adjtimex corrections are. > > Run > adjtimex -p > The two key items are frequency and tick. > > Note that if you use the tsc clock in Linux, that drift rate will fluctuate > with each reboot. > > > > >> server 127.127.1.0 > >> fudge 127.127.1.0 stratum 5 > > >If you have a time island, there should be exactly one master server > >with a relatively low stratum local clock, although stratum 5 is > >dangerously low. Your target should be that you end up with some > >clients at stratum 14 or 15. > > ??? Why would they be that high? The clients are surely all getting their > time from that one master, and their stratum should be one higher. Also who > cares what stratum he declares his master to be. If he reallynever goes to > the net, he could make it stratum 1 for all ntp cares. > > > > >Any pure clients should not have a local clock. That is universally > >true, not just for time islands. For the remaining machines, you should > > either specify a clear hieararchy, with steps of two in the local > >clock stratum between each one, or, I think orphan mode will work, > >providing the master server, with the local clock, never goes down for > >more than a few hours at a time. (There is circumstancial evidence, in > >a recent thread, that root dispersion will diverge on orphan mode > >servers until they get rejected for excessive root distance.) > > > >> > >> > >> [EMAIL PROTECTED] /]# cat /etc/adjtime > >> ------------------------------------ > >> 44.508790 1226358437 0.000000 > >> 1226358437 > >> LOCAL > > >You should not use this and ntpd at the same time (actually, if you are > >careful, you may be able to use it for correcting the time across a > >period in which the machine is powered down, but doing so requires > >special considerations > > _______________________________________________ > questions mailing list > [email protected] > https://lists.ntp.org/mailman/listinfo/questions _______________________________________________ questions mailing list [email protected] https://lists.ntp.org/mailman/listinfo/questions
