bow...@gmail.com said: > The problem is that they start in sync and over the course of a day drift > that far apart despite having NTP running. We're not sure why NTP isn't > correcting it along the way. Though at this point, we are looking at a > firmware bug.
I wouldn't think of it as two systems drifting apart, but rather at least one system with a broken clock. Is it only one system that is broken? How many systems do you have running the same firmware? OS? Hardware? Are the two systems that drift apart running on the same hardware and OS? Do any other similar systems have troubles? I wouldn't rule out an OS or ntpd bug. It's fairly easy to set up a system to monitor the time on several/many other systems. For each system you want to monitor, add a line like this to your ntp.conf: server xxxx noselect minpoll x maxpoll y ntpq -p will quickly show you any boxes that are way out of tune. Anything off by a second will stand out. Or scan rawstats or peerstats. noselect goes through all the work of polling the target site, including logging, but then discards the data rather than using it to control the local clock. It's great for monitoring other systems. Normally, if ntpd is off by more than 128 ms, it will step the clock. That puts a line in the log file. So it's more than a bit strange that the clocks get off by many seconds. I'd double check that ntpd really is still running. Are your drift-apart systems using only your 2 local stratum-2 servers? If so, that may be the problem. If those servers don't agree, which one do you believe? (There is endless discussion in the NTP community about how many servers you need. 3 lets you out-vote 1 bad guy. 4 lets you out-vote a bad guy if one of them is down. ...) -- These are my opinions. I hate spam. _______________________________________________ time-nuts mailing list -- time-nuts@febo.com To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts and follow the instructions there.