On Thu, Mar 24, 2011 at 8:44 PM, Hal Murray <[email protected]> wrote: > 2) Your local clock is broken. There are several ways this can happen. > a) Power saving can changes the CPU speed > b) The kernel timekeeping software could be broken. > c) Ages ago, you could get this when a lot of disk activity caused lost > interrupts. I haven't seen that on PCs in a while, but it might come back on > embedded systems. d) You're using a VM with broken timekeeping. This is increasingly common. Notice just after there was discussion on this list of the flash-mob effect every hour on pool servers from cron'd ntpdate or similar, a former pool server operator responded to this thread saying they had an incurably bad clock on their pool server, pulled it from the pool, and cron'd ntpdate every _minute_. I can only hope that's every minute plus or minus a healthy random amount like 15s.
The solution is often beyond the reach of the putative operator, because they have control over only the VM client, and not the the monitor/dom0/host. It also depends on the particular OS in the VM, and the level of integration the underlying VMM offers for that type of client. The best answer is to have no clock in the VM, instead passing through gettimeofday() and similar to the monitor/dom0/host clock. That doesn't allow a ntpd per VM, but assuming the monitor/dom0/host clock is well-disciplined, delivers the best possible client VM timing. While I've seen a few reports of well-behaved ntpd in VMs, I wonder how many are on lightly-oversubscribed hardware and subject to serious degradation when the cumulative load of all the VMs increases. > 3) Your network connection is variable and unsymmetrical. You can easily get > problems if you do a big download over a DSL connection. ntpd assumes your > network delays are symmetric. If your system gets calibrated when the > traffic is low (and symmetric) and you start a big download (or upload) > without much traffic in the other direction, queuing delays can cause a big > enough time shift to confuse ntpd. On my DSL line, I see up to 3 second > delays. If this is the problem, tinker huffpuff might help. > > Lots of interesting info at http://www.bufferbloat.net/ (but not directly > ntp related) While most likely unhelpful to the OP in solving their problem, I recommend everyone take a look at their internet connectivity's buffering using http://netalyzr.icsi.berkeley.edu/ A few higher-end consumer routers (which are also lower-end small business routers, typically) offer traffic shaping that can prevent upstream buffering in the "modem" by ensuring data is paced to stay under the minimum anticipated upstream throughput of the service. Minimizing downstream buffering at the other ISP end of the last mile is trickier but is usually part of the same router QoS feature. And if you like building your own routers, please do join the [email protected] list and you too can experience the bliss of both low latency and high throughput at the same time. Briefly, big, unmanaged buffers are increasingly common as devices typically use all available memory for packet buffering in an ill-advised view that packet loss is always counterproductive. In fact, lacking ECN (which is essentially unsupported by most routers your packets will traverse), packet loss is the only indication to TCP that the path is congested and senders need to slow down to avoid even more packet loss, or worst case, congestion collapse. Buffers sized to let you push several hundred megabits through your gigabit ethernet are extremely oversized when that gig link is part of a path capable of a few megabits. Even in home routers, engineers are pushed to ensure they can achieve maximum throughput, too often without any push to keep latencies reasonable over WAN bottlenecks. Cheers, Dave Hart Cheers, Dave Hart _______________________________________________ pool mailing list [email protected] http://lists.ntp.org/listinfo/pool
