On Feb 16, 2012, at 4:38 AM, Matthew Burgess wrote:
> On Thu, 16 Feb 2012 11:16:12 +0000, Andrew Benton <[email protected]> wrote:
>> On Wed, 15 Feb 2012 18:47:37 -0800
>> Qrux <[email protected]> wrote:
>>
>>> * So, I propose turning -x off.
>>
>> I agree, I run ntpd -g
>> However, I also think the ntpd bootscript will work fine for most
>> people and for those (like me) who think it should be done differently
>> it's trivial to edit the bootscript; your distro, your rules and all
>> that ;)
>
> It probably doesn't affect many LFSers, but Oracle's RAC installation/
> configuration wizard explicitly checks for '-x' in the ntpd options.
>
> It does this because you really don't want your database server's time
> from jumping backwards, and '-x' (or 'tinker step 0' in /etc/ntp.conf)
> is the only way to guarantee that won't happen.
Interesting! Sounds like Oracle...
As for the issue--I still stand by my original position that defaults should be
sensible, and obey "least surprise". Running NTP by default with -x is
surprising. I'll leave the 'why' below the fold.
Q
* * *
Technical details follow, for those who are jumping up and down saying:
"My app cares about time! So, running NTP with -x protects me!"
In case anyone has forgotten, NTP gives slewing by default. The question is
not whether monotonically increasing time is good. You get that with OR
WITHOUT -x. The issue is, -x doesn't guarantee anything. Man page:
"-x, --slew: Slew up to 600 seconds. Normally, the time is
slewed...and stepped if above the threshold. This option sets the threshold to
600 s, which is well within the accuracy window to set the clock manually."
It simply raises the threshold to 600s, from 128ms. And, in cases where you
clock is drifting by more than 10 minutes in the polling interval (and you're
saying your app cares about time?) then it wants YOU TO MANUALLY ADJUST THE
TIME, before running NTP again. I want to see you do that by hand, and keep
things monotonically increasing, especially if you drifted forward. I
know...you'll shut down your production machine until those 600 s have elapsed,
right? And, in that same situation where you've drifted beyond 600s, if you
combine -x with -g, you simply get a big step that doesn't shutdown ntpd--but,
the point is, you get a STEP. Lacking the -g, ntpd simply stops itself.
I, too, care about time in my apps. So, I've looked into it. And, in the
little I know, -x protects nothing. People spend all kinds of time worrying
about various other minutiae (MTBF of hard drives, vibration in their systems
causing bad feedback on platters, dual-redundant power supplies, etc, etc, etc)
and they want absolutely order-dependent mission-critical applications to
depend on the same technology that powers their Timex from 1982? No. Real
apps that *really* care about time go out of their way to make sure their time
hardware is as good as anything else. They get crystal clocks enclosed inside
a temperature-controlled, vibration-dampened enclosure with electronic
conditioning. And, if they're careful, they use the CO as a *counter*, not as a
*clock*. Monotonicity is about counting ticks on a counter, not getting time
from a clock.
So, -x is not a guarantee. It's a stop-gap, for when your clock (or the
environment around your clock) is failing miserably. If you're in a situation
where you're drifting for more than 600 seconds in a single polling interval,
NTP is going to step you anyway, forward or back. Or, it will simply quit.
And let you do it. At which point...What happens in your situation? You shut
down your high-volume production machine because you lost access to your
timesource?
Plus, this is completely missing the point. It's not about whether or not
slewing is good. It's about choosing between:
* (A) slew beyond 128ms drift
* (B) using a kernel discipline
The issue is, if you care about timekeeping (Oracle default installs don't give
a flying crap), you don't let your clock drift more than (and I'm averaging
here), 43 minutes/day. Why 43? NTP already keeps monotonically increasing
time by slewing single deltas less than 128ms--and that all happens without -x.
43 minutes is simply the aggregate of the total number of 128ms drifts that
NTP can correct BY DEFAULT (i.e., without -x) in a given day. The
arithmetic--if you accept the fact that the "typical Unix slew rate is limited
to 0.5 ms/s", a 128ms drift will take 256 seconds to amortize. So, if you lose
less 128ms every 256 seconds, that's fine, because THE DEFAULT SLEWING WILL
TAKE CARE OF YOU. And, 128ms every 256 seconds totals to 43 seconds per day.
And, up to that amount of drift, the default slew will take care of it.
There is an exception, which is where you get single drifts in a polling
interval past 128ms. The default maximum polling interval is 1024 s. Which
means your clock would have to have a stability of less than 1 part in 8000.
Crystal clocks themselves have accuracies specified in PPMs, and the error is
caused mostly by temperature and electronic variances (ambient temperature and
power supply). If you want to see clock skew, chain 2 UPSes, and run your PC
off that.
So, again: -x is not a a guarantee...
* ...and, it's trading off kernel discipline...
* ...in a situation that probably never needs -x.
The argument that "slew is good, we want it always" is...completely backwards.
If you care about high-precision slewing, you'd want it in the kernel and you
would look into things like the nanokernel patch, etc. Which means, you
definitely want kernel discipline. And if you *need* -x, what you actually
need is a better motherboard and better environmental controls, since
temperature and power have direct effects on the clock's error.
So, getting back to your RAC system...Sure, it can check for it. But let's
hope your database app doesn't stop operating when you can't find a timesource.
True high-volume systems that require absolutely monotonic time don't mess
around with NTP as a dependency for their database--they use NTP only to
condition the system's wallclock. They can use the POSIX methods
clock_gettime(2) with CLOCK_MONOTONIC* clocksources. That's what stuff like
the 1003.b real-time specs are for. They might pin themselves to a CPU and
reach down and access the hardware clocks (TSC, HPET) to get a monontonic
timestamp which they know will be increasing. Or, they simply create one
actual monotonic timesource, and access its time. Sure, you might lose
accuracy w.r.t walltime if that's a bottleneck, or you might lose some
performance. But, when order matters...it matters.
Monotonic clocks are wonderful. And, if you cared about monotonicity, you
might look into one of the monotonic timezones.
But, frankly, it's not the 90% use-case of NTP, which is to keep
as-good-as-possible wallclock time. NTP keeps UTC time--which supports stuff
like leap seconds. Even UTC doesn't give a rat's ass about monotonic time.
And, that has nothing to do with highly order-depedent application stacks.
Think about a high-availability database system, which falls over to another
physical machine when the first stops working. Oracle (IDK about their RAC
sub-product) certain supports physical clustering. How much would you trust
your, let's say mutual fund company, to a system that can migrate not just to
another core, or CPU, but to another physical machine? You want -x slewed
timestamps to protect the ordering of events? That's fairly...trusting.
Back when I was using Oracle, they made plenty of demands about wanting the
tablespaces to be on raw devices. Sure, that was about disks--but, in the
context of absolutely order-dependent time application, I'm sure that wouldn't
stop them from making demands--i.e., setting constraints--on CPUs and
virtualization (e.g., needing to pin processes to CPUs, having access to the
RDTSC family of instructions, etc)...if the consultants you hire knew anything
about time. In actual order-dependent systems, they don't care about
wall-time. Or, really, even, slewing. Just causality. Being able to pin
ticks to timestamps is secondary, because most of where that matters is in
human interpretation. In those situations, order matters first, and the exact
mapping to wallclock is secondary (which means, in most of those situations,
millisecond-level error won't matter--when was the last time you got
millisecond timestamps on your investment statements?).
To go even further, it has been the source of some legal disputes (about
exactly when certain transactions have occurred--I think the big cases are in
Europe). But, it's safe to say that the legislature hasn't caught up. If
you're concerned about the situation where you had to argue that your system
uses a monotonic clock for transactions, but NTP for wallclock, and they may
have disagreed by several microseconds, you have issues that transcend "ntpd
-x".
Q
--
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page