Re: [blfs-dev] NTP

Qrux Thu, 16 Feb 2012 14:14:37 -0800

On Feb 16, 2012, at 4:38 AM, Matthew Burgess wrote:

> On Thu, 16 Feb 2012 11:16:12 +0000, Andrew Benton <[email protected]> wrote:
>> On Wed, 15 Feb 2012 18:47:37 -0800
>> Qrux <[email protected]> wrote:
>> 
>>>     * So, I propose turning -x off.
>> 
>> I agree, I run ntpd -g
>> However, I also think the ntpd bootscript will work fine for most
>> people and for those (like me) who think it should be done differently
>> it's trivial to edit the bootscript; your distro, your rules and all
>> that ;)
> 
> It probably doesn't affect many LFSers, but Oracle's RAC installation/
> configuration wizard explicitly checks for '-x' in the ntpd options.
> 
> It does this because you really don't want your database server's time
> from jumping backwards, and '-x' (or 'tinker step 0' in /etc/ntp.conf)
> is the only way to guarantee that won't happen.


Interesting!  Sounds like Oracle...

As for the issue--I still stand by my original position that defaults should be 
sensible, and obey "least surprise".  Running NTP by default with -x is 
surprising.  I'll leave the 'why' below the fold.

        Q


* * *

Technical details follow, for those who are jumping up and down saying:

        "My app cares about time!  So, running NTP with -x protects me!"

In case anyone has forgotten, NTP gives slewing by default.  The question is 
not whether monotonically increasing time is good.  You get that with OR 
WITHOUT -x.  The issue is, -x doesn't guarantee anything.  Man page:

        "-x, --slew: Slew up to 600 seconds.  Normally, the time is 
slewed...and stepped if above the threshold.  This option sets the threshold to 
600 s, which is well within the accuracy window to set the clock manually."

It simply raises the threshold to 600s, from 128ms.  And, in cases where you 
clock is drifting by more than 10 minutes in the polling interval (and you're 
saying your app cares about time?) then it wants YOU TO MANUALLY ADJUST THE 
TIME, before running NTP again.  I want to see you do that by hand, and keep 
things monotonically increasing, especially if you drifted forward.  I 
know...you'll shut down your production machine until those 600 s have elapsed, 
right?  And, in that same situation where you've drifted beyond 600s, if you 
combine -x with -g, you simply get a big step that doesn't shutdown ntpd--but, 
the point is, you get a STEP.  Lacking the -g, ntpd simply stops itself.

I, too, care about time in my apps.  So, I've looked into it.  And, in the 
little I know, -x protects nothing.  People spend all kinds of time worrying 
about various other minutiae (MTBF of hard drives, vibration in their systems 
causing bad feedback on platters, dual-redundant power supplies, etc, etc, etc) 
and they want absolutely order-dependent mission-critical applications to 
depend on the same technology that powers their Timex from 1982?  No.  Real 
apps that *really* care about time go out of their way to make sure their time 
hardware is as good as anything else.  They get crystal clocks enclosed inside 
a temperature-controlled, vibration-dampened enclosure with electronic 
conditioning. And, if they're careful, they use the CO as a *counter*, not as a 
*clock*.  Monotonicity is about counting ticks on a counter, not getting time 
from a clock.

So, -x is not a guarantee.  It's a stop-gap, for when your clock (or the 
environment around your clock) is failing miserably.  If you're in a situation 
where you're drifting for more than 600 seconds in a single polling interval, 
NTP is going to step you anyway, forward or back.  Or, it will simply quit.  
And let you do it.  At which point...What happens in your situation?  You shut 
down your high-volume production machine because you lost access to your 
timesource?

Plus, this is completely missing the point.  It's not about whether or not 
slewing is good.  It's about choosing between:

        * (A) slew beyond 128ms drift

        * (B) using a kernel discipline

The issue is, if you care about timekeeping (Oracle default installs don't give 
a flying crap), you don't let your clock drift more than (and I'm averaging 
here), 43 minutes/day.  Why 43?  NTP already keeps monotonically increasing 
time by slewing single deltas less than 128ms--and that all happens without -x. 
 43 minutes is simply the aggregate of the total number of 128ms drifts that 
NTP can correct BY DEFAULT (i.e., without -x) in a given day.  The 
arithmetic--if you accept the fact that the "typical Unix slew rate is limited 
to 0.5 ms/s", a 128ms drift will take 256 seconds to amortize.  So, if you lose 
less 128ms every 256 seconds, that's fine, because THE DEFAULT SLEWING WILL 
TAKE CARE OF YOU.  And, 128ms every 256 seconds totals to 43 seconds per day.  
And, up to that amount of drift, the default slew will take care of it.

There is an exception, which is where you get single drifts in a polling 
interval past 128ms.  The default maximum polling interval is 1024 s.  Which 
means your clock would have to have a stability of less than 1 part in 8000.  
Crystal clocks themselves have accuracies specified in PPMs, and the error is 
caused mostly by temperature and electronic variances (ambient temperature and 
power supply).  If you want to see clock skew, chain 2 UPSes, and run your PC 
off that.

        So, again: -x is not a a guarantee...

        * ...and, it's trading off kernel discipline...

        * ...in a situation that probably never needs -x.

The argument that "slew is good, we want it always" is...completely backwards.  
If you care about high-precision slewing, you'd want it in the kernel and you 
would look into things like the nanokernel patch, etc.  Which means, you 
definitely want kernel discipline.  And if you *need* -x, what you actually 
need is a better motherboard and better environmental controls, since 
temperature and power have direct effects on the clock's error.

So, getting back to your RAC system...Sure, it can check for it.  But let's 
hope your database app doesn't stop operating when you can't find a timesource. 
 True high-volume systems that require absolutely monotonic time don't mess 
around with NTP as a dependency for their database--they use NTP only to 
condition the system's wallclock.  They can use the POSIX methods 
clock_gettime(2) with CLOCK_MONOTONIC* clocksources.  That's what stuff like 
the 1003.b real-time specs are for.  They might pin themselves to a CPU and 
reach down and access the hardware clocks (TSC, HPET) to get a monontonic 
timestamp which they know will be increasing.  Or, they simply create one 
actual monotonic timesource, and access its time.  Sure, you might lose 
accuracy w.r.t walltime if that's a bottleneck, or you might lose some 
performance.  But, when order matters...it matters.

Monotonic clocks are wonderful.  And, if you cared about monotonicity, you 
might look into one of the monotonic timezones.

But, frankly, it's not the 90% use-case of NTP, which is to keep 
as-good-as-possible wallclock time.  NTP keeps UTC time--which supports stuff 
like leap seconds.  Even UTC doesn't give a rat's ass about monotonic time.  
And, that has nothing to do with highly order-depedent application stacks.  
Think about a high-availability database system, which falls over to another 
physical machine when the first stops working.  Oracle (IDK about their RAC 
sub-product) certain supports physical clustering.  How much would you trust 
your, let's say mutual fund company, to a system that can migrate not just to 
another core, or CPU, but to another physical machine?  You want -x slewed 
timestamps to protect the ordering of events?  That's fairly...trusting.

Back when I was using Oracle, they made plenty of demands about wanting the 
tablespaces to be on raw devices.  Sure, that was about disks--but, in the 
context of absolutely order-dependent time application, I'm sure that wouldn't 
stop them from making demands--i.e., setting constraints--on CPUs and 
virtualization (e.g., needing to pin processes to CPUs, having access to the 
RDTSC family of instructions, etc)...if the consultants you hire knew anything 
about time.  In actual order-dependent systems, they don't care about 
wall-time.  Or, really, even, slewing.  Just causality.  Being able to pin 
ticks to timestamps is secondary, because most of where that matters is in 
human interpretation.  In those situations, order matters first, and the exact 
mapping to wallclock is secondary (which means, in most of those situations, 
millisecond-level error won't matter--when was the last time you got 
millisecond timestamps on your investment statements?).

To go even further, it has been the source of some legal disputes (about 
exactly when certain transactions have occurred--I think the big cases are in 
Europe).  But, it's safe to say that the legislature hasn't caught up.  If 
you're concerned about the situation where you had to argue that your system 
uses a monotonic clock for transactions, but NTP for wallclock, and they may 
have disagreed by several microseconds, you have issues that transcend "ntpd 
-x".

        Q

-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Re: [blfs-dev] NTP

Reply via email to