Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12149 )

Change subject: Take absolute value of ntp maxerror to avoid undefined behavior 
from negative values
......................................................................


Patch Set 1:

> Patch Set 1:
>
> > (1 comment)
>
> We have seen this in practice when using PTP instead of NTP.
>
> We first noticed this behavior in the binary ntptime (/usr/sbin), seemingly 
> when the adjtimex system call returns any negative value for maxerror.
> The output from ntptime results in very large maxerror values due to ntptime 
> casting long maxerror to an unsigned long.
>
> Here is the initial log entry from Kudu crashing:
>
> F0102 21:07:23.614316 18235 hybrid_clock.cc:340] Check failed: _s.ok() unable 
> to get current time with error bound:
> Service unavailable: clock error estimate (18446744073709551608us) too high 
> (clock considered synchronized by the kernel)

Interesting. So, the maxerror here is -8. That's a really really low maxerror. 
I guess PTP can give you tight synchronization but I'm surprised it's that 
tighta as a strict max bound.

Are you sure that taking the absolute value is the correct thing to do instead 
of clamping to some arbitrary minimum like 50us? The fact that maxerror can go 
negative implies that it might also end up being 0, which is obviously not 
correct, and I imagine some of our consistency algorithms will produce 
incorrect results in that case.

Which PTP implementation are you using? Might be good to look at how it's 
calculating maxerror and see if this indeed is a bug in its calculations.


--
To view, visit http://gerrit.cloudera.org:8080/12149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3fdc5807d227bb3f6f7f039239d1a9755b9a5725
Gerrit-Change-Number: 12149
Gerrit-PatchSet: 1
Gerrit-Owner: Patrick Freeman <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Patrick Freeman <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Thu, 03 Jan 2019 20:42:20 +0000
Gerrit-HasComments: No

Reply via email to