Actually I think I understand the root cause of this. I think at some point NTP can switch the clock from a microseconds-based mode to a nanoseconds-based mode, at which point Kudu starts interpreting the results of the ntp_gettime system call incorrectly, resulting in incorrect error estimates and even time values up to 1000 seconds in the future (we read 1 billion nanoseconds as 1 billion microseconds (=1000 seconds)). I'll work on reproducing this and a patch, to backport to previous versions.
-Todd On Wed, Nov 1, 2017 at 5:00 PM, Todd Lipcon <t...@cloudera.com> wrote: > What's the full log line where you're seeing this crash? Is it coming from > tablet_bootstrap.cc, raft_consensus.cc, or elsewhere? > > -Todd > > 2017-11-01 15:45 GMT-07:00 Franco Venturi <fvent...@comcast.net>: > >> Our version is kudu 1.5.0-cdh5.13.0. >> >> Franco >> >> >> >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- Todd Lipcon Software Engineer, Cloudera