Re: Kudu start error with low ntpdate "maximum error"

Matthew Jacobs Wed, 16 Nov 2016 10:21:43 -0800

I asked on the Kudu slack channel, they have seen issues where freshly
provisioned ec2 nodes take some time for ntp to quiesce, but they
didn't have a sense of how long that might take. If you checked
ntptime after the job failed, it may be that ntp had enough time. We
can probably consider bumping up the allowable error.


On Wed, Nov 16, 2016 at 9:24 AM, Jim Apple <[email protected]> wrote:
> This is the second time I have seen it, but it doesn't happen every
> time. It could very well be a difference on ec2; already I've seen
> some bugs due to my ec2 instances being Etc/UTC timezone while most
> Impala developers work in America/Los_Angeles.
>
> On Wed, Nov 16, 2016 at 9:10 AM, Matthew Jacobs <[email protected]> wrote:
>> No problem. If this happens again we should ask the Kudu developers. I
>> haven't seen this before - I wonder if it could be some weirdness on
>> ec2...
>>
>> Thanks
>>
>> On Wed, Nov 16, 2016 at 9:01 AM, Jim Apple <[email protected]> wrote:
>>> Thank you for your help!
>>>
>>> This was on an AWS machine that has expired, but I can see from the
>>> logs that "IMPALA_KUDU_VERSION=88b023" and
>>> "KUDU_JAVA_VERSION=1.0.0-SNAPSHOT" and "Downloading
>>> kudu-python-0.3.0.tar.gz" and "URL
>>> https://native-toolchain.s3.amazonaws.com/build/264-e9d44349ba/kudu/88b023-gcc-4.9.2/kudu-88b023-gcc-4.9.2-ec2-package-ubuntu-14-04.tar.gz";.
>>> I'll add "ps aux | grep kudu" to the logging this machine does on
>>> error, so we'll have it next time, but I did "ps -Afly" on exit and
>>> there were no kudu processes running, it looks like.
>>>
>>> On Wed, Nov 16, 2016 at 8:52 AM, Matthew Jacobs <[email protected]> wrote:
>>>> Can you check which version of the client you're building against
>>>> (KUDU_VERSION env var) vs what Kudu version is running (ps aux | grep
>>>> kudu
>>>>
>>>> On Wed, Nov 16, 2016 at 8:48 AM, Jim Apple <[email protected]> wrote:
>>>>> Yes.
>>>>>
>>>>> On Wed, Nov 16, 2016 at 7:45 AM, Matthew Jacobs <[email protected]> wrote:
>>>>>> Do you have NTP installed?
>>>>>>
>>>>>> On Tue, Nov 15, 2016 at 9:22 PM, Jim Apple <[email protected]> wrote:
>>>>>>> I have a machine where Kudu failed to start:
>>>>>>>
>>>>>>> F1116 05:02:00.173629 71098 tablet_server_main.cc:64] Check failed:
>>>>>>> _s.ok() Bad status: Service unavailable: Cannot initialize clock:
>>>>>>> Error reading clock. Clock considered unsynchronized
>>>>>>>
>>>>>>> https://kudu.apache.org/docs/troubleshooting.html says:
>>>>>>>
>>>>>>> "For the master and tablet server daemons, the server’s clock must be
>>>>>>> synchronized using NTP. In addition, the maximum clock error (not to
>>>>>>> be mistaken with the estimated error) be below a configurable
>>>>>>> threshold. The default value is 10 seconds, but it can be set with the
>>>>>>> flag --max_clock_sync_error_usec."
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> "If NTP is installed the user can monitor the synchronization status
>>>>>>> by running ntptime. The relevant value is what is reported for maximum
>>>>>>> error."
>>>>>>>
>>>>>>> ntptime reports:
>>>>>>>
>>>>>>> ntp_gettime() returns code 0 (OK)
>>>>>>>   time dbd66a6a.59bca948  Wed, Nov 16 2016  5:17:30.350, (.350535824),
>>>>>>>   maximum error 197431 us, estimated error 71015 us, TAI offset 0
>>>>>>> ntp_adjtime() returns code 0 (OK)
>>>>>>>   modes 0x0 (),
>>>>>>>   offset 74989.459 us, frequency 19.950 ppm, interval 1 s,
>>>>>>>   maximum error 197431 us, estimated error 71015 us,
>>>>>>>   status 0x2001 (PLL,NANO),
>>>>>>>   time constant 6, precision 0.001 us, tolerance 500 ppm,
>>>>>>>
>>>>>>> So it looks like this error is anticipated, but the expected
>>>>>>> conditions for it to occur are absent. Any ideas what could be going
>>>>>>> on here? This is with a recent checkout of Impala master.

Re: Kudu start error with low ntpdate "maximum error"

Reply via email to