Re: Help start kudu error: Bad status: Invalid argument: Tried to update clock beyond the max. error.

Alexey Serbin Tue, 02 May 2017 10:47:14 -0700

Hi,

It seems the clock among the machines in the cluster is not synchronizedas expected. It might be because of NTP configuration issues. There issome information to start troubleshooting with:http://kudu.apache.org/docs/troubleshooting.html#ntp

That error might appear during tablet bootstrap (so it might happen toboth masters and tservers).

What is output of the 'ntptime' command if running at the servers?Also, what is 'ntpq -p localhost' output is?



Best regards,

Alexey


On 5/2/17 12:12 AM, ???????? wrote:

Since the kudu cluster machine is powered down, I need to restartkudu-master and kudu-tserver.The cluster has three master and three tserver, one of the master andthree tserver start error, error message: Bad status: Invalidargument: Tried to update clock beyond the max. Error.I tried to set max_clock_sync_error_usec larger, but still the samemistake.
I do not know what to do to solve it.
Kudu-master start log:

Log file created at: 2017/05/02 14:50:53
Running on machine: hadoopname01vl
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0502 14:50:53.479116 5474 master_main.cc:60] Master servernon-default flags:
--fs_data_dirs=/app/kudu/master
--fs_wal_dir=/app/kudu/master
--master_addresses=hadoopname01vl:7051,hadoopdata04vl:7051,hadoopname02vl:7051
--max_clock_sync_error_usec=1500000000
--heap_profile_path=/tmp/kudu-master.5474
--flagfile=/etc/kudu/conf/master.gflagfile
--fromenv=log_dir
--log_dir=/app/kudu/log
Master server version:
kudu 1.2.0-cdh5.10.0
revision 01748528baa06b78e04ce9a799cc60090a821162
build type RELEASE
built by jenkins at 23 Jan 2017 23:49:02 PST onkudu-centos66-17b9.vpc.cloudera.com
build id 2017-01-23_23-14-17
I0502 14:50:53.479230 5474 mem_tracker.cc:140] MemTracker: hardmemory limit is 2.988239 GBI0502 14:50:53.479236 5474 mem_tracker.cc:142] MemTracker: softmemory limit is 1.792943 GBI0502 14:50:53.480358 5474 master_main.cc:67] Initializing masterserver...I0502 14:50:53.480466 5474 hybrid_clock.cc:177] HybridClockinitialized. Resolution in nanos?: 1 Wait times tolerance adjustment:1.0005 Current error: 1109553I0502 14:50:53.481259 5474 env_posix.cc:1284] Not raising processfile limit of 131072; it is already as high as it can goI0502 14:50:53.481281 5474 file_cache.cc:401] Constructed file cachelbm with capacity 65536I0502 14:50:53.482020 5474 log_block_manager.cc:1336] Data dir/app/kudu/master/data is on an ext4 filesystem vulnerable to KUDU-1508with block size 4096I0502 14:50:53.482035 5474 log_block_manager.cc:1346] Limitingcontainers on data directory /app/kudu/master/data to 2721 blocksI0502 14:50:53.484666 5474 fs_manager.cc:251] Opened localfilesystem: /app/kudu/master
uuid: "4811dfb33ff444d2b3416d7bbe3c9a38"
format_stamp: "Formatted at 2017-02-20 07:35:54 on hadoopname01vl"
I0502 14:50:53.501610  5474 master_main.cc:70] Starting Master server...
I0502 14:50:53.505748 5474 rpc_server.cc:164] RPC server started.Bound to: 0.0.0.0:7051I0502 14:50:53.505798 5474 webserver.cc:126] Starting webserver on0.0.0.0:8051I0502 14:50:53.505807 5474 webserver.cc:131] Document root:/usr/lib/kudu/wwwI0502 14:50:53.505928 5474 webserver.cc:221] Webserver started. Boundto: http://0.0.0.0:8051/I0502 14:50:53.506609 5543 sys_catalog.cc:119] Verifying existingconsensus stateI0502 14:50:53.507067 5543 tablet_bootstrap.cc:381] T00000000000000000000000000000000 P 4811dfb33ff444d2b3416d7bbe3c9a38:Bootstrap starting.I0502 14:50:53.507866 5543 tablet_bootstrap.cc:540] T00000000000000000000000000000000 P 4811dfb33ff444d2b3416d7bbe3c9a38:Time spent opening tablet: real 0.001s user 0.000s sys 0.000sI0502 14:50:53.507894 5543 tablet_bootstrap.cc:560] T00000000000000000000000000000000 P 4811dfb33ff444d2b3416d7bbe3c9a38:Previous recovery directory found at/app/kudu/master/wals/00000000000000000000000000000000.recovery:Replaying log files from this location instead of/app/kudu/master/wals/00000000000000000000000000000000I0502 14:50:53.507917 5543 tablet_bootstrap.cc:567] T00000000000000000000000000000000 P 4811dfb33ff444d2b3416d7bbe3c9a38:Deleting old log files from previous recovery attempt in/app/kudu/master/wals/00000000000000000000000000000000I0502 14:50:53.509835 5543 log_util.cc:316] Log segment/app/kudu/master/wals/00000000000000000000000000000000.recovery/wal-000000001has no footer. This segment was likely being written when the serverpreviously shut down.I0502 14:50:53.509851 5543 log_reader.cc:160] Log segment/app/kudu/master/wals/00000000000000000000000000000000.recovery/wal-000000001was likely left in-progress after a previous crash. Will try torebuild footer by scanning data.I0502 14:50:53.548249 5543 log_util.cc:570] Scanning/app/kudu/master/wals/00000000000000000000000000000000.recovery/wal-000000001for valid entry headers following offset 7156830...
I0502 14:50:53.564885  5543 log_util.cc:607] Found no log entry headers
I0502 14:50:53.564929 5543 log_util.cc:219] Ignoring log segmentcorruption in/app/kudu/master/wals/00000000000000000000000000000000.recovery/wal-000000001because there are no log entries following the corrupted one. Theserver probably crashed in the middle of writing an entry to thewrite-ahead log or downloaded an active log via tablet copy. Errordetail: Corruption: CRC mismatch in log entry header: Log filecorruption detected. Failed trying to read batch #0 at offset 7156818for log segment/app/kudu/master/wals/00000000000000000000000000000000.recovery/wal-000000001:Prior entries: [off=7156180 REPLICATE (3.11030)] [off=7156213 COMMIT(3.11030)] [off=7156252 REPLICATE (4.11031)] [off=7156818 REPLICATE(4.11032)]I0502 14:50:53.564937 5543 log_util.cc:369] Successfully rebuiltfooter for segment:/app/kudu/master/wals/00000000000000000000000000000000.recovery/wal-000000001(valid entries through byte offset 7156818)I0502 14:50:53.564985 5543 tablet.cc:983] T00000000000000000000000000000000 Rewinding schema during bootstrap toSchema [
        0:entry_type[int8 NOT NULL],
        1:entry_id[string NOT NULL],
        2:metadata[string NOT NULL]
]
I0502 14:50:53.565114 5543 log.cc:351] Log is configured to *not*fsync() on all Append() callsF0502 14:50:53.717851 5543 tablet_bootstrap.cc:790] Check failed:_s.ok() Bad status: Invalid argument: Tried to update clock beyond themax. error.

Re: Help start kudu error: Bad status: Invalid argument: Tried to update clock beyond the max. error.

Reply via email to