[
https://issues.apache.org/jira/browse/KUDU-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon resolved KUDU-1319.
-------------------------------
Resolution: Cannot Reproduce
Fix Version/s: n/a
Going to resolve this as "cannot reproduce". David, feel free to re-open if you
have a way to repro this
> Tablet bootstrap fails with: Unexpected new leader in same term!
> ----------------------------------------------------------------
>
> Key: KUDU-1319
> URL: https://issues.apache.org/jira/browse/KUDU-1319
> Project: Kudu
> Issue Type: Bug
> Components: consensus
> Affects Versions: 0.6.0
> Reporter: David Alves
> Assignee: David Alves
> Priority: Critical
> Fix For: n/a
>
>
> Sometimes when a tablet bootstraps in the same tablet server where it was
> just deleted, after it starts getting messages from the leader it fails with:
> Unexpected new leader in same term!
> Here's a relevant log:
> {code}
> I0202 14:03:29.673732 2548 raft_consensus.cc:831] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18
> FOLLOWER]: Refusing update from remote peer
> I0202 14:03:30.939676 2543 tablet_peer.cc:217] Initiating TabletPeer
> shutdown for tablet: f39f916a60494836942b2e830f569e66
> I0202 14:03:30.939692 2543 maintenance_manager.cc:180] Unregistered op
> CompactRowSetsOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939699 2543 maintenance_manager.cc:180] Unregistered op
> MinorDeltaCompactionOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939707 2543 maintenance_manager.cc:180] Unregistered op
> MajorDeltaCompactionOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939713 2543 maintenance_manager.cc:180] Unregistered op
> FlushMRSOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939719 2543 maintenance_manager.cc:180] Unregistered op
> FlushDeltaMemStoresOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939724 2543 maintenance_manager.cc:180] Unregistered op
> LogGCOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939733 2543 raft_consensus.cc:1467] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18
> FOLLOWER]: Raft consensus shutting down.
> I0202 14:03:30.939743 2543 raft_consensus.cc:1483] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18
> FOLLOWER]: Raft consensus is shut down!
> I0202 14:03:30.940188 2543 ts_tablet_manager.cc:941] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Deleting
> tablet data with delete state TABLET_DATA_TOMBSTONED
> I0202 14:03:30.970777 2543 ts_tablet_manager.cc:951] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Tablet
> deleted. Last logged OpId: 18.2
> I0202 14:03:30.971076 2543 ts_tablet_manager.cc:411] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> Initiating remote bootstrap from Peer ef1f80adfd1849ac933488787009b162
> (node-5.internal:7050)
> I0202 14:03:30.971578 2543 remote_bootstrap_client.cc:157] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Remote
> bootstrap client: Beginning remote bootstrap session from remote peer at
> address node-5.internal:7050
> I0202 14:03:30.981222 2543 ts_tablet_manager.cc:738] Registered tablet
> f39f916a60494836942b2e830f569e66
> I0202 14:03:30.981236 2543 remote_bootstrap_client.cc:379] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Remote
> bootstrap client: Starting download of 0 data blocks...
> I0202 14:03:30.987156 2543 remote_bootstrap_client.cc:344] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Remote
> bootstrap client: Starting download of 1 WAL segments...
> I0202 14:03:30.987165 2543 tablet_bootstrap.cc:363] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> RemoteBootstrap: Downloading WAL segment with seq. number 1 (1/1)
> I0202 14:03:30.990623 2543 remote_bootstrap_client.cc:265] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Remote
> bootstrap client: Remote bootstrap complete. Replacing tablet superblock.
> I0202 14:03:30.990633 2543 tablet_bootstrap.cc:363] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> RemoteBootstrap: Replacing tablet superblock
> I0202 14:03:31.001113 5927 ts_tablet_manager.cc:610] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> Bootstrapping tablet
> I0202 14:03:31.001261 5927 tablet_bootstrap.cc:363] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> Bootstrap starting.
> I0202 14:03:31.001332 5927 tablet_bootstrap.cc:524] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Time
> spent opening tablet: real 0.000s user 0.000s sys 0.000s
> I0202 14:03:31.001406 5927 tablet_bootstrap.cc:580] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Will
> attempt to recover log segment
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66/wal-000000001 to
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery/wal-000000001
> I0202 14:03:31.001413 5927 tablet_bootstrap.cc:588] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Moving
> log directory /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66 to
> recovery directory
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery in
> preparation for log replay
> W0202 14:03:31.001554 5927 log_util.cc:166] Could not read footer for
> segment:
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery/wal-000000001:
> Not found: Footer not found. Footer magic doesn't match
> W0202 14:03:31.001605 5927 log_reader.cc:151] Log segment
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery/wal-000000001
> was likely left in-progress after a previous crash. Will try to rebuild
> footer by scanning data.
> I0202 14:03:31.001651 5927 log_util.cc:220] Successfully rebuilt footer for
> segment:
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery/wal-000000001
> (valid entries through byte offset 353)
> I0202 14:03:31.001674 5927 tablet.cc:752] Rewinding schema during bootstrap
> to Schema [
> 0:key1[int64 NOT NULL],
> 1:key2[int64 NOT NULL],
> 2:prev1[int64 NULLABLE],
> 3:prev2[int64 NULLABLE],
> 4:row_id[int64 NOT NULL],
> 5:client[string NOT NULL],
> 6:update_count[int32 NOT NULL]
> ]
> I0202 14:03:31.002318 5927 tablet_bootstrap.cc:363] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> Bootstrap replayed 1/1 log segments. Stats: ops{read=2 overwritten=0
> applied=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0}
> orphaned_commits=0. Pending: 0 replicates
> I0202 14:03:31.002331 5927 tablet_bootstrap.cc:1003] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> ReplayState: Previous OpId: 18.2, Committed OpId: 18.2, Pending Replicates:
> 0, Pending Commits: 0
> I0202 14:03:31.002563 5927 tablet_bootstrap.cc:617] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> Preparing to delete log recovery files and directory
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery
> I0202 14:03:31.002573 5927 tablet_bootstrap.cc:620] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Renaming
> log recovery dir from
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery to
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery-1454450611002572
> I0202 14:03:31.002609 5927 tablet_bootstrap.cc:630] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Deleting
> all files from renamed log recovery directory
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery-1454450611002572
> I0202 14:03:31.002704 5927 tablet_bootstrap.cc:633] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> Completed deletion of old log recovery files and directory
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery-1454450611002572
> I0202 14:03:31.002712 5927 tablet_bootstrap.cc:363] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706:
> Bootstrap complete.
> I0202 14:03:31.002768 5927 ts_tablet_manager.cc:615] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Time
> spent bootstrapping tablet: real 0.002s user 0.000s sys 0.001s
> I0202 14:03:31.002984 5927 raft_consensus.cc:268] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18
> FOLLOWER]: Replica starting. Triggering 0 pending transactions. Active
> config: opid_index: -1 local: false peers { permanent_uuid:
> "59f9b39fef534f4d8bc2111bbb6bf24c" member_type: VOTER last_known_addr { host:
> "node-3.internal" port: 7050 } } peers { permanent_uuid:
> "6ecbb8fb26534a55a009c77617ab0706" member_type: VOTER last_known_addr { host:
> "node-4.internal" port: 7050 } } peers { permanent_uuid:
> "ef1f80adfd1849ac933488787009b162" member_type: VOTER last_known_addr { host:
> "node-5.internal" port: 7050 } }
> I0202 14:03:31.003002 5927 raft_consensus.cc:477] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18
> FOLLOWER]: Becoming Follower/Learner. State: Replica:
> 6ecbb8fb26534a55a009c77617ab0706, State: 1, Role: FOLLOWER
> Watermarks: {Received: term: 18 index: 2 Committed: term: 18 index: 2}
> I0202 14:03:31.003013 5927 consensus_queue.cc:162] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706
> [NON_LEADER]: Queue going to NON_LEADER mode. State: All replicated op: 0.0,
> Majority replicated op: 0.0, Committed index: 0.0, Last appended: 18.2,
> Current term: 0, Majority size: -1, State: 1, Mode: NON_LEADER
> F0202 14:03:34.440295 2538 raft_consensus.cc:922] T
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18
> FOLLOWER]: Unexpected new leader in same term! Existing leader UUID:
> ef1f80adfd1849ac933488787009b162, new leader UUID:
> af1f80adfd1849ac933488787009b162
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)