[ 
https://issues.apache.org/jira/browse/KUDU-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280654#comment-15280654
 ] 

Todd Lipcon commented on KUDU-1319:
-----------------------------------

Ping? Is it possible this was caused by KUDU-1436?

> Tablet bootstrap fails with: Unexpected new leader in same term!
> ----------------------------------------------------------------
>
>                 Key: KUDU-1319
>                 URL: https://issues.apache.org/jira/browse/KUDU-1319
>             Project: Kudu
>          Issue Type: Bug
>          Components: consensus
>    Affects Versions: 0.6.0
>            Reporter: David Alves
>            Assignee: David Alves
>            Priority: Critical
>
> Sometimes when a tablet bootstraps in the same tablet server where it was 
> just deleted, after it starts getting messages from the leader it fails with: 
> Unexpected new leader in same term!
> Here's a relevant log:
> {code}
> I0202 14:03:29.673732  2548 raft_consensus.cc:831] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18 
> FOLLOWER]: Refusing update from remote peer
> I0202 14:03:30.939676  2543 tablet_peer.cc:217] Initiating TabletPeer 
> shutdown for tablet: f39f916a60494836942b2e830f569e66
> I0202 14:03:30.939692  2543 maintenance_manager.cc:180] Unregistered op 
> CompactRowSetsOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939699  2543 maintenance_manager.cc:180] Unregistered op 
> MinorDeltaCompactionOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939707  2543 maintenance_manager.cc:180] Unregistered op 
> MajorDeltaCompactionOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939713  2543 maintenance_manager.cc:180] Unregistered op 
> FlushMRSOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939719  2543 maintenance_manager.cc:180] Unregistered op 
> FlushDeltaMemStoresOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939724  2543 maintenance_manager.cc:180] Unregistered op 
> LogGCOp(f39f916a60494836942b2e830f569e66)
> I0202 14:03:30.939733  2543 raft_consensus.cc:1467] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18 
> FOLLOWER]: Raft consensus shutting down.
> I0202 14:03:30.939743  2543 raft_consensus.cc:1483] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18 
> FOLLOWER]: Raft consensus is shut down!
> I0202 14:03:30.940188  2543 ts_tablet_manager.cc:941] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Deleting 
> tablet data with delete state TABLET_DATA_TOMBSTONED
> I0202 14:03:30.970777  2543 ts_tablet_manager.cc:951] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Tablet 
> deleted. Last logged OpId: 18.2
> I0202 14:03:30.971076  2543 ts_tablet_manager.cc:411] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> Initiating remote bootstrap from Peer ef1f80adfd1849ac933488787009b162 
> (node-5.internal:7050)
> I0202 14:03:30.971578  2543 remote_bootstrap_client.cc:157] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Remote 
> bootstrap client: Beginning remote bootstrap session from remote peer at 
> address node-5.internal:7050
> I0202 14:03:30.981222  2543 ts_tablet_manager.cc:738] Registered tablet 
> f39f916a60494836942b2e830f569e66
> I0202 14:03:30.981236  2543 remote_bootstrap_client.cc:379] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Remote 
> bootstrap client: Starting download of 0 data blocks...
> I0202 14:03:30.987156  2543 remote_bootstrap_client.cc:344] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Remote 
> bootstrap client: Starting download of 1 WAL segments...
> I0202 14:03:30.987165  2543 tablet_bootstrap.cc:363] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> RemoteBootstrap: Downloading WAL segment with seq. number 1 (1/1)
> I0202 14:03:30.990623  2543 remote_bootstrap_client.cc:265] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Remote 
> bootstrap client: Remote bootstrap complete. Replacing tablet superblock.
> I0202 14:03:30.990633  2543 tablet_bootstrap.cc:363] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> RemoteBootstrap: Replacing tablet superblock
> I0202 14:03:31.001113  5927 ts_tablet_manager.cc:610] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> Bootstrapping tablet
> I0202 14:03:31.001261  5927 tablet_bootstrap.cc:363] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> Bootstrap starting.
> I0202 14:03:31.001332  5927 tablet_bootstrap.cc:524] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Time 
> spent opening tablet: real 0.000s    user 0.000s     sys 0.000s
> I0202 14:03:31.001406  5927 tablet_bootstrap.cc:580] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Will 
> attempt to recover log segment 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66/wal-000000001 to 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery/wal-000000001
> I0202 14:03:31.001413  5927 tablet_bootstrap.cc:588] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Moving 
> log directory /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66 to 
> recovery directory 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery in 
> preparation for log replay
> W0202 14:03:31.001554  5927 log_util.cc:166] Could not read footer for 
> segment: 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery/wal-000000001:
>  Not found: Footer not found. Footer magic doesn't match
> W0202 14:03:31.001605  5927 log_reader.cc:151] Log segment 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery/wal-000000001
>  was likely left in-progress after a previous crash. Will try to rebuild 
> footer by scanning data.
> I0202 14:03:31.001651  5927 log_util.cc:220] Successfully rebuilt footer for 
> segment: 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery/wal-000000001
>  (valid entries through byte offset 353)
> I0202 14:03:31.001674  5927 tablet.cc:752] Rewinding schema during bootstrap 
> to Schema [
>       0:key1[int64 NOT NULL],
>       1:key2[int64 NOT NULL],
>       2:prev1[int64 NULLABLE],
>       3:prev2[int64 NULLABLE],
>       4:row_id[int64 NOT NULL],
>       5:client[string NOT NULL],
>       6:update_count[int32 NOT NULL]
> ]
> I0202 14:03:31.002318  5927 tablet_bootstrap.cc:363] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> Bootstrap replayed 1/1 log segments. Stats: ops{read=2 overwritten=0 
> applied=2} inserts{seen=0 ignored=0} mutations{seen=0 ignored=0} 
> orphaned_commits=0. Pending: 0 replicates
> I0202 14:03:31.002331  5927 tablet_bootstrap.cc:1003] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> ReplayState: Previous OpId: 18.2, Committed OpId: 18.2, Pending Replicates: 
> 0, Pending Commits: 0
> I0202 14:03:31.002563  5927 tablet_bootstrap.cc:617] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> Preparing to delete log recovery files and directory 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery
> I0202 14:03:31.002573  5927 tablet_bootstrap.cc:620] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Renaming 
> log recovery dir from 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery to 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery-1454450611002572
> I0202 14:03:31.002609  5927 tablet_bootstrap.cc:630] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Deleting 
> all files from renamed log recovery directory 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery-1454450611002572
> I0202 14:03:31.002704  5927 tablet_bootstrap.cc:633] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> Completed deletion of old log recovery files and directory 
> /var/lib/kudu/tserver/wals/f39f916a60494836942b2e830f569e66.recovery-1454450611002572
> I0202 14:03:31.002712  5927 tablet_bootstrap.cc:363] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: 
> Bootstrap complete.
> I0202 14:03:31.002768  5927 ts_tablet_manager.cc:615] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706: Time 
> spent bootstrapping tablet: real 0.002s     user 0.000s     sys 0.001s
> I0202 14:03:31.002984  5927 raft_consensus.cc:268] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18 
> FOLLOWER]: Replica starting. Triggering 0 pending transactions. Active 
> config: opid_index: -1 local: false peers { permanent_uuid: 
> "59f9b39fef534f4d8bc2111bbb6bf24c" member_type: VOTER last_known_addr { host: 
> "node-3.internal" port: 7050 } } peers { permanent_uuid: 
> "6ecbb8fb26534a55a009c77617ab0706" member_type: VOTER last_known_addr { host: 
> "node-4.internal" port: 7050 } } peers { permanent_uuid: 
> "ef1f80adfd1849ac933488787009b162" member_type: VOTER last_known_addr { host: 
> "node-5.internal" port: 7050 } }
> I0202 14:03:31.003002  5927 raft_consensus.cc:477] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18 
> FOLLOWER]: Becoming Follower/Learner. State: Replica: 
> 6ecbb8fb26534a55a009c77617ab0706, State: 1, Role: FOLLOWER
> Watermarks: {Received: term: 18 index: 2 Committed: term: 18 index: 2}
> I0202 14:03:31.003013  5927 consensus_queue.cc:162] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 
> [NON_LEADER]: Queue going to NON_LEADER mode. State: All replicated op: 0.0, 
> Majority replicated op: 0.0, Committed index: 0.0, Last appended: 18.2, 
> Current term: 0, Majority size: -1, State: 1, Mode: NON_LEADER
> F0202 14:03:34.440295  2538 raft_consensus.cc:922] T 
> f39f916a60494836942b2e830f569e66 P 6ecbb8fb26534a55a009c77617ab0706 [term 18 
> FOLLOWER]: Unexpected new leader in same term! Existing leader UUID: 
> ef1f80adfd1849ac933488787009b162, new leader UUID: 
> af1f80adfd1849ac933488787009b162
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to