[
https://issues.apache.org/jira/browse/KUDU-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360734#comment-15360734
]
zhangsong commented on KUDU-1483:
---------------------------------
link with 1449
> in some cases, followers cannot promote to leader.
> --------------------------------------------------
>
> Key: KUDU-1483
> URL: https://issues.apache.org/jira/browse/KUDU-1483
> Project: Kudu
> Issue Type: Bug
> Reporter: zhangsong
>
> in my env, a tablet only has two follower on master's webui, that situation
> last forever.
> Some logs about the tablet on two followers log:
> follower1:
> I0613 11:16:33.244365 26846 leader_election.cc:223] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e
> [CANDIDATE]: Term 31717 election: Requesting vote from
> peer 8cf59ddd6d154ae99d3b23da840169e0W0613 11:16:33.247150 26016
> leader_election.cc:281] T 87588b06c65d4898a5b8c29d08b3528d P
> eded59517b14432ab9022cd50d160b8e [CANDIDATE]: Term 31717 election: Tablet
> error from VoteRequest() call to peer 8cf59ddd6d154ae99d3b23da840169e0:
> Illegal state: Tablet not RUN
> NING: FAILED: Not found: Can't find block: 1363326557009763249I0613
> 11:16:33.247463 26016 leader_election.cc:248] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e
> [CANDIDATE]: Term 31717 election: Election decided. Re
> sult: candidate lost.I0613 11:16:33.248205 17534 raft_consensus.cc:1942] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Snoozing failure detection for election timeout plus an
> additional 15.536s
> I0613 11:16:33.248245 17534 raft_consensus.cc:1795] T
> 87588b06c65d4898a5b8c29d08b3528d P
> eded59517b14432ab9022cd50d160b8e [term 31717 FOLLOWER]: Leader election lost
> for term 3
> 1717. Reason: None given
> sult: candidate lost.I0613 11:16:33.248205 17534 raft_consensus.cc:1942] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Snoozing failure detection for election timeout plus an
> additional 15.536sI0613 11:16:33.248245 17534 raft_consensus.cc:1795] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Leader election lost for term 31717. Reason: None given
> I0613 11:16:34.288436 26137 raft_consensus.cc:1298] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Handling vote request from an unknown peer
> 95bc8f3637ed4a52b53a984052ba6114
> I0613 11:16:34.288633 26137 raft_consensus.cc:1558] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate
> 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31666. Current term is
> 31717.
> I0613 11:16:41.506261 26127 raft_consensus.cc:1298] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Handling vote request from an unknown peer
> 95bc8f3637ed4a52b53a984052ba6114
> I0613 11:16:41.506325 26127 raft_consensus.cc:1558] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate
> 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31667. Current term is
> 31717.
> I0613 11:16:45.440551 26135 raft_consensus.cc:1298] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Handling vote request from an unknown peer
> 95bc8f3637ed4a52b53a984052ba6114
> I0613 11:16:45.440625 26135 raft_consensus.cc:1558] T
> 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate
> 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31668. Current term is
> 31717.
> it seems that there are three follower/voters and one of it has tablet in
> "not running" state.
> on the other follower:
> W0613 11:16:45.437863 18782 leader_election.cc:281] T
> 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114
> [CANDIDATE]: Term 31668 election: Tablet error from VoteRequest() call to
> peer 8cf59ddd6d154ae99d3b23da840169e0: Illegal state: Tablet not RUNNING:
> FAILED: Not found: Can't find block: 1363326557009763249
> W0613 11:16:45.438611 18782 leader_election.cc:333] T
> 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114
> [CANDIDATE]: Term 31668 election: Vote denied by peer
> eded59517b14432ab9022cd50d160b8e with higher term. Message: Invalid argument:
> T 87588b06c65d4898a5b8c29d08b3528d P eded59517b14432ab9022cd50d160b8e [term
> 31717 FOLLOWER]: Leader election vote request: Denying vote to candidate
> 95bc8f3637ed4a52b53a984052ba6114 for earlier term 31668. Current term is
> 31717.
> I0613 11:16:45.439034 18782 leader_election.cc:336] T
> 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114
> [CANDIDATE]: Term 31668 election: Cancelling election due to peer responding
> with higher term
> I0613 11:16:45.440032 21807 raft_consensus.cc:1942] T
> 87588b06c65d4898a5b8c29d08b3528d P 95bc8f3637ed4a52b53a984052ba6114 [term
> 31668 FOLLOWER]: Snoozing failure detection for election timeout plus an
> additional 15.493s
> And this logs repeat again and again, it seems that follower with low term
> start leader election and get denied by followers with high term, and the
> follower with high term doesn't kown about the first follower for some reason.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)