Shashikant Banerjee created RATIS-1112:
------------------------------------------
Summary: Ensure a node doesn't get reelected as a leader if it
voluntarily steps down
Key: RATIS-1112
URL: https://issues.apache.org/jira/browse/RATIS-1112
Project: Ratis
Issue Type: Bug
Components: server
Reporter: Shashikant Banerjee
Assignee: Tsz-wo Sze
Fix For: 1.1.0
currently, a leader voluntarily steps down if it doesn't receive heartbeat from
both followers for a period of leader election timeout(5s). But same node can
get relected again. The idea is to avoid this.
{code:java}
2020-10-19 05:01:31,639 WARN org.apache.ratis.server.impl.RaftServerImpl:
14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220-LeaderState: Lost
leadership on term: 1. Election timeout: 5200ms. In charge for: 4532364ms.
Conf: 0: [7792870e-e2e1-4b9d-84ee-1855f82ea08c:10.17.234.24:9858:0,
14f34f1f-5102-4ba8-91ed-3694e3705af6:10.17.234.17:9858:0,
459b68fc-2f29-413e-a773-0d9be7fd9511:10.17.234.18:9858:0], old=null. Followers:
[14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220->7792870e-e2e1-4b9d-84ee-1855f82ea08c(c234581,m234581,n234603,
attendVote=true, lastRpcSendTime=0, lastRpcResponseTime=5501),
14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220->459b68fc-2f29-413e-a773-0d9be7fd9511(c234865,m234866,n234938,
attendVote=true, lastRpcSendTime=18, lastRpcResponseTime=5501)]
2020-10-19 05:01:31,640 INFO org.apache.ratis.server.impl.RaftServerImpl:
14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220: changes role from
LEADER to FOLLOWER at term 1 for stepDown. -------------> Stepping down
2020-10-19 05:01:36,900 INFO org.apache.ratis.server.impl.LeaderElection:
14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220-LeaderElection14:
Election PASSED; received 1 response(s)
[14f34f1f-5102-4ba8-91ed-3694e3705af6<-7792870e-e2e1-4b9d-84ee-1855f82ea08c#0:OK-t2]
and 0 exception(s);
14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220:t2, leader=null,
voted=14f34f1f-5102-4ba8-91ed-3694e3705af6,
raftlog=14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220-SegmentedRaftLog:OPENED:c234866,f234937,i234937,
conf=0: [7792870e-e2e1-4b9d-84ee-1855f82ea08c:10.17.234.24:9858:0,
14f34f1f-5102-4ba8-91ed-3694e3705af6:10.17.234.17:9858:0,
459b68fc-2f29-413e-a773-0d9be7fd9511:10.17.234.18:9858:0], old=null
2020-10-19 05:01:36,901 INFO org.apache.ratis.server.impl.RaftServerImpl:
14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220: changes role from
CANDIDATE to LEADER at term 2 for changeToLeader---------------------> become
leader again in the next term
2020-10-19 05:01:36,901 INFO
org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
Leader change notification received for group: group-7A1D5C0EE220 with new
leaderId: 14f34f1f-5102-4ba8-91ed-3694e3705af6
2020-10-19 05:01:36,902 INFO org.apache.ratis.server.impl.RaftServerImpl:
14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220: change Leader from
null to 14f34f1f-5102-4ba8-91ed-3694e3705af6 at term 2 for becomeLeader, leader
elected after 50ms
2020-10-19 05:01:37,339 INFO
org.apache.ratis.grpc.client.GrpcClientProtocolService: Failed
RaftClientRequest:client-0F18033B24C7->14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220,
cid=1896, seq=0, Watch-ALL_COMMITTED(234813), Message:<EMPTY>,
reply=RaftClientReply:client-0F18033B24C7->14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220,
cid=1896, FAILED org.apache.ratis.protocol.LeaderNotReadyException:
14f34f1f-5102-4ba8-91ed-3694e3705af6@group-7A1D5C0EE220 is in LEADER state but
not ready yet., logIndex=0,
commits[14f34f1f-5102-4ba8-91ed-3694e3705af6:c234939,
7792870e-e2e1-4b9d-84ee-1855f82ea08c:c234602,
459b68fc-2f29-413e-a773-0d9be7fd9511:c234938]
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)