[ 
https://issues.apache.org/jira/browse/RATIS-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guangbao zhao updated RATIS-2135:
---------------------------------
    Description: 
We found in the leader log of ratis that the follower replied leader nextIndex 
= 645891323, and the leader also executed setNextIndex nextIndex: 
updateUnconditionally 645892117 -> 645891323, but this process was repeated 
continuously, causing the follower to become unavailable.

The following is the log of the leader node:

 

2024/08/06 00:00:00,094 [grpc-default-executor-210] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: 
RST_STREAM closed stream. HTTP/2 error code: CANCEL
2024/08/06 00:00:00,103 [grpc-default-executor-210] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed (request=null, 
errorCount=206); keep nextIndex (645892117) unchanged and retry.
2024/08/06 00:00:00,400 [grpc-default-executor-160] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=207, 
request=AppendEntriesRequest:cid=51646275,entriesCount=744,entries=(t:626, 
i:645892117)...(t:626, i:645892860)
2024/08/06 00:00:00,400 [grpc-default-executor-160] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892861 -> 645891323
2024/08/06 00:00:02,888 [grpc-default-executor-210] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=209, 
request=AppendEntriesRequest:cid=51646277,entriesCount=669,entries=(t:626, 
i:645892117)...(t:626, i:645892785)
2024/08/06 00:00:02,888 [grpc-default-executor-210] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892786 -> 645891323
2024/08/06 00:00:03,344 [grpc-default-executor-160] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=211, 
request=AppendEntriesRequest:cid=51646279,entriesCount=744,entries=(t:626, 
i:645892117)...(t:626, i:645892860)
2024/08/06 00:00:03,344 [grpc-default-executor-160] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892861 -> 645891323
2024/08/06 00:00:05,103 [timer1] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed appendEntries 
(Repeated 4 times in the last 5.009s): 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: 
RST_STREAM closed stream. HTTP/2 error code: CANCEL
2024/08/06 00:00:05,104 [timer2] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed (request=null, 
errorCount=212); keep nextIndex (645892117) unchanged and retry. (Repeated 4 
times in the last 5.001s)
2024/08/06 00:00:05,833 [grpc-default-executor-160] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=213, 
request=AppendEntriesRequest:cid=51646281,entriesCount=669,entries=(t:626, 
i:645892117)...(t:626, i:645892785)
2024/08/06 00:00:05,833 [grpc-default-executor-160] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892786 -> 645891323
2024/08/06 00:00:05,996 [grpc-default-executor-160] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: 
RST_STREAM closed stream. HTTP/2 error code: CANCEL
2024/08/06 00:00:05,996 [grpc-default-executor-160] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed (request=null, 
errorCount=214); keep nextIndex (645892117) unchanged and retry.
2024/08/06 00:00:10,792 [grpc-default-executor-210] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=215, 
request=AppendEntriesRequest:cid=51646283,entriesCount=0
2024/08/06 00:00:10,792 [grpc-default-executor-210] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892117 -> 645891323
2024/08/06 00:00:10,996 [timer5] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed (request=null, 
errorCount=216); keep nextIndex (645892117) unchanged and retry. (Repeated 2 
times in the last 5.000s)
2024/08/06 00:00:10,996 [timer4] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed appendEntries 
(Repeated 2 times in the last 5.000s): 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: 
RST_STREAM closed stream. HTTP/2 error code: CANCEL
2024/08/06 00:00:15,014 [grpc-default-executor-160] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=217, 
request=AppendEntriesRequest:cid=51646285,entriesCount=0
2024/08/06 00:00:15,014 [grpc-default-executor-160] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892117 -> 645891323

  was:
We found in the leader log of ratis that the follower replied leader nextIndex 
= 645891323, and the leader also executed setNextIndex nextIndex: 
updateUnconditionally 645892117 -> 645891323, but this process was repeated 
continuously, causing the follower to become unavailable.

The following is the log of the leader node:

 

2024/08/05 23:59:05,943 [grpc-default-executor-98] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed (request=null, 
errorCount=184); keep nextIndex (645892117) unchanged and retry.
2024/08/05 23:59:10,758 [grpc-default-executor-160] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=185, 
request=AppendEntriesRequest:cid=51646253,entriesCount=0
2024/08/05 23:59:10,758 [grpc-default-executor-160] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892117 -> 645891323
2024/08/05 23:59:10,943 [timer5] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed (request=null, 
errorCount=186); keep nextIndex (645892117) unchanged and retry. (Repeated 2 
times in the last 5.000s)
2024/08/05 23:59:10,943 [timer4] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed appendEntries 
(Repeated 2 times in the last 5.000s): 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: 
RST_STREAM closed stream. HTTP/2 error code: CANCEL
2024/08/05 23:59:17,561 [grpc-default-executor-98] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=187, 
request=AppendEntriesRequest:cid=51646255,entriesCount=0
2024/08/05 23:59:17,561 [grpc-default-executor-98] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892117 -> 645891323
2024/08/05 23:59:17,670 [grpc-default-executor-98] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: 
RST_STREAM closed stream. HTTP/2 error code: CANCEL
2024/08/05 23:59:17,670 [grpc-default-executor-98] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed (request=null, 
errorCount=188); keep nextIndex (645892117) unchanged and retry.
2024/08/05 23:59:21,426 [grpc-default-executor-160] WARN 
org.apache.ratis.grpc.server.GrpcLogAppender: 
node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
INCONSISTENCY reply with nextIndex 645891323, errorCount=189, 
request=AppendEntriesRequest:cid=51646257,entriesCount=0
2024/08/05 23:59:21,426 [grpc-default-executor-160] INFO 
org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
setNextIndex nextIndex: updateUnconditionally 645892117 -> 645891323


> The leader keeps sending inconsistent entries repeatedly to followers.
> ----------------------------------------------------------------------
>
>                 Key: RATIS-2135
>                 URL: https://issues.apache.org/jira/browse/RATIS-2135
>             Project: Ratis
>          Issue Type: Bug
>    Affects Versions: 3.0.1
>            Reporter: guangbao zhao
>            Priority: Major
>
> We found in the leader log of ratis that the follower replied leader 
> nextIndex = 645891323, and the leader also executed setNextIndex nextIndex: 
> updateUnconditionally 645892117 -> 645891323, but this process was repeated 
> continuously, causing the follower to become unavailable.
> The following is the log of the leader node:
>  
> 2024/08/06 00:00:00,094 [grpc-default-executor-210] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed 
> appendEntries: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> CANCELLED: RST_STREAM closed stream. HTTP/2 error code: CANCEL
> 2024/08/06 00:00:00,103 [grpc-default-executor-210] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed 
> (request=null, errorCount=206); keep nextIndex (645892117) unchanged and 
> retry.
> 2024/08/06 00:00:00,400 [grpc-default-executor-160] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
> INCONSISTENCY reply with nextIndex 645891323, errorCount=207, 
> request=AppendEntriesRequest:cid=51646275,entriesCount=744,entries=(t:626, 
> i:645892117)...(t:626, i:645892860)
> 2024/08/06 00:00:00,400 [grpc-default-executor-160] INFO 
> org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
> setNextIndex nextIndex: updateUnconditionally 645892861 -> 645891323
> 2024/08/06 00:00:02,888 [grpc-default-executor-210] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
> INCONSISTENCY reply with nextIndex 645891323, errorCount=209, 
> request=AppendEntriesRequest:cid=51646277,entriesCount=669,entries=(t:626, 
> i:645892117)...(t:626, i:645892785)
> 2024/08/06 00:00:02,888 [grpc-default-executor-210] INFO 
> org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
> setNextIndex nextIndex: updateUnconditionally 645892786 -> 645891323
> 2024/08/06 00:00:03,344 [grpc-default-executor-160] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
> INCONSISTENCY reply with nextIndex 645891323, errorCount=211, 
> request=AppendEntriesRequest:cid=51646279,entriesCount=744,entries=(t:626, 
> i:645892117)...(t:626, i:645892860)
> 2024/08/06 00:00:03,344 [grpc-default-executor-160] INFO 
> org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
> setNextIndex nextIndex: updateUnconditionally 645892861 -> 645891323
> 2024/08/06 00:00:05,103 [timer1] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed 
> appendEntries (Repeated 4 times in the last 5.009s): 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: 
> RST_STREAM closed stream. HTTP/2 error code: CANCEL
> 2024/08/06 00:00:05,104 [timer2] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed 
> (request=null, errorCount=212); keep nextIndex (645892117) unchanged and 
> retry. (Repeated 4 times in the last 5.001s)
> 2024/08/06 00:00:05,833 [grpc-default-executor-160] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
> INCONSISTENCY reply with nextIndex 645891323, errorCount=213, 
> request=AppendEntriesRequest:cid=51646281,entriesCount=669,entries=(t:626, 
> i:645892117)...(t:626, i:645892785)
> 2024/08/06 00:00:05,833 [grpc-default-executor-160] INFO 
> org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
> setNextIndex nextIndex: updateUnconditionally 645892786 -> 645891323
> 2024/08/06 00:00:05,996 [grpc-default-executor-160] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed 
> appendEntries: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> CANCELLED: RST_STREAM closed stream. HTTP/2 error code: CANCEL
> 2024/08/06 00:00:05,996 [grpc-default-executor-160] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed 
> (request=null, errorCount=214); keep nextIndex (645892117) unchanged and 
> retry.
> 2024/08/06 00:00:10,792 [grpc-default-executor-210] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
> INCONSISTENCY reply with nextIndex 645891323, errorCount=215, 
> request=AppendEntriesRequest:cid=51646283,entriesCount=0
> 2024/08/06 00:00:10,792 [grpc-default-executor-210] INFO 
> org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
> setNextIndex nextIndex: updateUnconditionally 645892117 -> 645891323
> 2024/08/06 00:00:10,996 [timer5] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-GrpcLogAppender: Follower failed 
> (request=null, errorCount=216); keep nextIndex (645892117) unchanged and 
> retry. (Repeated 2 times in the last 5.000s)
> 2024/08/06 00:00:10,996 [timer4] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: Failed 
> appendEntries (Repeated 2 times in the last 5.000s): 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: CANCELLED: 
> RST_STREAM closed stream. HTTP/2 error code: CANCEL
> 2024/08/06 00:00:15,014 [grpc-default-executor-160] WARN 
> org.apache.ratis.grpc.server.GrpcLogAppender: 
> node2@group-EE11FFDD70A4->node1-AppendLogResponseHandler: received 
> INCONSISTENCY reply with nextIndex 645891323, errorCount=217, 
> request=AppendEntriesRequest:cid=51646285,entriesCount=0
> 2024/08/06 00:00:15,014 [grpc-default-executor-160] INFO 
> org.apache.ratis.server.leader.FollowerInfo: node2@group-EE11FFDD70A4->node1: 
> setNextIndex nextIndex: updateUnconditionally 645892117 -> 645891323



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to