Xinyu Tan created RATIS-2203:
--------------------------------

             Summary: Fixed issue where Ratis member changes are stuck
                 Key: RATIS-2203
                 URL: https://issues.apache.org/jira/browse/RATIS-2203
             Project: Ratis
          Issue Type: Improvement
            Reporter: Xinyu Tan
            Assignee: Xinyu Tan


Currently, in IoTDB, we use ratisclient to submit membership change requests, 
and rely on the 
[retrypolicy|https://github.com/apache/iotdb/blob/master/iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/ratis/RatisClient.java#L236-L249]
 to ensure that membership change requests with longer durations can eventually 
be successfully completed.

In a recent test, we discovered that Ratis might get stuck during a membership 
change. After a series of troubleshooting steps, we have mostly determined that 
this is a bug in the Ratis core.

Now, I will list the logs of the leader and the new replica to see if we can 
analyze the cause of the error together.

leader

{code:java}
2024-11-29 11:50:49,166 [19-server-thread1] INFO  o.a.r.s.i.RaftServerImpl:1296 
- 19@group-000200000061: receive setConfiguration 
SetConfigurationRequest:client-5FD616FE3614->19@group-000200000061, cid=1, 
seq=null, RW, null, SET_UNCONDITIONALLY, servers:[4|172.16.2.3:10750, 
19|172.16.2.18:10750], listeners:[] 
2024-11-29 11:50:49,168 [19-server-thread1] INFO  o.a.r.s.i.LeaderStateImpl:485 
- 19@group-000200000061-LeaderStateImpl: startSetConfiguration 
SetConfigurationRequest:client-5FD616FE3614->19@group-000200000061, cid=1, 
seq=null, RW, null, SET_UNCONDITIONALLY, servers:[4|172.16.2.3:10750, 
19|172.16.2.18:10750], listeners:[] 
2024-11-29 11:50:49,208 
[19@group-000200000061->4-GrpcLogAppender-LogAppenderDaemon] INFO  
o.a.r.s.i.FollowerInfoImpl:154 - Follower 19@group-000200000061->4 acknowledged 
installing snapshot 
2024-11-29 11:50:49,215 
[19@group-000200000061->4-GrpcLogAppender-LogAppenderDaemon] INFO  
o.a.r.g.s.GrpcServerProtocolClient:65 - Build channel for 4|172.16.2.3:10750 
2024-11-29 11:50:59,412 [timer2] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out appendEntries, 
errorCount=1, 
request=AppendEntriesRequest:cid=1,entriesCount=9198,entries=(t:1, i:0)...(t:1, 
i:9197) 
2024-11-29 11:50:59,732 [timer3] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=2, request=AppendEntriesRequest:cid=2,entriesCount=0 
2024-11-29 11:51:00,238 [timer4] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=3, request=AppendEntriesRequest:cid=3,entriesCount=0 
2024-11-29 11:51:00,740 [timer5] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=4, request=AppendEntriesRequest:cid=4,entriesCount=0 
2024-11-29 11:51:01,906 [timer6] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=5, request=AppendEntriesRequest:cid=6,entriesCount=0 
2024-11-29 11:51:02,406 [timer7] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=6, request=AppendEntriesRequest:cid=7,entriesCount=0 
2024-11-29 11:51:02,908 [timer0] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=7, request=AppendEntriesRequest:cid=8,entriesCount=0 
2024-11-29 11:51:03,086 [19-server-thread1] INFO  o.a.r.s.i.RaftServerImpl:1296 
- 19@group-000200000061: receive setConfiguration 
SetConfigurationRequest:client-5FD616FE3614->19@group-000200000061, cid=1, 
seq=null, RW, null, SET_UNCONDITIONALLY, servers:[4|172.16.2.3:10750, 
19|172.16.2.18:10750], listeners:[] 
2024-11-29 11:51:03,410 [timer1] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=8, request=AppendEntriesRequest:cid=9,entriesCount=0 
2024-11-29 11:51:03,912 [timer2] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=9, request=AppendEntriesRequest:cid=10,entriesCount=0 
2024-11-29 11:51:04,414 [timer3] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=10, request=AppendEntriesRequest:cid=11,entriesCount=0 
2024-11-29 11:51:04,914 [timer4] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=11, request=AppendEntriesRequest:cid=12,entriesCount=0 
2024-11-29 11:51:05,415 [timer5] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=12, request=AppendEntriesRequest:cid=13,entriesCount=0 
2024-11-29 11:51:05,916 [timer6] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=13, request=AppendEntriesRequest:cid=14,entriesCount=0 
2024-11-29 11:51:06,418 [timer7] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=14, request=AppendEntriesRequest:cid=15,entriesCount=0 
2024-11-29 11:51:06,683 
[19@group-000200000061->4-GrpcLogAppender-LogAppenderDaemon] WARN  
o.a.r.g.s.GrpcLogAppender:293 - 19@group-000200000061->4-GrpcLogAppender: Wait 
interrupted by java.lang.InterruptedException 
2024-11-29 11:51:06,921 [timer0] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=15, request=AppendEntriesRequest:cid=16,entriesCount=0 
2024-11-29 11:51:07,134 [19-server-thread1] INFO  o.a.r.s.i.RaftServerImpl:1296 
- 19@group-000200000061: receive setConfiguration 
SetConfigurationRequest:client-5FD616FE3614->19@group-000200000061, cid=1, 
seq=null, RW, null, SET_UNCONDITIONALLY, servers:[4|172.16.2.3:10750, 
19|172.16.2.18:10750], listeners:[] 
2024-11-29 11:51:07,135 [19-server-thread1] INFO  o.a.r.s.i.LeaderStateImpl:485 
- 19@group-000200000061-LeaderStateImpl: startSetConfiguration 
SetConfigurationRequest:client-5FD616FE3614->19@group-000200000061, cid=1, 
seq=null, RW, null, SET_UNCONDITIONALLY, servers:[4|172.16.2.3:10750, 
19|172.16.2.18:10750], listeners:[] 
2024-11-29 11:51:07,137 
[19@group-000200000061->4-GrpcLogAppender-LogAppenderDaemon] INFO  
o.a.r.s.i.FollowerInfoImpl:154 - Follower 19@group-000200000061->4 acknowledged 
installing snapshot 
2024-11-29 11:51:07,421 [timer1] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=16, request=AppendEntriesRequest:cid=17,entriesCount=0 
2024-11-29 11:51:07,924 [timer2] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=17, request=AppendEntriesRequest:cid=18,entriesCount=0 
2024-11-29 11:51:08,426 [timer3] WARN  o.a.r.g.s.GrpcLogAppender:441 - 
19@group-000200000061->4-GrpcLogAppender: Timed out HEARTBEAT appendEntries, 
errorCount=18, request=AppendEntriesRequest:cid=19,entriesCount=0 
2024-11-29 11:51:08,950 [grpc-default-executor-5] WARN  
o.a.r.g.s.GrpcLogAppender$AppendLogResponseHandler:519 - 
19@group-000200000061->4-AppendLogResponseHandler: received INCONSISTENCY reply 
with nextIndex 1, errorCount=1, 
request=AppendEntriesRequest:cid=1,entriesCount=9198,entries=(t:1, i:0)...(t:1, 
i:9197) 
2024-11-29 11:51:08,951 [grpc-default-executor-5] INFO  
o.a.r.s.i.FollowerInfoImpl:70 - 19@group-000200000061->4: setNextIndex 
nextIndex: updateUnconditionally 9198 -> 1 
2024-11-29 11:51:08,972 [grpc-default-executor-5] INFO  
o.a.r.g.s.GrpcLogAppender$AppendLogResponseHandler:550 - 
19@group-000200000061->4-AppendLogResponseHandler: follower responses 
appendEntries COMPLETED 
2024-11-29 11:51:08,975 [grpc-default-executor-5] INFO  
o.a.r.s.i.FollowerInfoImpl:70 - 19@group-000200000061->4: computeNextIndex 
nextIndex: updateUnconditionally 9198 -> 0 
2024-11-29 11:51:09,015 [grpc-default-executor-5] WARN  
o.a.r.g.s.GrpcLogAppender$AppendLogResponseHandler:519 - 
19@group-000200000061->4-AppendLogResponseHandler: received INCONSISTENCY reply 
with nextIndex 9198, errorCount=1, 
request=AppendEntriesRequest:cid=6,entriesCount=9197,entries=(t:1, i:1)...(t:1, 
i:9197) 
2024-11-29 11:51:09,022 [grpc-default-executor-1] INFO  
o.a.r.g.s.GrpcLogAppender$AppendLogResponseHandler:550 - 
19@group-000200000061->4-AppendLogResponseHandler: follower responses 
appendEntries COMPLETED 
{code}

new adding follower

{code:java}
2024-11-29 11:50:46,528 [4-groupManagement] INFO  o.a.r.s.i.RaftServerImpl:405 
- 4@group-000200000061: Successfully started. 
2024-11-29 11:50:46,915 [4-server-thread1] INFO  o.a.r.s.i.RaftServerImpl:1296 
- 4@group-000200000061: receive setConfiguration 
SetConfigurationRequest:client-5FD616FE3614->4@group-000200000061, cid=1, 
seq=null, RW, null, SET_UNCONDITIONALLY, servers:[4|172.16.2.3:10750, 
19|172.16.2.18:10750], listeners:[] 
2024-11-29 11:50:50,886 [4-server-thread1] INFO  o.a.r.s.i.RaftServerImpl:562 - 
4@group-000200000061: set firstElectionSinceStartup to false for appendEntries 
2024-11-29 11:50:50,886 [4-server-thread1] INFO  
o.a.i.d.c.s.s.SchemaRegionStateMachine:75 - Current node [nodeId: 4] is no 
longer the schema region leader [regionId: SchemaRegion[97]], the new leader is 
[nodeId:19] 
2024-11-29 11:50:50,886 [4-server-thread1] INFO  o.a.r.s.impl.ServerState:272 - 
4@group-000200000061: change Leader from null to 19 at term 1 for 
appendEntries, leader elected after 4575ms 
2024-11-29 11:50:50,887 [4-server-thread2] INFO  o.a.r.s.impl.ServerState:386 - 
4@group-000200000061: set configuration conf: {index: 0, 
cur=peers:[19|172.16.2.18:10750]|listeners:[], old=null} 
2024-11-29 11:50:50,887 [4-server-thread2] INFO  
o.a.r.s.r.s.SegmentedRaftLogWorker:438 - 
4@group-000200000061-SegmentedRaftLogWorker: Starting segment from index:0 
2024-11-29 11:50:50,888 [4@group-000200000061-SegmentedRaftLogWorker] INFO  
o.a.r.s.r.s.BufferedWriteChannel:61 - open log_inprogress_0 at position 0 
2024-11-29 11:50:56,801 [4@group-000200000061-SegmentedRaftLogWorker] INFO  
o.a.r.s.r.s.SegmentedRaftLogWorker$StartLogSegment:645 - 
4@group-000200000061-SegmentedRaftLogWorker: created new log segment 
/data1/iotdb/tablev2_rc5_1128_1fc99a6_1rep/data/datanode/consensus/schema_region/47474747-4747-4747-4747-000200000061/current/log_inprogress_0
 
2024-11-29 11:51:01,063 [4-server-thread21] INFO  o.a.r.s.i.RaftServerImpl:1296 
- 4@group-000200000061: receive setConfiguration 
SetConfigurationRequest:client-5FD616FE3614->4@group-000200000061, cid=1, 
seq=null, RW, null, SET_UNCONDITIONALLY, servers:[4|172.16.2.3:10750, 
19|172.16.2.18:10750], listeners:[] 
2024-11-29 11:51:05,118 [4-server-thread21] INFO  o.a.r.s.i.RaftServerImpl:1296 
- 4@group-000200000061: receive setConfiguration 
SetConfigurationRequest:client-5FD616FE3614->4@group-000200000061, cid=1, 
seq=null, RW, null, SET_UNCONDITIONALLY, servers:[4|172.16.2.3:10750, 
19|172.16.2.18:10750], listeners:[] 
2024-11-29 11:51:08,942 [4-server-thread22] INFO  o.a.r.s.i.RaftServerImpl:1638 
- 4@group-000200000061: Failed appendEntries as the first entry (index 0) 
already exists (snapshotIndex: -1, commitIndex: 0) 
2024-11-29 11:51:08,943 [4-server-thread22] INFO  o.a.r.s.i.RaftServerImpl:1580 
- 4@group-000200000061: appendEntries* reply 
19<-4#1:FAIL-t1,INCONSISTENCY,nextIndex=1,followerCommit=-1,matchIndex=-1 
2024-11-29 11:51:09,003 [4-server-thread2] INFO  o.a.r.s.i.RaftServerImpl:1638 
- 4@group-000200000061: Failed appendEntries as the first entry (index 1) 
already exists (snapshotIndex: -1, commitIndex: 9197) 
2024-11-29 11:51:09,004 [4-server-thread2] INFO  o.a.r.s.i.RaftServerImpl:1580 
- 4@group-000200000061: appendEntries* reply 
19<-4#6:FAIL-t1,INCONSISTENCY,nextIndex=9198,followerCommit=9197,matchIndex=-1 
{code}

>From the logs, we can see that the synchronization link from the leader to the 
>new follower has been closed, so the logs cannot be committed. This causes the 
>membership change thread to be stuck indefinitely while waiting for the 
>follower to catch up.

Based on the logs above, I have a few suspicions:
- Why is the gRPC stream API on the leader side being completed? I haven't seen 
any place on the follower side where this function is called.
- The commitIndex obtained in the checkInconsistentAppendEntries function on 
the follower side seems inconsistent with the one obtained in 
appendEntriesAsync. For example, in the following two lines of logs, the same 
thread gets a commitIndex of -1 outside the lock, and an index of 0 inside the 
lock. This causes the subsequent synchronization to be completely stuck. I'm 
not sure if this is a concurrency issue, but perhaps we should move this line 
inside the 
[lock|ttps://github.com/apache/ratis/blob/master/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L1591]?


{code:java}
2024-11-29 11:51:08,942 [4-server-thread22] INFO  o.a.r.s.i.RaftServerImpl:1638 
- 4@group-000200000061: Failed appendEntries as the first entry (index 0) 
already exists (snapshotIndex: -1, commitIndex: 0) 
2024-11-29 11:51:08,943 [4-server-thread22] INFO  o.a.r.s.i.RaftServerImpl:1580 
- 4@group-000200000061: appendEntries* reply 
19<-4#1:FAIL-t1,INCONSISTENCY,nextIndex=1,followerCommit=-1,matchIndex=-1 
{code}

What's your opinion? [~szetszwo]





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to