Hi Tsz-Wo,

Thanks so much for your reply. So I was wrong, but I still can't figure why 
this would happen.

Here are some logs from that partitioned server. This server was notified to 
become leader and try to write message through 
RaftServer.submitClientRequestAsync.

At the same time, it lost connection with all followers.

This server will call RaftServer.submitClientRequestAsync continuously as long 
as the calling fails and the server does not receive any notification from 
StateMachine.notifyLeaderChanged or StateMachine.notifyNotLeader to give up 
leadership.

Would you mind giving me some hint about what is going on in this log? The 
Ratis version is 2.0.0.




[2021-05-13 03:11:30,048] [INFO] [main] [user-application] - sendAsync Continue 
cause org.apache.ratis.protocol.exceptions.LeaderNotReadyException: 
n2p8848hn2@group-ABB3109A44C1 is in LEADER state but not ready yet.

[2021-05-13 03:11:33,073] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@4bd7bd5d[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n1p8848hn1-GrpcLogAppender:  appendEntries 
Timeout, request=AppendEntriesRequest:cid=90,entriesCount=1,lastEntry=(t:13, 
i:3497)

[2021-05-13 03:11:33,077] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@4bd7bd5d[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n3p8848hn3-GrpcLogAppender:  appendEntries 
Timeout, request=AppendEntriesRequest:cid=90,entriesCount=1,lastEntry=(t:13, 
i:3497)

[2021-05-13 03:11:35,074] [INFO] [main] [user-application] - Failed to submit 
start entry: java.util.concurrent.TimeoutException

[2021-05-13 03:11:36,074] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@4bd7bd5d[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n1p8848hn1-GrpcLogAppender: HEARTBEAT 
appendEntries Timeout, 
request=AppendEntriesRequest:cid=381,entriesCount=0,lastEntry=null

[2021-05-13 03:11:36,075] [INFO] [main] [user-application] - sendAsync again

========== start repeating

[2021-05-13 03:11:36,078] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@4bd7bd5d[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n3p8848hn3-GrpcLogAppender: HEARTBEAT 
appendEntries Timeout, 
request=AppendEntriesRequest:cid=381,entriesCount=0,lastEntry=null

[2021-05-13 03:11:39,075] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@4bd7bd5d[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n1p8848hn1-GrpcLogAppender: HEARTBEAT 
appendEntries Timeout, 
request=AppendEntriesRequest:cid=672,entriesCount=0,lastEntry=null

[2021-05-13 03:11:39,077] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@4bd7bd5d[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n3p8848hn3-GrpcLogAppender:  appendEntries 
Timeout, request=AppendEntriesRequest:cid=673,entriesCount=1,lastEntry=(t:13, 
i:3498)

[2021-05-13 03:11:39,077] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@4bd7bd5d[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n1p8848hn1-GrpcLogAppender:  appendEntries 
Timeout, request=AppendEntriesRequest:cid=673,entriesCount=1,lastEntry=(t:13, 
i:3498)

[2021-05-13 03:11:41,042] [INFO] [main] [user-application] - Failed to submit 
start entry: java.util.concurrent.TimeoutException

[2021-05-13 03:11:42,043] [INFO] [main] [user-application] - sendAsync again

...

[2021-05-13 03:12:03,054] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@771ef45e[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n3p8848hn3-GrpcLogAppender:  appendEntries 
Timeout, request=AppendEntriesRequest:cid=3005,entriesCount=1,lastEntry=(t:13, 
i:3502)

...

[2021-05-13 03:26:14,306] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@1dce73a8[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n1p8848hn1-GrpcLogAppender: HEARTBEAT 
appendEntries Timeout, 
request=AppendEntriesRequest:cid=74168,entriesCount=0,lastEntry=null

[2021-05-13 03:26:14,307] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@1dce73a8[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n3p8848hn3-GrpcLogAppender: HEARTBEAT 
appendEntries Timeout, 
request=AppendEntriesRequest:cid=74180,entriesCount=0,lastEntry=null

[2021-05-13 03:26:16,307] [INFO] [main] [user-application] - Failed to submit 
start entry: java.util.concurrent.TimeoutException

[2021-05-13 03:26:17,307] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@1dce73a8[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n1p8848hn1-GrpcLogAppender: HEARTBEAT 
appendEntries Timeout, 
request=AppendEntriesRequest:cid=74169,entriesCount=0,lastEntry=null

[2021-05-13 03:26:17,307] [INFO] [main] [user-application] - sendAsync again

[2021-05-13 03:26:17,308] [WARN] 
[java.util.concurrent.ThreadPoolExecutor$Worker@1dce73a8[State = -1, empty 
queue]] [org.apache.ratis.grpc.server.GrpcLogAppender] - 
n2p8848hn2@group-ABB3109A44C1->n3p8848hn3-GrpcLogAppender: HEARTBEAT 
appendEntries Timeout, 
request=AppendEntriesRequest:cid=74181,entriesCount=0,lastEntry=null

========== network healed, end repeating

[2021-05-13 03:26:17,552] [INFO] [grpc-default-executor-3] 
[org.apache.ratis.server.RaftServer$Division] - n2p8848hn2@group-ABB3109A44C1: 
change Leader from n2p8848hn2 to null at term 14 for updateCurrentTerm

[2021-05-13 03:26:17,552] [INFO] [grpc-default-executor-3] 
[org.apache.ratis.server.RaftServer$Division] - n2p8848hn2@group-ABB3109A44C1: 
changes role from    LEADER to FOLLOWER at term 14 for appendEntries




By the way, there is no log coming from LeaderStateImpl.checkLeadership.




Thanks again.

ly

Reply via email to