Song Ziyang created RATIS-1782:
----------------------------------
Summary: gRPC installSnapshot timeout handler malfunctioning
Key: RATIS-1782
URL: https://issues.apache.org/jira/browse/RATIS-1782
Project: Ratis
Issue Type: Bug
Components: gRPC, snapshot
Affects Versions: 2.4.1
Reporter: Song Ziyang
When gRPC logAppender fails to install a snapshot to a follower owing to
timeout, the onError callback will be invoked and resetClient is called.
However, in this resetClient[1] handler, installSnapshotResponseHandler is not
set to null (compared to appendLogReponseHandler). In this way, pending RPCs
in the old installSnapshot pipe will timeout and call the onError again
sometime in the future, disrupting future on-going installSnapshot requests.
[1]
https://github.com/apache/ratis/blob/18eacaed31e4965a9c400d86409a88fea21fc18a/ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java#L117-L120
--
This message was sent by Atlassian Jira
(v8.20.10#820010)