Song Ziyang created RATIS-1782:
----------------------------------

             Summary: gRPC installSnapshot timeout handler malfunctioning 
                 Key: RATIS-1782
                 URL: https://issues.apache.org/jira/browse/RATIS-1782
             Project: Ratis
          Issue Type: Bug
          Components: gRPC, snapshot
    Affects Versions: 2.4.1
            Reporter: Song Ziyang


When gRPC logAppender fails to install a snapshot to a follower owing to 
timeout, the onError callback will be invoked and resetClient is called. 
However, in this resetClient[1] handler, installSnapshotResponseHandler is not 
set to null (compared to  appendLogReponseHandler). In this way, pending RPCs 
in the old installSnapshot pipe will timeout and call the onError again 
sometime in the future, disrupting future on-going installSnapshot requests.

[1] 
https://github.com/apache/ratis/blob/18eacaed31e4965a9c400d86409a88fea21fc18a/ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java#L117-L120



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to