Tsz-wo Sze created RATIS-1884:
---------------------------------

             Summary: Fix retry cache warning condition 
                 Key: RATIS-1884
                 URL: https://issues.apache.org/jira/browse/RATIS-1884
             Project: Ratis
          Issue Type: Bug
          Components: server
            Reporter: Tsz-wo Sze
            Assignee: Song Ziyang


Made a mistake in previous PR [#904|https://github.com/apache/ratis/pull/904]. 
The conditions here are a bit of tricky.

The cache entry is expected to be *not completed normally* when 
{{{}replyPendingRequest{}}}, since we'll complete this cache entry at the very 
end of {{{}replyPendingRequest{}}}.

The explanation why this assertion fails in previous PR is incorrect. The real 
path leading to the error is:
If the request r arrived, committed, but became timeout due to blocking apply 
(may be stuck in a synchronous snapshotting), a client may choose to retry r. 
However, if the retry gap exceeds retryCache expiration duration (in our case, 
yes), the very same request r would be committed, {*}again{*}. After the 
snapshotting finished, these two identical requests being applied would cause 
the assertion to fail.

Maybe we should recommend users to set a retry cache expiration duration longer 
than the client retry-waiting duration?

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to