ArafatKhan2198 opened a new pull request, #5700:
URL: https://github.com/apache/ozone/pull/5700

   ## What changes were proposed in this pull request?
   As provided the following exceptions have been thrown :- 
`AlreadyClosedException`, `ExecutionException`, `RaftRetryFailureException`, 
`NotLeaderException`. 
   
   ```
   
org.apache.hadoop.hdds.scm.storage.TestCommitWatcher.testReleaseBuffersOnException
 -- Time elapsed: 28.88 s <<< ERROR!
   java.util.concurrent.ExecutionException: 
org.apache.ratis.protocol.exceptions.AlreadyClosedException: 
SlidingWindow$Client:client-2D6A59F17A72->RAFT is closed.
        at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
        at 
org.apache.hadoop.hdds.scm.storage.TestCommitWatcher.testReleaseBuffersOnException(TestCommitWatcher.java:301)
   ...
   Caused by: org.apache.ratis.protocol.exceptions.AlreadyClosedException: 
SlidingWindow$Client:client-2D6A59F17A72->RAFT is closed.
        at 
org.apache.ratis.util.SlidingWindow$Client.alreadyClosed(SlidingWindow.java:406)
   ...
   Caused by: org.apache.ratis.protocol.exceptions.RaftRetryFailureException: 
Failed 
RaftClientRequest:client-2D6A59F17A72->31ca1c78-6c1a-481a-9835-ed9e1bd72d7b@group-4121B0A26A9F,
 cid=40, seq=1*, Watch(0), null for 3 attempts with 
RequestTypeDependentRetryPolicy{WRITE->ExceptionDependentRetry(maxAttempts=2147483647;
 defaultPolicy=MultipleLinearRandomRetry[5x5s, 5x10s, 5x15s, 5x20s, 5x25s, 
10x60s]; 
map={org.apache.ratis.protocol.exceptions.GroupMismatchException->NoRetry, 
org.apache.ratis.protocol.exceptions.NotReplicatedException->NoRetry, 
org.apache.ratis.protocol.exceptions.ResourceUnavailableException->org.apache.ratis.retry.ExponentialBackoffRetry@571a7a22,
 org.apache.ratis.protocol.exceptions.StateMachineException->NoRetry, 
org.apache.ratis.protocol.exceptions.TimeoutIOException->org.apache.ratis.retry.ExponentialBackoffRetry@571a7a22}),
 WATCH->ExceptionDependentRetry(maxAttempts=2147483647; 
defaultPolicy=MultipleLinearRandomRetry[5x5s, 5x10s, 5x15s, 5x20s, 5x25s, 
10x60s]; map=
 {org.apache.ratis.protocol.exceptions.GroupMismatchException->NoRetry, 
org.apache.ratis.protocol.exceptions.NotReplicatedException->NoRetry, 
org.apache.ratis.protocol.exceptions.ResourceUnavailableException->org.apache.ratis.retry.ExponentialBackoffRetry@571a7a22,
 org.apache.ratis.protocol.exceptions.StateMachineException->NoRetry, 
org.apache.ratis.protocol.exceptions.TimeoutIOException->NoRetry})}
        at 
org.apache.ratis.client.impl.RaftClientImpl.noMoreRetries(RaftClientImpl.java:353)
        ... 20 more
   Caused by: org.apache.ratis.protocol.exceptions.NotLeaderException: Server 
31ca1c78-6c1a-481a-9835-ed9e1bd72d7b@group-4121B0A26A9F is not the leader 
66f0d3a6-9de9-4155-af63-8575b18d2e1d|10.1.0.11:15121
        at 
org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:397)
        at 
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers$1.onNext(GrpcClientProtocolClient.java:310)
        ... 11 more
   ```
   
   Solution for the fix :- 
   - The solution is to modify the test environment by reducing the number of 
datanodes and pipelines in the MiniOzoneCluster configuration.  
   - With fewer datanodes and pipelines, there's less likelihood of 
encountering resource contention and timing issues. Such issues can often lead 
to intermittent failures that are hard to reproduce and diagnose.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-9766
   ## How was this patch tested?
   
   Ran it 300 times in my fork, and passed successfully :- 
   
   Test Run 1 :- https://github.com/ArafatKhan2198/ozone/actions/runs/7031682772
   Test Run 2 :- https://github.com/ArafatKhan2198/ozone/actions/runs/7031685390
   Test Run 3 :- https://github.com/ArafatKhan2198/ozone/actions/runs/7031688216


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to