[ 
https://issues.apache.org/jira/browse/HDDS-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853600#comment-17853600
 ] 

Raju Balpande edited comment on HDDS-10788 at 6/10/24 9:55 AM:
---------------------------------------------------------------

I see the failure ratio as 48/1000, i.e. 4.8% in 
[https://github.com/raju-balpande/apache_ozone/actions/runs/9398762434] with 
following error,
{noformat}
    [INFO] Running org.apache.hadoop.ozone.client.rpc.TestWatchForCommit   
    Error:  Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
46.541 s <<< FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestWatchForCommit
    Error:  
org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForRetryfailure
  Time elapsed: 46.509 s  <<< FAILURE!
    java.lang.AssertionError: 

    Expecting actual:    
      "org.apache.ratis.protocol.exceptions.RaftRetryFailureException: Failed 
RaftClientRequest:client-54F8BAFFDCC6->caf3bd91-ec62-492c-910c-c9010abb2965@group-6E03D52A6FF6,
 cid=16, seq=null, Watch-MAJORITY_COMMITTED(36), null for 3 attempts with 
RequestTypeDependentRetryPolicy{WRITE->ExceptionDependentRetry(maxAttempts=2147483647;
 defaultPolicy=MultipleLinearRandomRetry[5x5s, 5x10s, 5x15s, 5x20s, 5x25s, 
10x60s]; 
map={org.apache.ratis.protocol.exceptions.GroupMismatchException->NoRetry, 
org.apache.ratis.protocol.exceptions.NotReplicatedException->NoRetry, 
org.apache.ratis.protocol.exceptions.ResourceUnavailableException->org.apache.ratis.retry.ExponentialBackoffRetry@64f32a1f,
 org.apache.ratis.protocol.exceptions.StateMachineException->NoRetry, 
org.apache.ratis.protocol.exceptions.TimeoutIOException->org.apache.ratis.retry.ExponentialBackoffRetry@64f32a1f}),
 WATCH->ExceptionDependentRetry(maxAttempts=2147483647; 
defaultPolicy=MultipleLinearRandomRetry[5x5s, 5x10s, 5x15s, 5x20s, 5x25s, 
10x60s]; 
map={org.apache.ratis.protocol.exceptions.GroupMismatchException->NoRetry, 
org.apache.ratis.protocol.exceptions.NotReplicatedException->NoRetry, 
org.apache.ratis.protocol.exceptions.ResourceUnavailableException->org.apache.ratis.retry.ExponentialBackoffRetry@64f32a1f,
 org.apache.ratis.protocol.exceptions.StateMachineException->NoRetry, 
org.apache.ratis.protocol.exceptions.TimeoutIOException->NoRetry})}"

    not to contain:    
      "Watch-MAJORITY_COMMITTED"

        at 
org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForRetryfailure(TestWatchForCommit.java:296)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at java.util.ArrayList.forEach(ArrayList.java:1259)
        at java.util.ArrayList.forEach(ArrayList.java:1259){noformat}
With this fix it 100% working as in 
[https://github.com/raju-balpande/apache_ozone/actions/runs/9427808268]


was (Author: JIRAUSER296391):
I see the failure ratio as 48/1000, i.e. 4.8% in 
[https://github.com/raju-balpande/apache_ozone/actions/runs/9398762434] with 
following error,
{noformat}
    [INFO] Running org.apache.hadoop.ozone.client.rpc.TestWatchForCommit

  

  
    
    Error:  Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
46.541 s <<< FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestWatchForCommit

  

  
    
    Error:  
org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForRetryfailure
  Time elapsed: 46.509 s  <<< FAILURE!

  

  
    
    java.lang.AssertionError: 

  

  
    
    

  

  
    
    Expecting actual:

  

  
    
      "org.apache.ratis.protocol.exceptions.RaftRetryFailureException: Failed 
RaftClientRequest:client-54F8BAFFDCC6->caf3bd91-ec62-492c-910c-c9010abb2965@group-6E03D52A6FF6,
 cid=16, seq=null, Watch-MAJORITY_COMMITTED(36), null for 3 attempts with 
RequestTypeDependentRetryPolicy{WRITE->ExceptionDependentRetry(maxAttempts=2147483647;
 defaultPolicy=MultipleLinearRandomRetry[5x5s, 5x10s, 5x15s, 5x20s, 5x25s, 
10x60s]; 
map={org.apache.ratis.protocol.exceptions.GroupMismatchException->NoRetry, 
org.apache.ratis.protocol.exceptions.NotReplicatedException->NoRetry, 
org.apache.ratis.protocol.exceptions.ResourceUnavailableException->org.apache.ratis.retry.ExponentialBackoffRetry@64f32a1f,
 org.apache.ratis.protocol.exceptions.StateMachineException->NoRetry, 
org.apache.ratis.protocol.exceptions.TimeoutIOException->org.apache.ratis.retry.ExponentialBackoffRetry@64f32a1f}),
 WATCH->ExceptionDependentRetry(maxAttempts=2147483647; 
defaultPolicy=MultipleLinearRandomRetry[5x5s, 5x10s, 5x15s, 5x20s, 5x25s, 
10x60s]; 
map={org.apache.ratis.protocol.exceptions.GroupMismatchException->NoRetry, 
org.apache.ratis.protocol.exceptions.NotReplicatedException->NoRetry, 
org.apache.ratis.protocol.exceptions.ResourceUnavailableException->org.apache.ratis.retry.ExponentialBackoffRetry@64f32a1f,
 org.apache.ratis.protocol.exceptions.StateMachineException->NoRetry, 
org.apache.ratis.protocol.exceptions.TimeoutIOException->NoRetry})}"

  

  
    
    not to contain:

  

  
    
      "Watch-MAJORITY_COMMITTED"

  

  
    
    

  

  
    
        at 
org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForRetryfailure(TestWatchForCommit.java:296)

  

  
    
        at java.lang.reflect.Method.invoke(Method.java:498)

  

  
    
        at java.util.ArrayList.forEach(ArrayList.java:1259)

  

  
    
        at java.util.ArrayList.forEach(ArrayList.java:1259){noformat}
With this fix it 100% working as in 
https://github.com/raju-balpande/apache_ozone/actions/runs/9427808268

> Intermittent failure in testWatchForCommitForRetryfailure
> ---------------------------------------------------------
>
>                 Key: HDDS-10788
>                 URL: https://issues.apache.org/jira/browse/HDDS-10788
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: test
>            Reporter: Attila Doroszlai
>            Assignee: Raju Balpande
>            Priority: Major
>         Attachments: 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit-output.txt, 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.txt
>
>
> {code}
> Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 134.993 s <<< 
> FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestWatchForCommit
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForRetryfailure
>   Time elapsed: 37.042 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <"org.apache.ratis.protocol.exceptions.RaftRetryFailureException: Failed 
> RaftClientRequest:client-35B425B0E6BC->7ca6ef5b-8396-458a-a334-21f1f3211157@group-3A5CD4773BBD,
>  cid=51, seq=null, Watch-MAJORITY_COMMITTED(95), null for 3 attempts with 
> RequestTypeDependentRetryPolicy{...}">
> not to contain:
>  <"Watch-MAJORITY_COMMITTED">
>       at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForRetryfailure(TestWatchForCommit.java:295)
> {code}
> {code:title=https://github.com/apache/ozone/blob/a658802d628271efa82824dc3c316f8eebfc75d3/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestWatchForCommit.java#L292-L296}
>       // client should not attempt to watch with
>       // MAJORITY_COMMITTED replication level, except the grpc IO issue
>       if (!logCapturer.getOutput().contains("Connection refused")) {
>         assertThat(e.getMessage()).doesNotContain("Watch-MAJORITY_COMMITTED");
>       }
> {code}
> {code:title=log}
> [main] WARN  scm.XceiverClientRatis 
> (XceiverClientRatis.java:watchForCommit(284)) - 3 way commit failed on 
> pipeline ...
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.protocol.exceptions.NotReplicatedException: Request with 
> call Id 50 and log index 95 is not yet replicated to ALL_COMMITTED
>       at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>       at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>       at 
> org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:279)
>       at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.lambda$testWatchForCommitForRetryfailure$0(TestWatchForCommit.java:286)
>       at org.junit.jupiter.api.AssertThrows.assertThrows(AssertThrows.java:53)
>       at org.junit.jupiter.api.AssertThrows.assertThrows(AssertThrows.java:35)
>       at org.junit.jupiter.api.Assertions.assertThrows(Assertions.java:3115)
>       at 
> org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForRetryfailure(TestWatchForCommit.java:285)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to