[ 
https://issues.apache.org/jira/browse/HDDS-12103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17921184#comment-17921184
 ] 

Tsz-wo Sze edited comment on HDDS-12103 at 1/26/25 5:25 PM:
------------------------------------------------------------

I did see [a lot for 
failures|https://github.com/szetszwo/ozone/actions/runs/12958213678].  We 
probably have to update Ozone's grpc version to sync with Ratis-thirdparty.  
Started a new build: https://github.com/szetszwo/ozone/actions/runs/12976815936
{code}
@@ -106,7 +106,7 @@
     <hk2.version>2.6.1</hk2.version>
     <httpclient.version>4.5.14</httpclient.version>
     <httpcore.version>4.4.16</httpcore.version>
-    <io.grpc.version>1.58.0</io.grpc.version>
+    <io.grpc.version>1.69.0</io.grpc.version>
     <jackson-jaxr.version>1.9.13</jackson-jaxr.version>
     <jackson1.version>1.9.13</jackson1.version>
     <jackson2-bom.version>2.16.2</jackson2-bom.version>
{code}

{code}
commit bc91558faa10b602effa48f0737fe00aa13f58e0
Author: Potato <[email protected]>
Date:   Tue Dec 17 15:42:07 2024 +0800

    RATIS-2212. Bump grpc to 1.69.0 (#58)
{code}


was (Author: szetszwo):
I did see [a lot for 
failures|https://github.com/szetszwo/ozone/actions/runs/12958213678].  We 
probably have to update Ozone's grpc version to sync with Ratis-thirdparty.  
Started a new build: https://github.com/szetszwo/ozone/actions/runs/12976623709
{code}
@@ -106,7 +106,7 @@
     <hk2.version>2.6.1</hk2.version>
     <httpclient.version>4.5.14</httpclient.version>
     <httpcore.version>4.4.16</httpcore.version>
-    <io.grpc.version>1.58.0</io.grpc.version>
+    <io.grpc.version>1.69.0</io.grpc.version>
     <jackson-jaxr.version>1.9.13</jackson-jaxr.version>
     <jackson1.version>1.9.13</jackson1.version>
     <jackson2-bom.version>2.16.2</jackson2-bom.version>
{code}

{code}
commit bc91558faa10b602effa48f0737fe00aa13f58e0
Author: Potato <[email protected]>
Date:   Tue Dec 17 15:42:07 2024 +0800

    RATIS-2212. Bump grpc to 1.69.0 (#58)
{code}

> PutBlock timeouts in MapReduce test with Ratis 3.1.3
> ----------------------------------------------------
>
>                 Key: HDDS-12103
>                 URL: https://issues.apache.org/jira/browse/HDDS-12103
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Attila Doroszlai
>            Priority: Major
>
> MapReduce tests are much slower with Ratis 3.1.3, frequently hitting test 
> timeout (even after increase from 4 to 6 minutes).  Even successful tests are 
> much slower.  Other tests do not indicate similar slowness.
> MapReduce job log shows some PutBlock request timeouts:
> {code}
> 2025-01-18 12:59:31 ERROR OrderedAsync:215 -  client-6A858158D10F: Failed* 
> RaftClientRequest:client-6A858158D10F->6113f37b-d1d0-4ce0-803d-1921ebe30b67@group-10CFDA178973,
>  cid=12, seq=9, RW, cmdType: PutBlock
> traceID: ""
> containerID: 1
> datanodeUuid: "d05682f8-babd-4570-8aec-e536a6edcb1d"
> putBlock {
>   blockData {
>     blockID {
>       containerID: 1
>       localID: 115816896921600024
>       blockCommitSequenceId: 0
>     }
>     metadata {
>       key: "TYPE"
>       value: "KEY"
>     }
>     chunks {
>       chunkName: "115816896921600024_chunk_1"
>       offset: 0
>       len: 179924
>       checksumData {
>         type: CRC32
>         bytesPerChecksum: 16384
>         checksums: ...
>       }
>     }
>   }
>   eof: true
> }
> version: 3
> , data.size=0
> java.util.concurrent.CompletionException: 
> org.apache.ratis.protocol.exceptions.TimeoutIOException: 
> client-6A858158D10F->6113f37b-d1d0-4ce0-803d-1921ebe30b67 request #12 timeout 
> 60s
>       at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>       at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>       at 
> java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647)
>       at 
> java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
>       at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>       at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>       at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.lambda$timeoutCheck$5(GrpcClientProtocolClient.java:376)
>       at java.util.Optional.ifPresent(Optional.java:159)
>       at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.handleReplyFuture(GrpcClientProtocolClient.java:381)
>       at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.timeoutCheck(GrpcClientProtocolClient.java:376)
>       at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.lambda$onNext$3(GrpcClientProtocolClient.java:369)
>       at 
> org.apache.ratis.util.TimeoutTimer.lambda$onTimeout$2(TimeoutTimer.java:101)
>       at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:38)
>       at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:78)
>       at org.apache.ratis.util.TimeoutTimer$Task.run(TimeoutTimer.java:55)
>       at java.util.TimerThread.mainLoop(Timer.java:555)
>       at java.util.TimerThread.run(Timer.java:505)
> Caused by: org.apache.ratis.protocol.exceptions.TimeoutIOException: 
> client-6A858158D10F->6113f37b-d1d0-4ce0-803d-1921ebe30b67 request #12 timeout 
> 60s
>       at 
> org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.lambda$timeoutCheck$5(GrpcClientProtocolClient.java:377)
>       ... 10 more
> {code}
> CC [~szetszwo]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to