Attila Doroszlai created HDDS-12103:
---------------------------------------
Summary: PutBlock timeouts in MapReduce test with Ratis 3.1.3
Key: HDDS-12103
URL: https://issues.apache.org/jira/browse/HDDS-12103
Project: Apache Ozone
Issue Type: Bug
Reporter: Attila Doroszlai
MapReduce tests are much slower with Ratis 3.1.3, frequently hitting test
timeout (even after increase from 4 to 6 minutes). Even successful tests are
much slower. Other tests do not indicate similar slowness.
MapReduce job log shows some PutBlock request timeouts:
{code}
2025-01-18 12:59:31 ERROR OrderedAsync:215 - client-6A858158D10F: Failed*
RaftClientRequest:client-6A858158D10F->6113f37b-d1d0-4ce0-803d-1921ebe30b67@group-10CFDA178973,
cid=12, seq=9, RW, cmdType: PutBlock
traceID: ""
containerID: 1
datanodeUuid: "d05682f8-babd-4570-8aec-e536a6edcb1d"
putBlock {
blockData {
blockID {
containerID: 1
localID: 115816896921600024
blockCommitSequenceId: 0
}
metadata {
key: "TYPE"
value: "KEY"
}
chunks {
chunkName: "115816896921600024_chunk_1"
offset: 0
len: 179924
checksumData {
type: CRC32
bytesPerChecksum: 16384
checksums: ...
}
}
}
eof: true
}
version: 3
, data.size=0
java.util.concurrent.CompletionException:
org.apache.ratis.protocol.exceptions.TimeoutIOException:
client-6A858158D10F->6113f37b-d1d0-4ce0-803d-1921ebe30b67 request #12 timeout
60s
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at
java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647)
at
java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.lambda$timeoutCheck$5(GrpcClientProtocolClient.java:376)
at java.util.Optional.ifPresent(Optional.java:159)
at
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.handleReplyFuture(GrpcClientProtocolClient.java:381)
at
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.timeoutCheck(GrpcClientProtocolClient.java:376)
at
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.lambda$onNext$3(GrpcClientProtocolClient.java:369)
at
org.apache.ratis.util.TimeoutTimer.lambda$onTimeout$2(TimeoutTimer.java:101)
at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:38)
at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:78)
at org.apache.ratis.util.TimeoutTimer$Task.run(TimeoutTimer.java:55)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
Caused by: org.apache.ratis.protocol.exceptions.TimeoutIOException:
client-6A858158D10F->6113f37b-d1d0-4ce0-803d-1921ebe30b67 request #12 timeout
60s
at
org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.lambda$timeoutCheck$5(GrpcClientProtocolClient.java:377)
... 10 more
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]