[
https://issues.apache.org/jira/browse/HDDS-10497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sammi Chen resolved HDDS-10497.
-------------------------------
Fix Version/s: 1.5.0
Resolution: Fixed
> [hsync] Refresh block token immediately if block token expires
> --------------------------------------------------------------
>
> Key: HDDS-10497
> URL: https://issues.apache.org/jira/browse/HDDS-10497
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Wei-Chiu Chuang
> Assignee: Wei-Chiu Chuang
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.5.0
>
>
> HDDS-9734 and HDDS-7930 improves error handling when input stream fails to
> read due to expired block token. But it only refreshes block token after
> retry every datanode in the pipeline, which not only adds log spew but also
> increase 99.9% tail latency.
> The input stream should request new block token immediately after an expired
> block token.
> Relevant logs:
> {noformat}
> 2024-03-08 23:03:20,109 WARN
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls: Failed to read
> chunk 113750153625603061_chunk_1 (len=1048576) conID: 4 locID:
> 113750153625603061 bcsId: 129941 from
> 5fa1d092-1f11-4f6e-af4a-cf2785a8cae4(ccycloud-1.weichiu-hbase.root.comops.site/10.140.131.133);
> will try another datanode.
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
> BLOCK_TOKEN_VERIFICATION_FAILED for null: Expired token for user: hbase
> (auth:SIMPLE)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:675)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambda$createValidators$4(ContainerProtocolCalls.java:686)
> at
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:400)
> at
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.lambda$sendCommandWithTraceIDAndRetry$0(XceiverClientGrpc.java:340)
> at
> org.apache.hadoop.hdds.tracing.TracingUtil.executeInSpan(TracingUtil.java:159)
> at
> org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:149)
> at
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithTraceIDAndRetry(XceiverClientGrpc.java:335)
> at
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:316)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.readChunk(ContainerProtocolCalls.java:358)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambda$readChunk$2(ContainerProtocolCalls.java:345)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.tryEachDatanode(ContainerProtocolCalls.java:147)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.readChunk(ContainerProtocolCalls.java:344)
> at
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunk(ChunkInputStream.java:425)
> at
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunkDataIntoBuffers(ChunkInputStream.java:402)
> at
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunkFromContainer(ChunkInputStream.java:387)
> at
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.prepareRead(ChunkInputStream.java:319)
> at
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.read(ChunkInputStream.java:173)
> at
> org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:54)
> at
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.readWithStrategy(BlockInputStream.java:367)
> ...
> 2024-03-08 23:03:20,112 ERROR org.apache.hadoop.hdds.scm.XceiverClientGrpc:
> Failed to execute command ReadChunk on the pipeline Pipeline[ Id:
> 04646212-c013-4f8c-9ada-80580c189135, Nodes:
> 5fa1d092-1f11-4f6e-af4a-cf2785a8cae4(ccycloud-1.weichiu-hbase.root.comops.site/10.140.131.133)98e5528d-c790-465e-91e0-f47d4cabe3bc(ccycloud-3.weichiu-hbase.root.comops.site/10.140.103.18)0238996a-9361-4b83-aaa8-e99fd9523ad0(ccycloud-2.weichiu-hbase.root.comops.site/10.140.135.20),
> ReplicationConfig: STANDALONE/THREE, State:OPEN,
> leaderId:98e5528d-c790-465e-91e0-f47d4cabe3bc,
> CreationTimestamp2024-03-08T18:59:15.755Z[UTC]].
> 2024-03-08 23:03:20,113 WARN
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls: Failed to read
> chunk 113750153625603061_chunk_1 (len=1048576) conID: 4 locID:
> 113750153625603061 bcsId: 129941 from
> 98e5528d-c790-465e-91e0-f47d4cabe3bc(ccycloud-3.weichiu-hbase.root.comops.site/10.140.103.18);
> will try another datanode.
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
> BLOCK_TOKEN_VERIFICATION_FAILED for null: Expired token for user: hbase
> (auth:SIMPLE)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:675)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambda$createValidators$4(ContainerProtocolCalls.java:686)
> at
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:400)
> at
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.lambda$sendCommandWithTraceIDAndRetry$0(XceiverClientGrpc.java:340)
> at
> org.apache.hadoop.hdds.tracing.TracingUtil.executeInSpan(TracingUtil.java:159)
> at
> org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:149)
> at
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithTraceIDAndRetry(XceiverClientGrpc.java:335)
> at
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:316)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.readChunk(ContainerProtocolCalls.java:358)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambda$readChunk$2(ContainerProtocolCalls.java:345)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.tryEachDatanode(ContainerProtocolCalls.java:147)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.readChunk(ContainerProtocolCalls.java:344)
> ...
> 2024-03-08 23:03:20,116 ERROR org.apache.hadoop.hdds.scm.XceiverClientGrpc:
> Failed to execute command ReadChunk on the pipeline Pipeline[ Id:
> 04646212-c013-4f8c-9ada-80580c189135, Nodes:
> 5fa1d092-1f11-4f6e-af4a-cf2785a8cae4(ccycloud-1.weichiu-hbase.root.comops.site/10.140.131.133)98e5528d-c790-465e-91e0-f47d4cabe3bc(ccycloud-3.weichiu-hbase.root.comops.site/10.140.103.18)0238996a-9361-4b83-aaa8-e99fd9523ad0(ccycloud-2.weichiu-hbase.root.comops.site/10.140.135.20),
> ReplicationConfig: STANDALONE/THREE, State:OPEN,
> leaderId:98e5528d-c790-465e-91e0-f47d4cabe3bc,
> CreationTimestamp2024-03-08T18:59:15.755Z[UTC]].
> 2024-03-08 23:03:20,390 INFO
> org.apache.hadoop.hdds.scm.storage.BlockInputStream: Unable to read
> information for block conID: 3 locID: 113750153625603098 bcsId: 459126 from
> pipeline PipelineID=eb1d2690-75a6-48d7-9eec-60675b907fc0:
> BLOCK_TOKEN_VERIFICATION_FAILED for null: Expired token for user: hbase
> (auth:SIMPLE)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]