[ https://issues.apache.org/jira/browse/HDDS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen O'Donnell updated HDDS-10682: ------------------------------------- Fix Version/s: 1.4.1 > EC Reconstruction creates empty chunks at the end of blocks with partial > stripes > -------------------------------------------------------------------------------- > > Key: HDDS-10682 > URL: https://issues.apache.org/jira/browse/HDDS-10682 > Project: Apache Ozone > Issue Type: Bug > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Labels: pull-request-available > Fix For: 1.5.0, 1.4.1 > > > Given an EC block that is larger than 1 full stripe, but the last stripe is > partial so that it does not use all the index. > If one of the replicas is reconstructed that does not have any data in that > final position, an empty chunk is written to the end of the block's chunk > list. > While this does no cause any immediate problem, it can prevent further > reconstructions that attempt to use this block, and they will fail with an > error like: > {code} > 2024-04-09 01:06:21,855 ERROR > [ec-reconstruct-reader-TID-4]-org.apache.hadoop.hdds.scm.XceiverClientGrpc: > Failed to execute command GetBlock on the pipeline Pipeline[ Id: > 7f6f1fc9-ed26-4e19-86b6-47435b027f6a, Nodes: > 7f6f1fc9-ed26-4e19-86b6-47435b027f6a(ccycloud-4.quasar-jyswng.root.comops.site/10.140.150.0), > ReplicationConfig: STANDALONE/ONE, State:CLOSED, leaderId:, > CreationTimestamp2024-04-09T01:06:21.724509Z[UTC]]. > 2024-04-09 01:06:21,859 INFO > [ContainerReplicationThread-1]-org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream: > ECBlockReconstructedStripeInputStream{conID: 10007 locID: > 113750153625610009}@756a3998: error reading [1], marked as failed > org.apache.hadoop.ozone.client.io.BadDataLocationException: > java.io.IOException: Failed to get chunkInfo[77]: len == 0 > at > org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.readIntoBuffer(ECBlockReconstructedStripeInputStream.java:644) > at > org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.lambda$loadDataBuffersFromStream$2(ECBlockReconstructedStripeInputStream.java:577) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.io.IOException: Failed to get chunkInfo[77]: len == 0 > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.validate(BlockInputStream.java:278) > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.lambda$static$0(BlockInputStream.java:265) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:407) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.lambda$sendCommandWithTraceIDAndRetry$0(XceiverClientGrpc.java:347) > at > org.apache.hadoop.hdds.tracing.TracingUtil.executeInSpan(TracingUtil.java:169) > at > org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:149) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithTraceIDAndRetry(XceiverClientGrpc.java:342) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:323) > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:208) > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.lambda$getBlock$0(ContainerProtocolCalls.java:186) > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.tryEachDatanode(ContainerProtocolCalls.java:146) > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:185) > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:255) > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:146) > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.readWithStrategy(BlockInputStream.java:308) > at > org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:66) > at > org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.readFromCurrentLocation(ECBlockReconstructedStripeInputStream.java:655) > at > org.apache.hadoop.ozone.client.io.ECBlockReconstructedStripeInputStream.readIntoBuffer(ECBlockReconstructedStripeInputStream.java:631) > ... 5 more > {code} > If there are other spare replicas which can be used, reconstruction will > continue, otherwise it will not be able to complete. > At this stage, I am not sure if this can affect reading a block via the > normal read path. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org