[ https://issues.apache.org/jira/browse/HDFS-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17840351#comment-17840351 ]
ASF GitHub Bot commented on HDFS-17497: --------------------------------------- hadoop-yetus commented on PR #6765: URL: https://github.com/apache/hadoop/pull/6765#issuecomment-2074475186 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 45s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 50m 31s | | trunk passed | | +1 :green_heart: | compile | 1m 24s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 1m 14s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 1m 13s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 23s | | trunk passed | | +1 :green_heart: | javadoc | 1m 8s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 40s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 3m 19s | | trunk passed | | +1 :green_heart: | shadedclient | 41m 29s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 11s | | the patch passed | | +1 :green_heart: | compile | 1m 13s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 1m 13s | | the patch passed | | +1 :green_heart: | compile | 1m 9s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 1m 9s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 3s | | hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 290 unchanged - 1 fixed = 290 total (was 291) | | +1 :green_heart: | mvnsite | 1m 11s | | the patch passed | | +1 :green_heart: | javadoc | 0m 56s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 38s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 3m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 40m 48s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | +1 :green_heart: | unit | 269m 3s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 52s | | The patch does not generate ASF License warnings. | | | | 427m 31s | | | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6765 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 3ed817d3780c 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0aa96155ae7aed9c69d8c0ede601fffd4bc8c17f | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/2/testReport/ | | Max. process+thread count | 2789 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/2/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Logic for committed blocks is mixed when computing file size > ------------------------------------------------------------ > > Key: HDFS-17497 > URL: https://issues.apache.org/jira/browse/HDFS-17497 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: ZanderXu > Priority: Major > Labels: pull-request-available > > One in-writing HDFS file may contains multiple committed blocks, as follows > (assume one file contains three blocks): > || ||Block 1||Block 2||Block 3|| > |Case 1|Complete|Commit|UnderConstruction| > |Case 2|Complete|Commit|Commit| > |Case 3|Commit|Commit|Commit| > > But the logic for committed blocks is mixed when computing file size, it > ignores the bytes of the last committed block and contains the bytes of other > committed blocks. > {code:java} > public final long computeFileSize(boolean includesLastUcBlock, > boolean usePreferredBlockSize4LastUcBlock) { > if (blocks.length == 0) { > return 0; > } > final int last = blocks.length - 1; > //check if the last block is BlockInfoUnderConstruction > BlockInfo lastBlk = blocks[last]; > long size = lastBlk.getNumBytes(); > // the last committed block is not complete, so it's bytes may be ignored. > if (!lastBlk.isComplete()) { > if (!includesLastUcBlock) { > size = 0; > } else if (usePreferredBlockSize4LastUcBlock) { > size = isStriped()? > getPreferredBlockSize() * > ((BlockInfoStriped)lastBlk).getDataBlockNum() : > getPreferredBlockSize(); > } > } > // The bytes of other committed blocks are calculated into the file length. > for (int i = 0; i < last; i++) { > size += blocks[i].getNumBytes(); > } > return size; > } {code} > The bytes of one committed block will not be changed, so the bytes of the > last committed block should be calculated into the file length too. > > And the logic for committed blocks is mixed too when computing file length in > DFSInputStream. Normally DFSInputStream does not need to get visible length > for committed block regardless of whether the committed block is the last > block or not. > > -HDFS-10843- encountered one bug which actually caused by the committed > block, but -HDFS-10843- fixed that bug by updating quota usage when > completing block. The num of bytes of the committed block will no longer > change, so we should update the quota usage when the block is committed, > which can reduce the delta quota usage in time. > > So there are somethings we need to do: > * Unify the calculation logic for all committed blocks in > {{computeFileSize}} of {{INodeFile}} > * Unify the calculation logic for all committed blocks in {{getFileLength}} > of {{DFSInputStream}} > * Update quota usage when committing block -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org