[ https://issues.apache.org/jira/browse/HDFS-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875714#comment-17875714 ]
ASF GitHub Bot commented on HDFS-17497: --------------------------------------- hfutatzhanghb commented on code in PR #6765: URL: https://github.com/apache/hadoop/pull/6765#discussion_r1726157523 ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java: ########## @@ -1105,15 +1106,12 @@ static void unprotectedUpdateCount(INodesInPath inodesInPath, /** * Update the cached quota space for a block that is being completed. * Must only be called once, as the block is being completed. - * @param completeBlk - Completed block for which to update space - * @param inodes - INodes in path to file containing completeBlk; if null - * this will be resolved internally + * @param commitBlock - Committed block for which to update space + * @param iip - INodes in path to file containing committedBlock */ - public void updateSpaceForCompleteBlock(BlockInfo completeBlk, - INodesInPath inodes) throws IOException { + public void updateSpaceForCommittedBlock(Block commitBlock, + INodesInPath iip) throws IOException { assert namesystem.hasWriteLock(); - INodesInPath iip = inodes != null ? inodes : - INodesInPath.fromINode(namesystem.getBlockCollection(completeBlk)); INodeFile fileINode = iip.getLastINode().asFile(); Review Comment: In original logic, it will consider whether iip is null or not. But we miss judgement here. > Logic for committed blocks is mixed when computing file size > ------------------------------------------------------------ > > Key: HDFS-17497 > URL: https://issues.apache.org/jira/browse/HDFS-17497 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: ZanderXu > Priority: Major > Labels: pull-request-available > > One in-writing HDFS file may contains multiple committed blocks, as follows > (assume one file contains three blocks): > || ||Block 1||Block 2||Block 3|| > |Case 1|Complete|Commit|UnderConstruction| > |Case 2|Complete|Commit|Commit| > |Case 3|Commit|Commit|Commit| > > But the logic for committed blocks is mixed when computing file size, it > ignores the bytes of the last committed block and contains the bytes of other > committed blocks. > {code:java} > public final long computeFileSize(boolean includesLastUcBlock, > boolean usePreferredBlockSize4LastUcBlock) { > if (blocks.length == 0) { > return 0; > } > final int last = blocks.length - 1; > //check if the last block is BlockInfoUnderConstruction > BlockInfo lastBlk = blocks[last]; > long size = lastBlk.getNumBytes(); > // the last committed block is not complete, so it's bytes may be ignored. > if (!lastBlk.isComplete()) { > if (!includesLastUcBlock) { > size = 0; > } else if (usePreferredBlockSize4LastUcBlock) { > size = isStriped()? > getPreferredBlockSize() * > ((BlockInfoStriped)lastBlk).getDataBlockNum() : > getPreferredBlockSize(); > } > } > // The bytes of other committed blocks are calculated into the file length. > for (int i = 0; i < last; i++) { > size += blocks[i].getNumBytes(); > } > return size; > } {code} > The bytes of one committed block will not be changed, so the bytes of the > last committed block should be calculated into the file length too. > > And the logic for committed blocks is mixed too when computing file length in > DFSInputStream. Normally DFSInputStream does not need to get visible length > for committed block regardless of whether the committed block is the last > block or not. > > -HDFS-10843- encountered one bug which actually caused by the committed > block, but -HDFS-10843- fixed that bug by updating quota usage when > completing block. The num of bytes of the committed block will no longer > change, so we should update the quota usage when the block is committed, > which can reduce the delta quota usage in time. > > So there are somethings we need to do: > * Unify the calculation logic for all committed blocks in > {{computeFileSize}} of {{INodeFile}} > * Unify the calculation logic for all committed blocks in {{getFileLength}} > of {{DFSInputStream}} > * Update quota usage when committing block -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org