ZanderXu commented on code in PR #4155:
URL: https://github.com/apache/hadoop/pull/4155#discussion_r874957305
##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/FileChecksumHelper.java:
##########
@@ -316,18 +317,22 @@ FileChecksum makeCompositeCrcResult() throws IOException {
"Added blockCrc 0x{} for block index {} of size {}",
Integer.toString(blockCrc, 16), i, block.getBlockSize());
}
-
- // NB: In some cases the located blocks have their block size adjusted
- // explicitly based on the requested length, but not all cases;
- // these numbers may or may not reflect actual sizes on disk.
- long reportedLastBlockSize =
- blockLocations.getLastLocatedBlock().getBlockSize();
- long consumedLastBlockLength = reportedLastBlockSize;
- if (length - sumBlockLengths < reportedLastBlockSize) {
- LOG.warn(
- "Last block length {} is less than reportedLastBlockSize {}",
- length - sumBlockLengths, reportedLastBlockSize);
- consumedLastBlockLength = length - sumBlockLengths;
+ LocatedBlock nextBlock = locatedBlocks.get(i);
+ long consumedLastBlockLength = Math.min(length - sumBlockLengths,
+ nextBlock.getBlockSize());
+ LocatedBlock lastBlock = blockLocations.getLastLocatedBlock();
+ if (nextBlock.equals(lastBlock)) {
Review Comment:
Whether it is a replicated file or striped file, for a block, we will obtain
a 4-bytes composer crc, and the actual size corresponding to the crc is very
important, because line 336 will use it to compute the composer crc.
Suppose a file has 4 blocks, number block1, block2, block3 and block4
respectively, and the size of each blocks is 10MB, 10MB, 10MB, 7MB. And i use
getFilecheck(mockFile, 29MB). The correct consumedLastBlockLength in line 336
should be 9MB, but the result of the current logic is 7MB which from the last
block size of the file. So we will get an error composer crc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]