[ 
https://issues.apache.org/jira/browse/HDFS-16533?focusedWorklogId=771416&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771416
 ]

ASF GitHub Bot logged work on HDFS-16533:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/May/22 15:07
            Start Date: 17/May/22 15:07
    Worklog Time Spent: 10m 
      Work Description: ZanderXu commented on code in PR #4155:
URL: https://github.com/apache/hadoop/pull/4155#discussion_r874941317


##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/FileChecksumHelper.java:
##########
@@ -316,18 +317,22 @@ FileChecksum makeCompositeCrcResult() throws IOException {
             "Added blockCrc 0x{} for block index {} of size {}",
             Integer.toString(blockCrc, 16), i, block.getBlockSize());
       }
-
-      // NB: In some cases the located blocks have their block size adjusted
-      // explicitly based on the requested length, but not all cases;
-      // these numbers may or may not reflect actual sizes on disk.
-      long reportedLastBlockSize =
-          blockLocations.getLastLocatedBlock().getBlockSize();
-      long consumedLastBlockLength = reportedLastBlockSize;
-      if (length - sumBlockLengths < reportedLastBlockSize) {
-        LOG.warn(
-            "Last block length {} is less than reportedLastBlockSize {}",
-            length - sumBlockLengths, reportedLastBlockSize);
-        consumedLastBlockLength = length - sumBlockLengths;
+      LocatedBlock nextBlock = locatedBlocks.get(i);
+      long consumedLastBlockLength = Math.min(length - sumBlockLengths,
+          nextBlock.getBlockSize());
+      LocatedBlock lastBlock = blockLocations.getLastLocatedBlock();
+      if (nextBlock.equals(lastBlock)) {

Review Comment:
   Thanks @jojochuang for your comment. 
   First, i will explain the goal of UT:
   
   1. Use the same context to create a replicated file and a striped file.
   2. Set the conf the use COMPOSITE_CRC
   3. Expected  the same checksum result of any length from the replicated file 
and the striped file via getFileChecksum
   
   Second, i will explain the root cause:
   
   1. blockLocations in line 104, contains a blocks and a lastLocatedBlock
   2. the last block in blocks maybe not the same with lastLocatedBlock when 
the input length is less than file length.
   3. so, we cannot always compare with the lastLocatedBlock to get the 
composer length in line 336.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 771416)
    Time Spent: 1h 40m  (was: 1.5h)

> COMPOSITE_CRC failed between replicated file and striped file.
> --------------------------------------------------------------
>
>                 Key: HDFS-16533
>                 URL: https://issues.apache.org/jira/browse/HDFS-16533
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs, hdfs-client
>            Reporter: ZanderXu
>            Assignee: ZanderXu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HDFS-16533.001.patch
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> After testing the COMPOSITE_CRC with some random length between replicated 
> file and striped file which has same data with replicated file, it failed. 
> Reproduce step like this:
> {code:java}
> @Test(timeout = 90000)
> public void testStripedAndReplicatedFileChecksum2() throws Exception {
>   int abnormalSize = (dataBlocks * 2 - 2) * blockSize +
>       (int) (blockSize * 0.5);
>   prepareTestFiles(abnormalSize, new String[] {stripedFile1, replicatedFile});
>   int loopNumber = 100;
>   while (loopNumber-- > 0) {
>     int verifyLength = ThreadLocalRandom.current()
>         .nextInt(10, abnormalSize);
>     FileChecksum stripedFileChecksum1 = getFileChecksum(stripedFile1,
>         verifyLength, false);
>     FileChecksum replicatedFileChecksum = getFileChecksum(replicatedFile,
>         verifyLength, false);
>     if (checksumCombineMode.equals(ChecksumCombineMode.COMPOSITE_CRC.name())) 
> {
>       Assert.assertEquals(stripedFileChecksum1, replicatedFileChecksum);
>     } else {
>       Assert.assertNotEquals(stripedFileChecksum1, replicatedFileChecksum);
>     }
>   }
> } {code}
> And after tracing the root cause, `FileChecksumHelper#makeCompositeCrcResult` 
> maybe compute an error `consumedLastBlockLength` when updating checksum for 
> the last block of the fixed length which maybe not the last block in the file.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to