[ 
https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292763#comment-14292763
 ] 

Jing Zhao commented on HDFS-7682:
---------------------------------

Thanks for working on this, [~clamb]. 

One question about the current patch. The following code means we only do the 
length checking if the file is complete. Then for a snapshotted while still 
being written file, we will still have the issue. How about changing the 
condition to "if the src is a snapshot path"? Then we can use 
"{{blockLocations.getFileLength}} + {{last block's length if it's incomplete}}" 
as the length limit.
{code}
+    if (blockLocations.isLastBlockComplete()) {
+      remaining = Math.min(length, blockLocations.getFileLength());
+    }
{code}

> {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes 
> non-snapshotted content
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7682
>                 URL: https://issues.apache.org/jira/browse/HDFS-7682
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: HDFS-7682.000.patch
>
>
> DistributedFileSystem#getFileChecksum of a snapshotted file includes 
> non-snapshotted content.
> The reason why this happens is because DistributedFileSystem#getFileChecksum 
> simply calculates the checksum of all of the CRCs from the blocks in the 
> file. But, in the case of a snapshotted file, we don't want to include data 
> in the checksum that was appended to the last block in the file after the 
> snapshot was taken.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to