[
https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292763#comment-14292763
]
Jing Zhao commented on HDFS-7682:
---------------------------------
Thanks for working on this, [~clamb].
One question about the current patch. The following code means we only do the
length checking if the file is complete. Then for a snapshotted while still
being written file, we will still have the issue. How about changing the
condition to "if the src is a snapshot path"? Then we can use
"{{blockLocations.getFileLength}} + {{last block's length if it's incomplete}}"
as the length limit.
{code}
+ if (blockLocations.isLastBlockComplete()) {
+ remaining = Math.min(length, blockLocations.getFileLength());
+ }
{code}
> {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes
> non-snapshotted content
> ------------------------------------------------------------------------------------------------
>
> Key: HDFS-7682
> URL: https://issues.apache.org/jira/browse/HDFS-7682
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.7.0
> Reporter: Charles Lamb
> Assignee: Charles Lamb
> Attachments: HDFS-7682.000.patch
>
>
> DistributedFileSystem#getFileChecksum of a snapshotted file includes
> non-snapshotted content.
> The reason why this happens is because DistributedFileSystem#getFileChecksum
> simply calculates the checksum of all of the CRCs from the blocks in the
> file. But, in the case of a snapshotted file, we don't want to include data
> in the checksum that was appended to the last block in the file after the
> snapshot was taken.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)