[
https://issues.apache.org/jira/browse/HDFS-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083958#comment-15083958
]
Jing Zhao commented on HDFS-9613:
---------------------------------
Thanks for the improvement, Kai. One question about the patch:
{code}
/**
* Only when checksum opt and block size are preserved while copying, do
the
* file checksums comparing, to avoid unnecessary checksum computing for
* better performance.
*/
{code}
I'm not sure if this is correct if the source/target filesystems are not
DistributedFileSystem, or if we use a new file checksum computation algorithm
(e.g., HDFS-8430) which does not require the same block size.
> Some improvement and clean up in distcp
> ---------------------------------------
>
> Key: HDFS-9613
> URL: https://issues.apache.org/jira/browse/HDFS-9613
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Kai Zheng
> Assignee: Kai Zheng
> Priority: Minor
> Attachments: HDFS-9613-v1.patch, HDFS-9613-v2.patch
>
>
> While working on related issue, it was noticed there are some places in
> {{distcp}} that's better to be improved and cleaned up. Particularly, after a
> file is coped to target cluster, it will check the copied file is fine or
> not. When checking, better to check block size first, then the checksum,
> because the later is a little expensive.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)