[ 
https://issues.apache.org/jira/browse/HDFS-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083958#comment-15083958
 ] 

Jing Zhao commented on HDFS-9613:
---------------------------------

Thanks for the improvement, Kai. One question about the patch:
{code}
      /**
       * Only when checksum opt and block size are preserved while copying, do 
the
       * file checksums comparing, to avoid unnecessary checksum computing for
       * better performance.
       */
{code}

I'm not sure if this is correct if the source/target filesystems are not 
DistributedFileSystem, or if we use a new file checksum computation algorithm 
(e.g., HDFS-8430) which does not require the same block size.

> Some improvement and clean up in distcp
> ---------------------------------------
>
>                 Key: HDFS-9613
>                 URL: https://issues.apache.org/jira/browse/HDFS-9613
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>            Priority: Minor
>         Attachments: HDFS-9613-v1.patch, HDFS-9613-v2.patch
>
>
> While working on related issue, it was noticed there are some places in 
> {{distcp}} that's better to be improved and cleaned up. Particularly, after a 
> file is coped to target cluster, it will check the copied file is fine or 
> not. When checking, better to check block size first, then the checksum, 
> because the later is a little expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to