[ 
https://issues.apache.org/jira/browse/HADOOP-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078365#comment-17078365
 ] 

Steve Loughran commented on HADOOP-8143:
----------------------------------------

bq.  it surprises me that block-size preservation isn't turned off when 
`-skipCrcCheck && (!-pb)

it is, but as -pb is always set, the preservation is always taking place. That 
is the fundamental issue with the regression this patch is triggering.

Someone may actually want to preserve blocksize on a copy even without 
checksums; it'd be a regression if that suddenly went away.

Thoughts:

hasPathCapability() to probe source and dest for "really" having partition 
size, rather than fake size (i.e. hdfs, webhdfs == true, false for the rest). 
Same for replication too.

blocksize is only updated if -pb set (explicitly or default) or both source and 
dest really support it. 

I don't know what this means for maprfs; it may be better to go the other way 
and have a pathcapability "blocksize.simulated" which we'd set true for (hadoop 
cloudstore connectors, google gcs)

> Change distcp to have -pb on by default
> ---------------------------------------
>
>                 Key: HADOOP-8143
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8143
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Dave Thompson
>            Assignee: Mithun Radhakrishnan
>            Priority: Minor
>             Fix For: 3.0.0-alpha4
>
>         Attachments: HADOOP-8143.1.patch, HADOOP-8143.2.patch, 
> HADOOP-8143.3.patch
>
>
> We should have the preserve blocksize (-pb) on in distcp by default.        
> checksum which is on by default will always fail if blocksize is not the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to