[
https://issues.apache.org/jira/browse/HADOOP-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078365#comment-17078365
]
Steve Loughran commented on HADOOP-8143:
----------------------------------------
bq. it surprises me that block-size preservation isn't turned off when
`-skipCrcCheck && (!-pb)
it is, but as -pb is always set, the preservation is always taking place. That
is the fundamental issue with the regression this patch is triggering.
Someone may actually want to preserve blocksize on a copy even without
checksums; it'd be a regression if that suddenly went away.
Thoughts:
hasPathCapability() to probe source and dest for "really" having partition
size, rather than fake size (i.e. hdfs, webhdfs == true, false for the rest).
Same for replication too.
blocksize is only updated if -pb set (explicitly or default) or both source and
dest really support it.
I don't know what this means for maprfs; it may be better to go the other way
and have a pathcapability "blocksize.simulated" which we'd set true for (hadoop
cloudstore connectors, google gcs)
> Change distcp to have -pb on by default
> ---------------------------------------
>
> Key: HADOOP-8143
> URL: https://issues.apache.org/jira/browse/HADOOP-8143
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Dave Thompson
> Assignee: Mithun Radhakrishnan
> Priority: Minor
> Fix For: 3.0.0-alpha4
>
> Attachments: HADOOP-8143.1.patch, HADOOP-8143.2.patch,
> HADOOP-8143.3.patch
>
>
> We should have the preserve blocksize (-pb) on in distcp by default.
> checksum which is on by default will always fail if blocksize is not the same.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]