[
https://issues.apache.org/jira/browse/HADOOP-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189345#comment-14189345
]
Mithun Radhakrishnan commented on HADOOP-8143:
----------------------------------------------
[~aw]
bq. forcing block size will break non-HDFS methods in surprising ways.
Here's the code in DistCp that is affected by preserving block-size:
{code:java}
private static long getBlockSize(
EnumSet<FileAttribute> fileAttributes,
FileStatus sourceFile, FileSystem targetFS, Path tmpTargetPath) {
boolean preserve = fileAttributes.contains(FileAttribute.BLOCKSIZE)
|| fileAttributes.contains(FileAttribute.CHECKSUMTYPE);
return preserve ? sourceFile.getBlockSize() : targetFS
.getDefaultBlockSize(tmpTargetPath);
}
{code}
Would the concern be that {{FileStatus.getBlockSize()}} might conk if the
source-file isn't on HDFS? It's more likely that
{{FileSystem.getDefaultBlockSize()}} is being called for a non-HDFS file-system
as well, by default.
> Change distcp to have -pb on by default
> ---------------------------------------
>
> Key: HADOOP-8143
> URL: https://issues.apache.org/jira/browse/HADOOP-8143
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Dave Thompson
> Assignee: Mithun Radhakrishnan
> Priority: Minor
> Attachments: HADOOP-8143.1.patch
>
>
> We should have the preserve blocksize (-pb) on in distcp by default.
> checksum which is on by default will always fail if blocksize is not the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)