[
https://issues.apache.org/jira/browse/HADOOP-13145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284328#comment-15284328
]
Steve Loughran commented on HADOOP-13145:
-----------------------------------------
You know, I think s3a now has enough instrumentation that the # of times that
getFileStatus is called would be measurable.
At the very least, it'd be good to have a test of DistCp there, to verify that
inconsistency problems aren't surfacing. The examples in, say
{{TestDistCpViewFs}} , show a start, though I'd expect the new tests to simply
throw up IOEs, rather than swallow + fail, the way that class does (and which I
have just submitted a patch for, in HADOOP-13148).
> In DistCp, prevent unnecessary getFileStatus call when not preserving
> metadata.
> -------------------------------------------------------------------------------
>
> Key: HADOOP-13145
> URL: https://issues.apache.org/jira/browse/HADOOP-13145
> Project: Hadoop Common
> Issue Type: Improvement
> Components: tools/distcp
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: HADOOP-13145.001.patch
>
>
> After DistCp copies a file, it calls {{getFileStatus}} to get the
> {{FileStatus}} from the destination so that it can compare to the source and
> update metadata if necessary. If the DistCp command was run without the
> option to preserve metadata attributes, then this additional
> {{getFileStatus}} call is wasteful.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]