[
https://issues.apache.org/jira/browse/MAPREDUCE-6734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730406#comment-15730406
]
Daniel Templeton commented on MAPREDUCE-6734:
---------------------------------------------
Comments:
* In {{CopyListing}}, I wouldn't make the call to {{adjustPath()}} an
assignment. You're modifying the {{Text}} in the method, so the assignment
just looks suspicious. Same in {{CopyMapper.map()}}.
* In {{DistCpUtils.adjustPath()}}, the javadoc probably shouldn't talk about
keys and values since it's used in a broader context, e.g. {{CopyListing}}
* {{adjustPath()}} doesn't tolerate leaving out the leading slash. Seems a
logical thing for a user to want to do.
* In your tests, please add assert messages, especially for
{{assertTrue()}}/{{assertFalse()}}
* In the fail message, if it includes an exception, please use {{toString()}}
instead of {{getMessage()}}
* There's a lot of overlap between the tests in {{TestCopyMapper}}. Think you
can extract the common logic into a shared method? Also, it would be nice to
add some cleanup for the created files. I don't see any other methods doing
that, but maybe you'll start a trend!
* In the {{TestCopyListing}} test, you don't need the first
{{TestDistCpUtils.delete(fs, "/tmp");}}. I'd also make the
{{InvalidInputException}} expected instead of catching and ignoring
* In the {{TestDistCpUtils}} test, I think you can move that repeated logic
into a separate method and call that four times
> Add option to distcp to preserve file path structure of source files at the
> destination
> ---------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6734
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6734
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: distcp
> Affects Versions: 3.0.0-alpha2
> Environment: Software platform
> Reporter: Frederick Tucker
> Priority: Critical
> Labels: distcp, newbie, patch
> Fix For: 3.0.0-alpha2
>
> Attachments: MAPREDUCE-6734.3.0.0-alpha2.patch,
> MAPREDUCE-6734.3.0.0-alpha2.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> When copying files using distcp with globbed source files, all the matched
> files in the glob are copied in a single flat directory. This causes
> problems when the file structure at the source is important. It also is an
> issue when there are two files matched in the glob with the same name because
> it causes a duplicate file error at the target. I'd like to have an option
> to preserve the file structure of the source files when globbing inputs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]