[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730406#comment-15730406
 ] 

Daniel Templeton commented on MAPREDUCE-6734:
---------------------------------------------

Comments:

* In {{CopyListing}}, I wouldn't make the call to {{adjustPath()}} an 
assignment.  You're modifying the {{Text}} in the method, so the assignment 
just looks suspicious.  Same in {{CopyMapper.map()}}.
* In {{DistCpUtils.adjustPath()}}, the javadoc probably shouldn't talk about 
keys and values since it's used in a broader context, e.g. {{CopyListing}}
* {{adjustPath()}} doesn't tolerate leaving out the leading slash.  Seems a 
logical thing for a user to want to do.
* In your tests, please add assert messages, especially for 
{{assertTrue()}}/{{assertFalse()}}
* In the fail message, if it includes an exception, please use {{toString()}} 
instead of {{getMessage()}}
* There's a lot of overlap between the tests in {{TestCopyMapper}}.  Think you 
can extract the common logic into a shared method?  Also, it would be nice to 
add some cleanup for the created files.  I don't see any other methods doing 
that, but maybe you'll start a trend!
* In the {{TestCopyListing}} test, you don't need the first 
{{TestDistCpUtils.delete(fs, "/tmp");}}.  I'd also make the 
{{InvalidInputException}} expected instead of catching and ignoring
* In the {{TestDistCpUtils}} test, I think you can move that repeated logic 
into a separate method and call that four times

> Add option to distcp to preserve file path structure of source files at the 
> destination
> ---------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6734
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6734
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: distcp
>    Affects Versions: 3.0.0-alpha2
>         Environment: Software platform
>            Reporter: Frederick Tucker
>            Priority: Critical
>              Labels: distcp, newbie, patch
>             Fix For: 3.0.0-alpha2
>
>         Attachments: MAPREDUCE-6734.3.0.0-alpha2.patch, 
> MAPREDUCE-6734.3.0.0-alpha2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When copying files using distcp with globbed source files, all the matched 
> files in the glob are copied in a single flat directory.  This causes 
> problems when the file structure at the source is important.  It also is an 
> issue when there are two files matched in the glob with the same name because 
> it causes a duplicate file error at the target.  I'd like to have an option 
> to preserve the file structure of the source files when globbing inputs.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to