[ 
https://issues.apache.org/jira/browse/HDFS-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13947091#comment-13947091
 ] 

Yongjun Zhang commented on HDFS-6152:
-------------------------------------

The earlier test failure of

 -1 core tests. The patch failed these unit tests in hadoop-tools/hadoop-distcp:
org.apache.hadoop.tools.mapred.TestUniformSizeInputFormat
Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6495//testReport/

is now passing in 
https://builds.apache.org/job/PreCommit-HDFS-Build/6496//testReport/.
with the same version.

Since the earlier failure of TestUniformSizeInputFormat is about too many open 
files, anyone who would help 
me review  the patch, would you please try to spot whether the newly added 
entry to the file listing (described
in "notable changes #2") would contribute to that (even though it may just be a 
glitch on the node that ran build
6495)? thanks a lot.


> distcp V2 doesn't preserve root dir's attributes when -p is specified
> ---------------------------------------------------------------------
>
>                 Key: HDFS-6152
>                 URL: https://issues.apache.org/jira/browse/HDFS-6152
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.3.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-6152.001.patch, HDFS-6152.002.patch, 
> HDFS-6152.002.patch
>
>
> Two issues were observed with distcpV2
> ISSUE 1. when copying a source dir to target dir with "-pu" option using 
> command 
>   "distcp -pu source-dir target-dir"
>  
> The source dir's owner is not preserved at target dir. Simiarly other 
> attributes of source dir are not preserved.  Supposedly they should be 
> preserved when no -update and no -overwrite specified. 
> There are two scenarios with the above command:
> a. when target-dir already exists. Issuing the above command will  result in 
> target-dir/source-dir (source-dir here refers to the last component of the 
> source-dir path in the command line) at target file system, with all contents 
> in source-dir copied to under target-dir/src-dir. The issue in this case is, 
> the attributes of src-dir is not preserved.
> b. when target-dir doesn't exist. It will result in target-dir with all 
> contents of source-dir copied to under target-dir. This issue in this  case 
> is, the attributes of source-dir is not carried over to target-dir.
> For multiple source cases, e.g., command 
>   "distcp -pu source-dir1 source-dir2 target-dir"
> No matter whether the target-dir exists or not, the multiple sources are 
> copied to under the target dir (target-dir is created if it didn't exist). 
> And their attributes are preserved. 
> ISSUE 2. with the following command:
>   "distcp source-dir target-dir"
> when source-dir is an empty directory, and when target-dir doesn't exist, 
> source-dir is not copied, actually the command behaves like a no-op. However, 
> when the source-dir is not empty, it would be copied and results in 
> target-dir at the target file system containing a copy of source-dir's 
> children.
> To be consistent, empty source dir should be copied too. Basically the  above 
> distcp command should cause target-dir get created at target file system, and 
> the source-dir's attributes are preserved at target-dir when -p is passed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to