[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085543#comment-14085543
 ] 

Andrew Wang commented on MAPREDUCE-6007:
----------------------------------------

Hi Charles, thanks for the patch,

I had to take notes while reviewing this patch, the behavior is kind of 
complicated. We have a variety of flags that can be specified, and the 
destination FS can have different levels of support. It'd be very useful to 
specify this behavior in gory detail in the DistCp documentation.

Check me on this though:

Options:

{noformat}
-px : preserve raw and non-raw xattrs
-pr : no xattrs are preserved
-p  : preserve raw xattrs
-pxr: preserve non-raw xattrs
    : no xattrs are preserved
{noformat}

Behavior with a given src and dst, varying levels of dst support:

* raw src, raw dst: the options apply as specified above
* raw src, not-raw dst, dst supports xattrs but no {{/reserved/.raw}}: we will 
fail to set raw xattrs at runtime.
* raw src, dst doesn't support xattrs: if {{-pX}} is specified, throws an 
exception. Else, silently discards raw xattrs.

Some discussion on the above:
* If the src is {{/reserved/.raw}}, the user is expecting preservation of raw 
xattrs when {{-p}} or {{-pX}} is specified. In this scenario, we should test 
that the dest is {{/.reserved/raw}} and that it's present on the dstFS.
* There might be other weird cases, haven't thought through all of them

Some code review comments:

Misc:
- We have both {{noPreserveRaw}} and {{preserveRaw}} booleans, can we 
standardize on one everywhere? I'd like a negative one, call it {{disableRaw}} 
or {{excludeRaw}} since it better captures the meaning of the flag. {{exclude}} 
feels a bit better IMO, but it looks like {{-pe}} is taken.
- What's the expected behavior when the dest doesn't support xattrs or reserved 
raw, or supports xattrs but not reserved raw?
- CopyListing, this is where we'd also test to see if the destFS has a 
/.reserved/raw directory
- CopyMapper, two periods in the block comment

Documentation:
- I don't want to tie raw preservation just to encryption since we might also 
use it for compression, how about this instead:
{quote}
d: disable preservation of raw namespace extended attributes
...
raw namespace extended attributes are preserved by default if supported. 
Specifying -pd disables preservation of these xattrs.
{quote}
- As noted above, it'd be good to have the expected preservation behavior laid 
out in the distcp documentation.

DistCp:
{code}
    if (!Path.getPathWithoutSchemeAndAuthority(target).toString().
{code}
What if the target is a relative path here?

Test:
- Any reason this isn't part of the existing XAttr test? They seem pretty 
similar, and you also added a PXD test to the existing test.
- Don't need to do makeFilesAndDirs inO the BeforeClass
- Doesn't there need to be a non-raw attribute set so you can test some of 
these combinations?
- Can we test what happens when the dest FS doesn't support xattrs or raw 
xattrs?

> Create a new option for distcp -p which causes raw.* namespace extended 
> attributes to not be preserved
> ------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6007
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: distcp
>    Affects Versions: fs-encryption
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: MAPREDUCE-6007.001.patch
>
>
> As part of the Data at Rest Encryption work (HDFS-6134), we need to create a 
> new option for distcp which causes raw.* namespace extended attributes to not 
> be preserved. See the doc in HDFS-6509 for details. The default for this 
> option will be to preserve raw.* xattrs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to