[ https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085543#comment-14085543 ]
Andrew Wang commented on MAPREDUCE-6007: ---------------------------------------- Hi Charles, thanks for the patch, I had to take notes while reviewing this patch, the behavior is kind of complicated. We have a variety of flags that can be specified, and the destination FS can have different levels of support. It'd be very useful to specify this behavior in gory detail in the DistCp documentation. Check me on this though: Options: {noformat} -px : preserve raw and non-raw xattrs -pr : no xattrs are preserved -p : preserve raw xattrs -pxr: preserve non-raw xattrs : no xattrs are preserved {noformat} Behavior with a given src and dst, varying levels of dst support: * raw src, raw dst: the options apply as specified above * raw src, not-raw dst, dst supports xattrs but no {{/reserved/.raw}}: we will fail to set raw xattrs at runtime. * raw src, dst doesn't support xattrs: if {{-pX}} is specified, throws an exception. Else, silently discards raw xattrs. Some discussion on the above: * If the src is {{/reserved/.raw}}, the user is expecting preservation of raw xattrs when {{-p}} or {{-pX}} is specified. In this scenario, we should test that the dest is {{/.reserved/raw}} and that it's present on the dstFS. * There might be other weird cases, haven't thought through all of them Some code review comments: Misc: - We have both {{noPreserveRaw}} and {{preserveRaw}} booleans, can we standardize on one everywhere? I'd like a negative one, call it {{disableRaw}} or {{excludeRaw}} since it better captures the meaning of the flag. {{exclude}} feels a bit better IMO, but it looks like {{-pe}} is taken. - What's the expected behavior when the dest doesn't support xattrs or reserved raw, or supports xattrs but not reserved raw? - CopyListing, this is where we'd also test to see if the destFS has a /.reserved/raw directory - CopyMapper, two periods in the block comment Documentation: - I don't want to tie raw preservation just to encryption since we might also use it for compression, how about this instead: {quote} d: disable preservation of raw namespace extended attributes ... raw namespace extended attributes are preserved by default if supported. Specifying -pd disables preservation of these xattrs. {quote} - As noted above, it'd be good to have the expected preservation behavior laid out in the distcp documentation. DistCp: {code} if (!Path.getPathWithoutSchemeAndAuthority(target).toString(). {code} What if the target is a relative path here? Test: - Any reason this isn't part of the existing XAttr test? They seem pretty similar, and you also added a PXD test to the existing test. - Don't need to do makeFilesAndDirs inO the BeforeClass - Doesn't there need to be a non-raw attribute set so you can test some of these combinations? - Can we test what happens when the dest FS doesn't support xattrs or raw xattrs? > Create a new option for distcp -p which causes raw.* namespace extended > attributes to not be preserved > ------------------------------------------------------------------------------------------------------ > > Key: MAPREDUCE-6007 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: distcp > Affects Versions: fs-encryption > Reporter: Charles Lamb > Assignee: Charles Lamb > Attachments: MAPREDUCE-6007.001.patch > > > As part of the Data at Rest Encryption work (HDFS-6134), we need to create a > new option for distcp which causes raw.* namespace extended attributes to not > be preserved. See the doc in HDFS-6509 for details. The default for this > option will be to preserve raw.* xattrs. -- This message was sent by Atlassian JIRA (v6.2#6252)