[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated MAPREDUCE-6007:
------------------------------------

    Attachment: MAPREDUCE-6007.002.patch

{code}

-px : preserve raw and non-raw xattrs
-pr : no xattrs are preserved
-p  : preserve raw xattrs
-pxr: preserve non-raw xattrs
    : no xattrs are preserved

{code}

Yes, this is almost all correct. Assuming that -pr (above) is the same as -pd 
in the patch, then the only typo above is the last line which should be "no raw 
xattrs are preserved", but I think you had the "raw" part implied so I throw 
that in just to be crystal clear.

{quote}
    raw src, raw dst: the options apply as specified above
    raw src, not-raw dst, dst supports xattrs but no /reserved/.raw: we will 
fail to set raw xattrs at runtime.
    raw src, dst doesn't support xattrs: if -pX is specified, throws an 
exception. Else, silently discards raw xattrs.
{quote}

bq. If the src is /reserved/.raw, the user is expecting preservation of raw 
xattrs when -p or -pX is specified. In this scenario, we should test that the 
dest is /.reserved/raw and that it's present on the dstFS. There might be other 
weird cases, haven't thought through all of them

Good idea. I've cleaned this up so that anytime there's a /.reserved/raw src 
path present, and either pd or a non- /.reserved/raw target path is specified, 
an exception is thrown (actually a DistCpConstants.INVALID_ARGUMENT exit code 
is returned and a message logged). This will presumably help admins from 
shooting themselves in the foot.

bq. We have both noPreserveRaw and preserveRaw booleans, can we standardize on 
one everywhere? I'd like a negative one, call it disableRaw or excludeRaw since 
it better captures the meaning of the flag. exclude feels a bit better IMO, but 
it looks like -pe is taken.
 
I changed all of the booleans to be "disableRawXattrs" (vs disableRaw). This 
way it matches the "preserveXattrs" siblings in the code. However, in 
SimpleCopyList#doBuildListing, the boolean is still named copyRawXAttrs to 
mirror its copyAcls and copyXAttrs siblings. It felt wrong to change that to a 
disableRawXattrs. Ditto #traverseNonEmptyDirectory. Not to mention that if we 
negated the sense of the boolean from copyRawXattrs to disableRawXattrs, it 
would hair up the calls to toCopyListFileStatus when they are &&'d with 
child.isDirectory(). It feels like it's cleaner to leave all three booleans in 
the positive in these two cases.

I left the name of the enum the same (DISABLERAWXATTRS) since (1) it is an 
available letter ("d"), and (2) I think you were only suggesting changing the 
name of the booleans in the code. I can change it to DISABLERAW if you prefer.

bq.  What's the expected behavior when the dest doesn't support xattrs or 
reserved raw, or supports xattrs but not reserved raw?

If the dest doesn't support xattrs or raw, then (assuming -px was specified) an 
exception is thrown (per current behavior). If the dest supports xattrs but not 
reserved raw, then it depends on whether reserved raw was specified in the 
source or not. If it was, then an INVALID_ARGUMENT (-1) is returned as the exit 
code. If reserved/raw was not specified on any source path, then silence.

bq.    CopyListing, this is where we'd also test to see if the destFS has a 
/.reserved/raw directory

Yes, I've added a test for src/dst reserved/raw mismatch here.

I've updated the documentation per your suggestions.

{quote}

    if (!Path.getPathWithoutSchemeAndAuthority(target).toString().

What if the target is a relative path here?
{quote}

Check me on this. I convinced myself that a relative path could never be 
relative to /.reserved/raw since you can't set your working directory to that.

bq. Any reason this isn't part of the existing XAttr test? They seem pretty 
similar, and you also added a PXD test to the existing test.

I started out with a combined test, but during development, they diverged 
enough in terms of structure, @BeforeClass, files that they operated on, 
initializing files, directories, xattrs, etc. that it felt cleaner to leave 
them split into two tests.


> Create a new option for distcp -p which causes raw.* namespace extended 
> attributes to not be preserved
> ------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6007
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: distcp
>    Affects Versions: fs-encryption
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: MAPREDUCE-6007.001.patch, MAPREDUCE-6007.002.patch
>
>
> As part of the Data at Rest Encryption work (HDFS-6134), we need to create a 
> new option for distcp which causes raw.* namespace extended attributes to not 
> be preserved. See the doc in HDFS-6509 for details. The default for this 
> option will be to preserve raw.* xattrs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to