[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088300#comment-14088300
 ] 

Andrew Wang commented on MAPREDUCE-6007:
----------------------------------------

bq. the only typo above is the last line which should be "no raw xattrs are 
preserved"

If none of these flags are specified, AFAIK neither non-raw or raw xattrs are 
preserved, i.e. no xattrs. Yes?

bq. I convinced myself that a relative path could never be relative to 
/.reserved/raw since you can't set your working directory to that.

AFAIK you can set your wd to whatever you want, and you can have ".." in 
absolute paths too. We need to make sure that this path is fully normalized if 
we're doing a prefix check. Paths from a FileStatus are normalized, but paths 
coming from the user (like the ones coming out of a DistCpOptions) are suspect. 
setTargetPathExists has one of these suspect checks.

Doc
* This is hard to read, could we expand this into a separate section and a new 
table? I'd particularly like to see a fuller explanation of what happens with 
different dst options.

CopyListing
* Let's improve the InvalidInputException message. Paths don't really "specify" 
something, you could say "starts with" or something instead. We should also 
print the target path.
* I don't quite understand this error either, why is a {{/.r/r}} src and 
{{-pd}} not okay? The exception also mentions the target not starting with 
{{/.r/r}}, but that's not part of the if check.
* Line longer than 80chars
* I expected to see a check that was "if (-p || -px) && !-pd && src is /.r/r, 
then also check that the dst supports xattrs and is /.r/r". I wish there was a 
way to test that it's HDFS too, but looking for dest having /.r/r is probably 
good enough.

CopyMapper
* Can we expand the block comment to say that toCopyListingFileStatus is used 
to filter xattrs, and passing copyXAttrs in twice is okay because we already 
did it earlier? The double passing looks weird, though logically correct.

DistCp:
* I really don't like setting the DISABLERAWXATTRS flag in setTargetPathExists, 
since the expectation is that Options flags are set by the user. This method is 
also not named such that doing this there makes sense. We have the target path 
via the DistCpOptions, so let's be explicit and verbose with the checks 
instead. This is quite possibly why the CopyListing check is confusing to me.
* To expand on the above, -px means preserving all xattrs, while -pxd means 
preserving non-raw xattrs. Then we have {{toCopyListingFileStatus}} where the 
{{preserveXAttrs}} parameter actually means "preserve non-raw xattrs". This is 
also definitely confusing...

DistCpOptionSwitch:
* XATTR is not a standard capitalization style, let's lower case it as xattr 
here. XAttr isn't standard either, but that ship has sailed.

Test
* I'd like tests for weird src and dst paths, i.e. relative or containing ".."s
* We could also test the "no preserve flags" behavior, that no xattrs at all 
are preserved.

> Create a new option for distcp -p which causes raw.* namespace extended 
> attributes to not be preserved
> ------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6007
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: distcp
>    Affects Versions: fs-encryption
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: MAPREDUCE-6007.001.patch, MAPREDUCE-6007.002.patch
>
>
> As part of the Data at Rest Encryption work (HDFS-6134), we need to create a 
> new option for distcp which causes raw.* namespace extended attributes to not 
> be preserved. See the doc in HDFS-6509 for details. The default for this 
> option will be to preserve raw.* xattrs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to