[ 
https://issues.apache.org/jira/browse/HDFS-8747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649905#comment-14649905
 ] 

Xiaoyu Yao commented on HDFS-8747:
----------------------------------

bq. This is maybe viable for scratch, but not for trash. There can be many 
users on a cluster accessing a variety of EZs, such that it's unmanageable for 
the super-user to set up all the Trash folders beforehand.

Three solutions have been discussed in "Design->Soft Delete" section of the 
spec. My initial take is on "Option 1: Per User Trash Namespace", which is 
mostly for compatibility and simplicity. If pre-create trash folder for many 
users is a concern,  "Option 2: Global Trash Namespace" which is similar to the 
idea proposed in Hadoop-7310 can be used. It will not be compatible with 
current Trash behavior where users find their deleted files under 
/user/username/.Trash/Current/.... These solutions can be implemented as 
pluggable trash policy for admin to choose with configurable keys when the 
default one may not be appropriate for their deployment.

bq. Another question, how would this work if a user's homedir is already an EZ? 
Do you plan to add support for nested encryption zones?

No we don't plan to support nested encryption zones. If we take "Option 1", 
this will not be supported. But if we take "Option 2", it will not be a problem 
as the trash namespace for encryption zone will be separated from user's 
homedir.


> Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encryption 
> Zones
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-8747
>                 URL: https://issues.apache.org/jira/browse/HDFS-8747
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: encryption
>    Affects Versions: 2.6.0
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>         Attachments: HDFS-8747-07092015.pdf, HDFS-8747-07152015.pdf, 
> HDFS-8747-07292015.pdf
>
>
> HDFS Transparent Data Encryption At-Rest was introduced in Hadoop 2.6 to 
> allow create encryption zone on top of a single HDFS directory. Files under 
> the root directory of the encryption zone will be encrypted/decrypted 
> transparently upon HDFS client write or read operations. 
> Generally, it does not support rename(without data copying) across encryption 
> zones or between encryption zone and non-encryption zone because different 
> security settings of encryption zones. However, there are certain use cases 
> where efficient rename support is desired. This JIRA is to propose better 
> support of two such use cases “Scratch Space” (a.k.a. staging area) and “Soft 
> Delete” (a.k.a. trash) with HDFS encryption zones.
> “Scratch Space” is widely used in Hadoop jobs, which requires efficient 
> rename support. Temporary files from MR jobs are usually stored in staging 
> area outside encryption zone such as “/tmp” directory and then rename to 
> targeted directories as specified once the data is ready to be further 
> processed. 
> Below is a summary of supported/unsupported cases from latest Hadoop:
> * Rename within the encryption zone is supported
> * Rename the entire encryption zone by moving the root directory of the zone  
> is allowed.
> * Rename sub-directory/file from encryption zone to non-encryption zone is 
> not allowed.
> * Rename sub-directory/file from encryption zone A to encryption zone B is 
> not allowed.
> * Rename from non-encryption zone to encryption zone is not allowed.
> “Soft delete” (a.k.a. trash) is a client-side “soft delete” feature that 
> helps prevent accidental deletion of files and directories. If trash is 
> enabled and a file or directory is deleted using the Hadoop shell, the file 
> is moved to the .Trash directory of the user's home directory instead of 
> being deleted.  Deleted files are initially moved (renamed) to the Current 
> sub-directory of the .Trash directory with original path being preserved. 
> Files and directories in the trash can be restored simply by moving them to a 
> location outside the .Trash directory.
> Due to the limited rename support, delete sub-directory/file within 
> encryption zone with trash feature is not allowed. Client has to use 
> -skipTrash option to work around this. HADOOP-10902 and HDFS-6767 improved 
> the error message but without a complete solution to the problem. 
> We propose to solve the problem by generalizing the mapping between 
> encryption zone and its underlying HDFS directories from 1:1 today to 1:N. 
> The encryption zone should allow non-overlapped directories such as scratch 
> space or soft delete "trash" locations to be added/removed dynamically after 
> creation. This way, rename for "scratch space" and "soft delete" can be 
> better supported without breaking the assumption that rename is only 
> supported "within the zone". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to