[
https://issues.apache.org/jira/browse/HDFS-8747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648086#comment-14648086
]
Andrew Wang commented on HDFS-8747:
-----------------------------------
Hi [~xyao], I like this idea overall, nice work. A few questions:
* Have you thought about simply allowing rename between EZs with the same
settings? This would be a much smaller and easier change with similar
properties. Your proposal I think is still better in terms of ease-of-use and
also ensuring security invariants around key rolling (if/when we implement
that).
* If we keep the APIs superuser-only, how does a normal user add their trash
folder to an EZ? Same for scratch folders, e.g. if the Hive user is not a
superuser.
> Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encryption
> Zones
> ----------------------------------------------------------------------------------
>
> Key: HDFS-8747
> URL: https://issues.apache.org/jira/browse/HDFS-8747
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: encryption
> Affects Versions: 2.6.0
> Reporter: Xiaoyu Yao
> Assignee: Xiaoyu Yao
> Attachments: HDFS-8747-07092015.pdf, HDFS-8747-07152015.pdf,
> HDFS-8747-07292015.pdf
>
>
> HDFS Transparent Data Encryption At-Rest was introduced in Hadoop 2.6 to
> allow create encryption zone on top of a single HDFS directory. Files under
> the root directory of the encryption zone will be encrypted/decrypted
> transparently upon HDFS client write or read operations.
> Generally, it does not support rename(without data copying) across encryption
> zones or between encryption zone and non-encryption zone because different
> security settings of encryption zones. However, there are certain use cases
> where efficient rename support is desired. This JIRA is to propose better
> support of two such use cases “Scratch Space” (a.k.a. staging area) and “Soft
> Delete” (a.k.a. trash) with HDFS encryption zones.
> “Scratch Space” is widely used in Hadoop jobs, which requires efficient
> rename support. Temporary files from MR jobs are usually stored in staging
> area outside encryption zone such as “/tmp” directory and then rename to
> targeted directories as specified once the data is ready to be further
> processed.
> Below is a summary of supported/unsupported cases from latest Hadoop:
> * Rename within the encryption zone is supported
> * Rename the entire encryption zone by moving the root directory of the zone
> is allowed.
> * Rename sub-directory/file from encryption zone to non-encryption zone is
> not allowed.
> * Rename sub-directory/file from encryption zone A to encryption zone B is
> not allowed.
> * Rename from non-encryption zone to encryption zone is not allowed.
> “Soft delete” (a.k.a. trash) is a client-side “soft delete” feature that
> helps prevent accidental deletion of files and directories. If trash is
> enabled and a file or directory is deleted using the Hadoop shell, the file
> is moved to the .Trash directory of the user's home directory instead of
> being deleted. Deleted files are initially moved (renamed) to the Current
> sub-directory of the .Trash directory with original path being preserved.
> Files and directories in the trash can be restored simply by moving them to a
> location outside the .Trash directory.
> Due to the limited rename support, delete sub-directory/file within
> encryption zone with trash feature is not allowed. Client has to use
> -skipTrash option to work around this. HADOOP-10902 and HDFS-6767 improved
> the error message but without a complete solution to the problem.
> We propose to solve the problem by generalizing the mapping between
> encryption zone and its underlying HDFS directories from 1:1 today to 1:N.
> The encryption zone should allow non-overlapped directories such as scratch
> space or soft delete "trash" locations to be added/removed dynamically after
> creation. This way, rename for "scratch space" and "soft delete" can be
> better supported without breaking the assumption that rename is only
> supported "within the zone".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)