[
https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yi Liu updated HADOOP-10150:
----------------------------
Attachment: cfs.patch
extended information based on INode feature.patch
HADOOP cryptographic file system-V2.docx
The update includes two part patches.
Add “fs.encryption” and “fs.encryption.dirs” properties in core-site.xml. If
“fs.encryption=true”, then filesystem is encrypted. “fs.encryption.dirs”
indicates which directories are configured encrypted. Don’t modify
URL(fs.defaultFS) in core-site.xml, and CFS is transparent to upper layer
applications.
Each encrypted file has separate IV, and each configured encryption directory
has data key. HDFS-2006 is expected and used to save IV and data key, but it’s
not ready currently. So we implement extended information based on INode
feature, and use it to store data key and IV. In our case, only directories and
files which are configured encrypted need to use this feature, if there are
1,000,000 files which are encrypted, about 8MB memory is required, so these
information are stored in NN’s memory and will be serialized to edit log and
finally in FSImage.
For key management, we use key provider API in HADOOP-10141, and Key rotation:
data key will be decrypted using the original master key and then encrypted
using the new master key.
For more information, please refer to the updated design doc.
The first part of patch is “extended information” based on INode feature, and
used to save IV and data key. The second part patch is cfs patch.
I’m splitting these patches to the sub JIRAs.
> Hadoop cryptographic file system
> --------------------------------
>
> Key: HADOOP-10150
> URL: https://issues.apache.org/jira/browse/HADOOP-10150
> Project: Hadoop Common
> Issue Type: New Feature
> Components: security
> Affects Versions: 3.0.0
> Reporter: Yi Liu
> Assignee: Yi Liu
> Labels: rhino
> Fix For: 3.0.0
>
> Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file
> system-V2.docx, HADOOP cryptographic file system.pdf, cfs.patch, extended
> information based on INode feature.patch
>
>
> There is an increasing need for securing data when Hadoop customers use
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1. Transparent to and no modification required for upper layer
> applications.
> 2. “Seek”, “PositionedReadable” are supported for input stream of CFS if
> the wrapped file system supports them.
> 3. Very high performance for encryption and decryption, they will not
> become bottleneck.
> 4. Can decorate HDFS and all other file systems in Hadoop, and will not
> modify existing structure of file system, such as namenode and datanode
> structure if the wrapped file system is HDFS.
> 5. Admin can configure encryption policies, such as which directory will
> be encrypted.
> 6. A robust key management framework.
> 7. Support Pread and append operations if the wrapped file system supports
> them.
--
This message was sent by Atlassian JIRA
(v6.2#6252)