[
https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yi Liu updated HDFS-5143:
-------------------------
Attachment: CryptographicFileSystem.patch
This patch is an initial version(still need to refine) of implementation for
cryptographic file system aligned with design doc discussed in this jira:
1) Basic functionalities of cryptographic file system, including
transparently read/write data to HDFS(currently only HDFS has been tested)
using filesystem API, transparently using cryptographic filesystem in upper
layer applications(MapReduce has been tested), hdfs commands support(ls, du,
etc.) and so on.
2) Currently different IV are used for encryption files to enhance
security. And Length of IV is fixed 16 bytes and is stored at the beginning of
encryption file.
3) In the patch, crypto policy interface is defined, developers/users can
implement their own crypto policy to decide how and while files/directories
will be encrypted. By default, a simple crypto policy is implemented and admin
can configured the encrypted directory list and encrypted file list, each
encrypted directory has different encryption key, and the file stored into this
directory will be automatically encrypted.
4) For key management, in the patch, key management protocol interface is
defined, and there is default implementation and users/developers can have
their own implementation. In the patch a simple key management server is
implemented which uses java keystore to store keys. Currently the key
management server is still under development.
5) The patch includes a mvn project: hadoop-crypto, it uses OpenSSL to
implement Cipher which is much more faster than java cipher, especially when
AES-NI is enabled.
6) This patch also includes Encryptor/Decryptor interfaces and other
encryption facility, such as buffered EncryptorStream and DecryptorStream.
7) fs.default.name is “cfs://hdfs@hostname:9000” when cryptographi
filesystem is used on hdfs, and additionally “cfs-site.xml” need to be
configured.
This is an all-in-one patch, and later I will create several sub JIRAs and
split this patch for convenience of code review. I will make the patch stable
and extend the functionalities in further steps.
> Hadoop cryptographic file system
> --------------------------------
>
> Key: HDFS-5143
> URL: https://issues.apache.org/jira/browse/HDFS-5143
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: security
> Affects Versions: 3.0.0
> Reporter: Yi Liu
> Assignee: Owen O'Malley
> Labels: rhino
> Fix For: 3.0.0
>
> Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file
> system.pdf
>
>
> There is an increasing need for securing data when Hadoop customers use
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1. Transparent to and no modification required for upper layer
> applications.
> 2. “Seek”, “PositionedReadable” are supported for input stream of CFS if
> the wrapped file system supports them.
> 3. Very high performance for encryption and decryption, they will not
> become bottleneck.
> 4. Can decorate HDFS and all other file systems in Hadoop, and will not
> modify existing structure of file system, such as namenode and datanode
> structure if the wrapped file system is HDFS.
> 5. Admin can configure encryption policies, such as which directory will
> be encrypted.
> 6. A robust key management framework.
> 7. Support Pread and append operations if the wrapped file system supports
> them.
--
This message was sent by Atlassian JIRA
(v6.1#6144)