[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

Alejandro Abdelnur (JIRA) Wed, 18 Jun 2014 11:01:02 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036044#comment-14036044
 ]


Alejandro Abdelnur commented on HDFS-6134:
------------------------------------------

[~yoderme] cornered me and brought up the point that given that we are using 
AES-CTR, we have to be extremely careful on not repeating IVs given an 
encryption key. Then he followed on explaining how we could run into the that 
scenario with the current implementation we are working on:

* 1. All files in an encryption zone using the same keyVersion material share 
the same encryption key.
* 2. All files in #1 have different IVs
* 3. In AES-CTR, the 8 lower bytes of the IV are treated as a counter that is 
incremented every AES block (16 bytes). 
* 4. #3 ensures an IV is not repeated throughout the file (the biggest file, 
Long.MAX bytes, consumes 1/16 of the IV counter domain).
* 5. IVs are public, and predictable based on the initial IV and the file 
offset.
* 6. Because of #5, a possible attack would be to scan #1 files for IVs where 
the 8 higher bytes match. Then, fast-forward them to a common counter point 
(assuming files are long enough), then you’ll have more than one cypher-text 
using the same encryption key and the same IV. The chances of this are 1/2^64, 
but in cryptographic terms this is considered a high chance.

A known solution to address this is:

* A. Each file should use a unique data encryption key (DEK).
* B.  The unique DEK is encrypted with the EZ keyVersion and stored as one of 
the file xAttributes.
* C. The unique DEK is generated by the KeyProvider and encrypted before 
leaving the KeyProvider. The NN never sees the DEK decrypted.
* D. The NN gives the HDFS client the encrypted DEK and the keyVersion ID.
* E. The HDFS client sends the encrypted DEK and the keyVersion ID to the 
KeyProvider and gets (if authorized to use the keyVersion) the decrypted DEK 
for the file.
* F. The HDFS client uses the DEK to encrypt/decrypt the file.

This solution requires the KeyProvider to have 2 new methods:

* {{KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv)}}
* {{KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion 
encryptedKey)}}

Since the IV would be the file IV, then we don't have to store a new IV just 
for this. The implementation would do a known transformation on the IV (i.e.: 
xor with 0xff the original IV).

The key materials (EZ key materials) to encrypt the encryption keys for files 
never leave the KeyProvider. They are not known to HDFS clients. This means 
that a compromised encryption key only compromises a file, not all the files in 
an EZ using the same key version. Because of this, a side effect of this change 
is a more secure solution.


> Transparent data at rest encryption
> -----------------------------------
>
>                 Key: HDFS-6134
>                 URL: https://issues.apache.org/jira/browse/HDFS-6134
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 2.3.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the healthcare industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

Reply via email to