[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093603#comment-14093603
 ] 

Andrew Wang commented on HDFS-6134:
-----------------------------------

Hey Sanjay, thanks for reviewing things,

Regarding HAR, could you lay out the usecase you have in mind? When the user 
makes the HAR, they'll need access to all the input files (encrypted or 
unencrypted), and then if they write it within an EZ, then it'll be encrypted, 
else, unencrypted. This behavior seems reasonable to me.

Regarding webhdfs, it's not a recommended deployment. I'm going to doc this 
additionally in HDFS-6824. It requires giving the DNs (thus the HDFS superuser) 
access to EZ keys, which is not particularly secure. There is HTTPS transport 
via swebhdfs, but that doesn't fix the key access issue. The recommended access 
method is instead HttpFS, which runs as a non-superuser. So, yes distcp will 
work too. This will definitely be covered during our testing.

Regarding scalability, you can put the KMS behind a load balancer, which should 
make scalability a non-issue. Tucu can comment better on this than me since 
he's done some KMS benchmarking, but I think a single instance should be able 
to handle O(1000s) of req/s.

> Transparent data at rest encryption
> -----------------------------------
>
>                 Key: HDFS-6134
>                 URL: https://issues.apache.org/jira/browse/HDFS-6134
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Charles Lamb
>         Attachments: HDFS-6134.001.patch, HDFS-6134.002.patch, 
> HDFS-6134_test_plan.pdf, HDFSDataatRestEncryption.pdf, 
> HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to