[
https://issues.apache.org/jira/browse/HDFS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087313#comment-14087313
]
Mike Yoder commented on HDFS-6394:
----------------------------------
Very nice.
This is a bit of a pet peeve, but under Use Cases we need more than "the
regulators made us do it". :-) I'd mention that it prevents most attacks at
the OS layer and below, and that strong encryption combined with key management
is an industry best practice. (THEN you can mention the regulators.)
By "OS layer and below" I mean: There are several different layers in the
(traditional) stack where encryption can be applied:
* Application - Most difficult, most secure. Encryption can be finely
controlled and policy decisions informed with the most information. But
writing applications to do this is hard, and impossible for the customers of
existing applications
* Database - Not a bad choice, offered by all database vendors, but has
performance problems; you can't encrypt an index.
* File System - Can be simple to deploy, performant, and transparent to layers
above; the tradeoff is that some of the context of higher layers (like the user
of the application or database) is lost.
* Disk - easiest and operates at disk speed, but only really protects against
physical theft.
Things get more secure up the stack, but easier down the stack. And encryption
at any given point in the stack makes data unreadable at any point below it in
the stack.
Hadoop adds an "HDFS" layer above the traditional file system layer. This is
near-optimal because we have lots of context in which to make policy decisions,
but we can still provide transparent encryption to applications. Due to its
place in the stack, attacks at the OS layer (file system and disk, that is) are
prevented because they just see encrypted data.
----
Anyway...
Under architecture, I think you may want to first describe encryption zones,
then mention the KMS, then describe the process by which a client gets a DEK.
Then go into the KMS in detail. Going into the KMS and EEKs first is a little
confusing without knowing what they're for.
You probably also want to mention some of the details around the cipher suite
and what we do with IVs.
> HDFS encryption documentation
> -----------------------------
>
> Key: HDFS-6394
> URL: https://issues.apache.org/jira/browse/HDFS-6394
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode, security
> Reporter: Alejandro Abdelnur
> Assignee: Andrew Wang
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: hdfs-6394.001.patch, hdfs-6394.002.patch,
> hdfs-6394.003.patch, hdfs-6394.004.patch, hdfs-6394.005.patch
>
>
> Documentation for HDFS encryption behavior and configuration
--
This message was sent by Atlassian JIRA
(v6.2#6252)