[ 
https://issues.apache.org/jira/browse/HDFS-12907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16282868#comment-16282868
 ] 

Daryn Sharp commented on HDFS-12907:
------------------------------------

bq. My understanding is after HDFS-12355 \[...\] , Is this remotely correct?

No, the goal is all encryption/decryption is done on the client.  The DN will 
not be given KMS tokens.  Ever.  It will not talk to the KMS.  Ever.  The DN 
will never encrypt/decrypt.

<rage level="furious">
The KMS client completely breaks all the ugi-semantics to enable DNs to do the 
encrypt/decrypt.  How?  Why?  First off, the KMS client morphs based on the 
caller's ugi context.  Clients are expected to always be who they were when 
created.  Imagine if an IPC client was user1, disconnected, and magically 
became user2 just because it was in another context.  That's the KMS client.

It gets worse. If the current ugi is a proxy user, the KMS client will try to 
authenticate as the real user.  That's fine when the ugi real user  is the 
login user of a service (ex. oozie).  But if there are _no credentials_, ex. 
proxy ugi from a token, the client willfully decides to use the login user's 
credentials!  And for good measure, let's proxy as the effective ugi!  That's 
super bizarre.

So let's put it together with the DN.  I use webhdfs as "daryn (token)", but 
the DN connects to the KMS as  "daryn via dn (kerberos)".  Or I submit a job 
with oozie, so I'm "daryn via oozie (token)" to the DN but it connects to the 
KMS as "daryn via dn (kerberos)".  Wow, that would never work right?  It would 
it you told users to make all their DNs be proxy users on the KMS!  And since 
most people map their DNs to the hdfs superuser, which is a really bad idea, 
you now have let admins have the ability to decrypt any file.

Both Cloudera and Hortonworks actually documented this security insanity.  
Cloudera's docs appear to be gone now, but used to acknowledge with a yellow 
box like "this is a bad idea, but if you really want to...".  HortonWorks docs 
still exist with a footnote like "oh yeah, if you are still paying attention 
after clicking all the ui buttons, all your nodes now have access to all your 
keys, might want to consider changing your superuser".  

If you allow every node in your cluster the ability to decrypt everything on 
your cluster.  Why did you even enable security let alone EZ?  It's a rotten 
idea that should have never been implemented or passed a review.  It's what 
happens when a feature is rushed.
</rage>
––
Phew.  I value my security and data.  I'm sure as hell not making my DNs be 
proxy users, but we're stuck not breaking all the people that go to sleep with 
a false sense of cluster security.  So in the new design in progress, the DN is 
used as a dumb passthrough of encrypted bytes.  Never encrypts/decrypts or even 
talks to the KMS.  It can be done in a compatible way by the client sending a 
header to the NN that it knows how to handle EZ.  A new NN gives back the 
feinfo and prefaces the redirect path with /.reserved/raw.  That works across 
both old and new nodes and clusters.  Should be beautiful.  Stay tuned.

> Allow read-only access to reserved raw for non-superusers
> ---------------------------------------------------------
>
>                 Key: HDFS-12907
>                 URL: https://issues.apache.org/jira/browse/HDFS-12907
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Daryn Sharp
>            Assignee: Rushabh S Shah
>         Attachments: HDFS-12907.001.patch, HDFS-12907.patch
>
>
> HDFS-6509 added a special /.reserved/raw path prefix to access the raw file 
> contents of EZ files.  In the simplest sense it doesn't return the FE info in 
> the {{LocatedBlocks}} so the dfs client doesn't try to decrypt the data.  
> This facilitates allowing tools like distcp to copy raw bytes.
> Access to the raw hierarchy is restricted to superusers.  This seems like an 
> overly broad restriction designed to prevent non-admins from munging the EZ 
> related xattrs.  I believe we should relax the restriction to allow 
> non-admins to perform read-only operations.  Allowing non-superusers to 
> easily read the raw bytes will be extremely useful for regular users, esp. 
> for enabling webhdfs client-side encryption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to