[ 
https://issues.apache.org/jira/browse/HADOOP-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795482#action_12795482
 ] 

Philip Zeyliger commented on HADOOP-4487:
-----------------------------------------

I'm surprised I'm the first to comment: is the discussion going on elsewhere?

I read the design document over Christmas.  Great to see a document with so 
much detail, thanks!  I had some questions, and thought a couple of places 
could be clearer; my comments are below.

******

One thing that hasn't been covered (outside of assumptions) is more detail 
about how to operationally secure a Hadoop cluster in Unix-land.  The 
assumptions section lays out some of these ("root" needs to be secure).  Some 
things that I thought about: (1) data nodes node to write their data with a 
unix user that users don't have access to, and with appropriate permissions (or 
umask).  (Looking at my local system, the DataNode has left blocks 
world-readable.)  (2) We assume that the JT and NN are also run under unix 
accounts which users do not have access to.

Since Data Nodes and the NameNode share a key, it's important to limit cluster 
membership.  (This is critical for task trackers, too, since an evil task 
tracker could do nasty things.)  What's the mechanism to limit cluster 
participation?

Is there a central registry of what users can access HDFS and queues?

Is there an "HDFS" superuser?  In existing Hadoop, it's the username 
corresponding to the uid of the running the Namenode process.


bq. If the token doesn't exist in memory, which indicates NameNode has restarted

It could also mean that the token is expired, no?  I think this is made clearer 
in the following sentences.

bq. READ, WRITE, COPY, REPLACE

What is the COPY access mode used for?

bq. "only the user will be able to kill their own jobs and tasks"

Somewhere else in the document, there's discussion of jobs having 
owners/groups, not just owners.  Surely a superuser or cluster manager can kill 
jobs with appropriate permissions?

bq. API and environment changes

Will users still be able to use Hadoop in a "non-secure" manner?  How much work 
would be involved in using a different security model?  This is probably 
answered by the patch itself :)


> Security features for Hadoop
> ----------------------------
>
>                 Key: HADOOP-4487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4487
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: security
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: security-design.pdf
>
>
> This is a top-level tracking JIRA for security work we are doing in Hadoop. 
> Please add reference to this when opening new security related JIRAs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to