[jira] Commented: (HADOOP-4348) Adding service-level authorization to Hadoop

Doug Cutting (JIRA) Wed, 22 Oct 2008 21:17:07 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642048#action_12642048
 ]


Doug Cutting commented on HADOOP-4348:
--------------------------------------

Sanjay> The best way to represent that service access is when a service proxy 
object is created - e.g when the connection is established.

A proxy is not bound to a single connection.  Connections are retrieved from a 
cache each time a call is made. Different proxies may share the same 
connection, and a single proxy my use different connections for different calls.

Sanjay> We could share multiple service sessions in a single connection but 
that complexity is not worth it.

It would be simpler to implement this way, not more complex.  In HADOOP-4049 it 
was considerably simpler to pass extra data by modifying the RPC code than 
Client/Server.  That's my primary motivation here: to keep the code simple.  So 
unless there's a reason why we must authorize per connection rather than per 
request, it would be easier to authorize requests and would better 
compartmentalize the code.  There are some performance implications.  
Authorizing per request will use fewer connections but perform more 
authorizations.  I don't know whether this is significant.  I expect that ACLs 
will be cached, and that authorization will not be too expensive, but that 
remains to be seen.  So performance may provide a motivation to authorize per 
connection.  But let's not prematurely optimize.

Sanjay> I see your argument to be equivalent to arguing against service level 
authorization and that method level authorization is sufficient.

No, but we will eventually probably need method-level authorization too, and it 
would be nice if whatever support we add now also helps then.  If we do this in 
RPC, then we can examine only the protocol name for now, and subsequently add 
method-level authorization at the same place.  So implementing 
service-level-authentication this way better prepares us for method-level 
authentication.

Sanjay> Would you be happier if we created an intermediate layer, say 
rpc-session, in between. I am not seriously suggesting we do that.

We have two layers today.  We could add this at either layer.  It would be 
cleaner to add it only at one layer, not mixed between the two, as in the 
current patch.  It would be simpler to add it to the RPC layer, and I have yet 
to hear a strong reason why that would be wrong.  That's all I'm saying.


> Adding service-level authorization to Hadoop
> --------------------------------------------
>
>                 Key: HADOOP-4348
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4348
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Kan Zhang
>            Assignee: Arun C Murthy
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4348_0_20081022.patch
>
>
> Service-level authorization is the initial checking done by a Hadoop service 
> to find out if a connecting client is a pre-defined user of that service. If 
> not, the connection or service request will be declined. This feature allows 
> services to limit access to a clearly defined group of users. For example, 
> service-level authorization allows "world-readable" files on a HDFS cluster 
> to be readable only by the pre-defined users of that cluster, not by anyone 
> who can connect to the cluster. It also allows a M/R cluster to define its 
> group of users so that only those users can submit jobs to it.
> Here is an initial list of requirements I came up with.
>     1. Users of a cluster is defined by a flat list of usernames and groups. 
> A client is a user of the cluster if and only if her username is listed in 
> the flat list or one of her groups is explicitly listed in the flat list. 
> Nested groups are not supported.
>     2. The flat list is stored in a conf file and pushed to every cluster 
> node so that services can access them.
>     3. Services will monitor the modification of the conf file periodically 
> (5 mins interval by default) and reload the list if needed.
>     4. Checking against the flat list is done as early as possible and before 
> any other authorization checking. Both HDFS and M/R clusters will implement 
> this feature.
>     5. This feature can be switched off and is off by default.
> I'm aware of interests in pulling user data from LDAP. For this JIRA, I 
> suggest we implement it using a conf file. Additional data sources may be 
> supported via new JIRA's.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4348) Adding service-level authorization to Hadoop

Reply via email to