[ 
https://issues.apache.org/jira/browse/HBASE-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005530#comment-13005530
 ] 

Gary Helmling commented on HBASE-3615:
--------------------------------------

bq. ZK seems good for storing a small thing that does not change. Will the key 
be generally available if its in zk?

This will rely on securing ZK with kerberos auth (Eugene has a patch for 
ZOOKEEPER-938) and setting up ACLs.  We already need ZK secured as we use it to 
broadcast changes in ACLs to the RSs, so this seems to fit with that too.  Goes 
without saying, but using ZK security will be optional config as well from the 
HBase standpoint.

We'll want to periodically roll master keys and communicate updates to RSs.  
Hadoop rolls the "current" key every 24 hrs and keeps the last 7, so ZK again 
seems a good fit to communicate the changes.  I considered just storing the key 
IDs in ZK for change notification and storing key data in HDFS using file 
permissions for security, but that's just another piece that can break when 
we're securing ZK anyway.

bq. When would you need this? [token renewal]

Hadoop again does this.  I think the jobtracker is designed as the token 
"renewer" and then it pings the NN to keep it live for up to 7 days.  In that 
case, each token has a "max date", but expiration is computed separately as 
current time + some window.  Expiration is a bit fuzzy in that implementation 
though, as the renewer can still resurrect expired tokens if the current time < 
the "max date" in the token.  In theory, it limits the window during which 
token disclosure allows impersonating the user.  If the token expires in 24 
hours without renewal, _and_ the MR job completes in less than that time, then 
a disclosure of the token 25 hrs after issue, when the token has expired, and 
the JT has not needed to renew it, will not allow the token to be re-used to 
impersonate the user.  

However, this doesn't really close the loop if you can somehow trick the JT 
(the designated "renewer") into resurrecting the expired token for you.  Also, 
we can't use the built in JT renewal as it only works for Hadoop 
DelegationTokens, so something else would have to handle it for the duration of 
a job execution.  And it's not clear to me that it's a meaningful enhancement 
in security.  So I've ignored the expiration/max date distinction and just made 
it expire date. 

On failure-over, the master would read in the current master keys from ZK and 
repopulate the valid tokens in memory as they're used, validating that they use 
an existing master key and haven't expired, err.. "maxed out", yet.

bq. Whats this? We need name for cluster instance? I suppose we can't use 
master ip plus port because could change with time. The zk ensemble string plus 
the zk rootdir?

Yeah, this part is a bit tricky.  I hadn't thought of cluster ensemble subsets. 
 I was going to ping JD on if replication had anything to use for similar 
purposes -- uniquely identify clusters to prevent replication loops, say.  
Talking with Andy, he suggested generating a UUID on initial FS setup and 
adding it to hbase.version.  From there master could pop it up in ZK on 
startup?  Maybe I should open a separate JIRA for discussing that bit?

Thanks for the comments!  I suppose I should clarify some of these bits on the 
wiki page.


> Implement token based DIGEST-MD5 authentication for MapReduce tasks
> -------------------------------------------------------------------
>
>                 Key: HBASE-3615
>                 URL: https://issues.apache.org/jira/browse/HBASE-3615
>             Project: HBase
>          Issue Type: New Feature
>          Components: ipc, security
>            Reporter: Gary Helmling
>            Assignee: Gary Helmling
>             Fix For: 0.92.0
>
>
> HBase security currently supports Kerberos authentication for clients, but 
> this isn't sufficient for map-reduce interoperability, where tasks execute 
> without Kerberos credentials.  In order to fully interoperate with map-reduce 
> clients, we will need to provide our own token authentication mechanism, 
> mirroring the Hadoop token authentication mechanisms.  This will require 
> obtaining an HBase authentication token for the user when the job is 
> submitted, serializing it to a secure location, and then, at task execution, 
> having the client or task code de-serialize the stored authentication token 
> and use that in the HBase client authentication process.
> A detailed implementation proposal is sketched out on the wiki:
> http://wiki.apache.org/hadoop/Hbase/HBaseTokenAuthentication

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to