[
https://issues.apache.org/jira/browse/HBASE-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005530#comment-13005530
]
Gary Helmling commented on HBASE-3615:
--------------------------------------
bq. ZK seems good for storing a small thing that does not change. Will the key
be generally available if its in zk?
This will rely on securing ZK with kerberos auth (Eugene has a patch for
ZOOKEEPER-938) and setting up ACLs. We already need ZK secured as we use it to
broadcast changes in ACLs to the RSs, so this seems to fit with that too. Goes
without saying, but using ZK security will be optional config as well from the
HBase standpoint.
We'll want to periodically roll master keys and communicate updates to RSs.
Hadoop rolls the "current" key every 24 hrs and keeps the last 7, so ZK again
seems a good fit to communicate the changes. I considered just storing the key
IDs in ZK for change notification and storing key data in HDFS using file
permissions for security, but that's just another piece that can break when
we're securing ZK anyway.
bq. When would you need this? [token renewal]
Hadoop again does this. I think the jobtracker is designed as the token
"renewer" and then it pings the NN to keep it live for up to 7 days. In that
case, each token has a "max date", but expiration is computed separately as
current time + some window. Expiration is a bit fuzzy in that implementation
though, as the renewer can still resurrect expired tokens if the current time <
the "max date" in the token. In theory, it limits the window during which
token disclosure allows impersonating the user. If the token expires in 24
hours without renewal, _and_ the MR job completes in less than that time, then
a disclosure of the token 25 hrs after issue, when the token has expired, and
the JT has not needed to renew it, will not allow the token to be re-used to
impersonate the user.
However, this doesn't really close the loop if you can somehow trick the JT
(the designated "renewer") into resurrecting the expired token for you. Also,
we can't use the built in JT renewal as it only works for Hadoop
DelegationTokens, so something else would have to handle it for the duration of
a job execution. And it's not clear to me that it's a meaningful enhancement
in security. So I've ignored the expiration/max date distinction and just made
it expire date.
On failure-over, the master would read in the current master keys from ZK and
repopulate the valid tokens in memory as they're used, validating that they use
an existing master key and haven't expired, err.. "maxed out", yet.
bq. Whats this? We need name for cluster instance? I suppose we can't use
master ip plus port because could change with time. The zk ensemble string plus
the zk rootdir?
Yeah, this part is a bit tricky. I hadn't thought of cluster ensemble subsets.
I was going to ping JD on if replication had anything to use for similar
purposes -- uniquely identify clusters to prevent replication loops, say.
Talking with Andy, he suggested generating a UUID on initial FS setup and
adding it to hbase.version. From there master could pop it up in ZK on
startup? Maybe I should open a separate JIRA for discussing that bit?
Thanks for the comments! I suppose I should clarify some of these bits on the
wiki page.
> Implement token based DIGEST-MD5 authentication for MapReduce tasks
> -------------------------------------------------------------------
>
> Key: HBASE-3615
> URL: https://issues.apache.org/jira/browse/HBASE-3615
> Project: HBase
> Issue Type: New Feature
> Components: ipc, security
> Reporter: Gary Helmling
> Assignee: Gary Helmling
> Fix For: 0.92.0
>
>
> HBase security currently supports Kerberos authentication for clients, but
> this isn't sufficient for map-reduce interoperability, where tasks execute
> without Kerberos credentials. In order to fully interoperate with map-reduce
> clients, we will need to provide our own token authentication mechanism,
> mirroring the Hadoop token authentication mechanisms. This will require
> obtaining an HBase authentication token for the user when the job is
> submitted, serializing it to a secure location, and then, at task execution,
> having the client or task code de-serialize the stored authentication token
> and use that in the HBase client authentication process.
> A detailed implementation proposal is sketched out on the wiki:
> http://wiki.apache.org/hadoop/Hbase/HBaseTokenAuthentication
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira