Kai Zheng created HADOOP-10959: ---------------------------------- Summary: A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework Key: HADOOP-10959 URL: https://issues.apache.org/jira/browse/HADOOP-10959 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Kai Zheng Assignee: Kai Zheng
To implement and integrate pluggable authentication providers, enhance desirable single sign on for end users, and help enforce centralized access control on the platform, the community has widely discussed and concluded token based authentication could be the appropriate approach. TokenAuth (HADOOP-9392) was proposed and is under development to implement another Authentication Method in lieu with Simple and Kerberos. It is a big and long term effort to support TokenAuth across the entire ecosystem. We here propose a short term replacement based on Kerberos that can complement to TokenAuth. Our solution involves less codes changes with limited risk and the main development work has already been done in our POC. Users can use our solution as a short term solution to support token inside Hadoop. This effort and resultant solution will be fully described in the design document to be attached soon. Below is the brief introduction. We proposed to add token-preauth mechanism similar to PKINIT and OTP for Kerberos based on the Pre-Authentication framework, which allows users to authenticate to KDC using a JWT token instead of password. KDC authenticates the JWT token and issues TGT as it would trust the token authority/issuer via PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and they’re interested. Currently we’re collaborating with MIT team to work on the draft and standardize the mechanism. We also did a POC which implemented the token-preauth mechanism as a MIT Kerberos plugin. The plugin can be separately packaged as a Linux .so module and deployed additionally for existing installations. MIT also wish we could contribute the codes and make it available in their future releases. Before that we can make the plugin binary and source codes available to the community for experimental usage and review. So ideally token-preauth plugin can be deployed to a MIT Kerberos installation, the end users can authenticate to 3rd party JWT token authorities and get tokens, and then use the tokens to acquire Kerberos TGT from KDC. Based on that, we implemented the token authentication for Hadoop, with only a few of central modifications into the code base, as we don’t have to add another Authentication Method and the solution leverages the existing Kerberos support. We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to support logging in using a token or token cache. The new module is compatible with Krb5LoginModule in configuration and functionality, thus can be used safely. We also added KerberosTokenAuthenticationHandler to support Hadoop web interfaces. It extends KerberosAuthenticationHandler and adds to support token authentication and perform the SPNEGO negotiation purely in server side in the new handler. Again the new handler is compatible with KerberosAuthenticationHandler and can be used safely. Token is used to exchange Kerberos ticket and ticket goes to Hadoop services as normally does. In addition to that, to employ the token attributes to enforce fine-grained authorization or whatever, a token derivation is encapsulated into ticket as Authorization data when KDC issues the ticket with the token. Then in service (Hadoop services) side, token can be queried and extracted from service ticket. We made this happen in both GSSAPI and SASL contexts as the both are used in Hadoop. As we can see or think of, the main concern for this solution may be that it requires to deploy additional plugin for existing Kerberos installations, and involves necessary identity accounts sync from identity management systems to Kerberos KDC. Most importantly, it requires Kerberos deployment as its prerequisite setup. We’re also discussing with MIT team about how to simplify Kerberos deployment especially for Hadoop large clusters and alleviate the overhead to employ PKINIT/token-preauth mechanisms like identity sync. -- This message was sent by Atlassian JIRA (v6.2#6252)