Re: [DISCUSS] Hadoop SSO/Token Server Components
One aside: if you come across a bug, please try to fix it upstream and then merge into the feature branch rather than cherry-picking patches or only fixing it on the branch. It becomes very awkward to track. -C Related to this, when refactoring the code, generally required for large feature development, consider first refactoring in trunk and then make additional changes for the feature in the feature branch. This helps a lot in being able to merge the trunk to feature branch periodically. This will also help in keeping the change for merging feature to trunk small and easier reviews. Regards, Suresh -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
RE: [DISCUSS] Hadoop SSO/Token Server Components
Got it Suresh. So I guess HADOOP-9797 (and the family) for the UGI change would be a fit to this rule right. The refactoring is improving and cleaning UGI, also preparing for TokenAuth feature. According to this rule the changes would be in trunk first. Thanks for your guidance. Regards, Kai -Original Message- From: Suresh Srinivas [mailto:sur...@hortonworks.com] Sent: Thursday, September 05, 2013 2:42 PM To: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components One aside: if you come across a bug, please try to fix it upstream and then merge into the feature branch rather than cherry-picking patches or only fixing it on the branch. It becomes very awkward to track. -C Related to this, when refactoring the code, generally required for large feature development, consider first refactoring in trunk and then make additional changes for the feature in the feature branch. This helps a lot in being able to merge the trunk to feature branch periodically. This will also help in keeping the change for merging feature to trunk small and easier reviews. Regards, Suresh -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [DISCUSS] Hadoop SSO/Token Server Components
Chris - I am curious whether there are any guidelines for feature branch use. The general goals should be to: * keep branches as small and as easily reviewable as possible for a given feature * decouple the pluggable framework from any specific central server implementation * scope specific content into iterations that can be merged into trunk on their own and then development continued in new branches for the next iteration So, I guess the questions that immediately come to mind are: 1. Is there a document that describes the best way to do this? 2. How best do we leverage code being done in one feature branch within another? Thanks! --larry On Tue, Sep 3, 2013 at 10:00 PM, Zheng, Kai kai.zh...@intel.com wrote: This looks good and reasonable to me. Thanks Chris. -Original Message- From: Chris Douglas [mailto:cdoug...@apache.org] Sent: Wednesday, September 04, 2013 6:45 AM To: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components On Tue, Sep 3, 2013 at 5:20 AM, Larry McCay lmc...@hortonworks.com wrote: One outstanding question for me - how do we go about getting the branches created? Once a group has converged on a purpose- ideally with some initial code from JIRA- please go ahead and create the feature branch in svn. There's no ceremony. -C On Tue, Aug 6, 2013 at 6:22 PM, Chris Nauroth cnaur...@hortonworks.com wrote: Near the bottom of the bylaws, it states that addition of a New Branch Committer requires Lazy consensus of active PMC members. I think this means that you'll need to get a PMC member to sponsor the vote for you. Regular committer votes happen on the private PMC mailing list, and I assume it would be the same for a branch committer vote. http://hadoop.apache.org/bylaws.html Chris Nauroth Hortonworks http://hortonworks.com/ On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay lmc...@hortonworks.com wrote: That sounds perfect! I have been thinking of late that we would maybe need an incubator project or something for this - which would be unfortunate. This would allow us to move much more quickly with a set of patches broken up into consumable/understandable chunks that are made functional more easily within the branch. I assume that we need to start a separate thread for DISCUSS or VOTE to start that process - correct? On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur t...@cloudera.com wrote: yep, that is what I meant. Thanks Chris On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth cnaur...@hortonworks.com wrote: Perhaps this is also a good opportunity to try out the new branch committers clause in the bylaws, enabling non-committers who are working on this to commit to the feature branch. http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/% 3CCACO5Y4we4d8knB_xU3a=hr2gbeqo5m3vau+inba0li1i9e2...@mail.gmail.com% 3E Chris Nauroth Hortonworks http://hortonworks.com/ On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur t...@cloudera.com wrote: Larry, Sorry for the delay answering. Thanks for laying down things, yes, it makes sense. Given the large scope of the changes, number of JIRAs and number of developers involved, wouldn't make sense to create a feature branch for all this work not to destabilize (more ;) trunk? Thanks again. On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay lmc...@hortonworks.com wrote: The following JIRA was filed to provide a token and basic authority implementation for this effort: https://issues.apache.org/jira/browse/HADOOP-9781 I have attached an initial patch though have yet to submit it as one since it is dependent on the patch for CMF that was posted to: https://issues.apache.org/jira/browse/HADOOP-9534 and this patch still has a couple outstanding issues - javac warnings for com.sun classes for certification generation and 11 javadoc warnings. Please feel free to review the patches and raise any questions or concerns related to them. On Jul 26, 2013, at 8:59 PM, Larry McCay lmc...@hortonworks.com wrote: Hello All - In an effort to scope an initial iteration that provides value to the community while focusing on the pluggable authentication aspects, I've written a description for Iteration 1. It identifies the goal of the iteration, the endstate and a set of initial usecases. It also enumerates the components that are required for each usecase. There is a scope section that details specific things that should be kept out of the first iteration. This is certainly up for discussion. There may be some of these things that can be contributed in short order. If we can
Re: [DISCUSS] Hadoop SSO/Token Server Components
interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contrac c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie REQUIRED COMPONENTS for UI USECASES: COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise. thoughts? On Jul 10, 2013, at 1:06 PM, Brian Swan brian.s...@microsoft.com wrote: Hi Alejandro, all- There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of what we are aiming for forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything). Thanks. -Brian -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Wednesday, July 10, 2013 8:15 AM To: Larry McCay Cc: common-dev@hadoop.apache.org; da...@yahoo-inc.com; Kai Zheng Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Larry, all, Still is not clear to me what is the end state we are aiming for, or that we even agree on that. IMO, Instead trying to agree what to do, we should first agree on the final state, then we see what should be changed to there there, then we see how we change things to get there. The different documents out there focus more on how. We not try to say how before we know what. Thx. On Wed, Jul 10, 2013 at 6:42 AM
Re: [DISCUSS] Hadoop SSO/Token Server Components
. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie REQUIRED COMPONENTS for UI USECASES: COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise. thoughts? On Jul 10, 2013, at 1:06 PM, Brian Swan brian.s...@microsoft.com wrote: Hi Alejandro, all- There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of what we are aiming for forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything). Thanks. -Brian -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Wednesday, July 10, 2013 8:15 AM To: Larry McCay Cc: common-dev@hadoop.apache.org; da...@yahoo-inc.com; Kai Zheng Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Larry, all, Still is not clear to me what is the end state we are aiming for, or that we even agree on that. IMO, Instead trying to agree what to do, we should first agree on the final state, then we see what should be changed to there there, then we see how we change things to get there. The different documents out there focus more on how. We not try to say how before we know what. Thx. On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay lmc...@hortonworks.com wrote: All - After combing through this thread - as well as the summit session summary thread, I think that we have the following
Re: [DISCUSS] Hadoop SSO/Token Server Components
in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise. thoughts? On Jul 10, 2013, at 1:06 PM, Brian Swan brian.s...@microsoft.com wrote: Hi Alejandro, all- There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of what we are aiming for forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything). Thanks. -Brian -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Wednesday, July 10, 2013 8:15 AM To: Larry McCay Cc: common-dev@hadoop.apache.org; da...@yahoo-inc.com; Kai Zheng Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Larry, all, Still is not clear to me what is the end state we are aiming for, or that we even agree on that. IMO, Instead trying to agree what to do, we should first agree on the final state, then we see what should be changed to there there, then we see how we change things to get there. The different documents out there focus more on how. We not try to say how before we know what. Thx. On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay lmc...@hortonworks.com wrote: All - After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with: 1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself) I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. @Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation
RE: [DISCUSS] Hadoop SSO/Token Server Components
This looks good and reasonable to me. Thanks Chris. -Original Message- From: Chris Douglas [mailto:cdoug...@apache.org] Sent: Wednesday, September 04, 2013 6:45 AM To: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components On Tue, Sep 3, 2013 at 5:20 AM, Larry McCay lmc...@hortonworks.com wrote: One outstanding question for me - how do we go about getting the branches created? Once a group has converged on a purpose- ideally with some initial code from JIRA- please go ahead and create the feature branch in svn. There's no ceremony. -C On Tue, Aug 6, 2013 at 6:22 PM, Chris Nauroth cnaur...@hortonworks.comwrote: Near the bottom of the bylaws, it states that addition of a New Branch Committer requires Lazy consensus of active PMC members. I think this means that you'll need to get a PMC member to sponsor the vote for you. Regular committer votes happen on the private PMC mailing list, and I assume it would be the same for a branch committer vote. http://hadoop.apache.org/bylaws.html Chris Nauroth Hortonworks http://hortonworks.com/ On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay lmc...@hortonworks.com wrote: That sounds perfect! I have been thinking of late that we would maybe need an incubator project or something for this - which would be unfortunate. This would allow us to move much more quickly with a set of patches broken up into consumable/understandable chunks that are made functional more easily within the branch. I assume that we need to start a separate thread for DISCUSS or VOTE to start that process - correct? On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur t...@cloudera.com wrote: yep, that is what I meant. Thanks Chris On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth cnaur...@hortonworks.com wrote: Perhaps this is also a good opportunity to try out the new branch committers clause in the bylaws, enabling non-committers who are working on this to commit to the feature branch. http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/% 3CCACO5Y4we4d8knB_xU3a=hr2gbeqo5m3vau+inba0li1i9e2...@mail.gmail.com% 3E Chris Nauroth Hortonworks http://hortonworks.com/ On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur t...@cloudera.com wrote: Larry, Sorry for the delay answering. Thanks for laying down things, yes, it makes sense. Given the large scope of the changes, number of JIRAs and number of developers involved, wouldn't make sense to create a feature branch for all this work not to destabilize (more ;) trunk? Thanks again. On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay lmc...@hortonworks.com wrote: The following JIRA was filed to provide a token and basic authority implementation for this effort: https://issues.apache.org/jira/browse/HADOOP-9781 I have attached an initial patch though have yet to submit it as one since it is dependent on the patch for CMF that was posted to: https://issues.apache.org/jira/browse/HADOOP-9534 and this patch still has a couple outstanding issues - javac warnings for com.sun classes for certification generation and 11 javadoc warnings. Please feel free to review the patches and raise any questions or concerns related to them. On Jul 26, 2013, at 8:59 PM, Larry McCay lmc...@hortonworks.com wrote: Hello All - In an effort to scope an initial iteration that provides value to the community while focusing on the pluggable authentication aspects, I've written a description for Iteration 1. It identifies the goal of the iteration, the endstate and a set of initial usecases. It also enumerates the components that are required for each usecase. There is a scope section that details specific things that should be kept out of the first iteration. This is certainly up for discussion. There may be some of these things that can be contributed in short order. If we can add some things in without unnecessary complexity for the identified usecases then we should. @Alejandro - please review this and see whether it satisfies your point for a definition of what we are building. In addition to the document that I will paste here as text and attach a pdf version, we have a couple patches for components that are identified in the document. Specifically, COMP-7 and COMP-8. I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was filed specifically for that functionality. COMP-7 is a small set of classes to introduce JsonWebToken as the token format and a basic JsonWebTokenAuthority that can issue and verify these tokens. Since there is no JIRA for this yet, I will likely file a new JIRA
Re: [DISCUSS] Hadoop SSO/Token Server Components
. WebSSOAuthenticationHandler intercepts the request and redirects the browser to an SP web endpoint exposed by the AuthenticationServer passing the requested url as the redirect_url. This endpoint: a. is dedicated to redirecting to the external IdP passing the required parameters which may include a redirect_url back to itself as well as encoding the original redirect_url so that it can determine it on the way back to the client 3. the IdP: a. challenges the user for credentials and authenticates the user b. creates appropriate token/cookie and redirects back to the AuthenticationServer endpoint 4. AuthenticationServer endpoint: a. extracts the expected token/cookie from the incoming request and validates it b. creates a hadoop id_token c. acquires a hadoop access token for the id_token d. creates appropriate cookie and redirects back to the original redirect_url - being the requested resource 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contrac c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie REQUIRED COMPONENTS for UI USECASES: COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise. thoughts? On Jul 10, 2013, at 1:06 PM, Brian Swan brian.s...@microsoft.com wrote: Hi Alejandro, all- There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of what we are aiming for forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area
Re: [DISCUSS] Hadoop SSO/Token Server Components
as a cookie c. redirects the browser to the original service UI resource via the provided redirect_url 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contract c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie USECASE UI-2 Federation/SAML: For the federation usecase: 1. User’s browser requests access to a UI console page 2. WebSSOAuthenticationHandler intercepts the request and redirects the browser to an SP web endpoint exposed by the AuthenticationServer passing the requested url as the redirect_url. This endpoint: a. is dedicated to redirecting to the external IdP passing the required parameters which may include a redirect_url back to itself as well as encoding the original redirect_url so that it can determine it on the way back to the client 3. the IdP: a. challenges the user for credentials and authenticates the user b. creates appropriate token/cookie and redirects back to the AuthenticationServer endpoint 4. AuthenticationServer endpoint: a. extracts the expected token/cookie from the incoming request and validates it b. creates a hadoop id_token c. acquires a hadoop access token for the id_token d. creates appropriate cookie and redirects back to the original redirect_url - being the requested resource 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contrac c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie REQUIRED COMPONENTS for UI USECASES: COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise. thoughts? On Jul 10, 2013, at 1:06 PM, Brian Swan brian.s...@microsoft.com wrote: Hi Alejandro, all
Re: [DISCUSS] Hadoop SSO/Token Server Components
the user with a FORM over https a. user provides username/password and submits the FORM 4. AuthenticationServer authenticates the user with provided credentials against the configured LDAP server and: a. leverages a servlet filter or other authentication mechanism for the endpoint and authenticates the user with a simple LDAP bind with username and password b. acquires a hadoop id_token and uses it to acquire the required hadoop access token which is added as a cookie c. redirects the browser to the original service UI resource via the provided redirect_url 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contract c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie USECASE UI-2 Federation/SAML: For the federation usecase: 1. User’s browser requests access to a UI console page 2. WebSSOAuthenticationHandler intercepts the request and redirects the browser to an SP web endpoint exposed by the AuthenticationServer passing the requested url as the redirect_url. This endpoint: a. is dedicated to redirecting to the external IdP passing the required parameters which may include a redirect_url back to itself as well as encoding the original redirect_url so that it can determine it on the way back to the client 3. the IdP: a. challenges the user for credentials and authenticates the user b. creates appropriate token/cookie and redirects back to the AuthenticationServer endpoint 4. AuthenticationServer endpoint: a. extracts the expected token/cookie from the incoming request and validates it b. creates a hadoop id_token c. acquires a hadoop access token for the id_token d. creates appropriate cookie and redirects back to the original redirect_url - being the requested resource 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contrac c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie REQUIRED COMPONENTS for UI USECASES: COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed
Re: [DISCUSS] Hadoop SSO/Token Server Components
the configured LDAP server and: a. leverages a servlet filter or other authentication mechanism for the endpoint and authenticates the user with a simple LDAP bind with username and password b. acquires a hadoop id_token and uses it to acquire the required hadoop access token which is added as a cookie c. redirects the browser to the original service UI resource via the provided redirect_url 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contract c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie USECASE UI-2 Federation/SAML: For the federation usecase: 1. User’s browser requests access to a UI console page 2. WebSSOAuthenticationHandler intercepts the request and redirects the browser to an SP web endpoint exposed by the AuthenticationServer passing the requested url as the redirect_url. This endpoint: a. is dedicated to redirecting to the external IdP passing the required parameters which may include a redirect_url back to itself as well as encoding the original redirect_url so that it can determine it on the way back to the client 3. the IdP: a. challenges the user for credentials and authenticates the user b. creates appropriate token/cookie and redirects back to the AuthenticationServer endpoint 4. AuthenticationServer endpoint: a. extracts the expected token/cookie from the incoming request and validates it b. creates a hadoop id_token c. acquires a hadoop access token for the id_token d. creates appropriate cookie and redirects back to the original redirect_url - being the requested resource 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contrac c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie REQUIRED COMPONENTS for UI USECASES: COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please
Re: [DISCUSS] Hadoop SSO/Token Server Components
for administration and end user interactions. These consoles need to also benefit from the pluggability of authentication mechansims to be on par with the access control of the cluster REST and RPC APIs. Web consoles are protected with an WebSSOAuthenticationHandler which will be configured for either authentication or federation. USECASE UI-1 Authentication/LDAP: For the authentication usecase: 1. User’s browser requests access to a UI console page 2. WebSSOAuthenticationHandler intercepts the request and redirects the browser to an IdP web endpoint exposed by the AuthenticationServer passing the requested url as the redirect_url 3. IdP web endpoint presents the user with a FORM over https a. user provides username/password and submits the FORM 4. AuthenticationServer authenticates the user with provided credentials against the configured LDAP server and: a. leverages a servlet filter or other authentication mechanism for the endpoint and authenticates the user with a simple LDAP bind with username and password b. acquires a hadoop id_token and uses it to acquire the required hadoop access token which is added as a cookie c. redirects the browser to the original service UI resource via the provided redirect_url 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contract c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie USECASE UI-2 Federation/SAML: For the federation usecase: 1. User’s browser requests access to a UI console page 2. WebSSOAuthenticationHandler intercepts the request and redirects the browser to an SP web endpoint exposed by the AuthenticationServer passing the requested url as the redirect_url. This endpoint: a. is dedicated to redirecting to the external IdP passing the required parameters which may include a redirect_url back to itself as well as encoding the original redirect_url so that it can determine it on the way back to the client 3. the IdP: a. challenges the user for credentials and authenticates the user b. creates appropriate token/cookie and redirects back to the AuthenticationServer endpoint 4. AuthenticationServer endpoint: a. extracts the expected token/cookie from the incoming request and validates it b. creates a hadoop id_token c. acquires a hadoop access token for the id_token d. creates appropriate cookie and redirects back to the original redirect_url - being the requested resource 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contrac c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie REQUIRED COMPONENTS for UI USECASES: COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator
Re: [DISCUSS] Hadoop SSO/Token Server Components
redirect_url so that it can determine it on the way back to the client 3. the IdP: a. challenges the user for credentials and authenticates the user b. creates appropriate token/cookie and redirects back to the AuthenticationServer endpoint 4. AuthenticationServer endpoint: a. extracts the expected token/cookie from the incoming request and validates it b. creates a hadoop id_token c. acquires a hadoop access token for the id_token d. creates appropriate cookie and redirects back to the original redirect_url - being the requested resource 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one: a. validates the incoming token b. returns the AuthenticationToken as per AuthenticationHandler contrac c. AuthenticationFilter adds the hadoop auth cookie with the expected token d. serves requested resource for valid tokens e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie REQUIRED COMPONENTS for UI USECASES: COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise. thoughts? On Jul 10, 2013, at 1:06 PM, Brian Swan brian.s...@microsoft.com wrote: Hi Alejandro, all- There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of what we are aiming for forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything). Thanks. -Brian -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Wednesday, July 10, 2013 8:15 AM To: Larry McCay Cc: common-dev@hadoop.apache.org; da...@yahoo-inc.com; Kai Zheng Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Larry, all, Still is not clear to me what is the end state we
Re: [DISCUSS] Hadoop SSO/Token Server Components
All - After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with: 1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself) I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. @Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation of it would be great as well. @Kai - do you have existing code for the pluggable token authentication mechanism - if not, we can take a stab at representing it with interfaces and/or POC code. I can standup and say that we have a token format that we have been working with already and can provide a patch that represents it as a contribution to test out the pluggable tokenAuth. These patches will provide progress toward code being the central discussion vehicle. As a community, we can then incrementally build on that foundation in order to collaboratively deliver the common vision. In the absence of any other home for posting such patches, let's assume that they will be attached to HADOOP-9392 - or a dedicated subtask for this particular aspect/s - I will leave that detail to Kai. @Alejandro, being the only voice on this thread that isn't represented in the votes above, please feel free to agree or disagree with this direction. thanks, --larry On Jul 5, 2013, at 3:24 PM, Larry McCay lmc...@hortonworks.com wrote: Hi Andy - Happy Fourth of July to you and yours. Same to you and yours. :-) We had some fun in the sun for a change - we've had nothing but rain on the east coast lately. My concern here is there may have been a misinterpretation or lack of consensus on what is meant by clean slate Apparently so. On the pre-summit call, I stated that I was interested in reconciling the jiras so that we had one to work from. You recommended that we set them aside for the time being - with the understanding that work would continue on your side (and our's as well) - and approach the community discussion from a clean slate. We seemed to do this at the summit session quite well. It was my understanding that this community discussion would live beyond the summit and continue on this list. While closing the summit session we agreed to follow up on common-dev with first a summary then a discussion of the moving parts. I never expected the previous work to be abandoned and fully expected it to inform the discussion that happened here. If you would like to reframe what clean slate was supposed to mean or describe what it means now - that would be welcome - before I waste anymore time trying to facilitate a community discussion that is apparently not wanted. Nowhere in this picture are self appointed master JIRAs and such, which have been disappointing to see crop up, we should be collaboratively coding not planting flags. I don't know what you mean by self-appointed master JIRAs. It has certainly not been anyone's intention to disappoint. Any mention of a new JIRA was just to have a clear context to gather the agreed upon points - previous and/or existing JIRAs would easily be linked. Planting flags… I need to go back and read my discussion point about the JIRA and see how this is the impression that was made. That is not how I define success. The only flags that count is code. What we are lacking is the roadmap on which to put the code. I read Kai's latest document as something approaching today's consensus (or at least a common point of view?) rather than a historical document. Perhaps he and it can be given equal share of the consideration. I definitely read it as something that has evolved into something approaching what we have been talking about so far. There has not however been enough discussion anywhere near the level of detail in that document and more details are needed for each component in the design. Why the work in that document should not be fed into the community discussion as anyone else's would be - I fail to understand. My suggestion continues to be that you should take that document and speak to the inventory of moving parts as we agreed. As these are agreed upon, we will ensure that the appropriate subtasks are filed against whatever JIRA is to host them - don't really care much which it is. I don't really want to continue with
Re: [DISCUSS] Hadoop SSO/Token Server Components
Larry, all, Still is not clear to me what is the end state we are aiming for, or that we even agree on that. IMO, Instead trying to agree what to do, we should first agree on the final state, then we see what should be changed to there there, then we see how we change things to get there. The different documents out there focus more on how. We not try to say how before we know what. Thx. On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay lmc...@hortonworks.com wrote: All - After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with: 1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself) I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. @Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation of it would be great as well. @Kai - do you have existing code for the pluggable token authentication mechanism - if not, we can take a stab at representing it with interfaces and/or POC code. I can standup and say that we have a token format that we have been working with already and can provide a patch that represents it as a contribution to test out the pluggable tokenAuth. These patches will provide progress toward code being the central discussion vehicle. As a community, we can then incrementally build on that foundation in order to collaboratively deliver the common vision. In the absence of any other home for posting such patches, let's assume that they will be attached to HADOOP-9392 - or a dedicated subtask for this particular aspect/s - I will leave that detail to Kai. @Alejandro, being the only voice on this thread that isn't represented in the votes above, please feel free to agree or disagree with this direction. thanks, --larry On Jul 5, 2013, at 3:24 PM, Larry McCay lmc...@hortonworks.com wrote: Hi Andy - Happy Fourth of July to you and yours. Same to you and yours. :-) We had some fun in the sun for a change - we've had nothing but rain on the east coast lately. My concern here is there may have been a misinterpretation or lack of consensus on what is meant by clean slate Apparently so. On the pre-summit call, I stated that I was interested in reconciling the jiras so that we had one to work from. You recommended that we set them aside for the time being - with the understanding that work would continue on your side (and our's as well) - and approach the community discussion from a clean slate. We seemed to do this at the summit session quite well. It was my understanding that this community discussion would live beyond the summit and continue on this list. While closing the summit session we agreed to follow up on common-dev with first a summary then a discussion of the moving parts. I never expected the previous work to be abandoned and fully expected it to inform the discussion that happened here. If you would like to reframe what clean slate was supposed to mean or describe what it means now - that would be welcome - before I waste anymore time trying to facilitate a community discussion that is apparently not wanted. Nowhere in this picture are self appointed master JIRAs and such, which have been disappointing to see crop up, we should be collaboratively coding not planting flags. I don't know what you mean by self-appointed master JIRAs. It has certainly not been anyone's intention to disappoint. Any mention of a new JIRA was just to have a clear context to gather the agreed upon points - previous and/or existing JIRAs would easily be linked. Planting flags… I need to go back and read my discussion point about the JIRA and see how this is the impression that was made. That is not how I define success. The only flags that count is code. What we are lacking is the roadmap on which to put the code. I read Kai's latest document as something approaching today's consensus (or at least a common point of view?) rather than a historical document. Perhaps he and it can be given equal share of the consideration. I definitely read it as something that has evolved into something approaching what we have been talking about so far. There has not however been enough discussion anywhere near the level of detail in that document and more details are
Re: [DISCUSS] Hadoop SSO/Token Server Components
Sorry for falling out of the loop. I'm catching up the jiras and discussion, and will comment this afternoon. Daryn On Jul 10, 2013, at 8:42 AM, Larry McCay wrote: All - After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with: 1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself) I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. @Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation of it would be great as well. @Kai - do you have existing code for the pluggable token authentication mechanism - if not, we can take a stab at representing it with interfaces and/or POC code. I can standup and say that we have a token format that we have been working with already and can provide a patch that represents it as a contribution to test out the pluggable tokenAuth. These patches will provide progress toward code being the central discussion vehicle. As a community, we can then incrementally build on that foundation in order to collaboratively deliver the common vision. In the absence of any other home for posting such patches, let's assume that they will be attached to HADOOP-9392 - or a dedicated subtask for this particular aspect/s - I will leave that detail to Kai. @Alejandro, being the only voice on this thread that isn't represented in the votes above, please feel free to agree or disagree with this direction. thanks, --larry On Jul 5, 2013, at 3:24 PM, Larry McCay lmc...@hortonworks.com wrote: Hi Andy - Happy Fourth of July to you and yours. Same to you and yours. :-) We had some fun in the sun for a change - we've had nothing but rain on the east coast lately. My concern here is there may have been a misinterpretation or lack of consensus on what is meant by clean slate Apparently so. On the pre-summit call, I stated that I was interested in reconciling the jiras so that we had one to work from. You recommended that we set them aside for the time being - with the understanding that work would continue on your side (and our's as well) - and approach the community discussion from a clean slate. We seemed to do this at the summit session quite well. It was my understanding that this community discussion would live beyond the summit and continue on this list. While closing the summit session we agreed to follow up on common-dev with first a summary then a discussion of the moving parts. I never expected the previous work to be abandoned and fully expected it to inform the discussion that happened here. If you would like to reframe what clean slate was supposed to mean or describe what it means now - that would be welcome - before I waste anymore time trying to facilitate a community discussion that is apparently not wanted. Nowhere in this picture are self appointed master JIRAs and such, which have been disappointing to see crop up, we should be collaboratively coding not planting flags. I don't know what you mean by self-appointed master JIRAs. It has certainly not been anyone's intention to disappoint. Any mention of a new JIRA was just to have a clear context to gather the agreed upon points - previous and/or existing JIRAs would easily be linked. Planting flags… I need to go back and read my discussion point about the JIRA and see how this is the impression that was made. That is not how I define success. The only flags that count is code. What we are lacking is the roadmap on which to put the code. I read Kai's latest document as something approaching today's consensus (or at least a common point of view?) rather than a historical document. Perhaps he and it can be given equal share of the consideration. I definitely read it as something that has evolved into something approaching what we have been talking about so far. There has not however been enough discussion anywhere near the level of detail in that document and more details are needed for each component in the design. Why the work in that document should not be fed into the community discussion as anyone else's would be - I fail to understand. My suggestion continues to be that you should take that document and speak to the inventory of moving parts as
RE: [DISCUSS] Hadoop SSO/Token Server Components
Hi Alejandro, all- There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of what we are aiming for forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything). Thanks. -Brian -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Wednesday, July 10, 2013 8:15 AM To: Larry McCay Cc: common-dev@hadoop.apache.org; da...@yahoo-inc.com; Kai Zheng Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Larry, all, Still is not clear to me what is the end state we are aiming for, or that we even agree on that. IMO, Instead trying to agree what to do, we should first agree on the final state, then we see what should be changed to there there, then we see how we change things to get there. The different documents out there focus more on how. We not try to say how before we know what. Thx. On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay lmc...@hortonworks.com wrote: All - After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with: 1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself) I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. @Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation of it would be great as well. @Kai - do you have existing code for the pluggable token authentication mechanism - if not, we can take a stab at representing it with interfaces and/or POC code. I can standup and say that we have a token format that we have been working with already and can provide a patch that represents it as a contribution to test out the pluggable tokenAuth. These patches will provide progress toward code being the central discussion vehicle. As a community, we can then incrementally build on that foundation in order to collaboratively deliver the common vision. In the absence of any other home for posting such patches, let's assume that they will be attached to HADOOP-9392 - or a dedicated subtask for this particular aspect/s - I will leave that detail to Kai. @Alejandro, being the only voice on this thread that isn't represented in the votes above, please feel free to agree or disagree with this direction. thanks, --larry On Jul 5, 2013, at 3:24 PM, Larry McCay lmc...@hortonworks.com wrote: Hi Andy - Happy Fourth of July to you and yours. Same to you and yours. :-) We had some fun in the sun for a change - we've had nothing but rain on the east coast lately. My concern here is there may have been a misinterpretation or lack of consensus on what is meant by clean slate Apparently so. On the pre-summit call, I stated that I was interested in reconciling the jiras so that we had one to work from. You recommended that we set them aside for the time being - with the understanding that work would continue on your side (and our's as well) - and approach the community discussion from a clean slate. We seemed to do this at the summit session quite well. It was my understanding that this community discussion would live beyond the summit and continue on this list. While closing the summit session we agreed to follow up on common-dev with first a summary then a discussion of the moving parts. I never expected the previous work to be abandoned and fully expected it to inform the discussion that happened here. If you would like to reframe what clean slate was supposed to mean or describe what it means now - that would be welcome - before I waste anymore time trying
Re: [DISCUSS] Hadoop SSO/Token Server Components
It seems to me that we can have the best of both worlds here…it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise. thoughts? On Jul 10, 2013, at 1:06 PM, Brian Swan brian.s...@microsoft.com wrote: Hi Alejandro, all- There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of what we are aiming for forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything). Thanks. -Brian -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Wednesday, July 10, 2013 8:15 AM To: Larry McCay Cc: common-dev@hadoop.apache.org; da...@yahoo-inc.com; Kai Zheng Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Larry, all, Still is not clear to me what is the end state we are aiming for, or that we even agree on that. IMO, Instead trying to agree what to do, we should first agree on the final state, then we see what should be changed to there there, then we see how we change things to get there. The different documents out there focus more on how. We not try to say how before we know what. Thx. On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay lmc...@hortonworks.com wrote: All - After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with: 1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself) I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. @Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation of it would be great as well. @Kai - do you have existing code for the pluggable token authentication mechanism - if not, we can take a stab at representing it with interfaces and/or POC code. I can standup and say that we have a token format that we have been working with already and can provide a patch that represents it as a contribution to test out the pluggable tokenAuth. These patches will provide progress toward code being the central discussion vehicle. As a community, we can then incrementally build on that foundation in order to collaboratively deliver the common vision. In the absence of any other home for posting such patches, let's assume that they will be attached to HADOOP-9392 - or a dedicated subtask
RE: [DISCUSS] Hadoop SSO/Token Server Components
Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 10, 2013 10:40 AM To: common-dev@hadoop.apache.org Cc: da...@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components It seems to me that we can have the best of both worlds here...it's all about the scoping. If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain: 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the what we are building instead of the how to build it. Including: a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds. I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once. @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise. thoughts? On Jul 10, 2013, at 1:06 PM, Brian Swan brian.s...@microsoft.com wrote: Hi Alejandro, all- There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of what we are aiming for forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything). Thanks. -Brian -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Wednesday, July 10, 2013 8:15 AM To: Larry McCay Cc: common-dev@hadoop.apache.org; da...@yahoo-inc.com; Kai Zheng Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Larry, all, Still is not clear to me what is the end state we are aiming for, or that we even agree on that. IMO, Instead trying to agree what to do, we should first agree on the final state, then we see what should be changed to there there, then we see how we change things to get there. The different documents out there focus more on how. We not try to say how before we know what. Thx. On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay lmc...@hortonworks.com wrote: All - After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with: 1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself) I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. @Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation of it would be great as well. @Kai - do you have existing
RE: [DISCUSS] Hadoop SSO/Token Server Components
Hi Alejandro, Thanks for our summary and points. No correction I'm having and just some updates from our side for further discussion. we should make sure that UserGroupInformation and RPC security logic work with a pluggable GSS implementation. Right. I'm working on implementing a token authn method in current Hadoop RPC and SASL framework, and changing the UGI class. Create a common security component ie 'hadoop-security' to be 'the' security lib for all projects to use. Sure we will put our codes for the new AuthN AuthZ frameworks into the 'hadoop-security' component for the ecosystem. I guess this component should be a collection of related projects and it's in line with hadoop-common right? As we might agree that the key to all of these is to implement the token authentication method for client to service to start with. Hopefully I can finish and provide my working codes as a patch for the discussion. Thanks regards, Kai -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Friday, July 05, 2013 4:09 AM To: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Leaving JIRAs and design docs aside, my recollection from the f2f lounge discussion could be summarized as: -- 1* Decouple users-services authentication from (intra) services-services authentication. The main motivation for this is to get pluggable authentication and integrated SSO experience for users. (we never discussed if this is needed for external-apps talking with Hadoop) 2* We should leave the Hadoop delegation tokens alone No need to make this pluggable as this is an internal authentication mechanism after the 'real' authentication happened. (this is independent from factoring out all classes we currently have into a common implementation for Hadoop and other projects to use) 3* Being able to replace kerberos with something else for (intra) services-services authentication. It was suggested that to support deployments where stock Kerberos may not be an option (i.e. cloud) we should make sure that UserGroupInformation and RPC security logic work with a pluggable GSS implementation. 4* Create a common security component ie 'hadoop-security' to be 'the' security lib for all projects to use. Create a component/project that would provide the common security pieces for all projects to use. -- If we agree with this, after any necessary corrections, I think we could distill clear goals from it and start from there. Thanks. Tucu Alejandro On Thu, Jul 4, 2013 at 11:40 AM, Andrew Purtell apurt...@apache.org wrote: Hi Larry (and all), Happy Fourth of July to you and yours. In our shop Kai and Tianyou are already doing the coding, so I'd defer to them on the detailed points. My concern here is there may have been a misinterpretation or lack of consensus on what is meant by clean slate. Hopefully that can be quickly cleared up. Certainly we did not mean ignore all that came before. The idea was to reset discussions to find common ground and new direction where we are working together, not in conflict, on an agreed upon set of design points and tasks. There's been a lot of good discussion and design preceeding that we should figure out how to port over. Nowhere in this picture are self appointed master JIRAs and such, which have been disappointing to see crop up, we should be collaboratively coding not planting flags. I read Kai's latest document as something approaching today's consensus (or at least a common point of view?) rather than a historical document. Perhaps he and it can be given equal share of the consideration. On Wednesday, July 3, 2013, Larry McCay wrote: Hey Andrew - I largely agree with that statement. My intention was to let the differences be worked out within the individual components once they were identified and subtasks created. My reference to HSSO was really referring to a SSO *server* based design which was not clearly articulated in the earlier documents. We aren't trying to compare and contrast one design over another anymore. Let's move this collaboration along as we've mapped out and the differences in the details will reveal themselves and be addressed within their components. I've actually been looking forward to you weighing in on the actual discussion points in this thread. Could you do that? At this point, I am most interested in your thoughts on a single jira to represent all of this work and whether we should start discussing the SSO Tokens. If you think there are discussion points missing from that list, feel free to add to it. thanks, --larry On Jul 3, 2013, at 7:35 PM, Andrew Purtell apurt...@apache.org wrote: Hi Larry, Of course I'll let Kai speak for himself. However, let me point out that, while the differences between the competing JIRAs have been reduced
Re: [DISCUSS] Hadoop SSO/Token Server Components
Hi Alejandro - I missed your #4 in my summary and takeaways of the session in another thread on this list. I believe that the points of discussion were along the lines of: * put common security libraries into common much the same way as hadoop-auth is today making each available as separate maven modules to be used across the ecosystem * the was a concern raised that we need to be cognizant of not using common as a dumping grounds - I believe this to mean that we need to ensure that the libraries that are added there are truly cross cutting and can be used by the other projects across Hadoop - I think that security related things will largely be of that nature but we need to keep it in mind I'm not sure whether #3 is represented in the other summary or not… There was certainly discussions around the emerging work from Daryn related to pluggable authentication mechanisms within that layer and we will immediately have the options of kerberos, simple and plain. There was also talk of how this can be leveraged to introduce a Hadoop token mechanism as well. At the same time, there was talk of the possibility of simply making kerberos easy and a non-issue for intra-cluster use. Certainly we need both of these approaches. I believe someone used ApacheDS' KDC support as an example - if we could standup an ApacheDS based KDC and configure it and related keytabs easily than the end-to-end story is more palatable to a broader user base. That story being the choice of authentication mechanisms for user authentication and easy provisioning and management of kerberos for intra-cluster service authentication. If you agree with this extended summary then I can update the other thread with that recollection. Thanks for providing it! --larry On Jul 4, 2013, at 4:09 PM, Alejandro Abdelnur t...@cloudera.com wrote: Leaving JIRAs and design docs aside, my recollection from the f2f lounge discussion could be summarized as: -- 1* Decouple users-services authentication from (intra) services-services authentication. The main motivation for this is to get pluggable authentication and integrated SSO experience for users. (we never discussed if this is needed for external-apps talking with Hadoop) 2* We should leave the Hadoop delegation tokens alone No need to make this pluggable as this is an internal authentication mechanism after the 'real' authentication happened. (this is independent from factoring out all classes we currently have into a common implementation for Hadoop and other projects to use) 3* Being able to replace kerberos with something else for (intra) services-services authentication. It was suggested that to support deployments where stock Kerberos may not be an option (i.e. cloud) we should make sure that UserGroupInformation and RPC security logic work with a pluggable GSS implementation. 4* Create a common security component ie 'hadoop-security' to be 'the' security lib for all projects to use. Create a component/project that would provide the common security pieces for all projects to use. -- If we agree with this, after any necessary corrections, I think we could distill clear goals from it and start from there. Thanks. Tucu Alejandro On Thu, Jul 4, 2013 at 11:40 AM, Andrew Purtell apurt...@apache.org wrote: Hi Larry (and all), Happy Fourth of July to you and yours. In our shop Kai and Tianyou are already doing the coding, so I'd defer to them on the detailed points. My concern here is there may have been a misinterpretation or lack of consensus on what is meant by clean slate. Hopefully that can be quickly cleared up. Certainly we did not mean ignore all that came before. The idea was to reset discussions to find common ground and new direction where we are working together, not in conflict, on an agreed upon set of design points and tasks. There's been a lot of good discussion and design preceeding that we should figure out how to port over. Nowhere in this picture are self appointed master JIRAs and such, which have been disappointing to see crop up, we should be collaboratively coding not planting flags. I read Kai's latest document as something approaching today's consensus (or at least a common point of view?) rather than a historical document. Perhaps he and it can be given equal share of the consideration. On Wednesday, July 3, 2013, Larry McCay wrote: Hey Andrew - I largely agree with that statement. My intention was to let the differences be worked out within the individual components once they were identified and subtasks created. My reference to HSSO was really referring to a SSO *server* based design which was not clearly articulated in the earlier documents. We aren't trying to compare and contrast one design over another anymore. Let's move this collaboration along as we've mapped out and the differences in the details will reveal
Re: [DISCUSS] Hadoop SSO/Token Server Components
there was any discussion of abandoning the current JIRAs which tracks a lot of good input from others in the community and important for us to consider as we move forward with the work. Recommend we continue to move forward with the two JIRAs that we have already been respectively working on, as well other JIRAs that others in the community continue to work on. Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value. That is not my understanding. As Kai has pointed out in response to your comment on HADOOP-9392, a lot of these updates predate last week's discussion at the summit. Fortunately the discussion at the summit was in line with our thinking on the required revisions from discussing with others in the community prior to the summit. Our updated design doc clearly addresses the authorization and proxy flow which are important for users. HSSO can continue to be layered on top of TAS via federation. Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. Actually I see many key differences between 9392 and 9533. Andrew and Kai has also pointed out there are key differences when comparing 9392 and 9533. Please review the design doc we have uploaded to understand the differences. I am sure Kai will also add more details about the differences between these JIRAs. The work proposed by us on 9392 addresses additional user needs beyond what 9533 proposes to implement. We should figure out some of the implementation specifics for those JIRAs so both of us can keep moving on the code without colliding. Kai has also recommended the same as his preference in response to your comment on 9392. Let's work that out as a community of peers so we can all agree on an approach to move forward collaboratively. Thanks, Tianyou -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Thursday, July 04, 2013 4:10 AM To: Zheng, Kai Cc: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Hi Kai - I think that I need to clarify something... This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop. We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion. Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value. What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why. In other words, we need to define and agree on the work that has to be done. We also need to determine those components that need to be done before anything else can be started. I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order. Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point. I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be. If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it. thanks, --larry On Jul 3, 2013, at 2:39 PM, Zheng, Kai kai.zh...@intel.com wrote: Hi Larry, Thanks for the update. Good to see that with this update we are now aligned on most points. I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes: 1.Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO; 2.Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access
RE: [DISCUSS] Hadoop SSO/Token Server Components
Hi Larry, Our design from its first revision focuses on and provides comprehensive support to allow pluggable authentication mechanisms based on a common token, trying to address single sign on issues across the ecosystem to support access to Hadoop services via RPC, REST, and web browser SSO flow. The updated design doc adds even more texts and flows to explain or illustrate these existing items in details as requested by some on the JIRA. Additional to the identity token we had proposed, we adopted access token and adapted the approach not only for sake of making TokenAuth compatible with HSSO, but also for better support of fine grained access control, and seamless integration with our authorization framework and even 3rd party authorization service like OAuth Authorization Server. We regard these as important because Hadoop is evolving into an enterprise and cloud platform that needs a complete authN and authZ solution and without this support we would need future rework to complete the solution. Since you asked about the differences between TokenAuth and HSSO, here are some key ones: TokenAuth supports TAS federation to allow clients to access multiple clusters without a centralized SSO server while HSSO provides a centralized SSO server for multiple clusters. TokenAuth integrates authorization framework with auditing support in order to provide a complete solution for enterprise data access security. This allows administrators to administrate security polices centrally and have the polices be enforced consistently across components in the ecosystem in a pluggable way that supports different authorization models like RBAC, ABAC and even XACML standards. TokenAuth targets support for domain based authN authZ to allow multi-tenant deployments. Authentication and authorization rules can be configured and enforced per domain, which allows organizations to manage their individual policies separately while sharing a common large pool of resources. TokenAuth addresses proxy/impersonation case with flow as Tianyou mentioned, where a service can proxy client to access another service in a secured and constrained way. Regarding token based authentication plus SSO and unified authorization framework, HADOOP-9392 and HADOOP-9466 let's continue to use these as umbrella JIRAs for these efforts. HSSO targets support for centralized SSO server for multiple clusters and as we have pointed out before is a nice subset of the work proposed on HADOOP-9392. Let's align these two JIRAs and address the question Kevin raised multiple times in 9392/9533 JIRAs, How can HSSO and TAS work together? What is the relationship?. The design update I provided was meant to provide the necessary details so we can nail down that relationship and collaborate on the implementation of these JIRAs. As you have also confirmed, this design aligns with related community discussions, so let's continue our collaborative effort to contribute code to these JIRAs. Regards, Kai -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Thursday, July 04, 2013 4:10 AM To: Zheng, Kai Cc: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Hi Kai - I think that I need to clarify something... This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop. We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion. Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value. What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why. In other words, we need to define and agree on the work that has to be done. We also need to determine those components that need to be done before anything else can be started. I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order. Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point. I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should
Re: [DISCUSS] Hadoop SSO/Token Server Components
authorization models like RBAC, ABAC and even XACML standards. TokenAuth targets support for domain based authN authZ to allow multi-tenant deployments. Authentication and authorization rules can be configured and enforced per domain, which allows organizations to manage their individual policies separately while sharing a common large pool of resources. TokenAuth addresses proxy/impersonation case with flow as Tianyou mentioned, where a service can proxy client to access another service in a secured and constrained way. Regarding token based authentication plus SSO and unified authorization framework, HADOOP-9392 and HADOOP-9466 let's continue to use these as umbrella JIRAs for these efforts. HSSO targets support for centralized SSO server for multiple clusters and as we have pointed out before is a nice subset of the work proposed on HADOOP-9392. Let's align these two JIRAs and address the question Kevin raised multiple times in 9392/9533 JIRAs, How can HSSO and TAS work together? What is the relationship?. The design update I provided was meant to provide the necessary details so we can nail down that relationship and collaborate on the implementation of these JIRAs. As you have also confirmed, this design aligns with related community discussions, so let's continue our collaborative effort to contribute code to these JIRAs. Regards, Kai -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Thursday, July 04, 2013 4:10 AM To: Zheng, Kai Cc: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Hi Kai - I think that I need to clarify something... This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop. We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion. Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value. What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why. In other words, we need to define and agree on the work that has to be done. We also need to determine those components that need to be done before anything else can be started. I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order. Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point. I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be. If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it. thanks, --larry On Jul 3, 2013, at 2:39 PM, Zheng, Kai kai.zh...@intel.com wrote: Hi Larry, Thanks for the update. Good to see that with this update we are now aligned on most points. I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes: 1.Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO; 2.Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services; 3.Refined proxy access token and the proxy/impersonation flow; 4.Refined the browser web SSO flow regarding access to Hadoop web services; 5.Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL; 6.Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token; 7.Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference; 8.Added a detailed flow to illustrate Hadoop Simple authentication
Re: [DISCUSS] Hadoop SSO/Token Server Components
Leaving JIRAs and design docs aside, my recollection from the f2f lounge discussion could be summarized as: -- 1* Decouple users-services authentication from (intra) services-services authentication. The main motivation for this is to get pluggable authentication and integrated SSO experience for users. (we never discussed if this is needed for external-apps talking with Hadoop) 2* We should leave the Hadoop delegation tokens alone No need to make this pluggable as this is an internal authentication mechanism after the 'real' authentication happened. (this is independent from factoring out all classes we currently have into a common implementation for Hadoop and other projects to use) 3* Being able to replace kerberos with something else for (intra) services-services authentication. It was suggested that to support deployments where stock Kerberos may not be an option (i.e. cloud) we should make sure that UserGroupInformation and RPC security logic work with a pluggable GSS implementation. 4* Create a common security component ie 'hadoop-security' to be 'the' security lib for all projects to use. Create a component/project that would provide the common security pieces for all projects to use. -- If we agree with this, after any necessary corrections, I think we could distill clear goals from it and start from there. Thanks. Tucu Alejandro On Thu, Jul 4, 2013 at 11:40 AM, Andrew Purtell apurt...@apache.org wrote: Hi Larry (and all), Happy Fourth of July to you and yours. In our shop Kai and Tianyou are already doing the coding, so I'd defer to them on the detailed points. My concern here is there may have been a misinterpretation or lack of consensus on what is meant by clean slate. Hopefully that can be quickly cleared up. Certainly we did not mean ignore all that came before. The idea was to reset discussions to find common ground and new direction where we are working together, not in conflict, on an agreed upon set of design points and tasks. There's been a lot of good discussion and design preceeding that we should figure out how to port over. Nowhere in this picture are self appointed master JIRAs and such, which have been disappointing to see crop up, we should be collaboratively coding not planting flags. I read Kai's latest document as something approaching today's consensus (or at least a common point of view?) rather than a historical document. Perhaps he and it can be given equal share of the consideration. On Wednesday, July 3, 2013, Larry McCay wrote: Hey Andrew - I largely agree with that statement. My intention was to let the differences be worked out within the individual components once they were identified and subtasks created. My reference to HSSO was really referring to a SSO *server* based design which was not clearly articulated in the earlier documents. We aren't trying to compare and contrast one design over another anymore. Let's move this collaboration along as we've mapped out and the differences in the details will reveal themselves and be addressed within their components. I've actually been looking forward to you weighing in on the actual discussion points in this thread. Could you do that? At this point, I am most interested in your thoughts on a single jira to represent all of this work and whether we should start discussing the SSO Tokens. If you think there are discussion points missing from that list, feel free to add to it. thanks, --larry On Jul 3, 2013, at 7:35 PM, Andrew Purtell apurt...@apache.org wrote: Hi Larry, Of course I'll let Kai speak for himself. However, let me point out that, while the differences between the competing JIRAs have been reduced for sure, there were some key differences that didn't just disappear. Subsequent discussion will make that clear. I also disagree with your characterization that we have simply endorsed all of the design decisions of the so-called HSSO, this is taking a mile from an inch. We are here to engage in a collaborative process as peers. I've been encouraged by the spirit of the discussions up to this point and hope that can continue beyond one design summit. On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay lmc...@hortonworks.com wrote: Hi Kai - I think that I need to clarify something… This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop. We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion. Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value. What we need you to do at this point, is to look at those high-level
RE: [DISCUSS] Hadoop SSO/Token Server Components
Thanks, Larry, for starting this conversation (and thanks for the great Summit meeting summary you sent out a couple of days ago). To weigh in on your specific discussion points (and renumber them :-))... 1. Are there additional components that would be required for a Hadoop SSO service? Not that I can see. 2. Should any of the above described components be considered not actually necessary or poorly described? I think this will be determined as we get into the details of each component. What you've described here is certainly an excellent starting point. 3. Should we create a new umbrella Jira to identify each of these as a subtask? 4. Should we just continue to use 9533 for the SSO server and add additional subtasks? What is described here seem to fit with 9533, though 9533 may contain some details that need further discussion. IMHO, it may be better to file a new umbrella Jira, though I'm not 100% convinced of that. Would be very interested on input from others. 5. What are the natural seams of separation between these components and any dependencies between one and another that affect priority? Is 4 the right place to start? (4. Hadoop SSO Tokens: the exact shape and form of the sso tokens...) It seemed that in some 1:1 conversations after the Summit meeting that others may agree with this. Would like to hear if that is the case more broadly. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Tuesday, July 2, 2013 1:04 PM To: common-dev@hadoop.apache.org Subject: [DISCUSS] Hadoop SSO/Token Server Components All - As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service. There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread. https://issues.apache.org/jira/browse/HADOOP-9533 https://issues.apache.org/jira/browse/HADOOP-9392 As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort: * An alternative authentication mechanism to Kerberos for user authentication * A broader capability for integration into enterprise identity and SSO solutions * Possibly the advertisement/negotiation of available authentication mechanisms * Backward compatibility for the existing use of Kerberos * No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc) * Pluggable authentication mechanisms across: RPC, REST and webui enforcement points * Continued support for existing authorization policy/ACLs, etc * Keeping more fine grained authorization policies in mind - like attribute based access control - fine grained access control is a separate but related effort that we must not preclude with this effort * Cross cluster SSO In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow: +--+ +--+ credentials 1 | SSO | |CLIENT|--|SERVER| +--+ :tokens +--+ 2 | | access token V :requested resource +---+ |HADOOP | |SERVICE| +---+ The above diagram represents the simplest interaction model for an SSO service in Hadoop. 1. client authenticates to SSO service and acquires an access token a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services a. access token is presented as appropriate for the service endpoint protocol being used b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer +--+ | IdP | +--+ 1 ^ credentials | :idp_token | +--+ +--+ idp_token 2 | SSO | |CLIENT|--|SERVER| +--+ :tokens +--+ 3 | | access token V :requested resource +---+ |HADOOP | |SERVICE| +---+ The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business. 1. client authenticates to a trusted identity provider within the enterprise and acquires an IdP specific token a. client presents
RE: [DISCUSS] Hadoop SSO/Token Server Components
Hi Larry, Thanks for the update. Good to see that with this update we are now aligned on most points. I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes: 1.Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO; 2.Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services; 3.Refined proxy access token and the proxy/impersonation flow; 4.Refined the browser web SSO flow regarding access to Hadoop web services; 5.Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL; 6.Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token; 7.Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference; 8.Added a detailed flow to illustrate Hadoop Simple authentication over TokenAuth, in the appendices section; 9.Added secured task launcher in appendices as possible solutions for Windows platform; 10.Removed low level contents, and not so relevant parts into appendices section from the main body. As we all think about how to layer HSSO on TAS in TokenAuth framework, please take some time to look at the doc and then let's discuss the gaps we might have. I would like to discuss these gaps with focus on the implementations details so we are all moving towards getting code done. Let's continue this part of the discussion in HADOOP-9392 to allow for better tracking on the JIRA itself. For discussions related to Centralized SSO server, suggest we continue to use HADOOP-9533 to consolidate all discussion related to that JIRA. That way we don't need extra umbrella JIRAs. I agree we should speed up these discussions, agree on some of the implementation specifics so both us can get moving on the code while not stepping on each other in our work. Look forward to your comments and comments from others in the community. Thanks. Regards, Kai -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 03, 2013 4:04 AM To: common-dev@hadoop.apache.org Subject: [DISCUSS] Hadoop SSO/Token Server Components All - As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service. There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread. https://issues.apache.org/jira/browse/HADOOP-9533 https://issues.apache.org/jira/browse/HADOOP-9392 As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort: * An alternative authentication mechanism to Kerberos for user authentication * A broader capability for integration into enterprise identity and SSO solutions * Possibly the advertisement/negotiation of available authentication mechanisms * Backward compatibility for the existing use of Kerberos * No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc) * Pluggable authentication mechanisms across: RPC, REST and webui enforcement points * Continued support for existing authorization policy/ACLs, etc * Keeping more fine grained authorization policies in mind - like attribute based access control - fine grained access control is a separate but related effort that we must not preclude with this effort * Cross cluster SSO In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow: +--+ +--+ credentials 1 | SSO | |CLIENT|--|SERVER| +--+ :tokens +--+ 2 | | access token V :requested resource +---+ |HADOOP | |SERVICE| +---+ The above diagram represents the simplest interaction model for an SSO service in Hadoop. 1. client authenticates to SSO service and acquires an access token a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access
Re: [DISCUSS] Hadoop SSO/Token Server Components
Hi Kai - I think that I need to clarify something… This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop. We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion. Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value. What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why. In other words, we need to define and agree on the work that has to be done. We also need to determine those components that need to be done before anything else can be started. I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order. Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point. I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be. If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it. thanks, --larry On Jul 3, 2013, at 2:39 PM, Zheng, Kai kai.zh...@intel.com wrote: Hi Larry, Thanks for the update. Good to see that with this update we are now aligned on most points. I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes: 1.Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO; 2.Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services; 3.Refined proxy access token and the proxy/impersonation flow; 4.Refined the browser web SSO flow regarding access to Hadoop web services; 5.Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL; 6.Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token; 7.Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference; 8.Added a detailed flow to illustrate Hadoop Simple authentication over TokenAuth, in the appendices section; 9.Added secured task launcher in appendices as possible solutions for Windows platform; 10.Removed low level contents, and not so relevant parts into appendices section from the main body. As we all think about how to layer HSSO on TAS in TokenAuth framework, please take some time to look at the doc and then let's discuss the gaps we might have. I would like to discuss these gaps with focus on the implementations details so we are all moving towards getting code done. Let's continue this part of the discussion in HADOOP-9392 to allow for better tracking on the JIRA itself. For discussions related to Centralized SSO server, suggest we continue to use HADOOP-9533 to consolidate all discussion related to that JIRA. That way we don't need extra umbrella JIRAs. I agree we should speed up these discussions, agree on some of the implementation specifics so both us can get moving on the code while not stepping on each other in our work. Look forward to your comments and comments from others in the community. Thanks. Regards, Kai -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 03, 2013 4:04 AM To: common-dev@hadoop.apache.org Subject: [DISCUSS] Hadoop SSO/Token Server Components All - As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service. There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread. https://issues.apache.org
Re: [DISCUSS] Hadoop SSO/Token Server Components
Thanks, Brian! Look at that - the power of collaboration - the numbering is correct already! ;-) I am inclined to agree that we should start with the Hadoop SSO Tokens and am leaning toward a new jira that leaves behind the cruft but I don't feel very strongly about it being new. I do feel like, especially given Kai's new document, that we have only one. On Jul 3, 2013, at 2:32 PM, Brian Swan brian.s...@microsoft.com wrote: Thanks, Larry, for starting this conversation (and thanks for the great Summit meeting summary you sent out a couple of days ago). To weigh in on your specific discussion points (and renumber them :-))... 1. Are there additional components that would be required for a Hadoop SSO service? Not that I can see. 2. Should any of the above described components be considered not actually necessary or poorly described? I think this will be determined as we get into the details of each component. What you've described here is certainly an excellent starting point. 3. Should we create a new umbrella Jira to identify each of these as a subtask? 4. Should we just continue to use 9533 for the SSO server and add additional subtasks? What is described here seem to fit with 9533, though 9533 may contain some details that need further discussion. IMHO, it may be better to file a new umbrella Jira, though I'm not 100% convinced of that. Would be very interested on input from others. 5. What are the natural seams of separation between these components and any dependencies between one and another that affect priority? Is 4 the right place to start? (4. Hadoop SSO Tokens: the exact shape and form of the sso tokens...) It seemed that in some 1:1 conversations after the Summit meeting that others may agree with this. Would like to hear if that is the case more broadly. -Brian -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Tuesday, July 2, 2013 1:04 PM To: common-dev@hadoop.apache.org Subject: [DISCUSS] Hadoop SSO/Token Server Components All - As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service. There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread. https://issues.apache.org/jira/browse/HADOOP-9533 https://issues.apache.org/jira/browse/HADOOP-9392 As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort: * An alternative authentication mechanism to Kerberos for user authentication * A broader capability for integration into enterprise identity and SSO solutions * Possibly the advertisement/negotiation of available authentication mechanisms * Backward compatibility for the existing use of Kerberos * No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc) * Pluggable authentication mechanisms across: RPC, REST and webui enforcement points * Continued support for existing authorization policy/ACLs, etc * Keeping more fine grained authorization policies in mind - like attribute based access control - fine grained access control is a separate but related effort that we must not preclude with this effort * Cross cluster SSO In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow: +--+ +--+ credentials 1 | SSO | |CLIENT|--|SERVER| +--+ :tokens +--+ 2 | | access token V :requested resource +---+ |HADOOP | |SERVICE| +---+ The above diagram represents the simplest interaction model for an SSO service in Hadoop. 1. client authenticates to SSO service and acquires an access token a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services a. access token is presented as appropriate for the service endpoint protocol being used b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer +--+ | IdP | +--+ 1 ^ credentials | :idp_token | +--+ +--+ idp_token 2 | SSO | |CLIENT|--|SERVER| +--+ :tokens
Re: [DISCUSS] Hadoop SSO/Token Server Components
not stepping on each other in our work. Look forward to your comments and comments from others in the community. Thanks. Regards, Kai -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Wednesday, July 03, 2013 4:04 AM To: common-dev@hadoop.apache.org Subject: [DISCUSS] Hadoop SSO/Token Server Components All - As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service. There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread. https://issues.apache.org/jira/browse/HADOOP-9533 https://issues.apache.org/jira/browse/HADOOP-9392 As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort: * An alternative authentication mechanism to Kerberos for user authentication * A broader capability for integration into enterprise identity and SSO solutions * Possibly the advertisement/negotiation of available authentication mechanisms * Backward compatibility for the existing use of Kerberos * No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc) * Pluggable authentication mechanisms across: RPC, REST and webui enforcement points * Continued support for existing authorization policy/ACLs, etc * Keeping more fine grained authorization policies in mind - like attribute based access control - fine grained access control is a separate but related effort that we must not preclude with this effort * Cross cluster SSO In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow: +--+ +--+ credentials 1 | SSO | |CLIENT|--|SERVER| +--+ :tokens +--+ 2 | | access token V :requested resource +---+ |HADOOP | |SERVICE| +---+ The above diagram represents the simplest interaction model for an SSO service in Hadoop. 1. client authenticates to SSO service and acquires an access token a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services a. access token is presented as appropriate for the service endpoint protocol being used b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer +--+ | IdP | +--+ 1 ^ credentials | :idp_token | +--+ +--+ idp_token 2 | SSO | |CLIENT|--|SERVER| +--+ :tokens +--+ 3 | | access token V :requested resource +---+ |HADOOP | |SERVICE| +---+ The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business. 1. client authenticates to a trusted identity provider within the enterprise and acquires an IdP specific token a. client presents credentials to an enterprise IdP and receives a token representing the authentication identity 2. client authenticates to SSO service and acquires an access token a. client presents idp_token to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity b. client then presents the identity token from 2.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 3. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services a. access token is presented as appropriate for the service endpoint protocol being used b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision: 1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials
RE: [DISCUSS] Hadoop SSO/Token Server Components
Hi Larry, I participated in the design discussion at Hadoop Summit. I do not remember there was any discussion of abandoning the current JIRAs which tracks a lot of good input from others in the community and important for us to consider as we move forward with the work. Recommend we continue to move forward with the two JIRAs that we have already been respectively working on, as well other JIRAs that others in the community continue to work on. Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value. That is not my understanding. As Kai has pointed out in response to your comment on HADOOP-9392, a lot of these updates predate last week's discussion at the summit. Fortunately the discussion at the summit was in line with our thinking on the required revisions from discussing with others in the community prior to the summit. Our updated design doc clearly addresses the authorization and proxy flow which are important for users. HSSO can continue to be layered on top of TAS via federation. Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. Actually I see many key differences between 9392 and 9533. Andrew and Kai has also pointed out there are key differences when comparing 9392 and 9533. Please review the design doc we have uploaded to understand the differences. I am sure Kai will also add more details about the differences between these JIRAs. The work proposed by us on 9392 addresses additional user needs beyond what 9533 proposes to implement. We should figure out some of the implementation specifics for those JIRAs so both of us can keep moving on the code without colliding. Kai has also recommended the same as his preference in response to your comment on 9392. Let's work that out as a community of peers so we can all agree on an approach to move forward collaboratively. Thanks, Tianyou -Original Message- From: Larry McCay [mailto:lmc...@hortonworks.com] Sent: Thursday, July 04, 2013 4:10 AM To: Zheng, Kai Cc: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components Hi Kai - I think that I need to clarify something... This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop. We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion. Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value. What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why. In other words, we need to define and agree on the work that has to be done. We also need to determine those components that need to be done before anything else can be started. I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order. Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point. I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be. If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it. thanks, --larry On Jul 3, 2013, at 2:39 PM, Zheng, Kai kai.zh...@intel.com wrote: Hi Larry, Thanks for the update. Good to see that with this update we are now aligned on most points. I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes: 1.Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO; 2.Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity
[DISCUSS] Hadoop SSO/Token Server Components
All - As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service. There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread. https://issues.apache.org/jira/browse/HADOOP-9533 https://issues.apache.org/jira/browse/HADOOP-9392 As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort: * An alternative authentication mechanism to Kerberos for user authentication * A broader capability for integration into enterprise identity and SSO solutions * Possibly the advertisement/negotiation of available authentication mechanisms * Backward compatibility for the existing use of Kerberos * No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc) * Pluggable authentication mechanisms across: RPC, REST and webui enforcement points * Continued support for existing authorization policy/ACLs, etc * Keeping more fine grained authorization policies in mind - like attribute based access control - fine grained access control is a separate but related effort that we must not preclude with this effort * Cross cluster SSO In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow: +--+ +--+ credentials 1 | SSO | |CLIENT|--|SERVER| +--+ :tokens +--+ 2 | | access token V :requested resource +---+ |HADOOP | |SERVICE| +---+ The above diagram represents the simplest interaction model for an SSO service in Hadoop. 1. client authenticates to SSO service and acquires an access token a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services a. access token is presented as appropriate for the service endpoint protocol being used b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer +--+ | IdP | +--+ 1 ^ credentials | :idp_token | +--+ +--+ idp_token 2 | SSO | |CLIENT|--|SERVER| +--+ :tokens +--+ 3 | | access token V :requested resource +---+ |HADOOP | |SERVICE| +---+ The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business. 1. client authenticates to a trusted identity provider within the enterprise and acquires an IdP specific token a. client presents credentials to an enterprise IdP and receives a token representing the authentication identity 2. client authenticates to SSO service and acquires an access token a. client presents idp_token to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity b. client then presents the identity token from 2.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 3. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services a. access token is presented as appropriate for the service endpoint protocol being used b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision: 1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services. 2. Authentication