Oh, nice! I think that it may have been better to leverage existing principal mapping implementation - if possible. Unfortunately, it doesn't seem to have been made available in the abstract bases yet.
I could actually see the regex.lookup as a slightly different beast than principal mapping. Although, we could debate whether both were needed. It seems like: 1. regex.output/lookup are a means to establishing/normalizing the authenticated principal 2. principal.mapping would be a means to mapping the authenticated principal to the effective principal Give the current regex implementation, would we be able to take [email protected] and map that to "guest"? I guess if we started with a completely different regex.input then we could easily do that. Another consideration would be to standardize on the regex mechanism for doing principal mapping - which would mean moving some or all of that into the abstract base for identity assertion providers. On Mon, Jan 18, 2016 at 10:18 AM, Kevin Minder <[email protected] > wrote: > WRT Option 2 below: The regex identity assertion mapper can already do > what is described. Given the configuration below it will turn > [email protected] into somebody_USA. The {[2]} takes the value > from the second matching group and looks it up in the lookup table. > > <provider> > <role>identity-assertion</role> > <name>Regex</name> > <enabled>true</enabled> > <param> > <name>input</name> > <value>(.*)@(.*?)\..*</value> > </param> > <param> > <name>output</name> > <value>{1}_{[2]}</value> > </param> > <param> > <name>lookup</name> > <value>us=USA;ca=CANADA</value> > </param> > </provider> > > > > > On 1/16/16, 11:10 AM, "larry mccay" <[email protected]> wrote: > > >All - > > > >The pac4j provider contribution was committed yesterday and we are on > track > >for our 0.8.0 release. Note that the docs are still being massaged a bit > >and will end up in the new 0.8.0 users guide book soon. > > > >In the meantime, I'd like to start a discussion wrt the requirements for > >identity assertion functionality in order to have full usecase coverage > for > >our new authentication/federation mechanisms. > > > >A bit of background first... > > > >Some of the external provider integrations that are enabled by the pac4j > >provider: > > > >1. result in a PrimaryPrinicipal that is actually an id rather than a > >username that could be used directly within the hadoop cluster. > >2. some also allow you to configure the user profile attribute to returned > >as the subject - such as SAML (okta). So, we could at least some times > have > >it be an email address. > >3. others result in an actual username as the PrimaryPrincipal > >4. It is extremely likely that none of these PrimaryPrincipals won't > >actually line up with enterprise username that can be used within the > >cluster. > > > >Existing identity assertion providers: > > > >1. pseudo/default identity assertion - we have the ability to use > principal > >mapping to mapping a numeric id/email or whatever to an acceptable > username > >for hadoop. However, all users that would access hadoop through a topology > >configured for pac4j would need to have their principal mappings defined > >within the topology. Not a very scalable or manageable approach. The > >topology itself would likely end up being huge and they would need to be > >sync'd up across all Knox instances in the deployment. > >2. regex identity assertion provider - this provider would be able to take > >something like an email address PrimaryPrincipal and extract a username > >from that. In some cases, like okta, this may be the proper username for > >companies that use okta as a hosted SSO solution. There is no additional > >principal mapping capabilities however. > > > >So, questions/options for 0.8.0 release: > > > >Option 1. Is static principal mapping within a topology using the > >pseudo/default identity assertion provider sufficient for the first > release > >that has support for these external providers? > > > >Option 2. Do we need to add principal mapping capabilities to the regex > >provider to allow for the extraction of a username AND subsequently > mapping > >that to another username? > > > >Option 3. Should we create a new identity asserter that does a look up in > >LDAP for mapping an id or email address to the username/CN? A more dynamic > >assertion provider like this would certainly be better for scalability and > >management but at the same time would require a change to LDAP schemas for > >things like twitter id. Email address may not require a schema change but > >would require the email address from the external provider to match that > >within the corporate LDAP. > > > >Option 4. Should we consider a central mapping storage identity assertion > >provider that would interrogate some KnoxSSO specific mechanism? We could > >look at a mapping of PrimaryPrincipal to DN from LDAP, to corporate email > >address or directly to username. This would require some separate > >registration or user sync mechanism to populate this central store and > >likely couple the mappings to a particular user store like LDAP in some > >way. It will also introduce a new wrinkle or consideration for Knox > >upgrades having actual user data to migrate, etc. For the central store we > >could consider: > > a. file in HDFS > > b. embedded HBase > > c. Hive > > d. RDBMS > > e. LDAP > > > >Personally, I lean toward the following: > > > >* Option 1 from above for 0.8.0 release introduces the pac4j provider with > >static principal mapping using pseudo/default assertion provider and > >possibly add support for principal mapping to the regex provider (Option > 2) > >for additional flexibility. > > > >* Option 3 and/or 4 from above for a follow up release/s when we can > >determine the exact design for the central store and user > sync/registration > >mechanism that would best meet the community needs and be sure to put the > >time into the upgrade/migration considerations. > > > >Thoughts? > > > >thanks, > > > >--larry >
