On Wed, May 20, 2020 at 1:47 PM Eric Yang <eric...@gmail.com> wrote:
>
>> > Kerberos was developed decade before web development becomes popular.
>> > There are some Kerberos limitations which does not work well in Hadoop.  A
>> > few examples of corner cases:
>>
>> Microsoft Active Directory, which is extensively used in many organizations,
>> is based on Kerberos.
>
>
> True, but with rise of Google and AWS.  OIDC seems to be a formidable 
> standard that can replace Kerberos for authentication.  I think providing an 
> option for the new standard is good for Hadoop.
>

I think you are referring to Oauth2 and adoption across varies
significantly across vendors. When one refers to Kerberos, its mostly
about MIT Kerberos or Microsoft Active Directory. But Oauth2 is a
specification, implementations vary and are quite prone to bugs. I
would be very careful in making a generic statement as a "formidable
standard".

AWS services, atleast in the context of Data processing / Analytics
does not support Oauth2. Its more of a GCP thing. AWS uses Signed
requests [1].

[1] https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html

>>
>> > 1. Kerberos principal doesn't encode port number, it is difficult to know
>> > if the principal is coming from an authorized daemon or a hacker container
>> > trying to forge service principal.
>>
>> Clients use ephemeral ports. Not sure of what the relevancy of this 
>> statement.
>
> Hint: CVE-2020-9492
>

Its a reserved one. You can help the conversation by describing a threat model.

>> > 2. Hadoop Kerberos principals are used as high privileged principal, a form
>> > of credential to impersonate end user.
>>
>> Principals are identities of the user. You can make identities fully 
>> qualified,
>> to include issuing authority if you want to. This is not kerberos specific.
>>
>> Remember, Kerberos is an authentication mechanism, How those assertions
>> are translated to authorization rules are application specific.
>>
>> Probably reconsider alternatives to auth_to_local rules.
>
>
> Trust must be validated.  Hadoop Kerberos principals for service that can 
> perform impersonation are equal to root power.  Transport root power securely 
> without being intercepted is quite difficult, when services are running as 
> root instead of daemons.  There is alternate solution to always forward 
> signed end user token, hence, there is no need of validation of proxy user 
> credential.  The down side of forwarding signed token is difficult to forward 
> multiple tokens of incompatible security mechanism because renewal mechanism 
> and expiration time may not be deciphered by the transport mechanism.  This 
> is the reason that using SSO token is a good way to ensure every libraries 
> and framework abide by same security practice to eliminate confused deputy 
> problems.

Trust of what? Service principals should not be used for
authentication in client context, there
are there for server identification.

OAuth2 (which OIDC flow is based on) suggests JWT, which are signed
tokens. Can you
elaborate more on what do you mean my "SSO Token"?

To improve security for doAS use cases, add context to the calls. Just replacing
Kerberos with a different authentication mechanism is not going to
solve the problem.

And how to improve Proxy User usecases vary by application. Asserting
a 'on-behalf-of' action,
when there is an active client on the other end (eg: hdfs proxy) would
be different from one that
is initiated per schedule, eg Oozie.


>>
>> > 3. Delegation token may allow expired users to continue to run jobs long
>> > after they are gone, without rechecking if end user credentials is still
>> > valid.
>>
>> Delegation tokens are hadoop specific implementation, whose lifecycle is
>> outside the scope of Kerberos. Hadoop (NN/RM) can periodically check
>> respective IDP Policy and revoke tokens. Or have a central token
>> management service, similar to KMS
>>
>> > 4.  Passing different form of tokens does not work well with cloud provider
>> > security mechanism.  For example, passing AWS sts token for S3 bucket.
>> > There is no renewal mechanism, nor good way to identify when the token
>> > would expire.
>>
>> This is outside the scope of Kerberos.
>>
>> Assuming you are using YARN, making RM handle S3 temp credentials,
>> similar to HDFS delegation tokens is something to consider.
>>
>> > There are companies that work on bridging security mechanism of different
>> > types, but this is not primary goal for Hadoop.  Hadoop can benefit from
>> > modernized security using open standards like OpenID Connect, which
>> > proposes to unify web applications using SSO.   This ensure the client
>> > credentials are transported in each stage of client servers interaction.
>> > This may improve overall security, and provide more cloud native form
>> > factor.  I wonder if there is any interested in the community to enable
>> > Hadoop OpenID Connect integration work?
>>
>> End to end identity assertion is where Kerberos in it self does not address.
>> But any implementation should not pass "credentials'. Need a way to pass
>> signed requests, that could be verified along the chain.
>
>
> We agree on this, and OIDC seems like a good option to pass signed requests 
> and verifies the signed token.
>
>>
>> >
>> > regards,
>> > Eric

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to