Hi Eric, Thanks for starting this discussion.
Kerberos was developed decade before web development becomes popular. > There are some Kerberos limitations which does not work well in Hadoop. > Sure, Kerberos was developed long before the web but it was selected as de facto authentication mechanism in Hadoop after the internet boom. And it was selected for a reason - it is one of the strongest symmetric key based authentication mechanism out there which doesn't transmit the password in the plain text. Kerberos has been around since long and has stood the test of time. Microsoft Active Directory, which is extensively used in many > organizations, is based on Kerberos. > +1 to this. And the fact that Microsoft has put Active Directory in Azure too, tells me that AD (and thereof Kerberos) is not going away any time soon. Overall, I agree with Rajive and Craig on this topic. Paving way for the OpenID Connect in Hadoop is a good idea but seeing it as a replacement to Kerberos, needs to be carefully thought out. All the problems, that are described in the original mail, are not really Kerberos issues. Yes, we do understand that making Kerberos work *in a right way* is always an uphill task (I'm a long time Kerberos+Hadoop Support Engineer) but that shouldn't be the reason to replace it. Hint: CVE-2020-9492 > Btw, the CVE-2020-9492 is not accessible right now in the CVE database, maybe it is not yet public. On Thu, May 21, 2020 at 9:22 AM Steve Loughran <ste...@cloudera.com.invalid> wrote: > On Wed, 6 May 2020 at 23:32, Eric Yang <eric...@gmail.com> wrote: > > > Hi all, > > > > > > 4. Passing different form of tokens does not work well with cloud > provider > > security mechanism. For example, passing AWS sts token for S3 bucket. > > There is no renewal mechanism, nor good way to identify when the token > > would expire. > > > > > well, HADOOP-14556 does it fairly well, supporting session and role tokens. > We even know when they expire because we ask for a duration when we request > the session/role creds. > See org.apache.hadoop.fs.s3a.auth.delegation.AbstractS3ATokenIdentifier for > the core of what we marshall, including encryption secrets. > > The main issue there is that Yarn can't refresh those tokens because a new > triple of session credentials are required; currently token renewal assumes > the token is unchanged and a request is made to the service to update their > table of issued tokens. But even if the RM could get back a new token from > a refresh call, we are left with the problem of "how to get an updated set > of creds to each process" >