Hi Steve, Thanks a lot for such an elaborative email (though it brought more questions than answers but it's because I'm new to YARN in general and Kerberos/tokens/tickets in particular).
Thanks also for liking my notes. I'm very honoured to hear it from you. I value your work with Spark/YARN/Hadoop. I'm going to spend some time on security stuff and Kerberos is on my list (to learn why YARN could be a better option than Mesos). I'll ping you when I'm ready for review. Thanks. Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Wed, Aug 24, 2016 at 11:28 AM, Steve Loughran <ste...@hortonworks.com> wrote: > >> On 23 Aug 2016, at 11:26, Jacek Laskowski <ja...@japila.pl> wrote: >> >> Hi Steve, >> >> Could you share your opinion on whether the token gets renewed or not? >> Is the token going to expire after 7 days anyway? > > > There's Hadoop service tokens, and Kerberos tickets. They are similar-ish, > but not quite the same. > > -Kerberos "tickets" expire, you need to re-authenticate with a keytab or > user+password > -Hadoop "Tokens" are more anonymous. A kerberos authenticated application has > to talk to the service to ask for a token (i.e. it uses a kerberos ticket to > say "I need a token for operation X for Y hours". > -There are protocols for renewing tokens up to a time limit; can be done over > IPC mechanisms, or REST APIs using SASL > > I get a bit mixed up myself, and use "tickets and tokens" to allow myself to > get away with mistakes > > Things about kerberos you didn't want to know but will end up discovering in > stack traces anyway > > webinar: > http://hortonworks.com/webinar/hadoop-and-kerberos-the-madness-beyond-the-gate/ > > and > > https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/ > > YARN apps can run for a couple of days renewing tokens, but eventually the > time limit on token renewal is reached —they need to use a kerberos ticket to > request new tokens. > If something times out after 7 days, I would guess that it's Kerberos ticket > expiry; a keytab needs to be passed to Spark for it to do the renewal > > The current YARN docs on this: > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md > > > > > >> >> Why is the change in >> the recent version for token renewal? See >> https://github.com/apache/spark/commit/ab648c0004cfb20d53554ab333dd2d198cb94ffa >> > > > That's designed to make it easy for a kerberos-authenticated client to get > tokens for more services. Before: hard coded support for HDFS, HBase, Hive. > After: anything which implements the same interface. This includes multiple > HBase servers, more than one Hive metastore, etc. It also stops the spark > client code needing lots of one-off classes, allows people to add their own > token fetching code for their own services. > >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski > > > Like your e-book BTW > > If you plan to add a specific section of Spark & Kerberos, I'd gladly help > review it. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org