Hi Steve,

Thanks a lot for such an elaborative email (though it brought more
questions than answers but it's because I'm new to YARN in general and
Kerberos/tokens/tickets in particular).

Thanks also for liking my notes. I'm very honoured to hear it from
you. I value your work with Spark/YARN/Hadoop. I'm going to spend some
time on security stuff and Kerberos is on my list (to learn why YARN
could be a better option than Mesos). I'll ping you when I'm ready for
review. Thanks.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Wed, Aug 24, 2016 at 11:28 AM, Steve Loughran <ste...@hortonworks.com> wrote:
>
>> On 23 Aug 2016, at 11:26, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>> Hi Steve,
>>
>> Could you share your opinion on whether the token gets renewed or not?
>> Is the token going to expire after 7 days anyway?
>
>
> There's Hadoop service tokens, and Kerberos tickets. They are similar-ish, 
> but not quite the same.
>
> -Kerberos "tickets" expire, you need to re-authenticate with a keytab or 
> user+password
> -Hadoop "Tokens" are more anonymous. A kerberos authenticated application has 
> to talk to the service to ask for a token (i.e. it uses a kerberos ticket to 
> say "I need a token for operation X for Y hours".
> -There are protocols for renewing tokens up to a time limit; can be done over 
> IPC mechanisms, or REST APIs using SASL
>
> I get a bit mixed up myself, and use "tickets and tokens" to allow myself to 
> get away with mistakes
>
> Things about kerberos you didn't want to know but will end up discovering in 
> stack traces anyway
>
> webinar: 
> http://hortonworks.com/webinar/hadoop-and-kerberos-the-madness-beyond-the-gate/
>
> and
>
> https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/
>
> YARN apps can run for a couple of days renewing tokens, but eventually the 
> time limit on token renewal is reached —they need to use a kerberos ticket to 
> request new tokens.
> If something times out after 7 days, I would guess that it's Kerberos ticket 
> expiry; a keytab needs to be passed to Spark for it to do the renewal
>
> The current YARN docs on this: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md
>
>
>
>
>
>>
>> Why is the change in
>> the recent version for token renewal? See
>> https://github.com/apache/spark/commit/ab648c0004cfb20d53554ab333dd2d198cb94ffa
>>
>
>
> That's designed to make it easy for a kerberos-authenticated client to get 
> tokens for more services. Before: hard coded support for HDFS, HBase, Hive. 
> After: anything which implements the same interface. This includes multiple 
> HBase servers, more than one Hive metastore, etc. It also stops the spark 
> client code needing lots of one-off classes, allows people to add their own 
> token fetching code for their own services.
>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>
>
> Like your e-book BTW
>
> If you plan to add a specific section of Spark & Kerberos, I'd gladly help 
> review it.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to