Thank you for the clarification Marcelo, makes sense. I'm thinking about 2 questions here, somewhat unrelated to the original problem.
What is the purpose of the delegation token renewal (the one that is done automatically by Hadoop libraries, after 1 day by default)? It seems that it always happens (every day) until the token expires, no matter what. I'd probably find an answer to that in a basic Hadoop security description. I have a feeling that giving the keytab to Spark bypasses the concept behind delegation tokens. As I understand, the NN basically says that "your application can access hdfs with this delegation token, but only for 7 days". After 7 days, the NN should *ideally* ask me like "this app runs for a week now, do you want to continue that?" - then I'd need to login with my keytab and give the new delegation token to the application. I know that this would be really difficult to handle, but now Spark just "ignores" the whole token expiration mechanism and relogins every time it is needed. Am I missing something? 2016-11-03 22:42 GMT+01:00 Marcelo Vanzin <van...@cloudera.com>: > I think you're a little confused about what "renewal" means here, and > this might be the fault of the documentation (I haven't read it in a > while). > > The existing delegation tokens will always be "renewed", in the sense > that Spark (actually Hadoop code invisible to Spark) will talk to the > NN to extend its lifetime. The feature you're talking about is for > creating *new* delegation tokens after the old ones expire and cannot > be renewed anymore (i.e. the max-lifetime configuration). > > On Thu, Nov 3, 2016 at 2:02 PM, Zsolt Tóth <toth.zsolt....@gmail.com> > wrote: > > Yes, I did change dfs.namenode.delegation.key.update-interval and > > dfs.namenode.delegation.token.renew-interval to 15 min, the > max-lifetime to > > 30min. In this case the application (without Spark having the keytab) did > > not fail after 15 min, only after 30 min. Is it possible that the > resource > > manager somehow automatically renews the delegation tokens for my > > application? > > > > 2016-11-03 21:34 GMT+01:00 Marcelo Vanzin <van...@cloudera.com>: > >> > >> Sounds like your test was set up incorrectly. The default TTL for > >> tokens is 7 days. Did you change that in the HDFS config? > >> > >> The issue definitely exists and people definitely have run into it. So > >> if you're not hitting it, it's most definitely an issue with your test > >> configuration. > >> > >> On Thu, Nov 3, 2016 at 7:22 AM, Zsolt Tóth <toth.zsolt....@gmail.com> > >> wrote: > >> > Hi, > >> > > >> > I ran some tests regarding Spark's Delegation Token renewal mechanism. > >> > As I > >> > see, the concept here is simple: if I give my keytab file and client > >> > principal to Spark, it starts a token renewal thread, and renews the > >> > namenode delegation tokens after some time. This works fine. > >> > > >> > Then I tried to run a long application (with HDFS operation in the > end) > >> > without providing the keytab/principal to Spark, and I expected it to > >> > fail > >> > after the token expires. It turned out that this is not the case, the > >> > application finishes successfully without a delegation token renewal > by > >> > Spark. > >> > > >> > My question is: how is that possible? Shouldn't a saveAsTextfile() > fail > >> > after the namenode delegation token expired? > >> > > >> > Regards, > >> > Zsolt > >> > >> > >> > >> -- > >> Marcelo > > > > > > > > -- > Marcelo >