Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/4688#issuecomment-97183253
> So I noticed that if the dfs.namenode.delegation.token.renew-interval is
different from the max lifetime of the token, a lot of exceptions get thrown
around with the token being expired etc - and the executors may not be able to
read the new tokens. It looks like the tokens don't get renewed if HDFS is not
accessed before the renew interval - so for an executor which accesses HDFS
rarely enough, it may not be able to read from HDFS.
> So instead of waiting till 80% of max lifetime, I wait till 0.75 *
dfs.namenode.delegation.token.renew-interval to renew. This means that the
hdfs-site.xml file must be in sync with the one on the namenode >(my
understanding is this param's value is rarely changed, so this is unlikely to
be an issue at all).
thanks for the updates and details on testing.
So my guess on this is that after the initial expiration period the yarn RM
isn't renewing the tokens anymore since it doesn't get the updated ones (it
only has the one you initially submitted the application with). Thus in order
for the token to stay good for longer then 1 day you either have to renew it or
do the loginFromKeytab like you mention.
So you could change this to renew until the max lifetime and then do the
loginFromKeytab. I don't think doing the loginFromKeytab is going to add much
more overhead then doing the renewal so I'm ok with leaving this doing the
loginFromKeytab before the renewal period. We could always change it later if
we decide.
I'd rather not use dfs.namenode.delegation.token.renew-interval config. As
you say it might not match on the gateway as compared to what the namenode is
using. You can get the renewal interval by doing a renew on the token once.
Then we can store that and do the loginFromKeytab at X% of that. Note that
addDelegationTokens in obtainTokensForNamenodes will return a list of tokens
that you could renew to get the period.
I'll look through the rest of the code and leave any comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]