[ 
https://issues.apache.org/jira/browse/SPARK-33440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-33440.
----------------------------------
    Fix Version/s: 3.0.2
                   3.1.0
       Resolution: Fixed

Issue resolved by pull request 30366
[https://github.com/apache/spark/pull/30366]

> Spark schedules on updating delegation token with 0 interval under some token 
> provider implementation
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-33440
>                 URL: https://issues.apache.org/jira/browse/SPARK-33440
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.0.1, 3.1.0
>            Reporter: Jungtaek Lim
>            Assignee: Jungtaek Lim
>            Priority: Major
>             Fix For: 3.1.0, 3.0.2
>
>
> We got a report from customer that under specific circumstance Spark 
> schedules on updating delegation token with 0 interval, ended up with 
> flooding log message & massive requests on token handler side.
> After investigation, the problem was they have two delegation token 
> identifiers which one of token identifier (IDBS3ATokenIdentifier) has the 
> value of "issue date" to be 0, whereas another token identifier 
> (DelegationTokenIdentifier) has correct value. 
> Both are providing the expire time correctly via Token.renew(), and Spark 
> assumes issue date is "correct", hence calculating the token expire period as 
> (the result of Token.renew() - "issue date").
> {code}
> 20/10/13 06:34:19 INFO security.HadoopFSDelegationTokenProvider: Renewal 
> interval is 1603175657000 for token S3ADelegationToken/IDBroker
> 20/10/13 06:34:19 INFO security.HadoopFSDelegationTokenProvider: Renewal 
> interval is 86400048 for token HDFS_DELEGATION_TOKEN
> {code}
> It's safe at least here because Spark picks "minimal" value. The thing is, to 
> calculate the next renewal timestamp, Spark tries to add the renewal interval 
> with issue date for every token, and pick minimum value, hence "86400048" is 
> picked as the next renewal timestamp.
> This is "earlier" than now, hence interval to schedule goes to be negative 
> (as we apply subtract with now), and Spark applies safeguard to pick the 
> greater between 0 and interval, hence 0 is picked up, and schedule updating 
> token infinitely. (Schedule is one-time, but the calculation will always lead 
> to the negative, so that's effectively immediate schedule.)
> We should construct the better consideration of "safe guard", instead of just 
> guarding the schedule interval doesn't go to negative.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to