Thomas Graves created SPARK-22218:
-------------------------------------

             Summary: spark shuffle services fails to update secret on 
application re-attempts
                 Key: SPARK-22218
                 URL: https://issues.apache.org/jira/browse/SPARK-22218
             Project: Spark
          Issue Type: Bug
          Components: Shuffle, YARN
    Affects Versions: 2.2.0
            Reporter: Thomas Graves
            Priority: Blocker


Running on yarn, If you have any application re-attempts using the spark 2.2 
shuffle service, the external shuffle service does not update the credentials 
properly and the application re-attempts fail with 
javax.security.sasl.SaslException. 

A bug was fixed in 2.2 (SPARK-21494) where it changed the ShuffleSecretManager 
to use containsKey 
(https://git.corp.yahoo.com/hadoop/spark/blob/yspark_2_2_0/common/network-shuffle/src/main/java/org/apache/spark/network/sasl/ShuffleSecretManager.java#L50)
 , which is the proper behavior, the problem is that between application 
re-attempts it never removes the key. So when the second attempt starts, the 
code says it already contains the key (since the application id is the same) 
and it doesn't update the secret properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to