Thomas Graves created SPARK-22218:
-------------------------------------
Summary: spark shuffle services fails to update secret on
application re-attempts
Key: SPARK-22218
URL: https://issues.apache.org/jira/browse/SPARK-22218
Project: Spark
Issue Type: Bug
Components: Shuffle, YARN
Affects Versions: 2.2.0
Reporter: Thomas Graves
Priority: Blocker
Running on yarn, If you have any application re-attempts using the spark 2.2
shuffle service, the external shuffle service does not update the credentials
properly and the application re-attempts fail with
javax.security.sasl.SaslException.
A bug was fixed in 2.2 (SPARK-21494) where it changed the ShuffleSecretManager
to use containsKey
(https://git.corp.yahoo.com/hadoop/spark/blob/yspark_2_2_0/common/network-shuffle/src/main/java/org/apache/spark/network/sasl/ShuffleSecretManager.java#L50)
, which is the proper behavior, the problem is that between application
re-attempts it never removes the key. So when the second attempt starts, the
code says it already contains the key (since the application id is the same)
and it doesn't update the secret properly.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]