Kay Ousterhout created SPARK-9509:
-------------------------------------
Summary: AppClient.stop() may throw an exception
Key: SPARK-9509
URL: https://issues.apache.org/jira/browse/SPARK-9509
Project: Spark
Issue Type: Bug
Reporter: Kay Ousterhout
Assignee: Shixiong Zhu
AppClient.stop() calls RPCEndpointRef.askWithRetry, which throws a
SparkException if it fails. This exception is not caught (stop() only catches
timeout exceptions) which can lead to a failure during shutdown, causing Spark
not to clean itself up properly. This behavior was changed in this commit:
https://github.com/apache/spark/commit/3bee0f1466ddd69f26e95297b5e0d2398b6c6268#diff-a240aa7b4630dc389590147f96cf3431R174,
and this seems to be the root cause of the recent Distributed Suite test
failures described in SPARK-9497 (the flakiness of DistributedSuite coincides
with when the above commit was added to master).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]