Kay Ousterhout created SPARK-9509:
-------------------------------------

             Summary: AppClient.stop() may throw an exception
                 Key: SPARK-9509
                 URL: https://issues.apache.org/jira/browse/SPARK-9509
             Project: Spark
          Issue Type: Bug
            Reporter: Kay Ousterhout
            Assignee: Shixiong Zhu


AppClient.stop() calls RPCEndpointRef.askWithRetry, which throws a 
SparkException if it fails.  This exception is not caught (stop() only catches 
timeout exceptions) which can lead to a failure during shutdown, causing Spark 
not to clean itself up properly.  This behavior was changed in this commit: 
https://github.com/apache/spark/commit/3bee0f1466ddd69f26e95297b5e0d2398b6c6268#diff-a240aa7b4630dc389590147f96cf3431R174,
 and this seems to be the root cause of the recent Distributed Suite test 
failures described in SPARK-9497 (the flakiness of DistributedSuite coincides 
with when the above commit was added to master).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to