GitHub user ilganeli opened a pull request:

    https://github.com/apache/spark/pull/5277

    [SPARK-6492][CORE] SparkContext.stop() can deadlock when 
DAGSchedulerEventProcessLoop dies

    I've added a timeout and retry loop around the SparkContext shutdown code 
that should fix this deadlock. If a SparkContext shutdown is in progress when 
another thread comes knocking, it will wait for 10 seconds for the lock, then 
fall through where the outer loop will re-submit the request.
    
    Also, I've moved the {code} stopped = true {/code} to the end of the 
shutdown sequence since otherwise the sequence may not complete successfully 
yet the context would be labelled as stopped. 
    I would appreciate feedback from folks more familiar with the shutdown code 
to confirm that it's ok to run some of the items within the shutdown sequence 
multiple times in case the entire sequence doesn't complete successfully the 
first time. I think this should be fixed rather than leaving it as it is now 
since in my opinion a double cleanup with an error is a better solution than an 
improper cleanup. 
    
    I added a null-check for the dagScheduler object which could otherwise 
trigger a NullPointerException.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ilganeli/spark SPARK-6492

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5277.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5277
    
----
commit 343cb941d4ffde4c8a1552ddf693d23a27d8388f
Author: Ilya Ganelin <[email protected]>
Date:   2015-03-30T22:45:22Z

    [SPARK-6492] Added timeout/retry logic to fix a deadlock in SparkContext 
shutdown

commit df8224ff4621dc6974d56292af09281c31b4381e
Author: Ilya Ganelin <[email protected]>
Date:   2015-03-30T22:46:14Z

    Added comment for added lock

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to