Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/186#issuecomment-38769355
Hi, @kayousterhout @markhamstra @mateiz @pwendell , I just committed the
supervisor-based solution, it should work as expected (the supervised crashes
due to an exception, suspends itself, supervisor gets what happened, clears
everything and kills the child)
the next problem here is how to test the fault tolerance logic, according
to my understanding on Akka testkit, we have to declare the supervisor as a
class to test its internal logic (this involves moving some functions from
DAGScheduler to the new class), my concern here is that 0.9.1 is just a
maintenance release, do we really want to change so much here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---