Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/5636#issuecomment-97323086
  
    Thanks for the update @ilganeli !  my comments are mostly minor.  The only 
thing which is bugging me is that the tests don't really show how the stage 
failure gets pushed up to the user code.  Eg., do they get a `SparkException` 
with a good message -- or does the DAGScheduler end up in some weird state 
where it stops running any additional jobs?  I think it should work, but the 
DAGScheduler code is hairy enough that I'd really prefer a test.  But I can't 
come up with a good way to write a unit test (or test manually for that 
matter).  Maybe something like this test in `ShuffleSuite`?
    
    
https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/ShuffleSuite.scala#L264
    
    The problem is you don't have a good way to delete the shuffle files 
between stage attempts ... but maybe we could swap-in a different 
`diskBlockManager` that always fails to find the files or something.  I'll 
think about it a little more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to