Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/7699#issuecomment-134762560
  
    @markhamstra @mateiz  thanks for taking a look, I think I've addressed your 
concerns.
    
    However, the last round of comments made me realize that there is probably 
still an issue -- after we register the map output for stage 1, and start 
executing stage 2, I think we'll still have a pending task set for stage 1 that 
is non-zombie.  You'll probably get pretty confusing behavior if you still see 
lots of tasks completing for stage 1, and you're very likely to run into 
[SPARK-8029](https://issues.apache.org/jira/browse/SPARK-8029).  On one hand, 
we can't eliminate this completely, since both attempts can be running the same 
partition at the same time (so no matter what SPARK-8029 is a possibility).  
But I feel like we should at least mark the attempt as zombie to avoid running 
even more tasks, just to reduce the possibility, make the output a little more 
understandable, and also avoid wasting resources by running tasks that aren't 
needed.
    
    I think testing that is going to be a little tricky, since it involves 
interaction between `DAGScheduler` and `TAskSetManager` that isn't possible 
with the current way we've got tests setup in `DAGSchedulerSuite`.  So I'd like 
to tackle this in a separate task, since I think this is a strict improvement 
in any case.  I should be able to look at that right away, so shouldn't be 
putting it off indefinitely.  Thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to