JoshRosen commented on pull request #34265:
URL: https://github.com/apache/spark/pull/34265#issuecomment-941751848


   This is a longstanding issue and there's been multiple previous attempts to 
fix it.
   
   - #3794
   - #20770
   - #24438
   - #27234
   
   Some early attempts were rejected due to thread-safety issues with their 
approaches or became stale without review.
   
   This PR's approach is very similar to @ajithme's approach in #27234, with a 
few key differences:
   
   - I allowed exceptions to bubble instead of logging and ignoring them.
   - I used a faster and less-race-condition-prone testing approach (using the 
`SchedulerIntegrationSuite` framework).
   - I used a non-recursive tree-traversal method (based on similar existing 
methods) to avoid stack overflow errors when traversing huge DAGs.
   - I also added the fix to `submitMapStage` and `runApproximateJob`: these 
are much lesser used codepaths but can still potentially benefit from the fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to