warrenzhu25 opened a new pull request, #37924:
URL: https://github.com/apache/spark/pull/37924

   ### What changes were proposed in this pull request?
   Add a config `spark.stage.attempt.ignoreOnDecommissionFetchFailure` to 
control whether ignore stage fetch failure caused by decommissioned executor 
when count `spark.stage.maxConsecutiveAttempts`
   
   ### Why are the changes needed?
   When executor decommission is enabled, there would be many stage failure 
caused by FetchFailed from decommissioned executor, further causing whole job's 
failure. It would be better not to count such failure in 
`spark.stage.maxConsecutiveAttempts`
   
   ### Does this PR introduce _any_ user-facing change?
   Yes
   
   ### How was this patch tested?
   Added test in `DAGSchedulerSuite`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to