seayoun commented on issue #26975: [SPARK-30325][CORE] Stage retry and executor crash cause app hung up forever URL: https://github.com/apache/spark/pull/26975#issuecomment-568834508 @cloud-fan PTAL, I have a deep think and remove the kill logic, and in `handleSuccessfulTask` I think we needn't handle this case, it can overwrite the shuffle meta info, beause if the executor keeping the partition shuffle data was lost, we can `Resubmit` this partition in current stage instead of reschedule the partition in the next stage by `FetchFailedException` cc @HyukjinKwon @jiangxb1987 @Ngone51
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
