holdenk opened a new pull request, #37821: URL: https://github.com/apache/spark/pull/37821
### What changes were proposed in this pull request? Propagate decommission executor loss reason in K8s during onDisconnect ### Why are the changes needed? Currently if an executor has been sent a decommission message and then it disconnects from the scheduler we only disable the executor depending on the K8s status events to drive the rest of the state transitions. However, the K8s status events can become overwhelmed on large clusters so we should check if an executor is in a decommissioning state when it is disconnected and use that reason instead of waiting on the K8s status events so we have more accurate logging information. ### Does this PR introduce _any_ user-facing change? Logging output will change. ### How was this patch tested? Existing unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
