anishshri-db commented on PR #42504:
URL: https://github.com/apache/spark/pull/42504#issuecomment-1684519240

   @JoshRosen - I addressed your comments. I think your proposal seems fine 
too. At the time that we throw this TaskKilled exception, the context should be 
non-null and should have the reason set even though interruptThread is passed 
as false. The issue of whether the plugins will block on I/O remains. We should 
possibly move that within this `try` after the check for killIfInterrupted 
maybe ? But in any case, the issues we have seen have been with threads that 
actually run the task execution and get blocked on some network I/O (remote RPC 
or otherwise) and whose timeouts are effectively larger than the reaper timeout 
causing us to block task slots and force a executor JVM kill eventually. Those 
majority of cases should be handled by your proposed fix, I believe. Let me 
know what you think. Thx


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to