Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-161111303
But isn't the scenario here that the user app isn't done, because it
spawned non-daemon threads that are doing something? I agree it's not good
practice, but if apps avoided doing this entirely we'd have no problem to begin
with. The question is what to do if such a user thread does exist and runs
long.
Killing it immediately could cause problems, and it's not terribly
theoretical: imagine persisting data to disk or writing to a socket and failing
to write all the bytes. Not-killing it of course opens the possibility that the
thread never stops at all. Which is worse? The long-running user thread is the
app's "fault" and is pretty easy to debug by looking at a stack dump. On the
other hand killing threads straight away could cause problems for a sort of
reasonably behaving app. (Also imagine this non-daemon thread could be in
library code.)
Maybe some kind of timeout mostly mitigates the issue, but then I think
it's just trying to save a fairly clearly misbehaving app from itself at a
non-trivial cost.
You're always going to have the possibility of a stuck process (deadlock,
infinite loop in a driver, etc) and need to be able to kill that if needed as
an admin.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]