[ 
https://issues.apache.org/jira/browse/SPARK-20843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025036#comment-16025036
 ] 

Michael Allman commented on SPARK-20843:
----------------------------------------

[~rxin] I'd like to bump this to "Critical". This is really a disruptive, 
potentially dangerous change for spark streaming apps (among others). We could 
not tolerate this behavior in our production environment, and it caught us off 
guard in our prod migration to Spark 2.1.

I think this timeout should be configurable per-app (as a driver config param), 
but I couldn't find a way to do that. In our case, we modified our source build 
to set the timeout to `Int.MaxValue`, effectively reverting this change. 
Therefore, the best PR I could offer at this point is to effectively revert 
this change.

I have another concern that this behavior varies depending on the version of 
the underlying JDK. Specifically, this behavior will not manifest on Java 7 but 
will do so on Java 8+. IMO, users who upgrade their Java runtimes should not 
expect this kind of change in their Spark apps' behavior.

Thank you.

> Cannot gracefully kill drivers which take longer than 10 seconds to die
> -----------------------------------------------------------------------
>
>                 Key: SPARK-20843
>                 URL: https://issues.apache.org/jira/browse/SPARK-20843
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.1
>            Reporter: Michael Allman
>              Labels: regression
>
> Commit 
> https://github.com/apache/spark/commit/1c9a386c6b6812a3931f3fb0004249894a01f657
>  changed the behavior of driver process termination. Whereas before 
> `Process.destroyForcibly` was never called, now it is called (on Java VM's 
> supporting that API) if the driver process does not die within 10 seconds.
> This prevents apps which take longer than 10 seconds to shutdown gracefully 
> from shutting down gracefully. For example, streaming apps with a large batch 
> duration (say, 30 seconds+) can take minutes to shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to