Philipp Dallig created ZEPPELIN-5225: ----------------------------------------
Summary: RemoteInterpreterManagedProcess soft shutdown and abstraction Key: ZEPPELIN-5225 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5225 Project: Zeppelin Issue Type: Improvement Components: Core Affects Versions: 0.9.1, 0.10.0 Reporter: Philipp Dallig Assignee: Philipp Dallig During development I recognize many shutdown errors of remote interpreters. {code} 2021-01-25T10:43:33.2749004Z WARN [2021-01-25 10:43:33,274] ({Exec Default Executor} ProcessLauncher.java[onProcessFailed]:134) - Process with cmd [/home/runner/work/zeppelin/zeppelin/zeppelin-zengine/../bin/interpreter.sh, -d, /home/runner/work/zeppelin/zeppelin/zeppelin-zengine/../interpreter_NotebookTest/test, -c, 10.1.0.4, -p, 40207, -r, :, -i, test-isolated-2FYUBYUH2-2021-01-25_10-43-31, -l, /home/runner/work/zeppelin/zeppelin/zeppelin-zengine/../local-repo/test, -g, test] is failed due to 2021-01-25T10:43:33.2755177Z org.apache.commons.exec.ExecuteException: Process exited with an error: 143 (Exit value: 143) 2021-01-25T10:43:33.2757145Z at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404) 2021-01-25T10:43:33.2759258Z at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48) 2021-01-25T10:43:33.2760971Z at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200) 2021-01-25T10:43:33.2762144Z at java.lang.Thread.run(Thread.java:748) {code} Zeppelin server does not wait for a clean shutdown of the remote interpreter, but stops the process hard. The relevant code is located in [RemoteInterpreterManagedProcess|https://github.com/apache/zeppelin/blob/d63289a47a9ed26098ad93cb62ae1660bb937182/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterManagedProcess.java#L138-L157]. We should also abstract the RemoteInterpreterManagedProcess class and move the exec code to a new class, because the RemoteInterpreterManagedProcess class contains a lot of code that is only necessary when the Zeppelin server controls a remote interpreter via exec. In the meantime, we have many remote interpreter processes that are started by API calls to a cluster manager (e.g. K8s, YARN, Docker) but cannot use the code from the RemoteInterpreterManagedProcess class. -- This message was sent by Atlassian Jira (v8.3.4#803005)