Philipp Dallig created ZEPPELIN-5225:
----------------------------------------

             Summary: RemoteInterpreterManagedProcess soft shutdown and 
abstraction
                 Key: ZEPPELIN-5225
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5225
             Project: Zeppelin
          Issue Type: Improvement
          Components: Core
    Affects Versions: 0.9.1, 0.10.0
            Reporter: Philipp Dallig
            Assignee: Philipp Dallig


During development I recognize many shutdown errors of remote interpreters.
{code}
2021-01-25T10:43:33.2749004Z  WARN [2021-01-25 10:43:33,274] ({Exec Default 
Executor} ProcessLauncher.java[onProcessFailed]:134) - Process with cmd 
[/home/runner/work/zeppelin/zeppelin/zeppelin-zengine/../bin/interpreter.sh, 
-d, 
/home/runner/work/zeppelin/zeppelin/zeppelin-zengine/../interpreter_NotebookTest/test,
 -c, 10.1.0.4, -p, 40207, -r, :, -i, 
test-isolated-2FYUBYUH2-2021-01-25_10-43-31, -l, 
/home/runner/work/zeppelin/zeppelin/zeppelin-zengine/../local-repo/test, -g, 
test] is failed due to
2021-01-25T10:43:33.2755177Z org.apache.commons.exec.ExecuteException: Process 
exited with an error: 143 (Exit value: 143)
2021-01-25T10:43:33.2757145Z    at 
org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
2021-01-25T10:43:33.2759258Z    at 
org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48)
2021-01-25T10:43:33.2760971Z    at 
org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200)
2021-01-25T10:43:33.2762144Z    at java.lang.Thread.run(Thread.java:748)
{code}
Zeppelin server does not wait for a clean shutdown of the remote interpreter, 
but stops the process hard. The relevant code is located in 
[RemoteInterpreterManagedProcess|https://github.com/apache/zeppelin/blob/d63289a47a9ed26098ad93cb62ae1660bb937182/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterManagedProcess.java#L138-L157].

We should also abstract the RemoteInterpreterManagedProcess class and move the 
exec code to a new class, because the RemoteInterpreterManagedProcess class 
contains a lot of code that is only necessary when the Zeppelin server controls 
a remote interpreter via exec.
In the meantime, we have many remote interpreter processes that are started by 
API calls to a cluster manager (e.g. K8s, YARN, Docker) but cannot use the code 
from the RemoteInterpreterManagedProcess class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to