Github user neil4dong commented on the issue:
https://github.com/apache/zeppelin/pull/1030
@AhyoungRyu It looks like I am comment on a wrong place.
But I am still have some problem about "spark interpreter group", When
multiple notebook use timed-scheduling sharing one spark interpreter instance
, because of the interpreter use REPL to process the notebook ,and REPL only
allow one task in the same time .
So the code looks like this :
```
public InterpreterResult interpret(String[] lines, InterpreterContext
context) {
synchronized (this) {
z.setGui(context.getGui());
sc.setJobGroup(getJobGroup(context), "Zeppelin", false);
InterpreterResult r = interpretInput(lines, context);
sc.clearJobGroup();
return r;
}
}
```
Then the blocked processing will effect each other , eventually leading the
interpreter jvm died in somehow without any suspicious output in the
interpreter log.
Meanwhile the zeppelin server is constantly checking the status of the
interpreter by calling `RemoteInterpreterService#getStatus().` Because the
interpreter is already died , the zeppelin server always found :
```
ERROR [2016-12-26 15:39:00,715] ({pool-1-thread-84}
RemoteScheduler.java[getStatus]:255) - Can't get status information
org.apache.zeppelin.interpreter.InterpreterException:
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53)
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
at
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
at
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:189)
at
org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.getStatus(RemoteScheduler.java:253)
at
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:341)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
... 15 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 16 more
ERROR [2016-12-26 15:39:00,715] ({pool-1-thread-84}
NotebookServer.java[afterStatusChange]:1145) - Error
org.apache.zeppelin.interpreter.InterpreterException:
org.apache.zeppelin.interpreter.InterpreterException:
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:250)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:279)
at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
at
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:328)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zeppelin.interpreter.InterpreterException:
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53)
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
at
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
at
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:189)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:248)
... 11 more
Caused by: org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
... 18 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 19 more
```
**Should we need to find out what lead the interpreter died?**
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---