GitHub user HeartSaVioR opened a pull request:

    https://github.com/apache/incubator-zeppelin/pull/575

    ZEPPELIN-534 Discard broken thrift Client instance

    ### What is this PR for?
    
    Zeppelin has been reused broken thrift client instances.
    Since we can catch TException, we can discard client instances which throws 
TException from client pool.
    
    ### What type of PR is it?
    Bug Fix | Improvement
    
    ### Todos
    
    ### Is there a relevant Jira issue?
    
    https://issues.apache.org/jira/browse/ZEPPELIN-534
    
    ### How should this be tested?
    
    1. run notebook which uses spark interpreter
    2. kill spark interpreter with -9
    3. run notebook which uses killed interpreter
    4. run same notebook again and see error log has changed
    
    output of 3
    ```
    java.net.SocketException: Connection reset at 
java.net.SocketInputStream.read(SocketInputStream.java:196) at 
java.net.SocketInputStream.read(SocketInputStream.java:122) at 
java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at 
java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at 
java.io.BufferedInputStream.read(BufferedInputStream.java:334) at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:220)
 at org.apache.zeppelin.inte
 
rpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:205)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:225)
 at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
 at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:211) at 
org.apache.zeppelin.scheduler.Job.run(Job.java:169) at 
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at 
java.util.concurrent.FutureTask.run(FutureTask.java:262) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Work
 er.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
    ```
    
    output of 4
    ```
    java.net.ConnectException: Connection refused at 
java.net.PlainSocketImpl.socketConnect(Native Method) at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) 
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at 
java.net.Socket.connect(Socket.java:579) at 
org.apache.thrift.transport.TSocket.open(TSocket.java:182) at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
 at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
 at 
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
 at 
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
 at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
 at org.apache.commons.pool2.impl
 .GenericObjectPool.borrowObject(GenericObjectPool.java:363) at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:140)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:205)
 at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
 at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:211) at 
org.apache.zeppelin.scheduler.Job.run(Job.java:169) at 
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at 
java.util.concurrent.FutureTask.run(FutureTask.java:262) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
 at java.util.concurrent.ThreadPoolExecutor.runWork
 er(ThreadPoolExecutor.java:1145) at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745)
    ```
    
    Result could be different how many client instances pool makes at initial 
phase.
    Before applying this, output of 4 would be ```broken pipe```, which means 
it doesn't discard previous client instance.
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? (No)
    * Is there breaking changes for older versions? (No)
    * Does this needs documentation? (No)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HeartSaVioR/incubator-zeppelin ZEPPELIN-534

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-zeppelin/pull/575.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #575
    
----
commit a84d0ebbcf2ca2d355582956f90213a014fe2059
Author: Jungtaek Lim <[email protected]>
Date:   2015-12-28T07:00:49Z

    ZEPPELIN-534 Discard broken thrift Client instance
    
    * We can treat client as broken when TException occurs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to