Thanks for the reply. Since the 2 machines are on the same LAN (no firewall in between), does the Drill JDBC driver (or drill-embedded server) have any timeouts which can be increased?
Interestingly, the client side (JDBC) doesn't notice that the server side (Drill-embedded) has disconnected. ---Paul -----Original Message----- From: Nirav Shah [mailto:[email protected]] Sent: Tuesday, February 09, 2016 11:38 AM To: [email protected] Subject: Re: Help with error message... >From the logs it looks like network drop between nodes. If it fails on exact time say 10 min than check with firewall settings. On Feb 10, 2016 12:27 AM, "Paul Friedman" <[email protected]> wrote: > Hello... > > I'm executing a long-running Drill (1.4) query (4-10mins) called via > JDBC from Talend and sometimes I'm seeing an error stack like this > (see below) > > The query is a select statement with an order by against a directory > of Parquet files which were produced by Spark. Probably half the time > it succeeds and returns the expected results, but often it's erroring > out as below. > > Can you help with any insights? > > Thanks in advance. > > ---Paul > > ... > 2016-02-08 16:47:47,275 > [2946cbe3-e73d-2ed4-da60-76c1bd799372:frag:1:0] > INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:1:0: State change requested > RUNNING > --> > FINISHED > 2016-02-08 16:47:47,276 > [2946cbe3-e73d-2ed4-da60-76c1bd799372:frag:1:0] > INFO > o.a.d.e.w.f.FragmentStatusReporter - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:1:0: State to report: FINISHED > 2016-02-08 16:48:25,496 [UserServer-1] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:0:0: State change requested > RUNNING > --> > FAILED > 2016-02-08 16:48:25,778 > [2946cbe3-e73d-2ed4-da60-76c1bd799372:frag:0:0] > INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:0:0: State change requested > FAILED --> FAILED > 2016-02-08 16:48:25,779 [UserServer-1] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:0:0: State change requested > FAILED --> FAILED > 2016-02-08 16:48:25,779 [CONTROL-rpc-event-queue] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:0:0: State change requested > FAILED --> CANCELLATION_REQUESTED > 2016-02-08 16:48:25,779 [CONTROL-rpc-event-queue] WARN > o.a.d.e.w.fragment.FragmentExecutor - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:0:0: Ignoring unexpected state > transition FAILED --> CANCELLATION_REQUESTED > 2016-02-08 16:48:25,779 > [2946cbe3-e73d-2ed4-da60-76c1bd799372:frag:0:0] > INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:0:0: State change requested > FAILED --> FAILED > 2016-02-08 16:48:25,780 > [2946cbe3-e73d-2ed4-da60-76c1bd799372:frag:0:0] > INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2946cbe3-e73d-2ed4-da60-76c1bd799372:0:0: State change requested > FAILED --> FINISHED > 2016-02-08 16:48:25,781 [UserServer-1] WARN > o.a.d.exec.rpc.RpcExceptionHandler - Exception occurred with closed > channel. > Connection: /172.20.20.154:31010 <--> /172.20.20.157:64101 (user > client) > java.nio.channels.ClosedChannelException: null > 2016-02-08 16:48:25,783 > [2946cbe3-e73d-2ed4-da60-76c1bd799372:frag:0:0] > ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: > ChannelClosedException: Channel closed /172.20.20.154:31010 <--> > /172.20.20.157:64101. > > Fragment 0:0 > > [Error Id: 2f075631-fb49-4feb-b39d-cbe89083a2ee on > chai.dev.streetlightdata.com:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > ChannelClosedException: Channel closed /172.20.20.154:31010 <--> > /172.20.20.157:64101. > > Fragment 0:0 > > [Error Id: 2f075631-fb49-4feb-b39d-cbe89083a2ee on > chai.dev.streetlightdata.com:31010] > at > > org.apache.drill.common.exceptions.UserException$Builder.build(UserExc > eption.java:534) > ~[drill-common-1.4.0.jar:1.4.0] > at > > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(Fr > agmentExecutor.java:321) > [drill-java-exec-1.4.0.jar:1.4.0] > at > > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentE > xecutor.java:184) > [drill-java-exec-1.4.0.jar:1.4.0] > at > > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecu > tor.java:290) > [drill-java-exec-1.4.0.jar:1.4.0] > at > > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable. > java:38) > [drill-common-1.4.0.jar:1.4.0] > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j > ava:1142) > [na:1.8.0_66] > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor. > java:617) > [na:1.8.0_66] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] Caused > by: org.apache.drill.exec.rpc.ChannelClosedException: Channel closed > /172.20.20.154:31010 <--> /172.20.20.157:64101. > at > > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplet > e(RpcBus.java:175) > ~[drill-rpc-1.4.0.jar:1.4.0] > at > > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplet > e(RpcBus.java:151) > ~[drill-rpc-1.4.0.jar:1.4.0] > at > > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise > .java:680) ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromis > e.java:603) ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise > .java:563) ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java > :406) ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromis > e.java:82) ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel > .java:943) ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChann > el.java:592) ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel. > java:584) ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.cl > oseOnRead(AbstractEpollStreamChannel.java:409) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.ep > ollInReady(AbstractEpollStreamChannel.java:647) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.ep > ollRdHupReady(AbstractEpollStreamChannel.java:573) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java > :315) ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadE > ventExecutor.java:111) ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > ... 1 common frames omitted > 2016-02-08 16:48:25,785 [drill-executor-42] WARN > o.a.d.exec.rpc.control.WorkEventBus - Fragment > 2946cbe3-e73d-2ed4-da60-76c1bd799372:0:0 not found in the work bus. > 2016-02-08 16:48:25,810 [CONTROL-rpc-event-queue] WARN > o.a.drill.exec.work.foreman.Foreman - Dropping request to move to > COMPLETED state as query is already at CANCELED state (which is terminal). > 2016-02-08 16:48:25,811 [UserServer-1] INFO > o.a.drill.exec.work.foreman.Foreman - Failure while trying communicate > query result to initiating client. This would happen if a client is > disconnected before response notice can be sent. > org.apache.drill.exec.rpc.ChannelClosedException: null > at > > org.apache.drill.exec.rpc.CoordinationQueue$RpcListener.operationCompl > ete(CoordinationQueue.java:89) > [drill-rpc-1.4.0.jar:1.4.0] > at > > org.apache.drill.exec.rpc.CoordinationQueue$RpcListener.operationCompl > ete(CoordinationQueue.java:67) > [drill-rpc-1.4.0.jar:1.4.0] > at > > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise > .java:680) [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromis > e.java:603) [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise > .java:563) [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java > :424) [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(Abstrac > tChannel.java:788) [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel. > java:689) [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChann > elPipeline.java:1114) [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractCha > nnelHandlerContext.java:705) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractCha > nnelHandlerContext.java:32) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write > (AbstractChannelHandlerContext.java:980) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write > (AbstractChannelHandlerContext.java:1032) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(A > bstractChannelHandlerContext.java:965) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleT > hreadEventExecutor.java:357) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadE > ventExecutor.java:111) [netty-common-4.0.27.Final.jar:4.0.27.Final] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] > 2016-02-08 16:48:25,812 [UserServer-1] WARN > o.a.drill.exec.work.foreman.Foreman - Dropping request to move to > FAILED state as query is already at CANCELED state (which is terminal). >
