[ 
https://issues.apache.org/jira/browse/DRILL-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994199#comment-14994199
 ] 

Rahul Challapalli commented on DRILL-3480:
------------------------------------------

We see this issue when we run the extended functional suite with a concurrency 
of 10. Now I took the failing query and all other queries (another 20 of them) 
which were supposedly running when this query was executing and ran this new 
set with a concurrency of 5, 10, & 20. None of them reproduced the issue.

To me this suggests the successful queries which ran before the failing query 
are not cleaning up properly. However we have not seen such issues in the 
success path of a query in the past that I can think of. Could it be a 'limit' 
query which is not cleaning up? Any other thoughts?

Also the frequency of this issue has increased with the commit # 
e7db9dcacbc39c4797de1aa29b119a7428451dea

Note  : In our environment all the queries get submitted to the same drillbit

> Some tpcds queries fail with with timeout errors
> ------------------------------------------------
>
>                 Key: DRILL-3480
>                 URL: https://issues.apache.org/jira/browse/DRILL-3480
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>            Reporter: Krystal
>            Assignee: Hanifi Gunes
>            Priority: Critical
>             Fix For: 1.4.0
>
>
> Commit Id 9a85b2c
> Some failed queries contained the following errors:
> {code}
> Failed while running cleanup query. Not returning connection to pool.
> java.lang.InterruptedException: sleep interrupted
>       at java.lang.Thread.sleep(Native Method)
>       at 
> org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:100)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:744)
> Channel closed /10.10.104.85:59334 <--> /10.10.104.85:31010.
> {code}
> Others failed with error:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: CONNECTION ERROR: 
> Exceeded timeout (40000) while waiting send intermediate work fragments to 
> remote nodes. Sent 8 and only heard response back from 4 nodes.
> [Error Id: b85205b5-3134-4f90-aca8-7d67af04f3ed]
>       at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
>       at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:111)
>       at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
>       at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
>       at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61)
>       at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233)
>       at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205)
>       at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>       at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>       at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>       at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>       at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>       at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>       at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>       at java.lang.Thread.run(Thread.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to