[ 
https://issues.apache.org/jira/browse/DRILL-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255574#comment-16255574
 ] 

F Méthot commented on DRILL-4708:
---------------------------------

I'm glad someone is looking into this.
Using 1.10, on 200 nodes hadoop cluster, 3 threads per node, 8GB Heap, 20GB 
direct
We do get this issue daily when querying large dataset (multi TB of parquet). 
When the error happens, we re-submit the query and most of the time the queries 
goes through.
In some case, it is caused by a out of heap memory, in some other case it 
remained unexplained.

When we switched to 1.11, the error started to happens way too often, we had to 
switch back to 1.10. Haven't had time to investigate more.

> connection closed unexpectedly
> ------------------------------
>
>                 Key: DRILL-4708
>                 URL: https://issues.apache.org/jira/browse/DRILL-4708
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - RPC
>    Affects Versions: 1.7.0
>            Reporter: Chun Chang
>            Assignee: Karthikeyan Manivannan
>            Priority: Critical
>         Attachments: data.tgz
>
>
> Running DRILL functional automation, we often see query failed randomly due 
> to the following unexpected connection close error.
> {noformat}
> Execution Failures:
> /root/drillAutomation/framework/framework/resources/Functional/ctas/ctas_flatten/100000rows/filter5.q
> Query: 
> select * from dfs.ctas_flatten.`filter5_100000rows_ctas`
> Failed with exception
> java.sql.SQLException: CONNECTION ERROR: Connection /10.10.100.171:36185 <--> 
> drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. 
> Drillbit down?
> [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ]
>       at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>       at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321)
>       at 
> oadd.net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
>       at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172)
>       at 
> org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:210)
>       at 
> org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:99)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:744)
> Caused by: oadd.org.apache.drill.common.exceptions.UserException: CONNECTION 
> ERROR: Connection /10.10.100.171:36185 <--> 
> drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. 
> Drillbit down?
> [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ]
>       at 
> oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>       at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373)
>       at 
> oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>       at 
> oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
>       at 
> oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
>       at 
> oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
>       at 
> oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
>       at 
> oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943)
>       at 
> oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592)
>       at 
> oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584)
>       at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71)
>       at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89)
>       at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162)
>       at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>       at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>       at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>       at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>       at 
> oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>       ... 1 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to