[
https://issues.apache.org/jira/browse/DRILL-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542810#comment-14542810
]
Chris Westin commented on DRILL-2750:
-------------------------------------
I ran through the rest of the unit tests derived from BaseTestQuery, using
smaller amounts of memory to see if there were any leaks. In some cases, some
tests never failed, except at 8M, which seems to be too low to even start a
drillbit.
Here are the tests I ran with the lowest amounts of memory they were run with.
This is generally where I stopped because a lot of the test cases were failing.
Where they varied, I sometimes did multiple runs, with different amounts of
memory.
TestAggNullable (16M)
TestAggregateFunctions (12M)
TestAggregateFunctionsQuery (8M)
TestAltSortQueries (16M, 12M)
TestBroadcast (16M, 12M)
TestBugFixes (16M)
TestCastEmptyStrings (12M)
TestCastFunctions (10M)
TestComplexToJson (16M, 12M)
TestComplexTypeReader (12M)
TestComplexTypeWriter (12M, 10M)
TestContextFunctions (12M)
TestConvertFunctions (12M)
TestCTAS (12M)
TestExampleQueries (16M)
TestExtendedTypes (12M)
TestFlatten (12M, 10M)
TestFunctionsQuery (12M)
TestHashAggr (12M)
TestHashJoinAdvanced (16M)
TestInfoSchema (16M, 24M)
TestInList (10M)
TestJoinNullable (16M, 32M, 64M, 128M)
TestJsonReader (10M)
TestLargeInClause (12M)
TestLimitWithExchanges (12M)
TestMergeJoinAdvanced (16M)
TestNestedComplexSchema (16M, 12M)
TestNewDateFunctions (10M)
TestNewSimpleRepeatedFunctions (10M)
TestOutOfMemoryOutcome (12M)
TestParquetComplex (16M)
TestRepeatedReaders (16M)
TestSchemaChange (16M, 12M)
TestSimpleCastFunctions (10M, 12M, 16M, 32M)
TestSort (16M)
TestStarQueries (16M)
TestSystemTable (12M)
TestTextJoin (16M)
TestTpchSingleMode (16M)
TestUnionAll (32M, 16M, 12M)
TestWithClause (16M)
There was one memory leak found, in TestQueryOnLargeFile (16M): DRILL-3063
> Running 1 or more queries against Drillbits having insufficient DirectMem
> renders the Drillbits in an unusable state
> --------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-2750
> URL: https://issues.apache.org/jira/browse/DRILL-2750
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 0.9.0
> Environment: RHEL 6.4
> Reporter: Kunal Khatua
> Assignee: Chris Westin
> Priority: Critical
> Fix For: 1.0.0
>
> Attachments: DRILL-2750.1.patch.txt
>
>
> When running queries against a Drill cluster with limited DirectMem; if one
> or more queries fail due to insufficient memory, then even queries that
> should easily run within the allocated memory fail.
> The initial failure when queries with large memory requirements fail:
> 2015-04-10 09:57:55 [pip0] ERROR PipSQuawkling fetchRows - [ 1 / 16_par1000 ]
> Failure while executing query.
> java.sql.SQLException: Failure while executing query.
> at org.apache.drill.jdbc.DrillCursor.next(DrillCursor.java:144)
> at
> net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
> at org.apache.drill.jdbc.DrillResultSet.next(DrillResultSet.java:85)
> at PipSQuawkling.fetchRows(PipSQuawkling.java:319)
> at PipSQuawkling.executeTest(PipSQuawkling.java:154)
> at PipSQuawkling.run(PipSQuawkling.java:76)
> Caused by: org.apache.drill.exec.rpc.RpcException: RemoteRpcException:
> Failure while running fragment.[ e8c657a7-93a9-415a-8641-a4fbd4836a65 on
> ucs-node5.perf.lab:31010 ]
> [ e8c657a7-93a9-415a-8641-a4fbd4836a65 on ucs-node5.perf.lab:31010 ]
> at
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:111)
> at
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:100)
> at
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:52)
> at
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:34)
> at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:57)
> at
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:194)
> at
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:173)
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:161)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> at java.lang.Thread.run(Thread.java:744)
> After that, subsequent queries that should run, fail with the following:
> 2015-04-10 09:59:29 [pip0] ERROR PipSQuawkling executeQuery - [ 2 /
> rerun_06_par1000 ] exception while executing query: Failure while executing
> query.
> java.sql.SQLException: exception while executing query: Failure while
> executing query.
> at net.hydromatic.avatica.Helper.createException(Helper.java:40)
> at
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:406)
> at
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
> at
> net.hydromatic.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:78)
> at PipSQuawkling.executeQuery(PipSQuawkling.java:284)
> at PipSQuawkling.executeTest(PipSQuawkling.java:144)
> at PipSQuawkling.run(PipSQuawkling.java:76)
> Caused by: java.sql.SQLException: Failure while executing query.
> at org.apache.drill.jdbc.DrillCursor.next(DrillCursor.java:144)
> at
> org.apache.drill.jdbc.DrillResultSet.execute(DrillResultSet.java:105)
> at
> org.apache.drill.jdbc.DrillResultSet.execute(DrillResultSet.java:44)
> at
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
> ... 5 more
> Caused by: org.apache.drill.exec.rpc.RpcException: RemoteRpcException:
> Failure while trying to start remote fragment, You attempted to create a new
> child allocator with initial reservation 6000000 but only 110395 bytes of
> memory were available. [ 689006cb-d703-42c3-860d
> -bfecc0a66312 on ucs-node10.perf.lab:31010 ]
> at
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:111)
> at
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:100)
> at
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:52)
> at
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:34)
> at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:57)
> at
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:194)
> at
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:173)
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:161)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> at java.lang.Thread.run(Thread.java:744)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)