[ https://issues.apache.org/jira/browse/DRILL-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157019#comment-14157019 ]
Chun Chang commented on DRILL-1480: ----------------------------------- We had an issue with multi-homed cluster environment. After limit cluster node to single interface, the query succeeded. However, after running a batch of tpch queries (about 20), memory usage quickly went up on all nodes (20 nodes) and stayed at ~20G on each node, drill bits are not able to process new queries. Looks like memories are not getting reused (leaked). > severe memory leak query snappy compressed parquet file > ------------------------------------------------------- > > Key: DRILL-1480 > URL: https://issues.apache.org/jira/browse/DRILL-1480 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types > Affects Versions: 0.6.0 > Reporter: Chun Chang > > #Wed Oct 01 00:19:24 EDT 2014 > git.commit.id.abbrev=5c220e3 > Running TPCH query #03, drill bit shows severe memory leak and quickly ran > out of memory: > 2014-10-02 00:51:21,520 [WorkManager-116] ERROR > o.apache.drill.exec.work.WorkManager - Failure while running wrapper > [FragmentExecutor: 7d345235-0eb4-4189-b34f-f535fa5ad1bb:4:10] > java.lang.OutOfMemoryError: Direct buffer memory > at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_65] > at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) > ~[na:1.7.0_65] > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) > ~[na:1.7.0_65] > at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) > ~[netty-buffer-4.0.20.Final.jar:4.0.20.Final] > at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) > ~[netty-buffer-4.0.20.Final.jar:4.0.20.Final] > at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) > ~[netty-buffer-4.0.20.Final.jar:4.0.20.Final] > at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) > ~[netty-buffer-4.0.20.Final.jar:4.0.20.Final] > at > io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer(PooledByteBufAllocatorL.java:46) > ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:4.0.20.Final] > at > io.netty.buffer.PooledByteBufAllocatorL.directBuffer(PooledByteBufAllocatorL.java:66) > ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:4.0.20.Final] > at > org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:205) > > ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT] > at > org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:212) > > ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT] > at > org.apache.drill.exec.vector.IntVector.allocateNew(IntVector.java:149) > ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT] > at > org.apache.drill.exec.test.generated.HashTableGen478.allocMetadataVector(HashTableTemplate.java:728) > ~[na:na] > at > org.apache.drill.exec.test.generated.HashTableGen478.access$200(HashTableTemplate.java:41) > ~[na:na] > at > org.apache.drill.exec.test.generated.HashTableGen478$BatchHolder.<init>(HashTableTemplate.java:132) > ~[na:na] > at > org.apache.drill.exec.test.generated.HashTableGen478$BatchHolder.<init>(HashTableTemplate.java:101) > ~[na:na] > at > org.apache.drill.exec.test.generated.HashTableGen478.addBatchHolder(HashTableTemplate.java:654) > ~[na:na] > at > org.apache.drill.exec.test.generated.HashTableGen478.put(HashTableTemplate.java:494) > ~[na:na] > at > org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase(HashJoinBatch.java:344) > > ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:193) > > ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT] > The query runs fine against uncompressed parquet file of same 100G scale > factor. Here is the query: > [root@atsqa8c21 testcases]# cat 03.q > -- tpch3 using 1395599672 as a seed to the RNG > select > l.l_orderkey, > sum(l.l_extendedprice * (1 - l.l_discount)) as revenue, > o.o_orderdate, > o.o_shippriority > from > customer c, > orders o, > lineitem l > where > c.c_mktsegment = 'HOUSEHOLD' > and c.c_custkey = o.o_custkey > and l.l_orderkey = o.o_orderkey > and o.o_orderdate < date '1995-03-25' > and l.l_shipdate > date '1995-03-25' > group by > l.l_orderkey, > o.o_orderdate, > o.o_shippriority > order by > revenue desc, > o.o_orderdate > limit 10; -- This message was sent by Atlassian JIRA (v6.3.4#6332)