Khurram Faraaz created DRILL-6235: ------------------------------------- Summary: Flatten query leads to out of memory in RPC layer. Key: DRILL-6235 URL: https://issues.apache.org/jira/browse/DRILL-6235 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.12.0 Reporter: Khurram Faraaz Assignee: Padma Penumarthy Attachments: 25593391-512d-23ab-7c84-3651006931e2.sys.drill
Flatten query leads to out of memory in RPC layer. Query profile is attached here. Total number of JSON files = 4095 Each JSON file has nine rows And each row in the JSON has an array with 1024 integer values, and there are other string values outside of the array. Two major fragments and eighty eight minor fragments were created On a 4 node CentOS cluster number of CPU cores [root@qa102-45 ~]# grep -c ^processor /proc/cpuinfo 32 Details of memory {noformat} 0: jdbc:drill:schema=dfs.tmp> select * from sys.memory; +------------------+------------+---------------+-------------+-----------------+---------------------+-------------+ | hostname | user_port | heap_current | heap_max | direct_current | jvm_direct_current | direct_max | +------------------+------------+---------------+-------------+-----------------+---------------------+-------------+ | qa102-45.qa.lab | 31010 | 1130364912 | 4294967296 | 0 | 170528 | 8589934592 | | qa102-47.qa.lab | 31010 | 171823104 | 4294967296 | 0 | 21912 | 8589934592 | | qa102-48.qa.lab | 31010 | 201326576 | 4294967296 | 0 | 21912 | 8589934592 | | qa102-46.qa.lab | 31010 | 214780896 | 4294967296 | 0 | 21912 | 8589934592 | +------------------+------------+---------------+-------------+-----------------+---------------------+-------------+ 4 rows selected (0.166 seconds) {noformat} Reset all options and set slice_target=1 alter system reset all; alter system set `planner.slice_target`=1; {noformat} SELECT * , FLATTEN(arr) FROM many_json_files ... Error: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Failure allocating buffer. Fragment 1:38 [Error Id: cf4fd273-d8a2-45e8-8d72-15c738e53b0f on qa102-45.qa.lab:31010] (state=,code=0) {noformat} Stack trace from drillbit.log fir above failing query. {noformat} 2018-03-12 11:52:33,849 [25593391-512d-23ab-7c84-3651006931e2:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 25593391-512d-23ab-7c84-3651006931e2:0:0: State change requested AWAITING_ALLOCATION --> RUNNING 2018-03-12 11:52:33,849 [25593391-512d-23ab-7c84-3651006931e2:frag:0:0] INFO o.a.d.e.w.f.FragmentStatusReporter - 25593391-512d-23ab-7c84-3651006931e2:0:0: State to report: RUNNING 2018-03-12 11:52:33,854 [25593391-512d-23ab-7c84-3651006931e2:frag:0:0] INFO o.a.d.e.c.ClassCompilerSelector - Java compiler policy: DEFAULT, Debug option: true 2018-03-12 11:52:35,929 [BitServer-4] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 92340224. 2018-03-12 11:52:35,929 [BitServer-3] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 92340224. 2018-03-12 11:52:35,930 [BitServer-3] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2018-03-12 11:52:35,930 [BitServer-4] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2018-03-12 11:52:35,930 [BitServer-4] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 83886080. 2018-03-12 11:52:35,930 [BitServer-3] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 83886080. 2018-03-12 11:52:35,930 [BitServer-4] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2018-03-12 11:52:35,930 [BitServer-3] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2018-03-12 11:52:35,931 [BitServer-3] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 83886080. 2018-03-12 11:52:35,931 [BitServer-4] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 83886080. 2018-03-12 11:52:35,931 [BitServer-3] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2018-03-12 11:52:35,931 [BitServer-4] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. ... ... 2018-03-12 11:52:35,939 [BitServer-4] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 67174400. 2018-03-12 11:52:35,939 [BitServer-4] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2018-03-12 11:52:35,939 [BitServer-2] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 84017152. 2018-03-12 11:52:35,939 [BitServer-2] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2018-03-12 11:52:35,940 [25593391-512d-23ab-7c84-3651006931e2:frag:1:38] INFO o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes ran out of memory while executing the query. (Failure allocating buffer.) org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Failure allocating buffer. [Error Id: cf4fd273-d8a2-45e8-8d72-15c738e53b0f ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:243) [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:67) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:4.0.48.Final] at org.apache.drill.exec.memory.AllocationManager.<init>(AllocationManager.java:83) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:258) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:241) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:211) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.vector.BigIntVector.allocateBytes(BigIntVector.java:239) ~[vector-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.vector.BigIntVector.allocateNew(BigIntVector.java:219) ~[vector-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.vector.RepeatedBigIntVector.allocateNew(RepeatedBigIntVector.java:272) ~[vector-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:41) ~[vector-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:66) ~[vector-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.record.RecordBatchSizer$ColumnSize.allocateVector(RecordBatchSizer.java:403) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doAlloc(FlattenRecordBatch.java:276) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.handleRemainder(FlattenRecordBatch.java:240) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:165) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:93) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:233) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:226) ~[drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_161] at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_161] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) ~[hadoop-common-2.7.0-mapr-1707.jar:na] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:226) [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] ... 4 common frames omitted Caused by: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 67117056 byte(s) of direct memory (used: 8548425728, max: 8589934592) at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:510) ~[netty-common-4.0.48.Final.jar:4.0.48.Final] at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:464) ~[netty-common-4.0.48.Final.jar:4.0.48.Final] at io.netty.buffer.UnpooledUnsafeNoCleanerDirectByteBuf.allocateDirect(UnpooledUnsafeNoCleanerDirectByteBuf.java:30) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf.allocateDirect(UnpooledByteBufAllocator.java:169) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:67) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at io.netty.buffer.UnpooledUnsafeNoCleanerDirectByteBuf.<init>(UnpooledUnsafeNoCleanerDirectByteBuf.java:25) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf.<init>(UnpooledByteBufAllocator.java:164) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:73) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.newDirectBufferL(PooledByteBufAllocatorL.java:157) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:4.0.48.Final] at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.directBuffer(PooledByteBufAllocatorL.java:201) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:4.0.48.Final] at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:65) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:4.0.48.Final] ... 27 common frames omitted {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)