[
https://issues.apache.org/jira/browse/DRILL-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893034#comment-15893034
]
Paul Rogers commented on DRILL-5226:
------------------------------------
The case with is not a bug: it is a misconfiguration.
{code}
Potential memory overflow! Minumum needed = 74,186,544 bytes, actual available
= 52,428,800 bytes
{code}
"But", you say, "I configured the sort to use 100+ MB!" Actually, you set the
memory-per-query at 100 MB. This query has two sorts, so memory is split
equally:
{code}
{
"options" : [ {
"kind" : "LONG",
"type" : "SESSION",
"name" : "planner.memory.max_query_memory_per_node",
"num_val" : 104857600
}, {
"kind" : "LONG",
"type" : "SESSION",
"name" : "planner.width.max_per_node",
"num_val" : 2
...
{code}
The max width per node is 2. The input has two files. So, the query runs as two
slices. Each gets half the max query memory.
The reason that 50 MB is not sufficient is the data size:
{code}
Config: memory limit = 52,428,800, spill file size = 268,4354,56, spill batch
size = 8,388,608,
merge limit = 2147483647, merge batch size = 16,777,216
Actual batch schema & sizes {
T1¦¦col1(std col. size: 54, actual col. size: 2, total size: 2392064, data
size: 65536, row capacity: 32768, density: 3)
...
col1(std col. size: 54, actual col. size: 2, total size: 327680, data size:
65536, row capacity: 32768, density: 20)
...
Records: 32768, Total size: 30,212,096, Row width:922, Density:13}
{code}
Since Drill has no control over its batch sizes, sometimes they are small and
we can sort with constrained memory. Sometimes, as here, they are big and need
more memory for the sort to make progress.
In short, we need at least two 30 MB batches in memory to sort, but we are only
given 50 MB.
That said, this case did uncover a direct memory leak which I'll investigate.
> External Sort encountered an error while spilling to disk
> ---------------------------------------------------------
>
> Key: DRILL-5226
> URL: https://issues.apache.org/jira/browse/DRILL-5226
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Affects Versions: 1.10.0
> Reporter: Rahul Challapalli
> Assignee: Paul Rogers
> Attachments: 277578d5-8bea-27db-0da1-cec0f53a13df.sys.drill,
> profile_scenario3.sys.drill, scenario3.log
>
>
> Environment :
> {code}
> git.commit.id.abbrev=2af709f
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> Nodes in Mapr Cluster : 1
> Data Size : ~ 0.35 GB
> No of Columns : 1
> Width of column : 256 chars
> {code}
> The below query fails before spilling to disk due to wrong estimates of the
> record batch size.
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set
> `planner.width.max_per_node` = 1;
> +-------+--------------------------------------+
> | ok | summary |
> +-------+--------------------------------------+
> | true | planner.width.max_per_node updated. |
> +-------+--------------------------------------+
> 1 row selected (1.11 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set
> `planner.memory.max_query_memory_per_node` = 62914560;
> +-------+----------------------------------------------------+
> | ok | summary |
> +-------+----------------------------------------------------+
> | true | planner.memory.max_query_memory_per_node updated. |
> +-------+----------------------------------------------------+
> 1 row selected (0.362 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set
> `planner.disable_exchanges` = true;
> +-------+-------------------------------------+
> | ok | summary |
> +-------+-------------------------------------+
> | true | planner.disable_exchanges updated. |
> +-------+-------------------------------------+
> 1 row selected (0.277 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select * from (select * from
> dfs.`/drill/testdata/resource-manager/250wide-small.tbl` order by
> columns[0])d where d.columns[0] = 'ljdfhwuehnoiueyf';
> Error: RESOURCE ERROR: External Sort encountered an error while spilling to
> disk
> Unable to allocate buffer of size 1048576 (rounded from 618889) due to memory
> limit. Current allocation: 62736000
> Fragment 0:0
> [Error Id: 1bb933c8-7dc6-4cbd-8c8e-0e095baac719 on qa-node190.qa.lab:31010]
> (state=,code=0)
> {code}
> Exception from the logs
> {code}
> 2017-01-26 15:33:09,307 [277578d5-8bea-27db-0da1-cec0f53a13df:frag:0:0] INFO
> o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: External Sort
> encountered an error while spilling to disk (Unable to allocate buffer of
> size 1048576 (rounded from 618889) due to memory limit. Current allocation:
> 62736000)
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: External
> Sort encountered an error while spilling to disk
> Unable to allocate buffer of size 1048576 (rounded from 618889) due to memory
> limit. Current allocation: 62736000
> [Error Id: 1bb933c8-7dc6-4cbd-8c8e-0e095baac719 ]
> at
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
> ~[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:603)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:411)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:232)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:226)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method)
> [na:1.7.0_111]
> at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_111]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
> [hadoop-common-2.7.0-mapr-1607.jar:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:226)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> [drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_111]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_111]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to
> allocate buffer of size 1048576 (rounded from 618889) due to memory limit.
> Current allocation: 62736000
> at
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:216)
> ~[drill-memory-base-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:191)
> ~[drill-memory-base-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.cache.VectorAccessibleSerializable.readFromStream(VectorAccessibleSerializable.java:112)
> ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.getBatch(BatchGroup.java:111)
> ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.getNextIndex(BatchGroup.java:137)
> ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at
> org.apache.drill.exec.test.generated.PriorityQueueCopierGen7.next(PriorityQueueCopierTemplate.java:76)
> ~[na:na]
> at
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:590)
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> ... 45 common frames omitted
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)