Nathan Butler created DRILL-5470:
------------------------------------

             Summary: External Sort - Unable to Allocate Buffer error
                 Key: DRILL-5470
                 URL: https://issues.apache.org/jira/browse/DRILL-5470
             Project: Apache Drill
          Issue Type: Bug
          Components:  Server
    Affects Versions: 1.10.0
         Environment: - ubuntu 14.04
- r3.8xl (32 CPU/240GB Mem)
- openjdk version "1.8.0_111"
- drill 1.10.0 with 8656c83b00f8ab09fb6817e4e9943b2211772541 cherry-picked
            Reporter: Nathan Butler


Per the mailing list discussion and Rahul's and Paul's suggestion I'm filing 
this Jira issue. Drill seems to be running out of memory when doing an External 
Sort. Per Zelaine's suggestion I enabled sort.external.disable_managed in 
drill-override.conf and in the sqlline session. This caused the query to run 
for longer but it still would fail with the same message.

Per Paul's suggestion, I enabled debug logging for the 
org.apache.drill.exec.physical.impl.xsort.managed package and re-ran the query.

Here's the initial DEBUG line for ExternalSortBatch for our query:

bq. 2017-05-03 12:02:56,095 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:15] 
DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Config: memory limit = 10737418240, 
spill file size = 268435456, spill batch size = 8388608, merge limit = 
2147483647, merge batch size = 16777216

And here's the last DEBUG line before the stack trace:

bq. 2017-05-03 12:37:44,249 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:4] 
DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Available memory: 10737418240, buffer 
memory = 10719535268, merge memory = 1070714
0978

And the stacktrace:

{quote}
2017-05-03 12:38:02,927 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:6] INFO  
o.a.d.e.p.i.x.m.ExternalSortBatch - User Error Occurred: External Sort 
encountered an error while spilling to disk (Un
able to allocate buffer of size 268435456 due to memory limit. Current 
allocation: 10579849472)
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: External Sort 
encountered an error while spilling to disk


[Error Id: 5d53c677-0cd9-4c01-a664-c02089670a1c ]
        at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
 ~[drill-common-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1447)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:1376)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.spillFromMemory(ExternalSortBatch.java:1339)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.processBatch(ExternalSortBatch.java:831)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch(ExternalSortBatch.java:618)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:660)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:559)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext(StreamingAggBatch.java:137)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
[drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:144)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
[drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:232)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:226)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at java.security.AccessController.doPrivileged(Native Method) 
[na:1.8.0_111]
        at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_111]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 [hadoop-common-2.7.1.jar:na]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:226)
 [drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.10.0.jar:1.10.0]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_111]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_111]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to 
allocate buffer of size 268435456 due to memory limit. Current allocation: 
10579849472
        at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) 
~[drill-memory-base-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:195) 
~[drill-memory-base-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.vector.VarCharVector.reAlloc(VarCharVector.java:425) 
~[vector-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.vector.VarCharVector.copyFromSafe(VarCharVector.java:278) 
~[vector-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.vector.NullableVarCharVector.copyFromSafe(NullableVarCharVector.java:379)
 ~[vector-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.test.generated.PriorityQueueCopierGen140.doCopy(PriorityQueueCopierTemplate.java:22)
 ~[na:na]
        at 
org.apache.drill.exec.test.generated.PriorityQueueCopierGen140.next(PriorityQueueCopierTemplate.java:76)
 ~[na:na]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.CopierHolder$BatchMerger.next(CopierHolder.java:234)
 ~[drill-java-exec-1.10.0.jar:1.10.0]
        at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1408)
 [drill-java-exec-1.10.0.jar:1.10.0]
        ... 24 common frames omitted
{quote}

I'm in communication with Paul and will send him the full log file.

Thanks,
Nathan



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to