Zelaine, thanks for the suggestion. I added this option both to the
drill-override and in the session and this time the query did stay running
for much longer but it still eventually failed with the same error,
although much different memory values.

  (org.apache.drill.exec.exception.OutOfMemoryException) Unable to allocate
buffer of size 134217728 due to memory limit. Current allocation:
10653214316
    org.apache.drill.exec.memory.BaseAllocator.buffer():220
    org.apache.drill.exec.memory.BaseAllocator.buffer():195
    org.apache.drill.exec.vector.VarCharVector.reAlloc():425
    org.apache.drill.exec.vector.VarCharVector.copyFromSafe():278
    org.apache.drill.exec.vector.NullableVarCharVector.copyFromSafe():379
    org.apache.drill.exec.test.generated.PriorityQueueCopierGen8.doCopy():22
    org.apache.drill.exec.test.generated.PriorityQueueCopierGen8.next():76

org.apache.drill.exec.physical.impl.xsort.managed.CopierHolder$BatchMerger.next():234

org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill():1408

org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeAndSpill():1376

org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.spillFromMemory():1339

org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.processBatch():831

org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch():618

org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():660

org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():559
    org.apache.drill.exec.record.AbstractRecordBatch.next():162
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109

org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext():137
    org.apache.drill.exec.record.AbstractRecordBatch.next():162
    org.apache.drill.exec.physical.impl.BaseRootExec.next():104

org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext():144
    org.apache.drill.exec.physical.impl.BaseRootExec.next():94
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1657
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():226
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():745 (state=,code=0)

At first I didn't change planner.width.max_per_query and the default on a
32 core machine makes it 23. This query failed after 34 minutes. I then
tried setting planner.width.max_per_query=1 and this query also failed but
of course took took longer, about 2 hours. In both cases,
planner.memory.max_query_memory_per_node was set to 230G.


On Mon, May 1, 2017 at 11:09 AM, Zelaine Fong <[email protected]> wrote:

> Nate,
>
> The Jira you’ve referenced relates to the new external sort, which is not
> enabled by default, as it is still going through some additional testing.
> If you’d like to try it to see if it resolves your problem, you’ll need to
> set “sort.external.disable_managed” as follows  in your
> drill-override.conf file:
>
> drill.exec: {
>   cluster-id: "drillbits1",
>   zk.connect: "localhost:2181",
>   sort.external.disable_managed: false
> }
>
> and run the following query:
>
> ALTER SESSION SET `exec.sort.disable_managed` = false;
>
> -- Zelaine
>
> On 5/1/17, 7:44 AM, "Nate Butler" <[email protected]> wrote:
>
>     We keep running into this issue when trying to issue a query with
> hashagg
>     disabled. When I look at system memory usage though, drill doesn't
> seem to
>     be using much of it but still hits this error.
>
>     Our environment:
>
>     - 1 r3.8xl
>     - 1 drillbit version 1.10.0 configured with 4GB of Heap and 230G of
> Direct
>     - Data stored on S3 is compressed CSV
>
>     I've tried increasing planner.memory.max_query_memory_per_node to
> 230G and
>     lowered planner.width.max_per_query to 1 and it still fails.
>
>     We've applied the patch from this bug in the hopes that it would
> resolve
>     the issue but it hasn't:
>
>     https://issues.apache.org/jira/browse/DRILL-5226
>
>     Stack Trace:
>
>       (org.apache.drill.exec.exception.OutOfMemoryException) Unable to
> allocate
>     buffer of size 16777216 due to memory limit. Current allocation:
> 8445952
>         org.apache.drill.exec.memory.BaseAllocator.buffer():220
>         org.apache.drill.exec.memory.BaseAllocator.buffer():195
>         org.apache.drill.exec.vector.VarCharVector.reAlloc():425
>         org.apache.drill.exec.vector.VarCharVector.copyFromSafe():278
>         org.apache.drill.exec.vector.NullableVarCharVector.
> copyFromSafe():379
>
>     org.apache.drill.exec.test.generated.PriorityQueueCopierGen328.
> doCopy():22
>         org.apache.drill.exec.test.generated.PriorityQueueCopierGen328.
> next():75
>
>     org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.
> mergeAndSpill():602
>
>     org.apache.drill.exec.physical.impl.xsort.
> ExternalSortBatch.innerNext():428
>         org.apache.drill.exec.record.AbstractRecordBatch.next():162
>         org.apache.drill.exec.record.AbstractRecordBatch.next():119
>         org.apache.drill.exec.record.AbstractRecordBatch.next():109
>
>     org.apache.drill.exec.physical.impl.aggregate.
> StreamingAggBatch.innerNext():137
>         org.apache.drill.exec.record.AbstractRecordBatch.next():162
>         org.apache.drill.exec.physical.impl.BaseRootExec.next():104
>
>     org.apache.drill.exec.physical.impl.partitionsender.
> PartitionSenderRootExec.innerNext():144
>         org.apache.drill.exec.physical.impl.BaseRootExec.next():94
>         org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232
>         org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226
>         java.security.AccessController.doPrivileged():-2
>         javax.security.auth.Subject.doAs():422
>         org.apache.hadoop.security.UserGroupInformation.doAs():1657
>         org.apache.drill.exec.work.fragment.FragmentExecutor.run():226
>         org.apache.drill.common.SelfCleaningRunnable.run():38
>         java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>         java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>         java.lang.Thread.run():745 (state=,code=0)
>
>     Is there something I'm missing here? Any help/direction would be
>     appreciated.
>
>     Thanks,
>     Nate
>
>
>

Reply via email to