Re: Aggregation OutOfMemoryException

Abdel Hakim Deneche Sun, 20 Mar 2016 03:14:42 -0700

actually:

sort limit = MQMPN / (NS * MPN * NC * 0.7)


On Wed, Mar 16, 2016 at 6:30 PM, Abdel Hakim Deneche <[email protected]>
wrote:

> sort memory limit is computed, as follows:
>
> MQMPN = planner.memory.max_query_memory_per_node
> MPN = planner.width.max_per_node
> NC = number of core in each cluster node
> NS = number of sort operators in the query
>
> sort limit = MQMPN / (MPN * NC * 0.7)
>
> In your case I assume the query contains a single sort operator and you
> have 16 cores per node. To increase the sort limit you can increase the
> value of max_query_memory_per_node and you can also reduce the value of
> planner.width.max_per_node. Please note that reducing the value of the
> latter option may increase the query's execution time.
>
> On Wed, Mar 16, 2016 at 2:47 PM, François Méthot <[email protected]>
> wrote:
>
>> The default spill directory (/tmp) did not have enough space. We fixed
>> that. (thanks John)
>>
>> I altered session to set
>> planner.memory.max_query_memory_per_node = 17179869184 (16GB)
>> planner.enable_hashjoin=false;
>> planner.enable_hashadd=false;
>>
>> We ran our aggregation.
>>
>> After 7h44m.
>>
>> We got
>>
>> Error: RESOURCE ERROR: External Sort encountered an error while spilling
>> to
>> disk
>>
>> Fragment 7:35
>>
>> Caused by org.apache.drill.exec.exception.OutOfMemory: Unable to allocate
>> buffer of size 65536 (rounded from 37444) due to memory limit. Current
>> allocation: 681080448.
>>
>> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:216)
>>
>> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:191)
>>
>>
>> org.apache.drill.exec.cache.VectorAccessibleSerializable.readFromStream(VectorAccessibleSerializable.java:112)
>>
>>
>> org.apache.drill.exec.physical.impl.xsort.BatchGroup.getBatch(BatchGroup.java:110)
>>
>>
>> org.apache.drill.exec.physical.impl.xsort.BatchGroup.getNextIndex(BatchGroup.java:136)
>>
>>
>> org.apache.drill.exec.test.generated.PriorityQueuedCopierGen975.next(PriorityQueueCopierTemplate.java:76)
>>
>>
>> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:557)
>>
>>
>> I think we were close to have the query completed, In the Fragment
>> Profiles
>> Web UI, the 2 bottom major fragment (out of 5) were showing that they were
>> done.
>> I had the same query working on a (20x) smaller set of data.
>> Should I add more mem to planner.memory.max_query_memory_per_node ?
>>
>>
>>
>> Abdel:
>> We did get the memory leak below while doing streaming aggregation, when
>> our /tmp directory was too small.
>> After fixing that, our streaming  aggregation got us the error above.
>>
>> Error: SYSTEM ERROR: IllegalStateException: Memory was leaked by query.
>> Memory leaked: (389120)
>> Allocator(op:6:51:2:ExternalSort) 2000000/389120/680576640/715827882
>> (res/actual/peal/limit)
>>
>> Fragment 6:51
>>
>> [Error Id: ..... on node014.prod:31010]
>>
>>
>>   (java.lan.IllegalStateException) Memory was leaked by query. Memory
>> leaked (389120)
>> Allocator(op:6:51:2:ExternalSort) 2000000/389120/680576640/715827882
>> (res/actual/peal/limit)
>>     org.apache.drill.exec.memory.BaseAllocator.close():492
>>     org.apache.drill.exec.ops.OperatorContextImpl.close():124
>>     org.apache.drill.exec.ops.FragmentContext.supressingClose():416
>>     org.apache.drill.exec.ops.FragmentContext.close():405
>>
>>
>> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():343
>>     org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():180
>>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():287
>>     org.apache.drill.common.SelfCleaningRunnable.run():38
>>     java.util.concurrentThreadPoolExecutor.runWorker():1142
>>
>>
>> Thanks guys for your feedback.
>>
>>
>>
>> On Sat, Mar 12, 2016 at 1:18 AM, Abdel Hakim Deneche <
>> [email protected]>
>> wrote:
>>
>> > Disabling hash aggregation will default to streaming aggregation + sort.
>> > This will allow you to handle larger data and spill to disk if
>> necessary.
>> >
>> > Like stated in the documentation, starting from Drill 1.5 the default
>> > memory limit of sort may not be enough to process large data, but you
>> can
>> > bump it up by increasing planner.memory.max_query_memory_per_node
>> (defaults
>> > to 2GB), and if necessary reducing planner.width.max_per_node (defaults
>> to
>> > 75% of number of cores).
>> >
>> > You said disabling hash aggregate and hash join causes a memory leak.
>> Can
>> > you give more details about the error ? the query may fail with an out
>> of
>> > memory but it shouldn't leak.
>> >
>> > On Fri, Mar 11, 2016 at 10:53 PM, John Omernik <[email protected]>
>> wrote:
>> >
>> > > I've had some luck disabling multi-phase aggregations on some queries
>> > where
>> > > memory was an issue.
>> > >
>> > > https://drill.apache.org/docs/guidelines-for-optimizing-aggregation/
>> > >
>> > > After I try that, than I typically look at the hash aggregation as you
>> > have
>> > > done:
>> > >
>> > >
>> > >
>> >
>> https://drill.apache.org/docs/sort-based-and-hash-based-memory-constrained-operators/
>> > >
>> > > I've had limited success with changing the max_query_memory_per_node
>> and
>> > > max_width, sometimes it's a weird combination of things that work in
>> > there.
>> > >
>> > > https://drill.apache.org/docs/troubleshooting/#memory-issues
>> > >
>> > > Back to your spill stuff if you disable hash aggregation, do you know
>> if
>> > > your spill directories are setup? That may be part of the issue, I am
>> not
>> > > sure what the default spill behavior of Drill is for spill directory
>> > setup.
>> > >
>> > >
>> > >
>> > > On Fri, Mar 11, 2016 at 2:17 PM, François Méthot <[email protected]
>> >
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > >    Using version 1.5, DirectMemory is currently set at 32GB, heap
>> is at
>> > > > 8GB. We have been trying to perform multiple aggregation in one
>> query
>> > > (see
>> > > > below) on 40 Billions+ rows stored on 13 nodes. We are using parquet
>> > > > format.
>> > > >
>> > > > We keep getting OutOfMemoryException: Failure allocating buffer..
>> > > >
>> > > > on a query that looks like this:
>> > > >
>> > > > create table hdfs.`test1234` as
>> > > > (
>> > > > select string_field1,
>> > > >           string_field2,
>> > > >           min ( int_field3 ),
>> > > >           max ( int_field4 ),
>> > > >           count(1),
>> > > >           count ( distinct int_field5 ),
>> > > >           count ( distinct int_field6 ),
>> > > >           count ( distinct string_field7 )
>> > > >     from hdfs.`/data/`
>> > > >     group by string_field1, string_field2;
>> > > > )
>> > > >
>> > > > The documentation state:
>> > > > "Currently, hash-based operations do not spill to disk as needed."
>> > > >
>> > > > and
>> > > >
>> > > > "If the hash-based operators run out of memory during execution, the
>> > > query
>> > > > fails. If large hash operations do not fit in memory on your system,
>> > you
>> > > > can disable these operations. When disabled, Drill creates
>> alternative
>> > > > plans that allow spilling to disk."
>> > > >
>> > > > My understanding is that it will fall back to Streaming aggregation,
>> > > which
>> > > > required sorting..
>> > > >
>> > > > but
>> > > >
>> > > > "As of Drill 1.5, ... the sort operator (in queries that ran
>> > successfully
>> > > > in previous releases) may not have enough memory, resulting in a
>> failed
>> > > > query"
>> > > >
>> > > > And Indeed, disabling hash agg and hash join resulted in memory leak
>> > > error.
>> > > >
>> > > > So it looks like increasing direct memory our only option.
>> > > >
>> > > > Is there a plan to have Hash Aggregation to spill on disk in the
>> next
>> > > > release?
>> > > >
>> > > >
>> > > > Thanks for your feedback
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> >
>> > Abdelhakim Deneche
>> >
>> > Software Engineer
>> >
>> >   <http://www.mapr.com/>
>> >
>> >
>> > Now Available - Free Hadoop On-Demand Training
>> > <
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > >
>> >
>>
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: Aggregation OutOfMemoryException

Reply via email to