[
https://issues.apache.org/jira/browse/DRILL-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379721#comment-16379721
]
ASF GitHub Bot commented on DRILL-6032:
---------------------------------------
Github user ilooner commented on a diff in the pull request:
https://github.com/apache/drill/pull/1101#discussion_r171133616
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -427,8 +427,8 @@ drill.exec.options: {
exec.enable_union_type: false,
exec.errors.verbose: false,
exec.hashagg.mem_limit: 0,
- exec.hashagg.min_batches_per_partition: 2,
- exec.hashagg.num_partitions: 32,
+ exec.hashagg.min_batches_per_partition: 1,
--- End diff --
@Ben-Zvi This setting controls the minimum number of batches kept in memory
per partition. Making this larger will cause us to consume more memory. Making
it smaller makes us consume less memory. Also in general the purpose of this PR
was to make the memory calculations more precise and deterministic and it
passes all regression tests.
> Use RecordBatchSizer to estimate size of columns in HashAgg
> -----------------------------------------------------------
>
> Key: DRILL-6032
> URL: https://issues.apache.org/jira/browse/DRILL-6032
> Project: Apache Drill
> Issue Type: Improvement
> Reporter: Timothy Farkas
> Assignee: Timothy Farkas
> Priority: Major
> Fix For: 1.13.0
>
>
> We need to use the RecordBatchSize to estimate the size of columns in the
> Partition batches created by HashAgg.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)