[ 
https://issues.apache.org/jira/browse/DRILL-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379721#comment-16379721
 ] 

ASF GitHub Bot commented on DRILL-6032:
---------------------------------------

Github user ilooner commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1101#discussion_r171133616
  
    --- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
    @@ -427,8 +427,8 @@ drill.exec.options: {
         exec.enable_union_type: false,
         exec.errors.verbose: false,
         exec.hashagg.mem_limit: 0,
    -    exec.hashagg.min_batches_per_partition: 2,
    -    exec.hashagg.num_partitions: 32,
    +    exec.hashagg.min_batches_per_partition: 1,
    --- End diff --
    
    @Ben-Zvi This setting controls the minimum number of batches kept in memory 
per partition. Making this larger will cause us to consume more memory. Making 
it smaller makes us consume less memory. Also in general the purpose of this PR 
was to make the memory calculations more precise and deterministic and it 
passes all regression tests.


> Use RecordBatchSizer to estimate size of columns in HashAgg
> -----------------------------------------------------------
>
>                 Key: DRILL-6032
>                 URL: https://issues.apache.org/jira/browse/DRILL-6032
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Timothy Farkas
>            Assignee: Timothy Farkas
>            Priority: Major
>             Fix For: 1.13.0
>
>
> We need to use the RecordBatchSize to estimate the size of columns in the 
> Partition batches created by HashAgg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to