[
https://issues.apache.org/jira/browse/DRILL-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518637#comment-16518637
]
ASF GitHub Bot commented on DRILL-6310:
---------------------------------------
Ben-Zvi commented on a change in pull request #1324: DRILL-6310: limit batch
size for hash aggregate
URL: https://github.com/apache/drill/pull/1324#discussion_r196933394
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
##########
@@ -184,7 +184,15 @@
// then later re-read. So, disk I/O is twice this amount.
// For first phase aggr -- this is an estimate of the
amount of data
// returned early (analogous to a spill in the 2nd
phase).
- SPILL_CYCLE // 0 - no spill, 1 - spill, 2 - SECONDARY, 3 - TERTIARY
+ SPILL_CYCLE, // 0 - no spill, 1 - spill, 2 - SECONDARY, 3 - TERTIARY
+ INPUT_BATCH_COUNT,
Review comment:
The number of incoming batches is already collected by (each and every)
operator (performed by the __next()__ method in AbstractRecordBatch ). Though
this count also includes the spilled batches that were read (see another
comment about the __update()__ below)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> limit batch size for hash aggregate
> -----------------------------------
>
> Key: DRILL-6310
> URL: https://issues.apache.org/jira/browse/DRILL-6310
> Project: Apache Drill
> Issue Type: Improvement
> Components: Execution - Flow
> Affects Versions: 1.13.0
> Reporter: Padma Penumarthy
> Assignee: Padma Penumarthy
> Priority: Major
> Fix For: 1.14.0
>
>
> limit batch size for hash aggregate based on memory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)