[GitHub] [arrow-datafusion] kazuyukitanimura commented on a diff in pull request #7400: feat: Support spilling for hash aggregation

via GitHub Thu, 14 Sep 2023 13:39:15 -0700


kazuyukitanimura commented on code in PR #7400:
URL: https://github.com/apache/arrow-datafusion/pull/7400#discussion_r1326491659



##########
datafusion/core/src/physical_plan/aggregates/row_hash.rs:
##########
@@ -120,6 +151,56 @@ use super::AggregateExec;
 /// hash table).
 ///
 /// [`group_values`]: Self::group_values
+///
+/// # Spilling
+///
+/// The sizes of group values and accumulators can become large. Before that 
causes out of memory,
+/// this hash aggregator outputs those data early for partial aggregation or 
spills to local disk

Review Comment:
   The way I distinguish between `output` and `emit` is
   `output`: output from the aggregation exec. 
   `emit`: reading out from the hash table, rather more internal.
   Since the subject in this sentence is `hash aggregator`, I chose `output`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] kazuyukitanimura commented on a diff in pull request #7400: feat: Support spilling for hash aggregation

Reply via email to