kazuyukitanimura commented on code in PR #7400: URL: https://github.com/apache/arrow-datafusion/pull/7400#discussion_r1326491659
########## datafusion/core/src/physical_plan/aggregates/row_hash.rs: ########## @@ -120,6 +151,56 @@ use super::AggregateExec; /// hash table). /// /// [`group_values`]: Self::group_values +/// +/// # Spilling +/// +/// The sizes of group values and accumulators can become large. Before that causes out of memory, +/// this hash aggregator outputs those data early for partial aggregation or spills to local disk Review Comment: The way I distinguish between `output` and `emit` is `output`: output from the aggregation exec. `emit`: reading out from the hash table, rather more internal. Since the subject in this sentence is `hash aggregator`, I chose `output` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
