alamb commented on code in PR #9818:
URL: https://github.com/apache/arrow-datafusion/pull/9818#discussion_r1565589049
##########
datafusion/physical-plan/src/aggregates/row_hash.rs:
##########
@@ -787,7 +789,8 @@ impl GroupedHashAggregateStream {
let timer = elapsed_compute.timer();
self.exec_state = if self.spill_state.spills.is_empty() {
let batch = self.emit(EmitTo::All, false)?;
- ExecutionState::ProducingOutput(batch)
+ let batches = self.split_batch(batch)?;
Review Comment:
I thinik the key point of the request is to avoid the call to
`emit(EmitTo::All)` or perhaps change that call to return a Vec<RecordBatch>
Taking a large single record batch and slicing it up doesn't change how the
underlying memory is allocated / laid out (aka the same large contiguous batch
is used)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]