gabotechs commented on code in PR #16816: URL: https://github.com/apache/datafusion/pull/16816#discussion_r2223443238
########## datafusion/functions-aggregate/src/array_agg.rs: ########## @@ -315,11 +313,7 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty() { - // The ArrayRef might be holding a reference to its original input buffer, so - // storing it here directly copied/compacted avoids over accounting memory - // not used here. - self.values - .push(make_array(copy_array_data(&val.to_data()))); Review Comment: I don't think we will be using more memory. Right now the current code is in an intermediate state where only some of the accumulations in `ArrayAggAccumulator` are being compacted (https://github.com/apache/datafusion/pull/16346, which is merged) and some other don't (https://github.com/apache/datafusion/pull/16519, which was not merged due to performance concerns). So most likely the buffers that bake the accumulated data are still being retained anyways, because https://github.com/apache/datafusion/pull/16519 was not merged. If any, this PR should reduce memory usage, as we are no longer copying data, and improve accounting, as we are not double-counting several times the buffers that bake the accumulated data. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org