jayzhan211 commented on issue #13188: URL: https://github.com/apache/datafusion/issues/13188#issuecomment-2461547990
Other than https://github.com/apache/arrow-rs/issues/6692. If we create filter version of `GroupColumn` that accumulate the filtered array into array builder **for each column**, output if the batch size reach target (i.e. 8192). We can then aggregate those small batches in filter exec and produce a single large output for next step. Does this sound a possible improvement? I think the downside is again we need to implement builder per type like `GroupColumn` and the improvement is unclear -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org