alamb commented on PR #14299: URL: https://github.com/apache/datafusion/pull/14299#issuecomment-2614619310
Here is my entire benchmark run ``` ++ critcmp main improve-performance-for-array-agg-merge-batch improve-performance-for-array-agg-merge-batch main ----- --------------------------------------------- ---- array_agg i64 merge_batch 30% nulls, 0% of nulls point to a zero length array 1.00 565.6±1.11µs ? ?/sec 1.00 565.4±1.29µs ? ?/sec array_agg i64 merge_batch 30% nulls, 100% of nulls point to a zero length array 1.00 7.7±0.01µs ? ?/sec 73.20 562.7±1.07µs ? ?/sec array_agg i64 merge_batch 30% nulls, 50% of nulls point to a zero length array 1.02 579.6±3.49µs ? ?/sec 1.00 568.4±28.98µs ? ?/sec array_agg i64 merge_batch 30% nulls, 90% of nulls point to a zero length array 1.00 565.2±0.70µs ? ?/sec 1.00 563.3±0.61µs ? ?/sec array_agg i64 merge_batch 30% nulls, 99% of nulls point to a zero length array 1.01 566.8±0.52µs ? ?/sec 1.00 562.4±1.00µs ? ?/sec array_agg i64 merge_batch 70% nulls, 0% of nulls point to a zero length array 1.01 262.1±17.69µs ? ?/sec 1.00 259.3±15.89µs ? ?/sec array_agg i64 merge_batch 70% nulls, 100% of nulls point to a zero length array 1.00 7.5±0.01µs ? ?/sec 34.02 256.2±1.94µs ? ?/sec array_agg i64 merge_batch 70% nulls, 50% of nulls point to a zero length array 1.06 271.8±1.50µs ? ?/sec 1.00 256.5±0.53µs ? ?/sec array_agg i64 merge_batch 70% nulls, 90% of nulls point to a zero length array 1.02 261.5±17.90µs ? ?/sec 1.00 256.0±0.54µs ? ?/sec array_agg i64 merge_batch 70% nulls, 99% of nulls point to a zero length array 1.00 259.0±0.30µs ? ?/sec 1.00 259.6±16.06µs ? ?/sec array_agg i64 merge_batch all nulls, 100% of nulls point to a zero length array 1.00 85.5±0.08ns ? ?/sec 230.91 19.7±0.01µs ? ?/sec array_agg i64 merge_batch all nulls, 90% of nulls point to a zero length array 1.00 85.8±0.27ns ? ?/sec 229.79 19.7±0.02µs ? ?/sec array_agg i64 merge_batch no nulls 1.00 103.4±0.21ns ? ?/sec 6215.33 642.8±1.08µs ? ?/sec ``` TLDR is that this PR appears to have a very significant performance improvement. 🚀 If no one beats me to it I will give it a good look in the upcoming week -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org