alamb commented on PR #16208: URL: https://github.com/apache/datafusion/pull/16208#issuecomment-2927076431
I ran q24 locally and did see a small slowdown and did some profiling As expected filtering is about 30% of the overall execution time of the filtering time, about 1/2 goes to creating the output  The analysis reveals some more places to potentially improve (also aligned with @Dandandan 's suggestions):  There is also some evidence of reallocation as @Dandandan mentions: https://github.com/apache/datafusion/pull/16208#discussion_r2115799584  Next steps: 1. I will also try and update the code to avoid allocations when possible (specifically, recreate builders) 3. Optimize predicate creation some more (create it once / slice rather than twice) <details><summary>Details</summary> <p> On this branch: alamb/test_filter_pushdown I do so ```shell ./datafusion-cli-alamb_test_filter_pushdown -f q24-many.sql | grep Elapsed Elapsed 0.258 seconds. Elapsed 0.237 seconds. Elapsed 0.220 seconds. Elapsed 0.213 seconds. Elapsed 0.223 seconds. Elapsed 0.225 seconds. Elapsed 0.223 seconds. Elapsed 0.219 seconds. Elapsed 0.223 seconds. ``` On main main ```shell datafusion-cli -f q24-many.sql | grep Elapsed Elapsed 0.220 seconds. Elapsed 0.217 seconds. Elapsed 0.203 seconds. Elapsed 0.211 seconds. Elapsed 0.216 seconds. Elapsed 0.197 seconds. Elapsed 0.203 seconds. Elapsed 0.199 seconds. Elapsed 0.218 seconds. ``` </p> </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org