jayzhan211 commented on issue #11212: URL: https://github.com/apache/datafusion/issues/11212#issuecomment-2205885405
> Thanks for providing some benchmarks / example @acking-you ! > > I think the missing optimizations seem to be: > > * Checking for array.true_count() to be all zero / all true. I think this is an optimization that might be better to implement in the arrow-rs and/or (_and_kleene_, _or_kleene_) kernels as a special case instead of in DataFusion. > * UDF -> Immutable/non-volatile are executed during optimization in the constant evaluator, so the extra optimization will only apply for _non-volatile UDFs_. I'm not sure if modifying `and / or kleene` logic help, since it needs two boolean array, but in this case we expect to short circuit if the left hand side has boolean array with all false for AND clause. I think we might need to extend expression simplify logic for array case in `Simplifier` 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org