pepijnve opened a new pull request, #18152: URL: https://github.com/apache/datafusion/pull/18152
## Which issue does this PR close? - Improvement in the context of https://github.com/apache/datafusion/issues/18075 - Continues on #17898 ## Rationale for this change Case evaluation currently uses the generic `PhysicalExpr::evaluate_selection`. This implementation is fine, but because `evaluate_selection` is not specific to the `case` logic we're missing some optimisation opportunities. The main consequence is that too much work is being done filtering record batches. This PR inlines the `evaluate_selection` logic for the `NoExpression` evaluation method and adapts it to the specific logic that's needed in `case_when_no_expr`. ## What changes are included in this PR? Rewrite the `case_when_no_expr` evaluation loop to avoid as much unnecessary work as possible. In particular the remaining rows to be evaluated are retained across loop iterations. This allows the record batch that needs to be filtered to shrink as the loop is being evaluated which reduces the number of rows that needs to be refiltered. If a when predicate does not match any rows at all, filtering is avoided entirely. ## Are these changes tested? Covered by existing unit tests and SLTs ## Are there any user-facing changes? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
