pepijnve opened a new pull request, #18152:
URL: https://github.com/apache/datafusion/pull/18152

   ## Which issue does this PR close?
   
   - Improvement in the context of 
https://github.com/apache/datafusion/issues/18075
   - Continues on #17898
   
   ## Rationale for this change
   
   Case evaluation currently uses the generic 
`PhysicalExpr::evaluate_selection`. This implementation is fine, but because 
`evaluate_selection` is not specific to the `case` logic we're missing some 
optimisation opportunities. The main consequence is that too much work is being 
done filtering record batches. This PR inlines the `evaluate_selection` logic 
for the `NoExpression` evaluation method and adapts it to the specific logic 
that's needed in `case_when_no_expr`.
   
   ## What changes are included in this PR?
   
   Rewrite the `case_when_no_expr` evaluation loop to avoid as much unnecessary 
work as possible. In particular the remaining rows to be evaluated are retained 
across loop iterations. This allows the record batch that needs to be filtered 
to shrink as the loop is being evaluated which reduces the number of rows that 
needs to be refiltered. If a when predicate does not match any rows at all, 
filtering is avoided entirely.
   
   ## Are these changes tested?
   
   Covered by existing unit tests and SLTs
   
   ## Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to