neilconway commented on PR #22343:
URL: https://github.com/apache/datafusion/pull/22343#issuecomment-4587423560

   @alamb 
   
   > if this PR had a `config.optimizer.reorder_filters` type config knob to 
turn this optimization off, I think it would be a good addition to DataFusion.
   
   I think it would be misleading if `reorder_filters=false` **just** disabled 
this optimization, because we reorder filters today (and that behavior is not 
configurable): 
https://github.com/apache/datafusion/blob/3e006c99c29388d4ba7b78af40e8c8a410151287/datafusion/optimizer/src/simplify_expressions/simplify_predicates.rs#L50-L89
   
   We could go further and have `reorder_filters=false` either disable 
predicate simplification or ensure that any predicate simplification that is 
done respects the query text order.
   
   My concern there is that it seems pretty fragile to guarantee that filter 
evaluation order matches query text order. I could see this very easily being 
violated by other optimizations, either existing or that will be added in the 
future. For example, what about predicate pushdown? If the query has a list of 
predicates and some of them can be pushed down while others cannot, that would 
result in predicate evaluation order becoming inconsistent with the query text 
order.
   
   So we could _also_ disable predicate pushdown when `reorder_filters` is 
false... but then what about CNF/DNF-style normalization? Offhand I'd expect 
that might also change predicate evaluation order, I'd need to check more 
thoroughly. I'd also expect that when we convert `WHERE`-clause join filters 
into `ON` clauses, we probably don't do that in a way that always respects the 
query text for all classes of predicates (e.g., non-equijoin filters, volatile 
expressions, etc).
   
   tldr -- If we truly want to guarantee that evaluation order matches query 
text order, considerably more work is required than just gating this PR in a 
boolean variable. Do we have any signal that this is a real pain point with 
users?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to