neilconway commented on PR #22343: URL: https://github.com/apache/datafusion/pull/22343#issuecomment-4587423560
@alamb > if this PR had a `config.optimizer.reorder_filters` type config knob to turn this optimization off, I think it would be a good addition to DataFusion. I think it would be misleading if `reorder_filters=false` **just** disabled this optimization, because we reorder filters today (and that behavior is not configurable): https://github.com/apache/datafusion/blob/3e006c99c29388d4ba7b78af40e8c8a410151287/datafusion/optimizer/src/simplify_expressions/simplify_predicates.rs#L50-L89 We could go further and have `reorder_filters=false` either disable predicate simplification or ensure that any predicate simplification that is done respects the query text order. My concern there is that it seems pretty fragile to guarantee that filter evaluation order matches query text order. I could see this very easily being violated by other optimizations, either existing or that will be added in the future. For example, what about predicate pushdown? If the query has a list of predicates and some of them can be pushed down while others cannot, that would result in predicate evaluation order becoming inconsistent with the query text order. So we could _also_ disable predicate pushdown when `reorder_filters` is false... but then what about CNF/DNF-style normalization? Offhand I'd expect that might also change predicate evaluation order, I'd need to check more thoroughly. I'd also expect that when we convert `WHERE`-clause join filters into `ON` clauses, we probably don't do that in a way that always respects the query text for all classes of predicates (e.g., non-equijoin filters, volatile expressions, etc). tldr -- If we truly want to guarantee that evaluation order matches query text order, considerably more work is required than just gating this PR in a boolean variable. Do we have any signal that this is a real pain point with users? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
