Jefffrey commented on issue #4091: URL: https://github.com/apache/arrow-datafusion/issues/4091#issuecomment-1311664315
Looks like the particular behaviour for query 12 in TPC-H is caused by the `return Ok(unhandled)` here: https://github.com/apache/arrow-datafusion/blob/509c82c6d624bb63531f67531195b562a241c854/datafusion/core/src/physical_optimizer/pruning.rs#L787-L795 Where the `and true`'s are generated by the `l_commitdate < l_receiptdate` and `l_shipdate < l_commitdate` conditions. Would a potential fix be to introduce a step after `build_predicate_expression(...)` is called to fold down the resultant expression to remove those redundant conditions, after the fact? https://github.com/apache/arrow-datafusion/blob/509c82c6d624bb63531f67531195b562a241c854/datafusion/core/src/physical_optimizer/pruning.rs#L129-L139 Or perhaps to refactor the `build_predicate_expression(...)` function itself to not simply return a boolean TRUE for unhandled cases (causing `and true` to be appended to expressions) and instead maybe return something more informative like an option (instead of current `Expr`), to indicate whether an expression was generated or not? To try avoid introducing the `and true` in the first place, if possible? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
