nseekhao commented on issue #7611: URL: https://github.com/apache/arrow-datafusion/issues/7611#issuecomment-1731559894
> Even if they are practically the same thing, I believe it will not be very correct if we give a different meaning to something that is None. Does producing as None cause an error? This is from the original code. There is actually nothing wrong with this part. I just described it incorrectly in the issue description so I wanted to correct it. > AFAIK there is no distinction between post and pre-filters in datafusion joins, all non-equi filter predicates are collected in one place. You're exactly right. datafusion does not have a post-join filter. It only contains join `on` and `filter` fields. This is why the generated Substrait plan, if produced from datafusion should produce `None` in the `post_join_filter` field, since as you pointed out, **datafusion has no notion of post-join filter**. The current producer encodes the datafusion join `filter` as Substrait `post_join_filter` <-- **this is incorrect**. The [Join](https://github.com/apache/arrow-datafusion/blob/main/datafusion/expr/src/logical_plan/plan.rs#L2208-L2225) struct has only two fields for condition and according to the comments, neither of them are to be applied post-join. ```Rust /// Equijoin clause expressed as pairs of (left, right) join expressions pub on: Vec<(Expr, Expr)>, /// Filters applied during join (non-equi conditions) pub filter: Option<Expr>, ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
