nseekhao opened a new issue, #7611:
URL: https://github.com/apache/arrow-datafusion/issues/7611

   ### Is your feature request related to a problem or challenge?
   
   As of version 31.0.0, `datafusion` has a join 
[on](https://github.com/apache/arrow-datafusion/blob/main/datafusion/expr/src/logical_plan/plan.rs#L2214C11-L2214C11)
 field which is used for equi-join conditions and a join 
[filter](https://github.com/apache/arrow-datafusion/blob/main/datafusion/expr/src/logical_plan/plan.rs#L2216)
 field for non-equi-join conditions.
   
   Currently, the producer puts the non-equi-join conditions in the Substrait 
[post_join_filter](https://github.com/substrait-io/substrait/blob/main/proto/substrait/algebra.proto#L161).
 However, we can also combine all conditions and put it in the 
[expression](https://github.com/substrait-io/substrait/blob/main/proto/substrait/algebra.proto#L160)
 field of the JoinRel in Substrait.
   
   The motivation behind this change request is that this will let other DB 
systems decide what to do with the entire condition, as opposed to having to 
process them separately. Right now, if there is no equal condition, the 
producer will output just `Literal(True)` as the join expression, and put the 
rest of the condition in the `post_join_filter`. Having a redundant `Literal` 
expression adds unnecessary overhead of evaluating this condition. It also 
implies that you’re performing a cartesian product THEN a filter, as opposed to 
just a non-equi-join, which does not completely align with the original plan 
intent.
   
   ### Describe the solution you'd like
   
   All valid join conditions to be encoded in the Substrait `JoinRel`'s 
[expression](https://github.com/substrait-io/substrait/blob/main/proto/substrait/algebra.proto#L160)
 field.
   
   ### Describe alternatives you've considered
   
   The current approach works correctly semantically, but it can make 
downstream query execution inefficient.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to