Dandandan commented on a change in pull request #796:
URL: https://github.com/apache/arrow-datafusion/pull/796#discussion_r680475141
##########
File path: datafusion/src/sql/planner.rs
##########
@@ -372,25 +373,91 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
// extract join keys
extract_join_keys(&expr, &mut keys, &mut filter);
+ let mut cols = HashSet::new();
+ exprlist_to_columns(&filter, &mut cols)?;
+
let (left_keys, right_keys): (Vec<Column>, Vec<Column>) =
keys.into_iter().unzip();
- // return the logical plan representing the join
- let join = LogicalPlanBuilder::from(left).join(
- right,
- join_type,
- (left_keys, right_keys),
- )?;
+ // return the logical plan representing the join
if filter.is_empty() {
+ let join = LogicalPlanBuilder::from(left).join(
+ &right,
+ join_type,
+ (left_keys, right_keys),
+ )?;
join.build()
} else if join_type == JoinType::Inner {
+ let join = LogicalPlanBuilder::from(left).join(
+ &right,
+ join_type,
+ (left_keys, right_keys),
+ )?;
join.filter(
filter
.iter()
.skip(1)
.fold(filter[0].clone(), |acc, e|
acc.and(e.clone())),
)?
.build()
+ }
+ // Left join with all non-equijoin expressions from the right
Review comment:
An option to support non-equijoin filter expression for left/left and
right/right combinations would be to support filter expression in the join
implementation.
For efficiency, for expressions only referring to the left (or right for
right join) side, this can be done before loading those values into the hashmap
and directly produce values for the not matching left + null values for the
right - this saves memory / time if the additional condition filters out a lot.
Filters that include both the left and right columns (e.g. `left_col >
right_col`) could be done inside the probing loop when producing the (left,
right) tuples or as some "post processing" by nullifying the non-matching rows
on the right side.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]