Dandandan commented on a change in pull request #796:
URL: https://github.com/apache/arrow-datafusion/pull/796#discussion_r680475141



##########
File path: datafusion/src/sql/planner.rs
##########
@@ -372,25 +373,91 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
                 // extract join keys
                 extract_join_keys(&expr, &mut keys, &mut filter);
 
+                let mut cols = HashSet::new();
+                exprlist_to_columns(&filter, &mut cols)?;
+
                 let (left_keys, right_keys): (Vec<Column>, Vec<Column>) =
                     keys.into_iter().unzip();
-                // return the logical plan representing the join
-                let join = LogicalPlanBuilder::from(left).join(
-                    right,
-                    join_type,
-                    (left_keys, right_keys),
-                )?;
 
+                // return the logical plan representing the join
                 if filter.is_empty() {
+                    let join = LogicalPlanBuilder::from(left).join(
+                        &right,
+                        join_type,
+                        (left_keys, right_keys),
+                    )?;
                     join.build()
                 } else if join_type == JoinType::Inner {
+                    let join = LogicalPlanBuilder::from(left).join(
+                        &right,
+                        join_type,
+                        (left_keys, right_keys),
+                    )?;
                     join.filter(
                         filter
                             .iter()
                             .skip(1)
                             .fold(filter[0].clone(), |acc, e| 
acc.and(e.clone())),
                     )?
                     .build()
+                }
+                // Left join with all non-equijoin expressions from the right

Review comment:
       An option to support non-equijoin filter expression for left/left and 
right/right combinations would be to support filter expression in the join 
implementation.
   
   For efficiency, for expressions only referring to the left (or right for 
right join) side, this can be done before loading those values into the hashmap 
and directly produce values for the not matching left + null values for the 
right - this saves memory / time if the additional condition filters out a lot.
   
   Filters that include both the left and right columns (e.g.  `left_col > 
right_col`) could be done inside the probing loop when producing the (left, 
right) tuples or as some "post processing" by nullifying the non-matching rows 
on the right side.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to