mingmwang commented on code in PR #4185:
URL: https://github.com/apache/arrow-datafusion/pull/4185#discussion_r1020918310
##########
datafusion/optimizer/src/eliminate_cross_join.rs:
##########
@@ -44,143 +44,235 @@ impl ReduceCrossJoin {
}
}
+/// Attempt to reorder join tp reduce cross joins to inner joins.
+/// for queries:
+/// 'select ... from a, b where a.x = b.y and b.xx = 100;'
+/// 'select ... from a, b where (a.x = b.y and b.xx = 100) or (a.x = b.y and
b.xx = 200);'
+/// 'select ... from a, b, c where (a.x = b.y and b.xx = 100 and a.z = c.z)
+/// or (a.x = b.y and b.xx = 200 and a.z=c.z);'
+/// For above queries, the join predicate is available in filters and they are
moved to
+/// join nodes appropriately
+/// This fix helps to improve the performance of TPCH Q19. issue#78
+///
impl OptimizerRule for ReduceCrossJoin {
fn optimize(
&self,
plan: &LogicalPlan,
_optimizer_config: &mut OptimizerConfig,
) -> Result<LogicalPlan> {
- let mut possible_join_keys: Vec<(Column, Column)> = vec![];
- let mut all_join_keys = HashSet::new();
+ match plan {
+ LogicalPlan::Filter(filter) => {
+ let mut input = (**filter.input()).clone();
+
+ // optimize children.
+ input = self.optimize(&input, _optimizer_config)?;
+
Review Comment:
Should it call optimize children first ? I think this rule should be a
top-down(Preorder Traversal) optimization process.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]