Hello, I am trying to create my query planner based on hive's implementation of Calcite Planner (https://github.com/apache/hive/blob/master/ql/src/java/ org/apache/hadoop/hive/ql/parse/CalcitePlanner.java). I have split my optimizing procedure in a similar way like Hive's planner. At first, I use some pre-join order optimizations. Then I am using LoptOptimizeJoinRule.INSTANCE for join order and finally I apply some rules that don't need statistics to get my final plan. I face two problems :
1) When I have a query like this : "select * " + "from s.products join s.orders " + "on s.orders.productid = s.products.productid " + " where units>10 and description < 20 " ); I get this plan, after using the LoptOptimizeJoinRule : LogicalProject(rowtime=[$5], productid=[$6], description=[$7], rowtime0=[$0], orderid=[$1], productid0=[$2], units=[$3], customerid=[$4]) LogicalJoin(condition=[=($6, $2)], joinType=[inner]) LogicalFilter(condition=[>($3, 10)]) LogicalTableScan(table=[[s, orders]]) LogicalFilter(condition=[<($2, 20)]) LogicalTableScan(table=[[s, products]]) The final plan has an extra Projection over the Join. However, this projection has no use and I want to get rid of it. I tried to create a rule that transforms a project(join) -> join ,when they have the same output schema, but I couldn't find the output schema of the join operator. Am I doing something wrong with the order or the way I enforce the rules? Is there an easy way to get rid of this topProject? 2)After I have used the LoptOptimizeJoinRule and get my optimized order, I can't use JoinCommuteRule, as the hepPlanner runs forever. Thank you in advance, George