danny0405 opened a new pull request #1157: [CALCITE-2696] Improve design of join-like relational expressions URL: https://github.com/apache/calcite/pull/1157 ### Diff - Deprecate SemiJoin, EquiJoin, EnumerableSemiJoin, SemiJoinType, EnumerableSemiJoinRule, EnumerableThetaJoin - Make EnumerableMergeJoin extends Join instead of EquiJoin - Add SEMI and ANTI join type to JoinRelType, add method returnsJustFirstInput() to decide if the join only outputs left side - Correlate use JoinRelType instead of SemiJoinType - Rename EnumerableCorrelate to EnumerableNestedLoopJoin and make it exptends Join instead of Correlate - Rename EnumerableJoin to EnumerableHashJoin - EnumerableJoinRule will convert semi-join to EnumerableNestedLoopJoin (EnumerableSemiJoin's function is merged into this rule) - Add method isNonCorrelateSemiJoin() in Join.java to make sure if this join is a semi-join (Comes from SemiJoinRule) or comes from decorrelation(SubqueryRemoveRule or RelDecorrelator), the returns value true means the join is a semi-join equivalent to SemiJoin before this patch. - Cache the JoinInfo in Join and use it to get leftKeys and rightKeys, merge the SemiJoin#computeSelfCost to Join#computeSelfCost - RelBuilder removes SemiJoinFactory, method #semiJoin now return a LogicalJoin with JoinRelType#SEMI ### Rules tweak - JoinAddRedundantSemiJoinRule now create LogicalJoin with JoinRelType#SEMI instead of SemiJoin - JoinToCorrelateRule remove SEMI instance and change the matchs condition to !join.getJoinType().generatesNullsOnLeft() which also allowed ANTI compared before this patch. - SemiJoinRule match SEMI join specificlly ### Metadata tweak - RelMdAllPredicates, RelMdExpressionLineage: Add full rowType to getAllPredicates(Join) cause semi-join only outputs one side - RelMdColumnUniqueness, RelMdSelectivity, RelMdDistinctRowCount, RelMdSize, RelMdUniqueKeys: merge semi-join logic to join ### Test cases change - MaterializationTest#testJoinMaterialization11 now can materialize successfully, cause i allow logical SemiJoin node to match, the original matchs SemiJoin as SemiJoin.class.isAssignableFrom(), which i think is wrong cause this will only matches subClasses of SemiJoin which is only EnumerableSemiJoin before this patch. - SortRemoveRuleTest#removeSortOverEnumerableCorrelate, because CALCITE-2018, the final EnumerableSort's cost was cache by the previous EnumerableSort with logical childs, so i remove the EnumerableSortRule and the best plan is correct - sub-query.iq has better plan for null correlate
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
