Hi, @Haisheng Yuan, @Julian Hyde, @Stamatis Zampetakis, @Walaa Eldin Moustafa
I have did the work for this discussion, and look forward to your suggestions. ### Diff - Deprecate SemiJoin, EquiJoin, EnumerableSemiJoin, SemiJoinType, EnumerableSemiJoinRule, EnumerableThetaJoin - Make EnumerableMergeJoin extends Join instead of EquiJoin - Add SEMI and ANTI join type to JoinRelType, add method returnsJustFirstInput() to decide if the join only outputs left side - Correlate use JoinRelType instead of SemiJoinType - Rename EnumerableCorrelate to EnumerableNestedLoopJoin and make it exptends Join instead of Correlate - Rename EnumerableJoin to EnumerableHashJoin - EnumerableJoinRule will convert semi-join to EnumerableNestedLoopJoin (EnumerableSemiJoin's function is merged into this rule) - Add method isNonCorrelateSemiJoin() in Join.java to make sure if this join is a semi-join (Comes from SemiJoinRule) or comes from decorrelation(SubqueryRemoveRule or RelDecorrelator), the returns value true means the join is a semi-join equivalent to SemiJoin before this patch. - Cache the JoinInfo in Join and use it to get leftKeys and rightKeys, merge the SemiJoin#computeSelfCost to Join#computeSelfCost - RelBuilder removes SemiJoinFactory, method #semiJoin now return a LogicalJoin with JoinRelType#SEMI ### Rules tweak - JoinAddRedundantSemiJoinRule now create LogicalJoin with JoinRelType#SEMI instead of SemiJoin - JoinToCorrelateRule remove SEMI instance and change the matchs condition to !join.getJoinType().generatesNullsOnLeft() which also allowed ANTI compared before this patch. - SemiJoinRule match SEMI join specificlly ### Metadata tweak - RelMdAllPredicates, RelMdExpressionLineage: Add full rowType to getAllPredicates(Join) cause semi-join only outputs one side - RelMdColumnUniqueness, RelMdSelectivity, RelMdDistinctRowCount, RelMdSize, RelMdUniqueKeys: merge semi-join logic to join ### Test cases change - MaterializationTest#testJoinMaterialization11 now can materialize successfully, cause i allow logical SemiJoin node to match, the original matchs SemiJoin as SemiJoin.class.isAssignableFrom(), which i think is wrong cause this will only matches subClasses of SemiJoin which is only EnumerableSemiJoin before this patch. - SortRemoveRuleTest#removeSortOverEnumerableCorrelate, because CALCITE-2018, the final EnumerableSort's cost was cache by the previous EnumerableSort with logical childs, so i remove the EnumerableSortRule and the best plan is correct - sub-query.iq has better plan for null correlate Best, Danny Chan 在 2019年3月21日 +0800 AM3:07,Julian Hyde <[email protected]>,写道: > I just discovered that Correlate, which is neither a Join nor a SemiJoin, > uses SemiJoinType, but SemiJoin does not use SemiJoinType. > > Yuck. The Join/SemiJoin/Correlate type hierarchy needs some thought. > > Julian > >
