Hi, @Haisheng Yuan, @Julian Hyde, @Stamatis Zampetakis, @Walaa Eldin Moustafa

I have did the work for this discussion, and look forward to your suggestions.


### Diff
- Deprecate SemiJoin, EquiJoin, EnumerableSemiJoin, SemiJoinType, 
EnumerableSemiJoinRule, EnumerableThetaJoin
- Make EnumerableMergeJoin extends Join instead of EquiJoin
- Add SEMI and ANTI join type to JoinRelType, add method 
returnsJustFirstInput() to decide if the join only outputs left side
- Correlate use JoinRelType instead of SemiJoinType
- Rename EnumerableCorrelate to EnumerableNestedLoopJoin and make it exptends 
Join instead of Correlate
- Rename EnumerableJoin to EnumerableHashJoin
- EnumerableJoinRule will convert semi-join to EnumerableNestedLoopJoin 
(EnumerableSemiJoin's function is merged into this rule)
- Add method isNonCorrelateSemiJoin() in Join.java to make sure if this join is 
a semi-join (Comes from SemiJoinRule) or comes from 
decorrelation(SubqueryRemoveRule or RelDecorrelator), the returns value true 
means the join is a semi-join equivalent to SemiJoin before this patch.
- Cache the JoinInfo in Join and use it to get leftKeys and rightKeys, merge 
the SemiJoin#computeSelfCost to Join#computeSelfCost
- RelBuilder removes SemiJoinFactory, method #semiJoin now return a LogicalJoin 
with JoinRelType#SEMI

### Rules tweak
- JoinAddRedundantSemiJoinRule now create LogicalJoin with JoinRelType#SEMI 
instead of SemiJoin
- JoinToCorrelateRule remove SEMI instance and change the matchs condition to 
!join.getJoinType().generatesNullsOnLeft() which also allowed ANTI compared 
before this patch.
- SemiJoinRule match SEMI join specificlly

### Metadata tweak
- RelMdAllPredicates, RelMdExpressionLineage: Add full rowType to 
getAllPredicates(Join) cause semi-join only outputs one side
- RelMdColumnUniqueness, RelMdSelectivity, RelMdDistinctRowCount, RelMdSize, 
RelMdUniqueKeys: merge semi-join logic to join


### Test cases change
- MaterializationTest#testJoinMaterialization11 now can materialize 
successfully, cause i allow logical SemiJoin node to match, the original matchs 
SemiJoin as SemiJoin.class.isAssignableFrom(), which i think is wrong cause 
this will only matches subClasses of SemiJoin which is only EnumerableSemiJoin 
before this patch.
- SortRemoveRuleTest#removeSortOverEnumerableCorrelate, because CALCITE-2018, 
the final EnumerableSort's cost was cache by the previous EnumerableSort with 
logical childs, so i remove the EnumerableSortRule and the best plan is correct
- sub-query.iq has better plan for null correlate



Best,
Danny Chan
在 2019年3月21日 +0800 AM3:07,Julian Hyde <[email protected]>,写道:
> I just discovered that Correlate, which is neither a Join nor a SemiJoin, 
> uses SemiJoinType, but SemiJoin does not use SemiJoinType.
>
> Yuck. The Join/SemiJoin/Correlate type hierarchy needs some thought.
>
> Julian
>
>

Reply via email to