Agreed, but not sure what would the best way to do it be without making the code very confusing.
On Sat, Apr 13, 2019 at 2:46 PM Haisheng Yuan <[email protected]> wrote: > > I share the same concern with you. > > > > > > Thanks~ > Haisheng > Yuan------------------------------------------------------------------ > 发件人:Stamatis Zampetakis<[email protected]> > 日 期:2019年04月14日 05:37:29 > 收件人:<[email protected]> > 主 题:Re: Join, SemiJoin, Correlate > > Hi Danny, > > Thanks a lot for taking this on, it is a great start! > > I didn't look thoroughly through the PR but I noticed that there are many > renaming/refactoring of public APIs. I am not sure if we should introduce > so many breaking changes without prior notice. A most conservative approach > would be to keep existing classes/methods, mark them as deprecated, and > then remove them in one of the coming releases. I am not sure if that is > the right way to go so let's see what the others have to say. > > Best, > Stamatis > > On Fri, Apr 12, 2019 at 9:18 AM Yuzhao Chen <[email protected]> wrote: > > > Hi, @Haisheng Yuan, @Julian Hyde, @Stamatis Zampetakis, > > @Walaa Eldin Moustafa > > > > I have did the work for this discussion, and look forward to your > > suggestions. > > > > > > ### Diff > > - Deprecate SemiJoin, EquiJoin, EnumerableSemiJoin, SemiJoinType, > > EnumerableSemiJoinRule, EnumerableThetaJoin > > - Make EnumerableMergeJoin extends Join instead of EquiJoin > > - Add SEMI and ANTI join type to JoinRelType, add method > > returnsJustFirstInput() to decide if the join only outputs left side > > - Correlate use JoinRelType instead of SemiJoinType > > - Rename EnumerableCorrelate to EnumerableNestedLoopJoin and make it > > exptends Join instead of Correlate > > - Rename EnumerableJoin to EnumerableHashJoin > > - EnumerableJoinRule will convert semi-join to EnumerableNestedLoopJoin > > (EnumerableSemiJoin's function is merged into this rule) > > - Add method isNonCorrelateSemiJoin() in Join.java to make sure if this > > join is a semi-join (Comes from SemiJoinRule) or comes from > > decorrelation(SubqueryRemoveRule or RelDecorrelator), the returns value > > true means the join is a semi-join equivalent to SemiJoin before this patch. > > - Cache the JoinInfo in Join and use it to get leftKeys and rightKeys, > > merge the SemiJoin#computeSelfCost to Join#computeSelfCost > > - RelBuilder removes SemiJoinFactory, method #semiJoin now return a > > LogicalJoin with JoinRelType#SEMI > > > > ### Rules tweak > > - JoinAddRedundantSemiJoinRule now create LogicalJoin with > > JoinRelType#SEMI instead of SemiJoin > > - JoinToCorrelateRule remove SEMI instance and change the matchs condition > > to !join.getJoinType().generatesNullsOnLeft() which also allowed ANTI > > compared before this patch. > > - SemiJoinRule match SEMI join specificlly > > > > ### Metadata tweak > > - RelMdAllPredicates, RelMdExpressionLineage: Add full rowType to > > getAllPredicates(Join) cause semi-join only outputs one side > > - RelMdColumnUniqueness, RelMdSelectivity, RelMdDistinctRowCount, > > RelMdSize, RelMdUniqueKeys: merge semi-join logic to join > > > > > > ### Test cases change > > - MaterializationTest#testJoinMaterialization11 now can materialize > > successfully, cause i allow logical SemiJoin node to match, the original > > matchs SemiJoin as SemiJoin.class.isAssignableFrom(), which i think is > > wrong cause this will only matches subClasses of SemiJoin which is only > > EnumerableSemiJoin before this patch. > > - SortRemoveRuleTest#removeSortOverEnumerableCorrelate, because > > CALCITE-2018, the final EnumerableSort's cost was cache by the previous > > EnumerableSort with logical childs, so i remove the EnumerableSortRule and > > the best plan is correct > > - sub-query.iq has better plan for null correlate > > > > > > > > Best, > > Danny Chan > > 在 2019年3月21日 +0800 AM3:07,Julian Hyde <[email protected]>,写道: > > > I just discovered that Correlate, which is neither a Join nor a > > SemiJoin, uses SemiJoinType, but SemiJoin does not use SemiJoinType. > > > > > > Yuck. The Join/SemiJoin/Correlate type hierarchy needs some thought. > > > > > > Julian > > > > > > > > >
