Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1530#discussion_r182911028
  
    --- Diff: core/sql/optimizer/NormRelExpr.cpp ---
    @@ -2767,25 +2767,26 @@ Here t2.a is a unique key of table t2.
     The following transformation is made
              Semi Join {pred : t1.b = t2.a}          Join {pred : t1.b = t2.a} 
             /         \                   ------->  /    \
    -      /             \                         /        \
    -Scan t1     Scan t2                 Scan t1     Scan t2
    +       /           \                           /      \
    + Scan t1        Scan t2                   Scan t1     Scan t2
                                                     
     
                                                
     b) If the right child is not unique in the joining column then 
     we transform the semijoin into an inner join followed by a groupby
     as the join's right child. This transformation is enabled by default
    -only if the right side is an IN list, otherwise a CQD has to be used.
    +only if the right side is an IN list or if the groupby's reduction 
    +ratio is greater than 5.0, otherwise a CQD has to be used.
     
     select t1.a
     from t1
     where t1.b in (1,2,3,4,...,101) ;
     
     
    -  Semi Join {pred : t1.b = t2.a}          Join {pred : t1.b = InList.col} 
    +  Semi Join {pred : t1.b = InList.col}  Join {pred : t1.b = InList.col}
      /         \                   ------->  /    \
     /           \                           /      \
    -Scan t1     Scan t2                 Scan t1     GroupBy {group cols: 
InList.col}
    +Scan t1   TupleList                 Scan t1   GroupBy {group cols: 
InList.col}
                                                       |
    --- End diff --
    
    Nice to make the picture consistent, but from the code it looks like we do 
this for things other than TupleList, so maybe "Scan t2" or "Q2" would be a 
better name for the child?


---

Reply via email to