Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r161185413
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/AggPruleBase.java
 ---
    @@ -82,4 +83,19 @@ protected boolean create2PhasePlan(RelOptRuleCall call, 
DrillAggregateRel aggreg
         }
         return true;
       }
    +
    +  /**
    +   * Returns group-by keys with the remapped arguments for specified 
aggregate.
    +   *
    +   * @param groupSet ImmutableBitSet of aggregate rel node, whose group-by 
keys should be remapped.
    +   * @return {@link ImmutableBitSet} instance with remapped keys.
    +   */
    +  public static ImmutableBitSet remapGroupSet(ImmutableBitSet groupSet) {
    --- End diff --
    
    After the changes in CALCITE-1930 in the class 
`AggregateExpandDistinctAggregatesRule`, this rule started applying more 
correctly, since in the older version there were checks like this:
    ```
    aggCall.getAggregation() instanceof SqlCountAggFunction
    ```
    but they were replaced by checks like this:
    ```
    final SqlKind aggCallKind = aggCall.getAggregation().getKind();
    switch (aggCallKind)
    ```
    So for the cases when instead of Calcite s `SqlCountAggFunction` were used 
`DrillCalciteSqlAggFunctionWrapper`s this rule changed its behaviour and 
instead of returning joins of distinct and non-distinct aggregate rel nodes, it 
returns distinct aggregate which has an input with non-distinct aggregates 
grouped by column, which is distinct in outer aggregate.
    
    These Drill rules were able to work correctly only for the cases when 
aggregate rel node does not contain ` aggCalls` and contains only the single 
field in `rowType` (in this case `groupSet` is always `{0}`, and it still 
correct for outer aggregates which are created in `*AggPrule`s)
    
    With the new version of `AggregateExpandDistinctAggregatesRule` these Drill 
rules received aggregate rel nodes with the non-empty lists of ` aggCalls`, 
therefore its `rowType` contains more than one field. But before my changes, 
the same `groupSet` was passed to the constructor of outer aggregate and row 
type of aggregate differed from the row type of its input. So it was incorrect 
`groupSet`. 
    Aggregate rel nodes always specify the group by columns in the first 
positions of the list, so correct `groupSet` for outer aggregate should be 
`ImmutableBitSet` with the same size as the `groupSet` of nested aggregate, but 
the ordinals of columns should start from 0. 
    
    As for the point of iterating through the group set, 
`ImmutableBitSet.size()`, `ImmutableBitSet.length()` and 
`ImmutableBitSet.cardinality()` does not return desired "size" of the 
`groupSet`. `AggregateExpandDistinctAggregatesRule` also contains similar code 
which iterating through the `groupSet` for similar purposes.


---

Reply via email to