Re: Review Request 69202: HIVE-20804 Further improvements to group by optimization with constraints

Jesús Camacho Rodríguez Thu, 01 Nov 2018 13:33:06 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69202/#review210258
-----------------------------------------------------------





ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java
Lines 774 (patched)
<https://reviews.apache.org/r/69202/#comment294900>

    Cardinality and NDV may change. This just checks whether column 
origin/lineage can be backtrack to single input. We should use 
RelMdColumnOrigins or RelMdExpressionLineage instead of writing this new class.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java
Lines 809 (patched)
<https://reviews.apache.org/r/69202/#comment294899>

    You can use _shift_ method in ImmutableBitSet for these cases.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java
Lines 823 (patched)
<https://reviews.apache.org/r/69202/#comment294904>

    Not used?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java
Lines 833 (patched)
<https://reviews.apache.org/r/69202/#comment294894>

    colSet.contains(ImmutableBitSet.of(i)) -> colSet.get(i)



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java
Lines 834 (patched)
<https://reviews.apache.org/r/69202/#comment294896>

    This will create a copy in every iteration. You could create a 
ImmutableBitSet.Builder from colset outside of the loop and then clear it here.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
Lines 321 (patched)
<https://reviews.apache.org/r/69202/#comment294901>

    Method name should change, tbh it is not about cardinality, but about 
whether the rows can continue being identified uniquely.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
Lines 346 (patched)
<https://reviews.apache.org/r/69202/#comment294902>

    This should never be empty?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
Lines 359 (patched)
<https://reviews.apache.org/r/69202/#comment294903>

    This logic does not seem correct. What happens when a Project permutes 
input references and the GBy columns would not be in same order as the key? For 
instance consider GBy key is (0,1,2) which maps into columns (2,0,4) in TS. Key 
is (0,4). You would remove column 1 from GBy, but you should remove column 0. 
Using ColumnOrigins or ExpressionLineage providers to get the mapping would 
solve the issue.


- Jesús Camacho Rodríguez


On Oct. 29, 2018, 6:45 p.m., Vineet Garg wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69202/
> -----------------------------------------------------------
> 
> (Updated Oct. 29, 2018, 6:45 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-20804
>     https://issues.apache.org/jira/browse/HIVE-20804
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> See Jira
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java 
> 9aa30129b6 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
>  b7c31bdfca 
>   ql/src/test/queries/clientpositive/constraints_optimization.q 70ab8509c5 
>   ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
> 96caa4d6dd 
> 
> 
> Diff: https://reviews.apache.org/r/69202/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>

Re: Review Request 69202: HIVE-20804 Further improvements to group by optimization with constraints

Reply via email to