Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-09 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/
---

(Updated Oct. 9, 2018, 4:36 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Changes
---

Rebased patch


Bugs: HIVE-17043
https://issues.apache.org/jira/browse/HIVE-17043


Repository: hive-git


Description
---

This patch implements/test the following optimizations
* Removal of group by on primary keys
* Reduction of group by keys on primary keys
* is NOT NULL filter removal if NOT NULL constraint is defined


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 97609cfadd 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
 635d27e723 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
42e60de6a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
 f43ef01293 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 5857f730a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
 1ca1937ed9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
 3bf62c535c 
  ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_1.q.out 
735a4db2ee 


Diff: https://reviews.apache.org/r/68868/diff/8/

Changes: https://reviews.apache.org/r/68868/diff/7-8/


Testing
---


Thanks,

Vineet Garg



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-07 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/
---

(Updated Oct. 7, 2018, 10:01 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Changes
---

Rebased the patch


Bugs: HIVE-17043
https://issues.apache.org/jira/browse/HIVE-17043


Repository: hive-git


Description
---

This patch implements/test the following optimizations
* Removal of group by on primary keys
* Reduction of group by keys on primary keys
* is NOT NULL filter removal if NOT NULL constraint is defined


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 97609cfadd 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
 635d27e723 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
42e60de6a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
 f43ef01293 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 5857f730a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
 1ca1937ed9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
 3bf62c535c 
  ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_1.q.out 
735a4db2ee 


Diff: https://reviews.apache.org/r/68868/diff/7/

Changes: https://reviews.apache.org/r/68868/diff/6-7/


Testing
---


Thanks,

Vineet Garg



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/
---

(Updated Oct. 7, 2018, 4:14 a.m.)


Review request for hive and Jesús Camacho Rodríguez.


Changes
---

Addressed review comments


Bugs: HIVE-17043
https://issues.apache.org/jira/browse/HIVE-17043


Repository: hive-git


Description
---

This patch implements/test the following optimizations
* Removal of group by on primary keys
* Reduction of group by keys on primary keys
* is NOT NULL filter removal if NOT NULL constraint is defined


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 9984ce5eed 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 66280b2da1 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
 635d27e723 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
42e60de6a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
 f43ef01293 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 5857f730a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
 1ca1937ed9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
 3bf62c535c 
  
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestVectorizedOrcAcidRowBatchReader.java
 0a499b1a1b 
  ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_1.q.out 
735a4db2ee 


Diff: https://reviews.apache.org/r/68868/diff/6/

Changes: https://reviews.apache.org/r/68868/diff/5-6/


Testing
---


Thanks,

Vineet Garg



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/#review209289
---




ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java
Line 102 (original), 100 (patched)


I had not realized this field was final. It makes sense so object is 
ummutable and field cannot be changed after instantiation. Hence the pair 
option for generateKeys makes more sense (Calcite has a pair class, it should 
be quite straightforward).



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
Lines 190 (patched)


Can we change this as we mentioned (first pick up best, otherwise pick up 
original?)



ql/src/test/results/clientpositive/llap/constraints_optimization.q.out
Lines 813 (patched)


Can we add a query with filter on the key:

explain select key1 from dest_g21 where key1 > 1 group by key1, value1;

Query should be optimized since column remains unique and not null.


- Jesús Camacho Rodríguez


On Oct. 6, 2018, 10:18 p.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68868/
> ---
> 
> (Updated Oct. 6, 2018, 10:18 p.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-17043
> https://issues.apache.org/jira/browse/HIVE-17043
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch implements/test the following optimizations
> * Removal of group by on primary keys
> * Reduction of group by keys on primary keys
> * is NOT NULL filter removal if NOT NULL constraint is defined
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 9984ce5eed 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
>  635d27e723 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
> 42e60de6a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
>  f43ef01293 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
>  5857f730a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
>  1ca1937ed9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
>  3bf62c535c 
>   ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_1.q.out 
> 735a4db2ee 
> 
> 
> Diff: https://reviews.apache.org/r/68868/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/
---

(Updated Oct. 6, 2018, 10:18 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Changes
---

Codestyle violation fixes


Bugs: HIVE-17043
https://issues.apache.org/jira/browse/HIVE-17043


Repository: hive-git


Description
---

This patch implements/test the following optimizations
* Removal of group by on primary keys
* Reduction of group by keys on primary keys
* is NOT NULL filter removal if NOT NULL constraint is defined


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 9984ce5eed 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
 635d27e723 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
42e60de6a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
 f43ef01293 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 5857f730a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
 1ca1937ed9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
 3bf62c535c 
  ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_1.q.out 
735a4db2ee 


Diff: https://reviews.apache.org/r/68868/diff/5/

Changes: https://reviews.apache.org/r/68868/diff/4-5/


Testing
---


Thanks,

Vineet Garg



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-05 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/
---

(Updated Oct. 5, 2018, 5:31 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-17043
https://issues.apache.org/jira/browse/HIVE-17043


Repository: hive-git


Description
---

This patch implements/test the following optimizations
* Removal of group by on primary keys
* Reduction of group by keys on primary keys
* is NOT NULL filter removal if NOT NULL constraint is defined


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties d444c99d7b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
 635d27e723 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
42e60de6a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
 f43ef01293 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 5857f730a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
 1ca1937ed9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
 3bf62c535c 
  ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/materialized_view_rewrite_1.q.out 
735a4db2ee 


Diff: https://reviews.apache.org/r/68868/diff/4/

Changes: https://reviews.apache.org/r/68868/diff/3-4/


Testing
---


Thanks,

Vineet Garg



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-03 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/
---

(Updated Oct. 3, 2018, 7:34 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-17043
https://issues.apache.org/jira/browse/HIVE-17043


Repository: hive-git


Description
---

This patch implements/test the following optimizations
* Removal of group by on primary keys
* Reduction of group by keys on primary keys
* is NOT NULL filter removal if NOT NULL constraint is defined


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties fdd8ecc77c 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
 635d27e723 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
42e60de6a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
 f43ef01293 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 5857f730a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
 1ca1937ed9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
 3bf62c535c 
  ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68868/diff/3/

Changes: https://reviews.apache.org/r/68868/diff/2-3/


Testing
---


Thanks,

Vineet Garg



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-02 Thread Jesús Camacho Rodríguez


> On Oct. 1, 2018, 8:07 p.m., Jesús Camacho Rodríguez wrote:
> >

I will take a look at the q files today, there were several plans that I did 
not have time to check.


> On Oct. 1, 2018, 8:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
> > Lines 288 (patched)
> > 
> >
> > We may have Volcano node too, we need to include it.
> 
> Vineet Garg wrote:
> How do I include volcano nodes? Can you point me to an example? Tried 
> searching for it but couldn't figure out.

The volcano nodes are RelSubset.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/#review209136
---


On Oct. 3, 2018, 12:44 a.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68868/
> ---
> 
> (Updated Oct. 3, 2018, 12:44 a.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-17043
> https://issues.apache.org/jira/browse/HIVE-17043
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch implements/test the following optimizations
> * Removal of group by on primary keys
> * Reduction of group by keys on primary keys
> * is NOT NULL filter removal if NOT NULL constraint is defined
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties fdd8ecc77c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
>  635d27e723 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
> 42e60de6a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
>  f43ef01293 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
>  5857f730a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
>  1ca1937ed9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
>  3bf62c535c 
>   ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68868/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-02 Thread Vineet Garg


> On Oct. 1, 2018, 8:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java
> > Lines 223 (patched)
> > 
> >
> > Should it be key.contains(columns) ?

You are correct, good eye.


> On Oct. 1, 2018, 8:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java
> > Line 285 (original), 311 (patched)
> > 
> >
> > Can we either return a Pair, or make it void and that it sets both 
> > _keys_ and _nonNullablekeys_? Currently it is a bit weird that one of them 
> > is set via return of the method, and the second one from the method.

Made it void, i thought it that was more simpler.


> On Oct. 1, 2018, 8:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
> > Line 363 (original), 414 (patched)
> > 
> >
> > return...

Not sure what do you mean by that?


> On Oct. 1, 2018, 8:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
> > Lines 53 (patched)
> > 
> >
> > This should be a new metadata provider or a modification of the 
> > existing _RelMdUniqueKeys_ but introducing a new boolean parameter 
> > _acceptEstimatedResults_. However, this would need changes in Calcite side. 
> > Please, leave a TODO comment though.

Left a TODO comment


> On Oct. 1, 2018, 8:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
> > Lines 288 (patched)
> > 
> >
> > We may have Volcano node too, we need to include it.

How do I include volcano nodes? Can you point me to an example? Tried searching 
for it but couldn't figure out.


> On Oct. 1, 2018, 8:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
> > Line 320 (original), 320 (patched)
> > 
> >
> > We do not use metadata providers anymore for unique keys estimation, 
> > which means some features will be disabled, e.g., caching of metadata 
> > results. I am not sure whether this may cause an increase in compilation 
> > time for several queries, specially those containing many joins, but it 
> > would be worth leaving a comment and maybe monitoring it in future.

Good point, left a note


- Vineet


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/#review209136
---


On Oct. 3, 2018, 12:44 a.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68868/
> ---
> 
> (Updated Oct. 3, 2018, 12:44 a.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-17043
> https://issues.apache.org/jira/browse/HIVE-17043
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch implements/test the following optimizations
> * Removal of group by on primary keys
> * Reduction of group by keys on primary keys
> * is NOT NULL filter removal if NOT NULL constraint is defined
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties fdd8ecc77c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
>  635d27e723 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
> 42e60de6a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
>  f43ef01293 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
>  5857f730a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
>  1ca1937ed9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
>  3bf62c535c 
>   ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
> PRE-CREATION 
> 

Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-02 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/
---

(Updated Oct. 3, 2018, 12:44 a.m.)


Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-17043
https://issues.apache.org/jira/browse/HIVE-17043


Repository: hive-git


Description
---

This patch implements/test the following optimizations
* Removal of group by on primary keys
* Reduction of group by keys on primary keys
* is NOT NULL filter removal if NOT NULL constraint is defined


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties fdd8ecc77c 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
 635d27e723 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
42e60de6a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
 f43ef01293 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 5857f730a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
 1ca1937ed9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
 3bf62c535c 
  ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68868/diff/2/

Changes: https://reviews.apache.org/r/68868/diff/1-2/


Testing
---


Thanks,

Vineet Garg



Re: Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-10-01 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/#review209136
---




ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java
Lines 223 (patched)


Should it be key.contains(columns) ?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java
Lines 293 (patched)


Can we add a comment here? 'They should all be nullable'



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java
Line 285 (original), 311 (patched)


Can we either return a Pair, or make it void and that it sets both _keys_ 
and _nonNullablekeys_? Currently it is a bit weird that one of them is set via 
return of the method, and the second one from the method.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
Line 363 (original), 414 (patched)


return...



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
Lines 53 (patched)


This should be a new metadata provider or a modification of the existing 
_RelMdUniqueKeys_ but introducing a new boolean parameter 
_acceptEstimatedResults_. However, this would need changes in Calcite side. 
Please, leave a TODO comment though.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
Lines 59 (patched)


You can make all these methods private and leave a single point of entry to 
the method (the one that has a RelNode parameter).



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
Lines 288 (patched)


We may have Volcano node too, we need to include it.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
Line 320 (original), 320 (patched)


We do not use metadata providers anymore for unique keys estimation, which 
means some features will be disabled, e.g., caching of metadata results. I am 
not sure whether this may cause an increase in compilation time for several 
queries, specially those containing many joins, but it would be worth leaving a 
comment and maybe monitoring it in future.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
Lines 331 (patched)


Why is this neeeded now? _getUniqueKeys_ did not have a handler for 
HepRelVertex, hence I would expect this would have returned null keys in any 
case. Also note that we may use other planners now, hence we may have other 
special nodes such as the Volcano nodes.


- Jesús Camacho Rodríguez


On Sept. 28, 2018, 1:23 a.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68868/
> ---
> 
> (Updated Sept. 28, 2018, 1:23 a.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-17043
> https://issues.apache.org/jira/browse/HIVE-17043
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch implements/test the following optimizations
> * Removal of group by on primary keys
> * Reduction of group by keys on primary keys
> * is NOT NULL filter removal if NOT NULL constraint is defined
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties def356176b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
>  635d27e723 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
> 42e60de6a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
>  f43ef01293 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
>  5857f730a8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
>  1ca1937ed9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
>  3bf62c535c 
>   ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
>   

Review Request 68868: HIVE-17043: Remove non unique columns from group by keys if not referenced later

2018-09-27 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68868/
---

Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-17043
https://issues.apache.org/jira/browse/HIVE-17043


Repository: hive-git


Description
---

This patch implements/test the following optimizations
* Removal of group by on primary keys
* Reduction of group by keys on primary keys
* is NOT NULL filter removal if NOT NULL constraint is defined


Diffs
-

  itests/src/test/resources/testconfiguration.properties def356176b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveDefaultRelMetadataProvider.java
 635d27e723 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java 
42e60de6a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
 f43ef01293 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 5857f730a8 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdColumnUniqueness.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java
 1ca1937ed9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdUniqueKeys.java
 3bf62c535c 
  ql/src/test/queries/clientpositive/constraints_optimization.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68868/diff/1/


Testing
---


Thanks,

Vineet Garg