Re: Review Request 72200: TopN Key efficiency check might disable filter too soon

2020-03-06 Thread Attila Magyar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72200/
---

(Updated March 6, 2020, 12:44 p.m.)


Review request for hive, Gopal V, Jesús Camacho Rodríguez, Krisztian Kasa, and 
Rajesh Balamohan.


Bugs: HIVE-22982
https://issues.apache.org/jira/browse/HIVE-22982


Repository: hive-git


Description
---

The check is triggered after every n batches but there can be multiple filters, 
one for each partition. Some filters might have less data then the others.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 12f4822e381 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java f09867bb4e8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java 
0f8eb173c66 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/wrapper/VectorHashKeyWrapperBatch.java
 b487480b938 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/wrapper/VectorHashKeyWrapperGeneralComparator.java
 06ac661028f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TopNKeyDesc.java ddd657e5552 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestTopNKeyFilter.java a91bc7354a7 


Diff: https://reviews.apache.org/r/72200/diff/2/

Changes: https://reviews.apache.org/r/72200/diff/1-2/


Testing
---

manually


Thanks,

Attila Magyar



Re: Review Request 72200: TopN Key efficiency check might disable filter too soon

2020-03-05 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72200/#review219806
---




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Line 2421 (original), 2421 (patched)


i think we shall check for atleast 100K rows before turning this off, so 
checking for 10K batches make more sense to me. So, lets have default as 10K 
here.


- Ashutosh Chauhan


On March 5, 2020, 2:22 p.m., Attila Magyar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72200/
> ---
> 
> (Updated March 5, 2020, 2:22 p.m.)
> 
> 
> Review request for hive, Gopal V, Jesús Camacho Rodríguez, Krisztian Kasa, 
> and Rajesh Balamohan.
> 
> 
> Bugs: HIVE-22982
> https://issues.apache.org/jira/browse/HIVE-22982
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The check is triggered after every n batches but there can be multiple 
> filters, one for each partition. Some filters might have less data then the 
> others.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7ea2de9019c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java f09867bb4e8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java 
> 0f8eb173c66 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestTopNKeyFilter.java 
> a91bc7354a7 
> 
> 
> Diff: https://reviews.apache.org/r/72200/diff/1/
> 
> 
> Testing
> ---
> 
> manually
> 
> 
> Thanks,
> 
> Attila Magyar
> 
>



Review Request 72200: TopN Key efficiency check might disable filter too soon

2020-03-05 Thread Attila Magyar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72200/
---

Review request for hive, Gopal V, Jesús Camacho Rodríguez, Krisztian Kasa, and 
Rajesh Balamohan.


Bugs: HIVE-22982
https://issues.apache.org/jira/browse/HIVE-22982


Repository: hive-git


Description
---

The check is triggered after every n batches but there can be multiple filters, 
one for each partition. Some filters might have less data then the others.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7ea2de9019c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java f09867bb4e8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java 
0f8eb173c66 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestTopNKeyFilter.java a91bc7354a7 


Diff: https://reviews.apache.org/r/72200/diff/1/


Testing
---

manually


Thanks,

Attila Magyar