Re: Review Request 71645: HIVE-22292

2019-11-05 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/
---

(Updated Nov. 5, 2019, 8:41 a.m.)


Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet 
Garg.


Bugs: HIVE-22292
https://issues.apache.org/jira/browse/HIVE-22292


Repository: hive-git


Description
---

Implement Hypothetical-Set Aggregate Functions
==
1. rank, dense_rank, precent_rank, cume_dist
2. Allow unlimited column references in `WITHIN GROUP` clause
3. Refactor the implementation of the functions `percentile_cont` and 
`percentile_disc`: 
 - validate that only one parameter and column reference is passed to these 
two functions. 
 - since the semantics of the `WITHIN GROUP` clause allows multiple column 
references the parameter order had to be changed and this affect backward 
compatibility.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
48645dc3f2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java a0b0e48f4c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0198c0f724 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
d0c155ff2d 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
992f5bfd21 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
64e9c8b7ca 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
 ad61410180 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
 c8d3c12c80 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
13e2f537cd 
  ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
dead3ec472 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
9d44ed87e9 
  ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q PRE-CREATION 
  ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
  ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 


Diff: https://reviews.apache.org/r/71645/diff/4/

Changes: https://reviews.apache.org/r/71645/diff/3-4/


Testing
---

New q test added for testing Hypothetical-Set Aggregate Functions: 
hypothetical_set_aggregates.q
Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
udaf_percentile_disc.q
Run unit test: TestParseWithinGroupClause.java


Thanks,

Krisztian Kasa



Re: Review Request 71645: HIVE-22292

2019-11-04 Thread Jesús Camacho Rodríguez


> On Nov. 1, 2019, 6:06 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java
> > Line 30 (original), 30 (patched)
> > 
> >
> > Does _orderedAggregate_ duplicate the functionality of _impliesOrder_?
> 
> Krisztian Kasa wrote:
> `impliesOrder` indicates that the function is a window function and an 
> OVER clause is required when invoked. example:
> ```
> SELECT val, rank() OVER (ORDER BY val DESC) FROM t_table;
> ```
> 
> `orderedAggregate` means that the function is an Ordered-Set Aggregate or 
> a Hypothetical-Set Aggregate function and a `WITHIN GROUP` clause is required 
> when invoked:
> ```
> SELECT rank(1) WITHIN GROUP (ORDER BY val) FROM t_table;
> ```
> 
> When checking the semantic if the WITHIN GROUP keywords were provided an 
> extra check is added: whether the function allows using `WITHIN GROUP`: is it 
> an Ordered-Set Aggregate or a Hypothetical-Set Aggregate function or not. 
> If no WITHIN GROUP nor ORDER clause were specified however the function 
> can not be used wihout them a MISSING_OVER_CLAUSE exception will be thrown.

Can we add this as a comment to the code for clarity? The names are a bit 
confusing at this point. Thanks


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/#review218480
---


On Nov. 4, 2019, 11:01 a.m., Krisztian Kasa wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71645/
> ---
> 
> (Updated Nov. 4, 2019, 11:01 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and 
> Vineet Garg.
> 
> 
> Bugs: HIVE-22292
> https://issues.apache.org/jira/browse/HIVE-22292
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Implement Hypothetical-Set Aggregate Functions
> ==
> 1. rank, dense_rank, precent_rank, cume_dist
> 2. Allow unlimited column references in `WITHIN GROUP` clause
> 3. Refactor the implementation of the functions `percentile_cont` and 
> `percentile_disc`: 
>  - validate that only one parameter and column reference is passed to 
> these two functions. 
>  - since the semantics of the `WITHIN GROUP` clause allows multiple 
> column references the parameter order had to be changed and this affect 
> backward compatibility.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
> 48645dc3f2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java 
> a0b0e48f4c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 0198c0f724 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
> d0c155ff2d 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
> 992f5bfd21 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
> 64e9c8b7ca 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
>  ad61410180 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
>  c8d3c12c80 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
> 13e2f537cd 
>   ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
> dead3ec472 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
> 9d44ed87e9 
>   ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
>   ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 
> 
> 
> Diff: https://reviews.apache.org/r/71645/diff/3/
> 
> 
> Testing
> ---
> 
> New q test added for testing Hypothetical-Set Aggregate Functions: 
> hypothetical_set_aggregates.q
> Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
> udaf_percentile_disc.q
> Run unit test: TestParseWithinGroupClause.java
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>



Re: Review Request 71645: HIVE-22292

2019-11-04 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/
---

(Updated Nov. 4, 2019, 11:01 a.m.)


Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet 
Garg.


Bugs: HIVE-22292
https://issues.apache.org/jira/browse/HIVE-22292


Repository: hive-git


Description
---

Implement Hypothetical-Set Aggregate Functions
==
1. rank, dense_rank, precent_rank, cume_dist
2. Allow unlimited column references in `WITHIN GROUP` clause
3. Refactor the implementation of the functions `percentile_cont` and 
`percentile_disc`: 
 - validate that only one parameter and column reference is passed to these 
two functions. 
 - since the semantics of the `WITHIN GROUP` clause allows multiple column 
references the parameter order had to be changed and this affect backward 
compatibility.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
48645dc3f2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java a0b0e48f4c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0198c0f724 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
d0c155ff2d 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
992f5bfd21 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
64e9c8b7ca 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
 ad61410180 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
 c8d3c12c80 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
13e2f537cd 
  ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
dead3ec472 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
9d44ed87e9 
  ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q PRE-CREATION 
  ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
  ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 


Diff: https://reviews.apache.org/r/71645/diff/3/

Changes: https://reviews.apache.org/r/71645/diff/2-3/


Testing
---

New q test added for testing Hypothetical-Set Aggregate Functions: 
hypothetical_set_aggregates.q
Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
udaf_percentile_disc.q
Run unit test: TestParseWithinGroupClause.java


Thanks,

Krisztian Kasa



Re: Review Request 71645: HIVE-22292

2019-11-03 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/
---

(Updated Nov. 4, 2019, 7:12 a.m.)


Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet 
Garg.


Bugs: HIVE-22292
https://issues.apache.org/jira/browse/HIVE-22292


Repository: hive-git


Description
---

Implement Hypothetical-Set Aggregate Functions
==
1. rank, dense_rank, precent_rank, cume_dist
2. Allow unlimited column references in `WITHIN GROUP` clause
3. Refactor the implementation of the functions `percentile_cont` and 
`percentile_disc`: 
 - validate that only one parameter and column reference is passed to these 
two functions. 
 - since the semantics of the `WITHIN GROUP` clause allows multiple column 
references the parameter order had to be changed and this affect backward 
compatibility.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
48645dc3f2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java a0b0e48f4c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0198c0f724 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
d0c155ff2d 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
992f5bfd21 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
64e9c8b7ca 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
 ad61410180 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
 c8d3c12c80 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
13e2f537cd 
  ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
dead3ec472 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
9d44ed87e9 
  ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q PRE-CREATION 
  ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
  ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 


Diff: https://reviews.apache.org/r/71645/diff/2/

Changes: https://reviews.apache.org/r/71645/diff/1-2/


Testing
---

New q test added for testing Hypothetical-Set Aggregate Functions: 
hypothetical_set_aggregates.q
Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
udaf_percentile_disc.q
Run unit test: TestParseWithinGroupClause.java


Thanks,

Krisztian Kasa



Re: Review Request 71645: HIVE-22292

2019-11-03 Thread Krisztian Kasa


> On Nov. 1, 2019, 6:06 a.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java
> > Line 30 (original), 30 (patched)
> > 
> >
> > Does _orderedAggregate_ duplicate the functionality of _impliesOrder_?

`impliesOrder` indicates that the function is a window function and an OVER 
clause is required when invoked. example:
```
SELECT val, rank() OVER (ORDER BY val DESC) FROM t_table;
```

`orderedAggregate` means that the function is an Ordered-Set Aggregate or a 
Hypothetical-Set Aggregate function and a `WITHIN GROUP` clause is required 
when invoked:
```
SELECT rank(1) WITHIN GROUP (ORDER BY val) FROM t_table;
```

When checking the semantic if the WITHIN GROUP keywords were provided an extra 
check is added: whether the function allows using `WITHIN GROUP`: is it an 
Ordered-Set Aggregate or a Hypothetical-Set Aggregate function or not. 
If no WITHIN GROUP nor ORDER clause were specified however the function can not 
be used wihout them a MISSING_OVER_CLAUSE exception will be thrown.


- Krisztian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/#review218480
---


On Oct. 22, 2019, 10:49 a.m., Krisztian Kasa wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71645/
> ---
> 
> (Updated Oct. 22, 2019, 10:49 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and 
> Vineet Garg.
> 
> 
> Bugs: HIVE-22292
> https://issues.apache.org/jira/browse/HIVE-22292
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Implement Hypothetical-Set Aggregate Functions
> ==
> 1. rank, dense_rank, precent_rank, cume_dist
> 2. Allow unlimited column references in `WITHIN GROUP` clause
> 3. Refactor the implementation of the functions `percentile_cont` and 
> `percentile_disc`: 
>  - validate that only one parameter and column reference is passed to 
> these two functions. 
>  - since the semantics of the `WITHIN GROUP` clause allows multiple 
> column references the parameter order had to be changed and this affect 
> backward compatibility.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
> 48645dc3f2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java 
> a0b0e48f4c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 30d37914d0 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
> d0c155ff2d 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
> 992f5bfd21 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
> 64e9c8b7ca 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
>  ad61410180 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
>  c8d3c12c80 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
> 13e2f537cd 
>   ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
> dead3ec472 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
> 9d44ed87e9 
>   ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
>   ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 
> 
> 
> Diff: https://reviews.apache.org/r/71645/diff/1/
> 
> 
> Testing
> ---
> 
> New q test added for testing Hypothetical-Set Aggregate Functions: 
> hypothetical_set_aggregates.q
> Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
> udaf_percentile_disc.q
> Run unit test: TestParseWithinGroupClause.java
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>



Re: Review Request 71645: HIVE-22292

2019-11-01 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/#review218480
---




ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java
Line 30 (original), 30 (patched)


Does _orderedAggregate_ duplicate the functionality of _impliesOrder_?



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
Lines 210 (patched)


?


- Jesús Camacho Rodríguez


On Oct. 22, 2019, 10:49 a.m., Krisztian Kasa wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71645/
> ---
> 
> (Updated Oct. 22, 2019, 10:49 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and 
> Vineet Garg.
> 
> 
> Bugs: HIVE-22292
> https://issues.apache.org/jira/browse/HIVE-22292
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Implement Hypothetical-Set Aggregate Functions
> ==
> 1. rank, dense_rank, precent_rank, cume_dist
> 2. Allow unlimited column references in `WITHIN GROUP` clause
> 3. Refactor the implementation of the functions `percentile_cont` and 
> `percentile_disc`: 
>  - validate that only one parameter and column reference is passed to 
> these two functions. 
>  - since the semantics of the `WITHIN GROUP` clause allows multiple 
> column references the parameter order had to be changed and this affect 
> backward compatibility.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
> 48645dc3f2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java 
> a0b0e48f4c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 30d37914d0 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
> d0c155ff2d 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
> 992f5bfd21 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
> 64e9c8b7ca 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
>  ad61410180 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
>  c8d3c12c80 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
> 13e2f537cd 
>   ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
> dead3ec472 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
> 9d44ed87e9 
>   ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
>   ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 
> 
> 
> Diff: https://reviews.apache.org/r/71645/diff/1/
> 
> 
> Testing
> ---
> 
> New q test added for testing Hypothetical-Set Aggregate Functions: 
> hypothetical_set_aggregates.q
> Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
> udaf_percentile_disc.q
> Run unit test: TestParseWithinGroupClause.java
> 
> 
> Thanks,
> 
> Krisztian Kasa
> 
>



Review Request 71645: HIVE-22292

2019-10-22 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71645/
---

Review request for hive, Jesús Camacho Rodríguez and Zoltan Haindrich.


Bugs: HIVE-22292
https://issues.apache.org/jira/browse/HIVE-22292


Repository: hive-git


Description
---

Implement Hypothetical-Set Aggregate Functions
==
1. rank, dense_rank, precent_rank, cume_dist
2. Allow unlimited column references in `WITHIN GROUP` clause
3. Refactor the implementation of the functions `percentile_cont` and 
`percentile_disc`: 
 - validate that only one parameter and column reference is passed to these 
two functions. 
 - since the semantics of the `WITHIN GROUP` clause allows multiple column 
references the parameter order had to be changed and this affect backward 
compatibility.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 5e88f30cab 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 059919710e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
48645dc3f2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java a0b0e48f4c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55c6863f67 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 30d37914d0 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
d0c155ff2d 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
992f5bfd21 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
64e9c8b7ca 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
 ad61410180 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
 c8d3c12c80 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
13e2f537cd 
  ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java 
dead3ec472 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseWithinGroupClause.java 
9d44ed87e9 
  ql/src/test/queries/clientpositive/hypothetical_set_aggregates.q PRE-CREATION 
  ql/src/test/results/clientpositive/hypothetical_set_aggregates.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/udaf_percentile_cont.q.out f12cb6cd5e 
  ql/src/test/results/clientpositive/udaf_percentile_disc.q.out d10fee577c 


Diff: https://reviews.apache.org/r/71645/diff/1/


Testing
---

New q test added for testing Hypothetical-Set Aggregate Functions: 
hypothetical_set_aggregates.q
Run q tests: hypothetical_set_aggregates.q, udaf_percentile_cont.q, 
udaf_percentile_disc.q
Run unit test: TestParseWithinGroupClause.java


Thanks,

Krisztian Kasa