[
https://issues.apache.org/jira/browse/IMPALA-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17922217#comment-17922217
]
ASF subversion and git services commented on IMPALA-13688:
----------------------------------------------------------
Commit 98b584a45f95380a30cddc1736973d91ba29542f in impala's branch
refs/heads/master from Steve Carlin
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=98b584a45 ]
IMPALA-13481: Add support for various agg and analytic functions
Various functions were added. There were several issues for
these functions:
1) The Calcite parser and/or validator was generating SqlNodes
that weren't compatible with Impala. To fix this, the parsing
had to be removed from the Parser.jj file and the functions were
marked to use the ImpalaOperator rather than the Calcite operator.
These functions include:
trim, extract, regr*, regexp*, localtime, group_concat
2) The ntile, cume_dist, and percent_rank functions undergo a
transformation in AnalyticExpr. To make this more clean for Calcite,
the transformation now happens in the RewriteRexOverRule.
3) The "negative" operator had to be added to the custom operator table.
The subtract was already added there, and all "-" operators need
to be in the same table.
4) Various functions were added to function resolver where the Calcite
function name was different from the Impala function name.
Also added the test mentioned in IMPALA-13688 for cume_dist with
duplicates.
Change-Id: I57c69a60c63872b2964688f395b662a85698555e
Reviewed-on: http://gerrit.cloudera.org:8080/21976
Reviewed-by: Joe McDonnell <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Add test for cume_dist with duplicate values
> --------------------------------------------
>
> Key: IMPALA-13688
> URL: https://issues.apache.org/jira/browse/IMPALA-13688
> Project: IMPALA
> Issue Type: Task
> Components: Test
> Affects Versions: Impala 4.5.0
> Reporter: Joe McDonnell
> Priority: Major
>
> A crucial piece of the behavior of cume_dist() is handling duplicates
> properly. Our existing analytic test cases don't adequately test this. We
> should add a test that verifies cume_dist() for duplicates.
> Here is an example:
> {noformat}
> create table cume_dist_test (i int);
> insert into cume_dist_test values (1);
> insert into cume_dist_test values (1);
> insert into cume_dist_test values (3);
> select i, cume_dist() over (order by i) from cume_dist_test order by i;
> # Expected values:
> +---+-----------------------+
> | i | cume_dist() OVER(...) |
> +---+-----------------------+
> | 1 | 0.666666666667 |
> | 1 | 0.666666666667 |
> | 3 | 1.0 |
> +---+-----------------------+
> Fetched 3 row(s) in 0.12s{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]