Carter Shanklin created HIVE-16255:
--------------------------------------
Summary: Support percentile_cont / percentile_disc
Key: HIVE-16255
URL: https://issues.apache.org/jira/browse/HIVE-16255
Project: Hive
Issue Type: Sub-task
Reporter: Carter Shanklin
Way back in HIVE-259, a percentile function was added that provides a subset of
the standard percentile_cont aggregate function.
The SQL standard provides some additional options and also a percentile_disc
aggregate function with different rules. In the standard you specify an
ordering with arbitrary value expression and the results are drawn from this
value expression. This aggregate functions should be usable as analytic
functions as well (i.e. support the over clause). The current percentile
function is able to be used with an over clause.
The rough outline of how this works is:
percentile_cont(number) within group (order by expression) [ over(window spec) ]
percentile_disc(number) within group (order by expression) [ over(window spec) ]
The value of number should be between 0 and 1. The value expression is
evaluated for each row of the group, nulls are discarded, and the remaining
rows are ordered.
— If PERCENTILE_CONT is specified, by considering the pair of consecutive rows
that are indicated by the argument, treated as a fraction of the total number
of rows in the group, and interpolating the value of the value expression
evaluated for these rows.
— If PERCENTILE_DISC is specified, by treating the group as a window partition
of the CUME_DIST window function, using the specified ordering of the value
expression as the window ordering, and returning the first value expression
whose cumulative distribution value is greater than or equal to the argument.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)