I would say, just create a histogram of <value, count> pair, sort at
the end, and return the value at the percentile.

This assumes that the number of unique values are not big, which can
be easily enforced by using round(number, digits).

Zheng

On Thu, Feb 4, 2010 at 9:08 PM, Bryan Talbot <[email protected]> wrote:
> What's the best way to compute median and other percentiles using Hive 0.40?  
> I've run across http://issues.apache.org/jira/browse/HIVE-259 but there 
> doesn't seem to be any planned implementation yet.
>
>
> -Bryan
>
>
>
>
>



-- 
Yours,
Zheng

Reply via email to