I would say, just create a histogram of <value, count> pair, sort at the end, and return the value at the percentile.
This assumes that the number of unique values are not big, which can be easily enforced by using round(number, digits). Zheng On Thu, Feb 4, 2010 at 9:08 PM, Bryan Talbot <[email protected]> wrote: > What's the best way to compute median and other percentiles using Hive 0.40? > I've run across http://issues.apache.org/jira/browse/HIVE-259 but there > doesn't seem to be any planned implementation yet. > > > -Bryan > > > > > -- Yours, Zheng
