I’ve thought for some time that SQL aggregate functions should have an “APPROXIMATE ( … )” clause. Users don’t WANT to call a TD_MEDIAN function, they want the MEDIAN that gives them an answer to their desired accuracy (within X, within Y%, or within a given confidence interval), and TD_MEDIAN may be the way to achieve that.
In fact the user might just set “SET APPROXIMATE = ’95%'” in their session and the APPROXIMATE clause is implicit on every query they write. Approximate aggregate functions are all the rage right now but I’m not aware of any effort standardize them across databases. Julian > On Jun 6, 2016, at 5:58 PM, Parth Chandra <par...@apache.org> wrote: > > Hey Steven, > Somehow I missed this one when you posted it. > Since you asked, I would suggest a different name from median, quartile > since that might mislead. How about td_median, td_quantile ? > > On Wed, Apr 13, 2016 at 11:51 AM, Steven Phillips <ste...@dremio.com> wrote: > >> I submitted a pull request a little while ago that introduces (approximate) >> median and quantile functions using the tdigest library. >> >> https://github.com/apache/drill/pull/456 >> >> It would be great if I could get some feedback on this. Specifically, is it >> ok to call these functions median and quantile, given that they are not >> exact. >>