As Julian mentioned, an optional APPROXIMATE clause along with a session/system setting looks like the best option to me. Exposing the algorithm in the name does not make sense - we might want to replace it with a new one in the future. However, there might be different approaches e.g. Oracle uses a different naming convention APPROXIMATE_COUNT_DISTINCT() [1]
[1] https://docs.oracle.com/database/121/SQLRF/functions013.htm#SQLRF56900 On Tue, Jun 7, 2016 at 6:50 AM, John Omernik <j...@omernik.com> wrote: > Julian, great point. > > With a proper design, uses could session variables or use the select with > options so that one query wouldn't change the session wide settings. That > seems promising as an idea. > > John > > On Mon, Jun 6, 2016 at 8:12 PM, Julian Hyde <jh...@apache.org> wrote: > > > I’ve thought for some time that SQL aggregate functions should have an > > “APPROXIMATE ( … )” clause. Users don’t WANT to call a TD_MEDIAN > function, > > they want the MEDIAN that gives them an answer to their desired accuracy > > (within X, within Y%, or within a given confidence interval), and > TD_MEDIAN > > may be the way to achieve that. > > > > In fact the user might just set “SET APPROXIMATE = ’95%'” in their > session > > and the APPROXIMATE clause is implicit on every query they write. > > > > Approximate aggregate functions are all the rage right now but I’m not > > aware of any effort standardize them across databases. > > > > Julian > > > > > > > On Jun 6, 2016, at 5:58 PM, Parth Chandra <par...@apache.org> wrote: > > > > > > Hey Steven, > > > Somehow I missed this one when you posted it. > > > Since you asked, I would suggest a different name from median, quartile > > > since that might mislead. How about td_median, td_quantile ? > > > > > > On Wed, Apr 13, 2016 at 11:51 AM, Steven Phillips <ste...@dremio.com> > > wrote: > > > > > >> I submitted a pull request a little while ago that introduces > > (approximate) > > >> median and quantile functions using the tdigest library. > > >> > > >> https://github.com/apache/drill/pull/456 > > >> > > >> It would be great if I could get some feedback on this. Specifically, > > is it > > >> ok to call these functions median and quantile, given that they are > not > > >> exact. > > >> > > > > >