Re: [PATCHES] variance aggregates per SQL:2003
On Tue, 2006-03-07 at 17:54 -0500, Neil Conway wrote: > This patch implements some new aggregate functions defined by SQL2003: > stddev_pop(), stddev_samp(), var_pop(), and var_samp(). Applied. -Neil ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] variance aggregates per SQL:2003
Neil Conway <[EMAIL PROTECTED]> writes: > Well, I realize that stddev(DISTINCT x) != stddev(x) and that most > people are going to be interested in stddev(x), but I don't think it's > inconceivable for someone to be interested in stddev(DISTINCT x). > Explicitly checking for and rejecting it doesn't serve any useful > purpose that I can see, beyond compliance with the letter of the > standard -- if the user asks for stddev(DISTINCT x), are we really > providing useful behavior if we refuse to calculate it? Agreed, refusing this is not something we should waste code on. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] variance aggregates per SQL:2003
On Tue, Mar 07, 2006 at 07:56:06PM -0500, Neil Conway wrote: > On Tue, 2006-03-07 at 16:36 -0800, David Fetter wrote: > > The rationale is kinda mathematical. A measure of deviation from > > central tendency (i.e. variance or stddev) is something where you > > probably don't want to normalize the weights. > > > > For example, the standard deviation of {0,1,1,1,2} is about 0.707, > > but the standard deviation of {0,1,2} is 1. > > Well, I realize that stddev(DISTINCT x) != stddev(x) and that most > people are going to be interested in stddev(x), but I don't think > it's inconceivable for someone to be interested in stddev(DISTINCT > x). Not inconceivable. Just really hard to justify unless you're trying to fudge a number ;) > Explicitly checking for and rejecting it doesn't serve any useful > purpose that I can see, beyond compliance with the letter of the > standard -- if the user asks for stddev(DISTINCT x), are we really > providing useful behavior if we refuse to calculate it? Nope. I was just coming up for a rationale for why the standard disallows it :) Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] variance aggregates per SQL:2003
On Tue, 2006-03-07 at 16:36 -0800, David Fetter wrote: > The rationale is kinda mathematical. A measure of deviation from > central tendency (i.e. variance or stddev) is something where you > probably don't want to normalize the weights. > > For example, the standard deviation of {0,1,1,1,2} is about 0.707, but > the standard deviation of {0,1,2} is 1. Well, I realize that stddev(DISTINCT x) != stddev(x) and that most people are going to be interested in stddev(x), but I don't think it's inconceivable for someone to be interested in stddev(DISTINCT x). Explicitly checking for and rejecting it doesn't serve any useful purpose that I can see, beyond compliance with the letter of the standard -- if the user asks for stddev(DISTINCT x), are we really providing useful behavior if we refuse to calculate it? -Neil ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] variance aggregates per SQL:2003
On Tue, Mar 07, 2006 at 05:54:00PM -0500, Neil Conway wrote: > This patch implements some new aggregate functions defined by SQL2003: > stddev_pop(), stddev_samp(), var_pop(), and var_samp(). stddev_samp() > and var_samp() are identical to the existing stddev() and variance() > aggregates, so I've made the latter aliases for the former. > > I noticed that SQL2003 does not allow DISTINCT to be specified for these > aggregate functions. I can't really see the rationale for this > restriction, and it would be fairly ugly to implement as far as I can > tell. Thoughts? > The rationale is kinda mathematical. A measure of deviation from central tendency (i.e. variance or stddev) is something where you probably don't want to normalize the weights. For example, the standard deviation of {0,1,1,1,2} is about 0.707, but the standard deviation of {0,1,2} is 1. Cheers, D (still hoping for some way to extend stddev, etc. to intervals) -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings