Re: [PATCHES] variance aggregates per SQL:2003

2006-03-10 Thread Neil Conway
On Tue, 2006-03-07 at 17:54 -0500, Neil Conway wrote:
> This patch implements some new aggregate functions defined by SQL2003:
> stddev_pop(), stddev_samp(), var_pop(), and var_samp().

Applied.

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] variance aggregates per SQL:2003

2006-03-07 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> Well, I realize that stddev(DISTINCT x) != stddev(x) and that most
> people are going to be interested in stddev(x), but I don't think it's
> inconceivable for someone to be interested in stddev(DISTINCT x).
> Explicitly checking for and rejecting it doesn't serve any useful
> purpose that I can see, beyond compliance with the letter of the
> standard -- if the user asks for stddev(DISTINCT x), are we really
> providing useful behavior if we refuse to calculate it?

Agreed, refusing this is not something we should waste code on.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] variance aggregates per SQL:2003

2006-03-07 Thread David Fetter
On Tue, Mar 07, 2006 at 07:56:06PM -0500, Neil Conway wrote:
> On Tue, 2006-03-07 at 16:36 -0800, David Fetter wrote:
> > The rationale is kinda mathematical.  A measure of deviation from
> > central tendency (i.e. variance or stddev) is something where you
> > probably don't want to normalize the weights.
> > 
> > For example, the standard deviation of {0,1,1,1,2} is about 0.707,
> > but the standard deviation of {0,1,2} is 1.
> 
> Well, I realize that stddev(DISTINCT x) != stddev(x) and that most
> people are going to be interested in stddev(x), but I don't think
> it's inconceivable for someone to be interested in stddev(DISTINCT
> x).

Not inconceivable.  Just really hard to justify unless you're trying
to fudge a number ;)

> Explicitly checking for and rejecting it doesn't serve any useful
> purpose that I can see, beyond compliance with the letter of the
> standard -- if the user asks for stddev(DISTINCT x), are we really
> providing useful behavior if we refuse to calculate it?

Nope.  I was just coming up for a rationale for why the standard
disallows it :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] variance aggregates per SQL:2003

2006-03-07 Thread Neil Conway
On Tue, 2006-03-07 at 16:36 -0800, David Fetter wrote:
> The rationale is kinda mathematical.  A measure of deviation from
> central tendency (i.e. variance or stddev) is something where you
> probably don't want to normalize the weights.
> 
> For example, the standard deviation of {0,1,1,1,2} is about 0.707, but
> the standard deviation of {0,1,2} is 1.

Well, I realize that stddev(DISTINCT x) != stddev(x) and that most
people are going to be interested in stddev(x), but I don't think it's
inconceivable for someone to be interested in stddev(DISTINCT x).
Explicitly checking for and rejecting it doesn't serve any useful
purpose that I can see, beyond compliance with the letter of the
standard -- if the user asks for stddev(DISTINCT x), are we really
providing useful behavior if we refuse to calculate it?

-Neil



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] variance aggregates per SQL:2003

2006-03-07 Thread David Fetter
On Tue, Mar 07, 2006 at 05:54:00PM -0500, Neil Conway wrote:
> This patch implements some new aggregate functions defined by SQL2003:
> stddev_pop(), stddev_samp(), var_pop(), and var_samp(). stddev_samp()
> and var_samp() are identical to the existing stddev() and variance()
> aggregates, so I've made the latter aliases for the former.
> 
> I noticed that SQL2003 does not allow DISTINCT to be specified for these
> aggregate functions. I can't really see the rationale for this
> restriction, and it would be fairly ugly to implement as far as I can
> tell. Thoughts?
> 

The rationale is kinda mathematical.  A measure of deviation from
central tendency (i.e. variance or stddev) is something where you
probably don't want to normalize the weights.

For example, the standard deviation of {0,1,1,1,2} is about 0.707, but
the standard deviation of {0,1,2} is 1.

Cheers,
D (still hoping for some way to extend stddev, etc. to intervals)
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings