Re: [HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-07-16 Thread Noah Misch
On Mon, Jul 15, 2013 at 11:43:04AM -0700, David Fetter wrote:
 On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote:
  See attached patch revisions.  The first patch edits 
  find_minmax_aggs_walker()
  per my comments just now.  The second is an update of your FILTER patch with
  the changes to which I alluded above; it applies atop the first patch.  
  Would
  you verify that I didn't ruin anything?  Barring problems, will commit.

 Tested your changes.  They pass regression, etc. :)

Committed.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-07-15 Thread David Fetter
On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote:
 On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote:
  On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote:
   Overall I think this patch offers useful additional functionality, in
   compliance with the SQL spec, which should be handy to simplify
   complex grouping queries.
 
 As I understand this feature, it is syntactic sugar for the typical case of an
 aggregate with a strict transition function.  For example, min(x) FILTER
 (WHERE y  0) is rigorously equivalent to min(CASE y  0 THEN x END).
 Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard
 queries it is *only* syntactic sugar.  In PostgreSQL, it lets you do novel
 things with, for example, array_agg().  Is that accurate?
 
   I think this is ready for committer.
 
 The patch was thorough.  I updated applicable comments, revisited some
 cosmetic choices, and made these functional changes:
 
 1. The pg_stat_statements query jumble should incorporate the filter.
 
 2. The patch did not update costing.  I made it add the cost of the filter
 expression the same way we add the cost of the argument expressions.  This
 makes min(x) FILTER (WHERE y  0) match min(case y  0 THEN x end) in
 terms of cost, which is a fair start.  At some point, we could do better by
 reducing the argument cost by the filter selectivity.
 
 
 A few choices/standard interpretations may deserve discussion.  The patch
 makes filter clauses contribute to the subquery level chosen to be the
 aggregation query.  This is observable through the behavior of these two
 standard-conforming queries:
 
 select (select count(outer_c)
 from (values (1)) t0(inner_c))
 from (values (2),(3)) t1(outer_c); -- outer query is aggregation query
 select (select count(outer_c) filter (where inner_c  0)
 from (values (1)) t0(inner_c))
 from (values (2),(3)) t1(outer_c); -- inner query is aggregation query
 
 I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this.  Since that
 still isn't crystal-clear to me, I mention it in case anyone has a different
 reading.
 
 Distinct from that, albeit in a similar vein, SQL does not permit outer
 references in a filter clause.  This patch permits them; I think that
 qualifies as a reasonable PostgreSQL extension.
 
  --- a/doc/src/sgml/keywords.sgml
  +++ b/doc/src/sgml/keywords.sgml
 
  @@ -3200,7 +3200,7 @@
  /row
  row
   entrytokenOVER/token/entry
  -entryreserved (can be function or type)/entry
  +entrynon-reserved/entry
 
 I committed this one-line correction separately.
 
  --- a/src/backend/optimizer/plan/planagg.c
  +++ b/src/backend/optimizer/plan/planagg.c
  @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context)
  ListCell   *l;
   
  Assert(aggref-agglevelsup == 0);
  -   if (list_length(aggref-args) != 1 || aggref-aggorder != NIL)
  +   if (list_length(aggref-args) != 1 || aggref-aggorder != NIL 
  || aggref-agg_filter != NULL)
  return true;/* it couldn't be MIN/MAX */
  /* note: we do not care if DISTINCT is mentioned ... */
 
 I twitched upon reading this, because neither ORDER BY nor FILTER preclude the
 aggregate being MIN or MAX.  Perhaps Andrew can explain why he put aggorder
 there back in 2009.  All I can figure is that writing max(c ORDER BY x) is so
 unlikely that we'd too often waste the next syscache lookup.  But the same
 argument would apply to DISTINCT.  With FILTER, the rationale is entirely
 different.  The aggregate could well be MIN/MAX; we just haven't implemented
 the necessary support elsewhere in this file.
 
 
 See attached patch revisions.  The first patch edits find_minmax_aggs_walker()
 per my comments just now.  The second is an update of your FILTER patch with
 the changes to which I alluded above; it applies atop the first patch.  Would
 you verify that I didn't ruin anything?  Barring problems, will commit.
 
 Are you the sole named author of this patch?  That's what the CF page says,
 but that wasn't fully clear to me from the list discussions.
 
 Thanks,
 nm

Tested your changes.  They pass regression, etc. :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-07-15 Thread Noah Misch
On Mon, Jul 15, 2013 at 06:56:17PM +, Andrew Gierth wrote:
 Noah Misch said:
 
  I twitched upon reading this, because neither ORDER BY nor FILTER preclude
  the aggregate being MIN or MAX.  Perhaps Andrew can explain why he put
  aggorder there back in 2009.
 
 The bottom line is that I intentionally avoided assuming that an agg with an
 aggsortop could only be min() or max() and that having an order by clause
 would always be harmless in such cases. I can't think of an actual use case
 where it would matter, but I've seen people define some pretty strange aggs
 recently.
 
 So I mildly object to simply throwing away the ORDER BY clause in such cases.

I can't think of another use for aggsortop as defined today.  However, on
further reflection, min(x ORDER BY y) is not identical to min(x) when the
B-tree operator class of the aggsortop can find non-identical datums to be
equal.  This affects at least min(numeric) and min(interval).  min(x) chooses
an unspecified x among those equal to the smallest x, while min(x ORDER BY y)
can be used to narrow the choice.  I will update the comments along those
lines and not change semantics after all.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-07-14 Thread Noah Misch
On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote:
 On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote:
  Overall I think this patch offers useful additional functionality, in
  compliance with the SQL spec, which should be handy to simplify
  complex grouping queries.

As I understand this feature, it is syntactic sugar for the typical case of an
aggregate with a strict transition function.  For example, min(x) FILTER
(WHERE y  0) is rigorously equivalent to min(CASE y  0 THEN x END).
Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard
queries it is *only* syntactic sugar.  In PostgreSQL, it lets you do novel
things with, for example, array_agg().  Is that accurate?

  I think this is ready for committer.

The patch was thorough.  I updated applicable comments, revisited some
cosmetic choices, and made these functional changes:

1. The pg_stat_statements query jumble should incorporate the filter.

2. The patch did not update costing.  I made it add the cost of the filter
expression the same way we add the cost of the argument expressions.  This
makes min(x) FILTER (WHERE y  0) match min(case y  0 THEN x end) in
terms of cost, which is a fair start.  At some point, we could do better by
reducing the argument cost by the filter selectivity.


A few choices/standard interpretations may deserve discussion.  The patch
makes filter clauses contribute to the subquery level chosen to be the
aggregation query.  This is observable through the behavior of these two
standard-conforming queries:

select (select count(outer_c)
from (values (1)) t0(inner_c))
from (values (2),(3)) t1(outer_c); -- outer query is aggregation query
select (select count(outer_c) filter (where inner_c  0)
from (values (1)) t0(inner_c))
from (values (2),(3)) t1(outer_c); -- inner query is aggregation query

I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this.  Since that
still isn't crystal-clear to me, I mention it in case anyone has a different
reading.

Distinct from that, albeit in a similar vein, SQL does not permit outer
references in a filter clause.  This patch permits them; I think that
qualifies as a reasonable PostgreSQL extension.

 --- a/doc/src/sgml/keywords.sgml
 +++ b/doc/src/sgml/keywords.sgml

 @@ -3200,7 +3200,7 @@
 /row
 row
  entrytokenOVER/token/entry
 -entryreserved (can be function or type)/entry
 +entrynon-reserved/entry

I committed this one-line correction separately.

 --- a/src/backend/optimizer/plan/planagg.c
 +++ b/src/backend/optimizer/plan/planagg.c
 @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context)
   ListCell   *l;
  
   Assert(aggref-agglevelsup == 0);
 - if (list_length(aggref-args) != 1 || aggref-aggorder != NIL)
 + if (list_length(aggref-args) != 1 || aggref-aggorder != NIL 
 || aggref-agg_filter != NULL)
   return true;/* it couldn't be MIN/MAX */
   /* note: we do not care if DISTINCT is mentioned ... */

I twitched upon reading this, because neither ORDER BY nor FILTER preclude the
aggregate being MIN or MAX.  Perhaps Andrew can explain why he put aggorder
there back in 2009.  All I can figure is that writing max(c ORDER BY x) is so
unlikely that we'd too often waste the next syscache lookup.  But the same
argument would apply to DISTINCT.  With FILTER, the rationale is entirely
different.  The aggregate could well be MIN/MAX; we just haven't implemented
the necessary support elsewhere in this file.


See attached patch revisions.  The first patch edits find_minmax_aggs_walker()
per my comments just now.  The second is an update of your FILTER patch with
the changes to which I alluded above; it applies atop the first patch.  Would
you verify that I didn't ruin anything?  Barring problems, will commit.

Are you the sole named author of this patch?  That's what the CF page says,
but that wasn't fully clear to me from the list discussions.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com
*** a/src/backend/optimizer/plan/planagg.c
--- b/src/backend/optimizer/plan/planagg.c
***
*** 314,328  find_minmax_aggs_walker(Node *node, List **context)
ListCell   *l;
  
Assert(aggref-agglevelsup == 0);
!   if (list_length(aggref-args) != 1 || aggref-aggorder != NIL)
return true;/* it couldn't be MIN/MAX */
!   /* note: we do not care if DISTINCT is mentioned ... */
!   curTarget = (TargetEntry *) linitial(aggref-args);
  
aggsortop = fetch_agg_sort_op(aggref-aggfnoid);
if (!OidIsValid(aggsortop))
return true;/* not a MIN/MAX aggregate */
  
if (contain_mutable_functions((Node *) curTarget-expr))
return true;/* not potentially 

[HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-07-14 Thread David Fetter
On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote:
 On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote:
  On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote:
   Overall I think this patch offers useful additional functionality, in
   compliance with the SQL spec, which should be handy to simplify
   complex grouping queries.
 
 As I understand this feature, it is syntactic sugar for the typical case of an
 aggregate with a strict transition function.  For example, min(x) FILTER
 (WHERE y  0) is rigorously equivalent to min(CASE y  0 THEN x END).
 Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard
 queries it is *only* syntactic sugar.  In PostgreSQL, it lets you do novel
 things with, for example, array_agg().  Is that accurate?
 
   I think this is ready for committer.
 
 The patch was thorough.  I updated applicable comments, revisited some
 cosmetic choices, and made these functional changes:
 
 1. The pg_stat_statements query jumble should incorporate the filter.
 
 2. The patch did not update costing.  I made it add the cost of the filter
 expression the same way we add the cost of the argument expressions.  This
 makes min(x) FILTER (WHERE y  0) match min(case y  0 THEN x end) in
 terms of cost, which is a fair start.  At some point, we could do better by
 reducing the argument cost by the filter selectivity.

Thanks!

 A few choices/standard interpretations may deserve discussion.  The patch
 makes filter clauses contribute to the subquery level chosen to be the
 aggregation query.  This is observable through the behavior of these two
 standard-conforming queries:
 
 select (select count(outer_c)
 from (values (1)) t0(inner_c))
 from (values (2),(3)) t1(outer_c); -- outer query is aggregation query
 select (select count(outer_c) filter (where inner_c  0)
 from (values (1)) t0(inner_c))
 from (values (2),(3)) t1(outer_c); -- inner query is aggregation query
 
 I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this.  Since that
 still isn't crystal-clear to me, I mention it in case anyone has a different
 reading.
 
 Distinct from that, albeit in a similar vein, SQL does not permit outer
 references in a filter clause.  This patch permits them; I think that
 qualifies as a reasonable PostgreSQL extension.

Thanks again.

  --- a/doc/src/sgml/keywords.sgml
  +++ b/doc/src/sgml/keywords.sgml
 
  @@ -3200,7 +3200,7 @@
  /row
  row
   entrytokenOVER/token/entry
  -entryreserved (can be function or type)/entry
  +entrynon-reserved/entry
 
 I committed this one-line correction separately.
 
  --- a/src/backend/optimizer/plan/planagg.c
  +++ b/src/backend/optimizer/plan/planagg.c
  @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context)
  ListCell   *l;
   
  Assert(aggref-agglevelsup == 0);
  -   if (list_length(aggref-args) != 1 || aggref-aggorder != NIL)
  +   if (list_length(aggref-args) != 1 || aggref-aggorder != NIL 
  || aggref-agg_filter != NULL)
  return true;/* it couldn't be MIN/MAX */
  /* note: we do not care if DISTINCT is mentioned ... */
 
 I twitched upon reading this, because neither ORDER BY nor FILTER preclude the
 aggregate being MIN or MAX.  Perhaps Andrew can explain why he put aggorder
 there back in 2009.  All I can figure is that writing max(c ORDER BY x) is so
 unlikely that we'd too often waste the next syscache lookup.  But the same
 argument would apply to DISTINCT.  With FILTER, the rationale is entirely
 different.  The aggregate could well be MIN/MAX; we just haven't implemented
 the necessary support elsewhere in this file.

Excellent reasoning, and good catch.

 See attached patch revisions.  The first patch edits
 find_minmax_aggs_walker() per my comments just now.  The second is
 an update of your FILTER patch with the changes to which I alluded
 above; it applies atop the first patch.  Would you verify that I
 didn't ruin anything?  Barring problems, will commit.
 
 Are you the sole named author of this patch?  That's what the CF
 page says, but that wasn't fully clear to me from the list
 discussions.

While Andrew's help was invaluable and pervasive, I wrote the patch,
and all flaws in it are my fault.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-27 Thread David Fetter
On Thu, Jun 27, 2013 at 08:41:59AM +, Andrew Gierth wrote:
 Tom Lane said:
  Agreed, separating out the function-call-with-trailing-declaration
  syntaxes so they aren't considered in FROM and index_elem seems
  like the best compromise.
 
  If we do that for window function OVER clauses as well, can we
  make OVER less reserved?
 
 Yes.
 
 At least, I tried it with both OVER and FILTER unreserved and there
 were no grammar conflicts (and I didn't have to do anything fancy to
 avoid them), and it passed regression with the exception of the
 changed error message for window functions in the from-clause.
 
 So is this the final decision on how to proceed? It seems good to
 me, and I can work with David to get it done.

If this is really the direction people want to go, I'm in.  Is there
some code I can look at?

I still submit that having our reserved word ducks in a row in advance
is a saner way to go about this, and will work up a patch for that as
I have time.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-24 Thread Greg Stark
On Mon, Jun 24, 2013 at 3:50 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Or maybe they really don't give a damn about breaking
 applications every time they invent a new reserved word?

I think this is the obvious conclusion. In the standard the reserved
words are pretty explicitly reserved and not legal column names, no?

I think their model is that applications work with a certain version
of SQL and they're not expected to work with a new version without
extensive updating.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-24 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 I think their model is that applications work with a certain version
 of SQL and they're not expected to work with a new version without
 extensive updating.

Hm.  We could invent a sql_version parameter and tweak the lexer to
return keywords added in spec versions later than that as IDENT.
However, I fear such a parameter would be a major PITA from the user's
standpoint, just like most of our backwards-compatibility GUCs have
proven to be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers