Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-29 Thread David Rowley
Tom Lane Wrote:
 Well, this certainly demonstrates that the check I added to
 parseCheckAggregates is wrongly placed, but I'm not sure we really
 need to forbid the case.  David's example query seems to give sane
 answers once the bug in begin_partition is fixed:
 
  parentpart | childpart | quantity | rn
 +---+--+
  KITCHEN| TABLE |1 |  1
  KITCHEN| COOKER|1 |  2
  KITCHEN| FRIDGE|1 |  3
  TABLE  | CHAIR |4 |  1
  CHAIR  | LEG   |4 |  1
 (5 rows)
 

For what it's worth I've been looking into how DB2 and Sybase handle this.

DB2 seems to disallow any functions in the SELECT list of the recursive part
of the query. Error message is a little long winded to show here. It's also
very generic and also covers GROUP Bys and HAVINGs saying that they're also
not allowed.

However, Sybase does allow this query. I did modify the window's ORDER BY as
previously the order was undefined. The results match PostgreSQL.


Also while testing I noticed that this query didn't error out when it should
have: (Of course I only noticed because Sybase did)


WITH RECURSIVE bom(parentpart,childpart,quantity,rn) AS (
  SELECT parentpart,childpart,quantity,ROW_NUMBER() OVER (ORDER BY
parentpart,childpart)
  FROM billofmaterials
  WHERE parentpart = 'KITCHEN'
UNION ALL
  SELECT b.parentpart,b.childpart,b.quantity,ROW_NUMBER() OVER (ORDER BY
parentpart,childpart)
  FROM billofmaterials b,bom
  WHERE b.parentpart = bom.childpart
)
SELECT * FROM bom;


Notice the ORDER BY in the recursive part of the query orders by an
ambiguous column without complaint. If I replace b.quantity with just
quantity it does error there. So seems to just not be picking up the problem
in the window clause.

David.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-29 Thread Tom Lane
David Rowley dgrow...@gmail.com writes:
 Also while testing I noticed that this query didn't error out when it should
 have: (Of course I only noticed because Sybase did)

 WITH RECURSIVE bom(parentpart,childpart,quantity,rn) AS (
   SELECT parentpart,childpart,quantity,ROW_NUMBER() OVER (ORDER BY
 parentpart,childpart)
   FROM billofmaterials
   WHERE parentpart = 'KITCHEN'
 UNION ALL
   SELECT b.parentpart,b.childpart,b.quantity,ROW_NUMBER() OVER (ORDER BY
 parentpart,childpart)
   FROM billofmaterials b,bom
   WHERE b.parentpart = bom.childpart
 )
 SELECT * FROM bom;

 Notice the ORDER BY in the recursive part of the query orders by an
 ambiguous column without complaint.

Actually, it's not ambiguous according to our interpretation of ORDER BY
clauses: the first attempt is to match an output column name, so it's
seizing on the first SELECT column (b.parentpart) as being the intended
sort key for parentpart, and similarly for childpart.  You'd get the
same thing if you did ORDER BY 1,2.

We could disable all those special rules for window cases, but then we'd
have to document how window ORDER BY is different from query ORDER BY,
etc.  I think it'd be more confusing not less so.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-29 Thread David Rowley
Tom Lane Wrote: 
 Actually, it's not ambiguous according to our interpretation of ORDER BY
 clauses: the first attempt is to match an output column name, so it's
 seizing on the first SELECT column (b.parentpart) as being the intended
 sort key for parentpart, and similarly for childpart.  You'd get the
 same thing if you did ORDER BY 1,2.
 

Aha, I see. Well I learned something new tonight. I didn't realise that
Postgres would be that intelligent about it. It makes total sense. I
probably should have known this, but I'll forgive myself as I'm one of these
people that will prefix all column names when joining in case I ever add new
conflicting columns. end of excuse

 We could disable all those special rules for window cases, but then we'd
 have to document how window ORDER BY is different from query ORDER BY,
 etc.  I think it'd be more confusing not less so.
 

Without any further thought, I vote to leave it how it is.

David.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-28 Thread David Rowley
Tom Lane Wrote:
 I've spent quite a bit of time reviewing the window functions patch,
 and I think it is now ready to commit, other than the documentation
 (which I've not looked at yet at all).  Attached is my current patch
 against HEAD, sans documentation.  This incorporates the recently
 discussed aggregate-function API changes and support for tuplestore
 trimming.  There's a number of things that could be improved yet:
   * we really ought to have some support for non-built-in
 window functions
   * I think the planner could be a bit smarter about when to
 sort or not
   * tuplestore_advance and related code really needs to be made
 more efficient; it didn't matter much before but it does now
 but I think these things can be worked on after the core patch is
 committed.
 
   regards, tom lane

I've started running my test queries that I used when reviewing the patch.
The following crashes the backend:

CREATE TABLE billofmaterials (
  parentpart VARCHAR(20) NOT NULL,
  childpart VARCHAR(20) NOT NULL,
  quantity FLOAT NOT NULL,
  CHECK(quantity  0),
  PRIMARY KEY(parentpart, childpart)
);

INSERT INTO billofmaterials VALUES('KITCHEN','TABLE',1);
INSERT INTO billofmaterials VALUES('KITCHEN','COOKER',1);
INSERT INTO billofmaterials VALUES('KITCHEN','FRIDGE',1);
INSERT INTO billofmaterials VALUES('TABLE','CHAIR',4);
INSERT INTO billofmaterials VALUES('CHAIR','LEG',4);


WITH RECURSIVE bom AS (
  SELECT parentpart,childpart,quantity,ROW_NUMBER() OVER (ORDER BY
parentpart DESC) rn
  FROM billofmaterials
  WHERE parentpart = 'KITCHEN'
  UNION ALL
  SELECT b.parentpart,b.childpart,b.quantity,ROW_NUMBER() OVER (ORDER BY
parentpart ASC) rn
  FROM billofmaterials b
  INNER JOIN bom ON b.parentpart = bom.childpart
)
SELECT * from bom;

It seems not to like recursively calling row_number(). It does not crash if
I replace the 2nd row_number() with the constant 1


I compared everything to Oracle again and found no differences in results.
These tests test all window functions in some way or another. I compared all
results to Oracle 10g results apart from the queries that have NTH_VALUE as
this is not implemented by Oracle 10g. Also seems like NTH_VALUE is not
implemented by DB2 9.5 either. Anyone know of any database that does have
NTH_VALUE?

David.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-28 Thread Hitoshi Harada
2008/12/28 David Rowley dgrow...@gmail.com:
 Tom Lane Wrote:
 I've spent quite a bit of time reviewing the window functions patch,
 and I think it is now ready to commit, other than the documentation
 (which I've not looked at yet at all).  Attached is my current patch
 against HEAD, sans documentation.  This incorporates the recently
 discussed aggregate-function API changes and support for tuplestore
 trimming.  There's a number of things that could be improved yet:
   * we really ought to have some support for non-built-in
 window functions
   * I think the planner could be a bit smarter about when to
 sort or not
   * tuplestore_advance and related code really needs to be made
 more efficient; it didn't matter much before but it does now
 but I think these things can be worked on after the core patch is
 committed.

   regards, tom lane

 I've started running my test queries that I used when reviewing the patch.
 The following crashes the backend:

 CREATE TABLE billofmaterials (
  parentpart VARCHAR(20) NOT NULL,
  childpart VARCHAR(20) NOT NULL,
  quantity FLOAT NOT NULL,
  CHECK(quantity  0),
  PRIMARY KEY(parentpart, childpart)
 );

 INSERT INTO billofmaterials VALUES('KITCHEN','TABLE',1);
 INSERT INTO billofmaterials VALUES('KITCHEN','COOKER',1);
 INSERT INTO billofmaterials VALUES('KITCHEN','FRIDGE',1);
 INSERT INTO billofmaterials VALUES('TABLE','CHAIR',4);
 INSERT INTO billofmaterials VALUES('CHAIR','LEG',4);


 WITH RECURSIVE bom AS (
  SELECT parentpart,childpart,quantity,ROW_NUMBER() OVER (ORDER BY
 parentpart DESC) rn
  FROM billofmaterials
  WHERE parentpart = 'KITCHEN'
  UNION ALL
  SELECT b.parentpart,b.childpart,b.quantity,ROW_NUMBER() OVER (ORDER BY
 parentpart ASC) rn
  FROM billofmaterials b
  INNER JOIN bom ON b.parentpart = bom.childpart
 )
 SELECT * from bom;

 It seems not to like recursively calling row_number(). It does not crash if
 I replace the 2nd row_number() with the constant 1


It seems that parseCheckWindowFuncs() doesn't check CTE case whereas
parseCheckAggregates() checks it, including check of window functions.
So the urgent patch is as following, but is this operation allowed? I
am blind in CTE rules...


*** src/backend/parser/parse_agg.c.orig Sun Dec 28 15:34:21 2008
--- src/backend/parser/parse_agg.c  Mon Dec 29 01:13:16 2008
***
*** 336,347 
 errmsg(aggregate functions not allowed in a 
recursive query's
recursive term),
 parser_errposition(pstate,

locate_agg_of_level((Node *) qry, 0;
-   if (pstate-p_hasWindowFuncs  hasSelfRefRTEs)
-   ereport(ERROR,
-   (errcode(ERRCODE_INVALID_RECURSION),
-errmsg(window functions not allowed in a 
recursive query's
recursive term),
-parser_errposition(pstate,
-   
locate_windowfunc((Node *) qry;
  }

  /*
--- 336,341 
***
*** 357,366 
--- 351,374 
  parseCheckWindowFuncs(ParseState *pstate, Query *qry)
  {
ListCell   *l;
+   boolhasSelfRefRTEs;

/* This should only be called if we found window functions */
Assert(pstate-p_hasWindowFuncs);

+   /*
+* Scan the range table to see if there are JOIN or self-reference CTE
+* entries.  We'll need this info below.
+*/
+   hasSelfRefRTEs = false;
+   foreach(l, pstate-p_rtable)
+   {
+   RangeTblEntry *rte = (RangeTblEntry *) lfirst(l);
+
+   if (rte-rtekind != RTE_JOIN  rte-rtekind == RTE_CTE 
rte-self_reference)
+   hasSelfRefRTEs = true;
+   }
+
if (checkExprHasWindowFuncs(qry-jointree-quals))
ereport(ERROR,
(errcode(ERRCODE_WINDOWING_ERROR),
***
*** 426,431 
--- 434,446 

locate_windowfunc(expr;
}
}
+
+   if (pstate-p_hasWindowFuncs  hasSelfRefRTEs)
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_RECURSION),
+errmsg(window functions not allowed in a 
recursive query's
recursive term),
+parser_errposition(pstate,
+   
locate_windowfunc((Node *) qry;
  }

  /*

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-28 Thread Tom Lane
David Rowley dgrow...@gmail.com writes:
 I've started running my test queries that I used when reviewing the patch.
 The following crashes the backend:

Fixed, thanks.  (begin_partition was failing to reset spooled_rows when
falling out early because of empty outerplan; which could only cause an
issue if the outerplan changed from nonempty to empty during a ReScan.
This was true before, not sure why it failed to crash for you before...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-28 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 I ran the patch witouht any errors. Though it's trivial, I noticed
 window_gettupleslot has to be fixed a bit.

Yeah, it could uselessly spool the partition before failing.  I think
it had been that way before and I left it alone, but changing it is
good --- diff included in my patch.  I'm still working on the docs but
expect to commit later today.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-28 Thread David Rowley
Hitoshi Harada wrote:
  WITH RECURSIVE bom AS (
   SELECT parentpart,childpart,quantity,ROW_NUMBER() OVER (ORDER BY
  parentpart DESC) rn
   FROM billofmaterials
   WHERE parentpart = 'KITCHEN'
   UNION ALL
   SELECT b.parentpart,b.childpart,b.quantity,ROW_NUMBER() OVER (ORDER BY
  parentpart ASC) rn
   FROM billofmaterials b
   INNER JOIN bom ON b.parentpart = bom.childpart
  )
  SELECT * from bom;
 
  It seems not to like recursively calling row_number(). It does not crash
 if
  I replace the 2nd row_number() with the constant 1
 
 
 It seems that parseCheckWindowFuncs() doesn't check CTE case whereas
 parseCheckAggregates() checks it, including check of window functions.
 So the urgent patch is as following, but is this operation allowed? I
 am blind in CTE rules...
 
 

I should have said that this is the first time I've seen a crash running
this query. 

I only ever ran this query to verify that the backend didn't crash. I didn't
ever pay much attention to the results. Do you have an older version that
you can verify if it worked as it should have or not?

Your final version won't patch with CVS head anymore.

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-28 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 2008/12/28 David Rowley dgrow...@gmail.com:
 I've started running my test queries that I used when reviewing the patch.
 The following crashes the backend:

 It seems that parseCheckWindowFuncs() doesn't check CTE case whereas
 parseCheckAggregates() checks it, including check of window functions.
 So the urgent patch is as following, but is this operation allowed? I
 am blind in CTE rules...

Well, this certainly demonstrates that the check I added to
parseCheckAggregates is wrongly placed, but I'm not sure we really
need to forbid the case.  David's example query seems to give sane
answers once the bug in begin_partition is fixed:

 parentpart | childpart | quantity | rn 
+---+--+
 KITCHEN| TABLE |1 |  1
 KITCHEN| COOKER|1 |  2
 KITCHEN| FRIDGE|1 |  3
 TABLE  | CHAIR |4 |  1
 CHAIR  | LEG   |4 |  1
(5 rows)

I basically went around and made sure everyplace that threw an
error for aggregates also threw one for window functions, but
it's quite possible that that's overly restrictive in some places.
Window functions don't cause grouping so there's no reason to
forbid them in places where it's the grouping behavior that
is objectionable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-28 Thread Tom Lane
... and it's committed.  Congratulations!

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-28 Thread Hitoshi Harada
2008/12/29 Tom Lane t...@sss.pgh.pa.us:
 ... and it's committed.  Congratulations!

regards, tom lane


Great! I am really glad I can contribute PostgreSQL project by such a
big improvement.

And I really thank all the hackers, without all the helps by you it
wouldn't be done, obviously. Thank you.


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-27 Thread Hitoshi Harada
2008/12/28 Tom Lane t...@sss.pgh.pa.us:
 I've spent quite a bit of time reviewing the window functions patch,
 and I think it is now ready to commit, other than the documentation
 (which I've not looked at yet at all).  Attached is my current patch
 against HEAD, sans documentation.  This incorporates the recently
 discussed aggregate-function API changes and support for tuplestore
 trimming.  There's a number of things that could be improved yet:
* we really ought to have some support for non-built-in
  window functions
* I think the planner could be a bit smarter about when to
  sort or not
* tuplestore_advance and related code really needs to be made
  more efficient; it didn't matter much before but it does now
 but I think these things can be worked on after the core patch is
 committed.


I ran the patch witouht any errors. Though it's trivial, I noticed
window_gettupleslot has to be fixed a bit.

diff src/backend/executor/nodeWindowAgg.c.orig
src/backend/executor/nodeWindowAgg.c
1445a1446,1449
   /* pos = -1 means special on spool_tuples(), so check it before */
   if (pos  0)
   return false;

1449c1453
   if (pos = winstate-spooled_rows || pos  0)
---
   if (pos = winstate-spooled_rows)


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-21 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 2008/12/20 Tom Lane t...@sss.pgh.pa.us:
 I've been studying the grammar for the windowing patch a bit.  It seems
 to me that the existing window name option for window specification
 got left out.

 I completely missed this issue. If the existing window name is
 allowed in window clause, then does it mean this is possible?

 SELECT row_number() OVER w2 FROM t
 WINDOW w1 AS (PARTITION BY grp), w2(w1)

Yes, I read the spec to allow that.  It's a bit pointless unless w2
modifies w1 somehow, though.

 And what if w1 refers to w2 and w2 refers to w1 cyclicly?

Not allowed --- the scope of a window name doesn't include previous
elements of the WINDOW list, so a forward reference isn't valid.

 And from
 what I read the spec, it seems to me that it effects only frame clause
 which is unlikely implemented for 8.4, because if existing window
 name) is specified then partition clause and order by clause are
 both permitted in the window definition.

No, you're allowed to substitute a new order by clause.  The useful
case seems to be that you have a base definition giving a partitioning
and then several derived windows with different orderings and/or
different frame clauses.  I'm not really sure why they bothered to set
up the syntax like that, but the spec is pretty clear about it...

 I am also fairly strongly inclined to rip out all of the frame_clause
 syntax, since (1) it's unimplemented and unlikely to get implemented
 for 8.4, and (2) it's not particularly satisfactory anyway.

 The reason I added those grammer was 1) there had been possibilities
 to implement frame clause, and 2) to support not-implemented error. 1)
 is unlikey to be done and for 2) we need only rows/range syntax. So if
 it is annoying for 8.4 release, I don't disagree with your suggestion.

I've been hacking on this and I have a grammar that pretty much works,
but there's some bizarreness around UNBOUNDED.  I'll post it later.
There's still a a question whether we should introduce a lot of new
keywords and productions now to support a feature we don't intend to
implement in 8.4.  There's a distributed speed cost for each keyword
and each production; I don't really see the argument why users
should pay even a small cost for zero benefit in this release.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-21 Thread Tom Lane
I wrote:
 I've been hacking on this and I have a grammar that pretty much works,
 but there's some bizarreness around UNBOUNDED.  I'll post it later.

Here is a proof-of-concept grammar patch that allows frame_bound to use
a_expr instead of a hacked-up constant production (which, as I
complained before, didn't allow all the cases demanded by the spec).
The key changes involved are:

* BETWEEN has to become a fully reserved word instead of type_func_name.

* PARTITION RANGE ROWS FOLLOWING PRECEDING have to be assigned the same
explicit precedence as IDENT.  (This doesn't have any bad consequences
AFAIK, except that you can't use a postfix operator in frame_bound exprs
unless you parenthesize it.)

* UNBOUNDED has to be assigned a precedence slightly lower than
PRECEDING and FOLLOWING, else it's ambiguous whether UNBOUNDED
PRECEDING ought to be parsed as a_expr PRECEDING.  I'm a bit nervous
about this solution; we might someday need to make some of these
keywords at least partly reserved instead.  But right now it doesn't
seem to have any negative consequences.

Making BETWEEN fully reserved is a bit annoying, but it would only
break apps using BETWEEN as a type or function name.  The former doesn't
seem like a problem, the latter might be.

Comments?

regards, tom lane

PS: I've removed some uninteresting hunks to shorten the diff, so you might
get some chatter from patch if you try to apply the diff.

*** src/backend/parser/gram.y   Sat Dec 20 11:02:55 2008
--- ./gram.ySun Dec 21 14:56:17 2008
***
*** 158,163 
--- 158,164 
DefElem *defelt;
OptionDefElem   *optdef;
SortBy  *sortby;
+   WindowDef   *windef;
JoinExpr*jexpr;
IndexElem   *ielem;
Alias   *alias;
***
*** 402,407 
--- 403,414 
  %type with  with_clause
  %type list  cte_list
  
+ %type list  window_clause window_definition_list opt_partition_clause
+ %type windefwindow_definition over_clause window_specification
+ %type str   opt_existing_window_name
+ %type node  opt_frame_clause frame_clause frame_extent
+ %type node  frame_bound opt_frame_exclusion
+ 
  
  /*
   * If you make any token changes, update the keyword table in
***
*** 431,440 
DEFERRABLE DEFERRED DEFINER DELETE_P DELIMITER DELIMITERS DESC
DICTIONARY DISABLE_P DISCARD DISTINCT DO DOCUMENT_P DOMAIN_P DOUBLE_P 
DROP
  
!   EACH ELSE ENABLE_P ENCODING ENCRYPTED END_P ENUM_P ESCAPE EXCEPT 
EXCLUDING
!   EXCLUSIVE EXECUTE EXISTS EXPLAIN EXTERNAL EXTRACT
  
!   FALSE_P FAMILY FETCH FIRST_P FLOAT_P FOR FORCE FOREIGN FORWARD
FREEZE FROM FULL FUNCTION
  
GLOBAL GRANT GRANTED GREATEST GROUP_P
--- 438,447 
DEFERRABLE DEFERRED DEFINER DELETE_P DELIMITER DELIMITERS DESC
DICTIONARY DISABLE_P DISCARD DISTINCT DO DOCUMENT_P DOMAIN_P DOUBLE_P 
DROP
  
!   EACH ELSE ENABLE_P ENCODING ENCRYPTED END_P ENUM_P ESCAPE EXCEPT EXCLUDE
!   EXCLUDING EXCLUSIVE EXECUTE EXISTS EXPLAIN EXTERNAL EXTRACT
  
!   FALSE_P FAMILY FETCH FIRST_P FLOAT_P FOLLOWING FOR FORCE FOREIGN FORWARD
FREEZE FROM FULL FUNCTION
  
GLOBAL GRANT GRANTED GREATEST GROUP_P
***
*** 461,475 
NOT NOTHING NOTIFY NOTNULL NOWAIT NULL_P NULLIF NULLS_P NUMERIC
  
OBJECT_P OF OFF OFFSET OIDS OLD ON ONLY OPERATOR OPTION OPTIONS OR
!   ORDER OUT_P OUTER_P OVERLAPS OVERLAY OWNED OWNER
  
!   PARSER PARTIAL PASSWORD PLACING PLANS POSITION
!   PRECISION PRESERVE PREPARE PREPARED PRIMARY
PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
QUOTE
  
!   READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX RELATIVE_P 
RELEASE
RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING 
RETURNS
REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
--- 468,482 
NOT NOTHING NOTIFY NOTNULL NOWAIT NULL_P NULLIF NULLS_P NUMERIC
  
OBJECT_P OF OFF OFFSET OIDS OLD ON ONLY OPERATOR OPTION OPTIONS OR
!   ORDER OTHERS OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
!   PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION
!   PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
QUOTE
  
!   RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX 
RELATIVE_P RELEASE
RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING 
RETURNS
REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
***
*** 479,495 
STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING SUPERUSER_P
SYMMETRIC SYSID SYSTEM_P
  
!   TABLE TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN TIME TIMESTAMP
TO TRAILING TRANSACTION TREAT TRIGGER TRIM TRUE_P
TRUNCATE TRUSTED TYPE_P
  
! 

Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-20 Thread Hitoshi Harada
2008/12/20 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 Here's the patch, including latest function default args. Regards,

 The comments added to planner.c contain various claims that setrefs.c
 will fix up window function references, but I see no code related to
 that in setrefs.c.  Please explain.


The Window node handles only expressions that contain corresponding
window functions. Since more than one Window nodes may be put in a
Plan, the window functions in the upper node are not evaluated in the
lower node but tlist should not forget it exists in the lower. For
examples,

SQL:
SELECT
row_number() OVER (ORDER BY salary) AS r1, row_number() OVER (ORDER BY
depname) AS r2
FROM empsalary;

The final Plan:
-Window (upper)
  pull r1 (Var)
  eval r2 (WindowFuncExpr)
  -Sort by depname
-Window (lower)
  eval r1 (WindowFunctionExpr)
  r2 as NULL (Const)
  -Sort by salary
-Scan empsalary

And it is possible for window_tlist() to change upper's r1 to Var but
delegating it to set_upper_references() seems right to do.If we simply
delegate all the nodes to set_upper_reference(), r2 would also be
evaluated in lower node, so we put null const in the lower and put
actual evaluation in the upper.

So window_preprocess and window_tlist do things needed for
set_upper_reference to fix up all the expressions correctly, that's
why no code is added to setrefs.c.

This part is hard to describe in english (or even in my japanese) so
if you are still lost, I'll add more explanation.

Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-20 Thread Hitoshi Harada
2008/12/20 Tom Lane t...@sss.pgh.pa.us:
 I've been studying the grammar for the windowing patch a bit.  It seems
 to me that the existing window name option for window specification
 got left out.  I think that WindowDef needs to have two name fields,
 one for the name (if any) defined by the window definition, and one
 for the referenced window name if any.  Also the OVER name syntax
 seems like it maps to a referenced window name not a defined window
 name.

I completely missed this issue. If the existing window name is
allowed in window clause, then does it mean this is possible?

SELECT row_number() OVER w2 FROM t
WINDOW w1 AS (PARTITION BY grp), w2(w1)

And what if w1 refers to w2 and w2 refers to w1 cyclicly? And from
what I read the spec, it seems to me that it effects only frame clause
which is unlikely implemented for 8.4, because if existing window
name) is specified then partition clause and order by clause are
both permitted in the window definition.

 I am also fairly strongly inclined to rip out all of the frame_clause
 syntax, since (1) it's unimplemented and unlikely to get implemented
 for 8.4, and (2) it's not particularly satisfactory anyway.  The
 frame_bound_const business has to be rethought, because it's ugly and
 it doesn't even satisfy the limited requirements of the spec (cf
 general value specification which is an allowed alternative for
 unsigned value specification).  It might be that we'll end up having
 to make BETWEEN be a fully reserved word in order to implement that
 syntax.

The reason I added those grammer was 1) there had been possibilities
to implement frame clause, and 2) to support not-implemented error. 1)
is unlikey to be done and for 2) we need only rows/range syntax. So if
it is annoying for 8.4 release, I don't disagree with your suggestion.


 BTW, I notice we also have not implemented the spec's bizarre
 RESPECT NULLS/IGNORE NULLS decoration for lead/lag or FROM FIRST/LAST
 for nth_value.  This is fine with me, but we probably ought to document
 workarounds for that, if possible.  I guess FROM FIRST/LAST could be
 handled by reversing the window sort order; what about the NULLS stuff?

For function-specific syntax seems like more big discussion. I heard
that array_agg() may have ORDER BY clause in its arguments but
currently not included in pgsql, whereas extract() is supported as
special case. I think we might build general rule for those and until
it is completed, RESPECT NULLS/IGNORENULLS and others can be ignored.
Of course it must be documented though.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-20 Thread Hitoshi Harada
2008/12/21 Hitoshi Harada umi.tan...@gmail.com:
 2008/12/20 Tom Lane t...@sss.pgh.pa.us:
 I've been studying the grammar for the windowing patch a bit.  It seems
 to me that the existing window name option for window specification
 got left out.  I think that WindowDef needs to have two name fields,
 one for the name (if any) defined by the window definition, and one
 for the referenced window name if any.  Also the OVER name syntax
 seems like it maps to a referenced window name not a defined window
 name.

 I completely missed this issue. If the existing window name is
 allowed in window clause, then does it mean this is possible?

 SELECT row_number() OVER w2 FROM t
 WINDOW w1 AS (PARTITION BY grp), w2(w1)

 And what if w1 refers to w2 and w2 refers to w1 cyclicly? And from
 what I read the spec, it seems to me that it effects only frame clause
 which is unlikely implemented for 8.4, because if existing window
 name) is specified then partition clause and order by clause are
 both permitted in the window definition.

both not permitted in the window definition.

Sorry for my mistake.

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-19 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 Here's the patch, including latest function default args. Regards,

The comments added to planner.c contain various claims that setrefs.c
will fix up window function references, but I see no code related to
that in setrefs.c.  Please explain.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-19 Thread Tom Lane
I've been studying the grammar for the windowing patch a bit.  It seems
to me that the existing window name option for window specification
got left out.  I think that WindowDef needs to have two name fields,
one for the name (if any) defined by the window definition, and one
for the referenced window name if any.  Also the OVER name syntax
seems like it maps to a referenced window name not a defined window
name.

I am also fairly strongly inclined to rip out all of the frame_clause
syntax, since (1) it's unimplemented and unlikely to get implemented
for 8.4, and (2) it's not particularly satisfactory anyway.  The
frame_bound_const business has to be rethought, because it's ugly and
it doesn't even satisfy the limited requirements of the spec (cf
general value specification which is an allowed alternative for
unsigned value specification).  It might be that we'll end up having
to make BETWEEN be a fully reserved word in order to implement that
syntax.

BTW, I notice we also have not implemented the spec's bizarre
RESPECT NULLS/IGNORE NULLS decoration for lead/lag or FROM FIRST/LAST
for nth_value.  This is fine with me, but we probably ought to document
workarounds for that, if possible.  I guess FROM FIRST/LAST could be
handled by reversing the window sort order; what about the NULLS stuff?

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-14 Thread Hitoshi Harada
2008/12/8 Heikki Linnakangas heikki.linnakan...@enterprisedb.com:
 That said, we should try to get this committed ASAP, so I think we can live
 without the trimming for 8.4.

Just let me know. What is the current status... Is there something for
me to do now? Or only wating?

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-07 Thread Hitoshi Harada
2008/12/6 David Rowley [EMAIL PROTECTED]:

 I've spent last night and tonight trying to break the patch and I've not
 managed it.

 I spent 2 and a half hours on the train last night reading over the patch
 mainly for my own interest. I also went over the documentation and I have a
 few suggestions for improvement:

 +para
 + After literalWHERE/ and literalGROUP BY/ process,
 + rows might be windowed table, using the literalWINDOW/
 + clause.
 +/para

 I think I know what you mean here. My re-write seems to have turned the
 sentence into a paragraph. Please tell me if I've assumed the meaning
 wrongly:


 After the literalWHERE/, literalGROUP BY/ and literalHAVING/
 clauses one or more literalWINDOW/ clauses can be specified. This will
 allow window functions to be specified in the literalSELECT/ clause.
 These window functions can make use of the literalWINDOW/ clauses by
 making reference to the alias name of the window rather than explicitly
 specifying the properties of the window in each literalOVER/ clause.

The WINDOW clause is a clause that starts with WINDOW, containing
some window definitions, syntactically. So I rewrote it as:


After the literalWHERE/, literalGROUP BY/ and
literalHAVING/ clauses one or more window definitions can be
specified by the literalWINDOW/ clause. This will allow window
functions to be specified in the literalSELECT/ clause. These
window functions can make use of the literalWINDOW/ clauses by
making reference to the alias name of the window rather than
explicitly specifying the properties of the window in each
literalOVER/ clause.




 + Window functions are not placed in any of GROUP BY, HAVING and
 + WHERE clauses, which process values before any of the windows. If
 + there is need to qualify rows by the result of window functions,
 + whole of the query must be nested and append WHERE clause outer of
 + the current query.

 I think this one maybe needs an example to back it up. It's quite an
 important thing and I'm sure lots of people will need to do this. I'm not
 100% happy with my new paragraph either but can't see how to word it any
 better.

 Window functions cannot be used in the WHERE, GROUP BY or HAVING clauses
 of the query. If there is a need to filter rows, group results or filter
 rows after aggregation takes place (HAVING) then the query must be nested.
 The query should contain the window functions in the inner query and apply
 the additional clauses that contain the results from the window function in
 the outer query, such as:

 SELECT depname,
   empno,
   salary,
   enroll_date
 FROM (SELECT depname,
 empno,
 salary,
 enroll_date,
 ROW_NUMBER() OVER (PARTITION BY depname ORDER BY salary,empno)
 AS pos
  FROM empsalary
 ) AS e
 WHERE pos = 1;

 In the above query the we're filtering and only showing the results from the
 inner query where the ROW_NUMBER() value is equal to 1.

 But of course the above query would be more simple using DISTINCT ON. Maybe
 there is a better example... My previous marathon getting the person in 2nd
 place might be better but that's introducing another previously unknown
 table to the manual.

I use this query:

SELECT depname,
   empno,
   salary,
   enroll_date
FROM (SELECT depname,
empno,
salary,
enroll_date,
ROW_NUMBER() OVER (PARTITION BY depname ORDER BY
salary,empno) AS pos
 FROM empsalary
) AS e
WHERE pos  3;

This isn't emulated by DISTINCT ON, is it?


For all other issues, thanks, applied to my patch.


Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-07 Thread Hitoshi Harada
2008/12/6 David Rowley [EMAIL PROTECTED]:
 Hitoshi Harada Wrote:
 2008/12/3 Hitoshi Harada [EMAIL PROTECTED]:
  I am randomly trying some issues instead of agg common code (which I
  now doubt if it's worth sharing the code), so tell me if you're
  restarting your hack again. I'll send the whole patch.
 

 Attached is the updated patch, including:

 - performance tuning up for large data sets
 - move find_wfunc to optimizer/util/clauses.c
 - rename WFunc to WindowFunc
 - rename WinDef to WindowDef
 - rename wfunc-pure_agg to winagg
 - remove winstate-tail_ptr
 - fix WinFrameGetArg in case that there are more than one peer, add
 relevant test
 - duplicate GetAggInitVal(), not sharing aggregate routines with nodeAgg.c

 I believe the remaining work is only to optimize row_number()/rank()
 cases to trim tuplestore like window aggregates. This will be done by
 providing some kind of way for each window functions to tell Window
 node that it doesn't require backward_scan.

I've spent hours to try this issue, and concluded it doesn't pay.

First, the test is on this query:

explain analyze select id, row_number() OVER (order by id)
from bigtable order by id;

where bigtable has ~400MB, 1000 rows.

This simple row_number() query takes about 50 sec, whereas without
row_number() indexscan query takes about 25 sec. I wondered what makes
the difference 25 sec.

With this test, the tuplestore dumps its tuples since it never trims.
Then I took profile of nodeWindow in some points,

tuplestore_puttupleslot 13.6 sec
spool_tuples37.9 sec
eval_windowfunction  9.3 sec

Note that spool_tuples contains execProc(outerPlan), which means its
37.9 sec contains outer indexscan 25 sec, plus tuplestore_puttuple,
13.9 sec.

Then I modified some code to let tuplestore trim and tested again then
the results were:

tuplestore_puttupleslot 11.2 sec
spool_tuples35.8 sec
eval_windowfunction  9.5 sec

It shows even though tuplestore trims its tuples and stays in memory
rather than dumps them on files, the performance up is only 2 sec in
50 sec. So I concluded the optimization for row_number()/rank() etc
doesn't pay for its more complexity in window function API. The
bottleneck of the Window node origins from something else, like
puttupleslot(), not Window node algorithm. Of course those issues
should be tracked more precisely, for the window functions work I
ignore them.


 It's been a long time since the CommitFest started. This patch has come a
 long way in that time. We've seen various performance improvements and many
 fixes where the patch was not matching the standard. We've also seen quite a
 big change in the window buffering technique which is showing amazing
 performance improvements in certain queries.

 I've spent many hours reading the standard and comparing the results from
 this patch against other RDBMS' that support window functions. I wonder if
 we're the first to implement NTH_VALUE()?.

 The patch, in my opinion is getting very close to being committable. The
 only outstanding issues; Please correct me where I'm wrong or have omitted
 something.

 * Fix still required for rank_up. Code still has a FIXME.
This was whether rank() without ORDER BY clause should be valid or
not. The answer is yes as it is implemented now, so I removed only
comments.

 * ROW_NUMBER() and RANK() performance to be looked into.
As I tested above.

 * Slight changes required in documentation (my previous email).
Applied to the patch.

 * Final code read by someone reviewing the code.
I am looking forward.

 I've also spent many hours trying to break this patch and in previous
 versions I've managed it. Last night I spent a few hours again testing this
 new patch and did not managed to find anything wrong. We're getting close to

 the time where the community can test further by committing this patch.
Agree. I'll send the latest patch and finish my work for now.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-07 Thread David Rowley
2008/12/7 Hitoshi Harada [EMAIL PROTECTED]:
 2008/12/7 Hitoshi Harada [EMAIL PROTECTED]:
 2008/12/6 David Rowley [EMAIL PROTECTED]:
 the time where the community can test further by committing this patch.
 Agree. I'll send the latest patch and finish my work for now.


 Here's the patch, including latest function default args. Regards,


I've added a link to the commitfest page and stated that the patch is
ready for a core member to review.

Good work.

David.

 --
 Hitoshi Harada


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-07 Thread Heikki Linnakangas

Hitoshi Harada wrote:

It shows even though tuplestore trims its tuples and stays in memory
rather than dumps them on files, the performance up is only 2 sec in
50 sec. So I concluded the optimization for row_number()/rank() etc
doesn't pay for its more complexity in window function API. The
bottleneck of the Window node origins from something else, like
puttupleslot(), not Window node algorithm. Of course those issues
should be tracked more precisely, for the window functions work I
ignore them.


The negative impact of not trimming the tuplestore comes from having to 
do I/O to write the tuples to temporary file. I would've expected that 
to show up with 400 MB table, but maybe that still fits comfortably in 
OS cache. Anyway, I would expect there to be a big drop in performance 
after the tuplestore no longer fits in cache, and trimming it would 
eliminate that.


That said, we should try to get this committed ASAP, so I think we can 
live without the trimming for 8.4.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-05 Thread David Rowley
Hitoshi Harada wrote:
 I tested the spool performance with David's earlier bigtable:
 
 CREATE TABLE bigtable (
  id SERIAL NOT NULL PRIMARY KEY,
  timestamp TIMESTAMP NOT NULL
 );
 
 -- about 383MB of data
 INSERT INTO bigtable (timestamp)
 SELECT NOW() + (CAST(RANDOM() * 10 AS INT) || ' secs')::INTERVAL
 FROM generate_series(1,1000);
 
 CREATE INDEX bigtable_timestamp_idx ON bigtable (timestamp);
 
 VACUUM ANALYZE bigtable;
 
 sample=# SELECT COUNT(*) FROM bigtable;
  count
 --
  1000
 (1 row)
 
 sample=# SELECT LEAD(timestamp) OVER (ORDER BY id) FROM bigtable LIMIT 1;
lead
 
  2008-12-02 00:15:10.288461
 (1 row)
 
 sample=# EXPLAIN ANALYZE SELECT LEAD(timestamp) OVER (ORDER BY id)
 FROM bigtable LIMIT 1;
 
 QUERY PLAN
 
 --
 
 ---
  Limit  (cost=0.00..0.04 rows=1 width=12) (actual time=0.038..0.039
 rows=1 loops=1)
   -  Window  (cost=0.00..386612.13 rows=1000 width=12) (actual
 time=0.036..0.036 rows=1
 loops=1)
 -  Index Scan using bigtable_pkey on bigtable
 (cost=0.00..286612.13 rows=1000 w
 idth=12) (actual time=0.018..0.021 rows=2 loops=1)
  Total runtime: 0.071 ms
 (4 rows)
 
 
 shows quite good result. Great work.

Amazing improvement!

Old patch:
david=# select timestamp,lag(timestamp,1) over (order by id) from bigtable
order by id limit 1;
 timestamp  | lag
+-
 2008-11-10 21:55:16.498458 |
(1 row)

Time: 105205.055 ms

New patch:
david=# select timestamp,lag(timestamp,1) over (order by id) from bigtable
order by id limit 1;
 timestamp  | lag
+-
 2008-12-04 22:05:22.687975 |
(1 row)

Time: 1.640 ms

 
 The following query works on my build:
 
  SELECT depname,SUM(SUM(salary)) OVER (ORDER BY depname) FROM empsalary
 GROUP
  BY depname;
  ERROR:  variable not found in subplan target list
 

This works fine on my build now too. 

 
 Now, I am thinking about refactoring around aggregate common code, and
 renaming WFunc to WinFunc, which leads pg_proc.proiswfunc be
 pg_proc.proiswinfunc and so on if no objections come.
 

I've spent last night and tonight trying to break the patch and I've not
managed it.

I spent 2 and a half hours on the train last night reading over the patch
mainly for my own interest. I also went over the documentation and I have a
few suggestions for improvement:


My modifications to the lead and lag syntax can be improved. Currently the
optional parameters make it look like DEFAULT can be specified without
OFFSET. This is not the case:

+ typeany, [integer], [any]/type

Should be:

+ typeany [,integer [,any] ]/type

And:

+  lag(replaceable class=parametervalue/replaceable,
[replaceable
+  class=parameteroffset/replaceable], [replaceable 
+  class=parameterdefault/replaceable])


Should be:

+  lag(replaceable class=parametervalue/replaceable [,
replaceable
+  class=parameteroffset/replaceable [,replaceable 
+  class=parameterdefault/replaceable] ])

Same for LEAD()


+para
+ After literalWHERE/ and literalGROUP BY/ process,
+ rows might be windowed table, using the literalWINDOW/
+ clause.
+/para

I think I know what you mean here. My re-write seems to have turned the
sentence into a paragraph. Please tell me if I've assumed the meaning
wrongly:


After the literalWHERE/, literalGROUP BY/ and literalHAVING/
clauses one or more literalWINDOW/ clauses can be specified. This will
allow window functions to be specified in the literalSELECT/ clause.
These window functions can make use of the literalWINDOW/ clauses by
making reference to the alias name of the window rather than explicitly
specifying the properties of the window in each literalOVER/ clause.


+para
+ Another expample shows different capability of window functions
+ from above.


Small typo: example instead of eapample

+para
+ The same window definitions can be named and put togather into one
+ definition using xref linkend=queries-window.

Small typo: together instead of togather.


+para
+ Window functions doesn't accept DISTINCT and ALL syntax, even though
+ the function is an aggregate function.
+/para

Small grammar error: doesn't should be replaced with don't. Or perhaps
change to:

Window functions, unlike normal aggregate functions, do not allow DISTINCT
or ALL to be used within the function argument list.


+ Window functions are not placed in any of GROUP BY, HAVING and
+ WHERE clauses, which process values before any of the windows. If
+ there is need to qualify rows by the result of window functions,
+ whole of the query must be nested and append WHERE clause outer of
+ the current query.

I think this one maybe needs an example to back it up. It's quite an

Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-05 Thread David Rowley
Hitoshi Harada Wrote:
 2008/12/3 Hitoshi Harada [EMAIL PROTECTED]:
  I am randomly trying some issues instead of agg common code (which I
  now doubt if it's worth sharing the code), so tell me if you're
  restarting your hack again. I'll send the whole patch.
 
 
 Attached is the updated patch, including:
 
 - performance tuning up for large data sets
 - move find_wfunc to optimizer/util/clauses.c
 - rename WFunc to WindowFunc
 - rename WinDef to WindowDef
 - rename wfunc-pure_agg to winagg
 - remove winstate-tail_ptr
 - fix WinFrameGetArg in case that there are more than one peer, add
 relevant test
 - duplicate GetAggInitVal(), not sharing aggregate routines with nodeAgg.c
 
 I believe the remaining work is only to optimze row_number()/rank()
 cases to trim tuplestore like window aggregates. This will be done by
 providing some kind of way for each window functions to tell Window
 node that it doesn't require backward_scan.

It's been a long time since the CommitFest started. This patch has come a
long way in that time. We've seen various performance improvements and many
fixes where the patch was not matching the standard. We've also seen quite a
big change in the window buffering technique which is showing amazing
performance improvements in certain queries.

I've spent many hours reading the standard and comparing the results from
this patch against other RDBMS' that support window functions. I wonder if
we're the first to implement NTH_VALUE()?.

The patch, in my opinion is getting very close to being committable. The
only outstanding issues; Please correct me where I'm wrong or have omitted
something.

* Fix still required for rank_up. Code still has a FIXME.

* ROW_NUMBER() and RANK() performance to be looked into.

* Slight changes required in documentation (my previous email).

* Final code read by someone reviewing the code.

I've also spent many hours trying to break this patch and in previous
versions I've managed it. Last night I spent a few hours again testing this
new patch and did not managed to find anything wrong. We're getting close to

the time where the community can test further by committing this patch.

I'll make a reference to this post in the CommitFest list for any
non-followers of this thread.

David.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-02 Thread Hitoshi Harada
2008/12/2 Hitoshi Harada [EMAIL PROTECTED]:
 sample=# EXPLAIN ANALYZE SELECT LEAD(timestamp) OVER (ORDER BY id)
 FROM bigtable LIMIT 1;

 QUERY PLAN

 --
 ---
  Limit  (cost=0.00..0.04 rows=1 width=12) (actual time=0.038..0.039
 rows=1 loops=1)
  -  Window  (cost=0.00..386612.13 rows=1000 width=12) (actual
 time=0.036..0.036 rows=1
 loops=1)
-  Index Scan using bigtable_pkey on bigtable
 (cost=0.00..286612.13 rows=1000 w
 idth=12) (actual time=0.018..0.021 rows=2 loops=1)
  Total runtime: 0.071 ms
 (4 rows)


 shows quite good result. Great work.



After more playing with the new patch, I found worse results.

sample=# explain analyze select id, row_number() OVER (order by id)
from bigtable order by id;

QUERY PLAN

--
---
 Window  (cost=0.00..361612.13 rows=1000 width=4) (actual
time=0.064..105414.522 rows=1000
 loops=1)
   -  Index Scan using bigtable_pkey on bigtable
(cost=0.00..286612.13 rows=1000 width=4
) (actual time=0.056..16836.341 rows=1000 loops=1)
 Total runtime: 114650.074 ms
(3 rows)



sample=# explain analyze select id,LAG(timestamp,1) over (order by id)
from bigtable order by id;

QUERY PLAN

--

 Window  (cost=0.00..411612.13 rows=1000 width=12) (actual
time=0.065..122583.331 rows=100
0 loops=1)
   -  Index Scan using bigtable_pkey on bigtable
(cost=0.00..286612.13 rows=1000 width=1
2) (actual time=0.056..18066.829 rows=1000 loops=1)
 Total runtime: 132770.399 ms
(3 rows)

The earlier patch results are here:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01121.php

row_number(): 44s/114s
lag(): 79s/132s

I don't understand the new patch totally, and I know the row_number()
optimization is in progress, but even lag() is quite worse. Maybe
tuplestore read pointer's heavy uses cause these.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-12-02 Thread Hitoshi Harada
2008/11/26 Heikki Linnakangas [EMAIL PROTECTED]:
 Hitoshi Harada wrote:

 I read more, and your spooling approach seems flexible for both now
 and the furture. Looking at only current release, the frame with ORDER
 BY is done by detecting peers in WinFrameGetArg() and add row number
 of peers to winobj-currentpos. Actually if we have capability to
 spool all rows we need on demand, the frame would be only a boundary
 problem.

 Yeah, we could do that. I'm afraid it would be pretty slow, though, if
 there's a lot of peers. That could probably be alleviated with some sort of
 caching, though.

I added code for this issue. See
http://git.postgresql.org/?p=~davidfetter/window_functions/.git;a=blobdiff;f=src/backend/executor/nodeWindow.c;h=f2144bf73a94829cd7a306c28064fa5454f8d369;hp=50a6d6ca4a26cd4854c445364395ed183b61f831;hb=895f1e615352dfc733643a701d1da3de7f91344b;hpb=843e34f341f0e824fd2cc0f909079ad943e3815b

This process is very similar to your aggregatedupto in window
aggregate, so they might be shared as general the way to detect frame
boundary, aren't they?

I am randomly trying some issues instead of agg common code (which I
now doubt if it's worth sharing the code), so tell me if you're
restarting your hack again. I'll send the whole patch.


Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-30 Thread David Rowley
 -Original Message-
 From: Heikki Linnakangas [mailto:[EMAIL PROTECTED]
 Sent: 26 November 2008 09:09
 To: Hitoshi Harada
 Cc: David Rowley; pgsql-hackers@postgresql.org
 Subject: Re: Windowing Function Patch Review - Standard Conformance
 
 Hitoshi Harada wrote:
  2008/11/26 David Rowley [EMAIL PROTECTED]:
  I'm at a bit of a loss to what to do now. Should I wait for your work
  Heikki? Or continue validating this patch?
 
  The best thing I can think to do right now is continue and any problems
 I
  find you can add regression tests for, then if we keep your regression
 tests
  for Heikki's changes then we can validate those changes more quickly.
 
  Any thoughts? Better ideas?
 
  Thanks to your great tests, we now know much more about specification
  and where to fail easily, so continuing makes sense but it may be good
  time to take a rest and wait for Heikki's patch completing.
 
 Here's another updated patch, including all your bug fixes.
 
 There's two known issues:
 - ranking functions still don't treat peer rows correctly.
 
 - I commented out the this function requires ORDER BY clause in the
 window test in rank_up, because a window function shouldn't be poking
 into the WindowState struct like that. I wonder if it's really needed?
 In section 7.11, the SQL2008 spec says if WD has no window ordering
 clause, then the window ordering is implementation-dependent, and *all
 rows are peers*. The regression test now fails because of this, but the
 current behavior actually seems correct to me.
 

Thanks for the updated patch. I've been doing a little testing over the
weekend.

I'm running into a small problem with passing results from normal aggregate
functions into the window function. Hitoshi and I saw this before:

SELECT depname,SUM(SUM(salary)) OVER (ORDER BY depname) FROM empsalary GROUP
BY depname;
ERROR:  variable not found in subplan target list

Hitoshi fixed this in his 2008-11-12 patch, but it seems to have found its
way back in.


I was also reading over the standard tonight. I've discovered that the
OFFSET in LEAD() and LAG() is optional. It should default to 1 if it is not
present. Oracle seems to support this.

SQL2008 says:
 If lead or lag function is specified, then:
 i) Let VE1 be lead or lag extent and let DT be the declared type of VE1.
 ii) Case:
 Scalar expressions 209
 WD 9075-2:200w(E)
 6.10 window function
 If offset is specified, then let OFF be offset. The declared type of
OFF shall be an
 exact numeric type with scale 0 (zero).
 1)
 2) Otherwise, let OFF be 1 (one).

Yet another variant of LEAD() and LAG() but I think well worth it for both
compliance to the standard and compatibility to Oracle.


I've also been thinking about the on-demand buffering for the window
functions that you talked about. In some of my previous performance tests I
noticed that LEAD with a LIMIT clause is not optimal as it will store all
the rows in the tuplestore before applying the LIMIT:

select timestamp,lead(timestamp,1) over (order by id) from bigtable order by
id limit 1;

Is there any way for the on-demand buffering to make the above query faster?
Really only the first 2 rows ordered by id need to be read. 

I had previously thought this would be too difficult to implement as the
OFFSET can be variable, but maybe it's possible on-demand. Yet I'd imagine
it's difficult to know when to allow rows to exit the tuplestore as a
LAG(col, someVariableValue) might need those rows later.

My next task is to read more of the standard just in case there is anything
else we should know.

David.



 --
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-30 Thread David Rowley
I wrote:
 I was also reading over the standard tonight. I've discovered that the
 OFFSET in LEAD() and LAG() is optional. It should default to 1 if it is
 not present. Oracle seems to support this.
 
 SQL2008 says:
  If lead or lag function is specified, then:
  i) Let VE1 be lead or lag extent and let DT be the declared type of
 VE1.
  ii) Case:
  Scalar expressions 209
  WD 9075-2:200w(E)
  6.10 window function
  If offset is specified, then let OFF be offset. The declared type of
 OFF shall be an
  exact numeric type with scale 0 (zero).
  1)
  2) Otherwise, let OFF be 1 (one).
 
 Yet another variant of LEAD() and LAG() but I think well worth it for both
 compliance to the standard and compatibility to Oracle.

I figured this was quite simple so I've created a patch to implement this.
Can probably put this down to the fact that I'm starting to feel bad about
pointing out the mistakes and having someone else fix them. Figured it was
time to make some changes myself.

I've got limited experience with diff so please let me know if there is
something wrong with the patch. Same goes for my changes to the code.

I re-sequenced the OIDs of other window functions so it will require initdb.

Also I made some updates to the documentation. Wasn't 100% sure on the
syntax for the optional arguments there. Hitoshi had: arg1 [,optional1].
I've changed this to arg, [optional1], [optional2].

One thing I didn't do was update the regression test:
SELECT oid, proname FROM pg_proc WHERE proiswfunc;

Hopefully this patch will apply after applying Heikki's latest patch
(version 3).

If you're happy with this Heikki can you merge to your patch?

David




windowfunc_nooffset1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-27 Thread David Rowley
I wrote:
  Hmm, did you apply the latest patch correctly? My build can produce
  right results, so I don't see why it isn't fixed. Make sure the lines
  around 2420-2430 in planner.c like:
 /*
  * must copyObject() to avoid args concatenating
 with each other.
  */
 pulled_exprs = list_concat(pulled_exprs,
 copyObject(wfunc-args));
 
  where copyObject() is added.
 
 I'm sitting here away from home with a funny feeling I forgot to make
 install.
 I think my home adsl has dropped out so can't confirm that. If it
 works for you and not for me last night then I probably did forget.
 
 I'll let you know.

Sorry, false alarm. What can I say, it was late. I did make install, just to
the wrong place, so ended up using the old binary.

Sorry for panic. The bug is fixed. Only I still get regression failures on
window. Do they all pass for you?

I have most of the weekend to test Heikki's work, in fact it's building now.
But I'll have to leave it running...

David.





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-26 Thread Heikki Linnakangas

David Rowley wrote:

I've created a query that uses the table in your regression test.
max_salary1 gives incorrect results. If you remove the max_salary2 column it
gives the correct results.


Thanks. I saw this myself yesterday, while hacking on the patch. I 
thought it was a bug I had introduced, but apparently it was there all 
along. Anyway, fixed in the latest version I will send shortly.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-26 Thread Hitoshi Harada
2008/11/26 Heikki Linnakangas [EMAIL PROTECTED]:
 Hitoshi Harada wrote:

 2008/11/26 David Rowley [EMAIL PROTECTED]:

 I'm at a bit of a loss to what to do now. Should I wait for your work
 Heikki? Or continue validating this patch?

 The best thing I can think to do right now is continue and any problems I
 find you can add regression tests for, then if we keep your regression
 tests
 for Heikki's changes then we can validate those changes more quickly.

 Any thoughts? Better ideas?

 Thanks to your great tests, we now know much more about specification
 and where to fail easily, so continuing makes sense but it may be good
 time to take a rest and wait for Heikki's patch completing.

 Here's another updated patch, including all your bug fixes.

 There's two known issues:
 - ranking functions still don't treat peer rows correctly.

 - I commented out the this function requires ORDER BY clause in the window
 test in rank_up, because a window function shouldn't be poking into the
 WindowState struct like that. I wonder if it's really needed? In section
 7.11, the SQL2008 spec says if WD has no window ordering clause, then the
 window ordering is implementation-dependent, and *all rows are peers*. The
 regression test now fails because of this, but the current behavior actually
 seems correct to me.


Yes, I was wrong. The reason I put the error in rank() without ORDER
BY is nothing but I didn't find it. It is actually a reasonable
specification, isn't it.

This is tiny thing, but negative transition function can be called
inverse transition function? I feel the latter is more readable.


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-26 Thread Hitoshi Harada
2008/11/26 Heikki Linnakangas [EMAIL PROTECTED]:
 Hitoshi Harada wrote:

 I read more, and your spooling approach seems flexible for both now
 and the furture. Looking at only current release, the frame with ORDER
 BY is done by detecting peers in WinFrameGetArg() and add row number
 of peers to winobj-currentpos. Actually if we have capability to
 spool all rows we need on demand, the frame would be only a boundary
 problem.

 Yeah, we could do that. I'm afraid it would be pretty slow, though, if
 there's a lot of peers. That could probably be alleviated with some sort of
 caching, though.

 It seems to me that eval_windowaggregate() also should use frame APIs.
 Only things we have to care is the shrinking frame, which is not
 supported in this release. So I'd suggest winobj-aggregatedupto to be
 removed. Is there objection?

 Actually, I took a different approach in the latest patch. Window aggregates
 now use a separate read pointer into the tuplestore. The current row is also
 read using a separate read pointer in ExecWindow. The aggregate and current
 row read pointers don't need random access, which has the nice effect that
 if you only use window aggregates, not window functions, the tuplestore
 doesn't need random access, and doesn't need to spill to disk as long as the
 window frame fits in work_mem.

 We should still figure out how to make it possible to trim the tuplestore,
 when window functions that don't need random access are used. Like
 ROW_NUMBER and RANK. Earlier, I thought we should add function to the window
 object API to explicitly tell that tuples before row X are no longer needed.
 But I'm now starting to wonder if we should design the window object API
 more like the tuplestore API, with a read pointer that you can advance
 forward or backward, and rewind. That would probably map more naturally to
 the underlying tuplestore, and it seems like it would be just as easy to use
 in all the existing window functions.


Complete solution, at least for the current release. I now figure out
exactly what the tuplestore_trim does. So currentrow pointer doesn't
need go backward, neither does extending frame's aggregate pointer,
row_number, rank, etc. Cutting off frame's aggregate need random row,
so does lead, lag, etc. Even there were random access, it's very
flexible in triming and saving memory. Little concern is some
operations like WinRowIsPeer() doesn't know if the required row is
trimmed already, which isn't big problem in the existing functions.

Now you might think about sharing aggregate code like
initialize/advance/finalize. If you want I can refactor in nodeAgg.c
to be able sharing with nodeWindow.c, which wouldn't conflict with
your work.


Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-26 Thread Heikki Linnakangas

Hitoshi Harada wrote:

I read more, and your spooling approach seems flexible for both now
and the furture. Looking at only current release, the frame with ORDER
BY is done by detecting peers in WinFrameGetArg() and add row number
of peers to winobj-currentpos. Actually if we have capability to
spool all rows we need on demand, the frame would be only a boundary
problem.


Yeah, we could do that. I'm afraid it would be pretty slow, though, if 
there's a lot of peers. That could probably be alleviated with some sort 
of caching, though.



It seems to me that eval_windowaggregate() also should use frame APIs.
Only things we have to care is the shrinking frame, which is not
supported in this release. So I'd suggest winobj-aggregatedupto to be
removed. Is there objection?


Actually, I took a different approach in the latest patch. Window 
aggregates now use a separate read pointer into the tuplestore. The 
current row is also read using a separate read pointer in ExecWindow. 
The aggregate and current row read pointers don't need random access, 
which has the nice effect that if you only use window aggregates, not 
window functions, the tuplestore doesn't need random access, and doesn't 
need to spill to disk as long as the window frame fits in work_mem.


We should still figure out how to make it possible to trim the 
tuplestore, when window functions that don't need random access are 
used. Like ROW_NUMBER and RANK. Earlier, I thought we should add 
function to the window object API to explicitly tell that tuples before 
row X are no longer needed. But I'm now starting to wonder if we should 
design the window object API more like the tuplestore API, with a read 
pointer that you can advance forward or backward, and rewind. That would 
probably map more naturally to the underlying tuplestore, and it seems 
like it would be just as easy to use in all the existing window functions.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-26 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Here's another updated patch, including all your bug fixes.

I did a very fast pass through this and have a few comments.


* Don't bother manually updating keywords.sgml.  As stated therein, that
table is automatically generated.  All you'll accomplish is to cause merge
problems.  (The effort would be a lot better spent on the non-boilerplate
parts of the docs anyway.)

* I assume there's going to be some way to create user-defined window
functions?

* It seems fairly unlikely that you can get away with not supporting
any qual expression on a Window plan node.  What will you do with HAVING
qualifiers?

* The find_aggref code added to planagg.c (where it doesn't belong anyway)
doesn't seem to be used anywhere.

* In the same vein, I'm unimpressed with moving GetAggInitVal into
execGrouping.c, which it isn't at all related to.  I'd just leave it alone
and duplicate the code in nodeWindow.c --- it's not exactly large.  If you
did insist on sharing this code it would be appropriate to change the
name and comments to reflect the fact that it's being used for more than
just aggregates, anyhow.

* And in the same vein. var.c is hardly the place to put a
search-for-wfuncs routine.

* It seems like a coin was flipped to determine whether struct and field
names would use window, win, or just w (I find WFunc to be
particularly unhelpful to a reader who doesn't already know what it is).
Please try to reduce the surprise factor.  I'd suggest consistently using
window in type names, though win is an OK prefix for field names
within window-related structs.

* This is a bad idea:

  /*
+  * OrderClause -
+  *   representation of ORDER BY in Window
+  */
+ typedef SortGroupClause OrderClause;
+ 
+ 
+ /*
+  * PartitionClause -
+  *   representaition of PATITION BY in Window
+  */
+ typedef SortGroupClause PartitionClause;

If they're just SortGroupClauses, call them that, don't invent an alias.
(Yes, I know SortClause and GroupClause used to be aliases.  That was a
bad idea: it confused matters and required lots of useless duplicated
code, except for the places where we didn't duplicate code because we were
willing to assume struct equivalence.  There's basically just nothing that
wins about that approach.)  In any case, order and partition are
really bad names to be using here given the number of possible other
meanings for those terms in a DBMS context.  If you actually need separate
struct types then names like WindowPartitionClause would be appropriate.

* The API changes chosen for func_get_detail seem pretty bizarre.
Why didn't you just add a new return code FUNCDETAIL_WINDOW?

* The node support needs to be gone over more closely.  I noticed just in
random checking that WFunc is missing parse-location support in
nodeFuncs.c and the Query.hasWindow field got added to copyfuncs, outfuncs,
readfuncs, but not equalfuncs.

* Please heed the comment at the top of parallel_schedule about the max
number of tests per parallel group.

* I don't find the test added to opr_sanity.sql to be particularly sane.
We *will* have the ability to add window functions.  But it might be
helpful to check that proisagg and proiswfunc aren't both set.

* errcodes.h is not the only place that has to be touched to add a new
error code --- see also sgml/errcodes.sgml, plpgsql/src/plerrcodes.h.
And what is your precedent for using 42813?  I don't see that in the
standard.  If it's coming from DB2 it's okay, otherwise we should be
using a private 'P' code.

* Please try to eliminate random whitespace changes from the patch
... *particularly* in files that otherwise wouldn't be touched at all
(there are at least two cases of that in this patch)


That's all I have time for right now ... more to come no doubt.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-26 Thread David Rowley
On 26/11/2008, Hitoshi Harada [EMAIL PROTECTED] wrote:
 2008/11/26 David Rowley [EMAIL PROTECTED]:
  Hitoshi Harada wrote:
  2008/11/20 David Rowley [EMAIL PROTECTED]:
   -- The following query gives incorrect results on the
   -- maxhighbid column
  
   SELECT auctionid,
 category,
 description,
 highestbid,
 reserve,
 MAX(highestbid) OVER (ORDER BY auctionid) AS maxhighbid,
 MAX(reserve) OVER() AS highest_reserve
   FROM auction_items;
  
   If you remove the highest_reserve column you get the correct results.
  
   The bug also affects MIN, AVG, COUNT, STDDEV but not SUM.
  Good report! I fixed this bug, which was by a trival missuse of
  list_concat() in the planner. This was occurred when the first
  aggregate trans func is strict and the second aggregate argument may
  be null. Yep, the argument of the second was implicitly passed to the
  first wrongly. That's why it didn't occur if you delete the second
  MAX().
 
  I added a test case with the existing data emulating yours (named
  strict aggs) but if it is wrong, let me know.
 
 
  It's not quite right yet. I'm also getting regression tests failing on
  window. Let me know if you want the diffs.
 
  I've created a query that uses the table in your regression test.
  max_salary1 gives incorrect results. If you remove the max_salary2 column it
  gives the correct results.
 
  Please excuse the lack of sanity with the query. I had to do it this way to
  get 2 columns with NULLs.
 
 
  SELECT depname,
empno,
salary,
salary1,
salary2,
MAX(salary1) OVER (ORDER BY empno) AS max_salary1,
MAX(salary2) OVER() AS max_salary2
  FROM (SELECT depname,
  empno,
  salary,
  (CASE WHEN salary  5000 THEN NULL ELSE salary END) AS salary1,
  (CASE WHEN salary = 5000 THEN NULL ELSE salary END) AS salary2
   FROM empsalary
  ) empsalary;
 
  Actual results:
 
   depname  | empno | salary | salary1 | salary2 | max_salary1 | max_salary2
  ---+---++-+-+-+-
   sales | 1 |   5000 |5000 | | |4800
   personnel | 2 |   3900 | |3900 | |4800
   sales | 3 |   4800 | |4800 | |4800
   sales | 4 |   4800 | |4800 | |4800
   personnel | 5 |   3500 | |3500 | |4800
   develop   | 7 |   4200 | |4200 | |4800
   develop   | 8 |   6000 |6000 | | |4800
   develop   | 9 |   4500 | |4500 | |4800
   develop   |10 |   5200 |5200 | | |4800
   develop   |11 |   5200 |5200 | | |4800
 
 
  Correct results:
 
   depname  | empno | salary | salary1 | salary2 | max_salary1 | max_salary2
  ---+---++-+-+-+-
   sales | 1 |   5000 |5000 | |5000 |4800
   personnel | 2 |   3900 | |3900 |5000 |4800
   sales | 3 |   4800 | |4800 |5000 |4800
   sales | 4 |   4800 | |4800 |5000 |4800
   personnel | 5 |   3500 | |3500 |5000 |4800
   develop   | 7 |   4200 | |4200 |5000 |4800
   develop   | 8 |   6000 |6000 | |6000 |4800
   develop   | 9 |   4500 | |4500 |6000 |4800
   develop   |10 |   5200 |5200 | |6000 |4800
   develop   |11 |   5200 |5200 | |6000 |4800
 
 
  This might be a good regression test once it's fixed.
 

 Hmm, did you apply the latest patch correctly? My build can produce
 right results, so I don't see why it isn't fixed. Make sure the lines
 around 2420-2430 in planner.c like:
/*
 * must copyObject() to avoid args concatenating with 
 each other.
 */
pulled_exprs = list_concat(pulled_exprs, 
 copyObject(wfunc-args));

 where copyObject() is added.

I'm sitting here away from home with a funny feeling I forgot to make install.
I think my home adsl has dropped out so can't confirm that. If it
works for you and not for me last night then I probably did forget.

I'll let you know.



 I'm not sure if this is related, another bug is found:

 *** a/src/backend/nodes/equalfuncs.c
 --- b/src/backend/nodes/equalfuncs.c
 ***
 *** 2246,2251  equal(void *a, void *b)
 --- 2246,2252 
 break;
 case T_Aggref:
 retval = _equalAggref(a, b);
 +break;
 

Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-26 Thread Hitoshi Harada
2008/11/27 Tom Lane [EMAIL PROTECTED]:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
 Here's another updated patch, including all your bug fixes.

 I did a very fast pass through this and have a few comments.

Thanks for your comments. The executor part is now being refactored by
Heikki, but remaining are still on me. Note that some of those are
because of my earlier poor understanding.


 * Don't bother manually updating keywords.sgml.  As stated therein, that
 table is automatically generated.  All you'll accomplish is to cause merge
 problems.  (The effort would be a lot better spent on the non-boilerplate
 parts of the docs anyway.)

OK, I intend nothing here but didn't know the rule. will remove it.

 * I assume there's going to be some way to create user-defined window
 functions?

Yes, but for 8.4 no. The window functions will need specific window
function APIs rather than existing PG_XXX APIs and the design of them
affects how to design the Window executor node. So we are currently
desgining the APIs. If it completes in the future users can create
their own functions.

 * It seems fairly unlikely that you can get away with not supporting
 any qual expression on a Window plan node.  What will you do with HAVING
 qualifiers?

Window nodes are executed after any of WHERE, GROUP BY, HAVING, and
before ORDER BY. Window nodes don't have qual and HAVING doesn't give
any effect to Window operations.

 * The find_aggref code added to planagg.c (where it doesn't belong anyway)
 doesn't seem to be used anywhere.

It was needed to extract Aggref node in planner once, but not needed
anymore. will remove it.

 * In the same vein, I'm unimpressed with moving GetAggInitVal into
 execGrouping.c, which it isn't at all related to.  I'd just leave it alone
 and duplicate the code in nodeWindow.c --- it's not exactly large.  If you
 did insist on sharing this code it would be appropriate to change the
 name and comments to reflect the fact that it's being used for more than
 just aggregates, anyhow.

It is now in the discussion. Since nodeWindow has much duplicated code
in initialize/advance/finalize so we wonder if those codes should be
shared among the two nodes. If so, GetAggInitVal seems to be shared as
well as other aggregate specific code. If we decide to separate them,
your suggestion that GetAggInitVal should be duplicated will be sane.

 * And in the same vein. var.c is hardly the place to put a
 search-for-wfuncs routine.

Agreed, but where to go? clause.c may be, or under parser/ ?

 * It seems like a coin was flipped to determine whether struct and field
 names would use window, win, or just w (I find WFunc to be
 particularly unhelpful to a reader who doesn't already know what it is).
 Please try to reduce the surprise factor.  I'd suggest consistently using
 window in type names, though win is an OK prefix for field names
 within window-related structs.

I named WFunc as WinFunc once, but sounds too long for such heavily
used node. I liked it like Agg, but Win is not appropriate neither is
Func. And also, its name is consistent with the added pg_proc column
named proiswfunc. I wonder it would be proiswinfunc if we rename WFunc
as WinFunc.

 * This is a bad idea:

  /*
 +  * OrderClause -
 +  *   representation of ORDER BY in Window
 +  */
 + typedef SortGroupClause OrderClause;
 +
 +
 + /*
 +  * PartitionClause -
 +  *   representaition of PATITION BY in Window
 +  */
 + typedef SortGroupClause PartitionClause;

 If they're just SortGroupClauses, call them that, don't invent an alias.
 (Yes, I know SortClause and GroupClause used to be aliases.  That was a
 bad idea: it confused matters and required lots of useless duplicated
 code, except for the places where we didn't duplicate code because we were
 willing to assume struct equivalence.  There's basically just nothing that
 wins about that approach.)  In any case, order and partition are
 really bad names to be using here given the number of possible other
 meanings for those terms in a DBMS context.  If you actually need separate
 struct types then names like WindowPartitionClause would be appropriate.

This is because I didn't know quite well about windowed table
specification earlier (and when I was started the Group and the Sort
was separated as you point). And now I can tell the two nodes can be
named SortGroupClause, nothing special.

 * The API changes chosen for func_get_detail seem pretty bizarre.
 Why didn't you just add a new return code FUNCDETAIL_WINDOW?

An aggregate that is existing currently can be used as a window
function. But we need to treat it as specialized case. A normal
aggregate without OVER clause is GROUP BY aggregate and with OVER
clause it's window aggregate. For func_get_detail to determine which
aggregate windef variable must be passed. Is it better?

And also, block starting with Oops. Time to die comment in
ParseFuncOrColumn can be shared among two types. So I thought the two
are similar and func_get_detail 

Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-25 Thread David Rowley
Hitoshi Harada wrote:
 2008/11/20 David Rowley [EMAIL PROTECTED]:
  -- The following query gives incorrect results on the
  -- maxhighbid column
 
  SELECT auctionid,
category,
description,
highestbid,
reserve,
MAX(highestbid) OVER (ORDER BY auctionid) AS maxhighbid,
MAX(reserve) OVER() AS highest_reserve
  FROM auction_items;
 
  If you remove the highest_reserve column you get the correct results.
 
  The bug also affects MIN, AVG, COUNT, STDDEV but not SUM.
 Good report! I fixed this bug, which was by a trival missuse of
 list_concat() in the planner. This was occurred when the first
 aggregate trans func is strict and the second aggregate argument may
 be null. Yep, the argument of the second was implicitly passed to the
 first wrongly. That's why it didn't occur if you delete the second
 MAX().
 
 I added a test case with the existing data emulating yours (named
 strict aggs) but if it is wrong, let me know.


It's not quite right yet. I'm also getting regression tests failing on
window. Let me know if you want the diffs.

I've created a query that uses the table in your regression test.
max_salary1 gives incorrect results. If you remove the max_salary2 column it
gives the correct results.

Please excuse the lack of sanity with the query. I had to do it this way to
get 2 columns with NULLs.


SELECT depname,
   empno,
   salary,
   salary1,
   salary2,
   MAX(salary1) OVER (ORDER BY empno) AS max_salary1,
   MAX(salary2) OVER() AS max_salary2
FROM (SELECT depname,
 empno,
 salary,
 (CASE WHEN salary  5000 THEN NULL ELSE salary END) AS salary1,
 (CASE WHEN salary = 5000 THEN NULL ELSE salary END) AS salary2
  FROM empsalary
) empsalary;

Actual results:

  depname  | empno | salary | salary1 | salary2 | max_salary1 | max_salary2
---+---++-+-+-+-
 sales | 1 |   5000 |5000 | | |4800
 personnel | 2 |   3900 | |3900 | |4800
 sales | 3 |   4800 | |4800 | |4800
 sales | 4 |   4800 | |4800 | |4800
 personnel | 5 |   3500 | |3500 | |4800
 develop   | 7 |   4200 | |4200 | |4800
 develop   | 8 |   6000 |6000 | | |4800
 develop   | 9 |   4500 | |4500 | |4800
 develop   |10 |   5200 |5200 | | |4800
 develop   |11 |   5200 |5200 | | |4800


Correct results:

  depname  | empno | salary | salary1 | salary2 | max_salary1 | max_salary2
---+---++-+-+-+-
 sales | 1 |   5000 |5000 | |5000 |4800
 personnel | 2 |   3900 | |3900 |5000 |4800
 sales | 3 |   4800 | |4800 |5000 |4800
 sales | 4 |   4800 | |4800 |5000 |4800
 personnel | 5 |   3500 | |3500 |5000 |4800
 develop   | 7 |   4200 | |4200 |5000 |4800
 develop   | 8 |   6000 |6000 | |6000 |4800
 develop   | 9 |   4500 | |4500 |6000 |4800
 develop   |10 |   5200 |5200 | |6000 |4800
 develop   |11 |   5200 |5200 | |6000 |4800


This might be a good regression test once it's fixed.

I'm at a bit of a loss to what to do now. Should I wait for your work
Heikki? Or continue validating this patch? 

The best thing I can think to do right now is continue and any problems I
find you can add regression tests for, then if we keep your regression tests
for Heikki's changes then we can validate those changes more quickly.

Any thoughts? Better ideas?

David.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-25 Thread Hitoshi Harada
2008/11/26 David Rowley [EMAIL PROTECTED]:
 Hitoshi Harada wrote:
 2008/11/20 David Rowley [EMAIL PROTECTED]:
  -- The following query gives incorrect results on the
  -- maxhighbid column
 
  SELECT auctionid,
category,
description,
highestbid,
reserve,
MAX(highestbid) OVER (ORDER BY auctionid) AS maxhighbid,
MAX(reserve) OVER() AS highest_reserve
  FROM auction_items;
 
  If you remove the highest_reserve column you get the correct results.
 
  The bug also affects MIN, AVG, COUNT, STDDEV but not SUM.
 Good report! I fixed this bug, which was by a trival missuse of
 list_concat() in the planner. This was occurred when the first
 aggregate trans func is strict and the second aggregate argument may
 be null. Yep, the argument of the second was implicitly passed to the
 first wrongly. That's why it didn't occur if you delete the second
 MAX().

 I added a test case with the existing data emulating yours (named
 strict aggs) but if it is wrong, let me know.


 It's not quite right yet. I'm also getting regression tests failing on
 window. Let me know if you want the diffs.

 I've created a query that uses the table in your regression test.
 max_salary1 gives incorrect results. If you remove the max_salary2 column it
 gives the correct results.

 Please excuse the lack of sanity with the query. I had to do it this way to
 get 2 columns with NULLs.


 SELECT depname,
   empno,
   salary,
   salary1,
   salary2,
   MAX(salary1) OVER (ORDER BY empno) AS max_salary1,
   MAX(salary2) OVER() AS max_salary2
 FROM (SELECT depname,
 empno,
 salary,
 (CASE WHEN salary  5000 THEN NULL ELSE salary END) AS salary1,
 (CASE WHEN salary = 5000 THEN NULL ELSE salary END) AS salary2
  FROM empsalary
 ) empsalary;

 Actual results:

  depname  | empno | salary | salary1 | salary2 | max_salary1 | max_salary2
 ---+---++-+-+-+-
  sales | 1 |   5000 |5000 | | |4800
  personnel | 2 |   3900 | |3900 | |4800
  sales | 3 |   4800 | |4800 | |4800
  sales | 4 |   4800 | |4800 | |4800
  personnel | 5 |   3500 | |3500 | |4800
  develop   | 7 |   4200 | |4200 | |4800
  develop   | 8 |   6000 |6000 | | |4800
  develop   | 9 |   4500 | |4500 | |4800
  develop   |10 |   5200 |5200 | | |4800
  develop   |11 |   5200 |5200 | | |4800


 Correct results:

  depname  | empno | salary | salary1 | salary2 | max_salary1 | max_salary2
 ---+---++-+-+-+-
  sales | 1 |   5000 |5000 | |5000 |4800
  personnel | 2 |   3900 | |3900 |5000 |4800
  sales | 3 |   4800 | |4800 |5000 |4800
  sales | 4 |   4800 | |4800 |5000 |4800
  personnel | 5 |   3500 | |3500 |5000 |4800
  develop   | 7 |   4200 | |4200 |5000 |4800
  develop   | 8 |   6000 |6000 | |6000 |4800
  develop   | 9 |   4500 | |4500 |6000 |4800
  develop   |10 |   5200 |5200 | |6000 |4800
  develop   |11 |   5200 |5200 | |6000 |4800


 This might be a good regression test once it's fixed.


Hmm, did you apply the latest patch correctly? My build can produce
right results, so I don't see why it isn't fixed. Make sure the lines
around 2420-2430 in planner.c like:
/*
 * must copyObject() to avoid args concatenating with 
each other.
 */
pulled_exprs = list_concat(pulled_exprs, 
copyObject(wfunc-args));

where copyObject() is added.


I'm not sure if this is related, another bug is found:

*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***
*** 2246,2251  equal(void *a, void *b)
--- 2246,2252 
 break;
 case T_Aggref:
 retval = _equalAggref(a, b);
+break;
 case T_WFunc:
 retval = _equalWFunc(a, b);
 break;


don't laugh at... could you try it and test again?

If whole the new patch is needed, tell me then will send it.

 I'm at a bit of a loss to what to do now. Should I wait for your work
 Heikki? Or continue validating this patch?

 The best thing I can think to do right now is continue 

Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-25 Thread Hitoshi Harada
2008/11/25 Hitoshi Harada [EMAIL PROTECTED]:
 2008/11/25 Heikki Linnakangas [EMAIL PROTECTED]:
 Here's an updated patch, where the rows are fetched on-demand.

 Good! And I like the fetching args by number better. Let me take more
 time to look in detail...

I read more, and your spooling approach seems flexible for both now
and the furture. Looking at only current release, the frame with ORDER
BY is done by detecting peers in WinFrameGetArg() and add row number
of peers to winobj-currentpos. Actually if we have capability to
spool all rows we need on demand, the frame would be only a boundary
problem.

It seems to me that eval_windowaggregate() also should use frame APIs.
Only things we have to care is the shrinking frame, which is not
supported in this release. So I'd suggest winobj-aggregatedupto to be
removed. Is there objection?

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-24 Thread Heikki Linnakangas

Hitoshi Harada wrote:

2008/11/20 David Rowley [EMAIL PROTECTED]:

I won't be around until Monday evening (Tuesday morning JST). I'll pickup
the review again there. It's really time for me to push on with this review.
The patch is getting closer to getting the thumbs up from me. I'm really
hoping that can happen on Monday. Then it's over to Heikki for more code
feedback.


This time I only fixed some bugs and rebased against the HEAD, but did
not refactored. I can figure out some part of what Heikki claimed but
not all, so I'd like to see what he will send and post another patch
if needed.


Thanks! Here's what I've got this far I merged your latest patch into 
this, as well as latest changes from CVS HEAD.


It's a bit of a mess, but here's the big changes I've done:
- Removed all the iterator stuff. You can just call 
WinGetPart/FrameGetArg repeatedly. It ought to be just as fast if done 
right in nodeWindow.c.
- Removed all the Shrinked/Extended stuff, as it's not needed until we 
have full support for window frames.
- Removed all the WinRow* functions, you can just call WinFrame/PartGet* 
 functions, using WINDOW_SEEK_CURRENT
- Removed WinFrame/PartGetTuple functions. They were only used for 
determining if two tuples are peer with each other, so I added a 
WinRowIsPeer function instead.
- Removed all the buffering strategy stuff. Currently, the whole 
partition is always materialized. That's of course not optimal; I'm 
still thinking we should just read the tuples from the outer node 
lazily, on-demand, instead. To avoid keeping all tuples in the partition 
in tuplestore, when not needed, should add an extra function to trim the 
tuplestore.


There's now a lot less code in nodeWindow.c, and I'm starting to 
understand most of what-s left :-).


TODO
- clean up the comments and other mess.
- Modify WinPart/FrameGetArg so that it's passed the parameter number 
instead of an Expr node.


I'll continue working on the executor, but please let me know what you 
think.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-24 Thread Hitoshi Harada
2008/11/24 Heikki Linnakangas [EMAIL PROTECTED]:
 Hitoshi Harada wrote:

 2008/11/20 David Rowley [EMAIL PROTECTED]:

 I won't be around until Monday evening (Tuesday morning JST). I'll pickup
 the review again there. It's really time for me to push on with this
 review.
 The patch is getting closer to getting the thumbs up from me. I'm really
 hoping that can happen on Monday. Then it's over to Heikki for more code
 feedback.

 This time I only fixed some bugs and rebased against the HEAD, but did
 not refactored. I can figure out some part of what Heikki claimed but
 not all, so I'd like to see what he will send and post another patch
 if needed.

 Thanks! Here's what I've got this far I merged your latest patch into this,
 as well as latest changes from CVS HEAD.

 It's a bit of a mess, but here's the big changes I've done:
 - Removed all the iterator stuff. You can just call WinGetPart/FrameGetArg
 repeatedly. It ought to be just as fast if done right in nodeWindow.c.
 - Removed all the Shrinked/Extended stuff, as it's not needed until we have
 full support for window frames.
 - Removed all the WinRow* functions, you can just call WinFrame/PartGet*
  functions, using WINDOW_SEEK_CURRENT
 - Removed WinFrame/PartGetTuple functions. They were only used for
 determining if two tuples are peer with each other, so I added a
 WinRowIsPeer function instead.
 - Removed all the buffering strategy stuff. Currently, the whole partition
 is always materialized. That's of course not optimal; I'm still thinking we
 should just read the tuples from the outer node lazily, on-demand, instead.
 To avoid keeping all tuples in the partition in tuplestore, when not needed,
 should add an extra function to trim the tuplestore.

 There's now a lot less code in nodeWindow.c, and I'm starting to understand
 most of what-s left :-).

 TODO
 - clean up the comments and other mess.
 - Modify WinPart/FrameGetArg so that it's passed the parameter number
 instead of an Expr node.

 I'll continue working on the executor, but please let me know what you
 think.


It is amazing changes and I like it. Actually I wanted to get it
slimer but hadn't found the way.

two points, frame concept and window_gettupleslot

If you keep this in your mind, please don't be annoyed but the current
frame concept is wrong.

-- yours
sample=# select depname, empno, salary, last_value(empno) over(order
by salary) from empsalary;
  depname  | empno | salary | last_value
---+---++
 personnel | 5 |   3500 |  5
 personnel | 2 |   3900 |  2
 develop   | 7 |   4200 |  7
 develop   | 9 |   4500 |  9
 sales | 4 |   4800 |  4
 sales | 3 |   4800 |  3
 sales |b1 |   5000 |  1
 develop   |11 |   5200 | 11
 develop   |10 |   5200 | 10
 develop   | 8 |   6000 |  8
(10 rows)

-- mine
sample=# select depname, empno, salary, last_value(empno) over(order
by salary) from empsalary;
  depname  | empno | salary | last_value
---+---++
 personnel | 5 |   3500 |  5
 personnel | 2 |   3900 |  2
 develop   | 7 |   4200 |  7
 develop   | 9 |   4500 |  9
 sales | 4 |   4800 |  3
 sales | 3 |   4800 |  3
 sales | 1 |   5000 |  1
 develop   |11 |   5200 | 10
 develop   |10 |   5200 | 10
 develop   | 8 |   6000 |  8
(10 rows)

Note that on empno=4 then last_value=4(yours)/3(mine), which means the
frame is applied to last_value() as well as nth_value() and
first_value(). Not only to aggregates. So these lines in nodeWindow.c,

/*
 * If there is an ORDER BY (we don't support other window frame
 * specifications yet), the frame runs from first row of the partition
 * to the current row. Otherwise the frame is the whole partition.
 */
if (((Window *) winobj-winstate-ss.ps.plan)-ordNumCols == 0)
max_pos = winobj-partition_rows - 1;
else
max_pos = winobj-currentpos;


max_pos is bogus. RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
means max_pos would be currentpos + rows_peers. I looked over the
patch and aggregates care it with winobj-aggregatedupto but what
about other window functions?

Then window_gettupleslot looks like killing performance in some cases.
What if the partition has millions of rows and wfunc1 seeks the head
and wfunc2 seeks the tail?

So as you point it is possible to define frame and partition while
scanning current rows rather than before scanning, I felt it'd cost
more. But I know this is work in progress, so you probably think about
the solutions.  What do you think?


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-24 Thread Hitoshi Harada
2008/11/25 Heikki Linnakangas [EMAIL PROTECTED]:
 Hitoshi Harada wrote:

 If you keep this in your mind, please don't be annoyed but the current
 frame concept is wrong.

 ...

 Note that on empno=4 then last_value=4(yours)/3(mine), which means the
 frame is applied to last_value() as well as nth_value() and
 first_value(). Not only to aggregates. So these lines in nodeWindow.c,

 Yes, you're right.

 Hmm, I wonder if we should implement first_value, last_value and nth_value
 as regular aggregates. That way, all the window functions would operate on
 the partition, not caring about the frame, and all the aggregates would
 operate on the frame.

No way. Current specification that we don't get other frames than
with/without ORDER BY doesn't have conflict with your proposal.
However, thinking about the future, we decided to add window
aggregates with wfunc='t', with subfunc for shrinking frame
performance up. The direction we should head for is convert aggregates
to subset of window functions, not vice versa.


 Then window_gettupleslot looks like killing performance in some cases.
 What if the partition has millions of rows and wfunc1 seeks the head
 and wfunc2 seeks the tail?

 Yeah, we probably should do something about that. You used several different
 read pointers to the tuplestore in your patch, one for frame head, another
 for frame tail, another for partition head etc., but I think that
 potentially suffers from the same problem. Perhaps we should have one read
 pointer for each window function, plus one reading the current row? It seems
 likely that one window function is not going to hop around wildly, and the
 behavior wouldn't depend so much on what other window functions are being
 used.

Yes. My patch also has performance problem in seeking but is better
than one read pointer. So I like your proposal to attach a read
pointer with each wfuncs. I am not sure what kind of window functions
the user will define in the future, but for current use cases it comes
reasonable. Anyway, only one read pointer will make problems, I guess.

 So as you point it is possible to define frame and partition while
 scanning current rows rather than before scanning, I felt it'd cost
 more. But I know this is work in progress, so you probably think about
 the solutions.  What do you think?

 Here's an updated patch, where the rows are fetched on-demand.

Good! And I like the fetching args by number better. Let me take more
time to look in detail...

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-19 Thread David Rowley
I wrote:
 All,

 This is my first patch review for PostgreSQL. I did submit a patch last
 commit fest (Boyer-Moore) so I feel I should review one this commit fest.
 I'm quite new to PostgreSQL so please don't rely on me totally. I'll do my
 best. Heikki is also reviewing this patch which makes me feel better.

 My aim is to get the author has much feed back as quickly as possible.
 For this reason I'm going to be breaking down my reviews into the
 following topics.

 1. Does patch apply cleanly?

 2. Any compiler warnings?

 3. Do the results follow the SQL standard?

 4. Performance Comparison, does it perform better than alternate ways of
 doing things. Self joins, sub queries etc.

 5. Performance, no comparison. How does it perform with larger tables?

I regret starting the individual thread for each function now. I was
expecting them to get longer. So I'll use this one for the remainder of the
standard conformance tests.

I covered all of the non-aggregate functions in previous tests. I will do
more final tests on them soon. Tonight I started testing the aggregate
functions with NULLable columns and I've found a small problem. I'll just
post my test script to make it easy for you to see what I mean.

BEGIN WORK;

CREATE TABLE auction_items (
  auctionid INT NOT NULL,
  category VARCHAR(32) NOT NULL,
  description VARCHAR(128) NOT NULL,
  highestbid INT NULL, -- NULL when no bids
  reserve INT NULL, -- NULL when no reserve
  started TIMESTAMP NOT NULL, -- When the action opened
  days INT NOT NULL, -- how many days the auction runs.
  
  CHECK(days  0),
  PRIMARY KEY (auctionid)
);
 

INSERT INTO auction_items
  VALUES(1,'BIKE','Broken Bicycle',NULL,100,NOW(),10);
INSERT INTO auction_items
  VALUES(2,'BIKE','Mountain Bike',120,NULL,NOW(),10);
INSERT INTO auction_items
  VALUES(3,'BIKE','Racer',230,NULL,NOW(),7);

INSERT INTO auction_items
  VALUES(4,'CAR','Bmw M3',5000,6000,NOW(),7);
INSERT INTO auction_items
  VALUES(5,'CAR','Audi S3',NULL,6500,NOW(),7);
INSERT INTO auction_items
  VALUES(6,'CAR','Holden Kingswood',NULL,2000,NOW(),7);

COMMIT; 

-- The following query gives incorrect results on the
-- maxhighbid column

SELECT auctionid,
   category,
   description,
   highestbid,
   reserve,
   MAX(highestbid) OVER (ORDER BY auctionid) AS maxhighbid,
   MAX(reserve) OVER() AS highest_reserve
FROM auction_items;

If you remove the highest_reserve column you get the correct results.

The bug also affects MIN, AVG, COUNT, STDDEV but not SUM.

I've also had a little look at the documentation included in the patch.
At first scan the only thing I can see that is wrong is

+ 
+para
+ Window functions are put in the commandSELECT/command list.
+ It is forbidden anywhere else such as literalGROUP BY/literal,

I think this needs to be rewritten as they are allowed in the ORDER BY
clause, works in patch and spec says the same.

Maybe it should read something more like:

Window functions are only permitted in the commandSELECT/command and
the commandORDER BY/command clause of the query. They are forbidden
anywhere else... etc.

I won't be around until Monday evening (Tuesday morning JST). I'll pickup
the review again there. It's really time for me to push on with this review.
The patch is getting closer to getting the thumbs up from me. I'm really
hoping that can happen on Monday. Then it's over to Heikki for more code
feedback.

Keep up the good work.

David.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-09 Thread Hitoshi Harada
2008/11/9 David Rowley [EMAIL PROTECTED]:
 Hitoshi Harada wrote:
 I recreate the patch against current HEAD, in the git it's here:

 http://git.postgresql.org/?p=postgresql.git;a=commit;h=f88970d3c6fb9f99543
 d873bb7228f4c057c23e0

 I tested `patch -p1` with the attached and succeeded to make it work
 cleanly. It seems to me that this patch can be applied until another
 pg_proc.h entry would introduced in the HEAD.


 Regards,

 --
 Hitoshi Harada

 Thanks Hitoshi,

 You've done what I planned to do first thing this morning. I was planning to
 write a script that checks for the new lines then I could have removed
 those, patched then re-added them with the extra f.

 It looks like it's not easy to keep it working with CVS head. I counted 3
 patches that touched pg_proc.h just this month. I must have missed another
 as I still couldn't get it to apply. Hence the need for the script.

 Thanks for the new patch I'll get testing now.

 David.


I'm glad to hear that. Actually thanks to git it is quite easy for me
to merge my own repository with the HEAD. It tells me which lines are
new coming and which lines I modified are newer than else in CVS. This
is my first project where I use git (and I am not guru of cvs either
-:) but I love it now.
So feel free to tell me to recreate a new patch against head. Not so
heavy work as making on-the-fly script.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-09 Thread David Rowley
Using one of my original test tables I'm testing windowing functions with a
GROUP BY.

The following query works as I would expect.

-- Works
SELECT department,
   SUM(Salary),
   ROW_NUMBER() OVER (ORDER BY department),
   SUM(SUM(salary)) OVER (ORDER BY department)
FROM employees
GROUP BY department;


The following one fails with the message.
ERROR:  variable not found in subplan target list

-- Does not work.
SELECT department,
   SUM(Salary),
   ROW_NUMBER() OVER (ORDER BY department),
   SUM(SUM(salary)) OVER (ORDER BY department DESC)
FROM employees
GROUP BY department;

I just added the DESC to force it into creating 2 separate windows.

I can re-write the non working query to work using the following:


SELECT department,
   salary,
   ROW_NUMBER() OVER (ORDER BY department),
   SUM(salary) OVER (ORDER BY department DESC)
FROM (SELECT department,
 SUM(salary) AS salary
  FROM employees
  GROUP BY department
) t;




Testing with:

create table employees (
  id INT primary key,
  name varchar(30) not null,
  department varchar(30) not null,
  salary int not null,
  check (salary = 0)
);


insert into employees values(1,'Jeff','IT',1);
insert into employees values(2,'Sam','IT',12000);

insert into employees values(3,'Richard','Manager',3);
insert into employees values(4,'Ian','Manager',2);

insert into employees values(5,'John','IT',6);
insert into employees values(6,'Matthew','Director',6);


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-09 Thread David Rowley
Hitoshi Harada wrote:
 I'm glad to hear that. Actually thanks to git it is quite easy for me
 to merge my own repository with the HEAD. It tells me which lines are
 new coming and which lines I modified are newer than else in CVS. This
 is my first project where I use git (and I am not guru of cvs either
 -:) but I love it now.
 So feel free to tell me to recreate a new patch against head. Not so
 heavy work as making on-the-fly script.


CUME_DIST and COUNT(*) seem to be working as expected now with the original
tests.  I'll continue to test more and also test the other windowing
functions that I've not got to yet.

I did receive a small warning from the compiler when compiling nodeWindow.c.
It was complaining about a possible un-initialised f_shrinking. Just to keep
it quiet I changed the break's to return's after the elog() call.



*** nodeWindow.c2008-11-09 10:54:50.0 +
--- nodeWindow.c2008-11-09 10:39:24.0 +
***
*** 818,824 
case FRAME_CURRENT_RANGE:
/* UNSUPPORTED */
elog(ERROR, unknown preceding type %d,
node-preceding_type);
!   break;
case FRAME_VALUE_ROWS:
if (node-preceding_rows  0)
f_shrinking = 0;
--- 818,824 
case FRAME_CURRENT_RANGE:
/* UNSUPPORTED */
elog(ERROR, unknown preceding type %d,
node-preceding_type);
!   return; /* keep compiler quiet */
case FRAME_VALUE_ROWS:
if (node-preceding_rows  0)
f_shrinking = 0;
***
*** 922,928 
case FRAME_CURRENT_RANGE:
/* UNSUPPORTED */
elog(ERROR, unknown preceding type %d,
node-preceding_type);
!   break;
case FRAME_VALUE_ROWS:
if (node-preceding_rows =
winobj-p_currentpos + 1)
f_shrinking = 1;
--- 922,928 
case FRAME_CURRENT_RANGE:
/* UNSUPPORTED */
elog(ERROR, unknown preceding type %d,
node-preceding_type);
!   return; /* keep compiler quiet */
case FRAME_VALUE_ROWS:
if (node-preceding_rows =
winobj-p_currentpos + 1)
f_shrinking = 1;




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-09 Thread David Rowley
Hitoshi Harada wrote:
 I recreate the patch against current HEAD, in the git it's here:
 
 http://git.postgresql.org/?p=postgresql.git;a=commit;h=f88970d3c6fb9f99543
 d873bb7228f4c057c23e0
 
 I tested `patch -p1` with the attached and succeeded to make it work
 cleanly. It seems to me that this patch can be applied until another
 pg_proc.h entry would introduced in the HEAD.
 
 
 Regards,
 
 --
 Hitoshi Harada

Thanks Hitoshi,

You've done what I planned to do first thing this morning. I was planning to
write a script that checks for the new lines then I could have removed
those, patched then re-added them with the extra f.

It looks like it's not easy to keep it working with CVS head. I counted 3
patches that touched pg_proc.h just this month. I must have missed another
as I still couldn't get it to apply. Hence the need for the script.

Thanks for the new patch I'll get testing now.

David.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-08 Thread David Rowley
Hitoshi Harada Wrote:
  although attached is the whole (split) patch.

I'm having some trouble getting these patches to patch cleanly. I think it's
because of this that I can't get postgres to compile after applying the
patch.

It errors out at tuptoaster.c some constants seem to be missing from
fmgroids.h

If I open src/include/utils/fmgroids.h

The only constaint being defined is:

#define F__NULL_ 31

I'd assume it was a problem with my setup, but the CVS head compiles fine.

Let me know if I'm doing something wrong.


Here is the output from patch and the compile errors.

[EMAIL PROTECTED]:~/pg8.4/pgsql$ patch -p1  window_functions.patch.20081107-1
patching file doc/src/sgml/func.sgml
Hunk #1 succeeded at 10061 (offset 11 lines).
patching file doc/src/sgml/keywords.sgml
patching file doc/src/sgml/queries.sgml
patching file doc/src/sgml/query.sgml
patching file doc/src/sgml/ref/select.sgml
patching file doc/src/sgml/syntax.sgml
patching file src/backend/catalog/pg_aggregate.c
patching file src/backend/catalog/pg_proc.c
patching file src/backend/commands/explain.c
patching file src/backend/executor/Makefile
patching file src/backend/executor/execAmi.c
patching file src/backend/executor/execGrouping.c
patching file src/backend/executor/execProcnode.c
patching file src/backend/executor/execQual.c
Hunk #3 succeeded at 4167 (offset 14 lines).
patching file src/backend/executor/nodeAgg.c
patching file src/backend/executor/nodeWindow.c
patching file src/backend/nodes/copyfuncs.c
patching file src/backend/nodes/equalfuncs.c
patching file src/backend/nodes/nodeFuncs.c
patching file src/backend/nodes/outfuncs.c
patching file src/backend/nodes/readfuncs.c
patching file src/backend/optimizer/path/allpaths.c
patching file src/backend/optimizer/path/equivclass.c
patching file src/backend/optimizer/plan/createplan.c
patching file src/backend/optimizer/plan/planagg.c
patching file src/backend/optimizer/plan/planmain.c
patching file src/backend/optimizer/plan/planner.c
patching file src/backend/optimizer/plan/setrefs.c
patching file src/backend/optimizer/plan/subselect.c
patching file src/backend/optimizer/prep/prepjointree.c
patching file src/backend/optimizer/util/clauses.c
patching file src/backend/optimizer/util/tlist.c
patching file src/backend/optimizer/util/var.c
patching file src/backend/parser/analyze.c
patching file src/backend/parser/gram.y
Hunk #6 succeeded at 6433 (offset 8 lines).
Hunk #7 succeeded at 6443 (offset 8 lines).
Hunk #8 succeeded at 8212 (offset 9 lines).
Hunk #9 succeeded at 8220 (offset 9 lines).
Hunk #10 succeeded at 8232 (offset 9 lines).
Hunk #11 succeeded at 8244 (offset 9 lines).
Hunk #12 succeeded at 8256 (offset 9 lines).
Hunk #13 succeeded at 8272 (offset 9 lines).
Hunk #14 succeeded at 8284 (offset 9 lines).
Hunk #15 succeeded at 8306 (offset 9 lines).
Hunk #16 succeeded at 8754 (offset 9 lines).
Hunk #17 succeeded at 9629 (offset 9 lines).
Hunk #18 succeeded at 9637 (offset 9 lines).
Hunk #19 succeeded at 9703 (offset 9 lines).
Hunk #20 succeeded at 9720 (offset 9 lines).
Hunk #21 succeeded at 9772 (offset 9 lines).
Hunk #22 succeeded at 9959 (offset 9 lines).
Hunk #23 succeeded at 9980 (offset 9 lines).
patching file src/backend/parser/keywords.c
patching file src/backend/parser/parse_agg.c
patching file src/backend/parser/parse_clause.c
patching file src/backend/parser/parse_expr.c
patching file src/backend/parser/parse_func.c
patching file src/backend/rewrite/rewriteManip.c
patching file src/backend/utils/adt/Makefile
patching file src/backend/utils/adt/ruleutils.c
patching file src/backend/utils/adt/wfunc.c
patching file src/backend/utils/sort/tuplestore.c
patching file src/include/catalog/pg_attribute.h
patching file src/include/catalog/pg_class.h
[EMAIL PROTECTED]:~/pg8.4/pgsql$ patch -p1  window_functions.patch.20081107-2
(Stripping trailing CRs from patch.)
patching file src/include/catalog/pg_proc.h
Hunk #4 FAILED at 106.
1 out of 4 hunks FAILED -- saving rejects to file
src/include/catalog/pg_proc.h.rej
(Stripping trailing CRs from patch.)
patching file src/include/executor/executor.h
(Stripping trailing CRs from patch.)
patching file src/include/executor/nodeWindow.h
(Stripping trailing CRs from patch.)
patching file src/include/nodes/execnodes.h
Hunk #3 succeeded at 509 (offset 2 lines).
Hunk #4 succeeded at 1586 (offset 2 lines).
(Stripping trailing CRs from patch.)
patching file src/include/nodes/nodes.h
(Stripping trailing CRs from patch.)
patching file src/include/nodes/parsenodes.h
(Stripping trailing CRs from patch.)
patching file src/include/nodes/plannodes.h
(Stripping trailing CRs from patch.)
patching file src/include/nodes/primnodes.h
(Stripping trailing CRs from patch.)
patching file src/include/nodes/relation.h
(Stripping trailing CRs from patch.)
patching file src/include/optimizer/planmain.h
(Stripping trailing CRs from patch.)
patching file src/include/optimizer/var.h
(Stripping trailing CRs from patch.)
patching file src/include/parser/parse_agg.h

Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-08 Thread Hitoshi Harada
2008/11/9 David Rowley [EMAIL PROTECTED]:
 Hitoshi Harada Wrote:
  although attached is the whole (split) patch.

 I'm having some trouble getting these patches to patch cleanly. I think it's
 because of this that I can't get postgres to compile after applying the
 patch.

 It errors out at tuptoaster.c some constants seem to be missing from
 fmgroids.h

 If I open src/include/utils/fmgroids.h

 The only constaint being defined is:

 #define F__NULL_ 31

 I'd assume it was a problem with my setup, but the CVS head compiles fine.

 Let me know if I'm doing something wrong.

 patching file src/include/catalog/pg_proc.h
 Hunk #4 FAILED at 106.
 1 out of 4 hunks FAILED -- saving rejects to file
 src/include/catalog/pg_proc.h.rej

As it says pg_proc.h may have conflicts. The patch is not against HEAD
but based on the same as the previous (v08) patch. I am remote now so
I'm not sure but could you try to find conflicts in pg_proc.h? The
pg_proc catalog has been added 1 column called prociswfunc (bool) in
the patch. All you have to do is add 'f' in the middle of new-coming
lines.
Sorry for annoying, and I'll be back in hours.


Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-08 Thread Tom Lane
David Rowley [EMAIL PROTECTED] writes:
 patching file src/include/catalog/pg_proc.h
 Hunk #4 FAILED at 106.
 1 out of 4 hunks FAILED -- saving rejects to file
 src/include/catalog/pg_proc.h.rej

I imagine you'll find that hunk #4 covers the entire DATA() body of
the file :-(.  It can't possibly apply cleanly if anyone's added or
altered pg_proc entries since the patch was made.

What you'd need to do is manually insert proiswfunc = 'f' entries in
all the existing DATA lines (this is usually not too hard with sed or
an emacs macro), then add whatever new functions the patch defines.
Even figuring out the latter from the patch representation can be a
serious PITA, since they'll be a few lines out of a multi-thousand-line
failed diff hunk.

I'm not sure if Hitoshi is in a position to submit the pg_proc changes
as two separate diffs --- one to add the new column and a separate one
to add in the new functions --- but it'd be a lot easier to deal with
merge issues if he could.

(Now I'll sit back and wait for some fanboy to claim that
$his_favorite_scm could solve this automatically ...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-08 Thread Greg Stark
Instead of a patch it might be easier to submit the new columns as a  
perl script or sed command. We do something like that to make merging  
pg_proc easier.



greg

On 8 Nov 2008, at 01:36 PM, Tom Lane [EMAIL PROTECTED] wrote:


David Rowley [EMAIL PROTECTED] writes:

patching file src/include/catalog/pg_proc.h
Hunk #4 FAILED at 106.
1 out of 4 hunks FAILED -- saving rejects to file
src/include/catalog/pg_proc.h.rej


I imagine you'll find that hunk #4 covers the entire DATA() body of
the file :-(.  It can't possibly apply cleanly if anyone's added or
altered pg_proc entries since the patch was made.

What you'd need to do is manually insert proiswfunc = 'f' entries in
all the existing DATA lines (this is usually not too hard with sed or
an emacs macro), then add whatever new functions the patch defines.
Even figuring out the latter from the patch representation can be a
serious PITA, since they'll be a few lines out of a multi-thousand- 
line

failed diff hunk.

I'm not sure if Hitoshi is in a position to submit the pg_proc changes
as two separate diffs --- one to add the new column and a separate one
to add in the new functions --- but it'd be a lot easier to deal with
merge issues if he could.

(Now I'll sit back and wait for some fanboy to claim that
$his_favorite_scm could solve this automatically ...)

   regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-04 Thread Vladimir Sitnikov
 Quoted from SQL:2008
 If CUME_DIST is specified, then the relative *rank *of a row R is defined
 as
 NP/NR, where NP is defined
 to be the number of rows preceding or peer with R in the window ordering of
 the window partition of R
 and NR is defined to be the number of rows in the window partition of R.

 I guess there is a difference between  row_number and number of rows
preceding or peer with R

number of rows preceding or peer with R == count(*) over (order by salary)

As far as I understand, the following query should calculate cume_dist
properly (and it does so in Oracle):

SELECT name,CAST(r AS FLOAT) / c, cd
FROM (SELECT name,
COUNT(*) OVER(ORDER BY salary) as r,
COUNT(*) OVER() AS c,
CUME_DIST() OVER(ORDER BY salary) AS cd
 FROM employees
) t;

Sincerely yours,
Vladimir Sitnikov


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-04 Thread Hitoshi Harada
2008/11/5 Vladimir Sitnikov [EMAIL PROTECTED]:

 Quoted from SQL:2008
 If CUME_DIST is specified, then the relative rank of a row R is defined
 as
 NP/NR, where NP is defined
 to be the number of rows preceding or peer with R in the window ordering
 of
 the window partition of R
 and NR is defined to be the number of rows in the window partition of R.

 I guess there is a difference between  row_number and number of rows
 preceding or peer with R

 number of rows preceding or peer with R == count(*) over (order by salary)

 As far as I understand, the following query should calculate cume_dist
 properly (and it does so in Oracle):

 SELECT name,CAST(r AS FLOAT) / c, cd
 FROM (SELECT name,
 COUNT(*) OVER(ORDER BY salary) as r,
 COUNT(*) OVER() AS c,
 CUME_DIST() OVER(ORDER BY salary) AS cd
  FROM employees
 ) t;


I'm afraid I misinterpreted it. As you say,

number of rows preceding == row_number()

and

rumber of rows preceding or peers to R != row_number() (neither rank())

peers to R in the window function context means same rows by the
ORDER BY clause, so in the first example, id=5 and id=6 are peers and
in both rows, NP should be 6, as Oracle and Sybase say.

Even though I understand the definition, your suggestion of COUNT(*)
OVER (ORDER BY salary) doesn't make sense. In the patch, it simply
returns the same value as row_number() but is it wrong, too?

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-04 Thread Vladimir Sitnikov


 Even though I understand the definition, your suggestion of COUNT(*)
 OVER (ORDER BY salary) doesn't make sense.

Why does not that make sense?
I have not read the spec, however Oracle has a default window specification
in case there is only an order by clause. The default window is range
between unbounded preceding and current row.

count(*) over (order by salary range between unbounded preceding and
current row) is perfectly identical to the number of rows preceding or
peers to R by the definition, isn't it? I see here a word-by-word
translation from SQL to the English and vice versa.

If the patch returns row_number it is wrong since there is no way for
row_number to be a number of rows preceding or peer with R, is there?

Regards,
Vladimir Sitnikov


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-04 Thread Hitoshi Harada
2008/11/5 Vladimir Sitnikov [EMAIL PROTECTED]:

 Even though I understand the definition, your suggestion of COUNT(*)
 OVER (ORDER BY salary) doesn't make sense.

 Why does not that make sense?
 I have not read the spec, however Oracle has a default window specification
 in case there is only an order by clause. The default window is range
 between unbounded preceding and current row.

 count(*) over (order by salary range between unbounded preceding and
 current row) is perfectly identical to the number of rows preceding or
 peers to R by the definition, isn't it? I see here a word-by-word
 translation from SQL to the English and vice versa.

 If the patch returns row_number it is wrong since there is no way for
 row_number to be a number of rows preceding or peer with R, is there?


I've got it.
I had thought that implicit window framing specified by ORDER BY
clause (such like above) would mean ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW. But actually reading the spec more closely it says:

Otherwise, WF consists of all rows of the partition of R that precede
R or are peers of R in the
window ordering of the window partition defined by the window ordering clause.

So it means RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW as you
pointed. And the result of count(*) OVER (ORDER BY salary) doesn't
equal to row_number().

Now my assumption is broken. Let me take time to think about how to solve it...


Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers