Re: [HACKERS] Window Functions patch v06

2008-10-13 Thread David Rowley
Hitoshi Harada wrote:
I made up my mind to scratch former window functions and redesigned
completely new execution model, based on the discussion with Heikki.
Attached is the v06 against HEAD today.
http://umitanuki.net/pgsql/wfv06/design.html

First off, fantastic work!

In my eyes this and WITH RECURSIVE are a big step for both Postgres and open
source RBDMS'.

Only, one small query with LEAD() and LAG()

Going by http://www.wiscorp.com/sql200n.zip 

The lead and lag functions each take three arguments, a value expression
VE, an exact numeric literal
OFFSET, and a value expression DEFAULT. For each row R within the window
partition P of R defined by
a window structure descriptor, the lag function returns the value of VE
evaluated on a row that is OFFSET
number of rows before R within P, and the lead function returns the value of
VE evaluated on a row that is
OFFSET number of rows after R within P. The value of DEFAULT is returned as
the result if there is no row
corresponding to the OFFSET number of rows before R within P (for the lag
function) or after R within P (for
the lead function). In addition, RESPECT NULLS or IGNORE NULLS can be
specified to indicate whether
the rows within P for which VE evaluates to the null value are preserved or
eliminated

So going by that:
SELECT name,LAG(name,1,'None') OVER (ORDER BY employeeid) FROM employee;

Would use 'None' for rows that would be out of the bounds of the window.
 
The current patch only seems to accept 2 arguments.
ERROR:  function lag(character varying, integer, unknown) does not exist



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions patch v06

2008-10-13 Thread Hitoshi Harada
2008/10/14 David Rowley [EMAIL PROTECTED]:
 Hitoshi Harada wrote:
I made up my mind to scratch former window functions and redesigned
completely new execution model, based on the discussion with Heikki.
Attached is the v06 against HEAD today.
http://umitanuki.net/pgsql/wfv06/design.html

 First off, fantastic work!

 In my eyes this and WITH RECURSIVE are a big step for both Postgres and open
 source RBDMS'.

 Only, one small query with LEAD() and LAG()

 Going by http://www.wiscorp.com/sql200n.zip

 The lead and lag functions each take three arguments, a value expression
 VE, an exact numeric literal
 OFFSET, and a value expression DEFAULT. For each row R within the window
 partition P of R defined by
 a window structure descriptor, the lag function returns the value of VE
 evaluated on a row that is OFFSET
 number of rows before R within P, and the lead function returns the value of
 VE evaluated on a row that is
 OFFSET number of rows after R within P. The value of DEFAULT is returned as
 the result if there is no row
 corresponding to the OFFSET number of rows before R within P (for the lag
 function) or after R within P (for
 the lead function). In addition, RESPECT NULLS or IGNORE NULLS can be
 specified to indicate whether
 the rows within P for which VE evaluates to the null value are preserved or
 eliminated

 So going by that:
 SELECT name,LAG(name,1,'None') OVER (ORDER BY employeeid) FROM employee;

 Would use 'None' for rows that would be out of the bounds of the window.

 The current patch only seems to accept 2 arguments.
 ERROR:  function lag(character varying, integer, unknown) does not exist




Thanks for your feedback.

I agree I need to work on that. Also from the spec, RESPECT NULLS /
IGNORE NULLS may be specified but not supported yet. This syntax
specification is out of the postgres general function call so I wonder
if those functions are treated specially or not.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions patch v06

2008-10-13 Thread Tom Lane
Hitoshi Harada [EMAIL PROTECTED] writes:
 I agree I need to work on that. Also from the spec, RESPECT NULLS /
 IGNORE NULLS may be specified but not supported yet. This syntax
 specification is out of the postgres general function call so I wonder
 if those functions are treated specially or not.

Egad, the SQL committee has certainly been taken over by creeping
COBOL-itis when it comes to inventing random new syntax ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions patch v06

2008-10-12 Thread Hitoshi Harada
I confirmed this on Oracle:

select last_value(id) over (order by id) as last_id, id from foo;

LAST_ID ID
--- --
0   0
1   1
2   2
3   3
4   4
5   5
6   6
7   7
8   8
9   9
10  10

So when you specify ORDER BY clause on window definition, the frame
always contains rows preceding from current row, not only on
aggregate.



 Doing a bit of poking around in the spec and the Oracle documentation,
 I think (but I'm not 100% sure) that the results returned were correct
 for the query:

 postgres=# select a, sum(a) over () from generate_series(1,10) a;
 ERROR:  either PARTITION BY or ORDER BY must be specified in window clause

The empty window definition is forbidden temporarily because I didn't
know how to take it although there's no problem to execute it. I
haven't found any description on spec yet, I know at least Oracle
allows it.

I can find how to do it with the new (window execution model) design,
(and the design is suitable to fix it above,) but at first before
going into trivial specs, I would like core hackers to review the
model is better than before or not. Thank you for your cooperation.


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions patch v06

2008-10-11 Thread Hitoshi Harada
I'm afraid the patch was too huge, trying to send it again without attachment...

I made up my mind to scratch former window functions and redesigned
completely new execution model, based on the discussion with Heikki.
Attached is the v06 against HEAD today.

http://umitanuki.net/pgsql/wfv06/design.html

The concrete design concept is noted in nodeWindow.c, quoted here:

 * Note that the solid concept of the window functions is they can access
 * arbitrary rows within the frame as they want. As far as we keep the rule
 * of thumb, any kind of optimization is allowed.

And I threw away an idea that window functions are derived from
aggregates, then window functions are only simple single function,
which have capability to arbitrary random row access. By this mean,
aggregates are special subset of window functions so that aggregates
are treated special case in nodeWindow.c, processed in
eval_windowaggregate(), while normal window functions are in
eval_windowfunction().

By giving random row access, not only able to integrate aggregate to
window mechanism but we now have lead()/lag() support. Furthermore, I
can see FRAME clause (i.e. moving frame aggregate/cumulative
computation) can be added for 8.4 with this design though not
implemented yet.

Tom,
I added a function to tuplestore.c,
tuplestore_write_to_read_pointer(), that allows to copy write pointer
position to the one of readptrs. This is required for my case because
when the functions access randomly rows that the executor haven't read
out will be fetched (i.e. last row of the frame). And I hope you see
the nth_value() function require really arbitrary row access.

P.S.
As a volunteer, my spare time to work on this project is getting
limited, as my daily work is getting busy for the end of the year
(this is a Japanese traditional business sense -:). I don't know how
much I can work for the next commit fest but I can't believe it if
this feature would be missed in 8.4 *only* for my own tiny business
clients!

Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions patch v06

2008-10-11 Thread Ian Caulfield
2008/10/11 Hitoshi Harada [EMAIL PROTECTED]

 I'm afraid the patch was too huge, trying to send it again without 
 attachment...

 I made up my mind to scratch former window functions and redesigned
 completely new execution model, based on the discussion with Heikki.
 Attached is the v06 against HEAD today.

Small nit - I get this from the following query:

postgres=# select a, sum(a) over (order by a) from generate_series(1,10) a;
 a  | sum
+-
  1 |  55
  2 |  55
  3 |  55
  4 |  55
  5 |  55
  6 |  55
  7 |  55
  8 |  55
  9 |  55
 10 |  55
(10 rows)

From what I can tell of the spec, the 'sum' column should contain a
running sum (ie 1,3,6 etc). You mention that window frames haven't
been implemented, but it seems like this case should return an error
rather than the wrong results...

Thanks,
Ian

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions patch v06

2008-10-11 Thread Hitoshi Harada
I am drunk. I forgot cc to -hackers. The talk between me and Ian was like that.

2008/10/12 Hitoshi Harada [EMAIL PROTECTED]:
 2008/10/12 Ian Caulfield [EMAIL PROTECTED]:
 2008/10/11 Hitoshi Harada [EMAIL PROTECTED]:
 2008/10/12 Ian Caulfield [EMAIL PROTECTED]:
 2008/10/11 Hitoshi Harada [EMAIL PROTECTED]

 I'm afraid the patch was too huge, trying to send it again without 
 attachment...

 I made up my mind to scratch former window functions and redesigned
 completely new execution model, based on the discussion with Heikki.
 Attached is the v06 against HEAD today.

 Small nit - I get this from the following query:

 postgres=# select a, sum(a) over (order by a) from generate_series(1,10) a;
  a  | sum
 +-
  1 |  55
  2 |  55
  3 |  55
  4 |  55
  5 |  55
  6 |  55
  7 |  55
  8 |  55
  9 |  55
  10 |  55
 (10 rows)

 From what I can tell of the spec, the 'sum' column should contain a
 running sum (ie 1,3,6 etc). You mention that window frames haven't
 been implemented, but it seems like this case should return an error
 rather than the wrong results...

 Thanks,
 Ian


 Thanks for notice.
 I didn't know that. Ordered aggregate has only rows until current row?
 I guess I need read more spec.

 That's how I read it, the relevant part of the spec seems to be:

 5) WD also defines for each row R of RTE the window frame WF of R,
 consisting of a collection of rows. WF
 is defined as follows.

 Case:
 a) If WD has no window framing clause, then

 Case:
 i) If the window ordering clause of WD is not present, then WF is the
 window partition of R.
 ii) Otherwise, WF consists of all rows of the partition of R that
 precede R or are peers of R in the
 window ordering of the window partition defined by the window ordering 
 clause.

 Ian


 It seems you're right. I will fix it soon probably.
 By this spec, some of the regression tests including nth_value() etc.
 are wrong. Generally we hold only preceding rows in the frame when
 ORDER BY is specified, not only aggregate case.
 Thanks again.


 Regards,


 --
 Hitoshi Harada




-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window Functions patch v06

2008-10-11 Thread Ian Caulfield
2008/10/11 Hitoshi Harada [EMAIL PROTECTED]:
 I am drunk. I forgot cc to -hackers. The talk between me and Ian was like 
 that.

 2008/10/12 Hitoshi Harada [EMAIL PROTECTED]:
 2008/10/12 Ian Caulfield [EMAIL PROTECTED]:
 2008/10/11 Hitoshi Harada [EMAIL PROTECTED]:
 2008/10/12 Ian Caulfield [EMAIL PROTECTED]:
 2008/10/11 Hitoshi Harada [EMAIL PROTECTED]

 I'm afraid the patch was too huge, trying to send it again without 
 attachment...

 I made up my mind to scratch former window functions and redesigned
 completely new execution model, based on the discussion with Heikki.
 Attached is the v06 against HEAD today.

 Small nit - I get this from the following query:

 postgres=# select a, sum(a) over (order by a) from generate_series(1,10) 
 a;
  a  | sum
 +-
  1 |  55
  2 |  55
  3 |  55
  4 |  55
  5 |  55
  6 |  55
  7 |  55
  8 |  55
  9 |  55
  10 |  55
 (10 rows)

 From what I can tell of the spec, the 'sum' column should contain a
 running sum (ie 1,3,6 etc). You mention that window frames haven't
 been implemented, but it seems like this case should return an error
 rather than the wrong results...

 Thanks,
 Ian


 Thanks for notice.
 I didn't know that. Ordered aggregate has only rows until current row?
 I guess I need read more spec.

 That's how I read it, the relevant part of the spec seems to be:

 5) WD also defines for each row R of RTE the window frame WF of R,
 consisting of a collection of rows. WF
 is defined as follows.

 Case:
 a) If WD has no window framing clause, then

 Case:
 i) If the window ordering clause of WD is not present, then WF is the
 window partition of R.
 ii) Otherwise, WF consists of all rows of the partition of R that
 precede R or are peers of R in the
 window ordering of the window partition defined by the window ordering 
 clause.

 Ian


 It seems you're right. I will fix it soon probably.
 By this spec, some of the regression tests including nth_value() etc.
 are wrong. Generally we hold only preceding rows in the frame when
 ORDER BY is specified, not only aggregate case.
 Thanks again.

Doing a bit of poking around in the spec and the Oracle documentation,
I think (but I'm not 100% sure) that the results returned were correct
for the query:

postgres=# select a, sum(a) over () from generate_series(1,10) a;
ERROR:  either PARTITION BY or ORDER BY must be specified in window clause

Howver, someone who is better at parsing the spec than I am probably
ought to check...

Ian

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers