Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2014-07-21 Thread Jeff Davis
On Thu, 2014-07-10 at 23:43 -0700, Jeff Davis wrote:
 On Mon, 2014-07-07 at 01:21 -0700, Jeff Davis wrote:
  On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote:
   On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
Thanks for the detailed feedback, I'm sorry it took so long to
incorporate it. I've attached the latest version of the patch, fixing
in particular:
 
 As innocent as this patch seemed at first, it actually opens up a lot of
 questions.
 
 Attached is the (incomplete) edit of the patch so far.

I haven't received much feedback so far on my changes, and I don't think
I will finish hacking on this in the next few days, so I will mark it
returned with feedback.

I don't think it's too far away, but my changes are substantial enough
(and incomplete enough) that some feedback is required.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2014-07-11 Thread Jeff Davis
On Mon, 2014-07-07 at 01:21 -0700, Jeff Davis wrote:
 On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote:
  On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
   Thanks for the detailed feedback, I'm sorry it took so long to
   incorporate it. I've attached the latest version of the patch, fixing
   in particular:

As innocent as this patch seemed at first, it actually opens up a lot of
questions.

Attached is the (incomplete) edit of the patch so far.

Changes from your patch:

* changed test to be locale-insensitive
* lots of refactoring in the execution itself
* fix offset 0 case
* many test improvements
* remove bitmapset and just use an array bitmap
* fix error message typo

Open Issues:

I don't think exposing the frame options is a good idea. That's an
internal concept now, but putting it in windowapi.h will mean that it
needs to live forever.

The struct is private, so there's no easy hack to access the frame
options directly. That means that we need to work with the existing API
functions, which is OK because I think that everything we want to do can
go into WinGetFuncArgInPartition(). If we do the same thing for
WinGetFuncArgInFrame(), then first/last/nth also work.

That leaves the questions:
 * Do we want IGNORE NULLS to work for every window function, or only a
specified subset?
 * If it only works for some window functions, is that hard-coded or
driven by the catalog?
 * If it works for all window functions, could it cause some
pre-existing functions to behave strangely?

Also, I'm re-thinking Dean's comments here:

http://www.postgresql.org/message-id/CAEZATCWT3=P88nv2ThTjvRDLpOsVtAPxaVPe=MaWe-x=guh...@mail.gmail.com

He brings up a few good points. I will look into the frame vs. window
option, though it looks like you've already at least fixed the crash.
His other point about actually eliminating the NULLs from the window
itself is interesting, but I don't think it works. IGNORE NULLS ignores
*other* rows with NULL, but (per spec) does not ignore the current row.
That sounds awkward if you've already removed the NULL rows from the
window, but maybe there's something that could work.

And there are a few other things I'm still looking into, but hopefully
they don't raise new issues.

Regards,
Jeff Davis

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 13164,13169  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
--- 13164,13170 
   lag(replaceable class=parametervalue/replaceable typeany/
   [, replaceable class=parameteroffset/replaceable typeinteger/
   [, replaceable class=parameterdefault/replaceable typeany/ ]])
+  [ { RESPECT | IGNORE } NULLS ]
 /function
/entry
entry
***
*** 13178,13184  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
 replaceable class=parameterdefault/replaceable are evaluated
 with respect to the current row.  If omitted,
 replaceable class=parameteroffset/replaceable defaults to 1 and
!replaceable class=parameterdefault/replaceable to null
/entry
   /row
  
--- 13179,13187 
 replaceable class=parameterdefault/replaceable are evaluated
 with respect to the current row.  If omitted,
 replaceable class=parameteroffset/replaceable defaults to 1 and
!replaceable class=parameterdefault/replaceable to null. If
!literalIGNORE NULLS/ is specified then the function will be evaluated
!as if the rows containing nulls didn't exist.
/entry
   /row
  
***
*** 13191,13196  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
--- 13194,13200 
   lead(replaceable class=parametervalue/replaceable typeany/
[, replaceable class=parameteroffset/replaceable typeinteger/
[, replaceable class=parameterdefault/replaceable typeany/ ]])
+  [ { RESPECT | IGNORE } NULLS ]
 /function
/entry
entry
***
*** 13205,13211  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
 replaceable class=parameterdefault/replaceable are evaluated
 with respect to the current row.  If omitted,
 replaceable class=parameteroffset/replaceable defaults to 1 and
!replaceable class=parameterdefault/replaceable to null
/entry
   /row
  
--- 13209,13217 
 replaceable class=parameterdefault/replaceable are evaluated
 with respect to the current row.  If omitted,
 replaceable class=parameteroffset/replaceable defaults to 1 and
!replaceable class=parameterdefault/replaceable to null. If
!literalIGNORE NULLS/ is specified then the function will be evaluated
!as if the rows containing nulls didn't exist.
/entry
   /row
  
***
*** 13299,13309  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y 

Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2014-07-07 Thread Jeff Davis
On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote:
 On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
  Thanks for the detailed feedback, I'm sorry it took so long to
  incorporate it. I've attached the latest version of the patch, fixing
  in particular:

Looking a little more:

* No tests exercise non-const offsets

* No tests for default clauses with IGNORE NULLS

* The use of bitmapsets is quite ugly. It would be nice if the API would
grow the BMS within the memory context in which it was allocated, but I
don't even see that the BMS is necessary. Why not just allocate a
fixed-size array of bits, and forget the BMS?

* Is there a reason you're leaving out first_value/last_value/nth_value?
I think they could be supported without a lot of extra work.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2014-07-06 Thread Jeff Davis
On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
 Thanks for the detailed feedback, I'm sorry it took so long to
 incorporate it. I've attached the latest version of the patch, fixing
 in particular:

I took a good look at this today.

* It fails for offset of 0 with IGNORE NULLS. Fixed (trivial).

* The tests are locale-sensitive. Fixed (trivial).

* The leadlag_common function is just way too long. I refactored the
IGNORE NULLS code into it's own function (win_get_arg_ignore_nulls())
with the same API as WinGetFuncArgInPartition. This cleans things up
substantially, and makes it easier to add useful comments.

* We're implementing the former semantics, so we'll need to correct
slightly sounds arbitrary, but it's mandated by the standard. That
should be clarified.

* I did a lot of other refactoring within win_get_arg_ignore_nulls for
the constant case. I'm not done yet, and I'm not 100% sure it's a net
gain, because the code ended up a little longer. But the previous
version was quite hard to follow because of so many special cases around
positive versus negative offsets. For instance, having the negative
'next' value in your code actually means something quite different than
when it's positive, but it took me a while to figure that out, so I made
it into two variables. I hope my code is moving it in a direction that's
easier for others to understand.

Please let me know if you think I am making things worse with my
refactorings. Otherwise I'll keep working on this and hopefully get it
committable soon.

The attached patch is still a WIP; just posting it here in case you see
any red flags.

Regards,
Jeff Davis

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 13164,13169  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
--- 13164,13170 
   lag(replaceable class=parametervalue/replaceable typeany/
   [, replaceable class=parameteroffset/replaceable typeinteger/
   [, replaceable class=parameterdefault/replaceable typeany/ ]])
+  [ { RESPECT | IGNORE } NULLS ]
 /function
/entry
entry
***
*** 13178,13184  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
 replaceable class=parameterdefault/replaceable are evaluated
 with respect to the current row.  If omitted,
 replaceable class=parameteroffset/replaceable defaults to 1 and
!replaceable class=parameterdefault/replaceable to null
/entry
   /row
  
--- 13179,13187 
 replaceable class=parameterdefault/replaceable are evaluated
 with respect to the current row.  If omitted,
 replaceable class=parameteroffset/replaceable defaults to 1 and
!replaceable class=parameterdefault/replaceable to null. If
!literalIGNORE NULLS/ is specified then the function will be evaluated
!as if the rows containing nulls didn't exist.
/entry
   /row
  
***
*** 13191,13196  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
--- 13194,13200 
   lead(replaceable class=parametervalue/replaceable typeany/
[, replaceable class=parameteroffset/replaceable typeinteger/
[, replaceable class=parameterdefault/replaceable typeany/ ]])
+  [ { RESPECT | IGNORE } NULLS ]
 /function
/entry
entry
***
*** 13205,13211  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
 replaceable class=parameterdefault/replaceable are evaluated
 with respect to the current row.  If omitted,
 replaceable class=parameteroffset/replaceable defaults to 1 and
!replaceable class=parameterdefault/replaceable to null
/entry
   /row
  
--- 13209,13217 
 replaceable class=parameterdefault/replaceable are evaluated
 with respect to the current row.  If omitted,
 replaceable class=parameteroffset/replaceable defaults to 1 and
!replaceable class=parameterdefault/replaceable to null. If
!literalIGNORE NULLS/ is specified then the function will be evaluated
!as if the rows containing nulls didn't exist.
/entry
   /row
  
***
*** 13299,13309  SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
note
 para
  The SQL standard defines a literalRESPECT NULLS/ or
! literalIGNORE NULLS/ option for functionlead/, functionlag/,
! functionfirst_value/, functionlast_value/, and
! functionnth_value/.  This is not implemented in
! productnamePostgreSQL/productname: the behavior is always the
! same as the standard's default, namely literalRESPECT NULLS/.
  Likewise, the standard's literalFROM FIRST/ or literalFROM LAST/
  option for functionnth_value/ is not implemented: only the
  default literalFROM FIRST/ behavior is supported.  (You can achieve
--- 

Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2014-04-16 Thread Nicholas White
Thanks for the detailed feedback, I'm sorry it took so long to
incorporate it. I've attached the latest version of the patch, fixing
in particular:

 We have this block:
I've re-written this so it only does a single pass through the window
definitions (my patch originally added a second pass), and only does
the clone if required.

 In gram.y there are some spurious whitespaces at end-of-line.
Fixed - I didn't know about diff --check, it's very useful!

 Also, in parsenodes.h, you had the [MANDATORY] and such tags.
I've re-written the comments (without tags) to make it much easier to
understand . I agree they were ugly!

Exactly what case does the in this case phrase refer to?
Clarified in the comments

A style issue.  You have this:
Fixed

 And a final style comment.
Fixed

 Finally, I'm not really sure about the column you added to the regression 
 tests table.
Indeed, it was a bit artificial. I've re-written the tests to use a
separate table as you suggest.

Thanks -

Nick
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0809a6d..5da852e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -13185,6 +13185,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
  lag(replaceable class=parametervalue/replaceable typeany/
  [, replaceable class=parameteroffset/replaceable typeinteger/
  [, replaceable class=parameterdefault/replaceable typeany/ ]])
+ [ { RESPECT | IGNORE } NULLS ]
/function
   /entry
   entry
@@ -13199,7 +13200,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
replaceable class=parameterdefault/replaceable are evaluated
with respect to the current row.  If omitted,
replaceable class=parameteroffset/replaceable defaults to 1 and
-   replaceable class=parameterdefault/replaceable to null
+   replaceable class=parameterdefault/replaceable to null. If
+   literalIGNORE NULLS/ is specified then the function will be evaluated
+   as if the rows containing nulls didn't exist.
   /entry
  /row
 
@@ -13212,6 +13215,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
  lead(replaceable class=parametervalue/replaceable typeany/
   [, replaceable class=parameteroffset/replaceable typeinteger/
   [, replaceable class=parameterdefault/replaceable typeany/ ]])
+ [ { RESPECT | IGNORE } NULLS ]
/function
   /entry
   entry
@@ -13226,7 +13230,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
replaceable class=parameterdefault/replaceable are evaluated
with respect to the current row.  If omitted,
replaceable class=parameteroffset/replaceable defaults to 1 and
-   replaceable class=parameterdefault/replaceable to null
+   replaceable class=parameterdefault/replaceable to null. If
+   literalIGNORE NULLS/ is specified then the function will be evaluated
+   as if the rows containing nulls didn't exist.
   /entry
  /row
 
@@ -13320,11 +13326,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
   note
para
 The SQL standard defines a literalRESPECT NULLS/ or
-literalIGNORE NULLS/ option for functionlead/, functionlag/,
-functionfirst_value/, functionlast_value/, and
-functionnth_value/.  This is not implemented in
-productnamePostgreSQL/productname: the behavior is always the
-same as the standard's default, namely literalRESPECT NULLS/.
+literalIGNORE NULLS/ option for functionfirst_value/,
+functionlast_value/, and functionnth_value/.  This is not
+implemented in productnamePostgreSQL/productname: the behavior is
+always the same as the standard's default, namely literalRESPECT NULLS/.
 Likewise, the standard's literalFROM FIRST/ or literalFROM LAST/
 option for functionnth_value/ is not implemented: only the
 default literalFROM FIRST/ behavior is supported.  (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 2fcc630..5cea825 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2431,7 +2431,6 @@ window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot)
  * API exposed to window functions
  ***/
 
-
 /*
  * WinGetPartitionLocalMemory
  *		Get working memory that lives till end of partition processing
@@ -2467,6 +2466,17 @@ WinGetCurrentPosition(WindowObject winobj)
 }
 
 /*
+ * WinGetFrameOptions
+ *		Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+	Assert(WindowObjectIsValid(winobj));
+	return winobj-winstate-frameOptions;
+}
+
+/*
  * WinGetPartitionRowCount
  *		Return total number of rows contained in the current partition.
  *
diff --git a/src/backend/parser/gram.y 

Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-11-29 Thread Bruce Momjian
On Fri, Jul  5, 2013 at 02:36:10PM -0700, Jeff Davis wrote:
 On Fri, 2013-06-28 at 13:26 -0400, Robert Haas wrote:
  I haven't really reviewed the windowing-related code in depth; I
  thought Jeff might jump back in for that part of it.  Jeff, is that
  something you're planning to do?
 
 Yes, getting back into this patch now after a bit of delay.

Jeff, any status on this?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-11-29 Thread Jeff Davis
On Fri, 2013-11-29 at 16:18 -0500, Bruce Momjian wrote:
 On Fri, Jul  5, 2013 at 02:36:10PM -0700, Jeff Davis wrote:
  On Fri, 2013-06-28 at 13:26 -0400, Robert Haas wrote:
   I haven't really reviewed the windowing-related code in depth; I
   thought Jeff might jump back in for that part of it.  Jeff, is that
   something you're planning to do?
  
  Yes, getting back into this patch now after a bit of delay.
 
 Jeff, any status on this?

The last message was a review from Alvaro that hasn't been addressed
yet.

Right now I am looking at the extension templates patch. But this patch
is fairly close, so if Nicholas doesn't get to looking at it, I'll see
what I can do.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-11-29 Thread Bruce Momjian
On Fri, Nov 29, 2013 at 01:21:27PM -0800, Jeff Davis wrote:
 On Fri, 2013-11-29 at 16:18 -0500, Bruce Momjian wrote:
  On Fri, Jul  5, 2013 at 02:36:10PM -0700, Jeff Davis wrote:
   On Fri, 2013-06-28 at 13:26 -0400, Robert Haas wrote:
I haven't really reviewed the windowing-related code in depth; I
thought Jeff might jump back in for that part of it.  Jeff, is that
something you're planning to do?
   
   Yes, getting back into this patch now after a bit of delay.
  
  Jeff, any status on this?
 
 The last message was a review from Alvaro that hasn't been addressed
 yet.
 
 Right now I am looking at the extension templates patch. But this patch
 is fairly close, so if Nicholas doesn't get to looking at it, I'll see
 what I can do.

Thank you.  I see it is looking very active on the commit-fest:  :-)

https://commitfest.postgresql.org/action/patch_view?id=1096

Thanks for all the work.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-07-05 Thread Jeff Davis
On Mon, 2013-07-01 at 07:40 -0400, Robert Haas wrote:
 On Sun, Jun 30, 2013 at 10:07 PM, Nicholas White n.j.wh...@gmail.com wrote:
  I've attached another iteration of the patch that fixes the multiple-window
  bug and adds ( uses) a function to create a Bitmapset using a custom
  allocator. I don't think there's any outstanding problems with it now.
 
 I think the right way to do this is to temporarily set the current
 memory context to winobj-winstate-partcontext while creating or
 manipulating the Bitmapset and restore it afterwards.  Maybe someone
 will say that's a modularity violation, but surely this is worse...

I think we should get rid of the bitmapset entirely. For one thing, we
want to be able to support large frames, and the size of the allocation
for the bitmap is dependent on the size of the frame. It would take a
very large frame for that to matter, but conceptually, it doesn't seem
right to me.

Instead of the bitmapset, we can keep track of two offsets, and the
number of rows in between which are non-NULL. That only works with a
constant offset; but I'm not inclined to optimize for the special case
involving large frames, variable offset which always happens to be
large, and IGNORE NULLS.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-07-05 Thread Jeff Davis
On Fri, 2013-06-28 at 13:26 -0400, Robert Haas wrote:
 I haven't really reviewed the windowing-related code in depth; I
 thought Jeff might jump back in for that part of it.  Jeff, is that
 something you're planning to do?

Yes, getting back into this patch now after a bit of delay.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-07-01 Thread Dean Rasheed
On 1 July 2013 03:07, Nicholas White n.j.wh...@gmail.com wrote:
 Alternatively, it might be trivial to make all aggregate functions work
 with ignore nulls in a window context

 This is a good idea, but I'd like to keep the scope of this patch limited
 for the time being

Agreed.


 - I'll look at doing this (along with the first / last /
 nth value window functions) for a later release.


On the other hand, perhaps this is not worth doing for aggregates,
since in that case IGNORE NULLS is just a special case of FILTER
(WHERE ...). Making IGNORE NULLS work for the other window functions
is probably more useful, as you say, in a future patch.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-07-01 Thread Robert Haas
On Sun, Jun 30, 2013 at 10:07 PM, Nicholas White n.j.wh...@gmail.com wrote:
 I've attached another iteration of the patch that fixes the multiple-window
 bug and adds ( uses) a function to create a Bitmapset using a custom
 allocator. I don't think there's any outstanding problems with it now.

I think the right way to do this is to temporarily set the current
memory context to winobj-winstate-partcontext while creating or
manipulating the Bitmapset and restore it afterwards.  Maybe someone
will say that's a modularity violation, but surely this is worse...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-07-01 Thread Dean Rasheed
On 1 July 2013 03:07, Nicholas White n.j.wh...@gmail.com wrote:
 I've attached another iteration of the patch that fixes the multiple-window
 bug and adds ( uses) a function to create a Bitmapset using a custom
 allocator. I don't think there's any outstanding problems with it now.


I just realised there is another issue (sorry). pg_get_viewdef() needs
to be updated so that dumping and restoring a view that uses
ignore/respect nulls works properly.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-30 Thread Dean Rasheed
On 29 June 2013 17:30, Jeff Davis pg...@j-davis.com wrote:

 On Mon, 2013-06-24 at 18:01 +0100, Nicholas White wrote:
 Good catch - I've attached a patch to address your point 1. It now
 returns the below (i.e. correctly doesn't fill in the saved value if
 the index is out of the window. However, I'm not sure whether (e.g.)
 lead-2-ignore-nulls means count forwards two rows, and if that's null
 use the last one you've seen (the current implementation) or count
 forwards two non-null rows (as you suggest). The behaviour isn't
 specified in a (free) draft of the 2003 standard
 (http://www.wiscorp.com/sql_2003_standard.zip), and I don't have
 access to the (non-free) final version. Could someone who does have
 access to it clarify this? I've also added your example to the
 regression test cases.

 Reading a later version of the draft, it is specified, but is still
 slightly unclear.

 As I see it, the standard describes the behavior in terms of eliminating
 the NULL rows entirely before applying the offset. This matches Troels's
 interpretation. Are you aware of any implementations that do something
 different?

 I didn't include this functionality for the first / last value window
 functions as their implementation is currently a bit different; they
 just call WinGetFuncArgInFrame to pick out a single value. Making
 these functions respect nulls would involve changing the single lookup
 to a walk through the tuples to find the first non-null version, and
 keeping track of this index in a struct in the context. As this change
 is reasonably orthogonal I was going to submit it as a separate patch.

 Sounds good.


I took a quick look at this and I think there are still a few problems:

1). The ignore/respect nulls flag needs to be per-window-function
data, not a window frame option, because the same window may be shared
by multiple window function calls. For example, the following test
causes a crash:

SELECT val,
   lead(val, 2) IGNORE NULLS OVER w,
   lead(val, 2) RESPECT NULLS OVER w
  FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
WINDOW w as ();

The connection to the server was lost. Attempting reset: Failed.

2). As Troels Nielsen said up-thread, I think this should throw a
FEATURE_NOT_SUPPORTED error if it is used for window functions that
don't support it, rather than silently ignoring the flag.

3). Similarly, the parser accepts ignore/respect nulls for arbitrary
aggregate functions over a window, so maybe this should also throw a
FEATURE_NOT_SUPPORTED error. Alternatively, it might be trivial to
make all aggregate functions work with ignore nulls in a window
context, simply by using the existing code for strict aggregate
transition functions. That might be quite handy to support things like
array_agg(val) IGNORE NULLS OVER(...).

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-30 Thread Nicholas White
 this should throw a FEATURE_NOT_SUPPORTED error if it is used for window
functions that don't support it
 arbitrary aggregate functions over a window ... should also throw a
FEATURE_NOT_SUPPORTED error.

Fixed (with test cases) in the attached patch.

 because the same window may be shared by multiple window function calls.

Ah, your example gives the stack trace below. As the respect / ignore nulls
frame option is part of the window definition your example should cause two
windows to be created (both based on w, but one with the respect-nulls flag
set), but instead it fails an assert as one window definition can't have
two sets of frame options. It might take me a day or two to solve this -
let me know if this approach (making the parser create two window objects)
seems wrong.

#2  0x000100cdb68b in ExceptionalCondition (conditionName=Could not
find the frame base for ExceptionalCondition.
) at /Users/xxx/postgresql/src/backend/utils/error/assert.c:54
#3  0x0001009a3c03 in transformWindowFuncCall (pstate=0x7f88228362c8,
wfunc=0x7f8822948ec0, windef=0x7f88228353a8) at
/Users/xxx/postgresql/src/backend/parser/parse_agg.c:573

Thanks -

Nick


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-30 Thread Nicholas White
I've attached another iteration of the patch that fixes the multiple-window
bug and adds ( uses) a function to create a Bitmapset using a custom
allocator. I don't think there's any outstanding problems with it now.

 Alternatively, it might be trivial to make all aggregate functions work
with ignore nulls in a window context

This is a good idea, but I'd like to keep the scope of this patch limited
for the time being - I'll look at doing this (along with the first / last /
nth value window functions) for a later release.

Thanks -

Nick


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-29 Thread Jeff Davis

On Mon, 2013-06-24 at 18:01 +0100, Nicholas White wrote:
 Good catch - I've attached a patch to address your point 1. It now
 returns the below (i.e. correctly doesn't fill in the saved value if
 the index is out of the window. However, I'm not sure whether (e.g.)
 lead-2-ignore-nulls means count forwards two rows, and if that's null
 use the last one you've seen (the current implementation) or count
 forwards two non-null rows (as you suggest). The behaviour isn't
 specified in a (free) draft of the 2003 standard
 (http://www.wiscorp.com/sql_2003_standard.zip), and I don't have
 access to the (non-free) final version. Could someone who does have
 access to it clarify this? I've also added your example to the
 regression test cases.

Reading a later version of the draft, it is specified, but is still
slightly unclear.

As I see it, the standard describes the behavior in terms of eliminating
the NULL rows entirely before applying the offset. This matches Troels's
interpretation. Are you aware of any implementations that do something
different?
 
 I didn't include this functionality for the first / last value window
 functions as their implementation is currently a bit different; they
 just call WinGetFuncArgInFrame to pick out a single value. Making
 these functions respect nulls would involve changing the single lookup
 to a walk through the tuples to find the first non-null version, and
 keeping track of this index in a struct in the context. As this change
 is reasonably orthogonal I was going to submit it as a separate patch.

Sounds good.

Regards,
Jeff Davis






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-28 Thread Robert Haas
On Thu, Jun 27, 2013 at 8:52 PM, Nicholas White n.j.wh...@gmail.com wrote:
 The result of the current patch using lead

 Actually, I think I agree with you (and, FWIW, so does Oracle:
 http://docs.oracle.com/cd/E11882_01/server.112/e25554/analysis.htm#autoId18).
 I've refactored the window function's implementation so that (e.g.) lead(5)
 means the 5th non-null value away in front of the current row (the previous
 implementation was the last non-null value returned if the 5th rows in front
 was null). These semantics are slower, as the require the function to scan
 through the tuples discarding non-null ones. I've made the implementation
 use a bitmap in the partition context to cache whether or not a given tuple
 produces a null. This seems correct (it passes the regression tests) but as
 it stores row offsets (which are int64s) I was careful not to use bitmap
 methods that use ints to refer to set members. I've added more explanation
 in the code's comments. Thanks -

The documentation you've added reads kind of funny to me:

+ [respect nulls]|[ignore nulls]

Wouldn't we normally write that as [ [ RESPECT | IGNORE ] NULLS ] ?

I've committed the changes from Troels to avoid the grammar conflicts,
and I also took the opportunity to make OVER unreserved, as allowed by
his refactoring and per related discussion on other threads.  This
patch will need to be rebased over those changes (sorry), but
hopefully it'll help the review of this patch focus in on the issues
that are specific to RESPECT/IGNORE NULLS.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-28 Thread Alvaro Herrera
Robert Haas escribió:
 On Thu, Jun 27, 2013 at 8:52 PM, Nicholas White n.j.wh...@gmail.com wrote:

 The documentation you've added reads kind of funny to me:
 
 + [respect nulls]|[ignore nulls]
 
 Wouldn't we normally write that as [ [ RESPECT | IGNORE ] NULLS ] ?

I think it should be
[ { RESPECT | IGNORE } NULLS ]
One of the leading keywords must be present.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-28 Thread Robert Haas
On Fri, Jun 28, 2013 at 11:41 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Robert Haas escribió:
 On Thu, Jun 27, 2013 at 8:52 PM, Nicholas White n.j.wh...@gmail.com wrote:

 The documentation you've added reads kind of funny to me:

 + [respect nulls]|[ignore nulls]

 Wouldn't we normally write that as [ [ RESPECT | IGNORE ] NULLS ] ?

 I think it should be
 [ { RESPECT | IGNORE } NULLS ]
 One of the leading keywords must be present.

Oh, yeah.  What he said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-28 Thread Nicholas White
 This patch will need to be rebased over those changes

See attached -


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-28 Thread Robert Haas
On Fri, Jun 28, 2013 at 12:29 PM, Nicholas White n.j.wh...@gmail.com wrote:
 This patch will need to be rebased over those changes

 See attached -

Thanks.  But I see a few issues...

+ [respect nulls]|[ignore nulls]

You fixed one of these but missed the other.

replaceable class=parameterdefault/replaceable are evaluated
with respect to the current row.  If omitted,
replaceable class=parameteroffset/replaceable defaults to 1 and
-   replaceable class=parameterdefault/replaceable to null
+   literalIGNORE NULLS/ is specified then the function will
be evaluated
+   as if the rows containing nulls didn't exist.
   /entry
  /row

This looks like you dropped a line during cut-and-paste.

+   null_values = (Bitmapset *) WinGetPartitionLocalMemory(
+   winobj,
+   BITMAPSET_SIZE(words_needed));
+   Assert(null_values);

This certainly seems ugly - isn't there a way to accomplish this
without having to violate the Bitmapset abstraction?

+* A slight edge case. Consider:
+*
+* A | lag(A, 1)
+* 1 |
+* 2 |  1
+*   |  ?

pgindent will reformat this into oblivion.  Use - markers around
the comment block as done elsewhere in the code where this is an
issue, or don't use ASCII art.

I haven't really reviewed the windowing-related code in depth; I
thought Jeff might jump back in for that part of it.  Jeff, is that
something you're planning to do?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-28 Thread Nicholas White
I've fixed the problems you mentioned (see attached) - sorry, I was a bit
careless with the docs.


 +   null_values = (Bitmapset *) WinGetPartitionLocalMemory(
 +   winobj,
 +   BITMAPSET_SIZE(words_needed));
 +   Assert(null_values);

 This certainly seems ugly - isn't there a way to accomplish this
 without having to violate the Bitmapset abstraction?

Indeed, it's ugly. I've revised it slightly to:

 null_values = (Bitmapset *) WinGetPartitionLocalMemory(
winobj,
BITMAPSET_SIZE(words_needed));
 null_values-nwords = (int) words_needed;

...which gives a proper bitmap. It's hard to break this into a factory
method like bms_make_singleton as I'm getting the (zero'ed) partition local
memory from one call, then forcing a correct bitmap's structure on it.
Maybe bitmapset.h needs an bms_initialise(void *, int num_words) factory
method? You'd still have to use the BITMAPSET_SIZE macro to get the correct
amount of memory for the void*. Maybe the factory method could take a
function pointer that would allocate memory of the given size (e.g.
Bitmapset* initialize(void* (allocator)(Size_t), int num_words) ) - this
means I could still use the partition's local memory.

I don't think the solution would be tidier if I re-instated the
leadlag_context struct with a single Bitmapset member. Other Bitmapset
usage seems to just call bms_make_singleton then bms_add_member over and
over again - which afaict will call palloc every BITS_PER_BITMAPWORD calls,
which is not really what I want.

Thanks -

Nick


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-27 Thread Nicholas White
 The result of the current patch using lead

Actually, I think I agree with you (and, FWIW, so does Oracle:
http://docs.oracle.com/cd/E11882_01/server.112/e25554/analysis.htm#autoId18).
I've refactored the window function's implementation so that (e.g.) lead(5)
means the 5th non-null value away in front of the current row (the previous
implementation was the last non-null value returned if the 5th rows in
front was null). These semantics are slower, as the require the function to
scan through the tuples discarding non-null ones. I've made the
implementation use a bitmap in the partition context to cache whether or
not a given tuple produces a null. This seems correct (it passes the
regression tests) but as it stores row offsets (which are int64s) I was
careful not to use bitmap methods that use ints to refer to set members.
I've added more explanation in the code's comments. Thanks -


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-24 Thread Nicholas White
Good catch - I've attached a patch to address your point 1. It now returns
the below (i.e. correctly doesn't fill in the saved value if the index is
out of the window. However, I'm not sure whether (e.g.) lead-2-ignore-nulls
means count forwards two rows, and if that's null use the last one you've
seen (the current implementation) or count forwards two non-null rows (as
you suggest). The behaviour isn't specified in a (free) draft of the 2003
standard (http://www.wiscorp.com/sql_2003_standard.zip), and I don't have
access to the (non-free) final version. Could someone who does have access
to it clarify this? I've also added your example to the regression test
cases.


select val, lead(val, 2) ignore nulls over (order by id) from test_table;
 val | lead
-+--
   1 |3
   2 |4
   3 |4
   4 |4
 |4
 |5
 |6
   5 |7
   6 |
   7 |
(10 rows)

If the other reviewers are happy with your grammar changes then I'll merge
them into the patch. Alternatively, if departing from the standard is OK
then we could reorder the keywords so that a window function is like SELECT
lag(x,1) OVER RESPECT NULLS (ORDER BY y) - i.e. putting the respect /
ignore tokens after the OVER reserved keyword. Although non-standard it'd
make the grammar change trivial.


 Also, I think someone mentioned this already, but what about
 first_value() and last_value()? Shouldn't we do those at the same time?

I didn't include this functionality for the first / last value window
functions as their implementation is currently a bit different; they just
call WinGetFuncArgInFrame to pick out a single value. Making these
functions respect nulls would involve changing the single lookup to a walk
through the tuples to find the first non-null version, and keeping track of
this index in a struct in the context. As this change is reasonably
orthogonal I was going to submit it as a separate patch.

Thanks -


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-24 Thread Robert Haas
On Fri, Jun 21, 2013 at 6:29 PM, Troels Nielsen bn.tro...@gmail.com wrote:
 The grammar conflict appears to be because of ambiguities in:
 1. table_ref (used exclusively in FROM clauses)
 2. index_elem (used exclusively in INDEX creation statements).

 Now, this doesn't seem to make much sense, as AFAICT window functions
 are explicitly disallowed in these contexts (transformWindowFuncCall
 will yield errors, and I can't really wrap my head around what a
 window function call would mean there).

 I therefore propose a simple rearrangement of the grammar,
 syntactically disallowing window functions in the outer part of those
 contexts (a_expr's inside can't and shouldn't be done much about)
 which will allow both RESPECT and IGNORE to become unreserved
 keywords, without doing any lexer hacking or abusing the grammar.

I reviewed this today and I think this is a very nice approach.
Thanks for working on it!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-24 Thread Nicholas White
OK - I've attached another iteration of the patch with Troels' grammar
changes. I think the only issue remaining is what the standard says about
lead semantics. Thanks -


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-21 Thread Robert Haas
On Fri, Jun 21, 2013 at 12:18 AM, Jeff Davis pg...@j-davis.com wrote:
 On Thu, 2013-06-20 at 10:03 -0400, Robert Haas wrote:
 I think the question is whether this feature is really worth adding
 new reserved keywords for.  I have a hard time saying we shouldn't
 support something that's part of the SQL standard, but personally,
 it's not something I've seen come up prior to this thread.

 What's the next step here?

Well, ideally, some other people weigh in on the value of the feature
vs. the pain of reserving the keywords.

 The feature sounds useful to me.

...and there's one person with an opinion now!   :-)

The other question here is - do we actually have the grammar right?
As in, is this actually the syntax we're supposed to be implementing?
It looks different from what's shown here, where the IGNORE NULLS is
inside the function's parentheses, rather than afterwards:

http://rwijk.blogspot.com/2010/06/simulating-laglead-ignore-nulls.html

IBM seems to think it's legal either inside or outside the parentheses:

http://pic.dhe.ibm.com/infocenter/informix/v121/index.jsp?topic=%2Fcom.ibm.sqls.doc%2Fids_sqs_2594.htm

Regardless of what syntax we settle on, we should also make sure that
the conflict is intrinsic to the grammar and can't be factored out, as
Tom suggested upthread.  It's not obvious to me what the actual
ambiguity is here.  If you've seen select lag(num,0) and the
lookahead token is respect, what's the problem?  It sort of looks
like it could be a column label, but not even unreserved keywords can
be column labels, so that's not it.  Probably deserves a bit more
investigation...

 If the grammar is unacceptable, does
 someone have an alternative idea, like using new function names instead
 of grammar? If so, what are reasonable names to use?

We could just add additional, optional Boolean argument to the
existing functions.  It's non-standard, but we avoid adding keywords.

 Also, I think someone mentioned this already, but what about
 first_value() and last_value()? Shouldn't we do those at the same time?

Not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-21 Thread Jeff Davis
On Fri, 2013-06-21 at 09:21 -0400, Robert Haas wrote:
 The other question here is - do we actually have the grammar right?
 As in, is this actually the syntax we're supposed to be implementing?
 It looks different from what's shown here, where the IGNORE NULLS is
 inside the function's parentheses, rather than afterwards:
 
 http://rwijk.blogspot.com/2010/06/simulating-laglead-ignore-nulls.html
 
 IBM seems to think it's legal either inside or outside the parentheses:
 
 http://pic.dhe.ibm.com/infocenter/informix/v121/index.jsp?topic=%2Fcom.ibm.sqls.doc%2Fids_sqs_2594.htm

The spec seems pretty clear that it falls outside of the parentheses,
unless I am missing something.

 Regardless of what syntax we settle on, we should also make sure that
 the conflict is intrinsic to the grammar and can't be factored out, as
 Tom suggested upthread.  It's not obvious to me what the actual
 ambiguity is here.  If you've seen select lag(num,0) and the
 lookahead token is respect, what's the problem?  It sort of looks
 like it could be a column label, but not even unreserved keywords can
 be column labels, so that's not it.  Probably deserves a bit more
 investigation...

I think the problem is when the function is used as a table function;
e.g.:

  SELECT * FROM generate_series(1,10) respect;

 We could just add additional, optional Boolean argument to the
 existing functions.  It's non-standard, but we avoid adding keywords.

I thought about that, but it is awkward because the argument would have
to be constant (or, if not, it would be quite strange).

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-21 Thread Robert Haas
On Fri, Jun 21, 2013 at 11:33 AM, Jeff Davis pg...@j-davis.com wrote:
 Regardless of what syntax we settle on, we should also make sure that
 the conflict is intrinsic to the grammar and can't be factored out, as
 Tom suggested upthread.  It's not obvious to me what the actual
 ambiguity is here.  If you've seen select lag(num,0) and the
 lookahead token is respect, what's the problem?  It sort of looks
 like it could be a column label, but not even unreserved keywords can
 be column labels, so that's not it.  Probably deserves a bit more
 investigation...

 I think the problem is when the function is used as a table function;
 e.g.:

   SELECT * FROM generate_series(1,10) respect;

Ah ha.  Well, there's probably not much help for that.  Disallowing
keywords as table aliases would be a cure worse than the disease, I
think.  I suppose the good news is that there probably aren't many
people using RESPECT as a column name, but I have a feeling that we're
almost certain to get complaints about reserving IGNORE.  I think that
will have to be quoted as a PL/pgsql variable name as well.  :-(

 We could just add additional, optional Boolean argument to the
 existing functions.  It's non-standard, but we avoid adding keywords.

 I thought about that, but it is awkward because the argument would have
 to be constant (or, if not, it would be quite strange).

True... but e.g. string_agg() has the same issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-21 Thread Troels Nielsen
Hello all

I've been examining PostgreSQL to gain a greater understanding
of RDBMS. (Thanks for a nice, very educational system!)

In the process I've been looking into a few problems and the
complications of this patch appeared relatively uninvolved, so I
tried to look for a solution.

I found the following:

The grammar conflict appears to be because of ambiguities in:
1. table_ref (used exclusively in FROM clauses)
2. index_elem (used exclusively in INDEX creation statements).

Now, this doesn't seem to make much sense, as AFAICT window functions
are explicitly disallowed in these contexts (transformWindowFuncCall
will yield errors, and I can't really wrap my head around what a
window function call would mean there).

I therefore propose a simple rearrangement of the grammar,
syntactically disallowing window functions in the outer part of those
contexts (a_expr's inside can't and shouldn't be done much about)
which will allow both RESPECT and IGNORE to become unreserved
keywords, without doing any lexer hacking or abusing the grammar.

I've attached a patch which will add RESPECT NULLS and IGNORE NULLS to
the grammar in the right place. Also the window frame options are set
but nothing more, so this patch needs to be merged with Nicholas White's
original patch.

One problem I see with this approach to the grammar is that the
error messages will change when putting window functions in any of the
forbidden places. The new error messages are I think worse and less
specific than the old ones. I suppose that can be fixed though, or
maybe the problem isn't so severe.

Example of old error message:
window functions are not allowed in functions in FROM

New error message:
syntax error at or near OVER



in addition I think the original patch as it stands has a few
problems that I haven't seen discussed:

1. The result of the current patch using lead

create table test_table (
   id serial,
   val integer);
insert into test_table (val) select * from unnest(ARRAY[1,2,3,4,NULL, NULL,
NULL, 5, 6, 7]);

select val, lead(val, 2) ignore nulls over (order by id) from test_table;
 val | lead
-+--
   1 |3
   2 |4
   3 |4
   4 |4
 |4
 |5
 |6
   5 |7
   6 |7
   7 |7
(10 rows)

I would expect it to output:

select val, lead(val, 2) ignore nulls over (order by id) from test_table;
 val | lead
-+--
   1 |3
   2 |4
   3 |5
   4 |6
 |6
 |6
 |6
   5 |7
   6 |
   7 |
(10 rows)

That is: skip two rows forward not counting null rows.

The lag behavior works well as far as I can see.

2. It would be nice if an error was given when ignore nulls was used
on a function for which it had no effect. Perhaps this should be up to
the different window function themselves to do though.

Apart from those points I think the original patch is nice and provides a
functionality
that's definitely nice to have.

Kind Regards
Troels Nielsen


On Fri, Jun 21, 2013 at 8:11 PM, Robert Haas robertmh...@gmail.com wrote:

 On Fri, Jun 21, 2013 at 11:33 AM, Jeff Davis pg...@j-davis.com wrote:
  Regardless of what syntax we settle on, we should also make sure that
  the conflict is intrinsic to the grammar and can't be factored out, as
  Tom suggested upthread.  It's not obvious to me what the actual
  ambiguity is here.  If you've seen select lag(num,0) and the
  lookahead token is respect, what's the problem?  It sort of looks
  like it could be a column label, but not even unreserved keywords can
  be column labels, so that's not it.  Probably deserves a bit more
  investigation...
 
  I think the problem is when the function is used as a table function;
  e.g.:
 
SELECT * FROM generate_series(1,10) respect;

 Ah ha.  Well, there's probably not much help for that.  Disallowing
 keywords as table aliases would be a cure worse than the disease, I
 think.  I suppose the good news is that there probably aren't many
 people using RESPECT as a column name, but I have a feeling that we're
 almost certain to get complaints about reserving IGNORE.  I think that
 will have to be quoted as a PL/pgsql variable name as well.  :-(

  We could just add additional, optional Boolean argument to the
  existing functions.  It's non-standard, but we avoid adding keywords.
 
  I thought about that, but it is awkward because the argument would have
  to be constant (or, if not, it would be quite strange).

 True... but e.g. string_agg() has the same issue.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



respect_nulls_and_ignore_nulls_grammar.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-20 Thread Robert Haas
On Tue, Jun 18, 2013 at 6:27 PM, Nicholas White n.j.wh...@gmail.com wrote:
 Thanks for the reviews. I've attached a revised patch that has the lexer
 refactoring Alvaro mentions (arbitarily using a goto rather than a bool
 flag) and uses Jeff's idea of just storing the index of the last non-null
 value (if there is one) in the window function's context (and not the Datum
 itself).

 However, Robert's right that SELECT ... ORDER BY respect NULLS LAST will now
 fail. An earlier iteration of the patch had RESPECT and IGNORE as reserved,
 but that would have broken tables with columns called respect (etc.),
 which the current version allows. Is this backwards incompatibility
 acceptable?

I think it's better to add new partially reserved keywords than to
have this kind of random breakage.  When you make something a
partially-reserved keyword, then things break, but in a fairly
well-defined way.  Lexer hacks can break things in ways that are much
subtler, which we may not even realize for a long time, and which in
that case would mean that the word respect needs to be quoted in
some contexts but not others.  That's going to cause a lot of
headaches, because pg_dump etc. know about quoting reserved keywords,
but they don't know anything about quoting unreserved keywords in
contexts where they might happen to be followed by the wrong next
word.

 If not, maybe I should try doing a two-token lookahead in the
 token-aggregating code between the lexer and the parser (i.e. make a
 RESPECT_NULLS token out of a sequence of RESPECT NULLS OVER tokens,
 remembering to replace the OVER token)? Or what about adding an %expect
 statement to the Bison grammar, confirming that the shift / reduce conflicts
 caused by using the RESPECT, IGNORE  NULL_P tokens the in out_clause rule
 are OK?

These lines of inquiry don't seem promising to me.  It's going to be
complicated and unmaintainable and may just move the failure scenarios
to cases that are too obscure for us to reason about.

I think the question is whether this feature is really worth adding
new reserved keywords for.  I have a hard time saying we shouldn't
support something that's part of the SQL standard, but personally,
it's not something I've seen come up prior to this thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-20 Thread Jeff Davis
On Thu, 2013-06-20 at 10:03 -0400, Robert Haas wrote:
 I think the question is whether this feature is really worth adding
 new reserved keywords for.  I have a hard time saying we shouldn't
 support something that's part of the SQL standard, but personally,
 it's not something I've seen come up prior to this thread.

What's the next step here?

The feature sounds useful to me. If the grammar is unacceptable, does
someone have an alternative idea, like using new function names instead
of grammar? If so, what are reasonable names to use?

Also, I think someone mentioned this already, but what about
first_value() and last_value()? Shouldn't we do those at the same time?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-18 Thread Hitoshi Harada
On Sat, Jun 15, 2013 at 1:30 PM, Jeff Davis pg...@j-davis.com wrote:

 On Sun, 2013-03-24 at 20:15 -0400, Nicholas White wrote:

  I've redone the leadlag function changes to use datumCopy as you
  suggested. However, I've had to remove the NOT_USED #ifdef around
  datumFree so I can use it to avoid building up large numbers of copies
  of Datums in the memory context while a query is executing. I've
  attached the revised patch...
 
 Comments:

 WinGetResultDatumCopy() calls datumCopy, which will just copy in the
 current memory context. I think you want it in the per-partition memory
 context, otherwise the last value in each partition will stick around
 until the query is done (so many partitions could be a problem). That
 should be easy enough to do by switching to the
 winobj-winstate-partcontext memory context before calling datumCopy,
 and then switching back.

 For that matter, why store the datum again at all? You can just store
 the offset of the last non-NULL value in that partition, and then fetch
 it using WinGetFuncArgInPartition(), right? We really want to avoid any
 per-tuple allocations.


I believe WinGetFuncArgInPartition is a bit slow if the offset is far from
the current row.  So it might make sense to store the last-seen value, but
I'm not sure if we need to copy datum every time.  I haven't looked into
the new patch.



Thanks,
-- 
Hitoshi Harada


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-18 Thread Nicholas White
Thanks for the reviews. I've attached a revised patch that has the lexer
refactoring Alvaro mentions (arbitarily using a goto rather than a bool
flag) and uses Jeff's idea of just storing the index of the last non-null
value (if there is one) in the window function's context (and not the Datum
itself).

However, Robert's right that SELECT ... ORDER BY respect NULLS LAST will
now fail. An earlier iteration of the patch had RESPECT and IGNORE as
reserved, but that would have broken tables with columns called respect
(etc.), which the current version allows. Is this backwards incompatibility
acceptable? If not, maybe I should try doing a two-token lookahead in the
token-aggregating code between the lexer and the parser (i.e. make a
RESPECT_NULLS token out of a sequence of RESPECT NULLS OVER tokens,
remembering to replace the OVER token)? Or what about adding an %expect
statement to the Bison grammar, confirming that the shift / reduce
conflicts caused by using the RESPECT, IGNORE  NULL_P tokens the in
out_clause rule are OK?

Thanks -

Nick


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-17 Thread Robert Haas
On Sat, Jun 15, 2013 at 9:37 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Nicholas White escribió:

 For the parsing changes, it seems I can either make RESPECT and IGNORE
 reserved keywords, or add a lookahead to construct synthetic RESPECT NULLS
 and IGNORE NULLS keywords. The grammar wouldn't compile if RESPECT and
 IGNORE were just normal unreserved keywords due to ambiguities after a
 function definition (e.g. select abs(1) respect; - which is currently a
 valid statement).

 Well, making them reserved keywords is not that great, so maybe the
 lookahead thingy is better.  However, this patch introduces the third
 and fourth uses of the save the lookahead token pattern in the
 default switch cases.  Can we refactor that bit somehow, to avoid so
 many duplicates?  (For a minute I thought that Andrew Gierth's WITH
 ORDINALITY patch would add another one, but it seems not.)

Making things reserved keywords is painful and I don't like it, but
I've started to become skeptical of shifting the problem to the lexer,
too.  Sometimes special case logic in the lexer about token combining
can have surprising consequences in other parts of the grammar.  For
example, with a lexer hack, what will happen if someone has a column
named RESPECT and does SELECT ... ORDER BY respect NULLS LAST?  I
haven't studied the code in detail so maybe it's fine, but it's
something to think about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-15 Thread Jeff Davis
On Sun, 2013-03-24 at 20:15 -0400, Nicholas White wrote:

 I've redone the leadlag function changes to use datumCopy as you
 suggested. However, I've had to remove the NOT_USED #ifdef around
 datumFree so I can use it to avoid building up large numbers of copies
 of Datums in the memory context while a query is executing. I've
 attached the revised patch...
 
Comments:

WinGetResultDatumCopy() calls datumCopy, which will just copy in the
current memory context. I think you want it in the per-partition memory
context, otherwise the last value in each partition will stick around
until the query is done (so many partitions could be a problem). That
should be easy enough to do by switching to the
winobj-winstate-partcontext memory context before calling datumCopy,
and then switching back.

For that matter, why store the datum again at all? You can just store
the offset of the last non-NULL value in that partition, and then fetch
it using WinGetFuncArgInPartition(), right? We really want to avoid any
per-tuple allocations.

Alternatively, you might look into setting a mark when you get a
non-NULL value. Then you could just always fetch the oldest one.
Unfortunately, I think that only works with const_offset=true... so the
previous idea might be better.

I'll leave it to someone else to review the grammar changes.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-15 Thread Alvaro Herrera
Nicholas White escribió:

 For the parsing changes, it seems I can either make RESPECT and IGNORE
 reserved keywords, or add a lookahead to construct synthetic RESPECT NULLS
 and IGNORE NULLS keywords. The grammar wouldn't compile if RESPECT and
 IGNORE were just normal unreserved keywords due to ambiguities after a
 function definition (e.g. select abs(1) respect; - which is currently a
 valid statement).

Well, making them reserved keywords is not that great, so maybe the
lookahead thingy is better.  However, this patch introduces the third
and fourth uses of the save the lookahead token pattern in the
default switch cases.  Can we refactor that bit somehow, to avoid so
many duplicates?  (For a minute I thought that Andrew Gierth's WITH
ORDINALITY patch would add another one, but it seems not.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-03-24 Thread Hitoshi Harada
On Sat, Mar 23, 2013 at 3:25 PM, Nicholas White n.j.wh...@gmail.com wrote:

 Thanks - I've added it here:
 https://commitfest.postgresql.org/action/patch_view?id=1096 .

 I've also attached a revised version that makes IGNORE and RESPECT
 UNRESERVED keywords (following the pattern of NULLS_FIRST and NULLS_LAST).


Hm, you made another lookahead in base_yylex to make them unreserved --
looks ok, but not sure if there was no way to do it.

You might want to try byref types such as text.  It seems you need to copy
the datum to save the value in appropriate memory context.  Also, try to
create a view on those expressions.  I don't think it correctly preserves
it.

Thanks,
-- 
Hitoshi Harada


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-03-24 Thread Nicholas White
Thanks for the feedback.

For the parsing changes, it seems I can either make RESPECT and IGNORE
reserved keywords, or add a lookahead to construct synthetic RESPECT NULLS
and IGNORE NULLS keywords. The grammar wouldn't compile if RESPECT and
IGNORE were just normal unreserved keywords due to ambiguities after a
function definition (e.g. select abs(1) respect; - which is currently a
valid statement).

I've redone the leadlag function changes to use datumCopy as you suggested.
However, I've had to remove the NOT_USED #ifdef around datumFree so I can
use it to avoid building up large numbers of copies of Datums in the memory
context while a query is executing. I've attached the revised patch...

Thanks -

Nick


On 24 March 2013 03:43, Hitoshi Harada umi.tan...@gmail.com wrote:



 On Sat, Mar 23, 2013 at 3:25 PM, Nicholas White n.j.wh...@gmail.comwrote:

 Thanks - I've added it here:
 https://commitfest.postgresql.org/action/patch_view?id=1096 .

 I've also attached a revised version that makes IGNORE and RESPECT
 UNRESERVED keywords (following the pattern of NULLS_FIRST and NULLS_LAST).


 Hm, you made another lookahead in base_yylex to make them unreserved --
 looks ok, but not sure if there was no way to do it.

 You might want to try byref types such as text.  It seems you need to copy
 the datum to save the value in appropriate memory context.  Also, try to
 create a view on those expressions.  I don't think it correctly preserves
 it.

 Thanks,
 --
 Hitoshi Harada


lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-03-23 Thread Tom Lane
Nicholas White n.j.wh...@gmail.com writes:
 The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead,
 lag, [...]. This is not implemented in PostgreSQL
 (http://www.postgresql.org/docs/devel/static/functions-window.html)
 I've had a go at implementing this, and I've attached the resulting patch.
 It's not finished yet, but I was hoping to find out if my solution is along
 the right lines.

Since we're trying to get 9.3 to closure, this patch probably isn't
going to get much attention until the 9.4 development cycle starts
(in a couple of months, likely).  In the meantime, please add it to
the next commitfest list so we remember to come back to it:
https://commitfest.postgresql.org/action/commitfest_view?id=18

One comment just from a quick eyeball look is that we really hate
adding new keywords that aren't UNRESERVED, because that risks
breaking existing applications.  Please see if you can refactor the
grammar to make those new entries unreserved.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-03-23 Thread Nicholas White
Thanks - I've added it here:
https://commitfest.postgresql.org/action/patch_view?id=1096 .

I've also attached a revised version that makes IGNORE and RESPECT
UNRESERVED keywords (following the pattern of NULLS_FIRST and NULLS_LAST).

Nick


On 23 March 2013 14:34, Tom Lane t...@sss.pgh.pa.us wrote:

 Nicholas White n.j.wh...@gmail.com writes:
  The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead,
  lag, [...]. This is not implemented in PostgreSQL
  (http://www.postgresql.org/docs/devel/static/functions-window.html)
  I've had a go at implementing this, and I've attached the resulting
 patch.
  It's not finished yet, but I was hoping to find out if my solution is
 along
  the right lines.

 Since we're trying to get 9.3 to closure, this patch probably isn't
 going to get much attention until the 9.4 development cycle starts
 (in a couple of months, likely).  In the meantime, please add it to
 the next commitfest list so we remember to come back to it:
 https://commitfest.postgresql.org/action/commitfest_view?id=18

 One comment just from a quick eyeball look is that we really hate
 adding new keywords that aren't UNRESERVED, because that risks
 breaking existing applications.  Please see if you can refactor the
 grammar to make those new entries unreserved.

 regards, tom lane



lead-lag-ignore-nulls.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers