Re: [HACKERS] review: More frame options in window functions

2010-02-12 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 I think there's the associativity property of operators that we might
 want to have someday, in order for the planner to know some more about
 joins on A = B then on B = C, or replace with  if you will.

 We already do know about that, at least in the case of =.  The reason it
 doesn't do transitive  deductions is not lack of information but doubt
 that it's worth the cycles to try.

Ok. I just remember about some mails here about the problem of
reordering [LEFT] JOINS when we can, but I can't remember if it's really
tied to associativity or some other thing.

Searching the archives ain't helping me refresh those memories. So it
seems the case for an extended opclass infrastructure, or a new side
one, is between thin an non-existent yet?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-12 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 Searching the archives ain't helping me refresh those memories. So it
 seems the case for an extended opclass infrastructure, or a new side
 one, is between thin an non-existent yet?

Yeah, I don't immediately see anything that would justify going to
that level of effort.  Adding +/- as support functions for btree
seems like the thing to do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-12 Thread Alvaro Herrera
Tom Lane escribió:
 Dimitri Fontaine dfonta...@hi-media.com writes:
  Searching the archives ain't helping me refresh those memories. So it
  seems the case for an extended opclass infrastructure, or a new side
  one, is between thin an non-existent yet?
 
 Yeah, I don't immediately see anything that would justify going to
 that level of effort.  Adding +/- as support functions for btree
 seems like the thing to do.

Would it work to use a fake access method instead?  If we add it to
btree, will we be able to backtrack and move that to a separate catalog
if we ever determine that it would have been better?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-12 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 Yeah, I don't immediately see anything that would justify going to
 that level of effort.  Adding +/- as support functions for btree
 seems like the thing to do.

 Would it work to use a fake access method instead?

Then you'd have to duplicate all the information in a btree opclass;
*and* teach all the stuff that knows about btree to know about fakeam
instead.  Doesn't seem like there's any win there.

 If we add it to
 btree, will we be able to backtrack and move that to a separate catalog
 if we ever determine that it would have been better?

Backwards compatibility with existing user-type definitions is one big
reason to *not* try to pull ORDER BY information out of btree.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-12 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 [ more_frame_options patch ]

Committed after rather extensive revisions.

I removed the RANGE value PRECEDING/FOLLOWING support as per discussion,
which was probably a good thing anyway from an incremental-development
standpoint; it made it easier to see what was going on in nodeWindowAgg.

I'm not terribly happy with the changes you made in WinGetFuncArgInPartition
and WinGetFuncArgInFrame to force the window function mark to not go
past frame start in some modes.  Not only is that pretty ugly, but I
think it can mask bugs in window functions: it's an error for a window
function to fetch a row before what it has set its mark to be, but in
some cases that wouldn't be detected because of this change.  I think
it would be better to revert those changes and find another method of
protecting fetches needed to determine the frame head.  One idea is
to create a separate read pointer that tracks the frame head whenever
actual fetches of the frame head might be needed by update_frameheadpos.
I committed it without changing that, but I think this should be
revisited before trying to add the RANGE value PRECEDING/FOLLOWING
options, because those will substantially expand the number of cases
where that hack affects the behavior.

I found one actual bug, which was indeed exposed by the submitted
regression tests:

SELECT sum(unique1) over (rows between 1 following and 3 following),
unique1, four
FROM tenk1 WHERE unique1  10;
 sum | unique1 | four 
-+-+--
   9 |   4 |0
  16 |   2 |2
  23 |   1 |1
  22 |   6 |2
  16 |   9 |1
  15 |   8 |0
  10 |   5 |1
   7 |   3 |3
   0 |   7 |3
   0 |   0 |0
(10 rows)

The last row's SUM ought to be null because there are no rows left in
the frame.  The cause of the bug was that update_frameheadpos was
forcing frameheadpos to not go past the last partition row, but this
has to be allowed so that the frame can be empty at need.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-12 Thread Hitoshi Harada
2010/2/13 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 [ more_frame_options patch ]

 Committed after rather extensive revisions.

Thanks a lot.

 I'm not terribly happy with the changes you made in WinGetFuncArgInPartition
 and WinGetFuncArgInFrame to force the window function mark to not go
 past frame start in some modes.  Not only is that pretty ugly, but I
 think it can mask bugs in window functions: it's an error for a window
 function to fetch a row before what it has set its mark to be, but in
 some cases that wouldn't be detected because of this change.  I think
 it would be better to revert those changes and find another method of
 protecting fetches needed to determine the frame head.  One idea is
 to create a separate read pointer that tracks the frame head whenever
 actual fetches of the frame head might be needed by update_frameheadpos.
 I committed it without changing that, but I think this should be
 revisited before trying to add the RANGE value PRECEDING/FOLLOWING
 options, because those will substantially expand the number of cases
 where that hack affects the behavior.

Well, you're right. In addition to this topic, I concern a little
about changing row fetching in aggregate from spool_tuples() to
window_gettupleslot(), for performance reason. It's needed to support
extended frame options, but for original basic frame options it may
get slow. Anyway, I agree to revisit and refactor to executor logic.


Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Martijn van Oosterhout
On Tue, Feb 09, 2010 at 12:37:32PM +0900, Hitoshi Harada wrote:
 I know +/- part is an issue. But as far as I know there's been no
 infrastructure to handle such situation. My ideal plan is to introduce
 some mechanism to make +/- operation abstract enough such like
 sort opclass/opfamily, although I wasn't sure if that is to be
 introduced for this (ie RANGE frame) purpose only.
 
 Now that specialized hard-coding +/- in source seems unacceptable,
 I would like to hear how to handle this. Is there any better than
 introducing new mechanism such like opclass?

I imagine it would be easiest to add these operators to the existing
opclass infrastructure. Although it didn't start that way, the opclass
structure is being more and more used to declare the semantics of a
type. Btree operator classes are for *ordered* types, which seems to be
the only case you can expect to work here.

Currently the btree operator classes have only one support function, so
it would seem than the easiest approach would be to declare two new
support functions for add() and subtract(). A difficulty may be that
the arguments to these functions are not obvious as in the
(timestamp,interval) case, but if there is only one suitable
possibility I think this can be made to work.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Tue, Feb 09, 2010 at 12:37:32PM +0900, Hitoshi Harada wrote:
 Now that specialized hard-coding +/- in source seems unacceptable,
 I would like to hear how to handle this. Is there any better than
 introducing new mechanism such like opclass?

 I imagine it would be easiest to add these operators to the existing
 opclass infrastructure. Although it didn't start that way, the opclass
 structure is being more and more used to declare the semantics of a
 type. Btree operator classes are for *ordered* types, which seems to be
 the only case you can expect to work here.

 Currently the btree operator classes have only one support function, so
 it would seem than the easiest approach would be to declare two new
 support functions for add() and subtract().

Yeah, that was pretty much the same line of thought I had.  The main
disadvantage seems to be two more catalog lookups per index to fill in
support function entries that won't ever be used (at least not by the
index machinery).  However, I think we cache that stuff already inside
relcache.c, so it might be insignificant.

The real question is whether it's time to bite the bullet and stop
overloading the opclass infrastructure for semantics-declaration
purposes.  Are there any other foreseeable cases where we are going
to need to add operator knowledge of this sort?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Robert Haas
On Thu, Feb 11, 2010 at 3:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Martijn van Oosterhout klep...@svana.org writes:
 On Tue, Feb 09, 2010 at 12:37:32PM +0900, Hitoshi Harada wrote:
 Now that specialized hard-coding +/- in source seems unacceptable,
 I would like to hear how to handle this. Is there any better than
 introducing new mechanism such like opclass?

 I imagine it would be easiest to add these operators to the existing
 opclass infrastructure. Although it didn't start that way, the opclass
 structure is being more and more used to declare the semantics of a
 type. Btree operator classes are for *ordered* types, which seems to be
 the only case you can expect to work here.

 Currently the btree operator classes have only one support function, so
 it would seem than the easiest approach would be to declare two new
 support functions for add() and subtract().

 Yeah, that was pretty much the same line of thought I had.  The main
 disadvantage seems to be two more catalog lookups per index to fill in
 support function entries that won't ever be used (at least not by the
 index machinery).  However, I think we cache that stuff already inside
 relcache.c, so it might be insignificant.

 The real question is whether it's time to bite the bullet and stop
 overloading the opclass infrastructure for semantics-declaration
 purposes.  Are there any other foreseeable cases where we are going
 to need to add operator knowledge of this sort?

knngist wants to jimmy with the opclass semantics too, though the need
there is a little different.  The issue is that an amcanorder AM is
good for optimizing ORDER BY indexed-column-name, but that's not
what GIST indices can do: they can optimize ORDER BY
indexed-column-name op constant, for suitable values of op.
Teodor and Oleg handled this by adding a new flag to the AM
(amcanorderbyop) and adding the operators to the opclass.  But, unlike
the comparison operators, these operators don't necessarily return a
boolean: in fact they probably don't.

It would be nice to come up with a general solution to this problem.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Feb 11, 2010 at 3:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The real question is whether it's time to bite the bullet and stop
 overloading the opclass infrastructure for semantics-declaration
 purposes.  Are there any other foreseeable cases where we are going
 to need to add operator knowledge of this sort?

 knngist wants to jimmy with the opclass semantics too, though the need
 there is a little different.  The issue is that an amcanorder AM is
 good for optimizing ORDER BY indexed-column-name, but that's not
 what GIST indices can do: they can optimize ORDER BY
 indexed-column-name op constant, for suitable values of op.
 Teodor and Oleg handled this by adding a new flag to the AM
 (amcanorderbyop) and adding the operators to the opclass.  But, unlike
 the comparison operators, these operators don't necessarily return a
 boolean: in fact they probably don't.

Yeah, but in that case it is reasonable to have the information in
opclasses, because it's directly tied to something you do with indexes,
and it does depend on which opclass the index uses.  I don't like the
specific details of what they did, but associating knngist info with
opclasses is sane in the abstract.  What's bothering me about the window
frame stuff is that it's not at all associated with an index.

However, what it *is* associated with is a sort ordering, and the notion
that btree opclasses are what define orderings is sufficiently deeply
wired into the system that undoing it would be a huge PITA.  So unless
we can see a pretty clear future need for more information in this
category, I'm not really inclined to invent some new structure
altogether.  I'm just wondering if anyone does see that...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Oleg Bartunov

On Thu, 11 Feb 2010, Robert Haas wrote:


On Thu, Feb 11, 2010 at 3:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Martijn van Oosterhout klep...@svana.org writes:

On Tue, Feb 09, 2010 at 12:37:32PM +0900, Hitoshi Harada wrote:

Now that specialized hard-coding +/- in source seems unacceptable,
I would like to hear how to handle this. Is there any better than
introducing new mechanism such like opclass?



I imagine it would be easiest to add these operators to the existing
opclass infrastructure. Although it didn't start that way, the opclass
structure is being more and more used to declare the semantics of a
type. Btree operator classes are for *ordered* types, which seems to be
the only case you can expect to work here.



Currently the btree operator classes have only one support function, so
it would seem than the easiest approach would be to declare two new
support functions for add() and subtract().


Yeah, that was pretty much the same line of thought I had.  The main
disadvantage seems to be two more catalog lookups per index to fill in
support function entries that won't ever be used (at least not by the
index machinery).  However, I think we cache that stuff already inside
relcache.c, so it might be insignificant.

The real question is whether it's time to bite the bullet and stop
overloading the opclass infrastructure for semantics-declaration
purposes.  Are there any other foreseeable cases where we are going
to need to add operator knowledge of this sort?


knngist wants to jimmy with the opclass semantics too, though the need
there is a little different.  The issue is that an amcanorder AM is
good for optimizing ORDER BY indexed-column-name, but that's not
what GIST indices can do: they can optimize ORDER BY
indexed-column-name op constant, for suitable values of op.
Teodor and Oleg handled this by adding a new flag to the AM
(amcanorderbyop) and adding the operators to the opclass.  But, unlike
the comparison operators, these operators don't necessarily return a
boolean: in fact they probably don't.


see http://archives.postgresql.org/pgsql-hackers/2009-11/msg01809.php
in knngist we use distance semantic,  distance  0 means FALSE, so
we compatible with old GIST.




It would be nice to come up with a general solution to this problem.


yeah

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 8, 2010 at 10:37 PM, Hitoshi Harada umi.tan...@gmail.com wrote:
 2010/2/9 Tom Lane t...@sss.pgh.pa.us:
 Given the lack of time remaining in this CF, I'm tempted to propose
 ripping out the RANGE support and just trying to get ROWS committed.
 That should be substantially less controversial from a semantic
 standpoint, and it still seems like a considerable improvement in
 functionality.
 
 As expected. I don't mind splitting patch to be committable if users
 who expected this feature don't mind.

 Well, they'll likely be happier with a partial feature than no feature
 at all...  I agree with Tom that there's no time time now to resolve
 the issue of how + and - should be handled.

I've done that and am reviewing the rest of the patch, but I had more
trouble than I expected with phrasing the not implemented message.
Usually we try to word these things like SQLCOMMAND is not implemented
but there's no one-word version of what it is that's been left out.
RANGE isn't right since there are variants of RANGE that work.
What I have at the moment is

if (n-frameOptions  (FRAMEOPTION_START_VALUE_PRECEDING |
   FRAMEOPTION_END_VALUE_PRECEDING))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg(RANGE value PRECEDING is not implemented yet),
 parser_errposition(@1)));
if (n-frameOptions  (FRAMEOPTION_START_VALUE_FOLLOWING |
   FRAMEOPTION_END_VALUE_FOLLOWING))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg(RANGE value FOLLOWING is not implemented yet),
 parser_errposition(@1)));

but I wonder if anyone has a better idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 However, what it *is* associated with is a sort ordering, and the notion
 that btree opclasses are what define orderings is sufficiently deeply
 wired into the system that undoing it would be a huge PITA.  So unless
 we can see a pretty clear future need for more information in this
 category, I'm not really inclined to invent some new structure
 altogether.  I'm just wondering if anyone does see that...

I think there's the associativity property of operators that we might
want to have someday, in order for the planner to know some more about
joins on A = B then on B = C, or replace with  if you will.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 However, what it *is* associated with is a sort ordering, and the notion
 that btree opclasses are what define orderings is sufficiently deeply
 wired into the system that undoing it would be a huge PITA.  So unless
 we can see a pretty clear future need for more information in this
 category, I'm not really inclined to invent some new structure
 altogether.  I'm just wondering if anyone does see that...

 I think there's the associativity property of operators that we might
 want to have someday, in order for the planner to know some more about
 joins on A = B then on B = C, or replace with  if you will.

We already do know about that, at least in the case of =.  The reason it
doesn't do transitive  deductions is not lack of information but doubt
that it's worth the cycles to try.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Robert Haas
On Thu, Feb 11, 2010 at 4:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 8, 2010 at 10:37 PM, Hitoshi Harada umi.tan...@gmail.com wrote:
 2010/2/9 Tom Lane t...@sss.pgh.pa.us:
 Given the lack of time remaining in this CF, I'm tempted to propose
 ripping out the RANGE support and just trying to get ROWS committed.
 That should be substantially less controversial from a semantic
 standpoint, and it still seems like a considerable improvement in
 functionality.

 As expected. I don't mind splitting patch to be committable if users
 who expected this feature don't mind.

 Well, they'll likely be happier with a partial feature than no feature
 at all...  I agree with Tom that there's no time time now to resolve
 the issue of how + and - should be handled.

 I've done that and am reviewing the rest of the patch, but I had more
 trouble than I expected with phrasing the not implemented message.
 Usually we try to word these things like SQLCOMMAND is not implemented
 but there's no one-word version of what it is that's been left out.
 RANGE isn't right since there are variants of RANGE that work.
 What I have at the moment is

        if (n-frameOptions  (FRAMEOPTION_START_VALUE_PRECEDING |
                               FRAMEOPTION_END_VALUE_PRECEDING))
            ereport(ERROR,
                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                     errmsg(RANGE value PRECEDING is not implemented yet),
                     parser_errposition(@1)));
        if (n-frameOptions  (FRAMEOPTION_START_VALUE_FOLLOWING |
                               FRAMEOPTION_END_VALUE_FOLLOWING))
            ereport(ERROR,
                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                     errmsg(RANGE value FOLLOWING is not implemented yet),
                     parser_errposition(@1)));

 but I wonder if anyone has a better idea.

Maybe something like this?

RANGE PRECEDING is only supported with UNBOUNDED

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-11 Thread Hitoshi Harada
2010/2/12 Tom Lane t...@sss.pgh.pa.us:
 The real question is whether it's time to bite the bullet and stop
 overloading the opclass infrastructure for semantics-declaration
 purposes.  Are there any other foreseeable cases where we are going
 to need to add operator knowledge of this sort?

Table partitioning with RANGE option maybe?

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 2010/1/23 Robert Haas robertmh...@gmail.com:
 Would it make sense to pull some of the infrastructure bits out of
 this patch and commit those bits separately, so as to reduce the size
 of the main patch?  In particular, the AggGetMemoryContext() stuff
 looks like a good candidate for that treatment.

 Fair enough. Attached contains that part only.

I started looking at this patch.  I believe that we should commit the 
AggGetMemoryContext API function --- *not* the window context
management changes that you included here, but only the API abstraction
--- to be sure that that gets into 9.0 so that third-party aggregate
functions can start relying on it instead of looking directly at the
AggState or WindowAggState node.  The rest of the patch might or might
not make it, but we can at least help people future-proof their code.

I'm fairly desperately unhappy with the RANGE PRECEDING/FOLLOWING parts
of the patch.  We have expended a great deal of sweat over the years
to avoid hard-wiring assumptions about particular operator names into
the code, but this patch is blithely ignoring that history and assuming
that + and - will do the right thing.  Also LookupOperName is
probably not the right thing, since it insists on exact or
binary-compatible match.  To the extent that there is any justification
at all for assuming that +/- are the right operators, it is that the
spec speaks in terms of the range bounds being VSK+V2F etc --- but if
someone were to actually write out such an expression, the parser would
allow for implicit casts to happen, so this code is not implementing
what that expression would produce.  Plus the results are dependent on
the current search path, which for example means it'll fail if the
window sort column is a user-defined type whose operators happen to be
out of path at the moment --- even though we would have found its
default sort opclass despite that.  And lastly, I'm totally unconvinced
that it's a good idea to accept an operator that returns a type other
than the type of the window sort column.  That seems to eliminate
whatever little protection we had against picking up an unsuitable
operator; and it complicates the code as well.

Given the lack of time remaining in this CF, I'm tempted to propose
ripping out the RANGE support and just trying to get ROWS committed.
That should be substantially less controversial from a semantic
standpoint, and it still seems like a considerable improvement in
functionality.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Josh Berkus
On 2/8/10 11:17 AM, Tom Lane wrote:
 Given the lack of time remaining in this CF, I'm tempted to propose
 ripping out the RANGE support and just trying to get ROWS committed.
 That should be substantially less controversial from a semantic
 standpoint, and it still seems like a considerable improvement in
 functionality.

+1

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Tom Lane
I wrote:
 I started looking at this patch.  I believe that we should commit the 
 AggGetMemoryContext API function --- *not* the window context
 management changes that you included here, but only the API abstraction
 --- to be sure that that gets into 9.0 so that third-party aggregate
 functions can start relying on it instead of looking directly at the
 AggState or WindowAggState node.  The rest of the patch might or might
 not make it, but we can at least help people future-proof their code.

I have committed that little part.  I revised the function API to be

/* AggCheckCallContext can return one of the following codes, or 0: */
#define AGG_CONTEXT_AGGREGATE1/* regular aggregate */
#define AGG_CONTEXT_WINDOW   2/* window function */

extern int  AggCheckCallContext(FunctionCallInfo fcinfo,
MemoryContext *aggcontext);

so that it would be conveniently usable in places that just want to
check aggregate-ness and don't need to fetch a memory context; and
with the thought that maybe someday there would be more than two
possible call contexts.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Hitoshi Harada
2010/2/9 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 2010/1/23 Robert Haas robertmh...@gmail.com:
 Would it make sense to pull some of the infrastructure bits out of
 this patch and commit those bits separately, so as to reduce the size
 of the main patch?  In particular, the AggGetMemoryContext() stuff
 looks like a good candidate for that treatment.

 Fair enough. Attached contains that part only.

 I started looking at this patch.  I believe that we should commit the
 AggGetMemoryContext API function --- *not* the window context
 management changes that you included here, but only the API abstraction
 --- to be sure that that gets into 9.0 so that third-party aggregate
 functions can start relying on it instead of looking directly at the
 AggState or WindowAggState node.  The rest of the patch might or might
 not make it, but we can at least help people future-proof their code.

 I'm fairly desperately unhappy with the RANGE PRECEDING/FOLLOWING parts
 of the patch.  We have expended a great deal of sweat over the years
 to avoid hard-wiring assumptions about particular operator names into
 the code, but this patch is blithely ignoring that history and assuming
 that + and - will do the right thing.  Also LookupOperName is
 probably not the right thing, since it insists on exact or
 binary-compatible match.  To the extent that there is any justification
 at all for assuming that +/- are the right operators, it is that the
 spec speaks in terms of the range bounds being VSK+V2F etc --- but if
 someone were to actually write out such an expression, the parser would
 allow for implicit casts to happen, so this code is not implementing
 what that expression would produce.  Plus the results are dependent on
 the current search path, which for example means it'll fail if the
 window sort column is a user-defined type whose operators happen to be
 out of path at the moment --- even though we would have found its
 default sort opclass despite that.  And lastly, I'm totally unconvinced
 that it's a good idea to accept an operator that returns a type other
 than the type of the window sort column.  That seems to eliminate
 whatever little protection we had against picking up an unsuitable
 operator; and it complicates the code as well.

I know +/- part is an issue. But as far as I know there's been no
infrastructure to handle such situation. My ideal plan is to introduce
some mechanism to make +/- operation abstract enough such like
sort opclass/opfamily, although I wasn't sure if that is to be
introduced for this (ie RANGE frame) purpose only.

Now that specialized hard-coding +/- in source seems unacceptable,
I would like to hear how to handle this. Is there any better than
introducing new mechanism such like opclass?

 Given the lack of time remaining in this CF, I'm tempted to propose
 ripping out the RANGE support and just trying to get ROWS committed.
 That should be substantially less controversial from a semantic
 standpoint, and it still seems like a considerable improvement in
 functionality.

 Thoughts?

As expected. I don't mind splitting patch to be committable if users
who expected this feature don't mind.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Robert Haas
On Mon, Feb 8, 2010 at 10:37 PM, Hitoshi Harada umi.tan...@gmail.com wrote:
 2010/2/9 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 2010/1/23 Robert Haas robertmh...@gmail.com:
 Would it make sense to pull some of the infrastructure bits out of
 this patch and commit those bits separately, so as to reduce the size
 of the main patch?  In particular, the AggGetMemoryContext() stuff
 looks like a good candidate for that treatment.

 Fair enough. Attached contains that part only.

 I started looking at this patch.  I believe that we should commit the
 AggGetMemoryContext API function --- *not* the window context
 management changes that you included here, but only the API abstraction
 --- to be sure that that gets into 9.0 so that third-party aggregate
 functions can start relying on it instead of looking directly at the
 AggState or WindowAggState node.  The rest of the patch might or might
 not make it, but we can at least help people future-proof their code.

 I'm fairly desperately unhappy with the RANGE PRECEDING/FOLLOWING parts
 of the patch.  We have expended a great deal of sweat over the years
 to avoid hard-wiring assumptions about particular operator names into
 the code, but this patch is blithely ignoring that history and assuming
 that + and - will do the right thing.  Also LookupOperName is
 probably not the right thing, since it insists on exact or
 binary-compatible match.  To the extent that there is any justification
 at all for assuming that +/- are the right operators, it is that the
 spec speaks in terms of the range bounds being VSK+V2F etc --- but if
 someone were to actually write out such an expression, the parser would
 allow for implicit casts to happen, so this code is not implementing
 what that expression would produce.  Plus the results are dependent on
 the current search path, which for example means it'll fail if the
 window sort column is a user-defined type whose operators happen to be
 out of path at the moment --- even though we would have found its
 default sort opclass despite that.  And lastly, I'm totally unconvinced
 that it's a good idea to accept an operator that returns a type other
 than the type of the window sort column.  That seems to eliminate
 whatever little protection we had against picking up an unsuitable
 operator; and it complicates the code as well.

 I know +/- part is an issue. But as far as I know there's been no
 infrastructure to handle such situation. My ideal plan is to introduce
 some mechanism to make +/- operation abstract enough such like
 sort opclass/opfamily, although I wasn't sure if that is to be
 introduced for this (ie RANGE frame) purpose only.

 Now that specialized hard-coding +/- in source seems unacceptable,
 I would like to hear how to handle this. Is there any better than
 introducing new mechanism such like opclass?

 Given the lack of time remaining in this CF, I'm tempted to propose
 ripping out the RANGE support and just trying to get ROWS committed.
 That should be substantially less controversial from a semantic
 standpoint, and it still seems like a considerable improvement in
 functionality.

 Thoughts?

 As expected. I don't mind splitting patch to be committable if users
 who expected this feature don't mind.

Well, they'll likely be happier with a partial feature than no feature
at all...  I agree with Tom that there's no time time now to resolve
the issue of how + and - should be handled.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-22 Thread Robert Haas
On Tue, Jan 19, 2010 at 3:02 PM, Hitoshi Harada umi.tan...@gmail.com wrote:
 2010/1/19 Hitoshi Harada umi.tan...@gmail.com:
 Yeah, that's my point, too. The planner has to distinguish four from
 sort pathkeys and to teach the executor the simple information which
 column should be used to determine frame. I was bit wrong because some
 of current executor code isn't like it, like using ordNumCols == 0 to
 know whether partition equals to frame, though

 And here's another version to fix this problem (I hope). Now the
 planner distinguish sort column from actual significant pathkeys. I
 tested it on both of 32bit and 64bit Linux.

Would it make sense to pull some of the infrastructure bits out of
this patch and commit those bits separately, so as to reduce the size
of the main patch?  In particular, the AggGetMemoryContext() stuff
looks like a good candidate for that treatment.

Why did you change BETWEEN from a TYPE_FUNC_NAME_KEYWORD to a COL_NAME_KEYWORD?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-22 Thread Hitoshi Harada
2010/1/23 Robert Haas robertmh...@gmail.com:
 On Tue, Jan 19, 2010 at 3:02 PM, Hitoshi Harada umi.tan...@gmail.com wrote:
 2010/1/19 Hitoshi Harada umi.tan...@gmail.com:
 Yeah, that's my point, too. The planner has to distinguish four from
 sort pathkeys and to teach the executor the simple information which
 column should be used to determine frame. I was bit wrong because some
 of current executor code isn't like it, like using ordNumCols == 0 to
 know whether partition equals to frame, though

 And here's another version to fix this problem (I hope). Now the
 planner distinguish sort column from actual significant pathkeys. I
 tested it on both of 32bit and 64bit Linux.

 Would it make sense to pull some of the infrastructure bits out of
 this patch and commit those bits separately, so as to reduce the size
 of the main patch?  In particular, the AggGetMemoryContext() stuff
 looks like a good candidate for that treatment.

Fair enough. Attached contains that part only. I'll search more parts
like what you suggest, but there seems to be few parts because for
example the change of parser affects all the road to the executor.

 Why did you change BETWEEN from a TYPE_FUNC_NAME_KEYWORD to a 
 COL_NAME_KEYWORD?

Introducing new frame option syntax, shift/reduce conflict came happened.
http://archives.postgresql.org/message-id/e08cc0400911240908s7efaea85wc8505d228220b...@mail.gmail.com
http://archives.postgresql.org/message-id/6363.1229890...@sss.pgh.pa.us

Tom suggested it as RESERVED_KEYWORD a year ago, but COL_NAME_KEYWORD
works as well which is slightly more weak (ie more chances that you
can use the word) than RESERVED_KEYWORD. I'm not completely sure which
is better, though.

Regards,

-- 
Hitoshi Harada


agg_memorycontext.20100123.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-20 Thread Pavel Stehule
2010/1/19 Hitoshi Harada umi.tan...@gmail.com:
 2010/1/19 Hitoshi Harada umi.tan...@gmail.com:
 Yeah, that's my point, too. The planner has to distinguish four from
 sort pathkeys and to teach the executor the simple information which
 column should be used to determine frame. I was bit wrong because some
 of current executor code isn't like it, like using ordNumCols == 0 to
 know whether partition equals to frame, though

 And here's another version to fix this problem (I hope). Now the
 planner distinguish sort column from actual significant pathkeys. I
 tested it on both of 32bit and 64bit Linux.


I tested it, and reported problems are fixed

Thank you

Pavel

 Regards,


 --
 Hitoshi Harada


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-19 Thread Hitoshi Harada
2010/1/19 Hitoshi Harada umi.tan...@gmail.com:
 Yeah, that's my point, too. The planner has to distinguish four from
 sort pathkeys and to teach the executor the simple information which
 column should be used to determine frame. I was bit wrong because some
 of current executor code isn't like it, like using ordNumCols == 0 to
 know whether partition equals to frame, though

And here's another version to fix this problem (I hope). Now the
planner distinguish sort column from actual significant pathkeys. I
tested it on both of 32bit and 64bit Linux.

Regards,


-- 
Hitoshi Harada


more_frame_options.20100120.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-18 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 2010/1/17 Tom Lane t...@sss.pgh.pa.us:
 That's broken, whether it passes regression tests or not.  Not
 canonicalizing will mean that you fail to recognize equality to
 canonicalized pathkeys, and thus for example execute unnecessary
 sorts.

 So why did you leave canonicalize argument in
 make_pathkeys_for_window()? I thought you'd thought it would be needed
 false in the future.

No, it has to be false in calls made before query_planner runs and true
afterwards.  There's no flexibility there.

 In a RANGE offset mode query, for example:

 SELECT sum(ten) over (PARTITION BY four ORDER BY four RANGE BETWEEN 2
 PRECEDING AND 1 PRECEDING) FROM tenk1

 the frame is determined as from the first row which has four value
 - 2 to the last row which has four value - 1 and executor should
 know four value *is* the sort column even if the column is not
 actually significant. But the planner removes that information.

Maybe we're just talking past each other.  My point is that the planner
should record the fact that four is the sort column someplace where the
executor can find it easily.  AFAICS that doesn't mean it can't be the
canonicalized form of the sort key.  If a column is dropped out of the
canonical sort key then it's simply redundant, and hence not relevant to
determining the range.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-18 Thread Hitoshi Harada
2010/1/19 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 In a RANGE offset mode query, for example:

 SELECT sum(ten) over (PARTITION BY four ORDER BY four RANGE BETWEEN 2
 PRECEDING AND 1 PRECEDING) FROM tenk1

 the frame is determined as from the first row which has four value
 - 2 to the last row which has four value - 1 and executor should
 know four value *is* the sort column even if the column is not
 actually significant. But the planner removes that information.

 Maybe we're just talking past each other.  My point is that the planner
 should record the fact that four is the sort column someplace where the
 executor can find it easily.  AFAICS that doesn't mean it can't be the
 canonicalized form of the sort key.  If a column is dropped out of the
 canonical sort key then it's simply redundant, and hence not relevant to
 determining the range.

Yeah, that's my point, too. The planner has to distinguish four from
sort pathkeys and to teach the executor the simple information which
column should be used to determine frame. I was bit wrong because some
of current executor code isn't like it, like using ordNumCols == 0 to
know whether partition equals to frame, though

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-18 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 2010/1/19 Tom Lane t...@sss.pgh.pa.us:
 AFAICS that doesn't mean it can't be the
 canonicalized form of the sort key.  If a column is dropped out of the
 canonical sort key then it's simply redundant, and hence not relevant to
 determining the range.

 Yeah, that's my point, too. The planner has to distinguish four from
 sort pathkeys and to teach the executor the simple information which
 column should be used to determine frame. I was bit wrong because some
 of current executor code isn't like it, like using ordNumCols == 0 to
 know whether partition equals to frame, though

BTW, watch out for the possibility that the canonicalized key is empty.
This isn't an error case --- what it means is that the planner has
proven that all the rows have equal sort key values, so there's no
need to compare anything.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-16 Thread Hitoshi Harada
2010/1/16 Erik Rijkers e...@xs4all.nl:

 Thanks for the review. I've found another crash today and attached is
 fixed version. The case is:

 SELECT four, sum(ten) over (PARTITION BY four ORDER BY four RANGE 1
 PRECEDING) FROM tenk1 WHERE unique1  10;


 Hi,

 The patch (more_frame_options.20100115.patch.gz) applies cleanly, but the 
 regression test gives:

It doesn't happen here. Could you send more information like configure
options and EXPLAIN output of that query?

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-16 Thread Pavel Stehule
2010/1/16 Hitoshi Harada umi.tan...@gmail.com:
 2010/1/16 Erik Rijkers e...@xs4all.nl:

 Thanks for the review. I've found another crash today and attached is
 fixed version. The case is:

 SELECT four, sum(ten) over (PARTITION BY four ORDER BY four RANGE 1
 PRECEDING) FROM tenk1 WHERE unique1  10;


 Hi,

 The patch (more_frame_options.20100115.patch.gz) applies cleanly, but the 
 regression test gives:

on fedora 32 bit is all ok. I'll check it on 64bit fedora on Monday.

Regards
Pavel


 It doesn't happen here. Could you send more information like configure
 options and EXPLAIN output of that query?

 Regards,


 --
 Hitoshi Harada

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-16 Thread Erik Rijkers
On Sat, January 16, 2010 09:29, Hitoshi Harada wrote:
 2010/1/16 Erik Rijkers e...@xs4all.nl:

 Thanks for the review. I've found another crash today and attached is
 fixed version. The case is:

 SELECT four, sum(ten) over (PARTITION BY four ORDER BY four RANGE 1
 PRECEDING) FROM tenk1 WHERE unique1  10;


 Hi,

 The patch (more_frame_options.20100115.patch.gz) applies cleanly, but the 
 regression test gives:

 It doesn't happen here. Could you send more information like configure
 options and EXPLAIN output of that query?


Sorry, I should have done that the first time.  Here is more info:

Linux 2.6.18-164.el5 x86_64 / CentOS release 5.4 (Final)

Source is CVS as of today (which without the patch compiles and runs the 
regression test OK).
With the patch, compiled + installed without 'make check', pg_config gives:

CONFIGURE = 
'--prefix=/var/data1/pg_stuff/pg_installations/pgsql.rows_frame_types'
'--with-pgport=6547'
CC = gcc
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement
-Wendif-labels -fno-strict-aliasing -fwrapv
CFLAGS_SL = -fpic
LDFLAGS = 
-Wl,-rpath,'/var/data1/pg_stuff/pg_installations/pgsql.rows_frame_types/lib'
LDFLAGS_SL =
LIBS = -lpgport -lz -lreadline -ltermcap -lcrypt -ldl -lm
VERSION = PostgreSQL 8.5devel-rows_frame_types



psql -f $pgsql_regress_sql/create_table.sql;
cat $pgsql_regress_data/tenk.data | psql -c copy tenk1 from stdin csv 
delimiter E'\t';


explain
  SELECT four, ten, sum(ten) over (partition by four order by four range 1 
preceding)
  FROM tenk1 WHERE unique1  10;


 QUERY PLAN

 WindowAgg  (cost=483.17..483.37 rows=10 width=8)
   -  Sort  (cost=483.17..483.19 rows=10 width=8)
 Sort Key: four
 -  Seq Scan on tenk1  (cost=0.00..483.00 rows=10 width=8)
   Filter: (unique1  10)
(5 rows)


explain analyze
  SELECT four, ten, sum(ten) over (partition by four order by four range 1 
preceding)
  FROM tenk1 WHERE unique1  10;

ERROR:  cannot extract system attribute from minimal tuple


hth,

Erik Rijkers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-16 Thread Hitoshi Harada
2010/1/17 Erik Rijkers e...@xs4all.nl:
 On Sat, January 16, 2010 09:29, Hitoshi Harada wrote:
 2010/1/16 Erik Rijkers e...@xs4all.nl:

 Thanks for the review. I've found another crash today and attached is
 fixed version. The case is:

 SELECT four, sum(ten) over (PARTITION BY four ORDER BY four RANGE 1
 PRECEDING) FROM tenk1 WHERE unique1  10;


 Hi,

 The patch (more_frame_options.20100115.patch.gz) applies cleanly, but the 
 regression test gives:

 It doesn't happen here. Could you send more information like configure
 options and EXPLAIN output of that query?


 Sorry, I should have done that the first time.  Here is more info:

 Linux 2.6.18-164.el5 x86_64 / CentOS release 5.4 (Final)

Thanks, I confirmed it on my 64bit environment. My approach to solve
this was completely wrong.

The problem here is that when PARTITION BY clause equals to ORDER BY
clause window_pathkeys is canonicalized in make_pathkeys_for_window()
and so get_column_info_for_window() recognizes number of ORDER BY
columns as 0, in other word, window-ordNumCols = 0 
window-ordColIdx[0] is invalid. This behavior is ok in 8.4 because in
that case all rows are peer to each other and the partition = the
frame in RANGE mode. This assumption is now broken since
FRAMEOPTION_START_OFFSET and FRAMEOPTION_END_OFFSET are introduced,
which means that the current row can be out of the frame. So with
these options we must always evaluate if the current row is inside the
frame or not by *sort key column*. However, we don't have it in the
executor as the planner has removed it during canonicalization of
pathkeys.

So I previously added such code:
*** 2819,2825  get_column_info_for_window(PlannerInfo *root,
WindowClause *wc, List *tlist,
int numPart = list_length(wc-partitionClause);
int numOrder = list_length(wc-orderClause);

!   if (numSortCols == numPart + numOrder)
{
/* easy case */
*partNumCols = numPart;
--- 2836,2847 
int numPart = list_length(wc-partitionClause);
int numOrder = list_length(wc-orderClause);

!   /*
!* in RANGE offset mode, numOrder should be represented as-is.
!*/
!   if (numSortCols == numPart + numOrder ||
!   (wc-frameOptions  FRAMEOPTION_RANGE 
!wc-frameOptions  (FRAMEOPTION_START_VALUE | 
FRAMEOPTION_END_VALUE)))
{
/* easy case */
*partNumCols = numPart;

but it failed now, since the sortColIdx passed in has been
canonicalized already. And I tried to change not to canonicalize the
pathkeys in make_pathkeys_window() in such cases and succeeded then
passed all regression tests.

diff --git a/src/backend/optimizer/plan/planner.c
b/src/backend/optimizer/plan/planner.c
index 6ba051d..fee1302 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1417,11 +1417,17 @@ grouping_planner(PlannerInfo *root, double
tuple_fraction)
int ordNumCols;
AttrNumber *ordColIdx;
Oid*ordOperators;
+   boolcanonicalize;
+
+   canonicalize = !(wc-frameOptions  
FRAMEOPTION_RANGE 
+   
wc-frameOptions 
+   
(FRAMEOPTION_START_VALUE |
+
FRAMEOPTION_END_VALUE));

window_pathkeys = make_pathkeys_for_window(root,

   wc,

   tlist,
-   
   true);
+   
   canonicalize);

/*
 * This is a bit tricky: we build a sort node 
even if we don't


I am not very sure if this is the correct answer. Thoughts?


Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-16 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 ... I tried to change not to canonicalize the
 pathkeys in make_pathkeys_window() in such cases and succeeded then
 passed all regression tests.

That's broken, whether it passes regression tests or not.  Not
canonicalizing will mean that you fail to recognize equality to
canonicalized pathkeys, and thus for example execute unnecessary
sorts.

I haven't looked at the patch, but it sounds a bit like you are trying
to put logic into the executor that needs to be in the planner.  If the
executor is guessing about what the planner did, that's a design
failure.  The planner should figure out what needs to happen and tell
the executor exactly what to do, eg, which columns need to be compared
to determine partition membership.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-16 Thread Hitoshi Harada
2010/1/17 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 ... I tried to change not to canonicalize the
 pathkeys in make_pathkeys_window() in such cases and succeeded then
 passed all regression tests.

 That's broken, whether it passes regression tests or not.  Not
 canonicalizing will mean that you fail to recognize equality to
 canonicalized pathkeys, and thus for example execute unnecessary
 sorts.

So why did you leave canonicalize argument in
make_pathkeys_for_window()? I thought you'd thought it would be needed
false in the future.

 I haven't looked at the patch, but it sounds a bit like you are trying
 to put logic into the executor that needs to be in the planner.  If the
 executor is guessing about what the planner did, that's a design
 failure.  The planner should figure out what needs to happen and tell
 the executor exactly what to do, eg, which columns need to be compared
 to determine partition membership.

I'd think I'm putting logic into the planner that needs to be in the
executor. Anyways, RANGE offset mode is quite different from existing
framework.

In a RANGE offset mode query, for example:

SELECT sum(ten) over (PARTITION BY four ORDER BY four RANGE BETWEEN 2
PRECEDING AND 1 PRECEDING) FROM tenk1

the frame is determined as from the first row which has four value
- 2 to the last row which has four value - 1 and executor should
know four value *is* the sort column even if the column is not
actually significant. But the planner removes that information. It
seems to me that what you say above is to put logic into the planner
how to determine the frame, but I'd think the planner should leave
that information about sort column for the executor, of course in
another way than non-canonicalizing approach.

I'll try it that way. If I'm missing something still, please point it out.


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-15 Thread Erik Rijkers

 Thanks for the review. I've found another crash today and attached is
 fixed version. The case is:

 SELECT four, sum(ten) over (PARTITION BY four ORDER BY four RANGE 1
 PRECEDING) FROM tenk1 WHERE unique1  10;


Hi,

The patch (more_frame_options.20100115.patch.gz) applies cleanly, but the 
regression test gives:


***
/var/data1/pg_stuff/pg_sandbox/pgsql.rows_frame_types/src/test/regress/expected/window.out
  2010-01-15
22:36:01.0 +0100
---
/var/data1/pg_stuff/pg_sandbox/pgsql.rows_frame_types/src/test/regress/results/window.out
   2010-01-15
22:37:01.0 +0100
***
*** 934,953 

  SELECT four, ten, sum(ten) over (partition by four order by four range 1 
preceding)
  FROM tenk1 WHERE unique1  10;
!  four | ten | sum
! --+-+-
! 0 |   0 |  12
! 0 |   8 |  12
! 0 |   4 |  12
! 1 |   5 |  15
! 1 |   9 |  15
! 1 |   1 |  15
! 2 |   6 |   8
! 2 |   2 |   8
! 3 |   3 |  10
! 3 |   7 |  10
! (10 rows)
!
  CREATE VIEW v_window AS
SELECT i, sum(i) over (order by i rows between 1 preceding and 1 
following) as sum_rows,
sum(i) over (order by i / 3 range between 1 preceding and 1 following) 
as sum_range
--- 934,940 

  SELECT four, ten, sum(ten) over (partition by four order by four range 1 
preceding)
  FROM tenk1 WHERE unique1  10;
! ERROR:  cannot extract system attribute from minimal tuple
  CREATE VIEW v_window AS
SELECT i, sum(i) over (order by i rows between 1 preceding and 1 
following) as sum_rows,
sum(i) over (order by i / 3 range between 1 preceding and 1 following) 
as sum_range

==



regards,


Erik Rijkers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: More frame options in window functions

2010-01-14 Thread Hitoshi Harada
2010/1/14 Pavel Stehule pavel.steh...@gmail.com:
 Hello

 I looked on Hitoshi's patch - and my result is:

Thanks for the review. I've found another crash today and attached is
fixed version. The case is:

SELECT four, sum(ten) over (PARTITION BY four ORDER BY four RANGE 1
PRECEDING) FROM tenk1 WHERE unique1  10;

The planner recognizes windowagg-ordNumCol as 0 for optimization, but
RANGE offset cases should really have interest on that value as the
syntax says.

Regards,


-- 
Hitoshi Harada


more_frame_options.20100115.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers