Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-28 Thread Tom Lane
David Rowley dgrow...@gmail.com writes:
 [ wfunc_pushdown_partitionby_v0.4.patch ]

I've committed this with the addition of a volatility check and some
other basically-cosmetic adjustments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-28 Thread David Rowley
On 28 June 2014 18:12, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrow...@gmail.com writes:
  [ wfunc_pushdown_partitionby_v0.4.patch ]

 I've committed this with the addition of a volatility check and some
 other basically-cosmetic adjustments.


Great, thank you for making the required changes.

Vik, thank you for reviewing the patch.

Regards

David Rowley

regards, tom lane



Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-26 Thread Vik Fearing
On 06/21/2014 02:03 PM, David Rowley wrote:
 I'm marking this Waiting on Author pending discussion on pushing down
 entire expressions, but on the whole I think this is pretty much ready.
 
 
 As I said above, I don't think playing around with that code is really
 work for this patch. It can be done later in another patch if people
 think it would be useful.

I tend to agree.

This latest patch is ready for a committer to look at now.  The weird
comments have been changed, superfluous regression tests removed, and
nothing done about expression pushdown per (brief) discussion.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-26 Thread Tom Lane
Vik Fearing vik.fear...@dalibo.com writes:
 This latest patch is ready for a committer to look at now.  The weird
 comments have been changed, superfluous regression tests removed, and
 nothing done about expression pushdown per (brief) discussion.

I started to look at this patch and realized that there's an issue that
isn't covered, which is not too surprising because the existing code fails
to cover it too.  Remember that the argument for pushing down being safe
at all is that we expect the pushed-down qual to yield the same result at
all rows of a given partition, so that we either include or exclude the
whole partition and thereby don't change window function results.  This
means that not only must the qual expression depend only on partitioning
columns, but *it had better not be volatile*.

In exactly the same way, it isn't safe to push down quals into
subqueries that use DISTINCT unless the quals are non-volatile.  This
consideration is missed by the current code, and I think that's a bug.

(Pushing down volatile quals would also be unsafe in subqueries involving
aggregation, except that we put them into HAVING so that they're executed
only once per subquery output row anyway.)

Given the lack of prior complaints, I'm not excited about back-patching a
change to prevent pushing down volatile quals in the presence of DISTINCT;
but I think we probably ought to fix it in 9.5, and maybe 9.4 too.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-26 Thread Vik Fearing
On 06/27/2014 02:49 AM, Tom Lane wrote:
 Vik Fearing vik.fear...@dalibo.com writes:
 This latest patch is ready for a committer to look at now.  The weird
 comments have been changed, superfluous regression tests removed, and
 nothing done about expression pushdown per (brief) discussion.
 
 I started to look at this patch and realized that there's an issue that
 isn't covered, which is not too surprising because the existing code fails
 to cover it too.  Remember that the argument for pushing down being safe
 at all is that we expect the pushed-down qual to yield the same result at
 all rows of a given partition, so that we either include or exclude the
 whole partition and thereby don't change window function results.  This
 means that not only must the qual expression depend only on partitioning
 columns, but *it had better not be volatile*.
 
 In exactly the same way, it isn't safe to push down quals into
 subqueries that use DISTINCT unless the quals are non-volatile.  This
 consideration is missed by the current code, and I think that's a bug.
 
 (Pushing down volatile quals would also be unsafe in subqueries involving
 aggregation, except that we put them into HAVING so that they're executed
 only once per subquery output row anyway.)

Are you going to take care of all this, or should David or I take a
crack at it?  The commitfest app still shows Ready for Committer.

 Given the lack of prior complaints, I'm not excited about back-patching a
 change to prevent pushing down volatile quals in the presence of DISTINCT;
 but I think we probably ought to fix it in 9.5, and maybe 9.4 too.
 
 Thoughts?

I didn't test it myself, I'm just taking your word on it.

If it's a bug, it should obviously be fixed in 9.5.  As for 9.4, I have
always viewed a beta as a time to fix bugs so I vote to backpatch it at
least that far.

Why wouldn't it go back all the way to 9.0?  (assuming 8.4 is dead)
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-26 Thread Tom Lane
Vik Fearing vik.fear...@dalibo.com writes:
 On 06/27/2014 02:49 AM, Tom Lane wrote:
 In exactly the same way, it isn't safe to push down quals into
 subqueries that use DISTINCT unless the quals are non-volatile.  This
 consideration is missed by the current code, and I think that's a bug.

 Given the lack of prior complaints, I'm not excited about back-patching a
 change to prevent pushing down volatile quals in the presence of DISTINCT;
 but I think we probably ought to fix it in 9.5, and maybe 9.4 too.

 Why wouldn't it go back all the way to 9.0?  (assuming 8.4 is dead)

People get unhappy when minor releases de-optimize queries that had
been working for them.  It's not too late to change the behavior of
9.4, but I'm hesitant to do it in already-released branches, especially
in the absence of any complaints from the field.

 Are you going to take care of all this, or should David or I take a
 crack at it?  The commitfest app still shows Ready for Committer.

I can deal with it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-26 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
  Why wouldn't it go back all the way to 9.0?  (assuming 8.4 is dead)
 
 People get unhappy when minor releases de-optimize queries that had
 been working for them.  It's not too late to change the behavior of
 9.4, but I'm hesitant to do it in already-released branches, especially
 in the absence of any complaints from the field.

Agreed.  Back-patching to released versions would likely end up making
for a nasty surprise in a minor point release for some users.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-21 Thread David Rowley
On 21 June 2014 01:38, Vik Fearing vik.fear...@dalibo.com wrote:

 I've had a chance to look at this and here is my review.

 On 04/14/2014 01:19 PM, David Rowley wrote:
  I've included the updated patch with some regression tests.

 The first thing I noticed is there is no documentation, but I don't
 think we document such things outside of the release notes, so that's
 probably normal.


This prompted me to have a look to see if there's anything written in the
documents about the reasons why we might not push quals down, but I didn't
manage to find anything to that affect.


  The final test I added tests that an unused window which would, if used,
  cause the pushdown not to take place still stops the pushdownf from
  happening... I know you both talked about this case in the other thread,
  but I've done nothing for it as Tom mentioned that this should likely be
  fixed elsewhere, if it's even worth worrying about at all. I'm not quite
  sure if I needed to bother including this test for it, but I did anyway,
  it can be removed if it is deemed unneeded.

 I don't think this test has any business being in this patch.  Removal
 of unused things should be a separate patch and once that's done your
 patch should work as expected.  This regression test you have here
 almost demands that that other removal patch shouldn't happen.


Agreed, I've removed that test now.


  Per Thomas' comments, I added a couple of tests that ensure that the
  order of the columns in the partition by clause don't change the
  behaviour. Looking at the implementation of the actual code makes this
  test seems a bit unneeded as really but I thought that the test should
  not assume anything about the implementation so from that point of view
  the extra test seems like a good idea.

 Every possible permutation of everything is overkill, but I think having
 a test that makes sure the pushdown happens when the partition in
 question isn't first in the list is a good thing.


In the updated patch I've removed some a few extra tests that were just
testing the same clauses in the PARTITION BY but in a different order.


  For now I can't think of much else to do for the patch, but I'd really
  appreciate it Thomas if you could run it through the paces if you can
  find any time for it, or maybe just give my regression tests a once over
  to see if you can think of any other cases that should be covered.

 I'm not Thomas, but...

 This patch is very simple.  There's nothing wrong with that, of course,
 but I wonder how hard it would be to push more stuff down.  For example,
 you have this case which works perfectly by not pushing down:

 SELECT * FROM (
 SELECT partid,
winagg() OVER (PARTITION BY partid+0)
 FROM t) wa
 WHERE partid = 1;

 But then you have this case which doesn't push down:

 SELECT * FROM (
 SELECT partid,
winagg() OVER (PARTITION BY partid+0)
 FROM t) wa
 WHERE partid+0 = 1;

 I don't know what's involved in fixing that, or if we do that sort of
 thing elsewhere, but it's something to consider.


I had a look at this and at first I thought it was just as simple as
removing the if (tle-resjunk) continue; from check_output_expressions, but
after removing that I see that it's a bit more complex.
In qual_is_pushdown_safe it pulls (using pull_var_clause) the Vars from the
outer WHERE clause and not the whole expression, it then checks, in your
case if partid (not partid+0) is safe to push down, and sees that it's
not, due to this patches code marking it as not because partid is not in
the PARTITION BY clause.

I really don't think it's the business of this patch to make changes around
here. Perhaps it would be worth looking into in more detail in the future.
Although, you can probably get what you want by just writing the query as:

SELECT * FROM (
SELECT partid+0 as partid,
   count(*) OVER (PARTITION BY partid+0)
FROM t) wa
WHERE partid = 1;

Or if you really need the unmodified partid, then you could add another
column to the target list and just not do select * in the outer query.


Multi-column pushdowns work as expected.


 I have an issue with some of the code comments:

 In subquery_is_pushdown_safe you reduced the number of points from
 three to two.  The comments used to say Check point 1 and Check point
 2 but the third was kind of tested throughout the rest of the code so
 didn't have a dedicated comment.  Now that there are only two checks,
 the Check point 1 without a 2 seems a little bit odd.  The attached
 patch provides a suggestion for improvement.

 The same kind of weirdness is found in check_output_expressions but I
 don't really have any suggestions to fix it so I just left it.

 The attached patch also fixes some typos.


That seems like a good change, oh and well spotted on the typo.
I've applied your patch into my local copy of the code here. Thanks


 I'm marking this Waiting on Author pending discussion on pushing down
 entire 

Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-20 Thread Vik Fearing
I've had a chance to look at this and here is my review.

On 04/14/2014 01:19 PM, David Rowley wrote:
 I've included the updated patch with some regression tests.

The first thing I noticed is there is no documentation, but I don't
think we document such things outside of the release notes, so that's
probably normal.

 The final test I added tests that an unused window which would, if used,
 cause the pushdown not to take place still stops the pushdownf from
 happening... I know you both talked about this case in the other thread,
 but I've done nothing for it as Tom mentioned that this should likely be
 fixed elsewhere, if it's even worth worrying about at all. I'm not quite
 sure if I needed to bother including this test for it, but I did anyway,
 it can be removed if it is deemed unneeded.

I don't think this test has any business being in this patch.  Removal
of unused things should be a separate patch and once that's done your
patch should work as expected.  This regression test you have here
almost demands that that other removal patch shouldn't happen.

 Per Thomas' comments, I added a couple of tests that ensure that the
 order of the columns in the partition by clause don't change the
 behaviour. Looking at the implementation of the actual code makes this
 test seems a bit unneeded as really but I thought that the test should
 not assume anything about the implementation so from that point of view
 the extra test seems like a good idea.

Every possible permutation of everything is overkill, but I think having
a test that makes sure the pushdown happens when the partition in
question isn't first in the list is a good thing.

 For now I can't think of much else to do for the patch, but I'd really
 appreciate it Thomas if you could run it through the paces if you can
 find any time for it, or maybe just give my regression tests a once over
 to see if you can think of any other cases that should be covered.

I'm not Thomas, but...

This patch is very simple.  There's nothing wrong with that, of course,
but I wonder how hard it would be to push more stuff down.  For example,
you have this case which works perfectly by not pushing down:

SELECT * FROM (
SELECT partid,
   winagg() OVER (PARTITION BY partid+0)
FROM t) wa
WHERE partid = 1;

But then you have this case which doesn't push down:

SELECT * FROM (
SELECT partid,
   winagg() OVER (PARTITION BY partid+0)
FROM t) wa
WHERE partid+0 = 1;

I don't know what's involved in fixing that, or if we do that sort of
thing elsewhere, but it's something to consider.

Multi-column pushdowns work as expected.


I have an issue with some of the code comments:

In subquery_is_pushdown_safe you reduced the number of points from
three to two.  The comments used to say Check point 1 and Check point
2 but the third was kind of tested throughout the rest of the code so
didn't have a dedicated comment.  Now that there are only two checks,
the Check point 1 without a 2 seems a little bit odd.  The attached
patch provides a suggestion for improvement.

The same kind of weirdness is found in check_output_expressions but I
don't really have any suggestions to fix it so I just left it.

The attached patch also fixes some typos.


I'm marking this Waiting on Author pending discussion on pushing down
entire expressions, but on the whole I think this is pretty much ready.
-- 
Vik
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
***
*** 1692,1698  subquery_is_pushdown_safe(Query *subquery, Query *topquery,
  {
  	SetOperationStmt *topop;
  
! 	/* Check point 1 */
  	if (subquery-limitOffset != NULL || subquery-limitCount != NULL)
  		return false;
  
--- 1692,1698 
  {
  	SetOperationStmt *topop;
  
! 	/* Pushdown is unsafe if we have a LIMIT/OFFSET clause */
  	if (subquery-limitOffset != NULL || subquery-limitCount != NULL)
  		return false;
  
***
*** 1794,1802  recurse_pushdown_safe(Node *setOp, Query *topquery,
   *
   * 4. If the subquery has windowing functions we are able to push down any
   * quals that are referenced in all of the subquery's window PARTITION BY
!  * clauses. This is permitted as window partitions are completed isolated
   * from each other and removing records from unneeded partitions early has
!  * no affect on the query results.
   */
  static void
  check_output_expressions(Query *subquery, bool *unsafeColumns)
--- 1794,1802 
   *
   * 4. If the subquery has windowing functions we are able to push down any
   * quals that are referenced in all of the subquery's window PARTITION BY
!  * clauses. This is permitted as window partitions are completely isolated
   * from each other and removing records from unneeded partitions early has
!  * no effect on the query results.
   */
  static void
  check_output_expressions(Query *subquery, bool *unsafeColumns)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes 

Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-17 Thread David Rowley
On Sun, May 25, 2014 at 2:10 PM, Thomas Mayer thomas.ma...@student.kit.edu
wrote:

 Hello David,

 sorry for the late response. I will try out your changes from the view of
 a user in mid-June. However, I can't do a trustworthy code review as I'm
 not an experienced postgre-hacker (yet).


Thanks, that sort of review will be a great start.

I've attached an updated patch as it seems there's been a change that
removes redundant sorts which was breaking one of the regression tests in
v0.2 of the patch.

+EXPLAIN (COSTS OFF)
+SELECT depname,depsalary FROM (SELECT depname,
+  sum(salary) OVER (PARTITION BY depname) depsalary,
+  rank() OVER (ORDER BY enroll_date) emprank
+  FROM empsalary) emp
+WHERE depname = 'sales';

This used to produced a plan which included a sort by enroll_date, but this
is no longer the case. I've updated the expected results to remove the
extra sort from the explain output

Regards

David Rowley


wfunc_pushdown_partitionby_v0.3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-05-25 Thread Thomas Mayer

Hello David,

sorry for the late response. I will try out your changes from the view 
of a user in mid-June. However, I can't do a trustworthy code review as 
I'm not an experienced postgre-hacker (yet).


Best Regards
Thomas


Am 14.04.2014 13:19, schrieb David Rowley:

On 14 April 2014 03:31, Tom Lane t...@sss.pgh.pa.us
mailto:t...@sss.pgh.pa.us wrote:

David Rowley dgrow...@gmail.com mailto:dgrow...@gmail.com writes:
  On this thread
 
http://www.postgresql.org/message-id/52c6f712.6040...@student.kit.edu there
  was some discussion around allowing push downs of quals that
happen to be
  in every window clause of the sub query. I've quickly put
together a patch
  which does this (see attached)

I think you should have check_output_expressions deal with this,
instead.
Compare the existing test there for non-DISTINCT output columns.


I've moved the code there and it seems like a much better place for it.
Thanks for the tip.

  Oh and I know that my function
var_exists_in_all_query_partition_by_clauses
  has no business in allpaths.c, I'll move it out as soon as I find
a better
  home for it.

I might be wrong, but I think you could just embed that search loop in
check_output_expressions, and it wouldn't be too ugly.


I've changed the helper function to take the TargetEntry now the same
as targetIsInSortList does for the hasDistinctOn test and I renamed the
helper function to targetExistsInAllQueryPartitionByClauses. It seems
much better, but there's probably a bit of room for improving on the
names some more.

I've included the updated patch with some regression tests.
The final test I added tests that an unused window which would, if used,
cause the pushdown not to take place still stops the pushdownf from
happening... I know you both talked about this case in the other thread,
but I've done nothing for it as Tom mentioned that this should likely be
fixed elsewhere, if it's even worth worrying about at all. I'm not quite
sure if I needed to bother including this test for it, but I did anyway,
it can be removed if it is deemed unneeded.

Per Thomas' comments, I added a couple of tests that ensure that the
order of the columns in the partition by clause don't change the
behaviour. Looking at the implementation of the actual code makes this
test seems a bit unneeded as really but I thought that the test should
not assume anything about the implementation so from that point of view
the extra test seems like a good idea.

For now I can't think of much else to do for the patch, but I'd really
appreciate it Thomas if you could run it through the paces if you can
find any time for it, or maybe just give my regression tests a once over
to see if you can think of any other cases that should be covered.

Regards

David Rowley


--
==
Thomas Mayer
Durlacher Allee 61
D-76131 Karlsruhe
Telefon: +49-721-2081661
Fax: +49-721-72380001
Mobil:   +49-174-2152332
E-Mail:  thomas.ma...@student.kit.edu
===



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-04-14 Thread David Rowley
On 14 April 2014 03:31, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrow...@gmail.com writes:
  On this thread
  http://www.postgresql.org/message-id/52c6f712.6040...@student.kit.eduthere
  was some discussion around allowing push downs of quals that happen to be
  in every window clause of the sub query. I've quickly put together a
 patch
  which does this (see attached)

 I think you should have check_output_expressions deal with this, instead.
 Compare the existing test there for non-DISTINCT output columns.


I've moved the code there and it seems like a much better place for it.
Thanks for the tip.


  Oh and I know that my function
 var_exists_in_all_query_partition_by_clauses
  has no business in allpaths.c, I'll move it out as soon as I find a
 better
  home for it.

 I might be wrong, but I think you could just embed that search loop in
 check_output_expressions, and it wouldn't be too ugly.


I've changed the helper function to take the TargetEntry now the same
as targetIsInSortList does for the hasDistinctOn test and I renamed the
helper function to targetExistsInAllQueryPartitionByClauses. It seems much
better, but there's probably a bit of room for improving on the names some
more.

I've included the updated patch with some regression tests.
The final test I added tests that an unused window which would, if used,
cause the pushdown not to take place still stops the pushdownf from
happening... I know you both talked about this case in the other thread,
but I've done nothing for it as Tom mentioned that this should likely be
fixed elsewhere, if it's even worth worrying about at all. I'm not quite
sure if I needed to bother including this test for it, but I did anyway, it
can be removed if it is deemed unneeded.

Per Thomas' comments, I added a couple of tests that ensure that the order
of the columns in the partition by clause don't change the behaviour.
Looking at the implementation of the actual code makes this test seems a
bit unneeded as really but I thought that the test should not assume
anything about the implementation so from that point of view the extra test
seems like a good idea.

For now I can't think of much else to do for the patch, but I'd really
appreciate it Thomas if you could run it through the paces if you can find
any time for it, or maybe just give my regression tests a once over to see
if you can think of any other cases that should be covered.

Regards

David Rowley


wfunc_pushdown_partitionby_v0.2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-04-13 Thread Tom Lane
David Rowley dgrow...@gmail.com writes:
 On this thread
 http://www.postgresql.org/message-id/52c6f712.6040...@student.kit.edu there
 was some discussion around allowing push downs of quals that happen to be
 in every window clause of the sub query. I've quickly put together a patch
 which does this (see attached)

I think you should have check_output_expressions deal with this, instead.
Compare the existing test there for non-DISTINCT output columns.

 Oh and I know that my function var_exists_in_all_query_partition_by_clauses
 has no business in allpaths.c, I'll move it out as soon as I find a better
 home for it.

I might be wrong, but I think you could just embed that search loop in
check_output_expressions, and it wouldn't be too ugly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers