Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-27 Thread Tom Lane
I wrote:
 I'm inclined to think that it would be best to move the responsibility
 for calling AfterTriggerBeginQuery/AfterTriggerEndQuery into the
 executor.  That would get us down to

 CreateQueryDesc(...);
 ExecutorStart(...);   // now includes AfterTriggerBeginQuery
 ExecutorRun(...); // zero or more times
 ExecutorFinish(...);  // ModifyTable cleanup, AfterTriggerEndQuery
 ExecutorEnd(...); // just does what it's always done
 FreeQueryDesc(...);

 where EXPLAIN without ANALYZE would skip ExecutorRun and ExecutorFinish.

 IMO the major disadvantage of a refactoring like this is the possibility
 of sins of omission in third-party code, in particular somebody not
 noticing the added requirement to call ExecutorFinish.  We could help
 them out by adding an Assert in ExecutorEnd to verify that
 ExecutorFinish had been called (unless explain-only mode).  A variant of
 that problem is an auto_explain-like add-on not noticing that they
 probably want to hook into ExecutorFinish if they'd previously been
 hooking ExecutorRun.  I don't see any simple check for that though.
 The other possible failure mode is forgetting to remove calls to the two
 trigger functions, but we could encourage getting that right by renaming
 those two functions.

This is committed.  I desisted from the last change (renaming the
trigger functions) because it seemed unnecessary.  If someone does
forget to remove redundant AfterTriggerBeginQuery/AfterTriggerEndQuery
calls, it won't hurt them much, just waste a few cycles stacking and
unstacking useless trigger contexts.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-26 Thread Heikki Linnakangas

On 26.02.2011 07:55, Tom Lane wrote:

So we really need some refactoring here.  I dislike adding another
fundamental step to the ExecutorStart/ExecutorRun/ExecutorEnd sequence,
but there may not be a better way.


Could you keep the sequence unchanged for non-EXPLAIN callers with some 
refactoring? Add an exposed function like ExecutorFinishRun() that 
Explain calls explicitly in the EXPLAIN ANALYZE case, and modify 
ExecutorEnd to call it too, if it hasn't been called yet and the 
explain-only flag isn't set.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-26 Thread Greg Stark
On Sat, Feb 26, 2011 at 5:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 So we really need some refactoring here.  I dislike adding another
 fundamental step to the ExecutorStart/ExecutorRun/ExecutorEnd sequence,
 but there may not be a better way.  The only way I see to fix this
 without changing that API is to have ExecutorRun do the cleanup
 processing just after the top plan node returns a null tuple, and that
 seems a bit ugly as well.


How would that handle the case of a cursor which isn't read to
completion? Should it still execute the CTEs to completion?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-26 Thread Dean Rasheed
On 26 February 2011 05:55, Tom Lane t...@sss.pgh.pa.us wrote:
 Further experimentation has reminded me of why I didn't want to put such
 processing in ExecutorEnd :-(.  There are some nasty interactions with
 EXPLAIN:

 1. EXPLAIN ANALYZE fails to include the execution cycles associated with
 running the ModifyTable nodes to completion.  In the worst case, such as
 WITH t AS (INSERT ...) SELECT 1, it will claim the INSERT subplan is
 never executed, even though rows certainly got inserted.  This is
 because EXPLAIN extracts all the counts from the execution state tree
 before shutting it down with ExecutorEnd.

 2. But it gets worse.  Try the same query *without* ANALYZE.  You'll
 find the INSERT executes anyway!  That's because EXPLAIN still calls
 ExecutorEnd to clean up the execution state tree, and ExecutorEnd
 doesn't realize it's not supposed to run any of the plan.


There's a third problem: AfterTriggerEndQuery() is called before
ExecutorEnd(), and so if the post-processing is done in ExecutorEnd()
and it attempts to queue up any AFTER triggers, it fails (ERROR:
AfterTriggerSaveEvent() called outside of query).


 So we really need some refactoring here.  I dislike adding another
 fundamental step to the ExecutorStart/ExecutorRun/ExecutorEnd sequence,
 but there may not be a better way.  The only way I see to fix this
 without changing that API is to have ExecutorRun do the cleanup
 processing just after the top plan node returns a null tuple, and that
 seems a bit ugly as well.

 Thoughts?


Could the post-processing not be done at the end of ExecutePlan()?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Sat, Feb 26, 2011 at 5:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 So we really need some refactoring here.  I dislike adding another
 fundamental step to the ExecutorStart/ExecutorRun/ExecutorEnd sequence,
 but there may not be a better way.  The only way I see to fix this
 without changing that API is to have ExecutorRun do the cleanup
 processing just after the top plan node returns a null tuple, and that
 seems a bit ugly as well.

 How would that handle the case of a cursor which isn't read to
 completion? Should it still execute the CTEs to completion?

Right at the moment we dodge that issue by disallowing wCTEs in cursors.
If we did allow them, then I would say that the wCTEs have to be run to
completion when the cursor is closed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-26 Thread Greg Stark
On Sat, Feb 26, 2011 at 4:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Right at the moment we dodge that issue by disallowing wCTEs in cursors.
 If we did allow them, then I would say that the wCTEs have to be run to
 completion when the cursor is closed.


Does that really dodge anything? Isn't it just the same as running a
query from a client and closing the result without reading to the end?

ExecutorEnd would be called but ExecutorRun would never be called to
the end of the scan.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-26 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 26.02.2011 07:55, Tom Lane wrote:
 So we really need some refactoring here.  I dislike adding another
 fundamental step to the ExecutorStart/ExecutorRun/ExecutorEnd sequence,
 but there may not be a better way.

 Could you keep the sequence unchanged for non-EXPLAIN callers with some 
 refactoring? Add an exposed function like ExecutorFinishRun() that 
 Explain calls explicitly in the EXPLAIN ANALYZE case, and modify 
 ExecutorEnd to call it too, if it hasn't been called yet and the 
 explain-only flag isn't set.

I had been toying with the same idea, but it doesn't work, as Dean
Rasheed points out nearby.  The full monty for running the executor
these days is really

CreateQueryDesc(...);
AfterTriggerBeginQuery(...);
ExecutorStart(...);
ExecutorRun(...);   // zero or more times
AfterTriggerEndQuery(...);
ExecutorEnd(...);
FreeQueryDesc(...);

ExecutorEnd can *not* do anything that might fire triggers, because it's
too late: AfterTriggerEndQuery has already been called.

BTW, there are various places that believe they can skip
AfterTriggerBeginQuery/AfterTriggerEndQuery if operation == CMD_SELECT,
an assumption that no longer holds if the query has wCTEs.  So we have
some work to do on the callers no matter what.

I'm inclined to think that it would be best to move the responsibility
for calling AfterTriggerBeginQuery/AfterTriggerEndQuery into the
executor.  That would get us down to

CreateQueryDesc(...);
ExecutorStart(...); // now includes AfterTriggerBeginQuery
ExecutorRun(...);   // zero or more times
ExecutorFinish(...);// ModifyTable cleanup, AfterTriggerEndQuery
ExecutorEnd(...);   // just does what it's always done
FreeQueryDesc(...);

where EXPLAIN without ANALYZE would skip ExecutorRun and ExecutorFinish.

The RI triggers have a requirement for being able to run this sequence
without the AfterTriggerBeginQuery/AfterTriggerEndQuery calls (cf
SPI_execute_snapshot's fire_triggers parameter), but we could support
that by adding an ExecutorStart flag that tells it to suppress those
trigger calls.

IMO the major disadvantage of a refactoring like this is the possibility
of sins of omission in third-party code, in particular somebody not
noticing the added requirement to call ExecutorFinish.  We could help
them out by adding an Assert in ExecutorEnd to verify that
ExecutorFinish had been called (unless explain-only mode).  A variant of
that problem is an auto_explain-like add-on not noticing that they
probably want to hook into ExecutorFinish if they'd previously been
hooking ExecutorRun.  I don't see any simple check for that though.
The other possible failure mode is forgetting to remove calls to the two
trigger functions, but we could encourage getting that right by renaming
those two functions.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-26 Thread Marko Tiikkaja

On 2011-02-26 7:18 PM, Tom Lane wrote:

IMO the major disadvantage of a refactoring like this is the possibility
of sins of omission in third-party code, in particular somebody not
noticing the added requirement to call ExecutorFinish.  We could help
them out by adding an Assert in ExecutorEnd to verify that
ExecutorFinish had been called (unless explain-only mode).  A variant of
that problem is an auto_explain-like add-on not noticing that they
probably want to hook into ExecutorFinish if they'd previously been
hooking ExecutorRun.  I don't see any simple check for that though.
The other possible failure mode is forgetting to remove calls to the two
trigger functions, but we could encourage getting that right by renaming
those two functions.


While I don't really like the possibility of breaking third party 
modules, I think the idea is good.  Also +1 for adding checks where 
possible.



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Robert Haas
On Fri, Feb 25, 2011 at 9:58 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I had what seems to me a remarkably good idea, though maybe someone else
 can spot a problem with it.  Given that we've decided to run the
 modifying sub-queries all with the same command counter ID, they are
 logically executing in parallel.  The current implementation takes no
 advantage of that fact, though: it's based around the idea of running
 the updates strictly sequentially.  I think we should change it so that
 the updates happen physically, not only logically, concurrently.
 Specifically, I'm imagining getting rid of the patch's additions to
 InitPlan and ExecutePlan that find all the modifying sub-queries and
 force them to be cycled to completion before the main plan runs.
 Just run the main plan and let it pull tuples from the CTEs as needed.
 Then, in ExecutorEnd, cycle any unfinished ModifyTable nodes to
 completion before shutting down the plan.  (In the event of an error,
 we'd never get to ExecutorEnd, but it doesn't matter since whatever
 updates we did apply are nullified anyhow.)

 This has a number of immediate and future implementation benefits:

 1. RETURNING tuples that aren't actually needed by the main plan
 don't need to be buffered anywhere.  (ExecutorEnd would just pull
 directly from the ModifyTable nodes, ignoring their parent CTE
 nodes, in all cases.)

 2. In principle, in many common cases the RETURNING tuples wouldn't have
 to be buffered at all, but could be consumed on-the-fly.  I think that
 right now the CTEScan nodes might still buffer the tuples so they can
 regurgitate them in case of being rescanned, but it's not hard to see
 how that could be improved later if it doesn't work immediately.

 3. The code could be significantly simpler.  Instead of that rather
 complex and fragile logic in InitPlan to try to locate all the
 ModifyTable nodes and their CTEScan parents, we could just have
 ModifyTable nodes add themselves to a list in the EState during
 ExecInitNode.  Then ExecutorEnd just traverses that list.

 However, the real reason for doing it isn't any of those, but rather
 to establish the principle that the executions of the modifying
 sub-queries are interleaved not sequential.  We're never going to be
 able to do any significant optimization of such queries if we have to
 preserve the behavior that the sub-queries execute sequentially.
 And I think it's inevitable that users will manage to build such an
 assumption into their queries if the first release with the feature
 behaves that way.

 Comments?

I completely agree.  Actually, I thought we had already agreed on the
design you just proposed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 Given that we've decided to run the modifying sub-queries all with
 the same command counter ID, they are logically executing in
 parallel.
 
 Just run the main plan and let it pull tuples from the CTEs as
 needed.
 
On the face of it, that sounds like it has another benefit you
didn't mention -- it sounds like it's much more conducive to
allowing parallel processing, if (when?) we eventually move in that
direction.  It might even be a good case for an initial, limited
implementation.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Given that we've decided to run the modifying sub-queries all with
 the same command counter ID, they are logically executing in
 parallel.
 
 Just run the main plan and let it pull tuples from the CTEs as
 needed.
 
 On the face of it, that sounds like it has another benefit you
 didn't mention -- it sounds like it's much more conducive to
 allowing parallel processing, if (when?) we eventually move in that
 direction.  It might even be a good case for an initial, limited
 implementation.

Yeah.  Most of the executor is in principle parallelizable at the
plan-node level (ignoring the obvious and severe implementation
problems with parallelizing *anything* in the backend).  It's not
good for wCTE to be creating a user-visible assumption that certain
things will happen in a predefined order.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Marko Tiikkaja

On 2011-02-25 4:58 PM, Tom Lane wrote:

Specifically, I'm imagining getting rid of the patch's additions to
InitPlan and ExecutePlan that find all the modifying sub-queries and
force them to be cycled to completion before the main plan runs.
Just run the main plan and let it pull tuples from the CTEs as needed.
Then, in ExecutorEnd, cycle any unfinished ModifyTable nodes to
completion before shutting down the plan.  (In the event of an error,
we'd never get to ExecutorEnd, but it doesn't matter since whatever
updates we did apply are nullified anyhow.)


This idea has actually been discussed before when we talked about 
optimizing wCTEs, but IIRC you said that doing this in ExecutorEnd is a 
bit ugly.  But if you can write this idea down in a way that makes you 
happy with the implementation, I think it's a huge benefit and we should 
definitely do it.



This has a number of immediate and future implementation benefits:



3. The code could be significantly simpler.  Instead of that rather
complex and fragile logic in InitPlan to try to locate all the
ModifyTable nodes and their CTEScan parents, we could just have
ModifyTable nodes add themselves to a list in the EState during
ExecInitNode.  Then ExecutorEnd just traverses that list.


Sounds good to me.


However, the real reason for doing it isn't any of those, but rather
to establish the principle that the executions of the modifying
sub-queries are interleaved not sequential.  We're never going to be
able to do any significant optimization of such queries if we have to
preserve the behavior that the sub-queries execute sequentially.
And I think it's inevitable that users will manage to build such an
assumption into their queries if the first release with the feature
behaves that way.


Yeah, you might be right.


Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Greg Stark
On Fri, Feb 25, 2011 at 2:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 However, the real reason for doing it isn't any of those, but rather
 to establish the principle that the executions of the modifying
 sub-queries are interleaved not sequential.  We're never going to be
 able to do any significant optimization of such queries if we have to
 preserve the behavior that the sub-queries execute sequentially.
 And I think it's inevitable that users will manage to build such an
 assumption into their queries if the first release with the feature
 behaves that way.

Does the interleaved execution have sane semantics?

With a query like:

WITH
  a as update x set x.i=x.i+1 returning x.i,
  b as update x set x.i=x.i+1 returning x.i
select * from a natural join b;

Is there any way to tell what it will return or what state it will
leave the table in?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Merlin Moncure
On Fri, Feb 25, 2011 at 11:31 AM, Greg Stark gsst...@mit.edu wrote:
 On Fri, Feb 25, 2011 at 2:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 However, the real reason for doing it isn't any of those, but rather
 to establish the principle that the executions of the modifying
 sub-queries are interleaved not sequential.  We're never going to be
 able to do any significant optimization of such queries if we have to
 preserve the behavior that the sub-queries execute sequentially.
 And I think it's inevitable that users will manage to build such an
 assumption into their queries if the first release with the feature
 behaves that way.

 Does the interleaved execution have sane semantics?

 With a query like:

 WITH
  a as update x set x.i=x.i+1 returning x.i,
  b as update x set x.i=x.i+1 returning x.i
 select * from a natural join b;

 Is there any way to tell what it will return or what state it will
 leave the table in?

WITH
  a as update x set x.i=x.i+1 returning x.i,
  b as update x set x.i=x.i+1 where x.i = 1 returning x.i
 select * from a natural join b;

or the above if x is.i is 1 for all x on query start?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Fri, Feb 25, 2011 at 2:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 However, the real reason for doing it isn't any of those, but rather
 to establish the principle that the executions of the modifying
 sub-queries are interleaved not sequential.

 Does the interleaved execution have sane semantics?

Depends on what you call sane.  With the decision to not increment
command counter, it's already the case that people shouldn't have
two subqueries try to modify the same row.

 With a query like:

 WITH
   a as update x set x.i=x.i+1 returning x.i,
   b as update x set x.i=x.i+1 returning x.i
 select * from a natural join b;

 Is there any way to tell what it will return or what state it will
 leave the table in?

My reaction to that is you shouldn't do that, and you definitely
shouldn't complain if it's not predictable whether a or b will
modify a given row.  This is exactly the sort of assumption I don't
want people building into their queries, because we will be locked
into purely sequential execution if we promise that the results will
be consistent.

There is already precedent for that position.  You can easily construct
queries using UPDATE ... FROM wherein the same target row joins to more
than one row in the FROM table, and then it's unpredictable which
joining row will be used to update that target row.  Our position has
always been don't do that, not that we'd lobotomize the planner and
executor to ensure predictability.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread David Fetter
On Fri, Feb 25, 2011 at 09:58:36AM -0500, Tom Lane wrote:
 I had what seems to me a remarkably good idea, though maybe someone else
 can spot a problem with it.  Given that we've decided to run the
 modifying sub-queries all with the same command counter ID, they are
 logically executing in parallel.  The current implementation takes no
 advantage of that fact, though: it's based around the idea of running
 the updates strictly sequentially.  I think we should change it so that
 the updates happen physically, not only logically, concurrently.
 Specifically, I'm imagining getting rid of the patch's additions to
 InitPlan and ExecutePlan that find all the modifying sub-queries and
 force them to be cycled to completion before the main plan runs.
 Just run the main plan and let it pull tuples from the CTEs as needed.
 Then, in ExecutorEnd, cycle any unfinished ModifyTable nodes to
 completion before shutting down the plan.  (In the event of an error,
 we'd never get to ExecutorEnd, but it doesn't matter since whatever
 updates we did apply are nullified anyhow.)

What's the effect, if any, on CTEs that depend on each other
explicitly?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Tom Lane
David Fetter da...@fetter.org writes:
 What's the effect, if any, on CTEs that depend on each other
 explicitly?

An error.  That would require mutual recursion, which we don't
support for the SELECT case let alone data-modifying statements.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread David Fetter
On Fri, Feb 25, 2011 at 10:12:02PM -0500, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
  What's the effect, if any, on CTEs that depend on each other
  explicitly?
 
 An error.  That would require mutual recursion, which we don't
 support for the SELECT case let alone data-modifying statements.

Sorry that was unclear.  Let's imagine there's a DELETE ... RETURNING
in one WITH, and an UPDATE in another that depends on that one.  Is
that still allowed?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Tom Lane
David Fetter da...@fetter.org writes:
 Sorry that was unclear.  Let's imagine there's a DELETE ... RETURNING
 in one WITH, and an UPDATE in another that depends on that one.  Is
 that still allowed?

Yeah it is, although I just noticed that there's a bug in the new
implementation:

with t1 as (insert into x select ... returning *),
 t2 as (insert into y select * from t1 returning *)
select 1;

This should result in the same rows inserted into both x and y, but in
git HEAD it fails to insert anything into y.  The reason is that the
ExecutorEnd scan first processes the ModifyTable node for x, and cycles
it to completion, discarding the results --- but we needed the CteScan
in t2 to see those rows.  There's a related case in the regression
tests, but it works because the outer query does fetch from both WITH
clauses, so there's no need to do anything at ExecutorEnd time.

The first solution that comes to mind is to pay attention to the
interdependencies of the CTEs, and perform the cleanup in an appropriate
order (here, the ModifyTable for y needs to be cycled first).  I'm not
sure if there's a nicer way.  We'll eventually want some interdependency
tracking for CTEs anyway, if we're ever to support mutual recursion,
so it'd not be completely single-purpose code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Tom Lane
I wrote:
 The first solution that comes to mind is to pay attention to the
 interdependencies of the CTEs, and perform the cleanup in an appropriate
 order (here, the ModifyTable for y needs to be cycled first).

Doh ... actually, we already *are* ordering the CTEs in dependency
order, so it's a one-liner fix to do the shutdowns in reverse order.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

2011-02-25 Thread Tom Lane
Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes:
 On 2011-02-25 4:58 PM, Tom Lane wrote:
 Specifically, I'm imagining getting rid of the patch's additions to
 InitPlan and ExecutePlan that find all the modifying sub-queries and
 force them to be cycled to completion before the main plan runs.
 Just run the main plan and let it pull tuples from the CTEs as needed.
 Then, in ExecutorEnd, cycle any unfinished ModifyTable nodes to
 completion before shutting down the plan.

 This idea has actually been discussed before when we talked about 
 optimizing wCTEs, but IIRC you said that doing this in ExecutorEnd is a 
 bit ugly.

Further experimentation has reminded me of why I didn't want to put such
processing in ExecutorEnd :-(.  There are some nasty interactions with
EXPLAIN:

1. EXPLAIN ANALYZE fails to include the execution cycles associated with
running the ModifyTable nodes to completion.  In the worst case, such as
WITH t AS (INSERT ...) SELECT 1, it will claim the INSERT subplan is
never executed, even though rows certainly got inserted.  This is
because EXPLAIN extracts all the counts from the execution state tree
before shutting it down with ExecutorEnd.

2. But it gets worse.  Try the same query *without* ANALYZE.  You'll
find the INSERT executes anyway!  That's because EXPLAIN still calls
ExecutorEnd to clean up the execution state tree, and ExecutorEnd
doesn't realize it's not supposed to run any of the plan.

So we really need some refactoring here.  I dislike adding another
fundamental step to the ExecutorStart/ExecutorRun/ExecutorEnd sequence,
but there may not be a better way.  The only way I see to fix this
without changing that API is to have ExecutorRun do the cleanup
processing just after the top plan node returns a null tuple, and that
seems a bit ugly as well.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers