[HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

Tom Lane Fri, 25 Feb 2011 07:00:27 -0800

I had what seems to me a remarkably good idea, though maybe someone else
can spot a problem with it.  Given that we've decided to run the
modifying sub-queries all with the same command counter ID, they are
logically executing "in parallel".  The current implementation takes no
advantage of that fact, though: it's based around the idea of running
the updates strictly sequentially.  I think we should change it so that
the updates happen physically, not only logically, concurrently.
Specifically, I'm imagining getting rid of the patch's additions to
InitPlan and ExecutePlan that find all the modifying sub-queries and
force them to be cycled to completion before the main plan runs.
Just run the main plan and let it pull tuples from the CTEs as needed.
Then, in ExecutorEnd, cycle any unfinished ModifyTable nodes to
completion before shutting down the plan.  (In the event of an error,
we'd never get to ExecutorEnd, but it doesn't matter since whatever
updates we did apply are nullified anyhow.)


This has a number of immediate and future implementation benefits:

1. RETURNING tuples that aren't actually needed by the main plan
don't need to be buffered anywhere.  (ExecutorEnd would just pull
directly from the ModifyTable nodes, ignoring their parent CTE
nodes, in all cases.)

2. In principle, in many common cases the RETURNING tuples wouldn't have
to be buffered at all, but could be consumed on-the-fly.  I think that
right now the CTEScan nodes might still buffer the tuples so they can
regurgitate them in case of being rescanned, but it's not hard to see
how that could be improved later if it doesn't work immediately.

3. The code could be significantly simpler.  Instead of that rather
complex and fragile logic in InitPlan to try to locate all the
ModifyTable nodes and their CTEScan parents, we could just have
ModifyTable nodes add themselves to a list in the EState during
ExecInitNode.  Then ExecutorEnd just traverses that list.

However, the real reason for doing it isn't any of those, but rather
to establish the principle that the executions of the modifying
sub-queries are interleaved not sequential.  We're never going to be
able to do any significant optimization of such queries if we have to
preserve the behavior that the sub-queries execute sequentially.
And I think it's inevitable that users will manage to build such an
assumption into their queries if the first release with the feature
behaves that way.

Comments?

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] wCTE: why not finish sub-updates at the end, not the beginning?

Reply via email to