Hello I did some tests and It looks so it allows only some form of nested loop.
postgres=# explain (analyze, timing off, buffers) update a1 set b = (select b from a2 where a1.a = a2.a); QUERY PLAN --------------------------------------------------------------------------------------------------------------- Update on a1 (cost=0.00..8456925.00 rows=1000000 width=10) (actual rows=0 loops=1) Buffers: shared hit=9017134 read=14376 dirtied=58170 written=1014 -> Seq Scan on a1 (cost=0.00..8456925.00 rows=1000000 width=10) (actual rows=1000000 loops=1) Buffers: shared hit=4005465 read=4424 written=971 SubPlan 1 -> Index Scan using a2_a_idx on a2 (cost=0.42..8.44 rows=1 width=4) (actual rows=1 loops=1000000) Index Cond: (a1.a = a) Buffers: shared hit=4005464 Planning time: 0.212 ms Execution time: 30114.101 ms (10 rows) do you plan some sophisticated mechanism - like MERGE or some similar? Regards Pavel 2014-06-16 17:17 GMT+02:00 Tom Lane <t...@sss.pgh.pa.us>: > Attached is a very-much-WIP patch for supporting > UPDATE foo SET ..., (a,b,c) = (select x,y,z from ...), ... > > It lacks documentation, ruleutils.c support, or any solution for the > rule NEW.* expansion problem I mentioned Saturday. The reason I'm > posting it now is to get feedback about an implementation choice that > feels a bit klugy to me; but I don't see any clearly better idea. > > The post-parse-analysis representation I've chosen here is that the > output columns of the sub-select are represented by PARAM_MULTIEXEC > Params, and the sub-select itself appears in a resjunk entry at the end > of the targetlist; that is, the UPDATE tlist is more or less like > > $1, -- to be assigned to a > $2, -- to be assigned to b > $3, -- to be assigned to c > (select x,y,z from ...), -- resjunk entry, value will be discarded > > If the sub-select is uncorrelated with the outer query, the planner > turns it into an InitPlan, replacing the resjunk tlist item with a > NULL constant, and then everything happens normally at execution. > > But more usually, the sub-select will be correlated with the outer > query. In this case, the subquery turns into a SubPlan tree that > stays in the resjunk item. At the start of execution, the ParamExecData > structs for each of its output Params are marked with execPlan pointers > to the subplan, just as would happen for an InitPlan. This causes the > subplan to get executed when the first of the output Params is evaluated; > it loads the ParamExecData structs for all its output Params, and then > the later Params just take data from the structs. When execution reaches > the MULTIEXEC SubPlan in the resjunk tlist item, no evaluation of the > subplan is needed; but instead we re-mark all the output ParamExecData > structs with non-null execPlan pointers, so that a fresh execution of > the subplan will happen in the next evaluation of the targetlist. > > The klugy aspect of this is that it assumes that the SubPlan item will > appear later in the tlist than all the Params referencing it. This is > true at the moment because resjunk tlist items always appear after > non-resjunk ones. There are a few other places that already depend on > this ordering, but we've mostly tried to avoid introducing new > dependencies on it. > > The alternative that I'd originally had in mind, before put-it-in-a- > resjunk-item occurred to me, was to invent a new "secondary tlist" > field of Query and of ModifyTable plan nodes, as I sketched back in > http://www.postgresql.org/message-id/1783.1399054...@sss.pgh.pa.us > We'd put the MULTIEXEC SubPlans in this secondary tlist and expect > the executor to evaluate it just before evaluating the main tlist. > However, that way isn't terribly pretty either, because it extends > knowledge of this feature to a *whole lot* of places that don't have > to know about it in the attached approach; in particular, just about > every place that manipulates targetlist contents would have to also > manipulate the secondary tlist. > > Another approach is to explicitly identify which of the Params will > be evaluated first and replace it with a node tree that evaluates > the subplan (and sets output Params for the remaining columns). > This is a bit messy because the first-to-be-evaluated is dependent > on the targetlist reordering that the planner does; so we don't want > parse analysis to try to do it. (If we allowed parse analysis to > know that the planner will sort the tlist items into physical column > order, we could do it like that; but then it would break if we ever > get around to allowing ALTER TABLE to change the physical order.) > We could safely have setrefs.c determine the first-to-be-evaluated > Param, though, since it will traverse the tlist in final order. > So if we went with this approach I'd have setrefs.c replace the first > Param with a SubPlan node. That seems a bit of a kluge too though. > > Preferences, comments, better ideas? > > regards, tom lane > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >