On Wed, Jan 6, 2021 at 7:39 PM Antonin Houska <[email protected]> wrote:
>
>
> @@ -252,6 +252,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
> PlannerGlobal *glob = root->glob;
> int rtoffset = list_length(glob->finalrtable);
> ListCell *lc;
> + Plan *finalPlan;
>
> /*
> * Add all the query's RTEs to the flattened rangetable. The live
> ones
> @@ -302,7 +303,17 @@ set_plan_references(PlannerInfo *root, Plan *plan)
> }
>
> /* Now fix the Plan tree */
> - return set_plan_refs(root, plan, rtoffset);
> + finalPlan = set_plan_refs(root, plan, rtoffset);
> + if (finalPlan != NULL && IsA(finalPlan, Gather))
> + {
> + Plan *subplan = outerPlan(finalPlan);
> +
> + if (IsA(subplan, ModifyTable) && castNode(ModifyTable,
> subplan)->returningLists != NULL)
> + {
> + finalPlan->targetlist =
> copyObject(subplan->targetlist);
> + }
> + }
> + return finalPlan;
> }
>
> I'm not sure if the problem of missing targetlist should be handled here (BTW,
> NIL is the constant for an empty list, not NULL). Obviously this is a
> consequence of the fact that the ModifyTable node has no regular targetlist.
>
> Actually I don't quite understand why (in the current master branch) the
> targetlist initialized in set_plan_refs()
>
> /*
> * Set up the visible plan targetlist as being the same as
> * the first RETURNING list. This is for the use of
> * EXPLAIN; the executor won't pay any attention to the
> * targetlist. We postpone this step until here so that
> * we don't have to do set_returning_clause_references()
> * twice on identical targetlists.
> */
> splan->plan.targetlist = copyObject(linitial(newRL));
>
> is not used. Instead, ExecInitModifyTable() picks the first returning list
> again:
>
> /*
> * Initialize result tuple slot and assign its rowtype using the first
> * RETURNING list. We assume the rest will look the same.
> */
> mtstate->ps.plan->targetlist = (List *)
> linitial(node->returningLists);
>
> So if you set the targetlist in create_modifytable_plan() (according to
> best_path->returningLists), or even in create_modifytable_path(), and ensure
> that it gets propagated to the Gather node (generate_gather_pahs currently
> uses rel->reltarget), then you should no longer need to tweak
> setrefs.c. Moreover, ExecInitModifyTable() would no longer need to set the
> targetlist. However I don't guarantee that this is the best approach - some
> planner expert should speak up.
>
I've had a bit closer look at this particular issue.
I can see what you mean about the ModifyTable targetlist (that is set
in set_plan_refs()) getting overwritten by ExecInitModifyTable() -
which also contradicts the comment in set_plan_refs() that claims the
targetlist being set is used by EXPLAIN (which it is not). So the
current Postgres master branch does seem to be broken in that respect.
I did try your suggestion (and also remove my tweak), but I found that
in the T_Gather case of set_plan_refs() it does some processing (see
set_upper_references()) of the current Gather targetlist and subplan's
targetlist (and will then overwrite the Gather targetlist after that),
but in doing that processing it produces the error:
ERROR: variable not found in subplan target list
I think that one of the fundamental problems is that, up to now,
ModifyTable has always been the top node in a (non-parallel) plan, but
now for Parallel INSERT we have a Gather node with ModifyTable in its
subplan. So the expected order of processing and node configuration
has changed.
For the moment (until perhaps a planner expert DOES speak up) I've
parked my temporary "fix" at the bottom of set_plan_refs(), simply
pointing the Gather node's targetlist to subplan's ModifyTable
targetlist.
if (nodeTag(plan) == T_Gather)
{
Plan *subplan = plan->lefttree;
if (IsA(subplan, ModifyTable) &&
castNode(ModifyTable, subplan)->returningLists != NIL)
{
plan->targetlist = subplan->targetlist;
}
}
Regards,
Greg Nancarrow
Fujitsu Australia