(2018/07/26 5:27), Robert Haas wrote:
On Tue, Jul 24, 2018 at 7:51 AM, Etsuro Fujita
<fujita.ets...@lab.ntt.co.jp> wrote:
Isn't that assumption fundamental to your whole approach?
I don't think so. What I mean here is: currently the subplan would be a
scan/join node, but in future we might have eg, a Sort node atop the
scan/join node, so it would be better to update the patch to handle such a
case as well.
But how would you do that?
What I had in mind was to insert a Rusult node with
inject_projection_plan and adjust the tlist of the Result, as done for
adding sort columns to a tlist in prepare_sort_from_pathkeys.
I think that's a bad idea. The target list affects lots
of things, like costing. If we don't insert a ConvertRowTypeExpr into
the child's target list, the costing will be wrong to the extent that
ConvertRowTypeExpr has any cost, which it does.
Actually, this is not true at least currently, because
set_append_rel_size
doesn't do anything about the costing:
Why would it? Append can't project, so the cost of any expressions
that appear in its target list is irrelevant. What is affected is the
cost of the scans below the Append -- see e.g. cost_seqscan(), which
uses the data produced by set_pathtarget_cost_width().
By set_rel_size()?
Sorry, I don't understand what you mean by this.
I think the data used by such a costing function is computed by
set_rel_size in set_append_rel_size, not set_pathtarget_cost_width; in
the case of a plain partition, for example, set_rel_size would call
set_plain_rel_size, and then set_plain_rel_size would eventually call
set_rel_width, which sets reltarget->cost, which I think would be used
by e.g., cost_seqscan. cost_qual_eval_node, which is called in
set_rel_width for computing the cost of ConvertRowTypeExpr, ignores that
expression, so currently, we don't charge any cost for it to the
partition's reltarget->cost, and to the cost of a scan below the Append.
I'm not sure that's a good idea, because I think we have a trade-off
relation; the more we make create_plan simple, the more we need to make
earlier states of the planner complicated.
And it looks to me like the partitionwise join code is making earlier (and
later) stages of the planner too complicated, to make create_plan simple.
I think that create_plan is *supposed* to be simple. Its purpose is
to prune away data that's only needed during planning and add things
that can be computed at the last minute which are needed at execution
time. Making it do anything else is, in my opinion, not good.
I agree on that point.
When considering paritionwise joining, it would make things complicated to
have a ConvertRowtypeExpr in a partrel's targetlist, because as discussed
upthread, it deviates from the planner's assumption that a rel's targetlist
would only include Vars and PHVs. So, I propose to include a child
whole-row Var in the targetlist instead, in which case, we need to fix the
Var after the fact, but can avoid making many other parts of the planner
complicated.
Well, I could have the wrong idea here, but I tend to think allowing
for ConvertRowTypeExpr elsewhere won't be that bad.
I still don't like that because in my opinion, changes needed for that
would not be localized, and that would make code complicated more than
necessary.
As I mentioned in a previous email, another idea to avoid that would be
to adjust tlists for children at path creation time, not plan creation
time; we could adjust the tlist for each of subpaths accumulated for an
Append/MergeAppend path in add_paths_to_append_rel called from
set_append_rel_pathlist or generate_partitionwise_join_paths, with
create_projection_path adding ConvertRowTypeExpr. It seems unlikely
that performing create_projection_path to such a subpath would change
its property of being the cheapest, so it would be safe to adjust the
tlists that way. This would not require making create_plan complicated
anymore. I might be missing something, though.
Best regards,
Etsuro Fujita