On Sat, Nov 5, 2016 at 2:16 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> writes: >> There's code in add_paths_to_joinrel() which computes the set of >> target relations that should overlap parameterization of any proposed >> join path. >> ... >> The calculations that follow are based on joinrel->relids (baserels >> covered by the join) and SpecialJoinInfo list in PlannerInfo. It is >> not based on specific combination of relations being joined or the >> paths being generated. We should probably do this computation once and >> store the result in the joinrel and use it multiple times. That way we >> can avoid computing the same set again and again for every pair of >> joining relations and their order. Any reasons why we don't do this? > > I'm not terribly excited about this. The issue is strictly local to > add_paths_to_joinrel, but putting that set in a global data structure > makes it nonlocal, and makes it that much harder to tweak the algorithm > if we think of a better way. (In particular, I think it's not all that > obvious that the set must be independent of which two subset relations > we are currently joining.)
Right now it appears that for every subset of relations, we have different param_source_rels, which is clearly not. It takes a bit of time to understand that. Adding it to a global data structure will at least make the current implementation clear i.e param_source_rels does not change with subset of relations being joined. > > If you can show a measurable performance improvement from this change, > then maybe those downsides are acceptable. But I do not think we should > commit it without a demonstrated performance benefit from the added > complexity and loss of flexibility. I couldn't find a measurable time difference with or without my patch, so multiple computations of param_source_rels aren't taking noticeable time. I used following queries to measure the planning time through explain analyze. create view pc_view as select c1.oid c1o, c2.oid c2o, c3.oid c3o from pg_class c1, pg_class c2 left join pg_class c3 on (c2.oid = c3.oid) where c1.oid = c2.oid and c1.oid = c3.oid and c1.relname = c3.relname; select v1, v2, v3 from pc_view v1, pc_view v2 left join pc_view v3 on (v2.c3o = v3.c1o), pc_view v4 where v1.c3o = v2.c2o and v1.c2o = v4.c3o limit 0; > >> Also, the way this code has been written, the declaration of variable >> sjinfo masks the earlier declaration with the same name. I am not sure >> if that's intentional, but may be we should use another variable name >> for the inner sjinfo. I have not included that change in the patch. > > Hmm, yeah, that's probably not terribly good coding practice. Attached a patch to fix this. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index cc7384f..8815789 100644 --- a/src/backend/optimizer/path/joinpath.c +++ b/src/backend/optimizer/path/joinpath.c @@ -124,42 +124,42 @@ add_paths_to_joinrel(PlannerInfo *root, * is a join order restriction that prevents joining one of our input rels * directly to the parameter source rel instead of joining to the other * input rel. (But see allow_star_schema_join().) This restriction * reduces the number of parameterized paths we have to deal with at * higher join levels, without compromising the quality of the resulting * plan. We express the restriction as a Relids set that must overlap the * parameterization of any proposed join path. */ foreach(lc, root->join_info_list) { - SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc); + SpecialJoinInfo *lc_sjinfo = (SpecialJoinInfo *) lfirst(lc); /* * SJ is relevant to this join if we have some part of its RHS * (possibly not all of it), and haven't yet joined to its LHS. (This * test is pretty simplistic, but should be sufficient considering the * join has already been proven legal.) If the SJ is relevant, it * presents constraints for joining to anything not in its RHS. */ - if (bms_overlap(joinrel->relids, sjinfo->min_righthand) && - !bms_overlap(joinrel->relids, sjinfo->min_lefthand)) + if (bms_overlap(joinrel->relids, lc_sjinfo->min_righthand) && + !bms_overlap(joinrel->relids, lc_sjinfo->min_lefthand)) extra.param_source_rels = bms_join(extra.param_source_rels, bms_difference(root->all_baserels, - sjinfo->min_righthand)); + lc_sjinfo->min_righthand)); /* full joins constrain both sides symmetrically */ - if (sjinfo->jointype == JOIN_FULL && - bms_overlap(joinrel->relids, sjinfo->min_lefthand) && - !bms_overlap(joinrel->relids, sjinfo->min_righthand)) + if (lc_sjinfo->jointype == JOIN_FULL && + bms_overlap(joinrel->relids, lc_sjinfo->min_lefthand) && + !bms_overlap(joinrel->relids, lc_sjinfo->min_righthand)) extra.param_source_rels = bms_join(extra.param_source_rels, bms_difference(root->all_baserels, - sjinfo->min_lefthand)); + lc_sjinfo->min_lefthand)); } /* * However, when a LATERAL subquery is involved, there will simply not be * any paths for the joinrel that aren't parameterized by whatever the * subquery is parameterized by, unless its parameterization is resolved * within the joinrel. So we might as well allow additional dependencies * on whatever residual lateral dependencies the joinrel will have. */ extra.param_source_rels = bms_add_members(extra.param_source_rels,
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers