On Wed, Aug 23, 2023 at 11:07 AM Richard Guo <guofengli...@gmail.com> wrote: > > > On Tue, Aug 22, 2023 at 10:39 PM Tom Lane <t...@sss.pgh.pa.us> wrote: >> >> Richard Guo <guofengli...@gmail.com> writes: >> > I'm wondering if we can instead adjust the 'inner_req_outer' in >> > create_nestloop_path before we perform the check to work around this >> > issue for the back branches, something like >> > + /* >> > + * Adjust the parameterization information, which refers to the topmost >> > + * parent. >> > + */ >> > + inner_req_outer = >> > + adjust_child_relids_multilevel(root, inner_req_outer, >> > + outer_path->parent->relids, >> > + >> > outer_path->parent->top_parent_relids); >> >> Mmm ... at the very least you'd need to not do that when top_parent_relids >> is unset. > > > Hmm, adjust_child_relids_multilevel would just return the given relids > unchanged when top_parent_relids is unset, so I think it should be safe > to call it even for non-other rel. > >> >> ... But I think the risk/reward ratio for messing with this in the >> back branches is unattractive in any case: to fix a corner case that >> apparently nobody uses in the field, we risk breaking any number of >> mainstream parameterized-path cases. I'm content to commit the v5 patch >> (or a successor) into HEAD, but at this point I'm not sure I even want >> to risk it in v16, let alone perform delicate surgery to get it to work >> in older branches. I think we ought to go with the "tablesample scans >> can't be reparameterized" approach in v16 and before. > > > If we go with the "tablesample scans can't be reparameterized" approach > in the back branches, I'm a little concerned that what if we find more > cases in the futrue where we need modify RTEs for reparameterization. > So I spent some time seeking and have managed to find one: there might > be lateral references in a scan path's restriction clauses, and > currently reparameterize_path_by_child fails to adjust them. > > regression=# explain (verbose, costs off) > select count(*) from prt1 t1 left join lateral > (select t1.b as t1b, t2.* from prt2 t2) s > on t1.a = s.b where s.t1b = s.a; > QUERY PLAN > ---------------------------------------------------------------------- > Aggregate > Output: count(*) > -> Append > -> Nested Loop > -> Seq Scan on public.prt1_p1 t1_1 > Output: t1_1.a, t1_1.b > -> Index Scan using iprt2_p1_b on public.prt2_p1 t2_1 > Output: t2_1.b > Index Cond: (t2_1.b = t1_1.a) > Filter: (t1.b = t2_1.a) > -> Nested Loop > -> Seq Scan on public.prt1_p2 t1_2 > Output: t1_2.a, t1_2.b > -> Index Scan using iprt2_p2_b on public.prt2_p2 t2_2 > Output: t2_2.b > Index Cond: (t2_2.b = t1_2.a) > Filter: (t1.b = t2_2.a) > -> Nested Loop > -> Seq Scan on public.prt1_p3 t1_3 > Output: t1_3.a, t1_3.b > -> Index Scan using iprt2_p3_b on public.prt2_p3 t2_3 > Output: t2_3.b > Index Cond: (t2_3.b = t1_3.a) > Filter: (t1.b = t2_3.a) > (24 rows) > > The clauses in 'Filter' are not adjusted correctly. Var 't1.b' still > refers to the top parent. > > For this query it would just give wrong results. > > regression=# set enable_partitionwise_join to off; > SET > regression=# select count(*) from prt1 t1 left join lateral > (select t1.b as t1b, t2.* from prt2 t2) s > on t1.a = s.b where s.t1b = s.a; > count > ------- > 100 > (1 row) > > regression=# set enable_partitionwise_join to on; > SET > regression=# select count(*) from prt1 t1 left join lateral > (select t1.b as t1b, t2.* from prt2 t2) s > on t1.a = s.b where s.t1b = s.a; > count > ------- > 5 > (1 row) > > > For another query it would error out during planning. > > regression=# explain (verbose, costs off) > select count(*) from prt1 t1 left join lateral > (select t1.b as t1b, t2.* from prt2 t2) s > on t1.a = s.b where s.t1b = s.b; > ERROR: variable not found in subplan target list > > To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath, > ForeignPath and CustomPath, and modify IndexOptInfos for IndexPath. It > seems that that is not easily done without postponing reparameterization > of paths until createplan.c.
Maybe I am missing something here but why aren't we copying these restrictinfos in the path somewhere? I have a similar question for tablesample clause as well. -- Best Wishes, Ashutosh Bapat