On Fri, Apr 11, 2025 at 3:51 PM Richard Guo <guofengli...@gmail.com> wrote: > On Fri, Apr 11, 2025 at 4:45 AM Robert Haas <robertmh...@gmail.com> wrote: > > OK. Maybe I shouldn't be worrying about the table_open() / > > table_close() here, because I see that you are right that > > has_subclass() is nearby, which admittedly does not involve opening > > the relation, but it does involve fetching from the syscache a tuple > > that we wouldn't need to fetch if we had a Relation available at that > > point. And I see now that expand_virtual_generated_columns() is also > > in that area and works similar to your proposed function in that it > > just opens and closes all the relations. Perhaps that's just the way > > we do things at this very early stage of the planner? But I feel like > > it might make sense to do some reorganization of this code, though, so > > that it more resembles the phase 1 and phase 2 as you describe them. > > Both expand_virtual_generated_columns() and collect_relation_attrs() > > care about exactly those relations with rte->rtekind == RTE_RELATION, > > and even if we have to open and close all of those relations once to > > do this processing, perhaps we can avoid doing it twice, and maybe > > avoid the has_subclass() call along the way? > > Yeah, I agree on this. This is the optimization I mentioned upthread > in the last paragraph of [1] - retrieving the small portion of catalog > information needed at an early stage in one place instead of three. > Hopefully, this way we only need one table_open/table_close at the > early stage of the planner.
Here is the patchset that implements this optimization. 0001 moves the expansion of virtual generated columns to occur before sublink pull-up. 0002 introduces a new function, preprocess_relation_rtes, which scans the rangetable for relation RTEs and performs inh flag updates and virtual generated column expansion in a single loop, so that only one table_open/table_close call is required for each relation. 0003 collects NOT NULL attribute information for each relation within the same loop, stores it in a relation OID based hash table, and uses this information to reduce NullTest quals during constant folding. I think the code now more closely resembles the phase 1 and phase 2 described earlier: it collects all required early-stage catalog information within a single loop over the rangetable, allowing each relation to be opened and closed only once. It also avoids the has_subclass() call along the way. Thanks Richard
v4-0001-Expand-virtual-generated-columns-before-sublink-p.patch
Description: Binary data
v4-0002-Centralize-collection-of-catalog-info-needed-earl.patch
Description: Binary data
v4-0003-Reduce-Var-IS-NOT-NULL-quals-during-constant-fold.patch
Description: Binary data