Re: pg_plan_advice

Robert Haas Thu, 26 Mar 2026 06:56:32 -0700

On Wed, Mar 25, 2026 at 7:59 PM Lukas Fittl <[email protected]> wrote:
> > 0001 adds a disabled_nodes fields to SubPlan, to fix the bug that I
> > identified in the email to which this is a reply.
>
> This looks good, and is consistent with how it would have worked
> before the introduction of disabled nodes (since costs would have just
> been very high and thus discourage a subplan with many disabled
> nodes).
>
> Only nit is that the commit hash reference in the commit message
> doesn't seem right, I think you probably meant
> e22253467942fdb100087787c3e1e3a8620c54b2


Whoops. Obviously got the wrong thing stuck in my cut and paste buffer
when I was writing that. Thanks for checking it. I'm going to go ahead
and commit this, because I'm pretty confident that it's correct, and
the rest of these patches are not going to fix the buildfarm
instability without it, and I'm pretty sure multiple committers are
pretty tired of seeing these test_plan_advice failures already.

> For 0002:
>
> I think that overall looks like a good refactoring, with two minor notes:
>
> I'd word that as "Get or create the pgpa_planner_info for the given
> PlannerInfo and its associated plan_name", since you're not passing a
> plan_name as the argument.

Right, the comment isn't quite correct. I don't think your rewording
is quite right either, though, because there's really no reason to
mention plan_name here at all. I'll adjust it.

> > @@ -2053,19 +2015,34 @@ pgpa_ri_checker_save(pgpa_planner_state *pps, 
> > PlannerInfo *root,
> >                      RelOptInfo *rel)
> > {
> > ...
> > +   if (proot->rid_array_size <= rel->relid)
> > +   {
> > +           int                     new_size = Max(proot->rid_array_size, 
> > 8);
> > +
> > +           while (new_size < rel->relid)
> > +                   new_size *= 2;
>
> This could use pg_nextpower2_32 on the rel->relid instead of the
> manual while loop.

OK, will change that also, and then also commit this one.

> For 0003:
>
> I wonder if "original_root" wouldn't be more correct here as a name
> (instead of "alternative_root"), since if I follow the implementation
> correctly, you are adding a pointer on each alternative root, back to
> the original root that the alternative was copied from.

To me, that seems less clear. Someone might think that the original
root means the toplevel PlannerInfo, or that whatever PlannerInfo we
had around when we created the current PlannerInfo will be the
original_root. But in fact we are only using this in a much more
narrow situation, namely, when we're creating a new PlannerInfo as a
way to consider an alternative implementation of the same portion of
the query. That is, I think alternative conveys a sibling
relationship, and original doesn't necessarily do so.

> I also wonder if maybe we should be more narrow in what we keep here.
> It seems 0004 mainly needs the original plan name, so maybe its better
> if we just keep that for targeting purposes, vs a full pointer to the
> PlannerInfo. The planner makes an effort to zap unused subplans at the
> end of set_plan_references, and I think this new field would then be
> the only pointer to those unused subplans. If we decided to add an
> early free there at some point (instead of just making them a NULL
> entry in the list), that'd break pg_plan_advice.

The dangling pointers are a good point; I agree that's bad. However,
I'd be more inclined to fix it by nulling out the alternative_root
pointers at the end of set_plan_references. I think that would just be
the case where root->isAltSubplan[ndx] && root->isUsedSubplan[ndx].
The reason I'm reluctant to just store the name is that there's not an
easy way to find a PlannerInfo by name. I originally proposed an
"allroots" list in PlannerGlobal, but we went with subplanNames on
Tom's suggestion. I subsequently realized that this kind of stinks for
code that is trying to use this infrastructure for anything, for
exactly this reason, but Tom never responded and I never pressed the
issue. But I think we're boxing ourselves into a corner if we just
keep storing names that can't be looked up everywhere. It doesn't
matter for the issue before us, so maybe doing as you say here is the
right idea just so we can move forward, but I think we're probably
kidding ourselves a little bit.

> From a design perspective, I'm -1 on storing of full query text
> strings in shared memory when the shared collector mode. With large
> query texts and without an aggregate MB size limit that's an
> expressway into OOM land, even if you used a low value like 100
> entries max, because ORMs are just really good at creating large query
> texts unexpectedly. I'm also skeptical whether that's a good idea for
> the local collection mode, but it'd be less problematic there.
> Overall, I think this needs to rely on queryid instead and not store
> query texts. I would not make queryid optional, but instead enable it
> automatically - which fits together with pg_stash_advice taking it as
> input.
>
> I realize not having query texts reduces its effectiveness (since you
> don't see which parameters produced which plan advice), but it still
> helps surface which different advice strings where seen for which
> query IDs, letting you identify if you're getting a mix of bad and
> good plans. And I'm just really worried people will enable this on
> production in shared collection mode and take down their system.

I fully admit that pg_collect_advice is crude, but I don't think
ripping out some portion of the limited functionality that it has is
going to get us where we want to be. If it hadn't collected the query
strings, it would have been useless for the purpose for which I
originally wrote it. We could add a GUC for a length limit, perhaps,
but I think the real feature that this needs to be used in the way
that you seem to want to use it is deduplication, and as I said
earlier, I think we should consider adding the advice collection logic
to pg_stat_statements rather than building an alternative version of
that module with overlapping functionality. If you think this is a
sufficiently large foot-gun that we shouldn't ship it, then I'll just
drop this patch for this cycle and we can revisit what to do for next
cycle. It's not critical, and with more time we can talk through
possible approaches and have time to code something up in a
responsible way. There's just no time to make big design changes right
now. If some small adjustment (like adding a GUC to limit the length
of the query string we're willing to collect) elevates it to
acceptability then I'm happy to do that, but otherwise we should just
revisit the topic for next cycle.

> From a design perspective, I'm worried about the fact that we lose the
> stashed advice on a restart. e.g. imagine a DBA using pg_stash_advice
> to pin a query that sometimes picks a bad plan to the good plan, but
> then their cloud provider applies a security update overnight.
> Suddenly the database is slow because the bad plans are being picked
> again.
>
> pg_hint_plan's solution to this (the "hints" table [0]) uses an actual
> table managed by the extension - but I suspect that doesn't fit the
> picture, since it'd be per database, etc. It does have the benefit of
> being restart safe though, and being copied to replicas.
>
> I wonder if we could find a way to dump and restore the advice stash
> information via a file, so its at least crash and restart safe?

I think if we want to ship this extension in this release, then the
only alternative is to tell users they have to put in place a manual
process for this. That is not great, but the original version of
pg_prewarm had the exact same issue, and I don't think anyone thought
that made it useless. Indeed, pginfcore still has nothing comparable
to autoprewarm, and I'm not saying that as a way of throwing shade on
pgfincore.

What I'm concerned about with this module is completely different: I'm
wondering whether the approach it takes to using DSA is OK, whether
the security model is adequate, and that sort of thing. Expanding the
scope is 100% off the table in my book. Feature freeze is in less than
two weeks.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_plan_advice

Reply via email to