On Wed, Mar 18, 2026 at 10:08 AM Robert Haas <[email protected]> wrote:
> But this does raise two points which are perhaps worthy of some
> further consideration:
>
> 1. Maybe we should limit the length of the detail message. In some
> other cases, like reportDependentObjects, we limit the detail message
> to the first 100 problems found, and then say at the end of that
> detail message how many more problems there were afterwards. We could
> do that here, too. I'm not 100% sure it's worth the code, but then
> again it's not much code.

I think we have similar problems elsewhere in Postgres where a large
user input causes an even larger log output - e.g. a case I'm familiar
with are complicated queries with long IN list inputs and their
associated EXPLAIN plans being logged by auto_explain - I recently had
a case where someone reported an OOM due to auto_explain trying to log
a > 100 MB sized query plan.

I think its actually less a problem with plan advice, since presumably
we won't have ORMs generating plan advice, and even if we do it'd be
per-table - so I think its unlikely a genuine use case would use 1000s
of advice tags.

That said, I also don't think super long long messages are actually
helpful. I do wonder if we should have a more coarse GUC that limits
DETAIL lines of any kind to a maximum length (e.g. 100 kB) across the
board instead of special casing every emitter.

> 2. The way the current code works is to transform the advice feedback
> into a Node-tree representation which is stashed in the PlannedStmt's
> extension_state, and then also passes that Node-tree representation to
> pgpa_planner_feedback_warning, which generates the actual warning.
> That Node-tree representation is currently used by EXPLAIN to display
> the advice feedback, which I think for most people will be a nicer
> interface than enabling pg_plan_advice.feedback_warnings, and it can
> also be used by other extensions. For instance, we could extend
> pg_stash_advice so that it looks at the advice feedback and updates
> the shared store, so that users can monitor whether their stashed
> advice is doing what they hoped it would.

Yeah, I think the ability for other extensions to retrieve this is
pretty important - whether in pg_stash_advice, or any other kind of
plan management extension that wants to know the outcome of applied
advice.

>
> However, in a case like this, that Node tree is actually quite large:
> about 16MB. I guess that's because pgpa_planner_append_feedback() has
> to do multiple allocations for every item of feedback: a C string,
> DefElem, an Integer, plus whatever lappend() charges us to add to a
> List. We could save that memory by adding code here to optimize for
> the case where we need to generate warnings but we don't need the Node
> tree for anything else. I'm inclined not to do that, because (A) I
> don't think temporarily using 16MB when the user specifies 10,000
> items of bogus advice is really that bad and (B) complicating the code
> has its own costs, such as maybe introducing more bugs. However, maybe
> someone else sees it differently.
>
> Another idea is to try to find a more economic Node representation.
> For instance, we could jam the flags into the DefElem's location
> field, instead of building a separate Integer node, or we could build
> the advice feedback as a giant binary blob and wrap it in a varlena
> and a Const node and leave consumers to make sense of that as best
> they're able, or we could invent a new Node type that's just to the
> perfect thing to hold a C string and an integer. I'm inclined to think
> that the first two are too hacky and the third is too special-purpose,
> but again someone else might see it otherwise.

Yeah, I feel like we're definitely constrained here by the fact that
advice tags are defined by a contrib module vs in-core - if they were
in-core we could just add a dedicated node type for them. I don't
think inventing a specialized binary format only defined in the
contrib module makes sense.

Two other ideas:

1) What if we return the utilized advice string as a separate DefElem
with a list of strings, and then the feedback just has to reference an
index into that list? (though I suppose that doesn't actually save
memory, now that I think that through -- unless we assume the caller
already has the advice string, but I don't think we can rely on that)

2) We could consider having separate DefElems for the different flag
types (i.e. "feedback_failed", "feedback_match_full", etc), and then a
list of strings attached to each - that'd save the nested DefElem and
the Integer node

Thanks,
Lukas

-- 
Lukas Fittl


Reply via email to