On Wed, Mar 18, 2026 at 10:08 AM Robert Haas <[email protected]> wrote: > But this does raise two points which are perhaps worthy of some > further consideration: > > 1. Maybe we should limit the length of the detail message. In some > other cases, like reportDependentObjects, we limit the detail message > to the first 100 problems found, and then say at the end of that > detail message how many more problems there were afterwards. We could > do that here, too. I'm not 100% sure it's worth the code, but then > again it's not much code.
I think we have similar problems elsewhere in Postgres where a large user input causes an even larger log output - e.g. a case I'm familiar with are complicated queries with long IN list inputs and their associated EXPLAIN plans being logged by auto_explain - I recently had a case where someone reported an OOM due to auto_explain trying to log a > 100 MB sized query plan. I think its actually less a problem with plan advice, since presumably we won't have ORMs generating plan advice, and even if we do it'd be per-table - so I think its unlikely a genuine use case would use 1000s of advice tags. That said, I also don't think super long long messages are actually helpful. I do wonder if we should have a more coarse GUC that limits DETAIL lines of any kind to a maximum length (e.g. 100 kB) across the board instead of special casing every emitter. > 2. The way the current code works is to transform the advice feedback > into a Node-tree representation which is stashed in the PlannedStmt's > extension_state, and then also passes that Node-tree representation to > pgpa_planner_feedback_warning, which generates the actual warning. > That Node-tree representation is currently used by EXPLAIN to display > the advice feedback, which I think for most people will be a nicer > interface than enabling pg_plan_advice.feedback_warnings, and it can > also be used by other extensions. For instance, we could extend > pg_stash_advice so that it looks at the advice feedback and updates > the shared store, so that users can monitor whether their stashed > advice is doing what they hoped it would. Yeah, I think the ability for other extensions to retrieve this is pretty important - whether in pg_stash_advice, or any other kind of plan management extension that wants to know the outcome of applied advice. > > However, in a case like this, that Node tree is actually quite large: > about 16MB. I guess that's because pgpa_planner_append_feedback() has > to do multiple allocations for every item of feedback: a C string, > DefElem, an Integer, plus whatever lappend() charges us to add to a > List. We could save that memory by adding code here to optimize for > the case where we need to generate warnings but we don't need the Node > tree for anything else. I'm inclined not to do that, because (A) I > don't think temporarily using 16MB when the user specifies 10,000 > items of bogus advice is really that bad and (B) complicating the code > has its own costs, such as maybe introducing more bugs. However, maybe > someone else sees it differently. > > Another idea is to try to find a more economic Node representation. > For instance, we could jam the flags into the DefElem's location > field, instead of building a separate Integer node, or we could build > the advice feedback as a giant binary blob and wrap it in a varlena > and a Const node and leave consumers to make sense of that as best > they're able, or we could invent a new Node type that's just to the > perfect thing to hold a C string and an integer. I'm inclined to think > that the first two are too hacky and the third is too special-purpose, > but again someone else might see it otherwise. Yeah, I feel like we're definitely constrained here by the fact that advice tags are defined by a contrib module vs in-core - if they were in-core we could just add a dedicated node type for them. I don't think inventing a specialized binary format only defined in the contrib module makes sense. Two other ideas: 1) What if we return the utilized advice string as a separate DefElem with a list of strings, and then the feedback just has to reference an index into that list? (though I suppose that doesn't actually save memory, now that I think that through -- unless we assume the caller already has the advice string, but I don't think we can rely on that) 2) We could consider having separate DefElems for the different flag types (i.e. "feedback_failed", "feedback_match_full", etc), and then a list of strings attached to each - that'd save the nested DefElem and the Integer node Thanks, Lukas -- Lukas Fittl
