On Tue, May 20, 2025 at 2:45 PM Tomas Vondra <to...@vondra.me> wrote: > Thanks for the overview. I don't have any immediate feedback, but it > sounds like it might be related to the "making planner decisions clear" > session from the unconference ... > > The basic premise of that session was about how to give users better > info about the planner decisions - why paths were selected/rejected, > etc. A simple example would be "why was the index not used", and the > possible answers include "dominated by cost by another path" or "does > not match the index keys" etc. > > I wonder if this work might be useful for something like that.
I've been wondering that, too. There's definitely some indirect ways in which that might be the case. For example, I think this work would lend itself to saying "hey, try planning this query, but for that table over there, use an index scan on this table." Then, it either still doesn't -- meaning the index isn't usable for some reason -- or it does and you can see the resulting plan with presumably higher cost and maybe infer why it didn't happen. That's better than today, where we have only very crude tools that let us do things like disable an entire scan type for the entire query, and I think it would make it a lot easier and less frustrating for a knowledgeable user to figure out why things are happening. But even though I think that would be better than today, I'm not sure it rises to the level of actually being good, because I think it still requires a fairly knowledgeable operator to figure things out, and you probably have to experiment a bunch to understand the situation instead of, say, being able to just look at the EXPLAIN plan and see the answer. I think being able to look at the EXPLAIN plan and see the answer, without needing a bunch of poking around, would be the ideal scenario here. But in some sense this is the same problem as understanding how an AI neural network is reasoning. The answer to "why did the planner pick plan X" is always "X was the cheapest possible plan". Ideas like "we chose a merge join because both tables are large enough that neither would fit into a hash table conveniently" are human explanations of why the math had the effect that it did; they are not how the planner actually reasons. So it's not just a matter of exposing the actual reasoning process to the user, because the computer is not reasoning in a way that a human would. It would have to be a matter of exposing some kind of other information that would allow the human being to comprehend easily what led the machine's algorithm to a certain conclusion; and it is not obvious how to get there. I have a sense - possibly an incorrect one - that the core of the problem here is that the planner considers lots of very similar alternatives. A hypothetical feature that showed the second-cheapest plan would be all but useless, because the second-cheapest plan would just be a very minor variation of the cheapest plan in almost all cases. One idea that crossed my mind was to display information in EXPLAIN about what would have happened if we'd done something really different. For instance, suppose that at a certain level of the plan tree we actually chose a merge join, but we also show the estimated cost of the cheapest hash join (if any) and the cheapest nested loop (if any) that we considered at that level. The user might be able to draw useful conclusions based on whether those numbers were altogether absent (i.e. that join type was not viable at all) or whether the cost was a little higher or a lot higher than that of the path actually chosen. For scans, you could list which indexes were believed to be usable and perhaps what the cost would have been for the cheapest one not actually selected; and what the cost of a sequential scan would have been if you hadn't picked one. I'm not sure how useful this would be, so the whole idea might actually suck, or maybe it's sort of the right idea but needs a bunch of refinement to really be useful. I don't have a better idea right now, though. If there are any notes that were taken during that unconference session, please point me in the right direction; I was in another session at that time but would read any available notes with interest. -- Robert Haas EDB: http://www.enterprisedb.com