On 2026-06-12 7:26 PM, Tomas Vondra wrote:

On 6/12/26 02:20, Arne Roland wrote:
Hi Tomas,

thank you for checking it out!
[...]

I guess git-am gets confused in some way, because cfbot has the same
issue with this:

https://github.com/postgresql-cfbot/postgresql/actions/runs/27379594748/job/80912606962
I see. It's a rebase issue. The patch shouldn't apply cleanly on master, but somehow it does. My next version will be rebased on the current master again, but I want to work more of the feedback into the patch before posting a new version here. At least more detailed documentation and glossary, at best already some more finely grained split up.
[...]
6) Does it always have to be a "foreign key traversal"? Consider an
example like this:

      create table dim (id serial primary key);
      create table f (id serial primary key, did int);
      select * from f left join dim for key (id) <- f(did);

Currently this fails because of no matching foreign key constraint, but
isn't that pretty much the same thing as if the table actually had the
foreign key (at least considering the cardinality of the join - it won't
change it by adding/removing rows).

I realize that'd contradict the "FOR KEY" part of this patch, but it's
also one of the things that might be beneficial for the starjoin
planning (in that we could maybe extend it to more cases). Although,
maybe we could check those constraints during planning, just like we
check the fkey_list.
This is a very different feature avoiding far less bugs than the
original key join. To give just once, consider

SELECT *
FROM customers cu
LEFT JOIN FOR KEY orders (id) <- cu (id)
WHERE customers.id = 122354;

I don't follow. What does this query illustrate?
Most relevantly here it illustrates, that fks instead have additional relevant guarantees compared to unique/primary key constraints for making writing and reading queries easier and safer.
Our proof framework with proof facts could potentially be leveraged to
proof such constructions too. We opted to keep this patch as minimal as
possible, since it's already huge. In the end our conquest is to get to
something, which is eventually commitable. This patch has a lot things
to get wrong. But I think as a second step adding support of other
schemas and handing this information of to the planer, should be a
fairly straight forward thing to do. Although it would add a few lines
on it's own, it's step, that I seems easier than landing a minimal
version of this patch.

OK. I realize this probably goes against my suggestion to make the patch
smaller. I certainly don't insist on supporting it.
I do want that though. It has advantages in using key joins in old codebases with existing SQL without full refactoring. And it might be ultimately helpful to the optimizer too. I just wonder about the best order to get a simple case of the architecture in.
[...]

8) I don't understand this change in functioncmds:

      /* Routine kind cannot change for an existing pg_proc OID. */
      Assert(procForm->prokind != PROKIND_AGGREGATE);

The comment says we can't change OID, but the assert checks it's not an
aggregate functions. Isn't that misleading?
Maybe it would be cleaner to mention additionally, that "aggregates were
already rejected on the pre-lock tuple"? I actually think, that comment
is indeed stating a helpful invariant to explain why we are concurrency
save here.
I don't follow. The comment says we can't change OID of an existing
procedure. OK, sure. But the assert checks the procedure is not an
aggregate function. How is that comment an accurate description of the
purpose of the assert?
it explains the invariant, why we can rely on our earlier pre-lock tuple check. To me that seemed to be a relevant enough assumption about the code to make it explicit in a comment.
[...]
13) I think the main question is whether parse-analyze is the right
place to handle this. I don't know. I assume one of the reasons for
doing that is to get an error when defining a view, or when altering an
object - e.g. like when dropping a function used by a view. Which seems
reasonable, people would not like views silently broken by DDL.

But it seems to be this also leads to a lot of code duplication, because
the parse analysis now has to "reimplement" a lot of the stuff already
done in later stages, after parse analyze. For example, we have the
root->fkey_list thing, but that's not available yet.

There's probably more such information - like innerrel_is_unique,
rel_is_distinct_for, relation_has_unique_index_for etc.

Maybe it's not worth it? What if we did this in the planner instead? How
much simpler would it get? My intuition is it'd get much smaller, but
maybe I'm wrong.

It'd probably mean it's not necessary to expand views during parse,
which was one of the problems mentioned at the beginning of this thread.

I suppose we'd need to keep additional information in the plan to know
which joins to check, etc.
If we want to use this for correctness and for proofs, we can't do it
plan time. We have DDL-writes to store proofs. In it's nature this is a
compile time feature.

Is that actually true? If we check the correctness later in the planner,
we'd still get a failure. Why couldn't that verify all the proofs, just
like the parse-analyze? We may need to retain more information, ofc.

Even if we don't include all of that in the first patch, this places
very severe limitations like the DDL issue, which would be just tooo
limiting to get to as an end place. I do think it's more reasonable how
to cut this up into more digestible pieces.

I'm not sure what "DDL issue" is, sorry :-(

I may be missing something, but the main difference in behavior seems to
be for views - the current patch verifies the proofs when creating the
view / altering objects, and rejects those. And AFAICS after moving it
to the planner, that would not be the case anymore.

But for regular ad hoc queries there's not much difference, no? You'd
get a failure, no matter what.

With DDL-issue I was eluding to a DDL command that attaches a query to some object. Views, sql functions and policies all do that. I don't see us doing that work planning time. Theoretically we could try to use a new planner hook for that, to call the planner with an extra struct element telling it about increased locking arrangements without the intend to ever execute a query and abort after the proof.

However I do think this introduces more future complexity by entangling separate concerns in the same code path, and I am not sure we are doing ourselves a favor with this, even if we would save a hand full of lines. I am not convinced I want to tackle that complexity on top of this already non-trivial patch.

I intend to try to separate the DDL handling case out into it's own patch of the patch series. While in lines of code is not that massive, reasoning about those two individually is indeed simpler. Even just getting to locking for the DDL-case right, isn't trivial.

I hope to get to that by the end of the now starting week.

Even though the planner is messy, I feel more at home there than here in
the parser. I could see a world were we clean up some of those out of
the planner to make them available earlier some time in parsing. We
could push those variables forward to the planner to avoid redoing the
work. Each of these steps would need a careful performance check and I
am not sure how the real stepping stones would look like.

I'm skeptical about moving this stuff to an earlier phase (so that the
parse can see it). You can try, but it probably requires moving a lot of
other stuff too, some of which may be expensive.

I have a high level general response and one tailored to this patch. I'll start with this particular patch: For this patch I am mostly concerned with not doing complex work twice: Once in the parser AND once in the optimizer. And the below general thoughts are not easily actionable.

More generally:

Your skepticism is well placed. Every change there would warrant benchmarking. I am just not completely convinced the planner is super optimal in the way it works. I suspect a lot of code there has not been optimized, because it's hard to wrap your head around it, that it doesn't feel like an easy gain anymore.

While 1:1 pulling stuff out of the planner almost surely gives mostly performance regressions, a clean architecture allowing more simple, isolated performance improvements sounds beneficial, too. And I don't want to open this can of worms right now.

I just recalled the time when Andres made the storage pluggable by introducing a layer of pointers everywhere. The new storage interface ended up being faster than the old. This is not storage and I am not Andres. I just wanted to share precedent, that a clear architecture with optimization disadvantages is not necessarily always slower.

regards

Regards
Arne



Reply via email to