On Fri, Sep 2, 2022 at 5:42 AM Wolfgang Walther <walt...@technowledgy.de> wrote: > > Kaiting Chen: > > I'd like to propose a change to PostgreSQL to allow the creation of a > > foreign > > key constraint referencing a superset of uniquely constrained columns. > > +1 > > Tom Lane: > > TBH, I think this is a fundamentally bad idea and should be rejected > > outright. It fuzzes the semantics of the FK relationship, and I'm > > not convinced that there are legitimate use-cases. Your example > > schema could easily be dismissed as bad design that should be done > > some other way. > > I had to add quite a few unique constraints on a superset of already > uniquely constrained columns in the past, just to be able to support FKs > to those columns. I think those cases most often come up when dealing > with slightly denormalized schemas, e.g. for efficiency. > > One other use-case I had recently, was along the followling lines, in > abstract terms: > > CREATE TABLE classes (class INT PRIMARY KEY, ...); > > CREATE TABLE instances ( > instance INT PRIMARY KEY, > class INT REFERENCES classes, > ... > ); > > Think about classes and instances as in OOP. So the table classes > contains some definitions for different types of object and the table > instances realizes them into concrete objects. > > Now, assume you have some property of a class than is best modeled as a > table like this: > > CREATE TABLE classes_prop ( > property INT PRIMARY KEY, > class INT REFERNECES classes, > ... > ); > > Now, assume you need to store data for each of those classes_prop rows > for each instance. You'd do the following: > > CREATE TABLE instances_prop ( > instance INT REFERENCES instances, > property INT REFERENCES classes_prop, > ... > ); > > However, this does not ensure that the instance and the property you're > referencing in instances_prop are actually from the same class, so you > add a class column: > > CREATE TABLE instances_prop ( > instance INT, > class INT, > property INT, > FOREIGN KEY (instance, class) REFERENCES instances, > FOREIGN KEY (property, class) REFERENCES classes_prop, > ... > ); > > But this won't work, without creating some UNIQUE constraints on those > supersets of the PK column first.
If I'm following properly this sounds like an overengineered EAV schema, and neither of those things inspires me to think "this is a use case I want to support". That being said, I know that sometimes examples that have been abstracted enough to share aren't always the best, so perhaps there's something underlying this that's a more valuable example. > > For one example of where the semantics get fuzzy, it's not > > very clear how the extra-baggage columns ought to participate in > > CASCADE updates. Currently, if we have > > CREATE TABLE foo (a integer PRIMARY KEY, b integer); > > then an update that changes only foo.b doesn't need to update > > referencing tables, and I think we even have optimizations that > > assume that if no unique-key columns are touched then RI checks > > need not be made. But if you did > > CREATE TABLE bar (x integer, y integer, > > FOREIGN KEY (x, y) REFERENCES foo(a, b) ON UPDATE > > CASCADE); > > then perhaps you expect bar.y to be updated ... or maybe you don't? > > In all use-cases I had so far, I would expect bar.y to be updated, too. > > I think it would not even be possible to NOT update bar.y, because the > FK would then not match anymore. foo.a is the PK, so the value in bar.x > already forces bar.y to be the same as foo.b at all times. > > bar.y is a little bit like a generated value in that sense, it should > always match foo.b. I think it would be great, if we could actually go a > step further, too: On an update to bar.x to a new value, if foo.a=bar.x > exists, I would like to set bar.y automatically to the new foo.b. > Otherwise those kind of updates always have to either query foo before, > or add a trigger to do the same. Isn't this actually contradictory to the behavior you currently have with a multi-column foreign key? In the example above then an update to bar.x is going to update the rows in foo that match bar.x = foo.a and bar.y = foo.b *using the old values of bar.x and bar.y* to be the new values. You seem to be suggesting that instead it should look for other rows that already match the *new value* of only one of the columns in the constraint. If I'm understanding the example correctly, that seems like a *very* bad idea. James Coleman