On 03.04.24 07:30, Paul Jungwirth wrote:
But is it *literally* unique? Well two identical keys, e.g. (5, '[Jan24,Mar24)') and (5, '[Jan24,Mar24)'), do have overlapping ranges, so the second is excluded. Normally a temporal unique index is *more* restrictive than a standard one, since it forbids other values too (e.g. (5, '[Jan24,Feb24)')). But sadly there is one exception: the ranges in these keys do not overlap: (5, 'empty'), (5, 'empty'). With ranges/multiranges, `'empty' && x` is false for all x. You can add that key as many times as you like, despite a PK/UQ constraint:

     postgres=# insert into t values
     ('[1,2)', 'empty', 'foo'),
     ('[1,2)', 'empty', 'bar');
     INSERT 0 2
     postgres=# select * from t;
       id   | valid_at | name
     -------+----------+------
      [1,2) | empty    | foo
      [1,2) | empty    | bar
     (2 rows)

Cases like this shouldn't actually happen for temporal tables, since empty is not a meaningful value. An UPDATE/DELETE FOR PORTION OF would never cause an empty. But we should still make sure they don't cause problems.

We should give temporal primary keys an internal CHECK constraint saying `NOT isempty(valid_at)`. The problem is analogous to NULLs in parts of a primary key. NULLs prevent two identical keys from ever comparing as equal. And just as a regular primary key cannot contain NULLs, so a temporal primary key should not contain empties.

The standard effectively prevents this with PERIODs, because a PERIOD adds a constraint saying start < end. But our ranges enforce only start <= end. If you say `int4range(4,4)` you get `empty`. If we constrain primary keys as I'm suggesting, then they are literally unique, and indisunique seems safer.

Should we add the same CHECK constraint to temporal UNIQUE indexes? I'm inclined toward no, just as we don't forbid NULLs in parts of a UNIQUE key. We should try to pick what gives users more options, when possible. Even if it is questionably meaningful, I can see use cases for allowing empty ranges in a temporal table. For example it lets you "disable" a row, preserving its values but marking it as never true.

It looks like we missed some of these fundamental design questions early on, and it might be too late now to fix them for PG17.

For example, the discussion on unique constraints misses that the question of null values in unique constraints itself is controversial and that there is now a way to change the behavior. So I imagine there is also a selection of possible behaviors you might want for empty ranges. Intuitively, I don't think empty ranges are sensible for temporal unique constraints. But anyway, it's a bit late now to be discussing this.

I'm also concerned that if ranges have this fundamental incompatibility with periods, then the plan to eventually evolve this patch set to support standard periods will also have as-yet-unknown problems.

Some of these issues might be design flaws in the underlying mechanisms, like range types and exclusion constraints. Like, if you're supposed to use this for scheduling but you can use empty ranges to bypass exclusion constraints, how is one supposed to use this? Yes, a check constraint using isempty() might be the right answer. But I don't see this documented anywhere.

On the technical side, adding an implicit check constraint as part of a primary key constraint is quite a difficult implementation task, as I think you are discovering. I'm just reminded about how the patch for catalogued not-null constraints struggled with linking these not-null constraints to primary keys correctly. This sounds a bit similar.

I'm afraid that these issues cannot be resolved in good time for this release, so we should revert this patch set for now.



Reply via email to