Re: Deferred constraint trigger semantics

Bryn Llewellyn Mon, 16 May 2022 12:03:33 -0700

> [email protected]:
> 
>> [email protected] wrote:
> 
>> 
>> …I tried this:
>> 
>> create table t(
>>   k serial primary key,
>>   v int not null,
>>   constraint t_v_unq unique(v) initially deferred);


Here's a better test:

-- BLUE session
start transaction isolation level read committed;
insert into t(v) values (1), (2);

-- RED session
start transaction isolation level read committed;
insert into t(v) values (1), (3);

-- BLUE session
set constraints all immediate;

-- RED session (hangs until BLUE commits).
-- Then, when it does, gets ERROR... "Key (v)=(1) already exists"
set constraints all immediate;

-- BLUE session
commit;

-- RED session
-- There are no changes to commit 'cos they were rolled back.
commit;

-- "select k, v from t order by k" (in each session) now shows that both 
sessions meet the constraint.

>> Where, in the PG doc, can I read the account of the proper mental model for 
>> the application programmer?
> 
> [See https://www.postgresql.org/docs/current/index-unique-checks.html.]

Thanks for referring me to the account "62.5. Index Uniqueness Checks". It's in 
the section "Part VII. Internals" (…contains assorted information that might be 
of use to PostgreSQL developers). I wouldn't expect to read this because I 
don't intend to write code that might become part of PG's implementation.

> I'd say that the proper mental model is that you don't need to care… The 
> exact sequence of what happens during COMMIT is interesting, but irrelevant 
> to the programmer. All that counts is "a deferred constraint is checked 
> between the time that COMMIT starts processing and the time that it returns".

Yes, I very much like this stance. It seems that, for built-in constraints 
(like "unique" or "foreign key") it's enough to understand that PG implements 
these at the "read committed" isolation level by using methods (that aren't 
exposed via SQL) to peep below the application programmer's MVCC view of the 
world to check the uncommitted state of other, concurrent, sessions.

This explains why, in the (new) test that I used above, the conflict is 
detected when the second session issues "set constraints all immediate" after 
the first already did this (i.e. long before COMMIT). In this case, the second 
session hangs until the first commits—at which point the second sees the 
uniqueness violation error.

In other words, the automagic implementation of the enforcement of built-in 
constraints allows the safe use of "set constraints all immediate" to provoke a 
possible early error that can, very usefully, be handled in PL/pgSQL code. This 
is the clue to understanding why the check of a built-in constraint, when it's 
performed as an implicit consequence of "commit", doesn't need to be within the 
small part of the operations that "commit" causes that are strictly serialized. 
(You've explained how this helps performance in multi-session scenarios.)

Critically, the special methods that implement the enforcement of built-in 
constraints aren't accessible in PL/pgSQL code and therefore not accessible in 
the "ordinary" implementation of trigger functions. This is the point that I 
failed to grasp. (Though I do see, now, that Laurenz's post says this clearly.)

I was able to demonstrate this by implementing a unique constraint with a 
deferred constraint trigger (and no use of "set constraints all immediate"). I 
simply introduced "pg_sleep(5)" between the trigger function's actual check and 
its final "return null". I copied the code below for completeness.

The "BLUE" session, because it reaches its serialized "commit" actions first, 
sees an outcome that meets the constraint. But the "RED" session has enough 
time to do its check before "BLUE" does its serialized "commit" actions. So its 
test passes too. This leaves the final database in conflict with the intended 
constraint.

I see now that the only robust use of an ordinarily (i.e. not using C) 
implemented constraint trigger (deferred or otherwise) is to enforce a single 
row-constraint. (There's a caveat that maybe, after careful analysis, you can 
work out a cunning lockings scheme to allow the safe implementation of an 
entity-level constraint without using C. But the "exactly one or two admins in 
a department" example shows that this isn't generally possible.) So it's 
reasonable that a constraint trigger must be AFTER EACH ROW. Further, it would  
make no sense to do SQL from its implementation function because the only 
values that you might defensibly use are available simply via "old" and "new".

So all that stuff I was concerned about where the deferred constraint fires 
many times when once is enough falls away because the larger endeavor makes no 
sense. (I suppose that it might matter if you implemented the trigger function 
robustly in C.)

It does strike me that the ordinary application programmer—who reads just the 
sections "CREATE TRIGGER", "Chapter 39. Triggers", "CREATE TABLE", and "SET 
CONSTRAINTS"—will have a hard time to reach a correct understanding of what I 
believe that, with huge help from Laurenz Albe and David Johnston, I hope that 
I now have. This is the key sentence that needs careful interpretation:

"If the constraint is INITIALLY DEFERRED, it is checked only at the end of the 
transaction."

One's understanding is crucially determined by realizing that "at the end of 
the transaction" means "concurrently, when two or more sessions hit COMMIT at 
the exact same moment—and therefore *before* that part of the commit actions 
that is serialized. Oh well, I won't push that point.

--------------------------------------------------------------------------------
-- set-up.sql
drop table if exists t;
create table t(
  k serial primary key,
  v int not null);

-- No need, in this simple demo, to use a latching scheme
-- to execute constraint_trg_fn()'s test only once.
drop function if exists constraint_trg_fn() cascade;
create function constraint_trg_fn()
  returns trigger
  language plpgsql
as $body$
begin
  set constraint_trigger.fired = 'true';
  if
    (
      (select count(*) from t) > (select count(distinct v) from t)
    )
  then
    raise exception using
      errcode = 'raise_exception',
      message = '"t.v" values must be unique',
      hint    =  'try again';
  end if;
  perform pg_sleep(5);
  return null;
end;
$body$;

create constraint trigger constraint_trg
after insert on t
initially deferred
for each row
execute function constraint_trg_fn();
--------------------------------------------------------------------------------
-- blue.sql

set default_transaction_isolation = 'read committed';
do $body$
begin
  insert into t(v) values (1), (2);
end;
$body$;
select k, v from t order by k;

--------------------------------------------------------------------------------
-- red.sql

set default_transaction_isolation = 'read committed';
do $body$
begin
  insert into t(v) values (1), (3);
end;
$body$;
select k, v from t order by k

--------------------------------------------------------------------------------
-- RED session

\i set-up.sql

--------------------------------------------------------------------------------
-- BLUE session

\i blue.sql
--------------------------------------------------------------------------------
-- RED session
-- to be done as soon as your reflexes allow after starting "red.sql"

\i red.sql

--------------------------------------------------------------------------------

Re: Deferred constraint trigger semantics

Reply via email to