Re: Two questions about "pg_constraint"

Bryn Llewellyn Fri, 26 Aug 2022 15:33:25 -0700

> x...@thebuild.com wrote:
> 
>> b...@yugabyte.com wrote:
>> [...]
> 
> I've read this a few times, and I am having trouble understanding what 
> behavior you were expecting out of PostgreSQL, and what behavior you received 
> that you didn't think was correct. If it is "pg_constraint has a column 
> connamespace, and that appears to be a denormalization since a constraint is 
> always in the same schema as the table it is owned by”, I believe Tom 
> explained the reason for that. If that's not what is concerning you, can you 
> summarize it in a sentence [or] two?


Thanks for your interest in this, Christophe. I’ll start again from scratch. 
I’m copying David because he said something relevant to the present discussion 
in an earlier separate thread.

But, first, a caveat. I write, here, as if I'm 100% confident of the 
correctness of everything I say. I've found that this is an effective way to 
provoke correction when I'm wrong. And I look forward to correction from you 
(all) here.

I’ll be as brief as I can. But I’m afraid that I cannot condense the problem 
statement beyond my points #1 and #2 below. And then I need a preamble before 
the problem statement and a discussion after it.

Notice that I don't think that there's any wrong behavior here. Rather, there's 
just that annoying transitive dependency and, worse for me, what appears to be 
some wrong ways to talk about stuff from the user's PoV where only business 
unique keys matter and the surrogate oid values vanish from the picture.

For example, Christophe used the phrase "a constraint is always in the same 
schema as..." (and didn't mention domains). And David said "the trigger owner 
is the context in which the trigger function is executed". Neither a constraint 
nor a trigger is "in" a schema or "has" an owner. In both cases, these are 
derived (a.k.a. transitive) properties.

Here, now, is my problem statement:

«
1. The "pg_constraint" column, "connamespace", is a transitive dependency. 
Therefore, tautologically, it need not be there. (So I’m not convinced by what 
Tom said here.) Given that "pg_constraint" does have its "connamespace" column, 
it may just as well have a "conowner"column as a second transitive dependency. 
On the other hand, "pg_trigger" has neither a "tgnamespace" column nor a 
"tgowner" column. But, if you (all) think that transitive dependencies are a 
nice usability feature, and you (all) can guarantee that the derived values are 
always in step with their source, then I suppose that no harm is done—beyond 
possibly leading the novice to assume a faulty mental. Whatever is decided, the 
documentation should make it clear.

2. The larger point is that I want to draw a distinction between a (primary) 
schema object, like a table, a domain, or a function and a secondary object 
like a constraint or a trigger. The business unique key of a schema object is 
its schema-qualified name together with the catalog table ("pg_class", 
"pg_type", "pg_proc",...) that defines the uniqueness scope for its name. A 
schema object also has an owner—but its name is not part of its business unique 
key. In contrast, the business unique key of a secondary object is its own name 
together with the business unique key of the object off which it hangs. And a 
secondary object cannot exist without hanging off a schema object.
»

I have to use more words now to illustrate, and dramatize, my point. First, run 
my demo. Just copy-and-paste the code at the end and paste it into a ".sql" 
script. Then execute it in psql. You can run it time and again. It finishes by 
producing this output:

 conname | connamespace |  n   
---------+--------------+------
 c       |      2127201 | 2000

 tgname |  n   
--------+------
 t      | 1000

So I've created 2K constraints, all called 'c', and all "in" (as Christophe and 
Adrian would have it) the same schema. And I've created 1K triggers, all called 
't', and all "having" (as David would have it) the same owner.

Look at the definitions of the temporary views whose results sets I showed 
above. Each constraint has a unique value for the tuple "(conname, conrelid, 
contypid)"—as it must have because my "constrains" view selects from just the 
single table "pg_constraint" and this places a uniqueness requirement on this 
tuple. It's only to be expected that I can have any number of constraints, all 
with the same "conname" and "connamespace" because the definition of 
"pg_constraint" places no uniqueness requirement on either of these, neither 
separately nor jointly.

Similarly, each trigger has a unique value for the tuple "(tgname, tgrelid)"—as 
it must have because my "triggers" view selects from just the single table 
"pg_trigger" and this places a uniqueness requirement on this tuple. It's only 
to be expected that I can have any number of constraints, all with the same 
"tgname" because the definition of "pg_trigger" places no uniqueness 
requirement on this.

With a bit more typing, I could materialize both a schema and an owner for each 
constraint and each trigger. (I showed the SQL for doing this for a constraint 
earlier in this thread. And I've done it for myself for triggers.) The 
situation is a little harder for constraints than for triggers because a 
constraint can hang off either a table or a domain while a DML trigger can hang 
off just a table. This leads to the arc-FK (implied, if not declared) from 
"pg_constraint" either to "pg_class" or to "pg_type". The documentation for 
"pg_constraint" on the topic:

«
conrelid (oid references pg_class.oid): The table this constraint is on; 0 if 
not a table constraint
contypid (oid references pg_type.oid): The domain this constraint is on; 0 if 
not a domain constraint
»

together with the requirement that "(conname, conrelid, contypid)" is unique 
(and some empirical testing) makes it clear. Thanks, Tom, for asking me to 
think about this. It implies more elaborate SQL to get the facts for the schema 
object that a constraint hangs off than it does to get those facts for a 
trigger. But it's not (as Tom claimed it would be) daunting. I expect that my 
SQL could be improved. But it wasn't hard to write what I did.

Anyway... the structure of the "pg_constraint" and "pg_trigger" tables, their 
unique indexes, and where their implied oid-FKs point is fully consistent with 
my mental model, thus:

- A schema object is uniquely identified by its qualified name and its 
namespace ('relations', types', 'subprograms',...).

- A constraint (as a secondary object) is uniquely identified by its own name 
and the identity of the schema object off which it hangs

- A trigger, too, (as a secondary object) is also uniquely identified by its 
own name and the identity of the schema object off which it hangs

- Because a schema object lives in a schema and has an owner, you might like to 
think that a constraint, or a trigger, each lives in a schema and has an owner. 
But I think that it's both unnecessary and sometimes unhelpful to elide the 
transitive step here. (This elision led Adrian to say that a constraint name 
must be unique within a schema. And it led David to say that the owner of a 
trigger plays a part in determining the "current_role" that a "security 
invoker" trigger function sees when the trigger fires.)

--------------------------------------------------------------------------------
-- Boring code to guarantee a fresh start and error-free re-runs.

\c postgres postgres
set client_min_messages = warning;
drop database if exists bryn;
create database bryn owner postgres;
revoke all on database bryn from public;

\c bryn postgres
set client_min_messages = warning;

do $body$
begin
  begin
    drop owned by u1 cascade;
  exception
    when undefined_object then null;
  end;
  drop schema public;
  create schema s1 authorization postgres;
  revoke usage on schema s1 from public;
  drop role if exists u1;
  create role u1 login password 'p';
  grant connect on database bryn to u1;
  grant create on database bryn to u1;
  grant temporary on database bryn to u1;
  grant usage on schema s1 to u1;
  grant create on schema s1 to u1;
  alter user u1 set search_path = pg_catalog, pg_temp;
end;
$body$;
--------------------------------------------------------------------------------
\c bryn u1
/*
  The interesting code starts here. It simply creates lots of tables and
  domains, all in the single schema 's1'. Each table has a trigger called 't'.
  And both each table and each domain has a constraint called 'c'.
*/;

create function s1.f()
  returns trigger
  security definer
  language plpgsql
as $body$
begin
  return old;
end;
$body$;

do $body$
declare
  no_of_iterations constant int not null := 1000;

  cr_table constant text not null := '
    create table s1.%I(
      k int primary key,
      v text,
      constraint c check(v = lower(v)))';

  cr_trigger constant text not null := '
    create trigger t
      after delete
      on s1.%I
      for each statement
      execute function s1.f()';

  cr_domain constant text not null := '
    create domain s1.%I as int[]
      constraint c check(cardinality(value) > 1)';
begin
  for j in 1..no_of_iterations loop
    execute format(cr_table,   'a'||j::text);
    execute format(cr_trigger, 'a'||j::text);
    execute format(cr_domain,  'b'||j::text);
  end loop;
end;
$body$;

create temporary view constraints as
select conname, conrelid, contypid, connamespace
from pg_constraint where conname = 'c';

create temporary view triggers as
select tgname, tgrelid
from pg_trigger where tgname = 't';

select conname, connamespace, count(*) as n
from constraints
group by conname, connamespace;

select tgname, count(*) as n
from triggers
group by tgname;

Re: Two questions about "pg_constraint"

Reply via email to