On Fri, Dec 19, 2025 at 1:39 PM Chao Li <[email protected]> wrote: > > > On Dec 18, 2025, at 22:49, Euler Taveira <[email protected]> wrote: > > > > On Wed, Dec 17, 2025, at 6:43 PM, GRANT ZHOU wrote: > >> On Wed, Dec 17, 2025 at 12:50 PM Euler Taveira <[email protected]> wrote: > >>> Each table needs to say what's its row identifier. The user created a > >>> table > >>> without primary key. Well, create a primary key. There are dozens of > >>> thousands > >>> of objects. Use a script. > >> However, I’d like to share a user perspective regarding the "use a > >> script" approach. The main value of `FOR TABLES IN SCHEMA` is > >> *in-database automation*. If users still need to maintain external > >> scripts to monitor and `ALTER` new tables to prevent replication > >> errors, it significantly diminishes the value of that automation. > >> > > > > As I tried to explain in the previous email, the problem with FOR ALL TABLES > > and FOR TABLES IN SCHEMA syntax is that the is no catalog information about > > the > > relations; the list of relations is collected at runtime. > > > > When I suggested "use a script" I was referring to fix the logical > > replication > > setup regarding the lack of primary key. There is no need to have an > > automation > > outside the database, use an event trigger. If your lazy user doesn't create > > the primary key, assign REPLICA IDENTITY FULL. Something like > > > > -- This example is far from being a complete solution for fixing the lack of > > -- primary key in a logical replication scenario. > > -- ALTER TABLE should be supported too > > CREATE OR REPLACE FUNCTION event_trigger_for_replica_identity() > > RETURNS event_trigger LANGUAGE plpgsql AS $$ > > DECLARE > > obj record; > > rec record; > > ricnt integer := 0; > > BEGIN > > FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() > > LOOP > > IF obj.command_tag = 'CREATE TABLE' THEN > > SELECT COUNT(*) INTO ricnt FROM pg_index WHERE indrelid = > > obj.objid AND indisprimary; > > RAISE NOTICE 'ricnt: %', ricnt; > > IF ricnt = 0 THEN > > EXECUTE 'ALTER TABLE ' || obj.object_identity || ' REPLICA > > IDENTITY FULL'; > > END IF; > > END IF; > > END LOOP; > > END; > > $$; > > > > CREATE EVENT TRIGGER event_trigger_for_replica_identity > > ON ddl_command_end > > EXECUTE FUNCTION event_trigger_for_replica_identity(); > > > > CREATE TABLE event_trigger_test_1 (a int); > > \d+ event_trigger_test_1 > > CREATE TABLE event_trigger_test_2 (a int primary key); > > \d+ event_trigger_test_2 > > CREATE TABLE event_trigger_test_3 (a int, b text not null, primary key(b)); > > \d+ event_trigger_test_3 > > --ALTER TABLE event_trigger_test_3 DROP CONSTRAINT > > event_trigger_test_3_pkey; > > --\d+ event_trigger_test_3 > > > > DROP EVENT TRIGGER event_trigger_for_replica_identity; > > DROP FUNCTION event_trigger_for_replica_identity; > > DROP TABLE event_trigger_test_1, event_trigger_test_2, event_trigger_test_3; > > > > 8<----------------------------------------------------------------------------8< > > > >> Additionally, tables without Primary Keys are valid SQL and extremely > >> common in enterprise environments (e.g., audit logs, data warehousing). > >> In large-scale deployments, enforcing PKs on every single table isn't > >> always practical. > >> > > > > I'm not saying users shouldn't create tables without a primary key. I'm > > arguing > > that this decision should take into account what adjustments need to be > > made to > > use these tables in logical replication. > > > >> > >> I think the goal of this proposal is not to change the underlying table > >> property design, but rather to seek a mechanism (like a Publication > >> option) to ensure this automation functions safely without external > >> intervention. It is simply about allowing the database to handle these > >> valid, common scenarios gracefully when automation is enabled. > >> > > > > You didn't get it. You already have one property to handle it and you are > > proposing to add a second property to handle it. > > > > I think you are pursuing the wrong solution. IMO we need a solution to > > enforce > > that the logical replication contract is valid. If you create or modify a > > table > > that is part of a publication, there is no validation that that table > > complies > > with the publication properties (update and delete properties should > > require an > > appropriate replica identity). We should close the gaps in both publication > > and > > table. > >
If we want, we can ensure that any table added to that specific publication (that has an option replica_identy='full') would automatically override the default to FULL, if PK is not available. This information can be cached to avoid overhead. > > If I summarize Euler’s position in short words: discipline over convenience. > I actually strongly agree with that. In PG we generally prefer explicit over > implicit behavior, and predictability over magic. > You haven't told why we can't consider a custom event trigger as suggested by Euler for customers who are not willing to change the RI default explicitly for each table. I think it is worth considering providing a custom solution outside core-postgres for your customers for this specific case. > Based on the discussion so far, I think we share the following design goals: > > 1) Keep replica identity as a table property. > 2) Avoid silent runtime failures when FOR TABLES IN SCHEMA pulls in tables > without primary keys. > 3) Avoid global or implicit behavior changes. > 4) Preserve explicit opt-in for higher WAL cost. > 5) Keep the logical replication contract explicit and enforceable. > > I’ve been thinking about whether adding a new replica identity could meet > these goals. > > Today we have four replica identities: DEFAULT (PK, fallback to NONE), INDEX, > FULL, and NONE. > > What if we introduce a new replica identity, tentatively called “FORCE”: PK > with fallback to FULL. (Let’s keep our focus on the design, not argue the > name for now.) > > With this approach: > > 1) Replica identity remains a table property. > 2) Publication membership is still evaluated at runtime, so FOR TABLES IN > SCHEMA is not special-cased. > 3) No new GUCs are required. > 4) The user must explicitly opt in by setting the replica identity. Once > FORCE is chosen, adding or dropping a primary key later does not silently > break UPDATE/DELETE replication. > > 5) The logical replication contract remains explicit; the table declares that > it is safe for UPDATE/DELETE replication even without a PK, at the cost of > higher WAL volume. > > This feels like a small, explicit extension of the existing RI semantics. > Notably, REPLICA IDENTITY DEFAULT already has conditional behavior (PK > fallback > to NONE), so conditional RI behavior is not new, this would just make a > different fallback explicit and user-chosen. > > After that, we could consider a database-level default_replica_identity > setting, applied at table creation time, for environments that want this > behavior consistently. But that would only make sense if we first agree on > the table-level mechanism. > I don't much like the database-level option as it expects a new default to be introduced. I think the internal working will almost be same as the option at publication-level. -- With Regards, Amit Kapila.
