On Tue, Apr 14, 2026 at 9:17 AM Chao Li <[email protected]> wrote: > > > > > On Apr 8, 2026, at 18:22, shveta malik <[email protected]> wrote: > > > > On Fri, Dec 19, 2025 at 1:39 PM Chao Li <[email protected]> wrote: > >> > >> > >> > >>> On Dec 18, 2025, at 22:49, Euler Taveira <[email protected]> wrote: > >>> > >>> On Wed, Dec 17, 2025, at 6:43 PM, GRANT ZHOU wrote: > >>>> On Wed, Dec 17, 2025 at 12:50 PM Euler Taveira <[email protected]> wrote: > >>>>> Each table needs to say what's its row identifier. The user created a > >>>>> table > >>>>> without primary key. Well, create a primary key. There are dozens of > >>>>> thousands > >>>>> of objects. Use a script. > >>>> However, I’d like to share a user perspective regarding the "use a > >>>> script" approach. The main value of `FOR TABLES IN SCHEMA` is > >>>> *in-database automation*. If users still need to maintain external > >>>> scripts to monitor and `ALTER` new tables to prevent replication > >>>> errors, it significantly diminishes the value of that automation. > >>>> > >>> > >>> As I tried to explain in the previous email, the problem with FOR ALL > >>> TABLES > >>> and FOR TABLES IN SCHEMA syntax is that the is no catalog information > >>> about the > >>> relations; the list of relations is collected at runtime. > >>> > >>> When I suggested "use a script" I was referring to fix the logical > >>> replication > >>> setup regarding the lack of primary key. There is no need to have an > >>> automation > >>> outside the database, use an event trigger. If your lazy user doesn't > >>> create > >>> the primary key, assign REPLICA IDENTITY FULL. Something like > >>> > >>> -- This example is far from being a complete solution for fixing the lack > >>> of > >>> -- primary key in a logical replication scenario. > >>> -- ALTER TABLE should be supported too > >>> CREATE OR REPLACE FUNCTION event_trigger_for_replica_identity() > >>> RETURNS event_trigger LANGUAGE plpgsql AS $$ > >>> DECLARE > >>> obj record; > >>> rec record; > >>> ricnt integer := 0; > >>> BEGIN > >>> FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() > >>> LOOP > >>> IF obj.command_tag = 'CREATE TABLE' THEN > >>> SELECT COUNT(*) INTO ricnt FROM pg_index WHERE indrelid = > >>> obj.objid AND indisprimary; > >>> RAISE NOTICE 'ricnt: %', ricnt; > >>> IF ricnt = 0 THEN > >>> EXECUTE 'ALTER TABLE ' || obj.object_identity || ' REPLICA > >>> IDENTITY FULL'; > >>> END IF; > >>> END IF; > >>> END LOOP; > >>> END; > >>> $$; > >>> > >>> CREATE EVENT TRIGGER event_trigger_for_replica_identity > >>> ON ddl_command_end > >>> EXECUTE FUNCTION event_trigger_for_replica_identity(); > >>> > >>> CREATE TABLE event_trigger_test_1 (a int); > >>> \d+ event_trigger_test_1 > >>> CREATE TABLE event_trigger_test_2 (a int primary key); > >>> \d+ event_trigger_test_2 > >>> CREATE TABLE event_trigger_test_3 (a int, b text not null, primary > >>> key(b)); > >>> \d+ event_trigger_test_3 > >>> --ALTER TABLE event_trigger_test_3 DROP CONSTRAINT > >>> event_trigger_test_3_pkey; > >>> --\d+ event_trigger_test_3 > >>> > >>> DROP EVENT TRIGGER event_trigger_for_replica_identity; > >>> DROP FUNCTION event_trigger_for_replica_identity; > >>> DROP TABLE event_trigger_test_1, event_trigger_test_2, > >>> event_trigger_test_3; > >>> > >>> 8<----------------------------------------------------------------------------8< > >>> > >>>> Additionally, tables without Primary Keys are valid SQL and extremely > >>>> common in enterprise environments (e.g., audit logs, data warehousing). > >>>> In large-scale deployments, enforcing PKs on every single table isn't > >>>> always practical. > >>>> > >>> > >>> I'm not saying users shouldn't create tables without a primary key. I'm > >>> arguing > >>> that this decision should take into account what adjustments need to be > >>> made to > >>> use these tables in logical replication. > >>> > >>>> > >>>> I think the goal of this proposal is not to change the underlying table > >>>> property design, but rather to seek a mechanism (like a Publication > >>>> option) to ensure this automation functions safely without external > >>>> intervention. It is simply about allowing the database to handle these > >>>> valid, common scenarios gracefully when automation is enabled. > >>>> > >>> > >>> You didn't get it. You already have one property to handle it and you are > >>> proposing to add a second property to handle it. > >>> > >>> I think you are pursuing the wrong solution. IMO we need a solution to > >>> enforce > >>> that the logical replication contract is valid. If you create or modify a > >>> table > >>> that is part of a publication, there is no validation that that table > >>> complies > >>> with the publication properties (update and delete properties should > >>> require an > >>> appropriate replica identity). We should close the gaps in both > >>> publication and > >>> table. > >>> > >> > >> If I summarize Euler’s position in short words: discipline over > >> convenience. I actually strongly agree with that. In PG we generally > >> prefer explicit over implicit behavior, and predictability over magic. > >> > >> Based on the discussion so far, I think we share the following design > >> goals: > >> > >> 1) Keep replica identity as a table property. > >> 2) Avoid silent runtime failures when FOR TABLES IN SCHEMA pulls in tables > >> without primary keys. > >> 3) Avoid global or implicit behavior changes. > >> 4) Preserve explicit opt-in for higher WAL cost. > >> 5) Keep the logical replication contract explicit and enforceable. > >> > >> I’ve been thinking about whether adding a new replica identity could meet > >> these goals. > >> > >> Today we have four replica identities: DEFAULT (PK, fallback to NONE), > >> INDEX, FULL, and NONE. > >> > >> What if we introduce a new replica identity, tentatively called “FORCE”: > >> PK with fallback to FULL. (Let’s keep our focus on the design, not argue > >> the name for now.) > >> > >> With this approach: > >> > >> 1) Replica identity remains a table property. > >> 2) Publication membership is still evaluated at runtime, so FOR TABLES IN > >> SCHEMA is not special-cased. > >> 3) No new GUCs are required. > >> 4) The user must explicitly opt in by setting the replica identity. Once > >> FORCE is chosen, adding or dropping a primary key later does not silently > >> break UPDATE/DELETE replication. > >> 5) The logical replication contract remains explicit; the table declares > >> that it is safe for UPDATE/DELETE replication even without a PK, at the > >> cost of higher WAL volume. > >> > >> This feels like a small, explicit extension of the existing RI semantics. > >> Notably, REPLICA IDENTITY DEFAULT already has conditional behavior (PK > >> fallback > >> to NONE), so conditional RI behavior is not new, this would just make a > >> different fallback explicit and user-chosen. > > > > > > Hi Shveta, > > Thanks for your feedback. > > > Even though I like the idea of keeping replica identity as a table > > property, this raises a question about the need for a new FORCE > > option. With this approach, the user must explicitly opt REPLICA > > IDENTITY FORCE for tables without a primary key (pt 4 in your design). > > But if the user is already expected to take an explicit action, they > > could simply set REPLICA IDENTITY FULL themselves at table creation > > time or via ALTER TABLE. > > This is the tricky part. The need for this feature comes from some > complicated operational models in practice. Table creation is often handled > by an application team, which focuses on data logic and usually does not care > about replication details. Data replication, on the other hand, is maintained > by a different team, which does not care what data the tables contain, but > only needs to ensure that all data is replicated correctly. In that kind of > setup, we cannot expect REPLICA IDENTITY to be configured properly at table > creation time. > > Such operational models also imply that the database operations team is > usually the central database team, with superuser privileges, while an > application typically has privileges only on the tables it owns. The central > team can set configurations at the database or cluster level. In most cases, > what they want is to replicate all tables by default, except for a small > number of tables that are explicitly excluded. > > Actually, this feature request is coming from the replication team. Today, > they have to manually identify newly created tables, usually by running some > script, and then configure REPLICA IDENTITY for them. That takes effort and > is error-prone. So what they want is a DEFAULT-to-FULL fallback when a table > has no primary key.
Okay, I see. If the goal is specifically to address this use-case, introducing a FORCE table property does not seem to solve the underlying problem. Using 'ALTER TABLE t REPLICA IDENTITY FORCE' is effectively no different from 'ALTER TABLE t REPLICA IDENTITY FULL'. In both cases, a manual action must still be taken on each table, which is precisely what the replication team is trying to avoid. > That’s why my initial proposal was to add a GUC to control the fallback > behavior, which would be convenient for the central database team. > > > From what I understood, the original > > requirement was to ensure that tables without a primary key do not > > silently fail replication, and that this is handled automatically > > without manual intervention. In contrast, the FORCE approach still > > requires users to explicitly configure replica identity for each > > table, which seems to reintroduce the same operational burden. > > > > For example, instead of: > > ALTER TABLE t1 REPLICA IDENTITY FORCE; > > the user could simply do: > > ALTER TABLE t1 REPLICA IDENTITY FULL > > > > By the time, the user is explicitly configuring replica identity, they > > would already need to understand the replication semantics, so it's > > not clear what additional benefit FORCE provides over FULL. Am I > > missing something here? > > > > thanks > > Shveta > > > -- > Chao Li (Evan) > HighGo Software Co., Ltd. > https://www.highgo.com/ > > > >
