Re: Improve logical replication usability when tables lack primary keys

shveta malik Mon, 13 Apr 2026 21:57:39 -0700

On Tue, Apr 14, 2026 at 9:17 AM Chao Li <[email protected]> wrote:
>
>
>
> > On Apr 8, 2026, at 18:22, shveta malik <[email protected]> wrote:
> >
> > On Fri, Dec 19, 2025 at 1:39 PM Chao Li <[email protected]> wrote:
> >>
> >>
> >>
> >>> On Dec 18, 2025, at 22:49, Euler Taveira <[email protected]> wrote:
> >>>
> >>> On Wed, Dec 17, 2025, at 6:43 PM, GRANT ZHOU wrote:
> >>>> On Wed, Dec 17, 2025 at 12:50 PM Euler Taveira <[email protected]> wrote:
> >>>>> Each table needs to say what's its row identifier. The user created a 
> >>>>> table
> >>>>> without primary key. Well, create a primary key. There are dozens of 
> >>>>> thousands
> >>>>> of objects. Use a script.
> >>>> However, I’d like to share a user perspective regarding the "use a
> >>>> script" approach. The main value of `FOR TABLES IN SCHEMA` is
> >>>> *in-database automation*. If users still need to maintain external
> >>>> scripts to monitor and `ALTER` new tables to prevent replication
> >>>> errors, it significantly diminishes the value of that automation.
> >>>>
> >>>
> >>> As I tried to explain in the previous email, the problem with FOR ALL 
> >>> TABLES
> >>> and FOR TABLES IN SCHEMA syntax is that the is no catalog information 
> >>> about the
> >>> relations; the list of relations is collected at runtime.
> >>>
> >>> When I suggested "use a script" I was referring to fix the logical 
> >>> replication
> >>> setup regarding the lack of primary key. There is no need to have an 
> >>> automation
> >>> outside the database, use an event trigger. If your lazy user doesn't 
> >>> create
> >>> the primary key, assign REPLICA IDENTITY FULL. Something like
> >>>
> >>> -- This example is far from being a complete solution for fixing the lack 
> >>> of
> >>> -- primary key in a logical replication scenario.
> >>> -- ALTER TABLE should be supported too
> >>> CREATE OR REPLACE FUNCTION event_trigger_for_replica_identity()
> >>>       RETURNS event_trigger LANGUAGE plpgsql AS $$
> >>> DECLARE
> >>>   obj record;
> >>>   rec record;
> >>>   ricnt integer := 0;
> >>> BEGIN
> >>>   FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
> >>>   LOOP
> >>>       IF obj.command_tag = 'CREATE TABLE' THEN
> >>>           SELECT COUNT(*) INTO ricnt FROM pg_index WHERE indrelid = 
> >>> obj.objid AND indisprimary;
> >>>           RAISE NOTICE 'ricnt: %', ricnt;
> >>>           IF ricnt = 0 THEN
> >>>               EXECUTE 'ALTER TABLE ' || obj.object_identity || ' REPLICA 
> >>> IDENTITY FULL';
> >>>           END IF;
> >>>       END IF;
> >>>   END LOOP;
> >>> END;
> >>> $$;
> >>>
> >>> CREATE EVENT TRIGGER event_trigger_for_replica_identity
> >>>   ON ddl_command_end
> >>>   EXECUTE FUNCTION event_trigger_for_replica_identity();
> >>>
> >>> CREATE TABLE event_trigger_test_1 (a int);
> >>> \d+ event_trigger_test_1
> >>> CREATE TABLE event_trigger_test_2 (a int primary key);
> >>> \d+ event_trigger_test_2
> >>> CREATE TABLE event_trigger_test_3 (a int, b text not null, primary 
> >>> key(b));
> >>> \d+ event_trigger_test_3
> >>> --ALTER TABLE event_trigger_test_3 DROP CONSTRAINT 
> >>> event_trigger_test_3_pkey;
> >>> --\d+ event_trigger_test_3
> >>>
> >>> DROP EVENT TRIGGER event_trigger_for_replica_identity;
> >>> DROP FUNCTION event_trigger_for_replica_identity;
> >>> DROP TABLE event_trigger_test_1, event_trigger_test_2, 
> >>> event_trigger_test_3;
> >>>
> >>> 8<----------------------------------------------------------------------------8<
> >>>
> >>>> Additionally, tables without Primary Keys are valid SQL and extremely
> >>>> common in enterprise environments (e.g., audit logs, data warehousing).
> >>>> In large-scale deployments, enforcing PKs on every single table isn't
> >>>> always practical.
> >>>>
> >>>
> >>> I'm not saying users shouldn't create tables without a primary key. I'm 
> >>> arguing
> >>> that this decision should take into account what adjustments need to be 
> >>> made to
> >>> use these tables in logical replication.
> >>>
> >>>>
> >>>> I think the goal of this proposal is not to change the underlying table
> >>>> property design, but rather to seek a mechanism (like a Publication
> >>>> option) to ensure this automation functions safely without external
> >>>> intervention. It is simply about allowing the database to handle these
> >>>> valid, common scenarios gracefully when automation is enabled.
> >>>>
> >>>
> >>> You didn't get it. You already have one property to handle it and you are
> >>> proposing to add a second property to handle it.
> >>>
> >>> I think you are pursuing the wrong solution. IMO we need a solution to 
> >>> enforce
> >>> that the logical replication contract is valid. If you create or modify a 
> >>> table
> >>> that is part of a publication, there is no validation that that table 
> >>> complies
> >>> with the publication properties (update and delete properties should 
> >>> require an
> >>> appropriate replica identity). We should close the gaps in both 
> >>> publication and
> >>> table.
> >>>
> >>
> >> If I summarize Euler’s position in short words: discipline over 
> >> convenience. I actually strongly agree with that. In PG we generally 
> >> prefer explicit over implicit behavior, and predictability over magic.
> >>
> >> Based on the discussion so far, I think we share the following design 
> >> goals:
> >>
> >> 1) Keep replica identity as a table property.
> >> 2) Avoid silent runtime failures when FOR TABLES IN SCHEMA pulls in tables 
> >> without primary keys.
> >> 3) Avoid global or implicit behavior changes.
> >> 4) Preserve explicit opt-in for higher WAL cost.
> >> 5) Keep the logical replication contract explicit and enforceable.
> >>
> >> I’ve been thinking about whether adding a new replica identity could meet 
> >> these goals.
> >>
> >> Today we have four replica identities: DEFAULT (PK, fallback to NONE), 
> >> INDEX, FULL, and NONE.
> >>
> >> What if we introduce a new replica identity, tentatively called “FORCE”: 
> >> PK with fallback to FULL. (Let’s keep our focus on the design, not argue 
> >> the name for now.)
> >>
> >> With this approach:
> >>
> >> 1) Replica identity remains a table property.
> >> 2) Publication membership is still evaluated at runtime, so FOR TABLES IN 
> >> SCHEMA is not special-cased.
> >> 3) No new GUCs are required.
> >> 4) The user must explicitly opt in by setting the replica identity. Once 
> >> FORCE is chosen, adding or dropping a primary key later does not silently 
> >> break UPDATE/DELETE replication.
> >> 5) The logical replication contract remains explicit; the table declares 
> >> that it is safe for UPDATE/DELETE replication even without a PK, at the 
> >> cost of higher WAL volume.
> >>
> >> This feels like a small, explicit extension of the existing RI semantics. 
> >> Notably, REPLICA IDENTITY DEFAULT already has conditional behavior (PK 
> >> fallback
> >> to NONE), so conditional RI behavior is not new, this would just make a 
> >> different fallback explicit and user-chosen.
> >
> >
>
> Hi Shveta,
>
> Thanks for your feedback.
>
> > Even though I like the idea of keeping replica identity as a table
> > property, this raises a question about the need for a new FORCE
> > option. With this approach, the user must explicitly opt REPLICA
> > IDENTITY FORCE for tables without a primary key (pt 4 in your design).
> > But if the user is already expected to take an explicit action, they
> > could simply set REPLICA IDENTITY FULL themselves at table creation
> > time or via ALTER TABLE.
>
> This is the tricky part. The need for this feature comes from some 
> complicated operational models in practice. Table creation is often handled 
> by an application team, which focuses on data logic and usually does not care 
> about replication details. Data replication, on the other hand, is maintained 
> by a different team, which does not care what data the tables contain, but 
> only needs to ensure that all data is replicated correctly. In that kind of 
> setup, we cannot expect REPLICA IDENTITY to be configured properly at table 
> creation time.
>
> Such operational models also imply that the database operations team is 
> usually the central database team, with superuser privileges, while an 
> application typically has privileges only on the tables it owns. The central 
> team can set configurations at the database or cluster level. In most cases, 
> what they want is to replicate all tables by default, except for a small 
> number of tables that are explicitly excluded.
>
> Actually, this feature request is coming from the replication team. Today, 
> they have to manually identify newly created tables, usually by running some 
> script, and then configure REPLICA IDENTITY for them. That takes effort and 
> is error-prone. So what they want is a DEFAULT-to-FULL fallback when a table 
> has no primary key.


Okay, I see. If the goal is specifically to address this use-case,
introducing a FORCE table property does not seem to solve the
underlying problem.  Using 'ALTER TABLE t REPLICA IDENTITY FORCE' is
effectively no different from 'ALTER TABLE t REPLICA IDENTITY FULL'.
In both cases, a manual action must still be taken on each table,
which is precisely what the replication team is trying to avoid.


> That’s why my initial proposal was to add a GUC to control the fallback 
> behavior, which would be convenient for the central database team.
>
> > From what I understood, the original
> > requirement was to ensure that tables without a primary key do not
> > silently fail replication, and that this is handled automatically
> > without manual intervention. In contrast, the FORCE approach still
> > requires users to explicitly configure replica identity for each
> > table, which seems to reintroduce the same operational burden.
> >
> > For example, instead of:
> > ALTER TABLE t1 REPLICA IDENTITY FORCE;
> > the user could simply do:
> > ALTER TABLE t1 REPLICA IDENTITY FULL
> >
> > By the time, the user is explicitly configuring replica identity, they
> > would already need to understand the replication semantics, so it's
> > not clear what additional benefit FORCE provides over FULL. Am I
> > missing something here?
> >
> > thanks
> > Shveta
>
>
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
>

Re: Improve logical replication usability when tables lack primary keys

Reply via email to