On Wed, 22 May 2024 at 08:18, Peter Smith <smithpb2...@gmail.com> wrote: > > On Tue, May 21, 2024 at 8:40 PM PG Doc comments form > <nore...@postgresql.org> wrote: > > > > The following documentation comment has been logged on the website: > > > > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html > > Description: > > > > The documentation on this page mentions: > > > > "If no column list is specified, any columns added later are automatically > > replicated." > > > > It feels ambiguous what this could mean. Does it mean: > > > > 1/ That if you alter the table on the publisher and add a new column, it > > will be replicated > > > > 2/ If you add a column list later and add a column to it, it will be > > replicated > > > > In both cases, does the subscriber automatically create this column if it > > wasn't there before? > > No, the subscriber will not automatically create the column. That is > already clearly said at the top of the same page you linked "The table > on the subscriber side must have at least all the columns that are > published." > > All that "If no column list..." paragraph was trying to say is: > > CREATE PUBLICATION pub FOR TABLE T; > > is not quite the same as: > > CREATE PUBLICATION pub FOR TABLE T(a,b,c); > > The difference is, in the 1st case if you then ALTER the TABLE T to > have a new column 'd' then that will automatically start replicating > the 'd' data without having to do anything to either the PUBLICATION > or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does > not have a column 'd' then you'll get an error because your subscriber > table needs to have *at least* all the replicated columns. (I > demonstrate this error below) > > Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have > a new column 'd' then that won't be replicated because 'd' was not > named in the PUBLICATION's column list. > > ~~~~ > > Here's an example where you can see this in action > > Here is an example of the 1st case -- it shows 'd' is automatically > replicated and also shows the subscriber-side error caused by the > missing column: > > test_pub=# CREATE TABLE T(a int,b int, c int); > test_pub=# CREATE PUBLICATION pub FOR TABLE T; > > test_sub=# CREATE TABLE T(a int,b int, c int); > test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION > pub; > > See the replication happening > test_pub=# INSERT INTO T VALUES (1,2,3); > test_sub=# SELECT * FROM t; > a | b | c > ---+---+--- > 1 | 2 | 3 > (1 row) > > Now alter the publisher table T and insert some new data > test_pub=# ALTER TABLE T ADD COLUMN d int; > test_pub=# INSERT INTO T VALUES (5,6,7,8); > > This will cause subscription errors like: > 2024-05-22 11:53:19.098 AEST [16226] ERROR: logical replication > target relation "public.t" is missing replicated column: "d" > > ~~~~ > > I think the following small change will remove any ambiguity: > > BEFORE > If no column list is specified, any columns added later are > automatically replicated. > > SUGGESTION > If no column list is specified, any columns added to the table later > are automatically replicated. > > ~~ > > I attached a small patch to make the above change. > > Thoughts?
A minor suggestion, the rest looks good: It would enhance clarity to include a line break following "If no column list is specified, any columns added to the table later are": - If no column list is specified, any columns added later are automatically + If no column list is specified, any columns added to the table later are automatically replicated. This means that having a column list which names all columns Regards, Vignesh