Hi all, I’m sending v2 of the patch. This is a clean rebase onto current master (commit a27893df45e) and a squash of the fix together with the TAP test into a single patch file.
I would appreciate your thoughts and comments on the current problem. Thank you! -- Best regards, Mikhail Kharitonov On Thu, May 29, 2025 at 9:30 AM Mikhail Kharitonov <mikhail.kharitonov....@gmail.com> wrote: > > Hi, > > Thank you for the feedback. > > I would like to clarify that the current behavior does not break replication > between PostgreSQL instances. The logical replication stream is still accepted > by the subscriber, and the data is applied correctly. However, the protocol > semantics are violated, which may cause issues for external systems that rely > on interpreting this stream. > > When using publish_via_partition_root = true and setting REPLICA IDENTITY FULL > only on the parent table (but not on all partitions), logical replication > generates messages with the tag 'O' (old tuple) for updates and deletes even > for partitions that do not have full identity configured. > > In those cases, only key columns are sent, and the rest of the tuple is > omitted. > This contradicts the meaning of tag 'O', which, according > to the documentation [1], indicates that the full old tuple is included. > > This behavior is safe for the standard PostgreSQL subscriber, which does not > rely on the tag when applying changes. However, third-party tools that consume > the logical replication stream and follow the protocol strictly can be misled. > For example, one of our clients uses a custom CDC mechanism that extracts > changes and sends them to Oracle. Their handler interprets the 'O' tag as a > signal that the full old row is available. When it is not - the data is > processed incorrectly. > > The attached patch changes the behavior so that the 'O' or 'K' tag is chosen > based on the REPLICA IDENTITY setting of the actual partition where the row > ends up not only the parent. > - If the partition has REPLICA IDENTITY FULL, the full tuple is > sent and tagged 'O'. > - Otherwise, only the key columns are sent, and the tag 'K' is used. > > This aligns the behavior with the protocol documentation. > I have also included a TAP test: 036_partition_replica_identity.pl, > located in src/test/subscription/t/ > > It demonstrates two cases: > - An update/delete on a partition with REPLICA IDENTITY FULL correctly > emits an 'O' tag with the full old row. > - An update/delete on a partition without REPLICA IDENTITY FULL currently > also emits an 'O' tag, but only with key fields - this is the problem. > > After applying the patch, the second case correctly uses the 'K' tag. > > This patch is a minimal change it does not alter protocol structure > or introduce new behavior. It only ensures the implementation matches > the documentation. In the future, we might consider a broader redesign > of logical replication for partitioned tables (see [2]), but this is > a narrow fix that solves a real inconsistency. > > Looking forward to your comments. > > Best regards, > Mikhail Kharitonov > > [1] > https://www.postgresql.org/docs/current/protocol-logicalrep-message-formats.html > [2] > https://www.postgresql.org/message-id/201902041630.gpadougzab7v@alvherre.pgsql > > On Mon, May 12, 2025 at 5:25 PM Maxim Orlov <orlo...@gmail.com> wrote: > > > > Hi! > > > > This is probably not the most familiar part of Postgres to me, but does it > > break anything? Or is it just inconsistency in the replication protocol? > > > > A test for the described scenario would be a great addition. And, if it is > > feasible, provide an example of what would be broken with the way > > partitioned tables are replicated now. > > > > There is a chance that the replication protocol for partitioned tables > > needs to be rewritten, and I sincerely hope that I am wrong about this. It > > seems Alvaro Herrera tried this here [0]. > > > > > > [0] > > https://www.postgresql.org/message-id/201902041630.gpadougzab7v@alvherre.pgsql > > > > > > -- > > Best regards, > > Maxim Orlov.