Hi, I researched about how to detect the resolve update_deleted and thought about one idea: which is to maintain the xmin in logical slot to preserve the dead row and support latest_timestamp_xmin resolution for update_deleted to maintain data consistency.
Here are details of the xmin idea and resolution of update_deleted: 1. how to preserve the dead row so that we can detect update_delete conflict correctly. (In the following explanation, let's assume there is a a multimeter setup with node A, B). To preserve the dead row on node A, I think we could maintain the "xmin" in the logical replication slot on Node A to prevent the VACCUM from removing the dead row in user table. The walsender that acquires the slot is responsible to advance the xmin. (Node that I am trying to explore xmin idea as it could be more efficient than using commit_timestamp, and the logic could be simpler as we are already maintaining catalog_xmin in logical slot and xmin in physical slot) - Strategy for advancing xmin: The xmin can be advanced if a) a transaction (xid:1000) has been flushed to the remote node (Node B in this case). *AND* b) On Node B, the local transactions that happened before applying the remote transaction(xid:1000) were also sent and flushed to the Node A. - The implementation: condition a) can be achieved with existing codes, the walsender can advance the xmin similar to the catalog_xmin. For condition b), we can add a subscription option (say 'feedback_slot'). The feedback_slot indicates the replication slot that will send changes to the origin (On Node B, the slot should be subBA). The apply worker will check the status(confirmed flush lsn) of the 'feedback slot' and send feedback to the walsender about the WAL position that has been sent and flushed via the feedback_slot. For example, on Node B, we specify the replication slot (subBA) that is sending changes to Node A. The apply worker on Node B will send feedback(WAL position that has been sent to the Node A) to Node A regularly. Then the Node A can use the position to advance the xmin. (Similar to the hot_standby_feedback). 2. The resolution for update_delete The current design doesn't support 'last_timestamp_win'. But this could be a problem if update_deleted is detected due to some very old dead row. Assume the update has the latest timestamp, and if we skip the update due to these very old dead rows, the data would be inconsistent because the latest update data is missing. The ideal resolution should compare the timestamp of the UPDATE and the timestamp of the transaction that produced these dead rows. If the UPDATE is newer, the convert the UDPATE to INSERT, otherwise, skip the UPDATE. Best Regards, Hou zj