On Tue, Jul 26, 2022 at 7:13 AM Jonathan S. Katz <jk...@postgresql.org> wrote: > > On 7/25/22 4:54 AM, vignesh C wrote: > > > > Let's take a simple case to understand why copy_data = force is > > required to replicate between two primaries for table t1 which has > > data as given below: > > > step4 - Node-2: > > Create Subscription sub2 Connection '<node-1 details>' Publication > > pub1_2 with (origin = none, copy_data=on); > > If we had allowed the create subscription to be successful with > > copy_data = on. After this the data will be something like this: > > Node-1: > > 1, 2, 3, 4, 5, 6, 7, 8 > > > > Node-2: > > 1, 2, 3, 4, 5, 6, 7, 8, 5, 6, 7, 8 > > > > So, you can see that data on Node-2 (5, 6, 7, 8) is duplicated. In > > case, table t1 has a unique key, it will lead to a unique key > > violation and replication won't proceed. > > > > To avoid this we will throw an error: > > ERROR: could not replicate table "public.t1" > > DETAIL: CREATE/ALTER SUBSCRIPTION with origin = none and copy_data = > > on is not allowed when the publisher has subscribed same table. > > HINT: Use CREATE/ALTER SUBSCRIPTION with copy_data = off/force. > > Thanks for the example. I agree that it is fairly simple to reproduce. > > I understand that "copy_data = force" is meant to protect a user from > hurting themself. I'm not convinced that this is the best way to do so. > > For example today I can subscribe to multiple publications that write to > the same table. If I have a primary key on that table, and two of the > subscriptions try to write an identical ID, we conflict. We don't have > any special flags or modes to guard against that from happening, though > we do have documentation on conflicts and managing them. > > AFAICT the same issue with "copy_data" also exists in the above scenario > too, even without the "origin" attribute. >
That's true but there is no parameter like origin = NONE which indicates that constraint violations or duplicate data problems won't occur due to replication. In the current case, I think the situation is different because a user has specifically asked not to replicate any remote data by specifying origin = NONE, which should be dealt differently. Note that current users or their setup won't see any difference/change unless they specify the new parameter origin as NONE. -- With Regards, Amit Kapila.