On Thursday, August 18, 2022 11:13 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Wed, Aug 17, 2022 at 12:34 PM Peter Smith <smithpb2...@gmail.com> > wrote: > > > > On Wed, Aug 17, 2022 at 4:33 PM Amit Kapila <amit.kapil...@gmail.com> > wrote: > > > > > > On Wed, Aug 17, 2022 at 8:48 AM houzj.f...@fujitsu.com > > > <houzj.f...@fujitsu.com> wrote: > > > > > > > > On Tuesday, August 2, 2022 8:00 PM Amit Kapila > <amit.kapil...@gmail.com> wrote: > > > > > On Tue, Jul 26, 2022 at 9:07 AM Amit Kapila > <amit.kapil...@gmail.com> wrote: > > > > > > > > Thanks for the summary. > > > > > > > > I think it's fine to make the user use the copy_data option more > > > > carefully to prevent duplicate copies by reporting an ERROR. > > > > > > > > But I also have similar concern with Sawada-san as it's possible > > > > for user to receive an ERROR in some unexpected cases. > > > > > > > > For example I want to build bi-directional setup between two nodes: > > > > > > > > Node A: TABLE test (has actual data) Node B: TABLE test (empty) > > > > > > > > Step 1: > > > > CREATE PUBLICATION on both Node A and B. > > > > > > > > Step 2: > > > > CREATE SUBSCRIPTION on Node A with (copy_data = on) > > > > -- this is fine as there is no data on Node B > > > > > > > > Step 3: > > > > CREATE SUBSCRIPTION on Node B with (copy_data = on) > > > > -- this should be fine as user needs to copy data from Node A to > > > > Node B, > > > > -- but we still report an error for this case. > > > > > > > > It looks a bit strict to report an ERROR in this case and it seems > > > > not easy to avoid this. So, personally, I think it might be better > > > > to document the correct steps to build the bi-directional > > > > replication and probably also docuemnt the steps to recover if > > > > user accidently did duplicate initial copy if not documented yet. > > > > > > > > In addition, we could also LOG some additional information about > > > > the ORIGIN and initial copy which might help user to analyze if needed. > > > > > > > > > > But why LOG instead of WARNING? I feel in this case there is a > > > chance of inconsistent data so a WARNING like "publication "pub1" > > > could have data from multiple origins" can be given when the user > > > has specified > > > options: "copy_data = on, origin = NONE" while creating a > > > subscription. We give a WARNING during subscription creation when > > > the corresponding publication doesn't exist, eg. > > > > > > postgres=# create subscription sub1 connection 'dbname = postgres' > > > publication pub1; > > > WARNING: publication "pub1" does not exist in the publisher > > > > > > Then, we can explain in docs how users can avoid data > > > inconsistencies while setting up replication. > > > > > > > I was wondering if this copy/origin case really should be a NOTICE. > > > > We usually give NOTICE for some sort of additional implicit information, e.g., > when we create a slot during CREATE SUBSCRIPTION > command: "NOTICE: created replication slot "sub1" on publisher". IMO, this is > likely to be a problem of data inconsistency so I think here we can choose > between WARNING and LOG. I prefer WARNING but okay with LOG as well if > others feel so. I think we can change this later as well if required. We do > have an > option to not do anything and just document it but I feel it is better to > give user > some indication of problem here because not everyone reads each update of > documentation. > > Jonathan, Sawada-San, Hou-San, and others, what do you think is the best way > to move forward here?
I think it's fine to throw a WARNING in this case given that there is a chance of inconsistent data. Best regards, Hou zj