On Mon, Nov 29, 2021 6:11 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > On Mon, Nov 29, 2021 at 12:10 PM Greg Nancarrow <gregn4...@gmail.com> > wrote: > > > > On Fri, Nov 26, 2021 at 12:40 AM houzj.f...@fujitsu.com > > <houzj.f...@fujitsu.com> wrote: > > > > > > When researching and writing a top-up patch about this. > > > I found a possible issue which I'd like to confirm first. > > > > > > It's possible the table is published in two publications A and B, > > > publication A only publish "insert" , publication B publish > > > "update". When UPDATE, both row filter in A and B will be executed. Is > > > this > behavior expected? > > > > > > For example: > > > ---- Publication > > > create table tbl1 (a int primary key, b int); create publication A > > > for table tbl1 where (b<2) with(publish='insert'); create > > > publication B for table tbl1 where (a>1) with(publish='update'); > > > > > > ---- Subscription > > > create table tbl1 (a int primary key); CREATE SUBSCRIPTION sub > > > CONNECTION 'dbname=postgres host=localhost port=10000' > PUBLICATION > > > A,B; > > > > > > ---- Publication > > > update tbl1 set a = 2; > > > > > > The publication can be created, and when UPDATE, the rowfilter in A > > > (b<2) will also been executed but the column in it is not part of replica > identity. > > > (I am not against this behavior just confirm) > > > > > > > There seems to be problems related to allowing the row filter to > > include columns that are not part of the replica identity (in the case > > of publish=insert). > > In your example scenario, the tbl1 WHERE clause "(b < 2)" for > > publication A, that publishes inserts only, causes a problem, because > > column "b" is not part of the replica identity. > > To see this, follow the simple example below: > > (and note, for the Subscription, the provided tbl1 definition has an > > error, it should also include the 2nd column "b int", same as in the > > publisher) > > > > ---- Publisher: > > INSERT INTO tbl1 VALUES (1,1); > > UPDATE tbl1 SET a = 2; > > > > Prior to the UPDATE above: > > On pub side, tbl1 contains (1,1). > > On sub side, tbl1 contains (1,1) > > > > After the above UPDATE: > > On pub side, tbl1 contains (2,1). > > On sub side, tbl1 contains (1,1), (2,1) > > > > So the UPDATE on the pub side has resulted in an INSERT of (2,1) on > > the sub side. > > > > This is because when (1,1) is UPDATEd to (2,1), it attempts to use the > > "insert" filter "(b<2)" to determine whether the old value had been > > inserted (published to subscriber), but finds there is no "b" value > > (because it only uses RI cols for UPDATE) and so has to assume the old > > tuple doesn't exist on the subscriber, hence the UPDATE ends up doing > > an INSERT. > > INow if the use of RI cols were enforced for the insert filter case, > > we'd properly know the answer as to whether the old row value had been > > published and it would have correctly performed an UPDATE instead of > > an INSERT in this case. > > > > I don't think it is a good idea to combine the row-filter from the publication > that publishes just 'insert' with the row-filter that publishes 'updates'. We > shouldn't apply the 'insert' filter for 'update' and similarly for publication > operations. We can combine the filters when the published operations are the > same. So, this means that we might need to cache multiple row-filters but I > think that is better than having another restriction that publish operation > 'insert' > should also honor RI columns restriction.
Personally, I agreed that an UPDATE operation should only apply a row filter that is part of a publication that has only UPDATE. Best regards, Hou zj