On Wednesday, March 9, 2022 6:04 PM Amit Kapila <amit.kapil...@gmail.com> > On Mon, Mar 7, 2022 at 8:48 PM Tomas Vondra > <tomas.von...@enterprisedb.com> wrote: > > > > On 3/4/22 11:42, Amit Kapila wrote: > > > > > * > > > Fetching column filter info in tablesync.c is quite expensive. It > > > seems to be using four round-trips to get the complete info whereas > > > for row-filter we use just one round trip. I think we should try to > > > get both row filter and column filter info in just one round trip. > > > > > > > Maybe, but I really don't think this is an issue. > > > > I am not sure but it might matter for small tables. Leaving aside the > performance issue, I think the current way will get the wrong column list in > many cases: (a) The ALL TABLES IN SCHEMA case handling won't work for > partitioned tables when the partitioned table is part of one schema and > partition table is part of another schema. (b) The handling of partition > tables in > other cases will fetch incorrect lists as it tries to fetch the column list > of all the > partitions in the hierarchy. > > One of my colleagues has even tested these cases both for column filters and > row filters and we find the behavior of row filter is okay whereas for column > filter it uses the wrong column list. We will share the tests and results > with you > in a later email. We are trying to unify the column filter queries with row > filter to > make their behavior the same and will share the findings once it is done. I > hope > if we are able to achieve this that we will reduce the chances of bugs in > this area. > > Note: I think the first two patches for tests are not required after commit > ceb57afd3c.
Hi, Here are some tests and results about the table sync query of column filter patch and row filter. 1) multiple publications which publish schema of parent table and partition. ----pub create schema s1; create table s1.t (a int, b int, c int) partition by range (a); create table t_1 partition of s1.t for values from (1) to (10); create publication pub1 for all tables in schema s1; create publication pub2 for table t_1(b); ----sub - prepare tables CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres' PUBLICATION pub1, pub2; When doing table sync for 't_1', the column list will be (b). I think it should be no filter because table t_1 is also published via ALL TABLES IN SCHMEA publication. For Row Filter, it will use no filter for this case. 2) one publication publishes both parent and child ----pub create table t (a int, b int, c int) partition by range (a); create table t_1 partition of t for values from (1) to (10) partition by range (a); create table t_2 partition of t_1 for values from (1) to (10); create publication pub2 for table t_1(a), t_2 with (PUBLISH_VIA_PARTITION_ROOT); ----sub - prepare tables CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres' PUBLICATION pub2; When doing table sync for table 't_1', it has no column list. I think the expected column list is (a). For Row Filter, it will use the row filter of the top most parent table(t_1) in this case. 3) one publication publishes both parent and child ----pub create table t (a int, b int, c int) partition by range (a); create table t_1 partition of t for values from (1) to (10) partition by range (a); create table t_2 partition of t_1 for values from (1) to (10); create publication pub2 for table t_1(a), t_2(b) with (PUBLISH_VIA_PARTITION_ROOT); ----sub - prepare tables CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres' PUBLICATION pub2; When doing table sync for table 't_1', the column list would be (a, b). I think the expected column list is (a). For Row Filter, it will use the row filter of the top most parent table(t_1) in this case. Best regards, Hou zj