Hi Peter, I just tried scenario b that Andres suggested:
For scenario b, I did some testing with row-filter-patch v74 and various levels of filtering. 0% replicated to 100% rows replicated. The times are in seconds, I did 5 runs each. Results: RUN HEAD "with patch 0%" "row-filter-patch 25%" "row-filter-patch v74 50%" "row-filter-patch 75%" "row-filter-patch v74 100%" 1 17.26178 12.573736 12.869635 13.742167 17.977112 17.75814 2 17.522473 12.919554 12.640879 14.202737 14.515481 16.961836 3 17.124001 12.640879 12.706631 14.220245 15.686613 17.219355 4 17.24122 12.602345 12.674566 14.219423 15.564312 17.432765 5 17.25352 12.610657 12.689842 14.210725 15.613708 17.403821 As can see the performance seen on HEAD is similar to that which the patch achieves with all rows (100%) replication. The performance improves linearly with more rows filtered. The test scenario used was: 1. On publisher and subscriber: CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value)); 2. On publisher: (based on which scenario is being tested) CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 250000); -- 75% allowed CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 500000); -- 50% allowed CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 750000); -- 25% allowed CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 1000000); -- 0% allowed 3. On the subscriber: CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432 dbname=postgres application_name=sync_sub' PUBLICATION pub_1; 4. now modify the postgresql.conf on the publisher side synchronous_standby_names = 'sync_sub' and restart. 5. The test case: DO $do$ BEGIN FOR i IN 1..1000001 BY 10 LOOP INSERT INTO test VALUES(i,'BAH', row_to_json(row(i))); UPDATE test SET value = 'FOO' WHERE key = i; IF I % 1000 = 0 THEN COMMIT; END IF; END LOOP; END $do$; regards, Ajin Cherian Fujitsu Australia On Tue, Feb 1, 2022 at 12:07 PM Peter Smith <smithpb2...@gmail.com> wrote: > > On Sat, Jan 29, 2022 at 11:31 AM Andres Freund <and...@anarazel.de> wrote: > > > > Hi, > > > > Are there any recent performance evaluations of the overhead of row > > filters? I > > think it'd be good to get some numbers comparing: > > > > 1) $workload with master > > 2) $workload with patch, but no row filters > > 3) $workload with patch, row filter matching everything > > 4) $workload with patch, row filter matching few rows > > > > For workload I think it'd be worth testing: > > a) bulk COPY/INSERT into one table > > b) Many transactions doing small modifications to one table > > c) Many transactions targetting many different tables > > d) Interspersed DDL + small changes to a table > > > > I have gathered performance data for the workload case (a): > > HEAD 46743.75 > v74 no filters 46929.15 > v74 allow 100% 46926.09 > v74 allow 75% 40617.74 > v74 allow 50% 35744.17 > v74 allow 25% 29468.93 > v74 allow 0% 22540.58 > > PSA. > > This was tested using patch v74 and synchronous pub/sub. There are 1M > INSERTS for publications using differing amounts of row filtering (or > none). > > Observations: > - There seems insignificant row-filter overheads (e.g. viz no filter > and 100% allowed versus HEAD). > - The elapsed time decreases linearly as there is less data getting > replicated. > > I will post the results for other workload kinds (b, c, d) when I have them. > > ------ > Kind Regards, > Peter Smith. > Fujitsu Australia.