Hi,

On 2022-02-01 13:31:36 +1100, Peter Smith wrote:
> TEST STEPS - Workload case a
> 
> 1. Run initdb pub and sub and start both postgres instances (use the nosync 
> postgresql.conf)
> 
> 2. Run psql for both instances and create tables
> CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));
> 
> 3. create the PUBLISHER on pub instance (e.g. choose from below depending on 
> filter)
> CREATE PUBLICATION pub_1 FOR TABLE test;                                      
>         -- 100% (no filter)
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0);              -- 100% 
> allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 250000); -- 75% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 500000); -- 50% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 750000); -- 25% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 1000000);        -- 0% 
> allowed
> 
> 4. create the SUBSCRIBER on sub instance
> CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432 
> dbname=postgres application_name=sync_sub' PUBLICATION pub_1;
> 
> 5. On pub side modify the postgresql.conf on the publisher side and restart
> \q quite psql
> edit synchronous_standby_names = 'sync_sub' 
> restart the pub instance
> 
> 6. Run psql (pub side) and perform the test run.
> \timing
> INSERT INTO test SELECT i, i::text, row_to_json(row(i)) FROM 
> generate_series(1,1000001)i;
> select count(*) from test;
> TRUNCATE test;
> select count(*) from test;
> repeat 6 for each test run.

I think think using syncrep as the mechanism for benchmarking the decoding
side makes the picture less clear than it could be - you're measuring a lot of
things other than the decoding. E.g. the overhead of applying those changes. I
think it'd be more accurate to do something like:

/* create publications, table, etc */

-- create a slot from before the changes
SELECT pg_create_logical_replication_slot('origin', 'pgoutput');

/* the changes you're going to measure */

-- save end LSN
SELECT pg_current_wal_lsn();

-- create a slot for pg_recvlogical to consume
SELECT * FROM pg_copy_logical_replication_slot('origin', 'consume');

-- benchmark, endpos is from pg_current_wal_lsn() above
time pg_recvlogical -S consume --endpos 0/2413A720 --start -o proto_version=3 
-o publication_names=pub_1 -f /dev/null  -d postgres

-- clean up
SELECT pg_drop_replication_slot('consume');

Then repeat this with the different publications and compare the time taken
for the pg_recvlogical. That way the WAL is exactly the same, there is no
overhead of actually doing anything with the data on the other side, etc.

Greetings,

Andres Freund


Reply via email to