Re: Add an option to skip loading missing publication to avoid logical replication failure

vignesh C Tue, 25 Feb 2025 02:03:15 -0800

On Tue, 18 Feb 2025 at 16:53, Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Tue, Feb 18, 2025 at 2:24 PM vignesh C <vignes...@gmail.com> wrote:
> >
> > On Fri, 14 Feb 2025 at 15:36, Amit Kapila <amit.kapil...@gmail.com> wrote:
> > >
> > > Did you try to measure the performance impact of this change? We can
> > > try a few cases where DDL and DMLs are involved, missing publication
> > > (drop publication and recreate after a varying number of records to
> > > check the impact).
> >
> > Since we don't have an exact scenario to compare with the patch
> > (because, in the current HEAD, when the publication is missing, an
> > error is thrown and the walsender/worker restarts), I compared the
> > positive case, where records are successfully replicated to the
> > subscriber, as shown below. For the scenario with the patch, I ran the
> > same test, where the publication is dropped before the insert,
> > allowing the walsender to check whether the publication is present.
> > The test results, which represent the median of 7 runs and the
> > execution run is in milliseconds, are provided below:
> >
> > Brach/records  |  100     |  1000   |  10000     |  100000  |  1000000
> > Head               |   1.214  |  2.548  |  10.823    |  90.3       |  
> > 951.833
> > Patch              |   1.215  |  2.5485 |  10.8545 |  90.94     |   955.134
> > % diff              |   0.082  |   0.020   |   0.291   |   0.704    |    
> > 0.347
> >
> > I noticed that the test run with patches is very negligible. The
> > scripts used for execution are attached.
> >
>
> You have used the synchronous_standby_name to evaluate the performance
> which covers other parts of replication than the logical decoding. It
> would be better to test using pg_recvlogical.


Here are the test runs with pg_recvlogical, the test results, which
represent the median of 10 runs and the execution run is in
milliseconds, are provided below:
Brach/records  |  100     |  1000   |  10000   |  100000  |  1000000
Head               |   9.95   |  15.26  |   62.62    |  536.57  |   8480.83
Patch              |   9.218  |  10.32 |   23.05    |  143.83  |   4852.43
% diff              |   7.356  |  32.38  |   63.19   |    73.193|       42.783

We observe that test execution with the patch performs better between
7.35 percent to 73.19 percent. This is because, in HEAD, after loading
and verifying that the publication is valid, it must continue
processing to output the change. In contrast, with the patch,
outputting the change is skipped since the publication does not exist.

The attached script has the script that was used for testing. Here the
NUM_RECORDS count should be changed accordingly for each of the tests
and while running the test with the patch change uncomment the drop
publication command.

Regards,
Vignesh

#!/bin/bash

#####

SLOT_NAME=test
PLUGIN_NAME=pgoutput
NUM_RECORDS=100
LOOP=10

#####

for i in `seq 1 $LOOP`
do
# Cleanup previous result

pg_ctl stop -D data
rm -rf data logfile

# Initialize an instance

initdb -D data -U postgres -c wal_level=logical

# Start the instance
pg_ctl -D data -l logfile start

# Create a table
psql -U postgres -c "CREATE TABLE foo (id int);"
psql -U postgres -c "CREATE publication pub1 for table foo;"
#psql -U postgres -c "drop publication pub1;"

# Create a replication slot
psql -U postgres -c "SELECT * FROM pg_create_logical_replication_slot('$SLOT_NAME', '$PLUGIN_NAME')"

    # Insert tuples (this transaction will be decoded)
    psql -U postgres -c "INSERT INTO foo VALUES (generate_series(1, $NUM_RECORDS))"

    # Confirm current WAL location
    WAL_POS=$(psql -qtAX -U postgres -c "SELECT * FROM pg_current_wal_lsn()")


t1=$(($(date +%s%N)/1000))
echo $t1 > run_${i}.dat
    # Run pg_recvlogical till the current WAL location
    (time pg_recvlogical -d postgres -U postgres --start -S $SLOT_NAME -E $WAL_POS -f - -o publication_names='pub1' -o proto_version=4 ) &>> run_${i}.dat
t2=$(($(date +%s%N)/1000))
echo $t2 >> run_${i}.dat
t3=$((t2-t1))
echo "execution time=$t3" >> run_${i}.dat
done

Re: Add an option to skip loading missing publication to avoid logical replication failure

Reply via email to