On 2017-02-08 23:25, Petr Jelinek wrote:

0001-Use-asynchronous-connect-API-in-libpqwalreceiver-v2.patch
0002-Always-initialize-stringinfo-buffers-in-walsender-v2.patch
0003-Fix-after-trigger-execution-in-logical-replication-v2.patch
0004-Add-RENAME-support-for-PUBLICATIONs-and-SUBSCRIPTION-v2.patch
0001-Logical-replication-support-for-initial-data-copy-v4.patch

Apart from the failing one make check test (test 'object_address') which I reported earlier, I find it is easy to 'confuse' the replication.

I attach a script that intends to test the default COPY DATA. There are two instances, initially without any replication. The script inits pgbench on the master, adds a serial column to pgbench_history, and dump-restores the 4 pgbench-tables to the future replica. It then empties the 4 pgbench-tables on the 'replica'. The idea is that when logrep is initiated, data will be replicated from master, with the end result being that there are 4 identical tables on master and replica.

This often works but it also fails far too often (in my hands). I test whether the tables are identical by comparing an md5 from an ordered resultset, from both replica and master. I estimate that 1 in 5 tries fail; 'fail' being a somewhat different table on replica (compared to mater), most often pgbench_accounts (typically there are 10-30 differing rows). No errors or warnings in either logfile. I'm not sure but I think testing on faster machines seem to be doing somewhat better ('better' being less replication error).

Another, probably unrelated, problem occurs (but much more rarely) when executing 'DROP SUBSCRIPTION sub1' on the replica (see the beginning of the script). Sometimes that command hangs, and refuses to accept shutdown of the server. I don't know how to recover from this -- I just have to kill the replica server (master server still obeys normal shutdown) and restart the instances.

The script accepts 2 parameters, scale and clients (used in pgbench -s resp. -c)

I don't think I've managed to successfully run the script with more than 1 client yet.

Can you have a look whether this is reproducible elsewhere?

thanks,

Erik Rijkers






#!/bin/sh

#  assumes both instances are running, on port 6972 and 6973

logfile1=$HOME/pg_stuff/pg_installations/pgsql.logical_replication/logfile.logical_replication
logfile2=$HOME/pg_stuff/pg_installations/pgsql.logical_replication2/logfile.logical_replication2

scale=1
if [[ ! "$1" == "" ]]
then
   scale=$1
fi

clients=1
if [[ ! "$2" == "" ]]
then
   clients=$2
fi

unset PGSERVICEFILE PGSERVICE PGPORT PGDATA PGHOST
PGDATABASE=testdb

# (this script also uses a custom pgpassfile)

## just for info:
# env | grep PG
# psql -qtAXc "select current_setting('server_version')"

port1=6972
port2=6973

function cb()
{
  #  display the 4 pgbench tables' accumulated content as md5s
  #  a,b,t,h stand for:  pgbench_accounts, -branches, -tellers, -history
  md5_total_6972='-1'
  md5_total_6973='-2'
  for port in $port1 $port2
  do
    md5_a=$(echo "select * from pgbench_accounts order by aid"|psql -qtAXp$port|md5sum|cut -b 1-9)
    md5_b=$(echo "select * from pgbench_branches order by bid"|psql -qtAXp$port|md5sum|cut -b 1-9)
    md5_t=$(echo "select * from pgbench_tellers  order by tid"|psql -qtAXp$port|md5sum|cut -b 1-9)
    md5_h=$(echo "select * from pgbench_history  order by hid"|psql -qtAXp$port|md5sum|cut -b 1-9)
    cnt_a=$(echo "select count(*) from pgbench_accounts"|psql -qtAXp $port)
    cnt_b=$(echo "select count(*) from pgbench_branches"|psql -qtAXp $port)
    cnt_t=$(echo "select count(*) from pgbench_tellers" |psql -qtAXp $port)
    cnt_h=$(echo "select count(*) from pgbench_history" |psql -qtAXp $port)
    md5_total[$port]=$( echo "${md5_a} ${md5_b} ${md5_t} ${md5_h}" | md5sum )
    printf "$port a,b,t,h: %6d %6d %6d %6d" $cnt_a  $cnt_b  $cnt_t  $cnt_h
    echo -n "   $md5_a  $md5_b  $md5_t  $md5_h"
    if   [[ $port -eq $port1 ]]; then echo    "   master"
    elif [[ $port -eq $port2 ]]; then echo -n "   replica"
    else                              echo    "             ERROR  "
    fi
  done
  if [[ "${md5_total[6972]}" == "${md5_total[6973]}" ]]
  then
    echo " ok"
  else
    echo " NOK"
  fi
}

bail=0

pub_count=$( echo "select count(*) from pg_publication" | psql -qtAXp 6972 )
if  [[ $pub_count -ne 0 ]]
then
  echo "pub_count -ne 0 - deleting pub1 & bailing out"
  echo "drop publication if exists pub1" | psql -Xp 6972
  bail=1
fi
sub_count=$( echo "select count(*) from pg_subscription" | psql -qtAXp 6973 )
if  [[ $sub_count -ne 0 ]]
then
  echo "sub_count -ne 0 - deleting sub1 & bailing out"
  echo "drop subscription if exists sub1" | psql -Xp 6973
  rc=$?
  echo "(drop subscr. ) )  rc=$rc"
  bail=1
fi

pub_count=$( echo "select count(*) from pg_publication"  | psql -qtAXp 6972 )
sub_count=$( echo "select count(*) from pg_subscription" | psql -qtAXp 6973 )

#if [[ $bail -eq 1 ]]
#then
#    if  [[ $pub_count -eq 0 ]] && [[ $sub_count -eq 0 ]]
#    then
#	    exit 0
#    else
#        exit 1
#    fi
#fi

if  [[ $pub_count -eq 1 ]] || [[ $sub_count -eq 1 ]]
then
    exit 1
fi


# clear logs
echo > $logfile1
echo > $logfile2

   echo "drop table if exists pgbench_accounts;
         drop table if exists pgbench_branches;
         drop table if exists pgbench_tellers;
         drop table if exists pgbench_history;" | psql -X -p $port1 \
&& echo "drop table if exists pgbench_accounts;
         drop table if exists pgbench_branches;
         drop table if exists pgbench_tellers;
         drop table if exists pgbench_history;" | psql -X -p $port2

pgbench -p $port1 -qis $scale && echo "
        alter table pgbench_history add column hid serial primary key;
     -- alter table pgbench_history replica identity full;
     -- delete from pgbench_accounts where aid > 40;
" | psql -1p $port1 \
  && pg_dump -F c  -p $port1 \
           -t pgbench_accounts \
           -t pgbench_branches \
           -t pgbench_tellers  \
           -t pgbench_history  \
  | pg_restore -p $port2 -d testdb

echo "-- (no diffs expected... )" 
echo  "$(cb)"
#pgbench -p $port2 -qis 1 && echo "
#        alter table pgbench_history add column hid serial primary key;
#     -- alter table pgbench_history replica identity full;" | psql -1p $port2
#

currval=1
# currval=$( echo "
# select nextval('pgbench_history_hid_seq'::regclass);
# select currval('pgbench_history_hid_seq'::regclass);" | psql -qtAXp $port1 | tail -n 1 )
# echo "-- currval [$currval]"

# empty tables at replica:
echo "
delete from pgbench_accounts; 
delete from pgbench_branches; 
delete from pgbench_tellers; 
delete from pgbench_history;
-- select setval('pgbench_history_hid_seq', $currval);
" | psql -q -X -p $port2 

echo "-- (pre-replication,  diffs expected... )" 
echo  "$(cb)"

echo "create publication pub1 for all tables;" | psql -p $port1 -aqtAX
echo "create subscription sub1 connection 'port=${port1}' publication pub1 with (disabled);
alter subscription sub1 enable; 
" | psql -p $port2 -aqtAX
#------------------------------------

if [[ 1 -eq 1 ]]
then
  echo "-- pgbench -c $clients -T 20 -P 5 -n "
           pgbench -c $clients -T 20 -P 5 -n
fi
sleep 5
echo "-- "
echo  "$(cb)"

echo "select * from pgbench_accounts order by aid" | psql -Xp 6972 > acc.6972.md5
echo "select * from pgbench_accounts order by aid" | psql -Xp 6973 > acc.6973.md5
echo "select * from pgbench_history order by hid"  | psql -Xp 6972 > his.6972.md5
echo "select * from pgbench_history order by hid"  | psql -Xp 6973 > his.6973.md5
wc_acc=$( diff acc.6972.md5 acc.6973.md5 | wc -l )
wc_his=$( diff his.6972.md5 his.6973.md5 | wc -l )

echo "-- wc -l accounts  $wc_acc"
echo "-- wc -l history   $wc_his"

echo -n "-- logfile1 "; wc -l $logfile1 | cut '-d ' -f 1 
echo -n "-- logfile2 "; wc -l $logfile2 | cut '-d ' -f 1

if [[ $wc_acc -eq 0 ]]
then
  echo "-- wc lines [$wc_acc] - ok"
  exit 0
fi

echo "-- waiting 30s..."
sleep 30
echo "-- "
echo  "$(cb)"

echo "select * from pgbench_accounts order by aid" | psql -Xp 6972 > accounts.6972.md5
echo "select * from pgbench_accounts order by aid" | psql -Xp 6973 > accounts.6973.md5
diff accounts.6972.md5  accounts.6973.md5 | wc -l

echo -n "-- logfile1 "; wc -l $logfile1 | cut '-d ' -f 1 
echo -n "-- logfile2 "; wc -l $logfile2 | cut '-d ' -f 1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to