pg 9.2 git master AMD 8120 (8-core) / 6 GB memory / Centos 6.2
I have experimented a bit with dropping a table from master, then querying that table from a sync-rep slave. It is a little worrying that this, the first test I tried, crashes the slave. There are two instance on one machine, head1 (=master) and head2 (=sync-rep slave). First, I generated a tab-separated file, a one off, to be used in the test: echo " copy ( select repeat('X',20) as c1 , repeat('X',20) as c2 , repeat('X',20) as c3 , repeat('X',20) as c4 , repeat('X',20) as c5 from generate_series(1, 200000) ) to stdout csv delimiter E'\t'; " | $HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql -p 6564 -d testdb > dropload_copy.txt That txt file is zipped, and the actual test consists of a bash while loop which 1. drops the table 2. loads the file into the table 3. Either: a. nothing b. does a select count(*) on the table So, it repeats the following: zcat dropload_copy.txt.gz \ | grep -v '^#' \ | $HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql -p 6564 -d testdb -c " drop table if exists t; create table t ( c1 text, c2 text, c3 text, c4 text, c5 text ); copy t from stdin csv delimiter E'\t'; analyze t;"; PAUSE_DURATION=0 PSQL=$HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql if [[ 0 -eq 1 ]]; # ON-OFF switch then echo "sleep $PAUSE_DURATION" sleep $PAUSE_DURATION; ( echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6564 -d testdb # master echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6565 -d testdb # wal_receiver_01 #echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6566 -d testdb # wal_receiver_02 ) | grep -v '^$' fi This runs fine for hours on end, as long as the ON-OFF switch is disabled. But when that if-block is added the client crashes after a while (sometimes almost immediately; it never survives longer then 20 minutes): 2012-05-26 10:44:22.617 CEST 10274 ERROR: could not fsync file "base/21268/32807": No such file or directory 2012-05-26 10:44:28.465 CEST 10274 ERROR: could not fsync file "base/21268/32867": No such file or directory 2012-05-26 10:44:28.587 CEST 10270 FATAL: could not open file "base/21268/32994": No such file or directory 2012-05-26 10:44:28.588 CEST 10270 CONTEXT: writing block 2508 of relation base/21268/32994 xlog redo multi-insert (init): rel 1663/21268/33006; blk 3117; 58 tuples TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741) 2012-05-26 10:44:31.131 CEST 10269 LOG: startup process (PID 10270) was terminated by signal 6: Aborted 2012-05-26 10:44:31.131 CEST 10269 LOG: terminating any other active server processes Crazy scenario , I'll admit, but surely this shouldn't be able to crash the client? I attach the logfiles of master(=head1) and slave (=head2). It show how the above ran for an hour without problems (while the ON/OFF switch was disabled), and how the crash came quickly when I switched it on (to add the select count(*) statements). Erik Rijkers
logfile.head2
Description: Binary data
logfile.head1
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers