My customer reported a strange connection hang problem. He and I couldn't
reproduce it. I haven't been able to understand the cause, but I can think
of one hypothesis. Could you give me your opinions on whether my hypothesis
is correct, and a direction on how to fix the problem? I'm willing to
submit a patch if necessary.
The customer is using synchronous streaming replication with PostgreSQL
9.2.8. The cluster consists of two nodes.
He performed archive recovery test like this:
1. Take a base backup. At that time, some notable settings in
synchronous_standby_names = 'node2'
autovacuum = on
# synchronous_commit is commented out, so it's on by default
2. Some update operations. I don't know what.
3. Shutdown the primary and promote the standby.
4. Shutdown the new primary.
5. Perform archive recovery. That is, restore the base backup, create
recovery.conf, and do pg_ctl start.
6. Immediately after the archive recovery is complete, connect to the
database server and perform some queries to check user data.
The steps 5 and 6 are done in some recovery script.
However, the connection attempt in step 6 got stuck for 12 hours, and the
test was canceled. The stack trace was:
#0 0x0000003f4badf258 in poll () from /lib64/libc.so.6
#1 0x0000000000619b94 in WaitLatchOrSocket ()
#2 0x0000000000640c4c in SyncRepWaitForLSN ()
#3 0x0000000000491c18 in RecordTransactionCommit ()
#4 0x0000000000491d98 in CommitTransaction ()
#5 0x0000000000493135 in CommitTransactionCommand ()
#6 0x000000000074938a in InitPostgres ()
#7 0x000000000066ddd7 in PostgresMain ()
#8 0x0000000000627d81 in PostmasterMain ()
#9 0x00000000005c4803 in main ()
The connection attempt is waiting for a reply from the standby. This is
strange, because we didn't anticipate that the connection establishment (and
subsequent SELECT queries) would update something and write some WAL. The
"When requesting synchronous replication, each commit of a write transaction
will wait until confirmation is received that the commit has been written to
the transaction log on disk of both the primary and standby server.
Read only transactions and transaction rollbacks need not wait for replies
from standby servers. Subtransaction commits do not wait for responses from
standby servers, only top-level commits."
Why does the connection processing emit WAL?
Probably, it did page-at-a-time vacuum during access to pg_database and
pg_authid for client authentication. src/backend/access/heap/README.HOT
"Effectively, space reclamation happens during tuple retrieval when the
page is nearly full (<10% free) and a buffer cleanup lock can be
acquired. This means that UPDATE, DELETE, and SELECT can trigger space
reclamation, but often not during INSERT ... VALUES because it does
not retrieve a row."
But the customer could not reproduce the problem when he performed the same
archive recovery from the same base backup again. Why? I guess the
autovacuum daemon vacuumed the system catalogs before he attempted to
connect to the database.
Is this correct?
[How to fix]
Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c
synchronous_standby_names='" to pg_ctl start in the recovery script would
prevent the problem.
But isn't there anything to fix in PostgreSQL? I think the doc needs
improvement so that users won't misunderstand that only write transactions
would block at commit.
Do you think something else should be done? I guess pg_basebackup,
pg_isready, and PQping() called in pg_ctl -w start/restart would block
likewise, and I'm afraid users don't anticipate it. pg_upgrade appears to
set synchronous_commit to local when starting the database server.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: