My customer reported a strange connection hang problem. He and I couldn't reproduce it. I haven't been able to understand the cause, but I can think of one hypothesis. Could you give me your opinions on whether my hypothesis is correct, and a direction on how to fix the problem? I'm willing to submit a patch if necessary.

The customer is using synchronous streaming replication with PostgreSQL 9.2.8. The cluster consists of two nodes.

He performed archive recovery test like this:

1. Take a base backup. At that time, some notable settings in postgresql.conf are:
synchronous_standby_names = 'node2'
autovacuum = on
# synchronous_commit is commented out, so it's on by default

2. Some update operations.  I don't know what.

3. Shutdown the primary and promote the standby.

4. Shutdown the new primary.

5. Perform archive recovery. That is, restore the base backup, create recovery.conf, and do pg_ctl start.

6. Immediately after the archive recovery is complete, connect to the database server and perform some queries to check user data.

The steps 5 and 6 are done in some recovery script.

However, the connection attempt in step 6 got stuck for 12 hours, and the test was canceled. The stack trace was:

#0  0x0000003f4badf258 in poll () from /lib64/libc.so.6
#1  0x0000000000619b94 in WaitLatchOrSocket ()
#2  0x0000000000640c4c in SyncRepWaitForLSN ()
#3  0x0000000000491c18 in RecordTransactionCommit ()
#4  0x0000000000491d98 in CommitTransaction ()
#5  0x0000000000493135 in CommitTransactionCommand ()
#6  0x000000000074938a in InitPostgres ()
#7  0x000000000066ddd7 in PostgresMain ()
#8  0x0000000000627d81 in PostmasterMain ()
#9  0x00000000005c4803 in main ()

The connection attempt is waiting for a reply from the standby. This is strange, because we didn't anticipate that the connection establishment (and subsequent SELECT queries) would update something and write some WAL. The doc says:


"When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server.
Read only transactions and transaction rollbacks need not wait for replies from standby servers. Subtransaction commits do not wait for responses from standby servers, only top-level commits."

Why does the connection processing emit WAL?

Probably, it did page-at-a-time vacuum during access to pg_database and pg_authid for client authentication. src/backend/access/heap/README.HOT describes:

"Effectively, space reclamation happens during tuple retrieval when the
page is nearly full (<10% free) and a buffer cleanup lock can be
acquired.  This means that UPDATE, DELETE, and SELECT can trigger space
reclamation, but often not during INSERT ... VALUES because it does
not retrieve a row."

But the customer could not reproduce the problem when he performed the same archive recovery from the same base backup again. Why? I guess the autovacuum daemon vacuumed the system catalogs before he attempted to connect to the database.

Is this correct?

[How to fix]
Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c synchronous_standby_names='" to pg_ctl start in the recovery script would prevent the problem.

But isn't there anything to fix in PostgreSQL? I think the doc needs improvement so that users won't misunderstand that only write transactions would block at commit.

Do you think something else should be done? I guess pg_basebackup, pg_isready, and PQping() called in pg_ctl -w start/restart would block likewise, and I'm afraid users don't anticipate it. pg_upgrade appears to set synchronous_commit to local when starting the database server.


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to