Fix ordering of operations in SyncRepWakeQueue to avoid assertion failure. Commit 14e8803f1 removed the locking in SyncRepWaitForLSN, but that introduced a race condition, where SyncRepWaitForLSN might see syncRepState already set to SYNC_REP_WAIT_COMPLETE, but the process was not yet removed from the queue. That tripped the assertion, that the process should no longer be in the uqeue. Reorder the operations in SyncRepWakeQueue to remove the process from the queue first, and update syncRepState only after that, and add a memory barrier in between to make sure the operations are made visible to other processes in that order.
Fixes bug #14721 reported by Const Zhang. Analysis and fix by Thomas Munro. Backpatch down to 9.5, where the locking was removed. Discussion: https://www.postgresql.org/message-id/20170629023623.1480.26508%40wrigleys.postgresql.org Branch ------ REL9_6_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/941188a5ff49c4e042ddd4234c35cf0e1bfd7f19 Modified Files -------------- src/backend/replication/syncrep.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers