Hi all,

This week I faced a out of disk space trouble in 8TB production cluster.
During investigation we notice that pg_replslot was the culprit growing
more than 1TB in less than 1 (one) hour.

We're using PostgreSQL 9.5.6 with pglogical 1.2.2 replicating to a new 9.6
instance and planning the upgrade soon.

What I did? I freed some disk space just to startup PostgreSQL and begin
the investigation. During the 'startup recovery' simply the files inside
the pg_replslot was tottaly removed. So our trouble with 'out of disk
space' disappear. Then the server went up and physical slaves attached
normally to master but logical slaves doesn't, staying stalled in 'catchup'

At this moment the "pg_replslot" directory started growing fast again and
forced us to drop the logical replication slot and we lost the logical

Googling awhile I found this thread [1] about a similar issue reported by
Dmitriy Sarafannikov and replied by Andres and Álvaro.

I ran the test case provided by Dmitriy [1] against branches:
- master

After all test the issue remains... and also using the new Logical
Replication stuff (CREATE PUB/CREATE SUB). Just after a restart the
"pg_replslot" was properly cleaned. The typo in ReorderBufferIterTXNInit
complained by Dimitriy was fixed but the issue remains.

Seems no one complain again about this issue and the thread was lost.

The attached is a reworked version of Dimitriy's patch that seems solve the
issue. I confess I don't know enough about replication slots code to really
know if it's the best solution.



Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 524946a..a538715 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -1142,7 +1142,7 @@ ReorderBufferCleanupTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
 	/* remove entries spilled to disk */
-	if (txn->nentries != txn->nentries_mem)
+	if (txn->nentries != txn->nentries_mem || txn->is_known_as_subxact)
 		ReorderBufferRestoreCleanup(rb, txn);
 	/* deallocate */
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to