On 2013-10-21 16:15:58 +0200, Andres Freund wrote:
> > I don't think I understand exactly what you have in mind for (2); can
> > you elaborate? I have always thought that having a
> > WaitForDecodingToCatchUp() primitive was a good way of handling
> > changes that were otherwise too difficult to track our way through. I
> > am not sure you're doing that at all right now, which in some sense I
> > guess is fine, but I haven't really understood your aversion to this
> > solution. There are some locking issues to be worked out here, but
> > the problems don't seem altogether intractable.
> So, what we need to do for rewriting catalog tables would be:
> 1) lock table against writes
> 2) wait for all in-progress xacts to finish, they could have modified
> the table in question (we don't keep locks on system tables)
> 3) acquire xlog insert pointer
> 4) wait for all logical decoding actions to read past that pointer
> 5) upgrade the lock to an access exclusive one
> 6) perform vacuum full as usual
> The lock upgrade hazards in here are the reason I am adverse to the
> solution. And I don't see how we can avoid them, since in order for
> decoding to catchup it has to be able to read from the
> catalog... Otherwise it's easy enough to implement.
So, I thought about this for some more and I think I've a partial
solution to the problem.
The worst thing about deadlocks that occur in the above is that they
could be the VACUUM FULL waiting for the "restart LSN" of a decoding
slot to progress, but the restart LSN cannot progress because the slot
is waiting for a xid/transaction to end which is being blocked by the
lock upgrade from VACUUM FULL. Such conflicts are not visible to the
deadlock detector, which obviously is bad.
I've prototyped this (~25 lines) and this happens pretty frequently. But
it turns out that we can actually fix this by exporting (to shared
memory) the oldest in-progress xid of a decoding slot. Then the waiting
code can do a XactLockTableWait() for that xid...
I wonder if this is isn't maybe sufficient. Yes, it can deadlock, but
that's already the case for VACUUM FULLs of system tables, although less
likely. And it will be detected/handled.
There's one more snag though, we currently allow CLUSTER system_table;
in an existing transaction. I think that'd have to be disallowed.
What do you think?
 The "restart LSN" is the point from where we need to be able read
WAL to replay all changes the receiving side hasn't acked yet.
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: