On 2013-10-28 12:04:01 -0400, Robert Haas wrote: > On Fri, Oct 25, 2013 at 8:14 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> > I wonder if this is isn't maybe sufficient. Yes, it can deadlock, but > > that's already the case for VACUUM FULLs of system tables, although less > > likely. And it will be detected/handled. > > There's one more snag though, we currently allow CLUSTER system_table; > > in an existing transaction. I think that'd have to be disallowed. > > It wouldn't bother me too much to restrict CLUSTER system_table by > PreventTransactionChain() at wal_level = logical, but obviously it > would be nicer if we *didn't* have to do that. > > In general, I don't think waiting on an XID is sufficient because a > process can acquire a heavyweight lock without having an XID. Perhaps > use the VXID instead? But decoding doesn't care about transactions that haven't "used" an XID yet (since that means they haven't modified the catalog), so that shouldn't be problematic. > One thought I had about waiting for decoding to catch up is that you > might do it before acquiring the lock. Of course, you then have a > problem if you get behind again before acquiring the lock. It's > tempting to adopt the solution we used for RangeVarGetRelidExtended, > namely: wait for catchup without the lock, acquire the lock, see > whether we're still caught up if so great else release lock and loop. > But there's probably too much starvation risk to get away with that. I think we'd pretty much always starve in that case. It'd be different if we could detect that there weren't any writes to the table inbetween. I can see doing that using a locking hack like autovac uses, but brr, that'd be ugly. > On the whole, I'm leaning toward thinking that the other solution > (recording the old-to-new CTID mappings generated by CLUSTER to the > extent that they are needed) is probably more elegant. I personally still think that the "wide cmin/cmax" solution is *much* more elegant, simpler and actually can be used for other things than logical decoding. Since you don't seem to agree I am going to write a prototype using such a mapping to see how it will look though. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers