On 2016-06-21 16:32:03 -0400, Robert Haas wrote: > On Tue, Jun 21, 2016 at 3:46 PM, Andres Freund <and...@anarazel.de> wrote: > > On 2016-06-21 15:38:25 -0400, Robert Haas wrote: > >> On Tue, Jun 21, 2016 at 1:49 PM, Andres Freund <and...@anarazel.de> wrote: > >> >> I'm also a bit dubious that LockAcquire is safe to call in general > >> >> with interrupts held. > >> > > >> > Looks like we could just acquire the tuple-lock *before* doing the > >> > toast_insert_or_update/RelationGetBufferForTuple, but after releasing > >> > the buffer lock. That'd allow us to do avoid doing the nested locking, > >> > should make the recovery just a goto l2;, ... > >> > >> Why isn't that racey? Somebody else can grab the tuple lock after we > >> release the buffer content lock and before we acquire the tuple lock. > > > > Sure, but by the time the tuple lock is released, they'd have updated > > xmax. So once we acquired that we can just do > > if (xmax_infomask_changed(oldtup.t_data->t_infomask, > > infomask) > > || > > > > !TransactionIdEquals(HeapTupleHeaderGetRawXmax(oldtup.t_data), > > xwait)) > > goto l2; > > which is fine, because we've not yet done the toasting. > > I see. > > > I'm not sure wether this approach is better than deleting potentially > > toasted data though. It's probably faster, but will likely touch more > > places in the code, and eat up a infomask bit (infomask & HEAP_MOVED > > == HEAP_MOVED in my prototype). > > Ugh. That's not very desirable at all.
I'm looking into three approaches right now: 1) Flag approach from above 2) Undo toasting on concurrent activity, retry 3) Use WAL logging for the already_marked = true case. 1) primarily suffers from a significant amount of complexity. I still have a bug in there that sometimes triggers "attempted to update invisible tuple" ERRORs. Otherwise it seems to perform decently performancewise - even on workloads with many backends hitting the same tuple, the retry-rate is low. 2) Seems to work too, but due to the amount of time the tuple is not locked, the retry rate can be really high. As we perform significant amount of work (toast insertion & index manipulation or extending a file) , while the tuple is not locked, it's quite likely that another session tries to modify the tuple inbetween. I think it's possible to essentially livelock. 3) This approach so far seems the best. It's possible to reuse the xl_heap_lock record (in an afaics backwards compatible manner), and in most cases the overhead isn't that large. It's of course annoying to emit more WAL, but it's not that big an overhead compared to extending a file, or to toasting. It's also by far the simplest fix. Comments? -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers