Re: [HACKERS] bug in locking an update tuple chain
> The attached patch fixes the problem. When locking some old tuple version of > the chain, if we detect that we already hold that lock > (test_lockmode_for_conflict returns HeapTupleSelfUpdated), do not try to lock > it again but instead skip ahead to the next version. This fixes the synthetic > case in my isolationtester as well as our customer's production case. Pushed. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in locking an update tuple chain
Amit Kapila wrote: > On Sat, Jul 15, 2017 at 2:30 AM, Alvaro Herrera > wrote: > > a transaction wants to lock the > > updated version of some tuple, and it does so; and some other > > transaction is also locking the same tuple concurrently in a compatible > > way. So both are okay to proceed concurrently. The problem is that if > > one of them detects that anything changed in the process of doing this > > (such as the other session updating the multixact to include itself, > > both having compatible lock modes), it loops back to ensure xmax/ > > infomask are still sane; but heap_lock_updated_tuple_rec is not prepared > > to deal with the situation of "the current transaction has the lock > > already", so it returns a failure and the tuple is returned as "not > > visible" causing the described problem. > > Your fix seems logical to me, though I have not tested it till now. > However, I wonder why heap_lock_tuple need to restart from the > beginning of update-chain in this case? Well, it's possible that we could change things so that it doesn't need to re-start from the same spot where it initially began, but I think it requires changing too much code; I'd rather not touch it in a back-patchable bug fix. If we really wanted, we could perhaps change things to avoid repeated walks of the chain, but I'd see that as a pg11 (or future) change only. (You would be forgiven for thinking that the interactions between EvalPlanQualFetch, heap_lock_tuple and heap_lock_update_tuple are rather Rube Goldbergian, to use Tom's term.) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in locking an update tuple chain
On Sat, Jul 15, 2017 at 2:30 AM, Alvaro Herrera wrote: > A customer of ours reported a problem in 9.3.14 while inserting tuples > in a table with a foreign key, with many concurrent transactions doing > the same: after a few insertions worked sucessfully, a later one would > return failure indicating that the primary key value was not present in > the referenced table. It worked fine for them on 9.3.4. > > After some research, we determined that the problem disappeared if > commit this commit was reverted: > > Author: Alvaro Herrera > Branch: master Release: REL9_6_BR [533e9c6b0] 2016-07-15 14:17:20 -0400 > Branch: REL9_5_STABLE Release: REL9_5_4 [649dd1b58] 2016-07-15 14:17:20 -0400 > Branch: REL9_4_STABLE Release: REL9_4_9 [166873dd0] 2016-07-15 14:17:20 -0400 > Branch: REL9_3_STABLE Release: REL9_3_14 [6c243f90a] 2016-07-15 14:17:20 -0400 > > Avoid serializability errors when locking a tuple with a committed update > > I spent some time writing an isolationtester spec to reproduce the > problem. It turned out that this required six concurrent sessions in > order for the problem to show up at all, but once I had that, figuring > out what was going on was simple: a transaction wants to lock the > updated version of some tuple, and it does so; and some other > transaction is also locking the same tuple concurrently in a compatible > way. So both are okay to proceed concurrently. The problem is that if > one of them detects that anything changed in the process of doing this > (such as the other session updating the multixact to include itself, > both having compatible lock modes), it loops back to ensure xmax/ > infomask are still sane; but heap_lock_updated_tuple_rec is not prepared > to deal with the situation of "the current transaction has the lock > already", so it returns a failure and the tuple is returned as "not > visible" causing the described problem. > Your fix seems logical to me, though I have not tested it till now. However, I wonder why heap_lock_tuple need to restart from the beginning of update-chain in this case? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers