On 2021-Dec-01, Simon Riggs wrote: > On Wed, 1 Dec 2021 at 14:33, Bossart, Nathan <bossa...@amazon.com> wrote: > > > > On 11/12/21, 8:56 AM, "Simon Riggs" <simon.ri...@enterprisedb.com> wrote: > > > The combination of these two statements in a transaction hits an > > > Assert in heapam.c at line 4770 on REL_14_STABLE > > > > I've been unable to reproduce this. Do you have any tips for how to > > do so? Does there need to be some sort of concurrent workload? > > That path is only ever taken when there are multiple sessions, and as > I said, pgbench finds this reliably. I guess I didn't say "use -c 2"
Simon had sent me the pgbench scripts earlier, so I attach them here. I don't actually get a crash with -c2 or -c3, but I do get almost immediate crashes with -c4 and above. If I run it under "rr", it doesn't occur either. I suspect the rr framework kills concurrency in some way that hides the problem. I didn't find a way to reproduce it with isolationtester. (If somebody wants to play with a debugger, I find that it's much easier to reproduce by adding a short sleep after the UpdateXmaxHintBits() call in line 4735; but that sleep occurs in a session *other* than the one that dies. And under rr I still don't see a crash with a sleep there; in fact the sleep doesn't seem to occur at all, which is weird.) The patch does fix the crasher under pgbench, and I think it makes sense that you can get WouldBlock and yet have the tuple marked with XMAX_INVALID: if transaction A is writing the tuple, and transaction B is acquiring the tuple lock, then transaction C also tries to acquire the tuple lock but that returns nay (because of B), then transaction A completes, then transaction B could set the XMAX_INVALID flag in time for C to have a seizure in its way out. So patching the assertion to allow the case is correct. What I don't understand is why hasn't this been reported already: this bug is pretty old. My only explanation is that nobody runs sufficiently- concurrent load with SKIP LOCKED in assert-enabled builds. [1] https://www.postgresql.org/message-id/flat/CADLWmXUvd5Z%2BcFczi6Zj1WcTrXzipgP-wj0pZOWSaRUy%3DF0omQ%40mail.gmail.com -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
simon_setup.sql
Description: application/sql
simon_bench.sql
Description: application/sql