On Mon, Feb 19, 2024 at 11:44 AM Michael Paquier <mich...@paquier.xyz> wrote: > > > Yeah, comments added in v3. > > The contents look rather OK, I may do some word-smithing for both.
Here are some comments on v3: 1. + XLogRecPtr initial_effective_xmin = InvalidXLogRecPtr; + XLogRecPtr initial_catalog_effective_xmin = InvalidXLogRecPtr; + XLogRecPtr initial_restart_lsn = InvalidXLogRecPtr; Prefix 'initial_' makes the variable names a bit longer, I think we can just use effective_xmin, catalog_effective_xmin and restart_lsn, the code updating then only when if (!terminated) tells one that they aren't updated every time. 2. + /* + * We'll release the slot's mutex soon, so it's possible that + * those values change since the process holding the slot has been + * terminated (if any), so record them here to ensure we would + * report the slot as obsolete correctly. + */ This needs a bit more info as to why and how effective_xmin, catalog_effective_xmin and restart_lsn can move ahead after signaling a backend and before the signalled backend reports back. 3. + /* + * Assert that the conflict cause recorded previously before we + * terminate the process did not change now for any reason. + */ + Assert(!(conflict_prev != RS_INVAL_NONE && terminated && + conflict_prev != conflict)); It took a while for me to understand the above condition, can we simplify it like below using De Morgan's laws for better readability? Assert((conflict_prev == RS_INVAL_NONE) || !terminated || (conflict_prev == conflict)); > > Are we going to fix this on master and 16 stable first and then later on > > add a > > test on master with the injection points? > > Still, the other patch is likely going to take a couple of weeks > before getting into the tree, so I have no objection to fix the bug > first and backpatch, then introduce a test. Things have proved that > failures could show up in the buildfarm in this area, so more time > running this code across two branches is not a bad concept, either. While I couldn't agree more on getting this fix in, it's worth pulling in the required injection points patch here and writing the test to reproduce this race issue. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com