Hi, On 2020-02-07 16:40:46 -0800, Andres Freund wrote: > I'm currently fighting with a race I'm observing in about 1/4 of the > runs. [...] > I think the issue here is that determines whether s1 can finish its > check_exclusion_or_unique_constraint() check with one retry is whether > it reaches it does the tuple visibility test before s2's transaction has > actually marked itself as visible (note that ProcArrayEndTransaction is > after RecordTransactionCommit logging the COMMIT above). > > I think the fix is quite easy: Ensure that there *always* will be the > second wait iteration on the transaction (in addition to the already > always existing wait on the speculative token). Which is just adding > s2_begin s2_commit steps. Simple, but took me a few hours to > understand :/. > > I've attached that portion of my changes. Will interrupt scheduled > programming for a bit of exercise now.
I've pushed this now. Thanks for the patch, and the review! I additionally restricted the controller_print_speculative_locks step to the current database and made a bunch of debug output more precise. Survived ~150 runs locally. Lets see what the buildfarm says... Regards, Andres