On Wed, Apr 18, 2018 at 10:36 PM, Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote: > Pavan Deolasee wrote: >> On Wed, Apr 18, 2018 at 7:37 AM, Wood, Dan <hexp...@amazon.com> wrote: > >> > My analysis is that heap_prepare_freeze_tuple->FreezeMultiXactId() >> > returns FRM_NOOP if the MultiXACT locked rows haven't committed. This >> > results in changed=false and totally_frozen=true(as initialized). When >> > this returns to lazy_scan_heap(), no rows are added to the frozen[] array. >> > Yet, tuple_totally_frozen is true. This means the page is marked frozen in >> > the VM, even though the MultiXACT row wasn't left untouch. >> > >> > A fix to heap_prepare_freeze_tuple() that seems to do the trick is: >> > else >> > { >> > Assert(flags & FRM_NOOP); >> > + totally_frozen = false; >> > } >> > >> >> That's a great find! > > Indeed. > > This family of bugs (introduced by freeze map changes in 9.6) was > initially fixed in 38e9f90a227d but this spot was missed in that fix. > > IMO the cause is the totally_frozen variable, which starts life in a > bogus state (true) and the different code paths can set it to the right > state, or by inaction end up deciding that the initial bogus state was > correct in the first place. Errors of omission are far too easy in that > kind of model, ISTM, so I propose this slightly different patch, which > albeit yet untested seems easier to verify and easier to get right. >
Thank you for the patch! Agreed. The patch looks to fix this issue correctly. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center