On Thu, 10 Jun 2021 at 18:03, Peter Geoghegan <p...@bowt.ie> wrote:
>
> On Thu, Jun 10, 2021 at 8:49 AM Matthias van de Meent
> <boekewurm+postg...@gmail.com> wrote:
> > Could you elaborate on what this "matches what we expect" entails?
> >
> > Apart from this, I'm also quite certain that the goto-branch that
> > created this infinite loop should have been dead code: In a correctly
> > working system, the GlobalVis*Rels should always be at least as strict
> > as the vacrel->OldestXmin, but at the same time only GlobalVis*Rels
> > can be updated (i.e. move their horizon forward) during the vacuum. As
> > such, heap_prune_satisfies_vacuum should never fail to vacuum a tuple
> > that also satisifies the condition of HeapTupleSatisfiesVacuum.
>
> It's true that these two similar functions should be in perfect
> agreement in general (given the same OldestXmin). That in itself
> doesn't mean that they must always agree about a tuple in practice,
> when they're called in turn inside lazy_scan_prune(). In particular,
> nothing stops a transaction that was in progress to
> heap_prune_satisfies_vacuum (when it saw some tuples it inserted)
> concurrently aborting. That will render the same tuples fully DEAD
> inside HeapTupleSatisfiesVacuum(). So we need to restart using the
> goto purely to cover that case. See the commit message of commit
> 8523492d4e3.

I totally overlooked that HeapTupleSatisfiesVacuumHorizon does the
heavyweight XID validation and does return HEAPTUPLE_DEAD in those
recently rolled back cases. Thank you for reminding me.

> By "matches what we expect", I meant "involves a just-aborted
> transaction". We could defensively verify that the inserting
> transaction concurrently aborted at the point of retrying/calling
> heap_page_prune() a second time. If there is no aborted transaction
> involved (as was the case with this bug), then we can be confident
> that something is seriously broken.

I believe there are more cases than only the rolled back case, but
checking for those cases would potentially help, yes.

With regards,

Matthias van de Meent.


Reply via email to