Re: Inval reliability, especially for inplace updates

Mark Dilger Wed, 21 Jan 2026 09:00:12 -0800

On Wed, Oct 23, 2024 at 7:54 PM Noah Misch <[email protected]> wrote:


> With the releases wrapping in 2.5 weeks, I'm ambivalent about pushing this
> before the release or after.  Pushing before means fewer occurrences of
> corruption, but pushing after gives more bake time to discover these
> changes
> were defective.  It's hard to predict which helps users more, on a
> risk-adjusted basis.  I'm leaning toward pushing this week.  Opinions?
>
> On Sun, Oct 20, 2024 at 06:41:37PM +0530, Nitin Motiani wrote:
> > I tested the patch locally and it works. And I have no other question
> > regarding the structure. So this patch looks good to me to commit.
>
> Thanks.  While resolving a back-branch merge conflict, I found
> AtEOXact_Inval() and AtEOSubXact_Inval() were skipping inplaceInvalInfo
> tasks
> if transInvalInfo==NULL.  If one PreInplace_Inval() failed, the session's
> next
> inplace update would get an assertion failure.  Non-assert builds left
> inplaceInvalInfo pointing to freed memory, but I didn't find a reachable
> malfunction.  Separately, the xact.c comment edit wasn't reflecting that v4
> brought back the transactional inval.  v6 fixes those.  Regarding the
> back-branch alternatives to the WAL format change:
>
> On Tue, Jun 18, 2024 at 08:23:49AM -0700, Noah Misch wrote:
> > On Mon, Jun 17, 2024 at 06:57:30PM -0700, Andres Freund wrote:
> > > On 2024-06-17 16:58:54 -0700, Noah Misch wrote:
> > > > On Sat, Jun 15, 2024 at 03:37:18PM -0700, Noah Misch wrote:
> > > > > - heap_xlog_inplace() could set the shared-inval-queue overflow
> signal on
> > > > >   every backend.  This is more wasteful, but inplace updates might
> be rare
> > > > >   enough (~once per VACUUM) to make it tolerable.
> > >
> > > We already set that surprisingly frequently, as
> > > a) The size of the sinval queue is small
> > > b) If a backend is busy, it does not process catchup interrupts
> > >    (i.e. executing queries, waiting for a lock prevents processing)
> > > c) There's no deduplication of invals, we often end up sending the
> same inval
> > >    over and over.
> > >
> > > So I suspect this might not be too bad, compared to the current
> badness.
> >
> > That is good.
>
> I benchmarked that by hacking 027_stream_regress.pl to run "pgbench
> --no-vacuum --client=4 -T 30 -b select-only" on the standby, concurrent
> with
> the primary running the regression tests.  Standby clients acted on sinval
> resetState ~16k times, and pgbench tps decreased 4.5%.  That doesn't
> necessarily mean the real-life cost would be unacceptable, but it was
> enough
> of a decrease that I switched to the next choice:
>
> > We might be able to do the overflow signal once at end of
> > recovery, like RelationCacheInitFileRemove() does for the init file.
> That's
> > mildly harder to reason about, but it would be cheaper.  Hmmm.
>
> I'm attaching the branch-specific patches for that and for the main fix.
> Other notes from back-patching:
>
> - All branches change the ABI of PrepareToInvalidateCacheTuple(), a
> function
>   catcache.c exports for the benefit of inval.c.  No PGXN extension calls
>   that, and I can't think of a use case in extensions.
>

Unfortunately, I can think of four.  I have four Table Access Methods that
I now need to fork to be compatible with 18.0 and 18.1 on the one hand, and
18.2 onward on the other.

I'm sorry I didn't follow this thread before it got pushed.

Is there a reason for doing this change in back branches?  The thread is
pretty long, and I'm struggling to find a security or stability
justification for the ABI break, but perhaps there is one.


>
> - Due to v15 commit 3aafc03, the patch for v14 differs to an unusual degree
>   from the patch for v15+v16.  The difference is mostly mechanical, though.
>
> Thanks,
> nm
>


-- 

*Mark Dilger*

Re: Inval reliability, especially for inplace updates

Reply via email to