On Wed, Jul 1, 2026 at 9:57 PM Antonin Houska <[email protected]> wrote:
>
> Ewan Young <[email protected]> wrote:
>
> > On Mon, Jun 22, 2026 at 7:12 PM Antonin Houska <[email protected]> wrote:
> > >
> > > Ewan Young <[email protected]> wrote:
> > >
> > > > The transient heap built by make_new_heap() is intentionally created
> > > > without the old table's defaults and constraints, so it has no 
> > > > generation
> > > > expressions for its generated columns, even though the tuple descriptor
> > > > still has attgenerated set.
> > > >
> > > > When apply_concurrent_update() replays a non-HOT update, it calls
> > > > ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass
> > > > the "indexUnchanged" hint, that calls index_unchanged_by_update() ->
> > > > ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the
> > > > generation expression of each generated column via 
> > > > build_column_default()
> > > > and errors out when it finds none on the transient heap.
> > > >
> > > > The apply path does not need to recompute generated columns at all: the
> > > > decoded tuple already carries the correct value, and it is only 
> > > > inserted.
> > > > Note also that ExecGetUpdatedCols() already returns an empty set for 
> > > > this
> > > > ResultRelInfo, because it is not part of any range table -- so the
> > > > indexUnchanged determination here is already approximate.
> > >
> > > I'm sorry for the confusion, but the fact that ExecGetUpdatedCols() 
> > > returns an
> > > empty set is an omission rather than deliberate choice. Assuming we fix 
> > > that,
> > > the result of ExecGetExtraUpdatedCols() does matter. Thus we should copy 
> > > the
> > > related pg_attrdef entries, as I suggest in this patch.
> > >
> > > Another question is how serious problem it is that ExecGetUpdatedCols()
> > > returns empty set. AFAICS, "indexUnchanged" does not affect correctness - 
> > > it's
> > > is only a hint that helps the btree AM decide whent the bottom-up 
> > > deletion and
> > > de-duplication techniques should (not) be used. I'm not sure it's easy to
> > > update the set for individual UPDATEs: the UPDATE commands REPACK replays
> > > originate from different SQL queries and the logical decoding does not
> > > transfer this information.
> > >
> > > Even then, I think it'd be "less bad" to have ExecGetUpdatedCols() return 
> > > a
> > > set containing all the attributes rather than empty set. That is, avoid 
> > > using
> > > the btree optimizations altogether rather than allow them them when not
> > > appropriate. However, per index_unchanged_by_update(), if 
> > > ExecGetUpdatedCols()
> > > tells that all columns are updated, the result of 
> > > ExecGetExtraUpdatedCols()
> > > does not matter.  Nevertheless, I'd still slightly prefer copying the
> > > pg_attrdef entries to hacking the executor.
> >
> > Agreed, thanks for the correction. Relying on the empty ExecGetUpdatedCols()
> > set was the weak point of my version -- it's an omission, not something to
> > build a second approximation on. Copying the pg_attrdef entries fixes the
> > inconsistency at the root, so let's go with your approach.
> >
> > I applied the patch and ran it through an injection-point reproducer
> > (cassert). Without the fix the bug reproduces (ERROR: no generation
> > expression found for column number 3 ...); with it, REPACK CONCURRENTLY
> > succeeds under a concurrent non-HOT UPDATE for a STORED generated column, an
> > index directly on the generated column, and a VIRTUAL column, with correct
> > values afterwards. Your repack.spec change passes.
> >
> > The approach is right and I've confirmed it fixes the bug, so +1 from me in
> > this direction.
>
> Thanks for checking. Here, 0002 tries to fix the problem of empty updatedCols,
> as I proposed above.
>
> 0001 fixes two minor coding issues that I found when writing 0002.

Both look good to me. 0001 is a clear improvement — makeNode() is the right
idiom, and NULL rather than 0 for the pointer argument.

For 0002, I agree with the approach: since indexUnchanged only feeds the btree
bottom-up-deletion / dedup heuristics, treating all columns as updated (so the
hint is always false) is the safe choice — better to forgo those optimizations
than to misuse them. I checked the mechanism against the current code and it
holds: with ri_RangeTableIndex = 1, ExecGetUpdatedCols() returns the full set,
so index_unchanged_by_update() always sees a changed key column.

+1 on both.

>
> --
> Antonin Houska
> Web: https://www.cybertec-postgresql.com
>


-- 
Regards,
Ewan Young


Reply via email to