Alvaro Herrera <[email protected]> wrote:

> On 2026-Feb-25, Antonin Houska wrote:

> > For REPACK, I suggest a variant of toast_flatten_tuple() that writes the
> > output to a file, and a corresponding function that reads it while 
> > allocating
> > separate chunks of memory for the individual TOASTed attributes - the 
> > restored
> > tuple would reference the chunks using the "external indirect" TOAST 
> > pointers,
> > as if it had been processed by ReorderBufferToastReplace(). Does that make
> > sense to you?
> 
> Hmm, so on the apply side when reading the file, we would first reach
> each toast attribute value, which we know to insert directly to the
> toast table (keeping track of each individually toast pointer as we do
> so); then we reach the heap tuple itself, we [... somehow ...] interpret
> these external indirect toast pointers and substitute the toast pointers
> that we created.  So we never have to construct the entire tuple, or
> indeed do anything else with the toasted values other than insert them
> into the toast table.

Yes, that's what I mean.

> Actually, can't we simply insert the toasted values directly in the
> decoding worker into the new toast table?  That could save a lot of
> writing to the file, since we only save the raw heap tuples with no
> toasted contents; but it's not clear to me that this is valid.  (And we
> might create extra bloat if a tuple is inserted and later deleted
> concurrently with the repack; but that would happen with the original
> approach as well, no?)

The problem I see here is that for UPDATE you need the old tuple to determine
if its TOAST value should be deleted or if the new tuple should reuse it -
this is how I understand toast_tuple_init(). So the worker would have to store
all the changes somewhere temporarily until it can fully apply the changes
(i.e. until the initial copy and index build is complete).

Besides that, if the worker had to switch between the past (for the decoding)
and present (for the TOAST operations), it would have to invalidate system
caches repeatedly. 0004 does that, but 0005 makes that unnecessary. (I don't
know if the repeated cache invalidation would be a serious performance
problem, but from coding perspective I find it more convenient if the worker
only deals with decoding and does not have to do this time travel and
invalidations at all.)

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Reply via email to