On Fri, Jul 18, 2025 at 10:00 PM Masahiko Sawada <sawada.m...@gmail.com> wrote:
>
> On Thu, Jul 17, 2025 at 1:39 PM Melanie Plageman
> <melanieplage...@gmail.com> wrote:
>
> > I think there is a fundamental tension here related to whether or not
> > vacuumparallel.c should be table-AM agnostic. All of its code is
> > invoked below heap_vacuum_rel(). I would argue that vacuumparallel.c
> > is index AM-agnostic but heap-specific.
> >
> > I think this is what makes your table AM callbacks feel odd. Other
> > table AM callbacks are invoked from general code -- for example from
> > executor code that is agnostic to the underlying table AMs. But, all
> > of the functions you added and the associated callbacks are invoked
> > below heap_vacuum_rel() (as is all of vacuumparallel.c).
>
> What does it exactly mean to make vacuumparallel.c index AM-agnostic
> but heap-specific? I imagine we change the vacuumparallel.c so that it
> calls heap's functions for parallel vacuuming such as initializing DSM
> and parallel vacuum worker's entry function etc. But I'm not sure how
> it works with non-heap table AMs that uses vacuumparallel.c.

I don't understand what you mean by non-heap table AMs using
vacuumparallel.c. Do you mean that they are using it like a library? I
don't understand why the existing parallel index vacuumming code added
table AM callbacks either. They are only invoked from below
heap_vacuum_rel(). Unless I am missing something? From looking at
tableam.h, the parallel vacuum callbacks seem to be the only table AM
callbacks that are not invoked from table AM agnostic code. What was
the point of adding them?

I can see how other table AMs implementing vacuum may want to use some
of the vacuumparallel.c functions exposed in the vacuum.h header file
to save time and code for the index vacuuming portion of vacuuming.
But why have table AM callbacks that aren't invoked unless you are
using heap?

More generally, vacuuming is a very table AM-specific operation. You
often wouldn't need any of the same data structures. Think about an
undo-based table AM -- you don't need dead items or anything like
that. Or the phase ordering -- other table AMs may not need to vacuum
the table then the indexes then the table. Something like a scan, it
just has to yield tuples to the executor. But vacuum seems
inextricably linked with the on-disk layout. Most of the vacuum.c code
is just concerned with parsing options, checking permissions,
determining precedence with GUCs/table options, and doing catalog
operations.

> One possible change would be to have lazyvacuum.c pass the set of
> callbacks to parallel_vacuum_init() instead of having them as table AM
> callbacks. This removes the weirdness of the associated table AM
> callbacks being invoked below heap_vacuum_rel() but it doesn't address
> your point "vacuumparallel.c is index AM-agnostic but heap-specific".

I do agree you are in a tough spot if we keep the index parallel
vacuuming callbacks in tableam.h. It is weird to call a heap-specific
function from within a table AM callback. I guess my first question is
which table AMs are using the existing callbacks and how are they
doing that since I don't see places where they are called from non
heap code. If they use the vacuum.h exposed vacuumparallel.c functions
as a library, they could do that without the callbacks in tableam.h.
But I think I must just be missing something.

> > And even for cases where the information has to be passed from the
> > leader to the worker, there is no reason you can't have the same
> > struct but in shared memory for the parallel case and local memory for
> > the serial case. For example with the struct members "aggressive",
> > "skipwithvm", and "cutoffs" in LVRelState and ParallelLVShared. Or
> > ParallelLVScanDesc->nblocks and LVRelState->rel_pages.
> >
> > Ideally the entire part of the LVRelState that is an unchanging
> > input/reference data is in a struct which is in local memory for the
> > serial and local parallel case and a single read-only location in the
> > shared parallel case. Or, for the shared case, if there is a reason
> > not to read from the shared memory, we copy them over to a local
> > instance of the same struct. Maybe it is called HeapVacInput or
> > similar.
>
> ParallelLVShared is created to pass information to parallel vacuum
> workers while keeping LVRelStates able to work locally. Suppose that
> we create HeapVacInput including "aggressive", "cutoff", "skipwithvm",
> and "rel_pages", LVRelState would have to have a pointer to a
> HeapVacInput instance that is either on local memory or shared memory.
> Since we also need to pass other information such as
> initial_chunk_size and eager scanning related information to the
> parallel vacuum worker, we would have to create something like
> ParallelLVShared as the patch does. As a result, we would have two
> structs that need to be shared on the shared buffer. Is that kind of
> what you meant? or did you mean that we include parallel vacuum
> related fields too to HeapVacInput struct?

Yes, I think generally this is what I am saying.

Honestly, I think the two biggest problems right now are that 1) the
existing LVRelState mixes too many different types of data (e.g.
output stats and input data) and that 2) the structs your patch
introduces don't have names making it clear enough what they are.

What is ParallelLVScanDesc? We don't have the convention of a scan
descriptor for vacuum, so is the point here that it is only used for a
single "scan" of the heap table? That sounds like all of phase I to
me. Or why is the static input data structure just called
ParallelLVShared? Nothing about that indicates read-only or static.

When I was imagining how to make it more clear, I was thinking
something like what you are saying above. There are two types of input
read-only data passed from the leader to the workers. One is
parallel-only and one is needed in both the parallel and serial cases.
I was suggesting having the common input read-only fields in the same
type of struct. That would mean having two types of input data
structures in the parallel-case -- one parallel only and one common.
It's possible that that is more confusing.

I think it also depends on how the workers end up accessing the data.
For input read-only data you can either have workers access a single
copy in shared memory (and have a pointer that points to shared memory
in the parallel case and local memory in the serial case) or you can
copy the data over to local memory so that the parallel and serial
cases use it in the same data structure. You are doing mostly the
latter but the setup is spread around enough that it isn't immediately
clear that is what is happening.

One thing that might be worth trying is moving the setup steps in
heap_vacuum_rel() into some kind of helper function that can be called
both in heap_vacuum_rel() by the leader and also in the worker setup.
Then it might make it clear what only the leader does (e.g. call
vacuum_get_cutoffs()) and what everyone does. I'm just brainstorming
here, though.

I think if you just make a new version of the patch with the eager
scanning separated out and the names of the structs clarified, I could
try to be more specific. I found the spread of members across the
different structs hard to parse, but I'm not 100% sure yet what would
make it easier.

> I wanted to keep ParallelLVShared() mostly read-only and aimed to
> share the information from the leader to the workers, whereas
> ParallelLVScanDesc() is a pure scan state maintained by each worker
> (and the leader).

I'm not sure what scan state means here. nallocated seems to be a
coordination member. But nblocks seems to be used in places that you
have access to the LVRelState, which has rel_pages, so why wouldn't
you just use that?

> chunk_size is used when allocating the new chunk (see
> parallel_lazy_scan_get_nextpage()). It's initialized by
> vacrel->plvstate->shared->initial_chunk_size since the first chunk
> size could vary depending on eager scanning state, and is updated to
> the fixed size PARALLEL_LV_CHUNK_SIZE from the next chunk.

I see that ParallelLVScanWorkerData->chunk_size is used in that way,
but I don't see where ParallelLVScanDesc->chunk_size is used. Also,
why does ParallelLVScanWorkerData->chunk_size need to be in shared
memory? Isn't the worker just using it locally?

- Melanie


Reply via email to