Re: POC: Parallel processing of indexes in autovacuum

Masahiko Sawada Mon, 16 Mar 2026 09:47:22 -0700

On Mon, Mar 16, 2026 at 5:34 AM Daniil Davydov <[email protected]> wrote:
>
> Hi,
>
> On Thu, Mar 12, 2026 at 2:05 AM Masahiko Sawada <[email protected]> wrote:
> >
> > BTW thes discussion made me think to change av_max_parallel_workers to
> > control the number of workers per-autovacuum worker instead (with
> > renaming it to say max_parallel_workers_per_autovacuum_worker). Users
> > can compute the maximum number of parallel workers the system requires
> > by (autovacuum_worker_slots *
> > max_parallel_workers_per_autovacuum_worker). We would no longer need
> > the reservation and release logic. I'd like to hear your opinion.
> >
>
> IIUC, one of the main autovacuum's goals is to be "inconspicuous" for the
> rest of the system. I mean that it should not try to vacuum all the tables
> as fast as possible. Instead it should try to interfere with other backends
> as little as possible and try to avoid high resource consumption (assuming
> there is no hazard of wraparound).
>
> I propose to reason based on the case for which the parallel a/v will
> actually be used :
> We have a 3 tables which has 80+ indexes each and require a
> parallel a/v. Ideally, each of these tables should be processed with 20
> parallel workers. This is a real example which can be encountered in
> different productions, where such tables take up about half of all the data
> in the database.
>
> How parallel a/v will handle such a situation?
> 1. Our current implementation
> We can set av_max_parallel_workers to 60 and autovacuum_parallel_workers
> reloption to 20 for each table.
> 2. Proposed idea
> We can set max_parallel_workers_per_av_worker to 20 and
> autovacuum_parallel_workers reloption to 20 for each table.
>
> In both cases we have guarantee that all tables will be processed with the
> desired number of parallel workers. And both cases allows us to limit the
> CPU consumption via reducing the "av_max_parallel_workers" parameter (for
> current implementation) or via reducing the "autovacuum_parallel_workers"
> reloption for each table (for proposed idea). So basically I don't see whether
> current approach has a big advantages over the idea you proposed.
>
> I also asked my friend, who is many years working with the clients with big
> productions. He said that this is super important to process such huge tables
> with maximum "intensity". I.e. each a/v worker should have ability to launch
> as many parallel workers as required. I guess that this is an argument in
> favor of your idea.
>
> The only argument against this idea that I could come up with is that some
> users may abuse our parallel a/v feature. For instance, the user can set
> "autovacuum_parallel_workers" reloption not only for large tables, but also
> for many smaller ones. In this case the max_parallel_workers_per_av_worker
> must be pretty large (in order to process the huge table). Thus, the user
> can face a situation when all a/v workers are launching additional parallel
> workers => there is high CPU consumption and possibly max_parallel_workers
> shortage. The only way to deal with it is to go through a large amount of
> smaller tables and reduce "autovacuum_parallel_workers" reloption for each
> of them. IMHO, this is a pretty unpleasant experience for the user. On the
> other hand, the user himself is to blame for the occurrence of such a
> situation.
>
> Let's summarize.
> Proposed idea has several strong advantages over current implementation.
> The only disadvantage I came up with can be avoided by writing recommendations
> on how to use this feature in the documentation. So, if I didn't messed up
> anything and you don't have any doubts, I would rather implement the
> proposed idea.


Thank you for the analysis on the new idea.

While both ideas can achieve our goal of this feature in general, the
new idea doesn't require an additional layer of reserve/release logic
on top of the existing bgworker pool, which is good. I've not tried
coding this idea but I believe the patch can be simplified very much.
So I agree to move to this idea.

>
> > > 2)
> > > I suggest adding a separate log that will be emitted every time we are
> > > unable to start workers due to a shortage of av_max_parallel_workers.
> >
> > For (2), do you mean that the worker writes these logs regardless of
> > log_autovacuum_min_duration setting? I'm concerned that the server
> > logs would be flooded with these logs especially when multiple
> > autovacuum workers are working very actively and the system is facing
> > a shortage of av_max_parallel_workers.
>
> Oh, I didn't take that into account. But this is not a problem - we can
> accumulate such statistics just as we do now for the "nreserved" ones. And
> then we will log this value with all other stats.
>
> > > Possibly we can introduce a new injection point, or a new log for it.
> > > But I assume that the subject of discussion in patch 0002 is the
> > > "nreserved" logic, and "nlaunched/nplanned" logic does not raise any
> > > questions.
> > >
> > > I suggest splitting the 0002 patch into two parts : 1) basic logic and
> > > 2) additional logic with nreserved or something else. The second part can 
> > > be
> > > discussed in isolation from the patch set. If we do this, we may not have 
> > > to
> > > change the tests. What do you think?
> >
> > Assuming the basic logic means nlaunched/nplanned logic, yes, it would
> > be a nice idea. I think user-facing logging stuff can be developed as
> > an improvement independent from the main parallel autovacuum patch.
> > It's ideal if we can implement the main patch (with tests) without
> > relying on the user-facing logging.
>
> OK, actually we can do it.
>
>
>
> Thank you very much for the review!
> Please, see attached patches. The changes are :
> 1) Fixed segfault with accessing outdated pv_shared_cost_params pointer.
> 2) "Logging for autovacuum" is divided into two patches - basic logging
> (nplanned/nlaunched) and advanced logging (nreserved).
> 3) Tests are now independent of logging.

Thank you for updating the patches. I'll wait for the new
implementation and will review the patches as soon as the patches are
updated.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: POC: Parallel processing of indexes in autovacuum

Reply via email to