On Sun, Jan 5, 2020 at 6:40 AM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > On Sun, Jan 05, 2020 at 08:54:15AM +0900, Masahiko Sawada wrote: > >On Thu, Jan 2, 2020 at 9:09 PM Amit Kapila <amit.kapil...@gmail.com> > >wrote: > >> > >> Hi, > >> > >> I am starting a new thread for some of the decisions for a parallel > >> vacuum in the hope to get feedback from more people. There are > >> mainly two points for which we need some feedback. > >> > >> 1. Tomas Vondra has pointed out on the main thread [1] that by > >> default the parallel vacuum should be enabled similar to what we do > >> for Create Index. As proposed, the patch enables it only when the > >> user specifies it (ex. Vacuum (Parallel 2) <tbl_name>;). One of the > >> arguments in favor of enabling it by default as mentioned by Tomas is > >> "It's pretty much the same thing we did with vacuum throttling - it's > >> disabled for explicit vacuum by default, but you can enable it. If > >> you're worried about VACUUM causing issues, you should set cost > >> delay.". Some of the arguments against enabling it are that it will > >> lead to use of more resources (like CPU, I/O) which users might or > >> might like. > >> > > > >I'm a bit wary of making parallel vacuum enabled by default. Single > >process vacuum does sequential reads/writes on most of indexes but > >parallel vacuum does random access random reads/writes. I've tested > >parallel vacuum on HDD and confirmed the performance is good but I'm > >concerned that it might be cause of more disk I/O than user expected. > > > > I understand the concern, but it's not clear to me why to apply this > defensive approach just to vacuum and not to all commands. Especially > when we do have a way to throttle vacuum (unlike pretty much any other > command) if I/O really is a scarce resource. > > As the vacuum workers are separate processes, each generating requests > with a sequential pattern, so I'd expect readahead to kick in and keep > the efficiency of sequential access pattern. >
Right, I also think so. > >> Now, if we want to enable it by default, we need a way to disable it > >> as well and along with that, we need a way for users to specify a > >> parallel degree. I have mentioned a few reasons why we need a > >> parallel degree for this operation in the email [2] on the main > >> thread. > >> > >> If parallel vacuum is *not* enabled by default, then I think the > >> current way to enable is fine which is as follows: Vacuum (Parallel > >> 2) <tbl_name>; > >> > >> Here, if the user doesn't specify parallel_degree, then we internally > >> decide based on number of indexes that support a parallel vacuum with > >> a maximum of max_parallel_maintenance_workers. > >> > >> If the parallel vacuum is enabled by default, then I could think of > >> the following ways: > >> > >> (a) Vacuum (disable_parallel) <tbl_name>; Vacuum (Parallel > >> <parallel_degree>) <tbl_name>; > >> > >> (b) Vacuum (Parallel <parallel_degree>) <tbl_name>; If user > >> specifies parallel_degree as 0, then disable parallelism. > >> > >> (c) ... Any better ideas? > >> > > > >If parallel vacuum is enabled by default, I would prefer (b) but I > >don't think it's a good idea to accept 0 as parallel degree. If we want > >to disable parallel vacuum we should max_parallel_maintenance_workers > >to 0 instead. > > > > IMO that just makes the interaction between vacuum options and the GUC > even more complicated/confusing. > Yeah, I am also not sure if that will be a good idea. > If we want to have a vacuum option to determine parallel degree, we > should probably have a vacuum option to disable parallelism using just a > vacuum option. I don't think 0 is too bad, and disable_parallel seems a > bit awkward. Maybe we could use NOPARALLEL (in addition to PARALLEL n). > That's what Oracle does, so it's not entirely without a precedent. > We can go either way (using 0 for parallel to indicate disable parallelism or by introducing a new option like NOPARALLEL). I think initially we can avoid introducing more options and just go with 'Parallel 0' and if we find a lot of people find it inconvenient, then we can always introduce a new option later. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com