On Wed, Oct 9, 2019 at 2:40 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Wed, Oct 9, 2019 at 2:00 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Wed, Oct 9, 2019 at 10:22 AM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > > > > > > On Tue, Oct 8, 2019 at 2:45 PM Amit Kapila <amit.kapil...@gmail.com> > > > wrote: > > > > > > > > On Tue, Oct 8, 2019 at 1:48 AM Peter Geoghegan <p...@bowt.ie> wrote: > > > > > > > > > > ISTM that the use of maintenance_work_mem wasn't given that much > > > > > thought originally. > > > > > > > > > > > > > One idea to something better could be to check, if there is a GIN > > > > index on a table, then use 1/4 (25% or whatever) of > > > > maintenance_work_mem for GIN indexes and 3/4 (75%) of > > > > maintenance_work_mem for collection dead tuples. > > > > > > > > > > I felt that it would not be easy for users to tune > > > maintenance_work_mem which controls more than one things. If this is > > > an index AM(GIN) specific issue we might rather want to control the > > > memory limit of pending list cleanup by a separate GUC parameter like > > > gin_pending_list_limit, say gin_pending_list_work_mem. > > Sure, by having another work_mem parameter for the Gin indexes which > controls when we need to flush the pending list will make life easier > as a programmer. I think if we have a specific parameter for this > purpose, then we can even think of using the same for a clean up > during insert operation as well. However, I am not sure how easy it > would be for users? Basically, now they need to remember another > parameter and for which there is no easy way to know what should be > the value. I think one has to check > gin_metapage_info->n_pending_pages and then based on that they can > configure the value for this parameter to get the maximum benefit > possible. > > Can we think of using work_mem for this? Basically, we use work_mem > during insert operation, so why not use it during vacuum operation for > this purpose? > > Another idea could be to try to divide the maintenance_work_mem > smartly if we know the value of pending_pages for each Gin index, but > I think for that we need to either read the metapage of maybe use some > sort of stats which can be used by vacuum. We need to somehow divide > it based on the amount of memory required for a number of dead tuples > in heap and memory required by tuples in the pending list. I am not > sure how feasible is this approach. > > About difficulty for users tuning one or two parameters for vacuum, I > think if they can compute what could be the values for Guc's > separately, then why can't they add up and set it as one value. > Having said that, I am not denying that having a separate parameter > gives better control, and for this specific case using separate > parameter can allow us to use it both during vacuum and insert > operations. > > > > And we can > > > either set the (the memory for GIN pending list cleanup / # of GIN > > > indexes) to the parallel workers. > > > > > IMHO if we do that then we will loose the meaning of having > > maintenance_work_mem right? Then user can not control that how much > > memory the autovacuum worker will use. > > > > I am not sure how different it is from the current situation? > Basically, now it can use up to 2 * maintenance_work_mem memory and if > we do what Sawada-San is proposing, then it will be > maintenance_work_mem + gin_*_work_mem. Do you have some other > alternative idea in mind or you think the current situation is better > than anything else we can do in this area?
I think the current situation is not good but if we try to cap it to maintenance_work_mem + gin_*_work_mem then also I don't think it will make the situation much better. However, I think the idea you proposed up-thread[1] is better. At least the maintenance_work_mem will be the top limit what the auto vacuum worker can use. [1] https://www.postgresql.org/message-id/CAA4eK1JhY88BXC%3DZK%3D89MALm%2BLyMkMhi6WG6AZfE4%2BKij6mebg%40mail.gmail.com -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com