On Fri, Mar 13, 2020 at 3:19 AM Laurenz Albe <laurenz.a...@cybertec.at> wrote: > > On Fri, 2020-03-13 at 09:10 +1300, David Rowley wrote: > > So you're suggesting we drive the insert-vacuums from existing > > scale_factor and threshold? What about the 1 billion row table > > example above? > > I am still not 100% certain if that is really realistic. > Transactions that insert only a single row are probably the > exception in large insert-only tables. > > But I think that we probably always can find a case where any given > parameter setting is not so great, so in order to get ahead > let's decide on something that is not right out stupid. > Changing the defaults later is always an option. > > So the three options are: > > 1. introduce no new parameters and trigger autovacuum if the number > of inserts exceeds the regular vacuum threshold. > > 2. introduce the new parameters with high base threshold and zero scale > factor.
Both of these look good to me. 1 is approach in my initial patch sketch, 2 is approach taken by Laurenz. Values I think in when considering vacuum is "how many megabytes of table aren't frozen/visible" (since that's what translates into processing time knowing io limits of storage), and "how many pages aren't yet vacuumed". Threshold in Laurenz's patch was good enough for my taste - it's basically "vacuum after every gigabyte", and that's exactly what we implemented when working around this issue manually. There's enough chance that latest gigabyte is in RAM and vacuum will be super fast on it; reading a gigabyte of data is not a showstopper for most contemporary physical and cloud environments I can think of. If reading a gigabyte is a problem already then wraparound is a guaranteed disaster. About index only scan, this threshold seems good enough too. There's a good chance last gig is already in RAM, and previous data was processed with previous vacuum. Anyway - with this patch Index Only Scan starts actually working :) I'd vote for 2 with a note "rip it off all together later and redesign scale factors and thresholds system to something more easily graspable". Whoever needs to cancel the new behavior for some reason will have a knob then, and patch is laid out already. > 3. introduce the new parameters with low base threshold and high scale factor. This looks bad to me. "the bigger the table, the longer we wait" does not look good for me for something designed as a measure preventing issues with big tables. > I think all three are viable. > If nobody else wants to weigh in, throw a coin. > > Yours, > Laurenz Albe > -- Darafei Praliaskouski Support me: http://patreon.com/komzpa