On Fri, 2020-03-13 at 12:05 +0300, Darafei "Komяpa" Praliaskouski wrote: > 1. introduce no new parameters and trigger autovacuum if the number > > of inserts exceeds the regular vacuum threshold. > > > > 2. introduce the new parameters with high base threshold and zero scale > > factor. > > Both of these look good to me. 1 is approach in my initial patch > sketch, 2 is approach taken by Laurenz. > Values I think in when considering vacuum is "how many megabytes of > table aren't frozen/visible" (since that's what translates into > processing time knowing io limits of storage), and "how many pages > aren't yet vacuumed". > > Threshold in Laurenz's patch was good enough for my taste - it's > basically "vacuum after every gigabyte", and that's exactly what we > implemented when working around this issue manually. There's enough > chance that latest gigabyte is in RAM and vacuum will be super fast on > it; reading a gigabyte of data is not a showstopper for most > contemporary physical and cloud environments I can think of. If > reading a gigabyte is a problem already then wraparound is a > guaranteed disaster. > > About index only scan, this threshold seems good enough too. There's a > good chance last gig is already in RAM, and previous data was > processed with previous vacuum. Anyway - with this patch Index Only > Scan starts actually working :) > > I'd vote for 2 with a note "rip it off all together later and redesign > scale factors and thresholds system to something more easily > graspable". Whoever needs to cancel the new behavior for some reason > will have a knob then, and patch is laid out already. > > > 3. introduce the new parameters with low base threshold and high scale > > factor. > > This looks bad to me. "the bigger the table, the longer we wait" does > not look good for me for something designed as a measure preventing > issues with big tables.
Thanks for the feedback. It looks like we have a loose consensus on #2, i.e. my patch. Yours, Laurenz Albe