On Sun, Mar 4, 2018 at 3:18 PM, David Gould <da...@sonic.net> wrote: > On Sun, 4 Mar 2018 07:49:46 -0800 > Jeff Janes <jeff.ja...@gmail.com> wrote: > > > On Wed, Jan 17, 2018 at 4:49 PM, David Gould <da...@sonic.net> wrote: > ... > > > > Maybe a well-timed crash caused n_dead_tup to get reset to zero and that > is > > why autovac is not kicking in? What are the pg_stat_user_table number > and > > the state of the visibility map for your massively bloated table, if you > > still have them? > > ... >
> The main pain points are that when reltuples gets inflated there is no way > to fix it, auto vacuum stops looking at the table and hand run ANALYZE > can't > reset the reltuples. The only cure is VACUUM FULL, but that is not really > practical without unacceptable amounts of downtime. > But why won't an ordinary manual VACUUM (not FULL) fix it? That seems like that is a critical thing to figure out. As for preventing it in the first place, based on your description of your hardware and operations, I was going to say you need to increase the max number of autovac workers, but then I remembered you from "Autovacuum slows down with large numbers of tables. More workers makes it slower" ( https://www.postgresql.org/message-id/20151030133252.3033.4249%40wrigleys.postgresql.org). So you are probably still suffering from that? Your patch from then seemed to be pretty invasive and so controversial. I had a trivial but fairly effective patch at the time, but it now less trivial because of how shared catalogs are dealt with (commit 15739393e4c3b64b9038d75) and I haven't rebased it over that issue. Cheers, Jeff