On Tue, Nov 2, 2021 at 11:50 AM Robert Haas <robertmh...@gmail.com> wrote:
> I almost proposed 1m rather than 10m, but then I thought the better of > it. I think it's unlikely that an autovacuum that takes 1 minute is > really the cause of some big problem you're having on your system. > Typical problem cases I see are hours or days long, so even 10 minutes > is pretty short. > I'm talking about the autoANALYZE part, not VACUUM. In my case, it was a few tables ~100GB-1TB in size, with 1-2 GIN indexes (with fastupdate, default pending list size limit, 4MB), 10 workers with quite high bar in terms of throttling. And default_statistics_target = 1000. Observed autoANALYZE timing reached dozens of minutes, sometimes ~1 hour for a table. The problem is that, it looks, ANALYZE (unlike VACUUM) holds snapshot, takes XID -- and it all leads to the issues on standbys, if it takes so long. I'm going to post the findings in a separate thread, but the point is that autoANALYZE running minutes *may* cause big performance issues. That's why 1m seems a good threshold to me, even if leads to having 3 log entries per minute from 3 workers. It's a quite low log traffic, but the data there is really useful for retrospective analysis.