Hi, On Wed, Mar 19, 2025 at 09:53:37AM +0100, Christophe Pettus wrote: > We're tracking down an issue that we've seen in two separate > installations so far, which is that, at the very end of a vacuum, the > vacuum operation starts using *very* high levels of CPU and > (sometimes) I/O, often to the point that the system becomes unable to > service other requests. We've seen this on versions 15, 16, and 17 so > far.
Ouch. > The common data points are: > > 1. The table being vacuumed is large (>250 million rows, often in the > >10 billion row level). > 2. The table has a relatively high churn rate. > 3. The number of updated / deleted rows before that particular vacuum > cycle are very high. > > Everything seems to point to the vacuum free space map operation, > since it would have a lot of work to do in that particular situation, > it happens at just the right place in the vacuum cycle, and its > resource consumption is not throttled the way the regular vacuum > operation is. Independent of throttling, if it turns out free space map vacuum is indeed the culprit, I think it would make sense to add that one as a dedicated phase so it can be more easily tracked in pg_stat_progress_vacuum etc. Michael