The more I think about this vacuum i/o problem, the more I think we have it wrong. The added i/o from vacuum really ought not be any worse than a single full table scan. And there are probably the occasional query doing full table scans already in those systems.
For the folks having this issue, if you run "select count(*) from bigtable" is there as big a hit in transaction performance? On the other hand, does the vacuum performance hit kick in right away? Or only after it's been running for a bit? I think the other factor mentioned is actually the main problem: cache. The vacuum basically kills the kernel buffer cache by reading in every block of every table in the system. The difference between vacuum and a single "select count(*)" is that it does all the tables one after each other eventually overrunning the total cache available. If it's just a matter of all the read i/o from vacuum then we're best off sleeping for a few milliseconds every few kilobytes. If it's the cache then we're probably better off reading a few megabytes and then sleeping for several seconds to allow the other buffers to get touched and pushed back to the front of the LRU. Hm, I wonder if the amount of data to read between sleeps should be, something like 25% of the effective_cache_size, for example. -- greg ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend