On Mon, Feb 3, 2014 at 8:55 AM, Robert Haas <robertmh...@gmail.com> wrote:
> I've also had some further thoughts about the right way to drive > vacuum scheduling. I think what we need to do is tightly couple the > rate at which we're willing to do vacuuming to the rate at which we're > incurring "vacuum debt". That is, if we're creating 100kB/s of pages > needing vacuum, we vacuum at 2-3MB/s (with default settings). If we can tolerate 2-3MB/s without adverse impact on other work, then we can tolerate it. Do we gain anything substantial by sand-bagging it? > If > we're creating 10MB/s of pages needing vacuum, we *still* vacuum at > 2-3MB/s. Not shockingly, vacuum gets behind, the database bloats, and > everything goes to heck. (Your reference to bloat made be me think your comments here are about vacuuming in general, not specific to IOS. If that isn't the case, then please ignore.) If we can only vacuum at 2-3MB/s without adversely impacting other activity, but we are creating 10MB/s of future vacuum need, then there are basically two possibilities I can think of. Either the 10MB/s represents a spike, and vacuum should tolerate it and hope to catch up on the debt later. Or it represents a new permanent condition, in which case I bought too few hard drives for the work load, and no scheduling decision that autovacuum can make will save me from my folly. Perhaps there is some middle ground between those possibilities, but I don't see room for much middle ground. I guess there might be entirely different possibilities not between those two; for example, I don't realize I'm doing something that is generating 10MB/s of vacuum debt, and would like to have this thing I'm doing be automatically throttled to the point it doesn't interfere with other processes (either directly, or indirectly by bloat) > The rate of vacuuming needs to be tied > somehow to the rate at which we're creating stuff that needs to be > vacuumed. Right now we don't even have a way to measure that, let > alone auto-regulate the aggressiveness of autovacuum on that basis. > There is the formula used to decide when a table gets vacuumed. Isn't the time delta in this formula a measure of how fast we are creating stuff that needs to be vacuumed for bloat reasons? Is your objection that it doesn't include other reasons we might want to vacuum, or that it just doesn't work very well, or that is not explicitly exposed? > Similarly, for marking of pages as all-visible, we currently make the > same decision whether the relation is getting index-scanned (in which > case the failure to mark those pages all-visible may be suppressing > the use of index scans or making them less effective) or whether it's > not being accessed at all (in which case vacuuming it won't help > anything, and might hurt by pushing other pages out of cache). If it is not getting accessed at all because the database is not very active right now, that would be the perfect time to vacuum it. Between "I can accurately project current patterns of (in)activity into the future" and "People don't build large tables just to ignore them forever", I think the latter is more likely to be true. If the system is busy but this particular table is not, then that would be a better reason to de-prioritise vacuuming that table. But can this degree of reasoning really be implemented in a practical way? In core? > Again, > if we had better statistics, we could measure this - counting heap > fetches for actual index-only scans plus heap fetches for index scans > that might have been planned index-only scans but for the relation > having too few all-visible pages doesn't sound like an impossible > metric to gather. My experience has been that if too few pages are all visible, it generally switches to a seq scan, not an index scan of a different index. But many things that are semantically possible to be index-only-scans would never be planned that way even if allvisible were 100%, so I think it would have to do two planning passes, one with the real allvisible, and a hypothetical one with allvisible set to 100%. And then there is the possibility that, while a high allvisible would be useful, the table is so active that no amount of vacuuming could ever keep it high. Cheers, Jeff