On Mon, May 9, 2011 at 10:36 PM, Merlin Moncure <mmonc...@gmail.com> wrote: >> 1. The visibility map needs to be crash-safe. The basic idea of >> index-only scans is that, instead of checking the heap to find out >> whether each tuple is visible, we first check the visibility map. If >> the visibility map bit is set, then we know all tuples on the page are >> visible to all transactions, and therefore the tuple of interest is >> visible to our transaction. Assuming that a significant number of >> visibility map bits are set, this should enable us to avoid a fair >> amount of I/O, especially on large tables, because the visibility map >> is roughly 8000 times smaller than the heap, and therefore far more >> practical to keep in cache. > > hm, what are the implications for tuple hint bits, short and long > term? I'm particularly interested if you think any hint bit i/o > mitigation strategies are worth pursuing.
Well, I don't really want to let this thread on my project get hijacked to talk about your project (not that I haven't been guilty of that myself!) but, in brief, I think the main effect of index-only scans is that the performance difference between a vacuumed table and an unvacuumed table is going to increase. It's already the case that sequential scanning a table which has been vacuumed (and, therefore, all the pages are marked all-visible) is noticeably faster than sequential scanning a table which is not vacuumed (even if all the hint bits are set). Index-only scans are going to extend that by making index scans run faster on a table with lots of all-visible tables than on one where no pages are all-visible. So the importance of vacuuming an insert-only table occasionally (which autovacuum won't do, at present, until it's needed to prevent XID wraparound) is already more than zero, and it's going to go up. But the all-visible bits aren't quite the same as hint bits: I don't think there's any impact on hint bits per se. >> 2. Crash safe visibility map vs. pg_upgrade. Even if we make the >> visibility map crash-safe in 9.2, people are going to want to use >> pg_upgrade to migrate from older versions, bringing their >> possibly-not-quite-correct visibility map forks along with them. How >> should we handle that? We could (2A) arrange to have pg_upgrade nuke >> all visibility forks when upgrading from a release where the >> visibility map is not crash-safe to one where it is; > > +1 on 2A. OK. Anybody else? >> 3. Statistics. I believe that in order to accurately estimate the >> cost of an index-only scan, we're going to need to know the fraction >> of tuples that are on pages whose visibility map bits are set. > > It would be helpful to know the performance benefit of index only > scans before knowing how much benefit to attribute here. Maybe a > system wide kludge would for starters anyway, like assuming 60% of > pages can be vis checked from the VM, or a single GUC, Then again, > maybe not. Yeah, maybe I should try to beat the main patch into some kind of shape before working too much on the statistics stuff. Then we could actually benchmark it a bit, which would be good. I don't think that a system-wide kludge or GUC is going to work for prime time, but it's probably fine for initial performance testing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers