On Mon, Oct 24, 2011 at 3:35 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Your point about people trying to create wider indexes to exploit > index-only scans is an interesting one, but I think it's premature to > optimize on the basis of hypotheses about what people might do in > future.
Well, I don't think it's too much of a stretch to guess that people will try to use covering indexes; that's common practice on other products and a frequent and a not-uncommon heartfelt request from people with large (non-memory-resident) databases they want to migrate to PostgreSQL. Exactly to what degree they'll do that, and how well it will work, is another question. But I have little doubt that it will be tried. > Not sure about your other idea of returning multiple tuples per > amgettuple call. The trouble with that is that it will add complexity > (and hence cycles) at the nodeIndexscan level, because now nodeIndexscan > will have to buffer those tuples, keep track of whether it's fetching > forward or backward, etc etc. Plus another layer of the same in > indexam.c (index_getnext etc). I'm not at all convinced that it's > likely to be a net win. I definitely agree that you don't want two layers of caching, but I don't see why we'd need that. I wasn't thinking of changing index_getnext() at all, but rather adding a new API that fills a buffer (or a pair of buffers, maybe) with index tuples and heap TIDs. It should spit them out in the same order that multiple index_getnext() calls would have done and leave the scan position wherever it would have ended up after a number of index_getnext_tid() calls equal to the number of TIDs returned. Any user of the API (and it might be just nodeIndexonlyscan.c) would just need to keep track of the number of tuples returned and the number consumed to date. This actually gets into a wider architectural discussion, which is whether the fact that the whole executor (modulo bitmap scans and a few other special cases) operates on one tuple at a time is a good design... but my brain hurts just thinking about that. > I wonder how trustworthy the measure of the visibilitymap_test call site > as a consumer of cycles really is. I've frequently noticed that > oprofile blames remarkably large fractions of the runtime on individual > statements that appear to be quite trivial. I'm not sure if that > represents real hardware-level effects such as cache line switching, > or whether it's just measurement artifacts. Keep in mind that > sampling-based measurements are always subject to sampling artifacts. I'm not sure either. I guess we could try short-circuiting visibilitymap_test and see what that does to performance (let's leave correct answers out of it). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers