On Wed, 2007-08-15 at 06:51 +0100, Heikki Linnakangas wrote: > Jeff Davis wrote: > > On Tue, 2007-08-14 at 16:27 -0500, Decibel! wrote: > >> Isn't this what Grouped Index Tuples is? > > > > http://community.enterprisedb.com/git/git-readme.txt > > > > It looks like GIT is a little different. > > > > GIT actually stores a lower-bound key of a contiguous* range of keys > > that all point to the same page, and for each of those ranges stores a > > bitmap of page offsets. A search searches first for an exact match in > > the index, and failing that, looks to see if the previous index tuple > > happens to be one of these ranges. > > > > The algorithm Chris is talking about stores a set of tuple ids (which > > include page and offset) for each distinct key. > > Right. > > > Both could be helpful, although I don't think they can work together > > very well. > > What Chris is suggesting is basically a special case of GIT, where all > the heap tuples represented by an index tuple have the same key. I was > actually thinking of adding a flag to index tuples to indicate that > special case in GIT. We could effectively do both.
A few additional thoughts... The approach suggested by Chris is also used by Teradata Non-Unique Secondary Indexes, known as NUSIs (but not named by me!). The following link is a public domain description that is detailed enough: http://teradata.uark.edu/research/wang/indexes.html Replicating this approach directly isn't that useful because it would interact badly with the way we handle updates. Thinking about the basic design pattern however, we can envisage a type of index that changes the 1:1 mapping between index and heap tuple into the concept of a "tuple set index" where we have a 1:Many mapping between index and heap. In general, the tuple set index approach can significantly reduce index size. This won't be of interest to anyone unless all of your data overflows RAM and you need to do I/O to constantly page back in pieces of your data. If your database does fit in RAM, reducing the number of index blocks might just increase contention. This means that the tuple set approach is only useful when you have very large databases, but when you do it is very, very useful. GIT is a tuple set index with two important extensions: 1. GIT allows a range of values to be indexed, not just a single value, so this allows it to be useful for both Unique and Non-Unique cases. 2. GIT restricts the set of tuples to a small range of blocks within the table. Making the range of blocks = 1 means that GIT is designed to work well with HOT, which also stays on the same block. Keeping the range of blocks small means GIT degrades as slowly as possible in the face of "cold" UPDATEs. If the values are inserted in roughly ordered/clustered sequence then this doesn't increase index size at all, so the most common/highest volume use cases are covered. So from my perspective, GIT is very close to the optimal design for a tuple set index that addresses the need for high concurrency and unique/near-uniqueness with PostgreSQL. There are certainly many other options for a tuple set index, and bitmap indexes are simply another version of a tuple set index but with different behaviours tuned to a different use case. There maybe other use cases that require more than two kinds of tuple set index...and we have discussed implementing the tuple set behaviour as part of the other main index types. As an aside, it turns out, after further research that GIT is similar to a clustered index in SQLServer, as described by Louis Davidson, "Pro SQL Server 2005 Database Design and Optimization", p.405. SQLServer creates a clustered index by default on each PK, so it says. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org