I've been experimenting with the idea of a so-called Block B-Tree. The basic idea is that instead of storing an index tuple for each heap tuple, we store an index tuple for each heap block. This dramatically reduces the size of an index, leading to savings on I/O. This idea was briefly discussed in January: http://archives.postgresql.org/pgsql-hackers/2006-01/msg00565.php

To make it actually work, the semantics of the B-Tree has been modified so that every index tuple represents 1 or more heap tuples that fall within some range of values, and are on the same heap page. The range that an index tuple represents is from X, inclusive, to Y, exclusive, where X is the key of the index tuple and Y is the key of the *next* index tuple in the index. If the heap is in index order (as after CLUSTER), we get a very compact index this way, effectively eliminating the leaf level of the B-tree.

To locate the actual matching items on the heap page, we have to scan the heap page because we don't have the item ids, so this is a tradeoff between CPU and I/O. However, we could have a hybrid where we initially store heap tuple pointers like we do now, and when there's more than X consecutive pointers to the same page, we collapse them to just one pointer to the whole page. This would potentially give us the best of both worlds.

This design is more flexible and less invasive than true index-organized-tables, because it doesn't require perfect ordering of the heap or moving heap tuples around. You can also define than one Block B-Tree on a table, though you wouldn't get much benefit from using Block B-Tree instead of normal B-Tree if there's no correlation between the index order and the heap order.

It also fits in nicely with the bitmap scans, since there's already support for "lossy" bitmap pages. Doing normal ordered index scans requires some coding, but can be done.

Thoughts? I'm thinking of getting this in early in the 8.3 cycle. We'll have to take a look at the indexam API to support both bitmap indexes and block B-trees nicely.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to