On 12/4/06 8:22 AM, "Heikki Linnakangas" <[EMAIL PROTECTED]> wrote:
> Gavin Sherry wrote: >> Hi all, >> >> Attached is a patch implementing bitmap indexes. It includes major >> enhancements on the patch submitted during feature freeze for 8.2 here[1]. >> >> In particular: much better integration with the existing bitmap scan code >> with the internals of the bitmap streaming pushed down into the AM and >> hidden from the executor code; completely new index creation algorithm >> which reduced creation time by 20-75% depending on the data; modifications >> to the encoding mechanism to suit the integration with bitmap index scans; >> work on memory management; lots of code rewriting; range query support. >> The code is also much cleaner now. > > Thanks! I'll take a look at it. Thanks! > > We need to give the indexam API some further thought. As you know, I've > been working on the Grouped Index Tuples stuff, which also requires > changes to the API to get full benefit. There's a bunch of functionality > I'd like to see: > > * Support for streamed bitmaps, like you have implemented. Yes. It is also interesting to see how this type of stream bitmaps works the one from the bitmap index. > > * Support for candidate matches. This is needed for GIT, as well as > range-encoded bitmap indexes if/when we add them. > We have replace amgetmulti with amgetbitmap. This should be supported by setting PagetableEntry accordingly. > * Support for returning tuples in partial order. This is again needed > for GIT, because grouped tuples don't keep track of the ordering within > the group, so they need to be sorted if ordering necessary. And again > it's also useful to return tuples in order from range-encoded bitmaps. The bitmap index does not guarantee that returning tuples are ordered, but it guarantees that the tuples from the same heap page will be returned consecutively. I don't quite understand why this is useful for range-encoding bitmaps in particular. > > * Support for kill_prior_tuple on bitmap scans. > > * A bulk insert API. When inserting a lot of tuples with similar keys, > we could a considerable amount of CPU with a bulk insert API. A bulk > insert to a B-tree for example would only need to descend the tree once, > find the insert location, lock the target page just once and insert all > the tuples that belong to that page. That would potentially also reduce > WAL traffic. > Yes. Currently during the bitmap index creation, we maintain a buffer to buffer many inserting tuples before writing them to bitmap pages, which improves the creation performance by 30%-200% depending on the cardinalities of the attribute to be indexed. This bulk insert API can take advantage of this as well. Thanks, Jie ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq