Re: [PATCHES] On-disk bitmap index implementation

Jie Zhang Mon, 04 Dec 2006 10:20:08 -0800

On 12/4/06 8:22 AM, "Heikki Linnakangas" <[EMAIL PROTECTED]> wrote:


> Gavin Sherry wrote:
>> Hi all,
>> 
>> Attached is a patch implementing bitmap indexes. It includes major
>> enhancements on the patch submitted during feature freeze for 8.2 here[1].
>> 
>> In particular: much better integration with the existing bitmap scan code
>> with the internals of the bitmap streaming pushed down into the AM and
>> hidden from the executor code; completely new index creation algorithm
>> which reduced creation time by 20-75% depending on the data; modifications
>> to the encoding mechanism to suit the integration with bitmap index scans;
>> work on memory management; lots of code rewriting; range query support.
>> The code is also much cleaner now.
> 
> Thanks! I'll take a look at it.

Thanks!

> 
> We need to give the indexam API some further thought. As you know, I've
> been working on the Grouped Index Tuples stuff, which also requires
> changes to the API to get full benefit. There's a bunch of functionality
> I'd like to see:
> 
> * Support for streamed bitmaps, like you have implemented.

Yes. It is also interesting to see how this type of stream bitmaps works the
one from the bitmap index.

> 
> * Support for candidate matches. This is needed for GIT, as well as
> range-encoded bitmap indexes if/when we add them.
>

We have replace amgetmulti with amgetbitmap. This should be supported by
setting PagetableEntry accordingly.
 
> * Support for returning tuples in partial order. This is again needed
> for GIT, because grouped tuples don't keep track of the ordering within
> the group, so they need to be sorted if ordering necessary. And again
> it's also useful to return tuples in order from range-encoded bitmaps.

The bitmap index does not guarantee that returning tuples are ordered, but
it guarantees that the tuples from the same heap page will be returned
consecutively. I don't quite understand why this is useful for
range-encoding bitmaps in particular.

> 
> * Support for kill_prior_tuple on bitmap scans.
> 
> * A bulk insert API. When inserting a lot of tuples with similar keys,
> we could a considerable amount of CPU with a bulk insert API. A bulk
> insert to a B-tree for example would only need to descend the tree once,
> find the insert location, lock the target page just once and insert all
> the tuples that belong to that page. That would potentially also reduce
> WAL traffic.
> 

Yes. Currently during the bitmap index creation, we maintain a buffer to
buffer many inserting tuples before writing them to bitmap pages, which
improves the creation performance by 30%-200% depending on the cardinalities
of the attribute to be indexed. This bulk insert API can take advantage of
this as well.

Thanks,
Jie



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [PATCHES] On-disk bitmap index implementation

Reply via email to