Re: [HACKERS] GIT patch

Heikki Linnakangas Wed, 01 Aug 2007 12:52:13 -0700

Alvaro Herrera wrote:
> I've started reading the GIT patch to see if I can help with the review.


Thanks.

> First thing I notice is that there are several things that seems left
> over; for example the comments in pg_proc for the new functions are
> incomplete.
> 
> ...
> I'm also finding a certain lack of code commentary that makes the
> reviewing a bit harder.

Sorry about that.

As the patch stands, I tried to keep it as non-invasive as possible,
with minimum changes to existing APIs. That's because in the winter we
were discussing changes to the indexam API to support the bitmap index
am, and also GIT. I wanted to just have a patch to do performance
testing with, without getting into the API changes.

I've been reluctant to spend time to clean up the code and comments,
knowing that that's going to change a lot, depending on what kind of an
API we settle on and what capabilities we're going to have in the
executor. And also because there was no acceptance of even the general
design, so it might just be rejected. Please read the discussions on the
thread "bitmapscan changes":

http://archives.postgresql.org/pgsql-patches/2007-03/msg00163.php

There's basically three slightly alternative designs:

1. A grouped index tuple contains a bitmap of offsetnumbers,
representing a bunch of heap tuples stored on the same heap page, that
all have a key between the key stored on the index tuple and the next
index tuple. We don't keep track of the ordering of the heap tuples
represented by one group index tuple. When doing a normal, non-bitmap,
index scan, they need to be sorted. This is what the patch currently
implements.

2. Same as 1, but instead of storing the offsetnumbers in a bitmap,
they're sorted in a list (variable-sized array, really), which keeps the
ordering between the tuples intact. No sorting needed on index scans,
and we can do binary searches using the list. But takes more space than
a bitmap.

3. Same as 1, but mandate that all the heap tuples that are represented
by the same grouped index tuple must be in index order on the heap page.
If an out-of-order tuple is inserted, we need to split the grouped index
tuple into two groups, to uphold that invariant. No sorting needed on
index scans and we can do binary searches. But takes more space when the
heap is not perfectly in order and makes the index to degrade into a
normal b-tree more quickly when the table is updated.

I'm leaning towards 2 or 3 myself at the moment, to keep it simple. In
any case.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [HACKERS] GIT patch

Reply via email to