Re: [HACKERS] Reviewing new index types (was Re: [PATCHES] Updatedbitmap indexpatch)

Simon Riggs Mon, 23 Jul 2007 15:14:46 -0700

On Mon, 2007-07-23 at 17:19 -0400, Tom Lane wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > ... BMI is not useful at all
> > for PKs, whilst GIT is specifically designed to handle them.
> 
> This seems a strange statement, because GIT doesn't look particularly
> efficient for unique indexes AFAICS.  In the worst case you'd have to
> look individually at each tuple on a heap page to check for uniqueness
> conflict (no binary search, because you couldn't assume they are
> ordered).


That is one of a few heuristics about the patch that need some active
discussion, so I'm glad you asked.

The main use case is nearly-unique, so for cases where we have a
Master:Detail relationship, e.g. Order:OrderItem. The Order index is a
PK, with the OrderItem index as a nearly unique key. The index is not
brilliant for the Order index, but is good for the OrderItem index.

Heikki designed the grouping so that there is a state change between
non-grouped and non-grouped (normal) index entries. By default the patch
uses a threshold of non-grouped -> grouped at N=2 index entries and then
no limit on the number of rows/block. Currently you can tune N, but we
might also envisage setting a limit on the width of the range of values
to limit the number of tids stored in a grouped index entry. That could
control the uniqueness overhead.

On an I/O bound workload the space saving on the index outweighs the CPU
loss from uniqueness checking. When I/O is not an issue then
unfortunately there is a CPU overhead.

For GIT it would appear that the summary is that it gives a slight loss
on medium sized PK indexes and an increasing win as index size
increases. We struggled to come up with ways of making it Just Work with
as few parameters as possible.

> > B-TREE INDEXES (Integers) 
> 
> > Rows/value  Best time       Size in blocks
> > 10000000    49s             21899
> > 1000000             49s             21899
> > 100000              49s             21899
> > 10000               47s             21899
> > 1000                43s             21899
> > 100         38s             21899
> > 10          38s             21899
> > 1           33s             21899
> 
> Surely the GIT code failed to kick in at all here?  That's just about
> exactly the index size I'd expect for 10 million integers with the
> existing btree code (at least when MAXALIGN=4).

That was the b-tree test, i.e. the control. The GIT patch has bitrot, so
not able to test just yet.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Reviewing new index types (was Re: [PATCHES] Updatedbitmap indexpatch)

Reply via email to