Re: [PATCHES] Bitmapscan changes

Heikki Linnakangas Mon, 12 Mar 2007 10:40:35 -0800

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
Tom Lane wrote:
In the second place, this seems to
forever kill the idea of indexscans that don't visit the heap --- not
that we have any near-term prospect of doing that, but I know a lot of
people remain interested in the idea.
I'm certainly interested in that. It's not really needed for clusteredindexes, though. A well-clustered index is roughly one level shallower,and the heap effectively is the leaf-level, therefore the amount of I/Oyou need to fetch the index tuple + heap tuple, is roughly the same thatas fetching just the index tuple from a normal b-tree index.
That argument ignores the fact that the heap entries are likely to be
much wider than the index entries, due to having other columns in them.

True, that's the "roughly" part. It does indeed depend on your schema.As a data point, here's the index sizes (in pages) of a 140 warehouseTPC-C database:


index name      normal  grouped % of normal size
--------------------------------------
 i_customer      31984  29250   91.5%
 i_orders        11519  11386   98.8%
 pk_customer     11519   1346   11.6%
 pk_district         6      2   
 pk_item           276     10    3.6%
 pk_new_order     3458     42    1.2%
 pk_order_line  153632   2993    1.9%
 pk_orders       11519    191    1.7%
 pk_stock        38389   2815    7.3%
 pk_warehouse        8      2

The customer table is an example of pretty wide table, there's only ~12tuples per page. pk_customer is still benefiting a lot. i_customer andi_orders are not benefiting because the tables are not in the indexorder. The orders-related indexes are seeing the most benefit, theydon't have many columns.

I think we could still come up with some safe condiitions when we couldenable it by default, though.


At this point I'm feeling unconvinced that we want it at all.  It's
sounding like a large increase in complexity (both implementation-wise
and in terms of API ugliness) for a fairly narrow use-case --- just how
much territory is going to be left for this between HOT and bitmap indexes?

I don't see how HOT is overlapping with clustered indexes. On thecontrary, it makes clustered indexes work better, because it reduces theamount of index inserts needed and helps to keep a table clustered.

The use cases for bitmap indexes and clustered indexes do overlapsomewhat. But clustered indexes have an edge because:

- there's no requirement of having only a small number of distinct values
- they support uniqueness checks

- you can efficiently have a mixture of grouped and non-grouped tuples,if your table is only partly clustered

In general, clustered indexes are more suited for OLTP work than bitmapindexes.

I particularly dislike the idea of having the index AM reaching directly
into the heap --- we should be trying to get rid of that, not add more
cases.

I agree. The right way would be to add support for partial ordering andcandidate matches to the indexam API, and move all the sorting etc.ugliness out of the indexam. That's been on my TODO since the beginning.

If you're still not convinced that we want this at all, how would youfeel about the another approach I described? The one where thein-heap-page order is stored in the index tuples, so there's no need forsorting, at the cost of losing part of the I/O benefit.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Re: [PATCHES] Bitmapscan changes

Reply via email to