Re: [PATCHES] Bitmapscan changes

Heikki Linnakangas Mon, 12 Mar 2007 09:54:38 -0800

Tom Lane wrote:

I'm really dubious that this is an intelligent way to go.  In the first
place, how will you keep the index sorted if you can't determine the
values of all the keys?  It certainly seems that this would break the
ability to have a simple indexscan return sorted data, even if the index

itself doesn't get corrupted.

That's indeed a very fundamental thing with the current design. Theindex doesn't retain the complete order within heap pages. Thatinformation is lost, again in favor of a smaller index size. It incurs asignificant CPU overhead, but on an I/O bound system that's a tradeoffyou want to make.

At the moment, I'm storing the offsets within the heap in a bitmapattached to the index tuple. btgettuple fetches all the heap tuplesrepresented by the grouped index tuple, checks their visibility, sortsthem into index order, and returns them to the caller one at a time.Thats ugly, API-wise, because it makes the indexam to actually go lookat the heap, which it shouldn't have to deal with.

Another approach I've been thinking of is to store a list of offsets, inindex order. That would avoid the problem of returning sorted data, andreduce the CPU overhead incurred by sorting and scanning, at the cost ofmuch larger (but still much smaller than what we have now) index.


> In the second place, this seems to

forever kill the idea of indexscans that don't visit the heap --- not
that we have any near-term prospect of doing that, but I know a lot of
people remain interested in the idea.

I'm certainly interested in that. It's not really needed for clusteredindexes, though. A well-clustered index is roughly one level shallower,and the heap effectively is the leaf-level, therefore the amount of I/Oyou need to fetch the index tuple + heap tuple, is roughly the same thatas fetching just the index tuple from a normal b-tree index.


On non-clustered indexes, index-only scans would of course still be useful.

The reason this catches me by surprise is that you've said several times
that you intended GIT to be something that could just be enabled
universally.  If it's lossy then there's a much larger argument that not
everyone would want it.

Yeah, we can't just always enable it by default. While a clustered indexwould degrade to a normal b-tree when the heap isn't clustered, youwould still not want to always enable the index clustering because ofthe extra CPU overhead. That has become clear in the CPU bound testsI've run.

I think we could still come up with some safe condiitions when we couldenable it by default, though. In particular, I've been thinking that ifyou run CLUSTER on a table, you'd definitely want to use a clusteredindex as well.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: [PATCHES] Bitmapscan changes

Reply via email to