Tom Lane wrote:
I'm really dubious that this is an intelligent way to go. In the first
place, how will you keep the index sorted if you can't determine the
values of all the keys? It certainly seems that this would break the
ability to have a simple indexscan return sorted data, even if the index
itself doesn't get corrupted.
That's indeed a very fundamental thing with the current design. The
index doesn't retain the complete order within heap pages. That
information is lost, again in favor of a smaller index size. It incurs a
significant CPU overhead, but on an I/O bound system that's a tradeoff
you want to make.
At the moment, I'm storing the offsets within the heap in a bitmap
attached to the index tuple. btgettuple fetches all the heap tuples
represented by the grouped index tuple, checks their visibility, sorts
them into index order, and returns them to the caller one at a time.
Thats ugly, API-wise, because it makes the indexam to actually go look
at the heap, which it shouldn't have to deal with.
Another approach I've been thinking of is to store a list of offsets, in
index order. That would avoid the problem of returning sorted data, and
reduce the CPU overhead incurred by sorting and scanning, at the cost of
much larger (but still much smaller than what we have now) index.
> In the second place, this seems to
forever kill the idea of indexscans that don't visit the heap --- not
that we have any near-term prospect of doing that, but I know a lot of
people remain interested in the idea.
I'm certainly interested in that. It's not really needed for clustered
indexes, though. A well-clustered index is roughly one level shallower,
and the heap effectively is the leaf-level, therefore the amount of I/O
you need to fetch the index tuple + heap tuple, is roughly the same that
as fetching just the index tuple from a normal b-tree index.
On non-clustered indexes, index-only scans would of course still be useful.
The reason this catches me by surprise is that you've said several times
that you intended GIT to be something that could just be enabled
universally. If it's lossy then there's a much larger argument that not
everyone would want it.
Yeah, we can't just always enable it by default. While a clustered index
would degrade to a normal b-tree when the heap isn't clustered, you
would still not want to always enable the index clustering because of
the extra CPU overhead. That has become clear in the CPU bound tests
I've run.
I think we could still come up with some safe condiitions when we could
enable it by default, though. In particular, I've been thinking that if
you run CLUSTER on a table, you'd definitely want to use a clustered
index as well.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster