On 6/23/17 10:38 AM, Teodor Sigaev wrote:
1. Table AM with a 6-byte TID.
2. Table AM with a custom locator format, which could be TID-like.
3. Table AM with no locators.
Currently TID has its own type in system catalog. Seems, it's possible
that storage claims type of TID which it uses. Then, AM could claim it
too, so the core based on that information could solve the question
about AM-storage compatibility. Storage could also claim that it hasn't
TID type at all so it couldn't be used with any access method, use case:
compressed column oriented storage.
Isn't the fact that TID is an existing type defined in system catalogs
is a fairly insignificant detail? I mean, we could just as easily define
a new 64-bit locator data type, and use that instead, for example.
The main issue here is that we assume things about the TID contents,
i.e. that it contains page/offset etc. And Bitmap nodes rely on that, to
some extent - e.g. when prefetching data.
As I remeber, only GIN depends on TID format, other indexes use it as
opaque type. Except, at least, btree and GiST - they believe that
internal pointers are the same as outer (to heap)
I think you're probably right - GIN does compress the posting lists by
exploiting the TID redundancy (that it's page/offset structure), and I
don't think there are other index AMs doing that.
But I'm not sure we can simply rely on that - it's possible people will
try to improve other index types (e.g. by adding similar compression to
other index types). Moreover we now support extensions defining custom
index AMs, and we have no clue what people may do in those.
So this would clearly require some sort of flag for each index AM.
Another dubious part - BitmapScan.
It would be really great if you could explain why BitmapScans are
dubious, instead of just labeling them as dubious. (I guess you mean
Bitmap Heap Scans, right?)
I see no conceptual issue with bitmap scans on arbitrary locator types,
as long as there's sufficient locality information encoded in the value.
What I mean by that is that for any two locator values A and B:
(1) if (A < B) then (A is stored before B)
(2) if (A is close to B) then (A is stored close to B)
Without these features it's probably futile to try to do bitmap scans,
because the bitmap would not result in mostly sequential access pattern
and things like prefetch would not be very efficient, I think.
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: