On 6/23/17 10:38 AM, Teodor Sigaev wrote:
1. Table AM with a 6-byte TID.
2. Table AM with a custom locator format, which could be TID-like.
3. Table AM with no locators.

Currently TID has its own type in system catalog. Seems, it's possible that storage claims type of TID which it uses. Then, AM could claim it too, so the core based on that information could solve the question about AM-storage compatibility. Storage could also claim that it hasn't TID type at all so it couldn't be used with any access method, use case: compressed column oriented storage.

Isn't the fact that TID is an existing type defined in system catalogs is a fairly insignificant detail? I mean, we could just as easily define a new 64-bit locator data type, and use that instead, for example.

The main issue here is that we assume things about the TID contents, i.e. that it contains page/offset etc. And Bitmap nodes rely on that, to some extent - e.g. when prefetching data.

As I remeber, only GIN depends on TID format, other indexes use it as opaque type. Except, at least, btree and GiST - they believe that internal pointers are the same as outer (to heap)

I think you're probably right - GIN does compress the posting lists by exploiting the TID redundancy (that it's page/offset structure), and I don't think there are other index AMs doing that.

But I'm not sure we can simply rely on that - it's possible people will try to improve other index types (e.g. by adding similar compression to other index types). Moreover we now support extensions defining custom index AMs, and we have no clue what people may do in those.

So this would clearly require some sort of flag for each index AM.

Another dubious part - BitmapScan.

It would be really great if you could explain why BitmapScans are dubious, instead of just labeling them as dubious. (I guess you mean Bitmap Heap Scans, right?)

I see no conceptual issue with bitmap scans on arbitrary locator types, as long as there's sufficient locality information encoded in the value. What I mean by that is that for any two locator values A and B:

   (1) if (A < B) then (A is stored before B)

   (2) if (A is close to B) then (A is stored close to B)

Without these features it's probably futile to try to do bitmap scans, because the bitmap would not result in mostly sequential access pattern and things like prefetch would not be very efficient, I think.


Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to