On Mon, Jun 12, 2017 at 9:50 PM, Haribabu Kommi <kommi.harib...@gmail.com> wrote: > Open Items: > > 1. The BitmapHeapScan and TableSampleScan are tightly coupled with > HeapTuple and HeapScanDesc, So these scans are directly operating > on those structures and providing the result. > > These scan types may not be applicable to different storage formats. > So how to handle them?
I think that BitmapHeapScan, at least, is applicable to any table AM that has TIDs. It seems to me that in general we can imagine three kinds of table AMs: 1. Table AMs where a tuple can be efficiently located by a real TID. By a real TID, I mean that the block number part is really a block number and the item ID is really a location within the block. These are necessarily quite similar to our current heap, but they can change the tuple format and page format to some degree, and it seems like in many cases it should be possible to plug them into our existing index AMs without too much heartache. Both index scans and bitmap index scans ought to work. 2. Table AMs where a tuple has some other kind of locator. For example, imagine an index-organized table where the locator is the primary key, which is a bit like what Alvaro had in mind for indirect indexes. If the locator is 6 bytes or less, it could potentially be jammed into a TID, but I don't think that's a great idea. For things like int8 or numeric, it won't work at all. Even for other things, it's going to cause problems because the bit patterns won't be what the code is expecting; e.g. bitmap scans care about the structure of the TID, not just how many bits it is. (Due credit: Somebody, maybe Alvaro, pointed out this problem before, at PGCon.) For these kinds of tables, larger modifications to the index AMs are likely to be necessary, at least if we want a really general solution, or maybe we should have separate index AMs - e.g. btree for traditional TID-based heaps, and generic_btree or indirect_btree or key_btree or whatever for heaps with some other kind of locator. It's not too hard to see how to make index scans work with this sort of structure but it's very unclear to me whether, or how, bitmap scans can be made to work. 3. Table AMs where a tuple doesn't really have a locator at all. In these cases, we can't support any sort of index AM at all. When the table is queried, there's really nothing the core system can do except ask the table AM for a full scan, supply the quals, and hope the table AM has some sort of smarts that enable it to optimize somehow. For example, you can imagine converting cstore_fdw into a table AM of this sort - ORC has a sort of inbuilt BRIN-like indexing that allows whole chunks to be proven uninteresting and skipped. (You could use chunk number + offset to turn this into a table AM of the previous type if you wanted to support secondary indexes; not sure if that'd be useful, but it'd certainly be harder.) I'm more interested in #1 than in #3, and more interested in #3 than #2, but other people may have different priorities. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers