On 14/08/2019 20:32, Ashwin Agrawal wrote:
On Wed, Aug 14, 2019 at 2:51 AM Ashutosh Sharma wrote:
2) Is there a chance that IndexOnlyScan would ever be required for
   zedstore tables considering the design approach taken for it?

We have not given much thought to IndexOnlyScans so far. But I think
IndexOnlyScan definitely would be beneficial for zedstore as
well. Even for normal index scans as well, fetching as many columns
possible from Index itself and only getting rest of required columns
from the table would be good for zedstore. It would help to further
cut down IO. Ideally, for visibility checking only TidTree needs to be
scanned and visibility checked with the same, so the cost of checking
is much lower compared to heap (if VM can't be consulted) but still is
a cost. Also, with vacuum, if UNDO log gets trimmed, the visibility
checks are pretty cheap. Still given all that, having VM type thing to
optimize the same further would help.

Hmm, yeah. An index-only scan on a zedstore table could perform the "VM checks" by checking the TID tree in the zedstore. It's not as compact as the 2 bits per TID in the heapam's visibility map, but it's pretty good.

Further, I tried creating a zedstore table with btree index on one of
it's column and loaded around 50 lacs record into the table. When the
indexed column was scanned (with enable_seqscan flag set to off), it
went for IndexOnlyScan and that took around 15-20 times more than it
would take for IndexOnly Scan on heap table just because IndexOnlyScan
in zedstore always goes to heap as the visibility check fails.

Currently, an index-only scan on zedstore should be pretty much the same speed as a regular index scan. All the visibility checks will fail, and you end up fetching every row from the table, just like a regular index scan. So I think what you're seeing is that the index fetches on a zedstore table is much slower than on heap.

Ideally, on a column store the index fetches would only fetch the needed columns, but I don't think that's been implemented yet, so all the columns are fetched. That can make a big difference, if you have a wide table with lots of columns, but only actually need a few of them. Was your test case something like that?

We haven't spent much effort on optimizing index fetches yet, so I hope there's many other little tweaks there as well, that we can do to make it faster.

- Heikki


Reply via email to