Given a freshly created table (not vacuumed), and a query that uses an
index-only scan, for example:

CREATE TABLE foo(a int PRIMARY KEY);
INSERT INTO foo SELECT * FROM generate_series(1,1000000);
ANALYSE foo;

EXPLAIN ANALYSE SELECT count(*) FROM foo WHERE a <= 10000;

                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=322.86..322.87 rows=1 width=0) (actual
time=23.646..23.646 rows=1 loops=1)
   ->  Index Only Scan using foo_pkey on foo  (cost=0.00..300.42
rows=8975 width=0) (actual time=0.027..22.291 rows=10000 loops=1)
         Index Cond: (a <= 10000)
         Heap Fetches: 10000
 Total runtime: 23.673 ms
(5 rows)


the actual performance is much worse than the equivalent index scan,
as used in 9.1 and earlier:

SET enable_indexonlyscan = off;
EXPLAIN ANALYSE SELECT count(*) FROM foo WHERE a <= 10000;

                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=322.86..322.87 rows=1 width=0) (actual
time=3.219..3.219 rows=1 loops=1)
   ->  Index Scan using foo_pkey on foo  (cost=0.00..300.42 rows=8975
width=0) (actual time=0.014..2.302 rows=10000 loops=1)
         Index Cond: (a <= 10000)
 Total runtime: 3.240 ms
(4 rows)


Obviously this is the worst-case for an index-only scan, since there
is no visibility map, and it has to fetch each tuple from the heap,
but ideally this should perform around the same as an ordinary index
scan, since it's doing pretty much the same work.

Digging around, it looks like the additional cost is coming from
visibilitymap_test(), which is calling smgrexists() for each tuple, to
see if the visibility map file has been created. So it's doing a file
access check for each row, while the visibility map doesn't exist.

I'm not really familiar with this code, but a possible fix seems to be
to send an invalidation message in vm_extend() when it creates or
extends the visibility map, so that vm_readbuf() only has to re-check
the visibility map file if smgr_vm_nblocks has been invalidated. With
this change (attached) the worst-case index-only scan time becomes
around the same as the index scan time.

Regards,
Dean

Attachment: index-only-scan.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to