Hi,

I noticed that during an index-only scan the filter check is executed
after the visibility of a tuple is confirmed. This means a lot of
unnecessary heap fetches, if the filter condition is highly selective.

In the application I'm maintaining, the visibility map is mostly
dirty. Index-only scans deteriorate into normal index-scans, even when
there is no row matching the filter conditions.

Attached is a patch, which changes the order of the checks. I.e. the
visibility of the row is confirmed only after the qual condition is
evaluated to true.

To see the performance benefits, consider the following example:
create table test_table (id integer primary key, a integer, b text, c
text, filler text);
insert into test_table (id, a, b, c, filler) select s, 1,
md5(s::text), md5(s::text), repeat('a', 100) from
generate_series(10000000, 19999999) s;
insert into test_table (id, a, b, c, filler) select s, 2,
md5(s::text), md5(s::text), repeat('a', 100) from
generate_series(20000000, 29999999) s;
insert into test_table (id, a, b, c, filler) select s, 3,
md5(s::text), md5(s::text), repeat('a', 100) from
generate_series(30000000, 39999999) s;
create index test_index on test_table (a, b, c);
explain analyze select a, b, c from test_table where a = 2 and b > c
order by a, b, c limit 1;

Before the changes:
"Limit  (cost=0.69..11.80 rows=1 width=68) (actual
time=8136433.536..8136433.536 rows=0 loops=1)"
"  ->  Index Only Scan using test_index on test_table
(cost=0.69..555456.69 rows=50000 width=68) (actual
time=8136433.533..8136433.533 rows=0 loops=1)"
"        Index Cond: (a = 2)"
"        Filter: (b > c)"
"        Rows Removed by Filter: 10000000"
"        Heap Fetches: 10000000"
"Planning time: 2.054 ms"
"Execution time: 8136433.564 ms"

After the changes:
"Limit  (cost=0.69..11.80 rows=1 width=68) (actual
time=3060.548..3060.548 rows=0 loops=1)"
"  ->  Index Only Scan using test_index on test_table
(cost=0.69..555456.69 rows=50000 width=68) (actual
time=3060.544..3060.544 rows=0 loops=1)"
"        Index Cond: (a = 2)"
"        Filter: (b > c)"
"        Rows Removed by Filter: 10000000"
"        Heap Fetches: 0"
"Planning time: 11.684 ms"
"Execution time: 3060.615 ms"
The example is extreme, but I would say it doesn't differ much from
what I see in my production environment.

About the naming in the solution:
There's already a concept of "index recheck" which seems to be a
perfect name for the functionality i'm introducing. However, after
looking into the code, I don't think it can be reused, because the
current rechecks are evaluated before quals and ExecScanFetch contains
a very specific code.

Please let me know what you think about the change.

--
Mateusz Stefek

Attachment: indexonlyfilter.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to