2015-10-13 23:28 GMT+02:00 David Rowley <david.row...@2ndquadrant.com>:
> On 4 September 2015 at 04:50, Robert Haas <robertmh...@gmail.com> wrote: > >> >> Also: very nice performance results. >> >> > Thanks. > > On following a thread in [General] [1] it occurred to me that this patch > can give a massive improvement on Merge joins where the mark and restore > causes an index scan to have to skip over many filtered rows again and > again. > > I mocked up some tables and some data from the scenario on the [General] > thread: > > create table customers (id bigint, group_id bigint not null); > insert into customers select x.x,x.x%27724+1 from > generate_series(1,473733) x(x); > alter table customers add constraint customer_pkey primary key (id); > create table balances (id bigint, balance int not null, tracking_number > int not null, customer_id bigint not null); > insert into balances select x.x, 100, 12345, x.x % 45 + 1 from > generate_Series(1,16876) x(x); > create index balance_customer_id_index on balances (customer_id); > create index balances_customer_tracking_number_index on balances > (customer_id,tracking_number); > analyze; > > Unpatched I get: > > test=# explain analyze SELECT ac.* FROM balances ac join customers o ON > o.id = ac.customer_id WHERE o.group_id = 45; > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------------------- > Merge Join (cost=164.87..1868.70 rows=1 width=24) (actual > time=6.110..1491.408 rows=375 loops=1) > Merge Cond: (ac.customer_id = o.id) > -> Index Scan using balance_customer_id_index on balances ac > (cost=0.29..881.24 rows=16876 width=24) (actual time=0.009..5.206 > rows=16876 loops=1) > -> Index Scan using customer_pkey on customers o (cost=0.42..16062.75 > rows=17 width=8) (actual time=0.014..1484.382 rows=376 loops=1) > Filter: (group_id = 45) > Rows Removed by Filter: 10396168 > Planning time: 0.207 ms > Execution time: 1491.469 ms > (8 rows) > > Patched: > > test=# explain analyze SELECT ac.* FROM balances ac join customers o ON > o.id = ac.customer_id WHERE o.group_id = 45; > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------------------- > Merge Join (cost=164.87..1868.70 rows=1 width=24) (actual > time=6.037..11.528 rows=375 loops=1) > Merge Cond: (ac.customer_id = o.id) > -> Index Scan using balance_customer_id_index on balances ac > (cost=0.29..881.24 rows=16876 width=24) (actual time=0.009..4.978 > rows=16876 loops=1) > -> Index Scan using customer_pkey on customers o (cost=0.42..16062.75 > rows=17 width=8) (actual time=0.015..5.141 rows=2 loops=1) > Filter: (group_id = 45) > Rows Removed by Filter: 27766 > Planning time: 0.204 ms > Execution time: 11.575 ms > (8 rows) > > Now it could well be that the merge join costs need a bit more work to > avoid a merge join in this case, but as it stands as of today, this is your > performance gain. > > Regards > it is great Pavel > > David Rowley > > [1] > http://www.postgresql.org/message-id/caczyddiaxeam2rkmhbmuhwvcm4txh+5e3hqgggyzfzbn-pn...@mail.gmail.com > > -- > David Rowley http://www.2ndQuadrant.com/ > <http://www.2ndquadrant.com/> > PostgreSQL Development, 24x7 Support, Training & Services >