Hi what is your random_page_cost and seq_page_cost?
Regards Pavel Stehule 2015-03-19 7:23 GMT+01:00 Jake Magner <jakemagne...@gmail.com>: > I am having problems with a join where the planner picks a merge join and > an > index scan on one of the tables. Manually disabling merge joins and running > the query both ways shows the merge join takes over 10 seconds while a hash > join takes less than 100ms. The planner total cost estimate favors the > merge > join, but the cost estimate for the index scan part is greater than the > total cost estimate by a factor of 300x. My understanding of how this can > occur is that it expects it won't actually have to scan all the rows, > because using the histogram distribution stats it can know that all the > relevant rows of the join column will be at the beginning of the scan. But > in practice it appears to actually be index scanning all the rows, showing > massive amounts of page hits. What is also confusing is that the planner > estimate of the number of rows that match the second join condition is > accurate and very low, so I would expect it to index scan on that column's > index instead. Pasted at the bottom is the explain plan for the query and > some other variations I think might be relevant. The table/index names are > obfuscated. I ran ANALYZE on all the tables in the query first. All the > pages are cached in the explain plans but we wouldn't expect that to be > true > in the production system. There are btree indexes on all the columns in > both > the join conditions and the filters. > > Searching, I found this thread > http://postgresql.nabble.com/merge-join-killing-performance-td2076433.html > which sounds kind of similar, but there are no Nulls in this table. > > Thanks for your help. > > > > Postgres version info: PostgreSQL 9.1.13 on x86_64-unknown-linux-gnu, > compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit > > > ----------------------- > Original Query > > The estimated cost for Index Scan is 898k but the total cost estimate is > 2.6k. The planner has a good estimate of the number of rows, 1335, for the > index scan, but by the number of page hits (8M) it appears it actually > scanned the entire table which has about 8M rows. > ----------------------- > EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM vehicles v LEFT JOIN usagestats ON > v.id = tid AND type = 'vehicle'; > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Merge Right Join (cost=593.28..2634.10 rows=4155 width=619) (actual > time=9.150..11464.949 rows=4155 loops=1) > Merge Cond: (usagestats.tid = s.id) > Buffers: shared hit=8063988 > -> Index Scan using usagestats_tid_idx on usagestats > (cost=0.00..898911.91 rows=1335 width=37) (actual time=0.027..11448.789 > rows=2979 loops=1) > Filter: ((type)::text = 'vehicle'::text) > Buffers: shared hit=8063686 > -> Sort (cost=593.28..603.67 rows=4155 width=582) (actual > time=9.108..10.429 rows=4155 loops=1) > Sort Key: s.id > Sort Method: quicksort Memory: 1657kB > Buffers: shared hit=302 > -> Seq Scan on vehicles v (cost=0.00..343.55 rows=4155 > width=582) > (actual time=0.014..2.917 rows=4155 loops=1) > Buffers: shared hit=302 > Total runtime: 11466.122 ms > (13 rows) > > ------------------------ > Change the type='vehicle' condition to an always true condition > > If we change the filter from "type = 'vehicle'" (True for a small fraction > of the rows) to "freq > -1" (True for all rows) then the plan is the same, > but the actual time and page hits are much less and the query returns is > fast. > ------------------------ > EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM vehicle v LEFT JOIN usagestats ON > (v.id = tid AND freq > -1); > > > QUERY PLAN > > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Merge Right Join (cost=593.28..2434.79 rows=7733 width=619) (actual > time=5.635..59.852 rows=17096 loops=1) > > Merge Cond: (usagestats.tid = v.id) > > Buffers: shared hit=17653 > > -> Index Scan using usagestats_tid_idx on usagestats > (cost=0.00..898914.00 rows=8006976 width=37) (actual time=0.010..34.075 > rows=17225 loops=1) > > Filter: (freq > (-1)) > > Buffers: shared hit=17351 > > -> Sort (cost=593.28..603.67 rows=4155 width=582) (actual > time=5.617..9.351 rows=17094 loops=1) > > Sort Key: v.id > > Sort Method: quicksort Memory: 1657kB > > Buffers: shared hit=302 > > -> Seq Scan on vehicle v (cost=0.00..343.55 rows=4155 width=582) > (actual time=0.009..1.803 rows=4157 loops=1) > > Buffers: shared hit=302 > > Total runtime: 62.868 ms > > (13 rows) > > > ---------------------- > Original Query with merge joins disabled > > If we manually disable merge joins and run the original query we get a hash > join with what seems like > a more reasonable index scan on the more selective type column. The total > cost estimate is higher than the merge join plan, but lower than the cost > estimate for the index scan in the merge join query. > --------------------- > BEGIN; > SET LOCAL enable_mergejoin = off; > > EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM vehicle v LEFT JOIN usagestats ON > v.id = tid AND type = 'vehicle'; > > QUERY PLAN > > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Hash Right Join (cost=395.49..5158.10 rows=4155 width=619) (actual > time=8.038..20.886 rows=4155 loops=1) > Hash Cond: (usagestats.tid = v.id) > Buffers: shared hit=3250 > -> Index Scan using usagestats_type_idx on usagestats > (cost=0.00..4752.59 rows=1335 width=37) (actual time=0.100..6.770 rows=2979 > loops=1) > Index Cond: ((type)::text = 'vehicle'::text) > Buffers: shared hit=2948 > -> Hash (cost=343.55..343.55 rows=4155 width=582) (actual > time=7.908..7.908 rows=4155 loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 1088kB > Buffers: shared hit=302 > -> Seq Scan on vehicle v (cost=0.00..343.55 rows=4155 width=582) > (actual time=0.021..3.068 rows=4155 loops=1) > Buffers: shared hit=302 > Total runtime: 21.936 ms > (12 rows) > > ----------------------- > Miscellaneous stats > ----------------------- > > SELECT COUNT(*) FROM vehicle; > count > ------- > 4155 > (1 row) > > SELECT COUNT(*) FROM usagestats; > count > --------- > 8007015 > (1 row) > > The usagestats table has 501 histogram buckets for the tid column. The max > id in the vehicle table is 4155 and the first two buckets of the histogram > cover all the values between 1 and 4500. > > > > > > -- > View this message in context: > http://postgresql.nabble.com/Merge-Join-chooses-very-slow-index-scan-tp5842523.html > Sent from the PostgreSQL - performance mailing list archive at Nabble.com. > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >