Re: bad plan using nested loops
Johan Fredrikssonwrites: > Bad plan: https://explain.depesz.com/s/avtZ > Good plan: https://explain.depesz.com/s/SJSt > Any suggestions on how to make the planner make better decisions for > this query? Core of the problem looks to be the misestimation here: Index Only Scan using shredder_cgm1 on public.cachedgroupmembers cachedgroupmembers_4 (cost=0.43..2.33 rows=79 width=8) (actual time=0.020..0.903 rows=1492 loops=804) Output: cachedgroupmembers_4.memberid, cachedgroupmembers_4.groupid, cachedgroupmembers_4.disabled Index Cond: ((cachedgroupmembers_4.memberid = principals_1.id) AND (cachedgroupmembers_4.disabled = 0)) Heap Fetches: 5018 Probably, memberid and disabled are correlated but the planner doesn't know that, so it thinks the index condition is way more selective than it actually is. In PG 10, you could very possibly fix that by installing extended statistics on that pair of columns. See https://www.postgresql.org/docs/current/static/planner-stats.html#PLANNER-STATS-EXTENDED regards, tom lane
Re: Bad plan
I've have a look to the plan with pgadmin, and I think the problem is rather here : -> Sort (cost=4997.11..4997.11 rows=1 width=69) (actual time=27.427..28.896 rows=7359 loops=1) Sort Key: amendment.id Sort Method: quicksort Memory: 1227kB -> Nested Loop (cost=183.44..4997.10 rows=1 width=69) (actual time=1.115..24.616 rows=7359 loops=1) -> Nested Loop (cost=183.15..4996.59 rows=1 width=49) (actual time=1.107..9.091 rows=7360 loops=1) -> Index Scan using uk_3b1y5vw9gmh7u3jj8aa2uy0b9 on contact_partner businessprovider (cost=0.42..8.44 rows=1 width=13) (actual time=0.010..0.010 rows=1 loops=1) Index Cond: ((business_provider_code)::text = 'BRZH'::text) -> Bitmap Heap Scan on contract_contract_line contractline (cost=182.73..4907.58 rows=8057 width=52) (actual time=1.086..5.231 rows=7360 loops=1) Recheck Cond: (business_provider_partner = businessprovider.id) Heap Blocks: exact=3586 -> Bitmap Index Scan on contract_contract_line_business_provider_partner_idx (cost=0.00..180.72 rows=8057 width=0) (actual time=0.655..0.655 rows=7360 loops=1) Index Cond: (business_provider_partner = businessprovider.id) -> Index Scan using contract_amendment_pkey on contract_amendment amendment (cost=0.29..0.50 rows=1 width=28) (actual time=0.001..0.002 rows=1 loops=7360) Index Cond: (id = contractline.amendment) The bitmap scan on contract_contract_line is good (8057 vs 7360 rows), and so is the index scan (1 row), but the JOIN with "contact_partner businessProvider" should give the 8057 rows from the bitmap scan, shouldn't it ? 2018-01-23 16:38 GMT+01:00 Laurent Martelli: > 2018-01-23 16:18 GMT+01:00 Justin Pryzby : >> On Tue, Jan 23, 2018 at 01:03:49PM +0100, Laurent Martelli wrote: >> >>> Here is the default plan : >> >> Can you resend without line breaks or paste a link to explain.depesz? > > I hope it's better like that. I've attached it too, just in case. > >> >> The problem appears to be here: >> >> -> Nested Loop Left Join (cost=32067.09..39197.85 rows=1 width=276) >> (actual time=342.725..340775.031 rows=7359 loops=1) >> Join Filter: (sh.share_holder_partner = partner.id) >> Rows Removed by Join Filter: 204915707 >> >> Justin > > > > QUERY PLAN > -- > Sort (cost=39200.76..39200.76 rows=1 width=1066) (actual > time=341273.300..341274.244 rows=7359 loops=1) >Sort Key: ((array_agg(subscribed_power.subscribed_power))[1]) DESC, > status.name, contractline.id >Sort Method: quicksort Memory: 3930kB >-> Nested Loop Left Join (cost=32069.19..39200.75 rows=1 > width=1066) (actual time=342.806..341203.151 rows=7359 loops=1) > -> Nested Loop Left Join (cost=32069.05..39200.50 rows=1 > width=508) (actual time=342.784..341102.848 rows=7359 loops=1) >-> Nested Loop Left Join (cost=32068.77..39200.20 > rows=1 width=500) (actual time=342.778..341070.310 rows=7359 loops=1) > -> Nested Loop Left Join > (cost=32068.64..39200.04 rows=1 width=507) (actual > time=342.776..341058.256 rows=7359 loops=1) >Join Filter: (cca.address = adr_contact.id) >Rows Removed by Join Filter: 2254 >-> Nested Loop Left Join > (cost=32068.22..39199.55 rows=1 width=515) (actual > time=342.767..340997.058 rows=7359 loops=1) > -> Nested Loop Left Join > (cost=32067.79..39198.84 rows=1 width=447) (actual > time=342.753..340932.286 rows=7359 loops=1) >-> Nested Loop Left Join > (cost=32067.65..39198.67 rows=1 width=421) (actual > time=342.748..340896.132 rows=7359 loops=1) > -> Nested Loop Left Join > (cost=32067.23..39198.01 rows=1 width=279) (actual > time=342.739..340821.987 rows=7359 loops=1) >-> Nested Loop > Left Join (cost=32067.09..39197.85 rows=1 width=276) (actual > time=342.725..340775.031 rows=7359 loops=1) > Join Filter: > (sh.share_holder_partner = partner.id) > Rows Removed > by Join Filter: 204915707 > -> Nested > Loop Left Join (cost=28514.61..34092.46 rows=1 width=244)
Re: Bad plan
2018-01-23 16:18 GMT+01:00 Justin Pryzby: > On Tue, Jan 23, 2018 at 01:03:49PM +0100, Laurent Martelli wrote: > >> Here is the default plan : > > Can you resend without line breaks or paste a link to explain.depesz? I hope it's better like that. I've attached it too, just in case. > > The problem appears to be here: > > -> Nested Loop Left Join (cost=32067.09..39197.85 rows=1 width=276) (actual > time=342.725..340775.031 rows=7359 loops=1) > Join Filter: (sh.share_holder_partner = partner.id) > Rows Removed by Join Filter: 204915707 > > Justin QUERY PLAN -- Sort (cost=39200.76..39200.76 rows=1 width=1066) (actual time=341273.300..341274.244 rows=7359 loops=1) Sort Key: ((array_agg(subscribed_power.subscribed_power))[1]) DESC, status.name, contractline.id Sort Method: quicksort Memory: 3930kB -> Nested Loop Left Join (cost=32069.19..39200.75 rows=1 width=1066) (actual time=342.806..341203.151 rows=7359 loops=1) -> Nested Loop Left Join (cost=32069.05..39200.50 rows=1 width=508) (actual time=342.784..341102.848 rows=7359 loops=1) -> Nested Loop Left Join (cost=32068.77..39200.20 rows=1 width=500) (actual time=342.778..341070.310 rows=7359 loops=1) -> Nested Loop Left Join (cost=32068.64..39200.04 rows=1 width=507) (actual time=342.776..341058.256 rows=7359 loops=1) Join Filter: (cca.address = adr_contact.id) Rows Removed by Join Filter: 2254 -> Nested Loop Left Join (cost=32068.22..39199.55 rows=1 width=515) (actual time=342.767..340997.058 rows=7359 loops=1) -> Nested Loop Left Join (cost=32067.79..39198.84 rows=1 width=447) (actual time=342.753..340932.286 rows=7359 loops=1) -> Nested Loop Left Join (cost=32067.65..39198.67 rows=1 width=421) (actual time=342.748..340896.132 rows=7359 loops=1) -> Nested Loop Left Join (cost=32067.23..39198.01 rows=1 width=279) (actual time=342.739..340821.987 rows=7359 loops=1) -> Nested Loop Left Join (cost=32067.09..39197.85 rows=1 width=276) (actual time=342.725..340775.031 rows=7359 loops=1) Join Filter: (sh.share_holder_partner = partner.id) Rows Removed by Join Filter: 204915707 -> Nested Loop Left Join (cost=28514.61..34092.46 rows=1 width=244) (actual time=287.323..610.192 rows=7359 loops=1) -> Nested Loop Left Join (cost=28514.47..34092.30 rows=1 width=239) (actual time=287.318..573.234 rows=7359 loops=1) -> Hash Right Join (cost=28513.48..34090.65 rows=1 width=159) (actual time=287.293..379.564 rows=7359 loops=1) Hash Cond: (ws.contract_line = contractline.id) -> Seq Scan on shareholder_web_subscription ws (cost=0.00..5378.84 rows=52884 width=24) (actual time=0.006..12.307 rows=52884 loops=1) -> Hash (cost=28513.47..28513.47 rows=1 width=143) (actual time=287.243..287.243 rows=7359 loops=1) Buckets: 8192 (originally 1024) Batches: 1 (originally 1) Memory Usage: 1173kB -> Nested Loop Left Join (cost=17456.16..28513.47 rows=1 width=143) (actual time=85.005..284.689 rows=7359 loops=1) -> Nested Loop (cost=17456.03..28513.31 rows=1 width=148) (actual time=85.000..276.599 rows=7359 loops=1) -> Nested Loop Left Join (cost=17455.73..28512.84 rows=1 width=148) (actual time=84.993..261.954 rows=7359 loops=1) -> Nested Loop (cost=17455.60..28512.67 rows=1 width=140) (actual time=84.989..253.715 rows=7359 loops=1) -> Nested Loop (cost=17455.18..28511.93 rows=1 width=93) (actual time=84.981..230.977 rows=7359 loops=1) -> Merge Right Join (cost=17454.89..28511.52 rows=1 width=93) (actual time=84.974..211.200 rows=7359 loops=1) Merge Cond: (subscribed_power.amendment = amendment.id) -> GroupAggregate (cost=12457.78..22574.03 rows=75229 width=168) (actual time=57.500..175.674 rows=83432 loops=1) Group Key: subscribed_power.amendment
Re: Bad plan for ltree predicate <@
Hi Tom, Thanks for your help. > On Dec 1, 2017, at 22:33, Tom Lanewrote: > > > The seqscan formulation of the query results in evaluating > this function afresh at most of the rows The function is defined as STABLE. I though that means that there is no need to reevaluate it on every row as input parameter is the same for every row and return value will be the same during the same query execution. Do I understand incorrectly what STABLE means? Why is the function evaluated more than once? > , whereas shoving it into an > uncorrelated sub-select causes it to be evaluated only once. That, I > think, and not the seqscan-vs-indexscan aspect, is what makes the bitmap > formulation go faster. Certainly you'd not expect that a bitmap scan that > has to hit most of the rows anyway is going to win over a seqscan. > > The fact that the planner goes for a bitmap scan in the second formulation > is an artifact of the fact that it doesn't try to pre-evaluate sub-selects > for selectivity estimation purposes, so you end up with a default estimate > that says that the <@ condition only selects a small fraction of the rows. > Not sure if we should try to change that or not. > > I'd suggest setting the function's cost to 1000 or so and seeing if that > doesn't improve matters. > If I set function cost to 1000 I get slightly better plan but still 3.5 more buffers are read when compared to bitmap scan which as you wrote one would expect to be slower than seq scan. Here is the plan: QUERY PLAN - Aggregate (cost=216438.81..216438.82 rows=1 width=0) (actual time=1262.244..1262.245 rows=1 loops=1) Buffers: shared hit=169215 CTE trees -> Index Scan using document_head__id_path__gist__idx on document_head d (cost=2.91..212787.85 rows=162265 width=74) (actual time=0.115..727.119 rows=154854 loops=1) Index Cond: (id_path <@ get_doc_path('78157c60-45bc-42c1-9aad-c5651995db5c'::character varying)) Filter: (((id)::text <> '78157c60-45bc-42c1-9aad-c5651995db5c'::text) AND ((state)::text <> 'DELETED'::text)) Rows Removed by Filter: 23 Buffers: shared hit=169215 -> CTE Scan on trees (cost=0.00..3245.30 rows=162265 width=0) (actual time=0.119..1118.899 rows=154854 loops=1) Buffers: shared hit=169215 Total runtime: 1277.010 ms (11 rows) My understanding is that the optimal plan in this case should read less data than bitmap scan by the amount of buffers hit by bitmap index scan. It should read roughly all buffers of the table itself. Something like the query with predicate using ltree literal instead of function invocation: explain (analyze, buffers) with trees AS ( SELECT d.id, d.snapshot_id , NULL :: text[] AS permissions FROM document_head AS d WHERE (d.id_path <@ '869c0187_51ae_4deb_a36f_0425fdafda6e.78157c60_45bc_42c1_9aad_c5651995db5c'::ltree AND d.id != '78157c60-45bc-42c1-9aad-c5651995db5c') AND d.state != 'DELETED' ) SELECT COUNT(*) FROM trees; QUERY PLAN - Aggregate (cost=42114.02..42114.03 rows=1 width=0) (actual time=997.427..997.427 rows=1 loops=1) Buffers: shared hit=35230 CTE trees -> Seq Scan on document_head d (cost=0.00..38463.06 rows=162265 width=74) (actual time=0.013..593.082 rows=154854 loops=1) Filter: ((id_path <@ '869c0187_51ae_4deb_a36f_0425fdafda6e.78157c60_45bc_42c1_9aad_c5651995db5c'::ltree) AND ((id)::text <> '78157c60-45bc-42c1-9aad-c5651995db5c'::text) AND ((state)::text <> 'DELETED'::text)) Rows Removed by Filter: 23357 Buffers: shared hit=35230 -> CTE Scan on trees (cost=0.00..3245.30 rows=162265 width=0) (actual time=0.017..888.076 rows=154854 loops=1) Buffers: shared hit=35230 Total runtime: 1011.565 ms (10 rows) The question is if it possible to get plan like that using function or some other way to get ltree value for given document_head.id value in one query? As an alternative I can get ltree value with the separate query but this would require 1. a round-trip to postgres 2. me to change isolation level to REPEATABLE READ to make sure that I get consistent result so I would like to avoid that. Regards, Roman Konoval
Re: Bad plan for ltree predicate <@
Roman Konovalwrites: > I have a problem on 9.3.14 with a query that accesses table: I think the root of the problem is your intermediate function: > CREATE OR REPLACE FUNCTION public.get_doc_path(document_id character varying) > RETURNS ltree > LANGUAGE plpgsql > STABLE > AS $function$ > DECLARE > path ltree; > BEGIN > select id_path into path from document_head where id = document_id; > RETURN path; > END $function$ This is quite expensive, as it involves another table search, but the planner doesn't know that since you've not marked it as having higher than normal cost. The seqscan formulation of the query results in evaluating this function afresh at most of the rows, whereas shoving it into an uncorrelated sub-select causes it to be evaluated only once. That, I think, and not the seqscan-vs-indexscan aspect, is what makes the bitmap formulation go faster. Certainly you'd not expect that a bitmap scan that has to hit most of the rows anyway is going to win over a seqscan. The fact that the planner goes for a bitmap scan in the second formulation is an artifact of the fact that it doesn't try to pre-evaluate sub-selects for selectivity estimation purposes, so you end up with a default estimate that says that the <@ condition only selects a small fraction of the rows. Not sure if we should try to change that or not. I'd suggest setting the function's cost to 1000 or so and seeing if that doesn't improve matters. (BTW, what tipped me off to this was that the "buffers hit" count for the seqscan node was so high, several times more than the actual size of the table. I couldn't account for that until I realized that the function itself would be adding a few buffer hits per execution.) regards, tom lane
Re: Bad plan chosen for union all
One more thing. Given this: > The difference here is that, from the perspective of the outer query, > the WHERE condition is a restriction clause on the "cim" relation, > not a join clause. So it will get pushed down into the subquery > without creating any join order constraints on the outer query. I expected the lateral form of the query to properly use the indexes. Sure enough, this correctly uses the index: explain select cim.yield from earnings JOIN contributions on contributions.id = earnings.note_id JOIN LATERAL ( SELECT contribution_id, max(CASE metrics.name WHEN 'Yield'::text THEN projected ELSE NULL::double precision END) AS yield from contribution_metrics JOIN metrics ON metrics.id = metric WHERE contributions.id = contribution_id group by contribution_id ) cim ON true WHERE earnings.id = '\x595400456c1f1400116b3843' However, when I try to wrap that subquery query again (e.g. as I would need to if it were a view), it doesn't restrict: select cim.yield from earnings JOIN contributions on contributions.id = earnings.note_id JOIN LATERAL ( select * from ( SELECT contribution_id, max(CASE metrics.name WHEN 'Yield'::text THEN projected ELSE NULL::double precision END) AS yield from contribution_metrics JOIN metrics ON metrics.id = metric group by contribution_id ) my_view WHERE contribution_id = contributions.id ) cim ON true WHERE earnings.id = '\x595400456c1f1400116b3843' Is there a way I can get the restriction to be pushed down into my subquery in this lateral form? Best, ~Alex
Re: Bad plan chosen for union all
Alex Reecewrites: > I managed to reduce my test case: the following query does not take > advantage of the index on contribution metrics. Yeah. What you're wishing is that the planner would push a join condition down into a subquery, but it won't do that at present. Doing so would require generating "parameterized paths" for subqueries. While I do not think there's any fundamental technical reason anymore that we couldn't do so, there's considerable risk of wasting a lot of planner cycles chasing unprofitable plan alternatives. Anyway it was totally impractical before 9.6's upper-planner-pathification changes, and not all of the dust has settled from that rewrite. > But I expected it to be equivalent to the plan from this query: The difference here is that, from the perspective of the outer query, the WHERE condition is a restriction clause on the "cim" relation, not a join clause. So it will get pushed down into the subquery without creating any join order constraints on the outer query. regards, tom lane
Re: Bad plan chosen for union all
I managed to reduce my test case: the following query does not take advantage of the index on contribution metrics. explain select cim.yield from earnings JOIN contributions on contributions.id = earnings.note_id JOIN ( SELECT contribution_id, max(CASE metrics.name WHEN 'Yield'::text THEN projected ELSE NULL::double precision END) AS yield from contribution_metrics JOIN metrics ON metrics.id = metric group by contribution_id ) cim ON cim.contribution_id = contributions.id WHERE earnings.id = '\x595400456c1f1400116b3843'; I got this: Hash Join (cost=125.02..147.03 rows=1 width=8) (actual time=4.781..4.906 rows=1 loops=1) Hash Cond: (contribution_metrics.contribution_id = contributions.id) -> HashAggregate (cost=116.86..126.64 rows=3261 width=21) (actual time=4.157..4.600 rows=3261 loops=1) Group Key: contribution_metrics.contribution_id -> Hash Join (cost=1.11..108.18 rows=5788 width=33) (actual time=0.021..2.425 rows=5788 loops=1) Hash Cond: (contribution_metrics.metric = metrics.id) -> Seq Scan on contribution_metrics (cost=0.00..85.36 rows=5788 width=34) (actual time=0.006..0.695 rows=5788 loops=1) -> Hash (cost=1.05..1.05 rows=17 width=25) (actual time=0.009..0.009 rows=17 loops=1) -> Seq Scan on metrics (cost=0.00..1.05 rows=17 width=25) (actual time=0.002..0.005 rows=17 loops=1) -> Hash (cost=8.15..8.15 rows=1 width=26) (actual time=0.022..0.022 rows=1 loops=1) -> Nested Loop (cost=0.14..8.15 rows=1 width=26) (actual time=0.019..0.020 rows=1 loops=1) -> Index Scan using earnings_pkey on earnings (cost=0.06..4.06 rows=1 width=13) (actual time=0.009..0.009 rows=1 loops=1) Index Cond: (id = '\x595400456c1f1400116b3843'::bytea) -> Index Only Scan using contributions_pkey on contributions (cost=0.08..4.09 rows=1 width=13) (actual time=0.008..0.009 rows=1 loops=1) Index Cond: (id = earnings.note_id) Planning time: 0.487 ms Execution time: 4.975 ms But I expected it to be equivalent to the plan from this query: select cim.yield from ( select contribution_id, max(CASE metrics.name WHEN 'Yield'::text THEN projected ELSE NULL::double precision END) AS yield from contribution_metrics JOIN metrics ON metrics.id = metric group by contribution_id ) cim where cim.contribution_id = ( select contributions.id from contributions join earnings on earnings.note_id = contributions.id where earnings.id = '\x595400456c1f1400116b3843') Which gives me _this_ plan, that correctly uses the index on contribution_metrics. Subquery Scan on cim (cost=9.32..14.23 rows=2 width=8) (actual time=0.108..0.108 rows=1 loops=1) InitPlan 1 (returns $1) -> Nested Loop (cost=0.14..8.15 rows=1 width=13) (actual time=0.054..0.055 rows=1 loops=1) -> Index Scan using earnings_pkey on earnings (cost=0.06..4.06 rows=1 width=13) (actual time=0.025..0.026 rows=1 loops=1) Index Cond: (id = '\x595400456c1f1400116b3843'::bytea) -> Index Only Scan using contributions_pkey on contributions (cost=0.08..4.09 rows=1 width=13) (actual time=0.026..0.026 rows=1 loops=1) Index Cond: (id = earnings.note_id) -> GroupAggregate (cost=1.17..6.07 rows=2 width=21) (actual time=0.108..0.108 rows=1 loops=1) Group Key: contribution_metrics.contribution_id -> Hash Join (cost=1.17..6.07 rows=2 width=33) (actual time=0.100..0.101 rows=2 loops=1) Hash Cond: (contribution_metrics.metric = metrics.id) -> Index Scan using contribution_metrics_contribution_id_metric_last_update_on_idx1 on contribution_metrics ( cost=0.06..4.95 rows=2 width=34) (actual time Index Cond: (contribution_id = $1) -> Hash (cost=1.05..1.05 rows=17 width=25) (actual time=0.012..0.012 rows=17 loops=1) -> Seq Scan on metrics (cost=0.00..1.05 rows=17 width=25) (actual time=0.004..0.006 rows=17 loops=1) Planning time: 0.396 ms Execution time: 0.165 ms schema here: https://gist.github.com/awreece/aeacbc818277c7c6d99477645e7fcd03 Best, ~Alex On Tue, Nov 28, 2017 at 2:13 AM Alex Reecewrote: > I'm on PostgreSQL 9.6.5 and getting an awkwardly bad plan chosen for my > query. > > I want to do: > > select investments.id, cim.yield > FROM contributions > JOIN investments ON contributions.investment_id = investments.id > JOIN contribution_investment_metrics_view cim ON cim.investment_id = > investments.id > WHERE contributions.id IN ('\x58c9c0d3ee944c48b32f814d', '\x11') > Where contribution_investment_metrics_view is morally > > select investment_id, first(val) from (select * from contribution_metrics > UNION ALL select * from investment_metrics) group by id > > Typically, querying this view is very fast since I have indexes in both > component queries, leading to a