Re: bad plan using nested loops

2018-02-01 Thread Tom Lane
Johan Fredriksson  writes:
> Bad plan: https://explain.depesz.com/s/avtZ
> Good plan: https://explain.depesz.com/s/SJSt
> Any suggestions on how to make the planner make better decisions for
> this query?

Core of the problem looks to be the misestimation here:

Index Only Scan using shredder_cgm1 on public.cachedgroupmembers 
cachedgroupmembers_4 (cost=0.43..2.33 rows=79 width=8) (actual 
time=0.020..0.903 rows=1492 loops=804)
  Output: cachedgroupmembers_4.memberid, cachedgroupmembers_4.groupid, 
cachedgroupmembers_4.disabled
  Index Cond: ((cachedgroupmembers_4.memberid = principals_1.id) AND 
(cachedgroupmembers_4.disabled = 0))
  Heap Fetches: 5018

Probably, memberid and disabled are correlated but the planner doesn't
know that, so it thinks the index condition is way more selective than it
actually is.  In PG 10, you could very possibly fix that by installing
extended statistics on that pair of columns.  See

https://www.postgresql.org/docs/current/static/planner-stats.html#PLANNER-STATS-EXTENDED

regards, tom lane



Re: Bad plan

2018-01-23 Thread Laurent Martelli
I've have a look to the plan with pgadmin, and I think the problem is
rather here :

->  Sort  (cost=4997.11..4997.11 rows=1 width=69) (actual
time=27.427..28.896 rows=7359 loops=1)
  Sort Key: amendment.id
  Sort Method: quicksort  Memory: 1227kB
  ->  Nested Loop  (cost=183.44..4997.10 rows=1 width=69) (actual
time=1.115..24.616 rows=7359 loops=1)
->  Nested Loop  (cost=183.15..4996.59 rows=1 width=49)
(actual time=1.107..9.091 rows=7360 loops=1)
  ->  Index Scan using uk_3b1y5vw9gmh7u3jj8aa2uy0b9 on
contact_partner businessprovider  (cost=0.42..8.44 rows=1 width=13)
(actual time=0.010..0.010 rows=1 loops=1)
Index Cond: ((business_provider_code)::text =
'BRZH'::text)
  ->  Bitmap Heap Scan on contract_contract_line
contractline  (cost=182.73..4907.58 rows=8057 width=52) (actual
time=1.086..5.231 rows=7360 loops=1)
Recheck Cond: (business_provider_partner =
businessprovider.id)
Heap Blocks: exact=3586
->  Bitmap Index Scan on
contract_contract_line_business_provider_partner_idx
(cost=0.00..180.72 rows=8057 width=0) (actual time=0.655..0.655
rows=7360 loops=1)
  Index Cond: (business_provider_partner =
businessprovider.id)
->  Index Scan using contract_amendment_pkey on
contract_amendment amendment  (cost=0.29..0.50 rows=1 width=28)
(actual time=0.001..0.002 rows=1 loops=7360)
  Index Cond: (id = contractline.amendment)

The bitmap scan on contract_contract_line is good (8057 vs 7360 rows),
and so is the index scan (1 row), but the JOIN with "contact_partner
businessProvider" should give the 8057 rows from the bitmap scan,
shouldn't it ?


2018-01-23 16:38 GMT+01:00 Laurent Martelli :
> 2018-01-23 16:18 GMT+01:00 Justin Pryzby :
>> On Tue, Jan 23, 2018 at 01:03:49PM +0100, Laurent Martelli wrote:
>>
>>> Here is the default plan :
>>
>> Can you resend without line breaks or paste a link to explain.depesz?
>
> I hope it's better like that. I've attached it too, just in case.
>
>>
>> The problem appears to be here:
>>
>> ->  Nested Loop Left Join  (cost=32067.09..39197.85 rows=1 width=276) 
>> (actual time=342.725..340775.031 rows=7359 loops=1)
>> Join Filter: (sh.share_holder_partner = partner.id)
>> Rows Removed by Join Filter: 204915707
>>
>> Justin
>
>
>
> QUERY PLAN
> --
>  Sort  (cost=39200.76..39200.76 rows=1 width=1066) (actual
> time=341273.300..341274.244 rows=7359 loops=1)
>Sort Key: ((array_agg(subscribed_power.subscribed_power))[1]) DESC,
> status.name, contractline.id
>Sort Method: quicksort  Memory: 3930kB
>->  Nested Loop Left Join  (cost=32069.19..39200.75 rows=1
> width=1066) (actual time=342.806..341203.151 rows=7359 loops=1)
>  ->  Nested Loop Left Join  (cost=32069.05..39200.50 rows=1
> width=508) (actual time=342.784..341102.848 rows=7359 loops=1)
>->  Nested Loop Left Join  (cost=32068.77..39200.20
> rows=1 width=500) (actual time=342.778..341070.310 rows=7359 loops=1)
>  ->  Nested Loop Left Join
> (cost=32068.64..39200.04 rows=1 width=507) (actual
> time=342.776..341058.256 rows=7359 loops=1)
>Join Filter: (cca.address = adr_contact.id)
>Rows Removed by Join Filter: 2254
>->  Nested Loop Left Join
> (cost=32068.22..39199.55 rows=1 width=515) (actual
> time=342.767..340997.058 rows=7359 loops=1)
>  ->  Nested Loop Left Join
> (cost=32067.79..39198.84 rows=1 width=447) (actual
> time=342.753..340932.286 rows=7359 loops=1)
>->  Nested Loop Left Join
> (cost=32067.65..39198.67 rows=1 width=421) (actual
> time=342.748..340896.132 rows=7359 loops=1)
>  ->  Nested Loop Left Join
>  (cost=32067.23..39198.01 rows=1 width=279) (actual
> time=342.739..340821.987 rows=7359 loops=1)
>->  Nested Loop
> Left Join  (cost=32067.09..39197.85 rows=1 width=276) (actual
> time=342.725..340775.031 rows=7359 loops=1)
>  Join Filter:
> (sh.share_holder_partner = partner.id)
>  Rows Removed
> by Join Filter: 204915707
>  ->  Nested
> Loop Left Join  (cost=28514.61..34092.46 rows=1 width=244) 

Re: Bad plan

2018-01-23 Thread Laurent Martelli
2018-01-23 16:18 GMT+01:00 Justin Pryzby :
> On Tue, Jan 23, 2018 at 01:03:49PM +0100, Laurent Martelli wrote:
>
>> Here is the default plan :
>
> Can you resend without line breaks or paste a link to explain.depesz?

I hope it's better like that. I've attached it too, just in case.

>
> The problem appears to be here:
>
> ->  Nested Loop Left Join  (cost=32067.09..39197.85 rows=1 width=276) (actual 
> time=342.725..340775.031 rows=7359 loops=1)
> Join Filter: (sh.share_holder_partner = partner.id)
> Rows Removed by Join Filter: 204915707
>
> Justin



QUERY PLAN
--
 Sort  (cost=39200.76..39200.76 rows=1 width=1066) (actual
time=341273.300..341274.244 rows=7359 loops=1)
   Sort Key: ((array_agg(subscribed_power.subscribed_power))[1]) DESC,
status.name, contractline.id
   Sort Method: quicksort  Memory: 3930kB
   ->  Nested Loop Left Join  (cost=32069.19..39200.75 rows=1
width=1066) (actual time=342.806..341203.151 rows=7359 loops=1)
 ->  Nested Loop Left Join  (cost=32069.05..39200.50 rows=1
width=508) (actual time=342.784..341102.848 rows=7359 loops=1)
   ->  Nested Loop Left Join  (cost=32068.77..39200.20
rows=1 width=500) (actual time=342.778..341070.310 rows=7359 loops=1)
 ->  Nested Loop Left Join
(cost=32068.64..39200.04 rows=1 width=507) (actual
time=342.776..341058.256 rows=7359 loops=1)
   Join Filter: (cca.address = adr_contact.id)
   Rows Removed by Join Filter: 2254
   ->  Nested Loop Left Join
(cost=32068.22..39199.55 rows=1 width=515) (actual
time=342.767..340997.058 rows=7359 loops=1)
 ->  Nested Loop Left Join
(cost=32067.79..39198.84 rows=1 width=447) (actual
time=342.753..340932.286 rows=7359 loops=1)
   ->  Nested Loop Left Join
(cost=32067.65..39198.67 rows=1 width=421) (actual
time=342.748..340896.132 rows=7359 loops=1)
 ->  Nested Loop Left Join
 (cost=32067.23..39198.01 rows=1 width=279) (actual
time=342.739..340821.987 rows=7359 loops=1)
   ->  Nested Loop
Left Join  (cost=32067.09..39197.85 rows=1 width=276) (actual
time=342.725..340775.031 rows=7359 loops=1)
 Join Filter:
(sh.share_holder_partner = partner.id)
 Rows Removed
by Join Filter: 204915707
 ->  Nested
Loop Left Join  (cost=28514.61..34092.46 rows=1 width=244) (actual
time=287.323..610.192 rows=7359 loops=1)
   ->
Nested Loop Left Join  (cost=28514.47..34092.30 rows=1 width=239)
(actual time=287.318..573.234 rows=7359 loops=1)

->  Hash Right Join  (cost=28513.48..34090.65 rows=1 width=159)
(actual time=287.293..379.564 rows=7359 loops=1)

Hash Cond: (ws.contract_line = contractline.id)

->  Seq Scan on shareholder_web_subscription ws
(cost=0.00..5378.84 rows=52884 width=24) (actual time=0.006..12.307
rows=52884 loops=1)

->  Hash  (cost=28513.47..28513.47 rows=1 width=143) (actual
time=287.243..287.243 rows=7359 loops=1)

  Buckets: 8192 (originally 1024)  Batches: 1 (originally 1)
Memory Usage: 1173kB

  ->  Nested Loop Left Join  (cost=17456.16..28513.47 rows=1
width=143) (actual time=85.005..284.689 rows=7359 loops=1)

->  Nested Loop  (cost=17456.03..28513.31 rows=1
width=148) (actual time=85.000..276.599 rows=7359 loops=1)

  ->  Nested Loop Left Join
(cost=17455.73..28512.84 rows=1 width=148) (actual
time=84.993..261.954 rows=7359 loops=1)

->  Nested Loop  (cost=17455.60..28512.67
rows=1 width=140) (actual time=84.989..253.715 rows=7359 loops=1)

  ->  Nested Loop
(cost=17455.18..28511.93 rows=1 width=93) (actual time=84.981..230.977
rows=7359 loops=1)

->  Merge Right Join
(cost=17454.89..28511.52 rows=1 width=93) (actual time=84.974..211.200
rows=7359 loops=1)

  Merge Cond:
(subscribed_power.amendment = amendment.id)

  ->  GroupAggregate
(cost=12457.78..22574.03 rows=75229 width=168) (actual
time=57.500..175.674 rows=83432 loops=1)

Group Key:
subscribed_power.amendment

 

Re: Bad plan for ltree predicate <@

2017-12-01 Thread Roman Konoval
Hi Tom,

Thanks for your help.

> On Dec 1, 2017, at 22:33, Tom Lane  wrote:
> 
> 
>  The seqscan formulation of the query results in evaluating
> this function afresh at most of the rows

The function is defined as STABLE. I though that means that there is no need
 to reevaluate it on every row as input parameter is the same for every row and
 return value will be the same during the same query execution. Do I understand
 incorrectly what STABLE means?
Why is the function evaluated more than once?

> , whereas shoving it into an
> uncorrelated sub-select causes it to be evaluated only once.  That, I
> think, and not the seqscan-vs-indexscan aspect, is what makes the bitmap
> formulation go faster.  Certainly you'd not expect that a bitmap scan that
> has to hit most of the rows anyway is going to win over a seqscan.
> 
> The fact that the planner goes for a bitmap scan in the second formulation
> is an artifact of the fact that it doesn't try to pre-evaluate sub-selects
> for selectivity estimation purposes, so you end up with a default estimate
> that says that the <@ condition only selects a small fraction of the rows.
> Not sure if we should try to change that or not.
> 
> I'd suggest setting the function's cost to 1000 or so and seeing if that
> doesn't improve matters.
> 


If I set function cost to 1000 I get slightly better plan but still 3.5 more 
buffers are read when compared to bitmap scan which as you wrote one would 
expect to be slower than seq scan.
Here is the plan:


 QUERY PLAN
-
 Aggregate  (cost=216438.81..216438.82 rows=1 width=0) (actual 
time=1262.244..1262.245 rows=1 loops=1)
   Buffers: shared hit=169215
   CTE trees
 ->  Index Scan using document_head__id_path__gist__idx on document_head d  
(cost=2.91..212787.85 rows=162265 width=74) (actual time=0.115..727.119 
rows=154854 loops=1)
   Index Cond: (id_path <@ 
get_doc_path('78157c60-45bc-42c1-9aad-c5651995db5c'::character varying))
   Filter: (((id)::text <> 
'78157c60-45bc-42c1-9aad-c5651995db5c'::text) AND ((state)::text <> 
'DELETED'::text))
   Rows Removed by Filter: 23
   Buffers: shared hit=169215
   ->  CTE Scan on trees  (cost=0.00..3245.30 rows=162265 width=0) (actual 
time=0.119..1118.899 rows=154854 loops=1)
 Buffers: shared hit=169215
 Total runtime: 1277.010 ms
(11 rows)

My understanding is that the optimal plan in this case should read less data 
than bitmap scan by the amount of buffers hit by bitmap index scan. 
It should read roughly all buffers of the table itself. Something like the 
query with predicate using ltree literal instead of function invocation:

explain (analyze, buffers)
with trees AS (
SELECT d.id, d.snapshot_id , NULL :: text[] AS permissions
  FROM document_head AS d
  WHERE (d.id_path <@ 
'869c0187_51ae_4deb_a36f_0425fdafda6e.78157c60_45bc_42c1_9aad_c5651995db5c'::ltree
 AND d.id != '78157c60-45bc-42c1-9aad-c5651995db5c') AND d.state != 'DELETED'
)
SELECT COUNT(*) FROM trees;

 QUERY PLAN
-
 Aggregate  (cost=42114.02..42114.03 rows=1 width=0) (actual 
time=997.427..997.427 rows=1 loops=1)
   Buffers: shared hit=35230
   CTE trees
 ->  Seq Scan on document_head d  (cost=0.00..38463.06 rows=162265 
width=74) (actual time=0.013..593.082 rows=154854 loops=1)
   Filter: ((id_path <@ 
'869c0187_51ae_4deb_a36f_0425fdafda6e.78157c60_45bc_42c1_9aad_c5651995db5c'::ltree)
 AND ((id)::text <> '78157c60-45bc-42c1-9aad-c5651995db5c'::text) AND 
((state)::text <> 'DELETED'::text))
   Rows Removed by Filter: 23357
   Buffers: shared hit=35230
   ->  CTE Scan on trees  (cost=0.00..3245.30 rows=162265 width=0) (actual 
time=0.017..888.076 rows=154854 loops=1)
 Buffers: shared hit=35230
 Total runtime: 1011.565 ms
(10 rows)


The question is if it possible to get plan like that using function or some 
other way to get ltree value for given document_head.id value in one query?

As an alternative I can get ltree value with the separate query but this would 
require
1. a round-trip to postgres
2. me to change isolation level to REPEATABLE READ to make sure that I get 
consistent result 
so I would like to avoid that.

Regards,
Roman Konoval

Re: Bad plan for ltree predicate <@

2017-12-01 Thread Tom Lane
Roman Konoval  writes:
> I have a problem on 9.3.14 with a query that accesses table:

I think the root of the problem is your intermediate function:

> CREATE OR REPLACE FUNCTION public.get_doc_path(document_id character varying)
>  RETURNS ltree
>  LANGUAGE plpgsql
>  STABLE
> AS $function$
> DECLARE
> path ltree;
> BEGIN
> select id_path into path from document_head where id = document_id;
> RETURN path;
> END $function$

This is quite expensive, as it involves another table search, but the
planner doesn't know that since you've not marked it as having higher than
normal cost.  The seqscan formulation of the query results in evaluating
this function afresh at most of the rows, whereas shoving it into an
uncorrelated sub-select causes it to be evaluated only once.  That, I
think, and not the seqscan-vs-indexscan aspect, is what makes the bitmap
formulation go faster.  Certainly you'd not expect that a bitmap scan that
has to hit most of the rows anyway is going to win over a seqscan.

The fact that the planner goes for a bitmap scan in the second formulation
is an artifact of the fact that it doesn't try to pre-evaluate sub-selects
for selectivity estimation purposes, so you end up with a default estimate
that says that the <@ condition only selects a small fraction of the rows.
Not sure if we should try to change that or not.

I'd suggest setting the function's cost to 1000 or so and seeing if that
doesn't improve matters.

(BTW, what tipped me off to this was that the "buffers hit" count for
the seqscan node was so high, several times more than the actual size
of the table.  I couldn't account for that until I realized that the
function itself would be adding a few buffer hits per execution.)

regards, tom lane



Re: Bad plan chosen for union all

2017-11-28 Thread Alex Reece
One more thing. Given this:


> The difference here is that, from the perspective of the outer query,
> the WHERE condition is a restriction clause on the "cim" relation,
> not a join clause.  So it will get pushed down into the subquery
> without creating any join order constraints on the outer query.


I expected the lateral form of the query to properly use the indexes. Sure
enough, this correctly uses the index:

explain select cim.yield
from earnings
JOIN contributions on contributions.id = earnings.note_id
JOIN LATERAL
(
SELECT contribution_id,
   max(CASE metrics.name WHEN 'Yield'::text THEN projected ELSE
NULL::double precision END) AS yield
from contribution_metrics
JOIN  metrics ON metrics.id = metric WHERE contributions.id =
contribution_id
group by contribution_id
) cim ON true
WHERE earnings.id = '\x595400456c1f1400116b3843'

However, when I try to wrap that subquery query again (e.g. as I would need
to if it were a view), it doesn't restrict:

select cim.yield
from earnings

JOIN contributions on contributions.id = earnings.note_id
JOIN LATERAL
(
select * from
(
SELECT contribution_id,
   max(CASE metrics.name WHEN 'Yield'::text THEN projected ELSE
NULL::double precision END) AS yield
from contribution_metrics
JOIN  metrics ON metrics.id = metric

group by contribution_id
) my_view WHERE contribution_id = contributions.id
) cim ON true
WHERE earnings.id = '\x595400456c1f1400116b3843'

Is there a way I can get the restriction to be pushed down into my subquery
in this lateral form?

Best,
~Alex


Re: Bad plan chosen for union all

2017-11-28 Thread Tom Lane
Alex Reece  writes:
> I managed to reduce my test case: the following query does not take
> advantage of the index on contribution metrics.

Yeah.  What you're wishing is that the planner would push a join
condition down into a subquery, but it won't do that at present.
Doing so would require generating "parameterized paths" for subqueries.
While I do not think there's any fundamental technical reason anymore
that we couldn't do so, there's considerable risk of wasting a lot of
planner cycles chasing unprofitable plan alternatives.  Anyway it was
totally impractical before 9.6's upper-planner-pathification changes,
and not all of the dust has settled from that rewrite.

> But I expected it to be equivalent to the plan from this query:

The difference here is that, from the perspective of the outer query,
the WHERE condition is a restriction clause on the "cim" relation,
not a join clause.  So it will get pushed down into the subquery
without creating any join order constraints on the outer query.

regards, tom lane



Re: Bad plan chosen for union all

2017-11-28 Thread Alex Reece
I managed to reduce my test case: the following query does not take
advantage of the index on contribution metrics.

explain select cim.yield
from earnings
JOIN contributions on contributions.id = earnings.note_id
JOIN
(
SELECT contribution_id,
max(CASE metrics.name WHEN 'Yield'::text THEN projected ELSE
NULL::double precision END) AS yield
from contribution_metrics
JOIN  metrics ON metrics.id = metric
group by contribution_id
) cim ON cim.contribution_id = contributions.id
WHERE earnings.id = '\x595400456c1f1400116b3843';

I got this:

  Hash Join  (cost=125.02..147.03 rows=1 width=8) (actual time=4.781..4.906
rows=1 loops=1)
Hash Cond: (contribution_metrics.contribution_id = contributions.id)
->  HashAggregate  (cost=116.86..126.64 rows=3261 width=21) (actual
time=4.157..4.600 rows=3261 loops=1)
  Group Key: contribution_metrics.contribution_id
  ->  Hash Join  (cost=1.11..108.18 rows=5788 width=33) (actual
time=0.021..2.425 rows=5788 loops=1)
Hash Cond: (contribution_metrics.metric = metrics.id)
->  Seq Scan on contribution_metrics  (cost=0.00..85.36
rows=5788 width=34) (actual time=0.006..0.695 rows=5788 loops=1)
->  Hash  (cost=1.05..1.05 rows=17 width=25) (actual
time=0.009..0.009 rows=17 loops=1)
  ->  Seq Scan on metrics  (cost=0.00..1.05 rows=17
width=25) (actual time=0.002..0.005 rows=17 loops=1)
->  Hash  (cost=8.15..8.15 rows=1 width=26) (actual time=0.022..0.022
rows=1 loops=1)
  ->  Nested Loop  (cost=0.14..8.15 rows=1 width=26) (actual
time=0.019..0.020 rows=1 loops=1)
->  Index Scan using earnings_pkey on earnings
(cost=0.06..4.06 rows=1 width=13) (actual time=0.009..0.009 rows=1 loops=1)
  Index Cond: (id = '\x595400456c1f1400116b3843'::bytea)
->  Index Only Scan using contributions_pkey on
contributions  (cost=0.08..4.09 rows=1 width=13) (actual time=0.008..0.009
rows=1 loops=1)
  Index Cond: (id = earnings.note_id)
  Planning time: 0.487 ms
  Execution time: 4.975 ms

But I expected it to be equivalent to the plan from this query:

select cim.yield from (
select contribution_id,
max(CASE metrics.name WHEN 'Yield'::text THEN projected ELSE NULL::double
precision END) AS yield
from contribution_metrics JOIN  metrics ON metrics.id = metric group by
contribution_id
) cim where cim.contribution_id = (
select contributions.id from contributions
join earnings on earnings.note_id = contributions.id
where earnings.id = '\x595400456c1f1400116b3843')

Which gives me _this_ plan, that correctly uses the index on
contribution_metrics.

  Subquery Scan on cim  (cost=9.32..14.23 rows=2 width=8) (actual
time=0.108..0.108 rows=1 loops=1)
InitPlan 1 (returns $1)
  ->  Nested Loop  (cost=0.14..8.15 rows=1 width=13) (actual
time=0.054..0.055 rows=1 loops=1)
->  Index Scan using earnings_pkey on earnings
(cost=0.06..4.06 rows=1 width=13) (actual time=0.025..0.026 rows=1 loops=1)
  Index Cond: (id = '\x595400456c1f1400116b3843'::bytea)
->  Index Only Scan using contributions_pkey on contributions
(cost=0.08..4.09 rows=1 width=13) (actual time=0.026..0.026 rows=1 loops=1)
  Index Cond: (id = earnings.note_id)
->  GroupAggregate  (cost=1.17..6.07 rows=2 width=21) (actual
time=0.108..0.108 rows=1 loops=1)
  Group Key: contribution_metrics.contribution_id
  ->  Hash Join  (cost=1.17..6.07 rows=2 width=33) (actual
time=0.100..0.101 rows=2 loops=1)
Hash Cond: (contribution_metrics.metric = metrics.id)
->  Index Scan using
contribution_metrics_contribution_id_metric_last_update_on_idx1 on
contribution_metrics  ( cost=0.06..4.95 rows=2 width=34) (actual time
  Index Cond: (contribution_id = $1)
->  Hash  (cost=1.05..1.05 rows=17 width=25) (actual
time=0.012..0.012 rows=17 loops=1)
  ->  Seq Scan on metrics  (cost=0.00..1.05 rows=17
width=25) (actual time=0.004..0.006 rows=17 loops=1)
  Planning time: 0.396 ms
  Execution time: 0.165 ms

schema here:
https://gist.github.com/awreece/aeacbc818277c7c6d99477645e7fcd03

Best,
~Alex



On Tue, Nov 28, 2017 at 2:13 AM Alex Reece  wrote:

> I'm on PostgreSQL 9.6.5 and getting an awkwardly bad plan chosen for my
> query.
>
> I want to do:
>
> select investments.id, cim.yield
> FROM contributions
> JOIN investments ON contributions.investment_id = investments.id
> JOIN contribution_investment_metrics_view cim ON cim.investment_id =
> investments.id
> WHERE contributions.id IN ('\x58c9c0d3ee944c48b32f814d', '\x11')
> Where contribution_investment_metrics_view is morally
>
> select investment_id, first(val) from (select * from contribution_metrics
> UNION ALL select * from investment_metrics) group by id
>
> Typically, querying this view is very fast since I have indexes in both
> component queries, leading to a