På fredag 05. august 2016 kl. 01:01:06, skrev Tom Lane <t...@sss.pgh.pa.us 
<mailto:t...@sss.pgh.pa.us>>:
I wrote:
 > Looking around, there are a couple of places outside commands/analyze.c
 > that are making the same mistake, so this patch isn't complete, but it
 > illustrates what needs to be done.

 Here's a more complete patch.

 regards, tom lane
 
(Sorry for HTML-email, but we live in 2016, and I think it improves 
readability)
 
Great!
 
I have tested it (with PG-9.6 from HEAD, 
e7caacf733f3ee77a555aa715ab6fbf4434e6b52) and it sure looks like it fixes the 
problem for my query.
 
The query:
 
explain analyze SELECT  log.relation_id as company_id , sum(log.duration) AS 
durationFROM onp_crm_activity_log log  JOIN onp_crm_person logfor ON 
logfor.onp_user_id =log.logged_for AND logfor.is_resource = FALSE WHERE 1 = 1 
-- Filter out already invoiced AND NOT EXISTS( SELECT * FROM 
onp_crm_calendarentry_invoice_membership cemJOIN onp_crm_invoice_line il ON 
cem.invoice_line_id = il.idJOIN onp_crm_invoice inv ON il.invoice_id = 
inv.entity_idWHERE cem.calendar_entry_id = log.id AND NOT EXISTS( SELECT * FROM 
onp_crm_invoice creditnoteWHERE il.invoice_id = creditnote.credit_against ) ) 
GROUP BY log.relation_id ; 
 
Gave the following explain-plan (before the patch), with enable_nestloop=off 
(needed to prevent nest-loop-anti-join which caused this query to execute in 
~26 min.):
 
QUERY PLAN 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=25201.62..25220.06 rows=1475 width=36) (actual time
=381.191..381.281rows=404 loops=1) Group Key: log.relation_id -> Hash Anti Join 
(cost=4782.34..25148.28 rows=10667 width=12) (actual time=103.683..361.409 rows=
148330loops=1) Hash Cond: (log.id = cem.calendar_entry_id) -> Hash Join (cost
=81.46..20312.73rows=10667 width=20) (actual time=0.100..156.432 rows=380617 
loops=1) Hash Cond: (log.logged_for = logfor.onp_user_id) -> Seq Scan on 
onp_crm_activity_loglog (cost=0.00..18696.71 rows=380771 width=24) (actual time
=0.005..49.195rows=380774 loops=1) -> Hash (cost=25.02..25.02 rows=4515 width=8
) (actualtime=0.078..0.078 rows=119 loops=1) Buckets: 8192 Batches: 1 Memory 
Usage: 69kB -> Index Scan using onp_crm_person_onp_id_idx on onp_crm_person 
logfor (cost=0.14..25.02 rows=4515 width=8) (actual time=0.005..0.063 rows=119 
loops=1) Filter: (NOT is_resource) Rows Removed by Filter: 8 -> Hash (cost
=4700.87..4700.87rows=1 width=4) (actual time=103.575..103.575 rows=232412 
loops=1) Buckets: 262144 (originally 1024) Batches: 1 (originally 1) Memory 
Usage: 10219kB -> Hash Join (cost=474.41..4700.87 rows=1 width=4) (actual time
=9.724..76.015rows=232412 loops=1) Hash Cond: (cem.invoice_line_id = il.id) -> 
Seq Scanon onp_crm_calendarentry_invoice_membership cem (cost=0.00..3354.28 rows
=232528 width=8) (actual time=0.004..17.626 rows=232528 loops=1) -> Hash (cost
=474.40..474.40rows=1 width=4) (actual time=9.710..9.710 rows=11535 loops=1) 
Buckets:16384 (originally 1024) Batches: 1 (originally 1) Memory Usage: 534kB 
-> MergeJoin (cost=415.33..474.40 rows=1 width=4) (actual time=4.149..8.467 rows
=11535 loops=1) Merge Cond: (il.invoice_id = inv.entity_id) -> Sort (cost
=415.05..415.06rows=1 width=8) (actual time=4.138..4.701 rows=11535 loops=1) 
SortKey: il.invoice_id Sort Method: quicksort Memory: 925kB -> Hash Anti Join (
cost=77.13..415.04 rows=1 width=8) (actual time=0.257..2.716 rows=11535 loops=1)
HashCond: (il.invoice_id = creditnote.credit_against) -> Seq Scan on 
onp_crm_invoice_line il (cost=0.00..294.30 rows=11630 width=8) (actual time
=0.003..0.995rows=11630 loops=1) -> Hash (cost=50.38..50.38 rows=2140 width=4) 
(actualtime=0.247..0.247 rows=37 loops=1) Buckets: 4096 Batches: 1 Memory Usage:
34kB -> Index Only Scan using origo_invoice_credit_against_idx on 
onp_crm_invoice creditnote (cost=0.28..50.38 rows=2140 width=4) (actual time
=0.013..0.182rows=2140 loops=1) Heap Fetches: 0 -> Index Only Scan using 
origo_invoice_id_status_sent_idxon onp_crm_invoice inv (cost=0.28..53.98 rows=
2140width=8) (actual time=0.008..1.274 rows=11576 loops=1) Heap Fetches: 0 
Planningtime: 0.955 ms Execution time: 381.884 ms (35 rows) 
 
AFAIU, this is the problematic estimate: -> Hash Anti Join (cost=77.13..415.04 
rows=1 width=8) (actual time=0.257..2.716 rows=11535 loops=1)
Right?
 
now (after the patch, and without needing to disable nest_loop) gives the 
following explain-plan:
QUERY PLAN 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=44409.89..44428.47 rows=1486 width=36) (actual time
=366.502..366.594rows=404 loops=1) Group Key: log.relation_id -> Hash Anti Join 
(cost=10157.30..43650.11 rows=151956 width=12) (actual time=99.420..347.309 rows
=148327 loops=1) Hash Cond: (log.id = cem.calendar_entry_id) -> Merge Join (cost
=0.56..26729.34rows=379984 width=20) (actual time=0.011..150.415 rows=380614 
loops=1) Merge Cond: (logfor.onp_user_id = log.logged_for) -> Index Scan using 
onp_crm_person_onp_id_idxon onp_crm_person logfor (cost=0.14..25.02 rows=4515 
width=8) (actual time=0.006..0.030 rows=35 loops=1) Filter: (NOT is_resource) 
RowsRemoved by Filter: 2 -> Index Scan using origo_activity_log_logged_for_idx 
ononp_crm_activity_log log (cost=0.42..21977.16 rows=380771 width=24) (actual 
time=0.003..69.406 rows=380623 loops=1) -> Hash (cost=7300.48..7300.48 rows=
228500width=4) (actual time=99.226..99.226 rows=232412 loops=1) Buckets: 262144 
Batches:1 Memory Usage: 10219kB -> Hash Join (cost=789.13..7300.48 rows=228500 
width=4) (actual time=8.043..67.079 rows=232412 loops=1) Hash Cond: 
(cem.invoice_line_id = il.id) -> Seq Scanon 
onp_crm_calendarentry_invoice_membership cem (cost=0.00..3354.28 rows=232528 
width=8) (actual time=0.007..16.230 rows=232528 loops=1) -> Hash (cost
=646.27..646.27rows=11429 width=4) (actual time=8.003..8.003 rows=11535 loops=1
) Buckets:16384 Batches: 1 Memory Usage: 534kB -> Merge Anti Join (cost
=0.85..646.27rows=11429 width=4) (actual time=0.021..6.393 rows=11535 loops=1) 
Merge Cond: (il.invoice_id = creditnote.credit_against) -> MergeJoin (cost
=0.57..613.99rows=11630 width=8) (actual time=0.010..4.771 rows=11630 loops=1) 
Merge Cond: (inv.entity_id = il.invoice_id) ->Index Only Scan using 
origo_invoice_id_status_sent_idxon onp_crm_invoice inv (cost=0.28..53.98 rows=
2140width=8) (actual time=0.006..0.318 rows=2140 loops=1) Heap Fetches: 0 -> 
IndexScan using onp_crm_invoice_line_invoice_id_lineno_key on 
onp_crm_invoice_line il (cost=0.29..409.28 rows=11630 width=8) (actual time
=0.003..2.323rows=11630 loops=1) -> Index Only Scan using 
onp_crm_invoice_credit_against_keyon onp_crm_invoice creditnote (cost
=0.28..47.98rows=2140 width=4) (actual time=0.011..0.016 rows=38 loops=1) Heap 
Fetches:0 Planning time: 0.931 ms Execution time: 366.710 ms (28 rows) 
 
And, for fun with max_parallel_workers_per_gather = 6
QUERY PLAN 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize GroupAggregate (cost=39788.73..39851.89 rows=1486 width=36) (actual 
time=234.289..234.747 rows=404 loops=1) Group Key: log.relation_id -> Sort (cost
=39788.73..39799.88rows=4458 width=36) (actual time=234.276..234.307 rows=1000 
loops=1) Sort Key: log.relation_id Sort Method: quicksort Memory: 103kB -> 
Gather (cost=39054.15..39518.53 rows=4458 width=36) (actual time
=233.012..234.110rows=1000 loops=1) Workers Planned: 3 Workers Launched: 3 -> 
PartialHashAggregate (cost=38054.15..38072.73 rows=1486 width=36) (actual time
=225.607..225.701rows=250 loops=4) Group Key: log.relation_id -> Hash Join (cost
=10238.19..37294.37rows=151956 width=12) (actual time=163.060..219.652 rows=
37082loops=4) Hash Cond: (log.logged_for = logfor.onp_user_id) -> Hash Anti Join
(cost=10156.73..28460.48 rows=152271 width=16) (actual time=162.618..212.351 
rows=37121 loops=4) Hash Cond: (log.id = cem.calendar_entry_id) -> Parallel Seq 
Scanon onp_crm_activity_log log (cost=0.00..16117.29 rows=122829 width=24) 
(actualtime=0.010..15.532 rows=95193 loops=4) -> Hash (cost=7300.48..7300.48 
rows=228500 width=4) (actual time=161.270..161.270 rows=232412 loops=4) Buckets:
262144Batches: 1 Memory Usage: 10219kB -> Hash Join (cost=789.13..7300.48 rows=
228500width=4) (actual time=18.276..112.034 rows=232412 loops=4) Hash Cond: 
(cem.invoice_line_id = il.id) -> Seq Scanon 
onp_crm_calendarentry_invoice_membership cem (cost=0.00..3354.28 rows=232528 
width=8) (actual time=0.045..26.061 rows=232528 loops=4) -> Hash (cost
=646.27..646.27rows=11429 width=4) (actual time=18.116..18.116 rows=11535 loops=
4) Buckets: 16384 Batches: 1 Memory Usage: 534kB -> Merge Anti Join (cost
=0.85..646.27rows=11429 width=4) (actual time=0.179..14.505 rows=11535 loops=4) 
Merge Cond: (il.invoice_id = creditnote.credit_against) -> MergeJoin (cost
=0.57..613.99rows=11630 width=8) (actual time=0.113..10.871 rows=11630 loops=4) 
Merge Cond: (inv.entity_id = il.invoice_id) ->Index Only Scan using 
origo_invoice_id_status_sent_idxon onp_crm_invoice inv (cost=0.28..53.98 rows=
2140width=8) (actual time=0.063..0.811 rows=2140 loops=4) Heap Fetches: 0 -> 
IndexScan using onp_crm_invoice_line_invoice_id_lineno_key on 
onp_crm_invoice_line il (cost=0.29..409.28 rows=11630 width=8) (actual time
=0.046..5.213rows=11630 loops=4) -> Index Only Scan using 
onp_crm_invoice_credit_against_keyon onp_crm_invoice creditnote (cost
=0.28..47.98rows=2140 width=4) (actual time=0.063..0.071 rows=38 loops=4) Heap 
Fetches:0 -> Hash (cost=25.02..25.02 rows=4515 width=8) (actual time
=0.325..0.325rows=119 loops=4) Buckets: 8192 Batches: 1 Memory Usage: 69kB -> 
IndexScan using onp_crm_person_onp_id_idx on onp_crm_person logfor (cost
=0.14..25.02rows=4515 width=8) (actual time=0.032..0.270 rows=119 loops=4) 
Filter: (NOT is_resource) Rows Removed by Filter: 8 Planning time: 2.443 ms 
Executiontime: 239.979 ms (38 rows) 
 
All in all, thanks for looking into this, and +1 for backpatching to 9.5.
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com <mailto:andr...@visena.com>
www.visena.com <https://www.visena.com>
 <https://www.visena.com>


 

Reply via email to