subject:"\[PERFORM\] How to get explain plan to prefer Hash Join"

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-12 Thread Jeff Janes

On Wed, Mar 11, 2015 at 5:35 PM, atxcanadian matthew.bo...@gmail.com
wrote:


 I'm a little perplexed why the autovacuum wasn't keeping up. Any
 recommendations for those settings to push it to do a bit more analyzing of
 the tables??


What does pg_stat_user_tables show for that table?

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-12 Thread Jeff Janes

On Thu, Mar 12, 2015 at 8:59 AM, atxcanadian matthew.bo...@gmail.com
wrote:

 Here is the output:

 http://postgresql.nabble.com/file/n5841610/pg_stat_user_table.jpg

 This is after I've manually ran analyze.


The last_* columns are only showing times, and not full timestamps.  Does
your reporting tool drop the date part of a timestamp when it is equal to
today?  Or does it just drop the date part altogether regardless of what it
is?

Cheers,

Jeff

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-12 Thread atxcanadian

Isn't a random_page_cost of 1 a little aggressive?

We are currently setup on Amazon SSD with software RAID 5. 



--
View this message in context: 
http://postgresql.nabble.com/How-to-get-explain-plan-to-prefer-Hash-Join-tp5841450p5841605.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-12 Thread atxcanadian

Here is the output:

http://postgresql.nabble.com/file/n5841610/pg_stat_user_table.jpg 

This is after I've manually ran analyze.



--
View this message in context: 
http://postgresql.nabble.com/How-to-get-explain-plan-to-prefer-Hash-Join-tp5841450p5841610.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-12 Thread atxcanadian

Sorry about that, excel clipped off the dates.

http://postgresql.nabble.com/file/n5841633/pg_stat_user_table.jpg 



--
View this message in context: 
http://postgresql.nabble.com/How-to-get-explain-plan-to-prefer-Hash-Join-tp5841450p5841633.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] How to get explain plan to prefer Hash Join

2015-03-11 Thread atxcanadian

Currently seeing massive increase in performance when optimizer chooses Hash
Join over Nested Loops. I achieve this by temporarily setting nested loops
off. I'd like to setup some database variables where the optimizer prefers
hash joins. Any suggestions?

*Query in question:*
explain analyze
select dp.market_day, dp.hour_ending, dp.repeated_hour_flag,
dp.settlement_point, sum(dp.mw) dp_mw 
from dp_hist_gen_actual dp 
 Inner Join api_settlement_points sp on 
 sp.settlement_point = dp.settlement_point and 
 sp.settlement_point_rdfid =
'#_{09F3A628-3B9D-481A-AC90-72AF8EAB64CA}' and 
 sp.start_date = '2015-01-01'::date and 
 sp.end_date  '2015-01-01'::date and 
 sp.rt_model = (select case when c.rt_model_loaded = 2 then true
else false end 
from cim_calendar c where c.nodal_load =
'2015-01-01'::date 
order by c.cim desc limit 1) 
where dp.market_day BETWEEN '2015-01-01'::date and 
'2015-01-01'::date and 
  dp.expiry_date is null 
group by dp.market_day, dp.hour_ending, dp.repeated_hour_flag,
dp.settlement_point;

*Nested Loop Explain Analyze Output:*
HashAggregate  (cost=58369.29..58369.30 rows=1 width=24) (actual
time=496287.249..496287.257 rows=24 loops=1)
  InitPlan 1 (returns $0)
-  Limit  (cost=8.30..8.30 rows=1 width=9) (actual time=0.145..0.145
rows=1 loops=1)
  -  Sort  (cost=8.30..8.78 rows=193 width=9) (actual
time=0.145..0.145 rows=1 loops=1)
Sort Key: c.cim
Sort Method: top-N heapsort  Memory: 25kB
-  Seq Scan on cim_calendar c  (cost=0.00..7.33 rows=193
width=9) (actual time=0.007..0.075 rows=192 loops=1)
  Filter: (nodal_load = '2015-01-01'::date)
  Rows Removed by Filter: 36
  - * Nested Loop  (cost=0.99..58360.98 rows=1 width=24) (actual
time=883.718..496287.058 rows=24 loops=1)*
Join Filter: ((dp.settlement_point)::text =
(sp.settlement_point)::text)
Rows Removed by Join Filter: 12312
-  Index Scan using dp_hist_gen_actual_idx2 on dp_hist_gen_actual
dp  (cost=0.56..2.78 rows=1 width=24) (actual time=0.020..20.012 rows=12336
loops=1)
  Index Cond: ((market_day = '2015-01-01'::date) AND
(market_day = '2015-01-01'::date) AND (expiry_date IS NULL))
-  Index Scan using api_settlement_points_idx on
api_settlement_points sp  (cost=0.43..58358.05 rows=12 width=9) (actual
time=39.066..40.223 rows=1 loops=12336)
  Index Cond: ((rt_model = $0) AND (start_date =
'2015-01-01'::date) AND (end_date  '2015-01-01'::date))
  Filter: ((settlement_point_rdfid)::text =
'#_{09F3A628-3B9D-481A-AC90-72AF8EAB64CA}'::text)
  Rows Removed by Filter: 5298
*Total runtime: 496287.325 ms*

*Hash Join Explain Analyze Output:*
HashAggregate  (cost=58369.21..58369.22 rows=1 width=24) (actual
time=50.835..50.843 rows=24 loops=1)
  InitPlan 1 (returns $0)
-  Limit  (cost=8.30..8.30 rows=1 width=9) (actual time=0.149..0.149
rows=1 loops=1)
  -  Sort  (cost=8.30..8.78 rows=193 width=9) (actual
time=0.148..0.148 rows=1 loops=1)
Sort Key: c.cim
Sort Method: top-N heapsort  Memory: 25kB
-  Seq Scan on cim_calendar c  (cost=0.00..7.33 rows=193
width=9) (actual time=0.009..0.082 rows=192 loops=1)
  Filter: (nodal_load = '2015-01-01'::date)
  Rows Removed by Filter: 36
  -  *Hash Join  (cost=3.23..58360.90 rows=1 width=24) (actual
time=49.644..50.811 rows=24 loops=1)*
Hash Cond: ((sp.settlement_point)::text =
(dp.settlement_point)::text)
-  Index Scan using api_settlement_points_idx on
api_settlement_points sp  (cost=0.43..58358.05 rows=12 width=9) (actual
time=39.662..40.822 rows=1 loops=1)
  Index Cond: ((rt_model = $0) AND (start_date =
'2015-01-01'::date) AND (end_date  '2015-01-01'::date))
  Filter: ((settlement_point_rdfid)::text =
'#_{09F3A628-3B9D-481A-AC90-72AF8EAB64CA}'::text)
  Rows Removed by Filter: 5298
-  Hash  (cost=2.78..2.78 rows=1 width=24) (actual
time=9.962..9.962 rows=12336 loops=1)
  Buckets: 1024  Batches: 1  Memory Usage: 684kB
  -  Index Scan using dp_hist_gen_actual_idx2 on
dp_hist_gen_actual dp  (cost=0.56..2.78 rows=1 width=24) (actual
time=0.023..5.962 rows=12336 loops=1)
Index Cond: ((market_day = '2015-01-01'::date) AND
(market_day = '2015-01-01'::date) AND (expiry_date IS NULL))
*Total runtime: 50.906 ms*




--
View this message in context: 
http://postgresql.nabble.com/How-to-get-explain-plan-to-prefer-Hash-Join-tp5841450.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-11 Thread Nicholson, Brad (Toronto, ON, CA)

 -Original Message-
 From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
 ow...@postgresql.org] On Behalf Of atxcanadian
 Sent: Wednesday, March 11, 2015 1:01 PM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] How to get explain plan to prefer Hash Join

 Currently seeing massive increase in performance when optimizer chooses
 Hash Join over Nested Loops. I achieve this by temporarily setting nested 
 loops
 off. I'd like to setup some database variables where the optimizer prefers 
 hash
 joins. Any suggestions?

Try making small adjustments to either random_page_cost or cpu_tuple_cost.  
They can influence the planners choice here.  I have solved similar issues in 
the past by adjusting one or the other.  Be aware thought that those changes 
can have negative effects in other places, so be sure to test.

Brad.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-11 Thread Jeff Janes

On Wed, Mar 11, 2015 at 10:01 AM, atxcanadian matthew.bo...@gmail.com
wrote:

 Currently seeing massive increase in performance when optimizer chooses
 Hash
 Join over Nested Loops. I achieve this by temporarily setting nested loops
 off. I'd like to setup some database variables where the optimizer prefers
 hash joins. Any suggestions?



-  Index Scan using dp_hist_gen_actual_idx2 on dp_hist_gen_actual
 dp  (cost=0.56..2.78 rows=1 width=24) (actual time=0.020..20.012 rows=12336
 loops=1)


Here it thinks it will find 1 row, but actually finds 12336.  That is not
conducive to good plans.  Has the table been analyzed recently?

Index Cond: ((market_day = '2015-01-01'::date) AND (market_day =
'2015-01-01'::date) AND (expiry_date IS NULL))

If you query just this one table with just these criteria, what do you get
for the row estimates and actual rows, with and without the IS NULL
condition?

Cheers,

Jeff

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-11 Thread atxcanadian

So I implemented two changes.

- Moved random_page_cost from 1.1 to 2.0 
- Manually ran analyze on all the tables

*Here is the new explain analyze:*
QUERY PLAN
HashAggregate  (cost=74122.97..74125.53 rows=256 width=24) (actual
time=45.205..45.211 rows=24 loops=1)
  InitPlan 1 (returns $0)
-  Limit  (cost=8.30..8.30 rows=1 width=9) (actual time=0.152..0.152
rows=1 loops=1)
  -  Sort  (cost=8.30..8.78 rows=193 width=9) (actual
time=0.150..0.150 rows=1 loops=1)
Sort Key: c.cim
Sort Method: top-N heapsort  Memory: 25kB
-  Seq Scan on cim_calendar c  (cost=0.00..7.33 rows=193
width=9) (actual time=0.008..0.085 rows=192 loops=1)
  Filter: (nodal_load = '2015-01-01'::date)
  Rows Removed by Filter: 36
  -  Nested Loop  (cost=22623.47..74111.47 rows=256 width=24) (actual
time=43.798..45.181 rows=24 loops=1)
-  Bitmap Heap Scan on api_settlement_points sp 
(cost=22622.91..67425.92 rows=12 width=9) (actual time=43.756..43.823 rows=1
loops=1)
  Recheck Cond: ((rt_model = $0) AND (start_date =
'2015-01-01'::date) AND (end_date  '2015-01-01'::date))
  Filter: ((settlement_point_rdfid)::text =
'#_{09F3A628-3B9D-481A-AC90-72AF8EAB64CA}'::text)
  Rows Removed by Filter: 5298
  -  Bitmap Index Scan on api_settlement_points_idx 
(cost=0.00..22622.90 rows=72134 width=0) (actual time=42.998..42.998
rows=5299 loops=1)
Index Cond: ((rt_model = $0) AND (start_date =
'2015-01-01'::date) AND (end_date  '2015-01-01'::date))
-  Index Scan using dp_hist_gen_actual_idx2 on dp_hist_gen_actual
dp  (cost=0.56..556.88 rows=25 width=24) (actual time=0.033..1.333 rows=24
loops=1)
  Index Cond: ((market_day = '2015-01-01'::date) AND
(market_day = '2015-01-01'::date) AND (expiry_date IS NULL) AND
((settlement_point)::text = (sp.settlement_point)::text))
Total runtime: 45.278 ms

I'm a little perplexed why the autovacuum wasn't keeping up. Any
recommendations for those settings to push it to do a bit more analyzing of
the tables??



--
View this message in context: 
http://postgresql.nabble.com/How-to-get-explain-plan-to-prefer-Hash-Join-tp5841450p5841520.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to get explain plan to prefer Hash Join

2015-03-11 Thread Pavel Stehule

2015-03-12 1:35 GMT+01:00 atxcanadian matthew.bo...@gmail.com:

 So I implemented two changes.

 - Moved random_page_cost from 1.1 to 2.0


random_page_cost 1 can enforce nested_loop - it is very cheap with it


 - Manually ran analyze on all the tables

 *Here is the new explain analyze:*
 QUERY PLAN
 HashAggregate  (cost=74122.97..74125.53 rows=256 width=24) (actual
 time=45.205..45.211 rows=24 loops=1)
   InitPlan 1 (returns $0)
 -  Limit  (cost=8.30..8.30 rows=1 width=9) (actual time=0.152..0.152
 rows=1 loops=1)
   -  Sort  (cost=8.30..8.78 rows=193 width=9) (actual
 time=0.150..0.150 rows=1 loops=1)
 Sort Key: c.cim
 Sort Method: top-N heapsort  Memory: 25kB
 -  Seq Scan on cim_calendar c  (cost=0.00..7.33 rows=193
 width=9) (actual time=0.008..0.085 rows=192 loops=1)
   Filter: (nodal_load = '2015-01-01'::date)
   Rows Removed by Filter: 36
   -  Nested Loop  (cost=22623.47..74111.47 rows=256 width=24) (actual
 time=43.798..45.181 rows=24 loops=1)
 -  Bitmap Heap Scan on api_settlement_points sp
 (cost=22622.91..67425.92 rows=12 width=9) (actual time=43.756..43.823
 rows=1
 loops=1)
   Recheck Cond: ((rt_model = $0) AND (start_date =
 '2015-01-01'::date) AND (end_date  '2015-01-01'::date))
   Filter: ((settlement_point_rdfid)::text =
 '#_{09F3A628-3B9D-481A-AC90-72AF8EAB64CA}'::text)
   Rows Removed by Filter: 5298
   -  Bitmap Index Scan on api_settlement_points_idx
 (cost=0.00..22622.90 rows=72134 width=0) (actual time=42.998..42.998
 rows=5299 loops=1)
 Index Cond: ((rt_model = $0) AND (start_date =
 '2015-01-01'::date) AND (end_date  '2015-01-01'::date))
 -  Index Scan using dp_hist_gen_actual_idx2 on dp_hist_gen_actual
 dp  (cost=0.56..556.88 rows=25 width=24) (actual time=0.033..1.333 rows=24
 loops=1)
   Index Cond: ((market_day = '2015-01-01'::date) AND
 (market_day = '2015-01-01'::date) AND (expiry_date IS NULL) AND
 ((settlement_point)::text = (sp.settlement_point)::text))
 Total runtime: 45.278 ms

 I'm a little perplexed why the autovacuum wasn't keeping up. Any
 recommendations for those settings to push it to do a bit more analyzing of
 the tables??



 --
 View this message in context:
 http://postgresql.nabble.com/How-to-get-explain-plan-to-prefer-Hash-Join-tp5841450p5841520.html
 Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] How to get explain plan to prefer Hash Join

Re: [PERFORM] How to get explain plan to prefer Hash Join

Re: [PERFORM] How to get explain plan to prefer Hash Join

Re: [PERFORM] How to get explain plan to prefer Hash Join

Re: [PERFORM] How to get explain plan to prefer Hash Join

[PERFORM] How to get explain plan to prefer Hash Join

Re: [PERFORM] How to get explain plan to prefer Hash Join

Re: [PERFORM] How to get explain plan to prefer Hash Join

Re: [PERFORM] How to get explain plan to prefer Hash Join

Re: [PERFORM] How to get explain plan to prefer Hash Join

10 matches

Site Navigation

Mail list logo

Footer information