[PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread jonanews

Greetings all,
I am continously encountering an issue with query plans that changes after 
a pg_dump / pg_restore operation has been performed.
On the production database, PostGre refuses to use the defined indexes in 
several queries however once the database has been dumped and restored 
either on another server or on the same database server it suddenly 
magically changes the query plan to utilize the indexes thereby cutting 
the query cost down to 10% of the original.
Databases are running on the same PostGre v7.3.9 on RH Enterprise 3.1 
server.


A VACUUM FULL runs regularly once a day and VACUUM ANALYZE every other 
hour.

The data in the tables affected by this query doesn't change very often
Even doing a manual VACUUM FULL, VACUUM ANALYZE or REINDEX before the 
query is run on the production database changes nothing.
Have tried to drop the indexes completely and re-create them as well, all 
to no avail.


If the queries are run with SET ENABLE_SEQSCAN TO OFF, the live database 
uses the correct indexes as expected.


Have placed an export of the query, query plan etc. online at: 
http://213.173.234.215:8080/plan.htm in order to ensure it's still 
readable.

For the plans, the key tables are marked with bold.

Any insight into why PostGre behaves this way as well as a possible 
solution (other than performing a pg_dump / pg_restore on the live 
database) would be very much appreciated?


Cheers
Jona

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Dennis Bjorklund
On Thu, 9 Jun 2005 [EMAIL PROTECTED] wrote:

 I am continously encountering an issue with query plans that changes after 
 a pg_dump / pg_restore operation has been performed.
 
 Have placed an export of the query, query plan etc. online at: 
 http://213.173.234.215:8080/plan.htm in order to ensure it's still 
 readable.

There is not a major difference in time, so pg is at least not way off 
(225ms vs. 280ms). The estimated cost is however not very related to the 
runtime (117 vs 1389).

What you have not showed is if the database is properly tuned. The output
of SHOW ALL; could help explain a lot together with info of how much
memory your computer have.

The first thing that comes to mind to me is that you probably have not 
tuned shared_buffers and effective_cache_size properly (SHOW ALL would 
tell).

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona




Thank you for the swift reply, the following is the output of the SHOW
ALL for shared_buffers and effective_cache_size.
shared_buffers: 13384
effective_cache_size: 4000
server memory: 2GB

Please note, the databases are on the same server, it's merely 2
instances of the same database in order to figure out why there's a
difference in the query plan before and after a dump / restore.

What worries me is that the plan is different, in the bad plan it makes
a seq scan of a table with 6.5k recods in (fairly silly) and another of
a table with 50k records in (plan stupid).
In the good plan it uses the indexes available as expected.

The estimated cost is obviously way off in the live database, even
though statistics etc should be up to date. Any insight into this?

Appreciate the help here...

Cheers
Jona

Dennis Bjorklund wrote:

  On Thu, 9 Jun 2005 [EMAIL PROTECTED] wrote:

  
  
I am continously encountering an issue with query plans that changes after 
a pg_dump / pg_restore operation has been performed.

Have placed an export of the query, query plan etc. online at: 
http://213.173.234.215:8080/plan.htm in order to ensure it's still 
readable.

  
  
There is not a major difference in time, so pg is at least not way off 
(225ms vs. 280ms). The estimated cost is however not very related to the 
runtime (117 vs 1389).

What you have not showed is if the database is properly tuned. The output
of SHOW ALL; could help explain a lot together with info of how much
memory your computer have.

The first thing that comes to mind to me is that you probably have not 
tuned shared_buffers and effective_cache_size properly (SHOW ALL would 
tell).

  






Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Christopher Kings-Lynne
  Thank you for the swift reply, the following is the output of the SHOW 
ALL for shared_buffers and effective_cache_size.

shared_buffers:  13384
effective_cache_size: 4000
server memory: 2GB


effective_cache_size should be 10-100x larger perhaps...

Chris


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona
Thanks... have notified our sys admin of that so he can make the correct 
changes.


It still doesn't explain the difference in query plans though?

I mean, it's the same database server the two instances of the same 
database is running on.
One instance (the live) just insists on doing the seq scan of the 50k 
records in Price_Tbl and the 6.5k records in SCT2SubCatType_Tbl.

Seems weird

Cheers
Jona

Christopher Kings-Lynne wrote:

  Thank you for the swift reply, the following is the output of the 
SHOW ALL for shared_buffers and effective_cache_size.

shared_buffers:  13384
effective_cache_size: 4000
server memory: 2GB



effective_cache_size should be 10-100x larger perhaps...

Chris




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Christopher Kings-Lynne

Is effective_cache_size set the same on the test and live?

Jona wrote:
Thanks... have notified our sys admin of that so he can make the correct 
changes.


It still doesn't explain the difference in query plans though?

I mean, it's the same database server the two instances of the same 
database is running on.
One instance (the live) just insists on doing the seq scan of the 50k 
records in Price_Tbl and the 6.5k records in SCT2SubCatType_Tbl.

Seems weird

Cheers
Jona

Christopher Kings-Lynne wrote:

  Thank you for the swift reply, the following is the output of the 
SHOW ALL for shared_buffers and effective_cache_size.

shared_buffers:  13384
effective_cache_size: 4000
server memory: 2GB




effective_cache_size should be 10-100x larger perhaps...

Chris






---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona
It's the same (physical) server as well as the same PostGreSQL daemon, 
so yes.


The only difference is the actual database, the test database is made 
from a backup of the live database and restored onto the same PostGreSQL 
server.

So if I run show databases in psql i get:
- test
- live

Makes sense??

/Jona

Christopher Kings-Lynne wrote:


Is effective_cache_size set the same on the test and live?

Jona wrote:

Thanks... have notified our sys admin of that so he can make the 
correct changes.


It still doesn't explain the difference in query plans though?

I mean, it's the same database server the two instances of the same 
database is running on.
One instance (the live) just insists on doing the seq scan of the 50k 
records in Price_Tbl and the 6.5k records in SCT2SubCatType_Tbl.

Seems weird

Cheers
Jona

Christopher Kings-Lynne wrote:

  Thank you for the swift reply, the following is the output of the 
SHOW ALL for shared_buffers and effective_cache_size.

shared_buffers:  13384
effective_cache_size: 4000
server memory: 2GB





effective_cache_size should be 10-100x larger perhaps...

Chris







---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Dennis Bjorklund
On Thu, 9 Jun 2005, Jona wrote:

 It's the same (physical) server as well as the same PostGreSQL daemon, 
 so yes.

The only thing that can differ then is the statistics collected and the
amount of dead space in tables and indexes (but since you both reindex and
run vacuum full that should not be it).

So comparing the statistics in the system tables is the only thing I can 
think of that might bring some light on the issue. Maybe someone else have 
some ideas.

And as KL said, the effective_cache_size looked like it was way to small. 
With that setting bigger then pg should select index scans more often. It 
doesn't explain why the databases behave like they do now, but it might 
make pg select the same plan nevertheless.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona




Thank you for the insight, any suggestion as to what table / columns I
should compare between the databases?

Cheers
Jona

Dennis Bjorklund wrote:

  On Thu, 9 Jun 2005, Jona wrote:

  
  
It's the same (physical) server as well as the same PostGreSQL daemon, 
so yes.

  
  
The only thing that can differ then is the statistics collected and the
amount of dead space in tables and indexes (but since you both reindex and
run vacuum full that should not be it).

So comparing the statistics in the system tables is the only thing I can 
think of that might bring some light on the issue. Maybe someone else have 
some ideas.

And as KL said, the effective_cache_size looked like it was way to small. 
With that setting bigger then pg should select index scans more often. It 
doesn't explain why the databases behave like they do now, but it might 
make pg select the same plan nevertheless.

  






Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Tom Lane
Jona [EMAIL PROTECTED] writes:
 What worries me is that the plan is different,

Given that the estimated costs are close to the same, this is probably
just the result of small differences in the ANALYZE statistics leading
to small differences in cost estimates and thus choice of different
plans.  I'll bet if you re-ANALYZE a few times on the source database
you'll see it flipping between plan choices too.  This is normal because
ANALYZE takes a random sample of rows rather than being exhaustive.

So the interesting question is not why are the plan choices different
it is how do I get the cost estimates closer to reality.  That's the
only way in the long run to ensure the planner makes the right choice.
Increasing the statistics targets or fooling with planner cost
parameters are the basic tools you have available here.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona




Hi Tom,
Thank you for the input, you're absolutely right.
Have just executed like 10 VACUUM ANALYZE on the Price_Tbl in both
databases and now both queries use the same plan the bad one, GREAT!
Who said ignorance is bliss?? ;-)

Have just messed around with ALTER TABLE ... ALTER  SET STATISTICS
 for both tables to no effect.
Have tried setting both high number (100 and 200) and a low number (1)
and run several VACUUM ANALYZE afterwards.
It still insists on the bad plan...

Furthermore I've played around with the RANDOM_PAGE_COST runtime
parameter.
Seems that when I set it to 2.2 it switch to using the aff_price_uq
index on Price_Tbl, however it needs to be set to 0.7 before it uses
the subcat_uq index on SCT2SubCatType_Tbl.
Has no effect wether the statistics is set to 1 or a 100 for this
behaviour.
The overall plan remains the same though, and even when it uses both
indexes the total cost is roughly 5.5 times higher than the good plan.

New plan:
Unique (cost=612.29..612.65 rows=3 width=75) (actual
time=255.88..255.89 rows=3 loops=1)
 - Hash Join (cost=158.26..596.22 rows=288 width=75) (actual
time=60.91..99.69 rows=2477 loops=1)
 Hash Cond: ("outer".sctid = "inner".sctid)
 - Index Scan using aff_price_uq on price_tbl
(cost=0.00..409.24 rows=5025 width=4) (actual time=0.03..17.81
rows=5157 loops=1)
 Index Cond: (affid = 8)
 - Hash (cost=157.37..157.37 rows=355
width=71) (actual time=60.77..60.77 rows=0 loops=1)
 - Merge Join (cost=10.26..157.37
rows=355 width=71) (actual time=14.42..53.79 rows=2493 loops=1)
 Merge Cond: ("outer".subcattpid =
"inner".id)
 - Index Scan using subcat_uq on
sct2subcattype_tbl (cost=0.00..126.28 rows=6536 width=8) (actual
time=0.03..23.25 rows=6527 loops=1)
 - Sort (cost=10.26..10.28 rows=9
width=63) (actual time=2.46..5.66 rows=2507 loops=1)

"Total runtime: 257.49 msec"

Old "good" plan:
Unique (cost=117.18..117.20 rows=1 width=147)
(actual time=224.62..224.63 rows=3 loops=1)

  

  
- Index Scan using subcat_uq on
sct2subcattype_tbl (cost=0.00..100.47 rows=33
width=8) (actual time=0.01..0.20 rows=46 loops=54)


 Index Cond: ("outer".id =
sct2subcattype_tbl.subcattpid)
  
  
  
  


   - Index Scan
using aff_price_uq on price_tbl (cost=0.00..7.11
rows=1 width=4) (actual time=0.01..0.01 rows=1 loops=2493)
  
  


   Index Cond: ((price_tbl.affid = 8)
AND ("outer".sctid = price_tbl.sctid))
  
  

  

Total runtime: 225.14 msec

It seems that the more it knows about

Could you provide some input on how to make it realise that the plan it
selects is not the optimal?

Cheers
Jona

Tom Lane wrote:

  Jona [EMAIL PROTECTED] writes:
  
  
What worries me is that the plan is different,

  
  
Given that the estimated costs are close to the same, this is probably
just the result of small differences in the ANALYZE statistics leading
to small differences in cost estimates and thus choice of different
plans.  I'll bet if you re-ANALYZE a few times on the source database
you'll see it flipping between plan choices too.  This is normal because
ANALYZE takes a random sample of rows rather than being exhaustive.

So the interesting question is not "why are the plan choices different"
it is "how do I get the cost estimates closer to reality".  That's the
only way in the long run to ensure the planner makes the right choice.
Increasing the statistics targets or fooling with planner cost
parameters are the basic tools you have available here.

			regards, tom lane