[PERFORM] How to enhance the chance that data is in disk cache

2005-06-13 Thread Jona

Hi there
I have a query (please refer to 
http://213.173.234.215:8080/get_content_plan.htm for the query as well 
as query plan) that is slow when it's run the first time and fast(ish) 
on all successive runs within a reasonable time period.
That is, if the query is not run for like 30 min, execution time returns 
to the initial time.


This leads me to suspect that when the query is first run, all used data 
have to be fetched from the disk where as once it has been run all data 
is available in the OS's disk cache.
Comparing the execution times we're talking roughly a factor 35 in time 
difference, thus optimization would be handy.
Is there anway to either enhance the chance that the data can be found 
in the disk cache or allowing the database to fetch the data faster?
Is this what the CLUSTER command is for, if so, which tables would I 
need to cluster?
Or is my only option to de-normalize the table structure around this 
query to speed it up?


Furthermore, it seems the database spends the majority of its time in 
the loop marked with italic in the initial plan, any idea what it spends 
its time on there?


Database is PG 7.3.9 on RH ES 3.0, with Dual XEON 1.9GHz processors and 
2GB of RAM.

effective_cache_size = 100k
shared_buffers = 14k
random_page_cost = 3
default_statistics_target = 50
VACUUM ANALYZE runs every few hours, so statistics should be up to date.

Appreciate any input here.

Cheers
Jona

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] How to enhance the chance that data is in disk cache

2005-06-13 Thread Jona




Thank you for the response Tom, I bet you get a lot of mails with
"trivial" solutions (mine likely being one of them)
I for one however truly appreciate you taking the time to answer them.


  Run the query more often?
  

The query is dynamically constructed from user input, although the
total number of different queries that can be run is limited (around
10k different combinations I suspect) it seems rather pointless to run
all of them (or even the most common) more often just to keep the data
in the disk cache.
Is there a way to make the data more accessible on the disk?

  
Also, that pile of INNER JOINs is forcing a probably-bad join order;
you need to think carefully about the order you want things joined in,
or else convert the query to non-JOIN syntax.  See the "Performance
Tips" chapter of the manual.
  

You're probably right here, the join order must be bad though it just
flattening the join and letting the planner decide on what would be
best makes the plan change for every execution.
Have query cost variering from from 1350 to 4500.
I wager it ends up using GEQO due to the number of possiblities for a
join order that the query has and thus just decides on a "good" plan
out of those it examined.
In any case, the "right" way to do this is definning a good explicit
join order, no?
On top of my head I'm not sure how to re-write it proberly, suppose
trial and errors is the only way
>From the plan it appears that the following part is where the cost
dramatically increases (although the time does not??):
- Nested Loop (cost=0.00..1207.19 rows=75 width=32) (actual
time=0.28..18.47 rows=164 loops=1)  
 - Nested Loop (cost=0.00..868.23 rows=58 width=20) (actual
time=0.16..13.91 rows=164 loops=1)  
  - Index Scan using subcat_uq on sct2subcattype_tbl
(cost=0.00..479.90 rows=82 width=8) (actual time=0.11..9.47 rows=164
loops=1)
 Index Cond: (subcattpid = 50)  
 Filter: (NOT (subplan))  
 SubPlan  
 - Seq Scan on aff2sct2subcattype_tbl
(cost=0.00..1.92 rows=1 width=4) (actual time=0.05..0.05 rows=0
loops=164)  
 Filter: ((affid = 8) AND ($0 = sctid))  
  - Index Scan using aff_price_uq on price_tbl
(cost=0.00..4.72 rows=1 width=12) (actual time=0.02..0.02 rows=1
loops=164)  
 Index Cond: ((price_tbl.affid = 8) AND (price_tbl.sctid =
outer".sctid))"  
 - Index Scan using ctp_statcon on statcon_tbl
(cost=0.00..5.86 rows=1 width=12) (actual time=0.02..0.02 rows=1
loops=164)  
 Index Cond: ((statcon_tbl.sctid = outer".sctid) AND
(statcon_tbl.ctpid = 1))"  
Especially the index scan on subcat_uq seems rather expensive, but is
pretty fast.
Can there be drawn a relation between estimated cost and execution time?
Any other pointers in the right direction would be very much
appreciated.

For the full query and query plan, please refer to:
http://213.173.234.215:8080/get_content_plan.htm

Cheers
Jona

Tom Lane wrote:

  Jona [EMAIL PROTECTED] writes:
  
  
I have a query (please refer to 
http://213.173.234.215:8080/get_content_plan.htm for the query as well 
as query plan) that is slow when it's run the first time and fast(ish) 
on all successive runs within a reasonable time period.

  
  
  
  
This leads me to suspect that when the query is first run, all used data 
have to be fetched from the disk where as once it has been run all data 
is available in the OS's disk cache.

  
  
Sounds like that to me too.

  
  
Is there anway to either enhance the chance that the data can be found 
in the disk cache or allowing the database to fetch the data faster?

  
  
  



  Run the query more often?
  

The query is dynamically constructed from user input, although the
total number of different queries that can be run is limited (around
10k different combinations I suspect) it seems rather pointless to run
all of them (or even the most common) more often just to keep the data
in the disk cache.
Is there a way to make the data more accessible on the disk?

  
Also, that pile of INNER JOINs is forcing a probably-bad join order;
you need to think carefully about the order you want things joined in,
or else convert the query to non-JOIN syntax.  See the "Performance
Tips" chapter of the manual.
  

You're probably right herem though I'm not sure I can 

  
			regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
  






Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona




Thank you for the swift reply, the following is the output of the SHOW
ALL for shared_buffers and effective_cache_size.
shared_buffers: 13384
effective_cache_size: 4000
server memory: 2GB

Please note, the databases are on the same server, it's merely 2
instances of the same database in order to figure out why there's a
difference in the query plan before and after a dump / restore.

What worries me is that the plan is different, in the bad plan it makes
a seq scan of a table with 6.5k recods in (fairly silly) and another of
a table with 50k records in (plan stupid).
In the good plan it uses the indexes available as expected.

The estimated cost is obviously way off in the live database, even
though statistics etc should be up to date. Any insight into this?

Appreciate the help here...

Cheers
Jona

Dennis Bjorklund wrote:

  On Thu, 9 Jun 2005 [EMAIL PROTECTED] wrote:

  
  
I am continously encountering an issue with query plans that changes after 
a pg_dump / pg_restore operation has been performed.

Have placed an export of the query, query plan etc. online at: 
http://213.173.234.215:8080/plan.htm in order to ensure it's still 
readable.

  
  
There is not a major difference in time, so pg is at least not way off 
(225ms vs. 280ms). The estimated cost is however not very related to the 
runtime (117 vs 1389).

What you have not showed is if the database is properly tuned. The output
of SHOW ALL; could help explain a lot together with info of how much
memory your computer have.

The first thing that comes to mind to me is that you probably have not 
tuned shared_buffers and effective_cache_size properly (SHOW ALL would 
tell).

  






Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona
Thanks... have notified our sys admin of that so he can make the correct 
changes.


It still doesn't explain the difference in query plans though?

I mean, it's the same database server the two instances of the same 
database is running on.
One instance (the live) just insists on doing the seq scan of the 50k 
records in Price_Tbl and the 6.5k records in SCT2SubCatType_Tbl.

Seems weird

Cheers
Jona

Christopher Kings-Lynne wrote:

  Thank you for the swift reply, the following is the output of the 
SHOW ALL for shared_buffers and effective_cache_size.

shared_buffers:  13384
effective_cache_size: 4000
server memory: 2GB



effective_cache_size should be 10-100x larger perhaps...

Chris




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona
It's the same (physical) server as well as the same PostGreSQL daemon, 
so yes.


The only difference is the actual database, the test database is made 
from a backup of the live database and restored onto the same PostGreSQL 
server.

So if I run show databases in psql i get:
- test
- live

Makes sense??

/Jona

Christopher Kings-Lynne wrote:


Is effective_cache_size set the same on the test and live?

Jona wrote:

Thanks... have notified our sys admin of that so he can make the 
correct changes.


It still doesn't explain the difference in query plans though?

I mean, it's the same database server the two instances of the same 
database is running on.
One instance (the live) just insists on doing the seq scan of the 50k 
records in Price_Tbl and the 6.5k records in SCT2SubCatType_Tbl.

Seems weird

Cheers
Jona

Christopher Kings-Lynne wrote:

  Thank you for the swift reply, the following is the output of the 
SHOW ALL for shared_buffers and effective_cache_size.

shared_buffers:  13384
effective_cache_size: 4000
server memory: 2GB





effective_cache_size should be 10-100x larger perhaps...

Chris







---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona




Thank you for the insight, any suggestion as to what table / columns I
should compare between the databases?

Cheers
Jona

Dennis Bjorklund wrote:

  On Thu, 9 Jun 2005, Jona wrote:

  
  
It's the same (physical) server as well as the same PostGreSQL daemon, 
so yes.

  
  
The only thing that can differ then is the statistics collected and the
amount of dead space in tables and indexes (but since you both reindex and
run vacuum full that should not be it).

So comparing the statistics in the system tables is the only thing I can 
think of that might bring some light on the issue. Maybe someone else have 
some ideas.

And as KL said, the effective_cache_size looked like it was way to small. 
With that setting bigger then pg should select index scans more often. It 
doesn't explain why the databases behave like they do now, but it might 
make pg select the same plan nevertheless.

  






Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Jona




Hi Tom,
Thank you for the input, you're absolutely right.
Have just executed like 10 VACUUM ANALYZE on the Price_Tbl in both
databases and now both queries use the same plan the bad one, GREAT!
Who said ignorance is bliss?? ;-)

Have just messed around with ALTER TABLE ... ALTER  SET STATISTICS
 for both tables to no effect.
Have tried setting both high number (100 and 200) and a low number (1)
and run several VACUUM ANALYZE afterwards.
It still insists on the bad plan...

Furthermore I've played around with the RANDOM_PAGE_COST runtime
parameter.
Seems that when I set it to 2.2 it switch to using the aff_price_uq
index on Price_Tbl, however it needs to be set to 0.7 before it uses
the subcat_uq index on SCT2SubCatType_Tbl.
Has no effect wether the statistics is set to 1 or a 100 for this
behaviour.
The overall plan remains the same though, and even when it uses both
indexes the total cost is roughly 5.5 times higher than the good plan.

New plan:
Unique (cost=612.29..612.65 rows=3 width=75) (actual
time=255.88..255.89 rows=3 loops=1)
 - Hash Join (cost=158.26..596.22 rows=288 width=75) (actual
time=60.91..99.69 rows=2477 loops=1)
 Hash Cond: ("outer".sctid = "inner".sctid)
 - Index Scan using aff_price_uq on price_tbl
(cost=0.00..409.24 rows=5025 width=4) (actual time=0.03..17.81
rows=5157 loops=1)
 Index Cond: (affid = 8)
 - Hash (cost=157.37..157.37 rows=355
width=71) (actual time=60.77..60.77 rows=0 loops=1)
 - Merge Join (cost=10.26..157.37
rows=355 width=71) (actual time=14.42..53.79 rows=2493 loops=1)
 Merge Cond: ("outer".subcattpid =
"inner".id)
 - Index Scan using subcat_uq on
sct2subcattype_tbl (cost=0.00..126.28 rows=6536 width=8) (actual
time=0.03..23.25 rows=6527 loops=1)
 - Sort (cost=10.26..10.28 rows=9
width=63) (actual time=2.46..5.66 rows=2507 loops=1)

"Total runtime: 257.49 msec"

Old "good" plan:
Unique (cost=117.18..117.20 rows=1 width=147)
(actual time=224.62..224.63 rows=3 loops=1)

  

  
- Index Scan using subcat_uq on
sct2subcattype_tbl (cost=0.00..100.47 rows=33
width=8) (actual time=0.01..0.20 rows=46 loops=54)


 Index Cond: ("outer".id =
sct2subcattype_tbl.subcattpid)
  
  
  
  


   - Index Scan
using aff_price_uq on price_tbl (cost=0.00..7.11
rows=1 width=4) (actual time=0.01..0.01 rows=1 loops=2493)
  
  


   Index Cond: ((price_tbl.affid = 8)
AND ("outer".sctid = price_tbl.sctid))
  
  

  

Total runtime: 225.14 msec

It seems that the more it knows about

Could you provide some input on how to make it realise that the plan it
selects is not the optimal?

Cheers
Jona

Tom Lane wrote:

  Jona [EMAIL PROTECTED] writes:
  
  
What worries me is that the plan is different,

  
  
Given that the estimated costs are close to the same, this is probably
just the result of small differences in the ANALYZE statistics leading
to small differences in cost estimates and thus choice of different
plans.  I'll bet if you re-ANALYZE a few times on the source database
you'll see it flipping between plan choices too.  This is normal because
ANALYZE takes a random sample of rows rather than being exhaustive.

So the interesting question is not "why are the plan choices different"
it is "how do I get the cost estimates closer to reality".  That's the
only way in the long run to ensure the planner makes the right choice.
Increasing the statistics targets or fooling with planner cost
parameters are the basic tools you have available here.

			regards, tom lane
  






Re: [PERFORM] Bad choice of query plan from PG 7.3.6 to PG 7.3.9

2005-05-08 Thread Jona




Sorry Tom, misread your mail! My bad :-(

I believe the following is the data you need ?

   

  Live Server


  relname
  relpages


  ctp_statcon
  72


  statcon_pk
  135


  
  
  
  


  Test Server


  relname
  relpages


  ctp_statcon
  34


  statcon_pk
  28

  


Have executed the following query to obtain that data:
SELECT relname, relpages
FROM pg_class
WHERE relname = 'statcon_pk' OR relname = 'sc2ctp_fk' OR relname =
'sc2mtp_fk' OR relname = 'sc2sc_fk' OR relname = 'ctp_statcon'

The size difference for the index is surprisingly big I think,
considering that there's only around 1000 rows more in the table on the
live server than on the server.
Count for Live Server: 12597
Count for Test Server: 11494
Any insight into this?

Cheers
Jona

PS: The meta data for the table is:
CREATE TABLE statcon_tbl
(
 id serial NOT NULL,
 data bytea,
 wm bool DEFAULT 'FALSE',
 created timestamp DEFAULT now(),
 modified timestamp DEFAULT now(),
 enabled bool DEFAULT 'TRUE',
 bitsperpixel int4 DEFAULT 0,
 mtpid int4,
 sctid int4,
 ctpid int4,
 CONSTRAINT statcon_pk PRIMARY KEY (id),
 CONSTRAINT sc2ctp_fk FOREIGN KEY (ctpid) REFERENCES contype_tbl (id)
ON   UPDATE CASCADE ON DELETE CASCADE,
 CONSTRAINT sc2mtp_fk FOREIGN KEY (mtpid) REFERENCES mimetype_tbl (id)
ON UPDATE CASCADE ON DELETE CASCADE,
 CONSTRAINT sc2sct_fk FOREIGN KEY (sctid) REFERENCES statcontrans_tbl
(id) ON UPDATE CASCADE ON DELETE CASCADE
) 
WITHOUT OIDS;
CREATE INDEX ctp_statcon ON statcon_tbl USING btree (sctid, ctpid);


Tom Lane wrote:

  Jona [EMAIL PROTECTED] writes:
  
  
anyway, here's the info for relpages:
Live Server: 424
Test Server: 338

  
  
I was asking about the indexes associated with the table, not the table
itself.

			regards, tom lane
  






Re: [PERFORM] Bad choice of query plan from PG 7.3.6 to PG 7.3.9

2005-05-07 Thread Jona




Wouldn't the VACUUM have made them equivalent??

anyway, here's the info for relpages:
Live Server: 424
Test Server: 338

Please note though that there're more rows on the live server than on
the test server due to recent upload.
Total Row counts are as follows:
Live Server: 12597
Test Server: 11494

When the problems started the tables had identical size though.

Cheers
Jona

Tom Lane wrote:

  Jona [EMAIL PROTECTED] writes:
  
  
Test Server:
comm=# VACUUM ANALYZE VERBOSE StatCon_Tbl;
INFO:  --Relation public.statcon_tbl--
INFO:  Pages 338: Changed 0, Empty 0; Tup 11494: Vac 0, Keep 0, UnUsed 0.
Total CPU 0.02s/0.00u sec elapsed 1.98 sec.
INFO:  --Relation pg_toast.pg_toast_179851--
INFO:  Pages 85680: Changed 0, Empty 0; Tup 343321: Vac 0, Keep 0, UnUsed 0.
Total CPU 1.75s/0.23u sec elapsed 30.36 sec.
INFO:  Analyzing public.statcon_tbl
VACUUM

  
  
  
  
Live Server:
comm=# VACUUM ANALYZE VERBOSE StatCon_Tbl;
INFO:  --Relation public.statcon_tbl--
INFO:  Pages 424: Changed 0, Empty 0; Tup 12291: Vac 0, Keep 0, UnUsed 6101.
Total CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  --Relation pg_toast.pg_toast_891830--
INFO:  Pages 89234: Changed 0, Empty 0; Tup 352823: Vac 0, Keep 0, 
UnUsed 5487.
Total CPU 3.21s/0.47u sec elapsed 18.03 sec.
INFO:  Analyzing public.statcon_tbl
VACUUM

  
  
Hm, the physical table sizes aren't very different, which suggests that
the problem must lie with the indexes.  Unfortunately, VACUUM in 7.3
doesn't tell you anything about indexes if it doesn't have any dead rows
to clean up.  Could you look at pg_class.relpages for all the indexes
of this table, and see what that shows?

			regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org
  






Re: [PERFORM] Bad choice of query plan from PG 7.3.6 to PG 7.3.9

2005-05-06 Thread Jona




Thank you for the swift reply.
The test server is hardly ever vacuumed as it in general sees very
limited traffic. vacuum is only necessary if the server sees a lot of
write operations, i.e. update, delete, insert right?

What explains the different choice of query plans then?
As can be seen from the following snippets the test server decides to
use an index twice in Query 2, where as the live server decides to do a
full scan of tables with 38.5k and 5.5k records.
In Query 3 it's vice versa.
Seems strange to me...

Query 2:
--- Bad idea, price_tbl hold 38.5k records

Test:
- Index Scan using aff_price_uq on price_tbl (cost=0.00..6.01
rows=1 width=4) (actual time=0.01..0.01 rows=1 loops=2838)"

Live:
- Seq Scan on price_tbl (cost=0.00..883.48 rows=2434 width=4)
(actual time=0.86..67.25 rows=4570 loops=1)"

Filter: (affid = 8)"


--- Bad idea, sct2subcattype_tbl hold 5.5k records

Test:
- Index Scan using subcat_uq on sct2subcattype_tbl
(cost=0.00..79.26 rows=26 width=8) (actual time=0.01..0.17 rows=59
loops=48)
Live:
- Seq Scan on sct2subcattype_tbl (cost=0.00..99.26 rows=5526
width=8) (actual time=0.01..30.16 rows=5526 loops=1)"


Query 3:
- Bad idea, sct2lang_tbl has 8.6k records
Test:
- Seq Scan on sct2lang_tbl (cost=0.00..150.79 rows=8679 width=8)
(actual time=0.03..10.70 rows=8679 loops=1)"

Live:
- Index Scan using sct2lang_uq on sct2lang_tbl (cost=0.00..8.13
rows=2 width=8) (actual time=1.10..2.39 rows=2 loops=69)"


Will get a VACUUM VERBOSE of StatCon_Tbl

Cheers
Jona

PS: The query plans are extracted using pgAdmin on Windows, if you can
recommend a better cross-platform postgre client I'd be happy to try it
out.

Tom Lane wrote:

  Jona [EMAIL PROTECTED] writes:
  
  
I'm currently experiencing problems with long query execution times.
What I believe makes these problems particularly interesting is the 
difference in execution plans between our test server running PostGreSQL 
7.3.6 and our production server running PostGreSQL 7.3.9.
The test server is an upgraded "home machine", a Pentium 4 with 1GB of 
memory and IDE disk.
The production server is a dual CPU XEON Pentium 4 with 2GB memory and 
SCSI disks.
One should expect the production server to be faster, but appearently 
not as the outlined query plans below shows.

  
  
I think the plans are fine; it looks to me like the production server
has serious table-bloat or index-bloat problems, probably because of
inadequate vacuuming.  For instance compare these entries:

-  Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..6.01 rows=1 width=4) (actual time=0.05..0.31 rows=39 loops=4)
  Index Cond: ((statcon_tbl.sctid = "outer".sctid) AND (statcon_tbl.ctpid = 1))

-  Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..20.40 rows=5 width=4) (actual time=27.97..171.84 rows=39 loops=4)
  Index Cond: ((statcon_tbl.sctid = "outer".sctid) AND (statcon_tbl.ctpid = 1))

Appears to be exactly the same task ... but the test server spent
1.24 msec total while the production server spent 687.36 msec total.
That's more than half of your problem right there.  Some of the other
scans seem a lot slower on the production machine too.

  
  
1) How come the query plans between the 2 servers are different?

  
  
The production server's rowcount estimates are pretty good, the test
server's are not.  How long since you vacuumed/analyzed the test server?

It'd be interesting to see the output of "vacuum verbose statcon_tbl"
on both servers ...

			regards, tom lane

PS: if you post any more query plans, please try to use software that
doesn't mangle the formatting so horribly ...

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
  





Re: [PERFORM] Bad choice of query plan from PG 7.3.6 to PG 7.3.9

2005-05-06 Thread Jona




Results of VACUUM VERBOSE from both servers

Test server:
comm=# VACUUM VERBOSE StatCon_Tbl;
INFO: --Relation public.statcon_tbl--
INFO: Pages 338: Changed 338, Empty 0; Tup 11494: Vac 0, Keep 0,
UnUsed 0.
 Total CPU 0.02s/0.00u sec elapsed 0.04 sec.
INFO: --Relation pg_toast.pg_toast_179851--
INFO: Pages 85680: Changed 85680, Empty 0; Tup 343321: Vac 0, Keep 0,
UnUsed 0.
 Total CPU 4.03s/0.40u sec elapsed 70.99 sec.
VACUUM

Live Server:
comm=# VACUUM VERBOSE StatCon_Tbl;
INFO: --Relation public.statcon_tbl--
INFO: Pages 424: Changed 0, Empty 0; Tup 12291: Vac 0, Keep 0, UnUsed
6101.
 Total CPU 0.01s/0.00u sec elapsed 0.60 sec.
INFO: --Relation pg_toast.pg_toast_891830--
INFO: Pages 89234: Changed 0, Empty 0; Tup 352823: Vac 0, Keep 0,
UnUsed 5487.
 Total CPU 4.44s/0.34u sec elapsed 35.48 sec.
VACUUM

Cheers
Jona

Tom Lane wrote:

  Jona [EMAIL PROTECTED] writes:
  
  
I'm currently experiencing problems with long query execution times.
What I believe makes these problems particularly interesting is the 
difference in execution plans between our test server running PostGreSQL 
7.3.6 and our production server running PostGreSQL 7.3.9.
The test server is an upgraded "home machine", a Pentium 4 with 1GB of 
memory and IDE disk.
The production server is a dual CPU XEON Pentium 4 with 2GB memory and 
SCSI disks.
One should expect the production server to be faster, but appearently 
not as the outlined query plans below shows.

  
  
I think the plans are fine; it looks to me like the production server
has serious table-bloat or index-bloat problems, probably because of
inadequate vacuuming.  For instance compare these entries:

-  Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..6.01 rows=1 width=4) (actual time=0.05..0.31 rows=39 loops=4)
  Index Cond: ((statcon_tbl.sctid = "outer".sctid) AND (statcon_tbl.ctpid = 1))

-  Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..20.40 rows=5 width=4) (actual time=27.97..171.84 rows=39 loops=4)
  Index Cond: ((statcon_tbl.sctid = "outer".sctid) AND (statcon_tbl.ctpid = 1))

Appears to be exactly the same task ... but the test server spent
1.24 msec total while the production server spent 687.36 msec total.
That's more than half of your problem right there.  Some of the other
scans seem a lot slower on the production machine too.

  
  
1) How come the query plans between the 2 servers are different?

  
  
The production server's rowcount estimates are pretty good, the test
server's are not.  How long since you vacuumed/analyzed the test server?

It'd be interesting to see the output of "vacuum verbose statcon_tbl"
on both servers ...

			regards, tom lane

PS: if you post any more query plans, please try to use software that
doesn't mangle the formatting so horribly ...

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
  





Re: [PERFORM] Bad choice of query plan from PG 7.3.6 to PG 7.3.9

2005-05-06 Thread Jona
Now with analyze
Test Server:
comm=# VACUUM ANALYZE VERBOSE StatCon_Tbl;
INFO:  --Relation public.statcon_tbl--
INFO:  Pages 338: Changed 0, Empty 0; Tup 11494: Vac 0, Keep 0, UnUsed 0.
   Total CPU 0.02s/0.00u sec elapsed 1.98 sec.
INFO:  --Relation pg_toast.pg_toast_179851--
INFO:  Pages 85680: Changed 0, Empty 0; Tup 343321: Vac 0, Keep 0, UnUsed 0.
   Total CPU 1.75s/0.23u sec elapsed 30.36 sec.
INFO:  Analyzing public.statcon_tbl
VACUUM
Live Server:
comm=# VACUUM ANALYZE VERBOSE StatCon_Tbl;
INFO:  --Relation public.statcon_tbl--
INFO:  Pages 424: Changed 0, Empty 0; Tup 12291: Vac 0, Keep 0, UnUsed 6101.
   Total CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  --Relation pg_toast.pg_toast_891830--
INFO:  Pages 89234: Changed 0, Empty 0; Tup 352823: Vac 0, Keep 0, 
UnUsed 5487.
   Total CPU 3.21s/0.47u sec elapsed 18.03 sec.
INFO:  Analyzing public.statcon_tbl
VACUUM

Have done some sampling running the same query a few times through the 
past few hours and it appears that the VACUUM has helped.
The following are the results after the Vacuum:

After VACUUM VERBOSE:
Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..21.29 rows=5 
width=4) (actual time=0.07..0.37 rows=39 loops=4)
Index Cond: ((statcon_tbl.sctid = outer.sctid) AND (statcon_tbl.ctpid 
= 1))

After VACUUM ANALYZE VERBOSE:
Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..20.03 rows=5 
width=4) (actual time=0.09..0.37 rows=39 loops=4)
Index Cond: ((statcon_tbl.sctid = outer.sctid) AND (statcon_tbl.ctpid 
= 1))

Only question remains why one server uses its indexes and the other 
don't eventhough VACUUM ANALYZE has now been run on both servers?
And even more interesting, before the VACUUM ANALYZEit was the server 
where no vacuum had taken place that used its index.

Cheers
Jona
Christopher Kings-Lynne wrote:
You didn't do analyze.
Chris
Jona wrote:
  Results of VACUUM VERBOSE from both servers
Test server:
comm=# VACUUM VERBOSE StatCon_Tbl;
INFO:  --Relation public.statcon_tbl--
INFO:  Pages 338: Changed 338, Empty 0; Tup 11494: Vac 0, Keep 0, 
UnUsed 0.
Total CPU 0.02s/0.00u sec elapsed 0.04 sec.
INFO:  --Relation pg_toast.pg_toast_179851--
INFO:  Pages 85680: Changed 85680, Empty 0; Tup 343321: Vac 0, Keep 
0, UnUsed 0.
Total CPU 4.03s/0.40u sec elapsed 70.99 sec.
VACUUM

Live Server:
comm=# VACUUM VERBOSE StatCon_Tbl;
INFO:  --Relation public.statcon_tbl--
INFO:  Pages 424: Changed 0, Empty 0; Tup 12291: Vac 0, Keep 0, 
UnUsed 6101.
Total CPU 0.01s/0.00u sec elapsed 0.60 sec.
INFO:  --Relation pg_toast.pg_toast_891830--
INFO:  Pages 89234: Changed 0, Empty 0; Tup 352823: Vac 0, Keep 0, 
UnUsed 5487.
Total CPU 4.44s/0.34u sec elapsed 35.48 sec.
VACUUM

Cheers
Jona
Tom Lane wrote:
Jona [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] writes:
 

I'm currently experiencing problems with long query execution times.
What I believe makes these problems particularly interesting is the 
difference in execution plans between our test server running 
PostGreSQL 7.3.6 and our production server running PostGreSQL 7.3.9.
The test server is an upgraded home machine, a Pentium 4 with 1GB 
of memory and IDE disk.
The production server is a dual CPU XEON Pentium 4 with 2GB memory 
and SCSI disks.
One should expect the production server to be faster, but 
appearently not as the outlined query plans below shows.
  
I think the plans are fine; it looks to me like the production server
has serious table-bloat or index-bloat problems, probably because of
inadequate vacuuming.  For instance compare these entries:
-  Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..6.01 
rows=1 width=4) (actual time=0.05..0.31 rows=39 loops=4)
 Index Cond: ((statcon_tbl.sctid = outer.sctid) AND 
(statcon_tbl.ctpid = 1))

-  Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..20.40 
rows=5 width=4) (actual time=27.97..171.84 rows=39 loops=4)
 Index Cond: ((statcon_tbl.sctid = outer.sctid) AND 
(statcon_tbl.ctpid = 1))

Appears to be exactly the same task ... but the test server spent
1.24 msec total while the production server spent 687.36 msec total.
That's more than half of your problem right there.  Some of the other
scans seem a lot slower on the production machine too.
 

1) How come the query plans between the 2 servers are different?
  
The production server's rowcount estimates are pretty good, the test
server's are not.  How long since you vacuumed/analyzed the test 
server?

It'd be interesting to see the output of vacuum verbose statcon_tbl
on both servers ...
regards, tom lane
PS: if you post any more query plans, please try to use software that
doesn't mangle the formatting so horribly ...
---(end of 
broadcast)---
TIP 4: Don't 'kill -9' the postmaster
 

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

[PERFORM] Bad choice of query plan from PG 7.3.6 to PG 7.3.9 part 1

2005-05-05 Thread Jona
Hi
I'm currently experiencing problems with long query execution times.
What I believe makes these problems particularly interesting is the 
difference in execution plans between our test server running PostGreSQL 
7.3.6 and our production server running PostGreSQL 7.3.9.
The test server is an upgraded home machine, a Pentium 4 with 1GB of 
memory and IDE disk.
The production server is a dual CPU XEON Pentium 4 with 2GB memory and 
SCSI disks.
One should expect the production server to be faster, but appearently 
not as the outlined query plans below shows.

My questions can be summoned up to:
1) How come the query plans between the 2 servers are different?
2) How come the production server in general estimates the cost of the 
query plans so horribly wrong? (ie. it chooses a bad query plan where as 
the test server chooses a good plan)
3) In Query 2, how come the production server refuses the use its 
indexes (subcat_uq and aff_price_uq, both unique indexes) where as the 
test server determines that the indexes are the way to go
4) In Query 3, how come the test server refuses to use its index 
(sct2lang_uq) and the production server uses it? And why is the test 
server still faster eventhough it makes a sequential scan of a table 
with 8.5k records in?

Please note, a VACUUM ANALYSE is run on the production server once a day 
(used to be once an hour but it seemed to make no difference), however 
there are generally no writes to the tables used in the queries.

If anyone could shed some light on these issues I would truly appreciate 
it.

Cheers
Jona
PS. Please refer to part 2 for the other queries and query plans
 

Query 1:
EXPLAIN ANALYZE
SELECT DISTINCT StatConTrans_Tbl.id, Code_Tbl.sysnm AS code, 
PriceCat_Tbl.amount AS price, Country_Tbl.currency,
 CreditsCat_Tbl.amount AS credits, Info_Tbl.title, Info_Tbl.description
FROM (SCT2SubCatType_Tbl
INNER JOIN SCT2Lang_Tbl ON SCT2SubCatType_Tbl.sctid = SCT2Lang_Tbl.sctid
INNER JOIN Language_Tbl ON SCT2Lang_Tbl.langid = Language_Tbl.id AND 
Language_Tbl.sysnm = UPPER('us') AND Language_Tbl.enabled = true
INNER JOIN Info_Tbl ON SCT2SubCatType_Tbl.sctid = Info_Tbl.sctid AND 
Language_Tbl.id = Info_Tbl.langid
INNER JOIN SubCatType_Tbl ON SCT2SubCatType_Tbl.subcattpid = 
SubCatType_Tbl.id AND SubCatType_Tbl.enabled = true
INNER JOIN CatType_Tbl ON SubCatType_Tbl.cattpid = CatType_Tbl.id AND 
CatType_Tbl.enabled = true
INNER JOIN SuperCatType_Tbl ON CatType_Tbl.spcattpid = 
SuperCatType_Tbl.id AND SuperCatType_Tbl.enabled = true
INNER JOIN StatConTrans_Tbl ON SCT2SubCatType_Tbl.sctid = 
StatConTrans_Tbl.id AND StatConTrans_Tbl.enabled = true
INNER JOIN Price_Tbl ON StatConTrans_Tbl.id = Price_Tbl.sctid AND 
Price_Tbl.affid = 8
INNER JOIN PriceCat_Tbl ON Price_Tbl.prccatid = PriceCat_Tbl.id AND 
PriceCat_Tbl.enabled = true
INNER JOIN Country_Tbl ON PriceCat_Tbl.cntid = Country_Tbl.id AND 
Country_Tbl.enabled = true
INNER JOIN CreditsCat_Tbl ON Price_Tbl.crdcatid = CreditsCat_Tbl.id AND 
CreditsCat_Tbl.enabled = true
INNER JOIN StatCon_Tbl ON StatConTrans_Tbl.id = StatCon_Tbl.sctid AND 
StatCon_Tbl.ctpid = 1
INNER JOIN Code_Tbl ON SuperCatType_Tbl.id = Code_Tbl.spcattpid AND 
Code_Tbl.affid = 8 AND Code_Tbl.cdtpid = 1)
WHERE SCT2SubCatType_Tbl.subcattpid = 79
ORDER BY StatConTrans_Tbl.id DESC
LIMIT 8 OFFSET 0

Plan on PostGre 7.3.6 on Red Hat Linux 3.2.3-39
Limit  (cost=178.59..178.61 rows=1 width=330) (actual time=22.77..28.51 
rows=4 loops=1)
  -  Unique  (cost=178.59..178.61 rows=1 width=330) (actual 
time=22.77..28.50 rows=4 loops=1)
-  Sort  (cost=178.59..178.60 rows=1 width=330) (actual 
time=22.76..22.85 rows=156 loops=1)
  Sort Key: statcontrans_tbl.id, code_tbl.sysnm, 
pricecat_tbl.amount, country_tbl.currency, creditscat_tbl.amount, 
info_tbl.title, info_tbl.description
  -  Hash Join  (cost=171.19..178.58 rows=1 width=330) 
(actual time=3.39..6.55 rows=156 loops=1)
Hash Cond: (outer.cntid = inner.id)
-  Nested Loop  (cost=170.13..177.51 rows=1 
width=312) (actual time=3.27..5.75 rows=156 loops=1)
  Join Filter: (inner.sctid = outer.sctid)
  -  Hash Join  (cost=170.13..171.48 rows=1 
width=308) (actual time=3.12..3.26 rows=4 loops=1)
Hash Cond: (outer.crdcatid = 
inner.id)
-  Hash Join  (cost=169.03..170.38 
rows=1 width=300) (actual time=3.00..3.11 rows=4 loops=1)
  Hash Cond: (outer.spcattpid = 
inner.spcattpid)
  -  Hash Join  
(cost=167.22..168.56 rows=1 width=253) (actual time=2.88..2.97 rows=4 
loops=1)
Hash Cond: (outer.id = 
inner.prccatid)
-  Seq Scan on 
pricecat_tbl  (cost=0.00..1.29 rows=12 width=12

[PERFORM] Testing list access

2005-05-03 Thread Jona
Testing list access
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[PERFORM] Bad choice of query plan from PG 7.3.6 to PG 7.3.9 part 1b

2005-05-03 Thread Jona
Please refer to part 1a for questions and part 2 for more queries and 
query plans.
Why won't this list accept my questions and sample data in one mail???

/Jona
 

Query 1:
EXPLAIN ANALYZE
SELECT DISTINCT StatConTrans_Tbl.id, Code_Tbl.sysnm AS code, 
PriceCat_Tbl.amount AS price, Country_Tbl.currency,
 CreditsCat_Tbl.amount AS credits, Info_Tbl.title, Info_Tbl.description
FROM (SCT2SubCatType_Tbl
INNER JOIN SCT2Lang_Tbl ON SCT2SubCatType_Tbl.sctid = SCT2Lang_Tbl.sctid
INNER JOIN Language_Tbl ON SCT2Lang_Tbl.langid = Language_Tbl.id AND 
Language_Tbl.sysnm = UPPER('us') AND Language_Tbl.enabled = true
INNER JOIN Info_Tbl ON SCT2SubCatType_Tbl.sctid = Info_Tbl.sctid AND 
Language_Tbl.id = Info_Tbl.langid
INNER JOIN SubCatType_Tbl ON SCT2SubCatType_Tbl.subcattpid = 
SubCatType_Tbl.id AND SubCatType_Tbl.enabled = true
INNER JOIN CatType_Tbl ON SubCatType_Tbl.cattpid = CatType_Tbl.id AND 
CatType_Tbl.enabled = true
INNER JOIN SuperCatType_Tbl ON CatType_Tbl.spcattpid = 
SuperCatType_Tbl.id AND SuperCatType_Tbl.enabled = true
INNER JOIN StatConTrans_Tbl ON SCT2SubCatType_Tbl.sctid = 
StatConTrans_Tbl.id AND StatConTrans_Tbl.enabled = true
INNER JOIN Price_Tbl ON StatConTrans_Tbl.id = Price_Tbl.sctid AND 
Price_Tbl.affid = 8
INNER JOIN PriceCat_Tbl ON Price_Tbl.prccatid = PriceCat_Tbl.id AND 
PriceCat_Tbl.enabled = true
INNER JOIN Country_Tbl ON PriceCat_Tbl.cntid = Country_Tbl.id AND 
Country_Tbl.enabled = true
INNER JOIN CreditsCat_Tbl ON Price_Tbl.crdcatid = CreditsCat_Tbl.id AND 
CreditsCat_Tbl.enabled = true
INNER JOIN StatCon_Tbl ON StatConTrans_Tbl.id = StatCon_Tbl.sctid AND 
StatCon_Tbl.ctpid = 1
INNER JOIN Code_Tbl ON SuperCatType_Tbl.id = Code_Tbl.spcattpid AND 
Code_Tbl.affid = 8 AND Code_Tbl.cdtpid = 1)
WHERE SCT2SubCatType_Tbl.subcattpid = 79
ORDER BY StatConTrans_Tbl.id DESC
LIMIT 8 OFFSET 0

Plan on PostGre 7.3.6 on Red Hat Linux 3.2.3-39
Limit  (cost=178.59..178.61 rows=1 width=330) (actual time=22.77..28.51 
rows=4 loops=1)
  -  Unique  (cost=178.59..178.61 rows=1 width=330) (actual 
time=22.77..28.50 rows=4 loops=1)
-  Sort  (cost=178.59..178.60 rows=1 width=330) (actual 
time=22.76..22.85 rows=156 loops=1)
  Sort Key: statcontrans_tbl.id, code_tbl.sysnm, 
pricecat_tbl.amount, country_tbl.currency, creditscat_tbl.amount, 
info_tbl.title, info_tbl.description
  -  Hash Join  (cost=171.19..178.58 rows=1 width=330) 
(actual time=3.39..6.55 rows=156 loops=1)
Hash Cond: (outer.cntid = inner.id)
-  Nested Loop  (cost=170.13..177.51 rows=1 
width=312) (actual time=3.27..5.75 rows=156 loops=1)
  Join Filter: (inner.sctid = outer.sctid)
  -  Hash Join  (cost=170.13..171.48 rows=1 
width=308) (actual time=3.12..3.26 rows=4 loops=1)
Hash Cond: (outer.crdcatid = 
inner.id)
-  Hash Join  (cost=169.03..170.38 
rows=1 width=300) (actual time=3.00..3.11 rows=4 loops=1)
  Hash Cond: (outer.spcattpid = 
inner.spcattpid)
  -  Hash Join  
(cost=167.22..168.56 rows=1 width=253) (actual time=2.88..2.97 rows=4 
loops=1)
Hash Cond: (outer.id = 
inner.prccatid)
-  Seq Scan on 
pricecat_tbl  (cost=0.00..1.29 rows=12 width=12) (actual time=0.04..0.08 
rows=23 loops=1)
  Filter: (enabled = 
true)
-  Hash  
(cost=167.21..167.21 rows=1 width=241) (actual time=2.80..2.80 rows=0 
loops=1)
  -  Nested Loop  
(cost=3.77..167.21 rows=1 width=241) (actual time=1.31..2.79 rows=4 
loops=1)
Join Filter: 
(inner.sctid = outer.sctid)
-  Nested 
Loop  (cost=3.77..161.19 rows=1 width=229) (actual time=1.19..2.60 
rows=4 loops=1)
  Join 
Filter: (outer.sctid = inner.sctid)
  -  Hash 
Join  (cost=3.77..155.17 rows=1 width=44) (actual time=1.07..2.37 rows=4 
loops=1)

Hash Cond: (outer.langid = inner.id)
-  
Nested Loop  (cost=2.69..154.06 rows=7 width=40) (actual time=0.90..2.18 
rows=8 loops=1)
  
Join Filter: (outer.sctid = inner.sctid)
  
-  Nested Loop  (cost=2.69..21.30 rows=1 width=32) (actual 
time=0.78..1.94 rows=4 loops=1