Re: [PERFORM] slow queries after ANALYZE

2005-11-14 Thread DW

DW wrote:

Hello,

I'm perplexed. I'm trying to find out why some queries are taking a long 
time, and have found that after running analyze,  one particular query 
becomes slow.


This query is based on a view that is based on multiple left outer joins 
to merge data from lots of tables.


If I drop the database and reload it from a dump, the query result is 
instaneous (less than one second).


But after I run analyze, it then takes much longer to run -- about 10 
seconds, give or take a few depending on the hardware I'm testing it on.
Earlier today, it was taking almost 30 seconds on the actual production 
server -- I restarted pgsql server and the time got knocked down to 
about 10 seconds -- another thing I don't understand.


I've run the query a number of times before and after running analyze, 
and the problem reproduces everytime. I also ran with "explain", and saw 
that the costs go up dramatically after I run analyze.


I'm fairly new to postgresql and not very experienced as a db admin to 
begin with, but it looks like I'm going to have to get smarter about 
this stuff fast, unless it's something the programmers need to deal with 
when constructing their code and queries or designing the databases.


I've already learned that I've commited the cardinal sin of configuring 
my new database server with RAID 5 instead of something more sensible 
for databases like 0+1, but I've been testing out and replicating this 
problem on different hardware, so I know that this issue is not the 
direct cause of this.


Thanks for any info. I can supply more info (like config files, schemas, 
etc.) if you think it might help. But I though I would just describe the 
problem for starters.


-DW

Well, for whatever it's worth, on my test box, I upgraded from postgreql 
7.4.9 to 8.1, and that seems to make all the difference in the world.


These complex queries are instantaneous, and the query planner when I 
run EXPLAIN ANALYZE both before and after running ANALYZE displays 
results more in line with what is expected (< 60ms).


Whatever changes were introduced  in 8.x seems to make a huge improvment 
in query performance.










---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] slow queries after ANALYZE

2005-11-12 Thread hubert depesz lubaczewski
On 11/11/05, DW <[EMAIL PROTECTED]> wrote:
I'm perplexed. I'm trying to find out why some queries are taking a longtime, and have found that after running analyze,  one particular querybecomes slow.
i have had exactly the same problem very recently.
what helped? increasing statistics on come column.
which ones?
make:
explain analyze ;
and check in which situations you gget the biggest change of "estiamted rows" and "actual rows".
then check what this particular part of your statement is touching, and increase appropriate statistics.

depesz


Re: [PERFORM] slow queries after ANALYZE

2005-11-11 Thread Tom Lane
DW <[EMAIL PROTECTED]> writes:
> In the meantime, again I'm new to this -- I got pg_stats; which rows are 
>   the relevent ones?

The ones for columns that are mentioned in the problem query.
I don't think you need to worry about columns used only in the SELECT
output list, but anything used in WHERE, GROUP BY, etc is interesting.

> Also, I am running postgresql-server-7.4.9 from FreeBSD port (with 
> optimized CFLAGS turned on during compiling)
> OS: FreeBSD 5.4 p8

The hardware environment (particularly disks/filesystems) is relevant
too.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] slow queries after ANALYZE

2005-11-11 Thread DW

Tom Lane wrote:


It would be interesting to see EXPLAIN ANALYZE results in both cases,
plus the contents of the relevant pg_stats rows.  (BTW, you need not
dump and reload to get back to the virgin state --- just delete the
relevant rows from pg_statistic.)  Also we'd want to know exactly what
PG version this is, and on what sort of platform.



Thanks for replying. I've got a message into to my team asking if I need 
to de-identify some of the table names before I go submitting output to 
a public mailing list.


In the meantime, again I'm new to this -- I got pg_stats; which rows are 
 the relevent ones?


Also, I am running postgresql-server-7.4.9 from FreeBSD port (with 
optimized CFLAGS turned on during compiling)


OS: FreeBSD 5.4 p8

Thanks,
DW


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] slow queries after ANALYZE

2005-11-11 Thread Tom Lane
DW <[EMAIL PROTECTED]> writes:
> I'm perplexed. I'm trying to find out why some queries are taking a long 
> time, and have found that after running analyze,  one particular query 
> becomes slow.

This implies that the planner's default choice of plan (without any
statistics) is better than its choice when informed by statistics.
This is undesirable but not unheard of :-(

It would be interesting to see EXPLAIN ANALYZE results in both cases,
plus the contents of the relevant pg_stats rows.  (BTW, you need not
dump and reload to get back to the virgin state --- just delete the
relevant rows from pg_statistic.)  Also we'd want to know exactly what
PG version this is, and on what sort of platform.

You might be able to fix things by increasing the statistics targets or
tweaking planner cost parameters, but it'd be best to investigate before
trying to fix.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] slow queries after ANALYZE

2005-11-11 Thread DW

Hello,

I'm perplexed. I'm trying to find out why some queries are taking a long 
time, and have found that after running analyze,  one particular query 
becomes slow.


This query is based on a view that is based on multiple left outer joins 
to merge data from lots of tables.


If I drop the database and reload it from a dump, the query result is 
instaneous (less than one second).


But after I run analyze, it then takes much longer to run -- about 10 
seconds, give or take a few depending on the hardware I'm testing it on.
Earlier today, it was taking almost 30 seconds on the actual production 
server -- I restarted pgsql server and the time got knocked down to 
about 10 seconds -- another thing I don't understand.


I've run the query a number of times before and after running analyze, 
and the problem reproduces everytime. I also ran with "explain", and saw 
that the costs go up dramatically after I run analyze.


I'm fairly new to postgresql and not very experienced as a db admin to 
begin with, but it looks like I'm going to have to get smarter about 
this stuff fast, unless it's something the programmers need to deal with 
when constructing their code and queries or designing the databases.


I've already learned that I've commited the cardinal sin of configuring 
my new database server with RAID 5 instead of something more sensible 
for databases like 0+1, but I've been testing out and replicating this 
problem on different hardware, so I know that this issue is not the 
direct cause of this.


Thanks for any info. I can supply more info (like config files, schemas, 
etc.) if you think it might help. But I though I would just describe the 
problem for starters.


-DW


---(end of broadcast)---
TIP 6: explain analyze is your friend