--- [EMAIL PROTECTED] wrote: > You can roughly estimate time spent for just scaning
the table using
something like this:
select sum(version) from ... where version is not null
and just
select sum(version) from ...
The results would be interesting to compare.
To answer (I hope) everyones questions at once:
1) Oracle and postmaster were not running at the same time 2) The queries were run once, to cache as much as possible then run again to get the timing
3) Distinct vs. no distinct (i.e. sort performance).
select length(version) from vers where version is not null;
Time: 9748.174 ms
select distinct(version) from vers where version is not null;
Time: 67988.972 ms
So about an extra 60 seconds with the distinct on.
Which is basically the sorting time...
Here is the explain analyze output from psql:
# explain analyze select distinct version from vers
where version is not null;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Unique (cost=117865.77..120574.48 rows=142
width=132) (actual time=63623.428..68269.111 rows=536
loops=1)
-> Sort (cost=117865.77..119220.13 rows=541741
width=132) (actual time=63623.417..66127.641
rows=541741 loops=1)
Sort Key: "version"
-> Seq Scan on vers (cost=0.00..21367.41
rows=541741 width=132) (actual time=0.218..7214.903
rows=541741 loops=1)
Filter: ("version" IS NOT NULL)
Total runtime: 68324.215 ms
(6 rows)
Time: 68326.062 ms
Yep - the seq-scan takes 7214.903 ms, there's a huge setup time for the sort (63623.417) and it's not finished until 66127.641ms have elapsed.
And the non-default .conf parameters:
tcpip_socket = true
max_connections = 100
password_encryption = true
shared_buffers = 2000
sort_mem = 16384 vacuum_mem = 8192 effective_cache_size = 4000
syslog = 2
Well, I'd probably up vacuum_mem, and check how much RAM is being used for disk cache - I'd guess it's more than 32MB (4000 * 8kb).
You might want to up the shared_buffers, but that's going to depend on the load.
Try increasing sort_mem temporarily, and see if that makes a difference:
SET sort_mem = 64000;
EXPLAIN ANALYSE ...
The only thing I can think is that you're getting disk activity to get a sort that slow. I'd be expecting a hash-sort if PG thought it could fit the distinct values in memory.
-- Richard Huxton Archonet Ltd
---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?
http://archives.postgresql.org