Gary Cowell wrote:
--- [EMAIL PROTECTED] wrote: > You can roughly estimate time
spent for just scaning

the table using
something like this:


        select sum(version) from ... where version is not
null

and just

        select sum(version) from ...

The results would be interesting to compare.


To answer (I hope) everyones questions at once:

1) Oracle and postmaster were not running at the same
time
2) The queries were run once, to cache as much as
possible then run again to get the timing

3) Distinct vs. no distinct (i.e. sort performance).

select length(version) from vers where version is not
null;

Time: 9748.174 ms

select distinct(version) from vers where version is
not null;

Time: 67988.972 ms

So about an extra 60 seconds with the distinct on.

Which is basically the sorting time...

Here is the explain analyze output from psql:

# explain analyze select distinct version from vers
where version is not null;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Unique (cost=117865.77..120574.48 rows=142
width=132) (actual time=63623.428..68269.111 rows=536
loops=1)
-> Sort (cost=117865.77..119220.13 rows=541741
width=132) (actual time=63623.417..66127.641
rows=541741 loops=1)
Sort Key: "version"
-> Seq Scan on vers (cost=0.00..21367.41
rows=541741 width=132) (actual time=0.218..7214.903
rows=541741 loops=1)
Filter: ("version" IS NOT NULL)
Total runtime: 68324.215 ms
(6 rows)


Time: 68326.062 ms

Yep - the seq-scan takes 7214.903 ms, there's a huge setup time for the sort (63623.417) and it's not finished until 66127.641ms have elapsed.



And the non-default .conf parameters:

tcpip_socket = true
max_connections = 100
password_encryption = true
shared_buffers = 2000
sort_mem = 16384 vacuum_mem = 8192 effective_cache_size = 4000
syslog = 2

Well, I'd probably up vacuum_mem, and check how much RAM is being used for disk cache - I'd guess it's more than 32MB (4000 * 8kb).


You might want to up the shared_buffers, but that's going to depend on the load.

Try increasing sort_mem temporarily, and see if that makes a difference:
SET sort_mem = 64000;
EXPLAIN ANALYSE ...
The only thing I can think is that you're getting disk activity to get a sort that slow. I'd be expecting a hash-sort if PG thought it could fit the distinct values in memory.


--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Reply via email to