On 1/30/07, Simon Riggs <[EMAIL PROTECTED]> wrote:
> explain analyze select distinct a, b from tbl > > EXPLAIN ANALYZE output is: > > Unique (cost=500327.32..525646.88 rows=1848 width=6) (actual > time=52719.868..56126.356 rows=5390 loops=1) > -> Sort (cost=500327.32..508767.17 rows=3375941 width=6) (actual > time=52719.865..54919.989 rows=3378864 loops=1) > Sort Key: a, b > -> Seq Scan on tbl (cost=0.00..101216.41 rows=3375941 > width=6) (actual time=16.643..20652.610 rows=3378864 loops=1) > Total runtime: 57307.394 ms All your time is in the sort, not in the SeqScan. Increase your work_mem.
Sounds like an opportunity to implement a "Sort Unique" (sort of like a hash, I guess), there is no need to push 3M rows through a sort algorithm to only shave it down to 1848 unique records. I am assuming this optimization just isn't implemented in PostgreSQL? -- Chad http://www.postgresqlforums.com/