[PERFORM] sequential scan on select distinct

Ole Langbehn Wed, 06 Oct 2004 02:30:04 -0700

Hi,

I'm using Postgres 7.4.5. Tables are analyzed & vacuumed.


I am wondering why postgresql never uses an index on queries of the type
'select distinct ...' while e.g. mysql uses the index on the same query.
See the following explains:


postgresql:

explain analyze select distinct "land" from "customer_dim";
---------------------------------------------------------------------------------------------------------------------------------------+
                                                              QUERY PLAN               
                                                |
---------------------------------------------------------------------------------------------------------------------------------------+
 Unique  (cost=417261.85..430263.66 rows=18 width=15) (actual 
time=45875.235..67204.694 rows=103 loops=1)                              |
   ->  Sort  (cost=417261.85..423762.75 rows=2600362 width=15) (actual 
time=45875.226..54114.473 rows=2600362 loops=1)                 |
         Sort Key: land                                                                
                                                |
         ->  Seq Scan on customer_dim  (cost=0.00..84699.62 rows=2600362 width=15) 
(actual time=0.048..10733.227 rows=2600362 loops=1) |
 Total runtime: 67246.465 ms                                                           
                                                |
---------------------------------------------------------------------------------------------------------------------------------------+


mysql:

explain select DISTINCT `customer_dim`.`land` from `customer_dim`;
--------------+-------+---------------+---------------+---------+--------+---------+-------------+
    table     | type  | possible_keys |      key      | key_len |  ref   |  rows   |   
 Extra    |
--------------+-------+---------------+---------------+---------+--------+---------+-------------+
 customer_dim | index | [NULL]        | IDX_cstd_land | 81      | [NULL] | 2600362 | 
Using index |
--------------+-------+---------------+---------------+---------+--------+---------+-------------+
1 row in result (first row: 8 msec; total: 9 msec)



The result set contains 103 rows (but i get this behavior with every query of
this kind). My tables consist of at least a million rows.

The indexes on the column 'land' are standard indexes, so in case of
postgresql, it's a btree-index. I've tried to change the index type, but to no
avail.

So, why doesn't postgresql use the index, and (how) could i persuade postgresql
to use an index for this type of query?

TiA

-- 
Ole Langbehn

freiheit.com technologies gmbh
Theodorstr. 42-90 / 22761 Hamburg, Germany
fon � � � +49 (0)40 / 890584-0
fax � � � +49 (0)40 / 890584-20

Freie Software durch B�cherkauf f�rdern | http://bookzilla.de/

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

[PERFORM] sequential scan on select distinct

Reply via email to