Re: [GENERAL] select distinct, index not used

Thomas Guettler Fri, 17 Apr 2009 01:35:46 -0700

Thank you Tom. The cron job for vacuum+analyze was not installed on the host.
(I had this idea some seconds after posting)


After vacuum+analyze the performance is good. I am happy.

Nevertheless, on a different host with nearly the same data, a index scan is 
used.

foo_hostone_foo=# explain analyze SELECT DISTINCT "foo_abc_abc"."lieferant" 
FROM "foo_abc_abc";
                                                             QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=15241.56..15820.71 rows=15 width=8) (actual 
time=1878.213..2393.550 rows=34 loops=1)
   ->  Sort  (cost=15241.56..15531.13 rows=115830 width=8) (actual 
time=1878.207..2227.478 rows=115830 loops=1)
         Sort Key: lieferant
         ->  Seq Scan on foo_abc_abc  (cost=0.00..3518.30 rows=115830 width=8) 
(actual time=0.042..226.883 rows=115830
loops=1)
 Total runtime: 2394.960 ms
(5 Zeilen)

foo_hostone_foo=# select version();
                                                     version
------------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.2.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 
20061115 (prerelease) (SUSE Linux)
(1 Zeile)




foo_hosttwo_foo=# explain analyze SELECT DISTINCT "foo_abc_abc"."lieferant" 
FROM "foo_abc_abc";
                                                                            
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..65641.70 rows=19 width=18) (actual time=0.163..1490.106 
rows=68 loops=1)
   ->  Index Scan using foo_abc_abc_lieferant on foo_abc_abc  
(cost=0.00..64536.38 rows=442127 width=18) (actual
time=0.155..955.844 rows=227600 loops=1)
 Total runtime: 1490.481 ms
(3 Zeilen)

foo_hosttwo_foo=# select version();
                                          version
--------------------------------------------------------------------------------------------
 PostgreSQL 8.2.6 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.2.1 
(SUSE Linux)
(1 Zeile)



Tom Lane schrieb:
> Thomas Guettler <[email protected]> writes:
>> why does the statement take so long? The column 'lieferant' is indexed. But
>> a sequential scan gets done.
> 
> It might have something to do with the fact that the planner's idea of
> the size of the table is off by a factor of more than 100:
> 
>>          ->  Seq Scan on foo_abc_abc  (cost=0.00..468944.11 rows=15404611 
>> width=8) (actual time=0.029..125458.870 rows=115830 loops=1)
> 
> You might need to review your vacuuming policy.
> 
> (However, a full table indexscan isn't going to be particularly fast in
> any case; it's often the case that seqscan-and-sort is the right
> decision.  I'm not sure this choice was wrong.)
> 
>                       regards, tom lane
> 

-- 
Thomas Guettler, http://www.thomas-guettler.de/
E-Mail: guettli (*) thomas-guettler + de

-- 
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] select distinct, index not used

Reply via email to