Hi all.

I'm seeing something fishy when trying to self-join two large tables and then 
order by one column.

I have the following schema:

CREATE TABLE origo_person(
id SERIAL PRIMARY KEY,
firstname varchar,
lastname varchar,
created timestamp not null default now(),
created_by integer REFERENCES onp_user(id),
onp_user_id integer references onp_user(id)
);

create index origo_person_created_idx on origo_person(created);
create index origo_person_createdby_idx on origo_person(created_by);
create index origo_person_onp_user_id_idx on origo_person(onp_user_id);
create index origo_person_firstname_idx on origo_person(firstname);

insert into onp_user(id) values (1);

copy origo_person (firstname, lastname, created) 
from '/home/andreak/simpleperson.sql';

update origo_person set created_by = 1;

update origo_person set onp_user_id = 1 where id = 1;

simpleperson.sql has more than 200K entries in COPY-format:
$ tail -5 /home/andreak/simpleperson.sql
INGOLF  KALLEBERG       2007-08-21 22:23:43.571421
SIGRUNN BRUVIK  2007-08-21 22:23:43.571421
ELFRID  FROGNER 2007-08-21 22:23:43.571421
GUNNAR KRISTOFFER       DOVLAND 2007-08-21 22:23:43.571421
JAN ARNE        HAARR   2007-08-21 22:23:43.571421

Now, the two first queries are *fast*, but the 3rd query is slow:
1. Fast:
EXPLAIN ANALYZE SELECT p.firstname, p.lastname FROM origo_person p order by 
p.firstname ASC limit 5;
                                                                        QUERY 
PLAN
-----------------------------------------------------------------------------
 Limit  (cost=0.00..0.55 rows=5 width=17) (actual time=0.031..0.070 rows=5 
loops=1)
   ->  Index Scan using origo_person_firstname_idx on origo_person p  
(cost=0.00..22277.13 rows=200827 width=17) (actual time=0.025..0.046 rows=5 
loops=1)
 Total runtime: 0.128 ms
(3 rows)



2. Fast:
EXPLAIN ANALYZE SELECT p.firstname, p.lastname FROM origo_person p, 
origo_person pcb WHERE pcb.onp_user_id = p.created_by limit 5;
                                                                             
QUERY PLAN
-----------------------------------------------------------------------------
 Limit  (cost=0.00..31526.55 rows=1 width=17) (actual time=0.096..0.170 rows=5 
loops=1)
   ->  Merge Join  (cost=0.00..31526.55 rows=1 width=17) (actual 
time=0.091..0.145 rows=5 loops=1)
         Merge Cond: (p.created_by = pcb.onp_user_id)
         ->  Index Scan using origo_person_createdby_idx on origo_person p  
(cost=0.00..10697.70 rows=200827 width=21) (actual time=0.045..0.057 rows=5 
loops=1)
         ->  Index Scan using origo_person_onp_user_id_idx on origo_person pcb  
(cost=0.00..19824.70 rows=200827 width=4) (actual time=0.032..0.044 rows=5 
loops=1)
 Total runtime: 0.264 ms
(6 rows)


3. Slow:
EXPLAIN ANALYZE SELECT p.firstname, p.lastname FROM origo_person p, 
origo_person pcb WHERE pcb.onp_user_id = p.created_by order by p.firstname 
ASC limit 5;
                                                                                
   
QUERY PLAN
-----------------------------------------------------------------------------
 Limit  (cost=31526.56..31526.56 rows=1 width=17) (actual 
time=2573.993..2574.015 rows=5 loops=1)
   ->  Sort  (cost=31526.56..31526.56 rows=1 width=17) (actual 
time=2573.987..2573.994 rows=5 loops=1)
         Sort Key: p.firstname
         Sort Method:  top-N heapsort  Memory: 17kB
         ->  Merge Join  (cost=0.00..31526.55 rows=1 width=17) (actual 
time=0.098..2047.726 rows=200827 loops=1)
               Merge Cond: (p.created_by = pcb.onp_user_id)
               ->  Index Scan using origo_person_createdby_idx on origo_person 
p  (cost=0.00..10697.70 rows=200827 width=21) (actual time=0.052..428.445 
rows=200827 loops=1)
               ->  Index Scan using origo_person_onp_user_id_idx on 
origo_person pcb  (cost=0.00..19824.70 rows=200827 width=4) (actual 
time=0.031..424.250 rows=200828 loops=1)
 Total runtime: 2574.113 ms
(9 rows)

Can anybody point out to me why PG doesn't perform better on the last query?

-- 
Andreas Joseph Krogh <[EMAIL PROTECTED]>
Senior Software Developer / Manager
------------------------+---------------------------------------------+
OfficeNet AS            | The most difficult thing in the world is to |
Karenslyst Allé 11      | know how to do a thing and to watch         |
PO. Box 529 Skøyen      | somebody else doing it wrong, without       |
0214 Oslo               | comment.                                    |
NORWAY                  |                                             |
Tlf:    +47 24 15 38 90 |                                             |
Fax:    +47 24 15 38 91 |                                             |
Mobile: +47 909  56 963 |                                             |
------------------------+---------------------------------------------+

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to