On Montag, 27. August 2007 Paul J Stevens wrote:
> Then why do I keep hearing noises on this list that JOINs are faster
> than sub-SELECTS ??
1) It still has to be done in the correct search order to be fast
2) I've given lots of temporary mem space to PostgreSQL, that should
help with temp tables
3) My numbers are for PostgreSQL 8.1.9 only, on my hardware, with my
database as it is. YMMV, that's why it would be nice to see other
stats. I currently have
count(*) from dbmail_headervalue: 4177750
count(*) from dbmail_physmessage: 116015
4) It seems that only the "sort" that's done with the LEFT JOIN that
kills performance. There would be an index in the correct order
(dbmail_headervalue_physmessage_id_fkey), but I do not know why it's
not used.
5) I've experimented now, and changed one parameter in postgresql.conf:
default_statistics_target = 1000 # was 10 before
Afterwards I made VACUUM ANALYZE, now look at the difference:
EXPLAIN ANALYZE SELECT p.id from dbmail_physmessage p where p.id NOT IN
(select physmessage_id from dbmail_headervalue);
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on dbmail_physmessage p (cost=128546.48..130988.81 rows=58014
width=8) (actual time=29485.098..29703.340 rows=15 loops=1)
Filter: (NOT (hashed subplan))
SubPlan
-> Seq Scan on dbmail_headervalue (cost=0.00..118101.38
rows=4178038 width=8) (actual time=0.051..15904.147 rows=4178057
loops=1)
Total runtime: 29717.718 ms
(5 Zeilen)
explain ANALYZE SELECT p.id FROM dbmail_physmessage p LEFT JOIN
dbmail_headervalue h ON p.id = h.physmessage_id WHERE h.physmessage_id
IS NULL;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Left Join (cost=0.00..262999.04 rows=116027 width=8) (actual
time=4305.679..37126.298 rows=15 loops=1)
Merge Cond: ("outer".id = "inner".physmessage_id)
Filter: ("inner".physmessage_id IS NULL)
-> Index Scan using dbmail_physmessage_pkey on dbmail_physmessage p
(cost=0.00..4053.94 rows=116027 width=8) (actual time=0.013..467.835
rows=116029 loops=1)
-> Index Scan using dbmail_headervalue_1 on dbmail_headervalue h
(cost=0.00..206429.55 rows=4178038 width=8) (actual
time=8.833..24189.424 rows=4178082 loops=1)
Total runtime: 37126.564 ms
Now I'd say there's nothing to be changed with the query, but a higher
default_statistics_target with vacuum analyze help ;-)
mfg zmi
--
// Michael Monnerie, Ing.BSc.
----------------------------------
Sorcerers have their magic wands:
powerful, potentially dangerous tools with a life of their own.
Witches have their familiars:
creatures disguised as household beasts that could,
if they choose, wreak the witches' havoc.
Mystics have their golems:
beings built of wood and tin brought to life to do their
masters' bidding.
I have Linux.
----------------------------------
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0676/846 914 666 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: EA39 8918 EDFF 0A68 ACFB 11B7 BA2D 060F 1C6F E6B0
// Keyserver: www.keyserver.net Key-ID: 1C6FE6B0
_______________________________________________
DBmail mailing list
[email protected]
https://mailman.fastxs.nl/mailman/listinfo/dbmail