The chapter on indexes in the manual ( http://www.postgresql.org/docs/8.2/static/indexes.html )should give you a pretty good idea on the why. IN and EXISTS are not the only possibilities, you can also use inner or outer joins. Which solution performs best depends on the data, the database version, the available indexes, ...
>>> Julien <[EMAIL PROTECTED]> 2008-03-13 15:47 >>> Hello, Does anyone has an idea why sometimes: - select ... where ... in (select ...) is faster than : - select ... where ... exists(select ...) and sometimes it's the opposite ? I had such a situation, I've pasted the queries on: http://rafb.net/p/KXNZ6892.html and http://rafb.net/p/jvo5DO38.html It's running PostgreSQL 8.1 with an effective_cache_size of 30000. specimens.id is the primary key and there are indexes on sequences(specimen_id) and specimen_measurements(specimen_id) Is there a general "rule" to know when to use the in() version and when to use the exists() version ? Is it true to say that the exists() version is more scalable (with many rows) than the in() version (from the little tests I made it seems the case) ? Thanks, Julien -- Julien Cigar Belgian Biodiversity Platform http://www.biodiversity.be ( http://www.biodiversity.be/ ) Université Libre de Bruxelles (ULB) Campus de la Plaine CP 257 Bâtiment NO, Bureau 4 N4 115C (Niveau 4) Boulevard du Triomphe, entrée ULB 2 B-1050 Bruxelles Mail: [EMAIL PROTECTED] @biobel: http://biobel.biodiversity.be/person/show/471 Tel : 02 650 57 52 -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql