Re: [PERFORM] Subquery WHERE IN or WHERE EXISTS faster?

Ulrich Mon, 30 Jun 2008 00:29:41 -0700

I think it will be fast, because the "IN set", which is the result of"SELECT processorid FROM users_processors WHERE userid=4040", is limitedto a maximum of ~500 processors which is not very big. IncreasingPostgres' RAM would be difficult for me, because I am only running avery small server with 256MB RAM and the webserver also likes to usesome RAM.

Does Postgre cache the HASH-Table for later use? For example when theuser reloads the website.


Kind regards
Ulrich

Rusty Conover wrote:

This is what I've found with tables ranging in the millions of rows.
Using IN is better when you've got lots of rows to check against theIN set and the IN set may be large and possibly complicated toretrieve (i.e. lots of joins, or expensive functions).
Postgres will normally build a hash table of the IN set and justsearch that hash table. It's especially fast if the entire hash tablethat is built can fit into RAM. The cpu/io cost of building the INset can be quite large because it needs to fetch every tuple to hashit, but this can be faster then searching tuple by tuple throughpossibly many indexes and tables like EXISTS does. I like to increasework_mem a lot (512mb and up) if I know I'm going to be doing a lot ofmatches against a large IN set of rows because I'd prefer for thathash table to never to be written to disk.
EXISTS is better when you're doing fewer matches because it will pullthe rows out one at a time from its query possibly using indexes, itsmain advantage is that it doesn't pull all of the tuples before itstarts processing matches.
So in summary both are good to know how to use, but choosing which oneto use can really depend on your data set and resources.
Cheers,

Rusty
--
Rusty Conover
InfoGears Inc.
http://www.infogears.com



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Subquery WHERE IN or WHERE EXISTS faster?

Reply via email to