Folks,
I'm using PostgreSQL 7.4.1 on Linux, and I'm trying to figure out weather a query I have is going to be slow when I have more information in my tables. both tables involved will likely have ~500K rows within a year or so.
Specifically I can't tell if I'm causing myself future problems with the subquery, and should maybe re-write the query to use a join. The reason I went with the subquery is that I don't know weather a row in Assignments will have a corresponding row in Assignment_Settings
The query is: SELECT User_ID FROM Assignments A WHERE A.User_ID IS NOT NULL AND (SELECT Value FROM Assignment_Settings WHERE Setting='Status' AND Assignment_ID=A.Assignment_ID) IS NULL GROUP BY User_ID;
You could always use a LEFT JOIN instead, like you say. I'd personally be tempted to select distinct user_id's then join, but it depends on how many of each.
You're not going to know for sure whether you'll have problems without testing. Generate 500k rows of plausible looking test-data and give it a try.
-- Richard Huxton Archonet Ltd
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq