Richard, I tried a left join, which has to be a little weird, because there may or may not be a corresponding row in Assignment_Settings for each Assignment, and they may or may not have Setting='Status', so I came up with:
SELECT User_ID FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings WHERE Setting='Status') ASet WHERE A.User_ID IS NOT NULL AND ASet.Assignment_ID IS NULL GROUP BY User_ID; Which explain analyze is saying takes 0.816 ms as compared to 0.163 ms for my other query. So, I'm not sure that I'm writing the best LEFT JOIN that I can. Also, I suspect that these ratios wouldn't hold as the data got bigger and started using indexes, etc. I'll mock up a couple of tables with a bunch of data and see how things go. It would be nice to understand WHY I get the results I get, which I'm not sure I will. I'm not sure what you mean by selecting a distinct User_ID first. Since I'm joining the tables on Assignment_ID, I'm not sure how I'd do a distinct before the join (because I'd lose Assignment_ID). I was also under the impression that group by was likely to be faster than a distinct, tho I can't really recall where I got that idea from. Thanks for your suggestions! Peter Darley -----Original Message----- From: Richard Huxton [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 26, 2005 1:36 AM To: Peter Darley Cc: Pgsql-Performance Subject: Re: [PERFORM] Possibly slow query Peter Darley wrote: > Folks, > > I'm using PostgreSQL 7.4.1 on Linux, and I'm trying to figure out > weather a > query I have is going to be slow when I have more information in my tables. > both tables involved will likely have ~500K rows within a year or so. > > Specifically I can't tell if I'm causing myself future problems with the > subquery, and should maybe re-write the query to use a join. The reason I > went with the subquery is that I don't know weather a row in Assignments > will have a corresponding row in Assignment_Settings > > The query is: > SELECT User_ID > FROM Assignments A > WHERE A.User_ID IS NOT NULL > AND (SELECT Value FROM Assignment_Settings WHERE Setting='Status' AND > Assignment_ID=A.Assignment_ID) IS NULL > GROUP BY User_ID; You could always use a LEFT JOIN instead, like you say. I'd personally be tempted to select distinct user_id's then join, but it depends on how many of each. You're not going to know for sure whether you'll have problems without testing. Generate 500k rows of plausible looking test-data and give it a try. -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings