On Thu, Jul 1, 2010 at 11:55 AM, Vincent de Lau <[email protected]> wrote: > Hi Andrew, > > Just jumping in, here is another approach: > > SELECT P.product_id, P.product_name, P.price > FROM > products AS P > INNER JOIN product_tags AS PT > ON P.product_id = PT.product_id > INNER JOIN tags AS T > ON T.tag_id = PT.tag_id AND T.tag IN ( 'foo', 'bar', 'baz', > 'floob', 'widget' ) > GROUP BY P.product_id > HAVING COUNT(DISTINCT T.tag_id) = 5 > > Theoretically, this should result in a smaller temporary table, since moving > the IN check to the join condition will limit the records stored in a > temporary table. Also, assuming that product_id is a primary key, you should > be save in grouping on just P.product_id. >
I don't see any noticeable difference between the two in SQL Server. I'm pretty sure that the optimizer knows that they are the same thing and uses the same plan for both. > Related to this, I recently had a curious issue. While rewriting a subquery > into a join, performance dropped dramatically. My hypothesis was that the > amount of columns MySQL had to store in the temporary table was so large, > that creating a reduced set of columns with a subquery was actually > beneficial. > I suppose it is possible. When faced with something like this, I frequently write a few variations to a query and compare them against approximately real data to see which approach performs best in a given situation. Often the results are what I expected, but many times they are not. That's why so many threads on a DBA board I follow include answers of "it depends." Andrew
