On Thu, Jul 1, 2010 at 11:55 AM, Vincent de Lau <[email protected]> wrote:
> Hi Andrew,
>
> Just jumping in, here is another approach:
>
> SELECT P.product_id, P.product_name, P.price
> FROM
>      products AS P
>        INNER JOIN  product_tags AS PT
>                ON  P.product_id = PT.product_id
>        INNER JOIN  tags AS T
>                ON  T.tag_id = PT.tag_id AND T.tag IN ( 'foo', 'bar', 'baz', 
> 'floob', 'widget' )
> GROUP BY    P.product_id
> HAVING      COUNT(DISTINCT T.tag_id) = 5
>
> Theoretically, this should result in a smaller temporary table, since moving 
> the IN check to the join condition will limit the records stored in a 
> temporary table. Also, assuming that product_id is a primary key, you should 
> be save in grouping on just P.product_id.
>


I don't see any noticeable difference between the two in SQL Server.
I'm pretty sure that the optimizer knows that they are the same thing
and uses the same plan for both.


> Related to this, I recently had a curious issue. While rewriting a subquery 
> into a join, performance dropped dramatically. My hypothesis was that the 
> amount of columns MySQL had to store in the temporary table was so large, 
> that creating a reduced set of columns with a subquery was actually 
> beneficial.
>


I suppose it is possible. When faced with something like this, I
frequently write a few variations to a query and compare them against
approximately real data to see which approach performs best in a given
situation. Often the results are what I expected, but many times they
are not. That's why so many threads on a DBA board I follow include
answers of "it depends."

Andrew

Reply via email to