It looks like your IN statement is forcing your inner SELECT to execute 
once PER ROW of your main table. It is asking the engine to make sure that 
_each and every_  id value in main meets the condition in the inner 
select. So, for each and every value in the table main, it has to 
re-computing the inner query and scan the results for matches.

I would change it to a JOIN against an anonymous view and test again -

SELECT m.*
FROM main m
INNER JOIN (SELECT main_ID
        FROM receipt
        GROUP BY main_ID
        HAVING COUNT(1) > 5) as r
ON m.id = r.main_ID

- or to break it into two tables for some real speed -

CREATE TEMPORARY TABLE tmpR
SELECT main_ID
FROM receipt
GROUP BY main_ID
HAVING COUNT(1) > 5;

alter table tmpR add key(main_Id);

SELECT m.*
FROM main m
INNER JOIN tmpR r
on m.ID = r.main_ID;

DROP TABLE tmpR;

Either method should avoid the re-execution of the subselect for every row 
in your primary table. Adding the index to the temporary table will make 
the last select really fly.

Yours,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

"Leo Siefert" <[EMAIL PROTECTED]> wrote on 07/19/2004 11:22:39 
AM:

> OS: Win2k Server
> MySQL: 4.1.1 alpha / 4.1.3 beta
>    Table type: INNO DB
> 
> In my production environment, running the query:
> 
> select * from main where id in
> (select main_id from receipt group by main_id having COUNT(*) > 5)
> 
> will hang the server - sometimes for over a day, thugh it seems it will
> eventually complete working on it if given enough time. Currently main
> contains ~200,000 records and receipt contains ~16,000. Main records
> with any receipts have an average of ~10 receipts, but most have none. 
> 
> Created a smaller test database:
> 
> master
>    id   int(6)  primary  autoincrement
>    name  varchar(25)  (filled with random 10 char strings)
> 
> detail
>    id   int(6)  primary  autoincrement
>    master_id   int(6)  index  (filled with random ints <=
> max(master.id))
>    detail  varchar(25)  (filled with random 10 char strings)
> 
> temp
>    id int(6) index
> 
> Fill master with 1,000 records, detail with 10,000.
> Clone and fill master with 10,000 records, detail with 100,000.
> 
> Query:
> 
> select * from master where master.id in
> (select master_id from detail group by master_id having COUNT(*) > 2)
> 
> (small) returns 76 rows in 13 seconds.
> (large) returns 496 rows in 566 seconds. (COUNT(*) > 15)
> 
> Tried a two part query, sending the intermediate results to a temporary
> table:
> 
> create temporary table t select master_id from detail group by
> master_id having COUNT(*) > 2;
> select * from master where master.id in (select master_id from t);
> drop table t;
> 
> (small) returns 76 rows in 2.8 seconds.
> (large) returns 496 rows in 17 seconds.
> 
> Running the intermediate results into a permanent table:
> 
> truncate table temp;
> insert into temp select master_id from detail group by master_id having
> COUNT(*) > 2;
> select * from master where master.id in (select id from temp);
> 
> (small) returns 76 rows in 0.16 seconds.
> (large) returns 496 rows in 0.17 seconds.
> 
> Have tried playing around with some of the system variables:
> query_cache_size, innodb_buffer_pool_size with no real affect.
> 
> In our production environment (record size is much larger, similar
> number of records to the large test set), both the nested query and the
> two-part query using a temporary query hang for indeterminate (> 6 hrs)
> amounts of time, leaving the use of a permanent table as the only
> option. Of course, the only real way to manage this is to create a
> dedicated scratch table for each user of the system, a somewhat onerous
> workaround.
> 
> Anyone have an idea on a solution to this? Is there something in
> setting up for INNO DB that I am missing, or should I file this as a
> bug?
> 
> Thanks.
> 
>  - Leo
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
> 

Reply via email to