If you have two tables A hasMany B, and B belongsTo A:

$data = $this->A->findAll($where, null,  null, null, null, 1);

Cake will generate one query for the A records, and then a second
query for the B records.

That's fine.

Now, the second query is of the form:

SELECT `B`.`id`, `B`.`field_2` FROM `B` AS `B` WHERE `B`.`a_id` IN
(16, 17, 18, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,
143, 144, 145, 146, 147, 154, 168, 170, 171, 172, 173, 174, 175, 176,
177, 178, 179, 180, 181, 182, 183, 184, 187, 188, 189, 190, 238, 281,
282, 311, 312, 313, 314, 338, 348)

The IN clause is a list of the A record id's from the first query.

Again, fine.

The problem is, for a large IN clause, even if one creates an index on
the B.a_id field, mySQL will not use the index, and will do a full
table scan comparing each record in B to the list of record id's in
the IN clause.

Very inefficient.

For small numbers of A records in the IN clause, B's index on B.a_id
will be used. But once you pass a certain nuber of id's in the IN
clause, (I had around 60) the index on B.a_id is no longer used, and a
full table scan is performed instead.

Here is a sample EXPLAIN performed on the cake query above:


id      select_type     table   type    possible_keys   key     key_len         
ref     rows
Extra
1       SIMPLE         B        ALL     a_id       NULL  NULL   NULL            
1
Using where

An index exists on B.a_id, yet is not used.

For small numbers of id's in the IN clause, the index is used.

The IN clause is very inefficient and can scarily result in full table
scans.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Cake 
PHP" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to