A covering index might also help, depending on what type of field b.field_2 is.
CREATE INDEX cover_field2 ON b (id,field_2); If field_2 is a short varchar then there is a greater chance that mysql will just read the index and grab the data from there, and never touch the table. If you need a bunch more fields then the covering technique loses its effectiveness. Sean On Dec 14, 7:09 am, grigri <[EMAIL PROTECTED]> wrote: > Sorry, I think you're interepreting the results wrong. > > The MySQL optimizer is a complicated piece of code, but it generally > works well. > > Basically, if your range query affects a large portion of the table > entries then it is MORE efficient to do a table scan than an index- > based range check, so it does. In fact, in older versions of MySQL > based this decision solely on the approximate percentage: > > ----------- > quote fromhttp://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html > : > > "Each table index is queried, and the best index is used unless the > optimizer believes that it is more efficient to use a table scan. At > one time, a scan was used based on whether the best index spanned more > than 30% of the table, but a fixed percentage no longer determines the > choice between using an index or a scan. The optimizer now is more > complex and bases its estimate on additional factors such as table > size, number of rows, and I/O block size." > ------------ > > Try running the same query on a table with a larger row count. I ran a > indexed/ranged query against a table with over 4 million rows (there > were 100 entries in the IN() statement), and EXPLAIN SELECT confirmed > that the index was being used, and no full table scan was involved. > > On Dec 14, 11:46 am, keymaster <[EMAIL PROTECTED]> wrote: > > > If you have two tables A hasMany B, and B belongsTo A: > > > $data = $this->A->findAll($where, null, null, null, null, 1); > > > Cake will generate one query for the A records, and then a second > > query for the B records. > > > That's fine. > > > Now, the second query is of the form: > > > SELECT `B`.`id`, `B`.`field_2` FROM `B` AS `B` WHERE `B`.`a_id` IN > > (16, 17, 18, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, > > 143, 144, 145, 146, 147, 154, 168, 170, 171, 172, 173, 174, 175, 176, > > 177, 178, 179, 180, 181, 182, 183, 184, 187, 188, 189, 190, 238, 281, > > 282, 311, 312, 313, 314, 338, 348) > > > The IN clause is a list of the A record id's from the first query. > > > Again, fine. > > > The problem is, for a large IN clause, even if one creates an index on > > the B.a_id field, mySQL will not use the index, and will do a full > > table scan comparing each record in B to the list of record id's in > > the IN clause. > > > Very inefficient. > > > For small numbers of A records in the IN clause, B's index on B.a_id > > will be used. But once you pass a certain nuber of id's in the IN > > clause, (I had around 60) the index on B.a_id is no longer used, and a > > full table scan is performed instead. > > > Here is a sample EXPLAIN performed on the cake query above: > > > id select_type table type possible_keys key key_len > > ref rows > > Extra > > 1 SIMPLE B ALL a_id NULL NULL NULL > > 1 > > Using where > > > An index exists on B.a_id, yet is not used. > > > For small numbers of id's in the IN clause, the index is used. > > > The IN clause is very inefficient and can scarily result in full table > > scans. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Cake PHP" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cake-php?hl=en -~----------~----~----~----~------~----~------~--~---
