the example you gave would work with a range constraint:

  WHERE `bar_id` > 0 AND `bar_id` < 63

but i guess this is not a general solution.

i've done exactly this kind of select using an IN constraint very often.
i've not had any trouble with lists of a few hundred so long as i have the
necessary index. 

however, when the list gets really long i've divided it into multiple
selects in the app. i've nothing concrete to go on but a very long list of
constants makes me nervous. can you break something if query length exceeds
packet size?

one thought i had was to create another table `baz` just to store the list
of `bar_id` values. the app would: 1) truncate `baz`, 2) write the list
`bar_ids` into `baz`, 3) select from `foo` joining `baz`. perhaps not a high
performance solution but you could make it scale all the way up.


On 1/15/10 12:15 PM, "kabel" <li...@voidship.net> wrote:

> I'm facing a refactor of some slow code that engages the DB pretty heavily.
> We're looking for a way to load arbitrary sets of rows from a table using an
> indexed column of decently high cardinality, and I'm not sure if IN() is a
> good way to do it or not.
> 
> Basically, we're looking at this:
> 
> CREATE TABLE `foo` (
>   `foo_id` INT UNSIGNED NOT NULL AUTO_INCREMENT
> , `bar_id` INT UNSIGNED NOT NULL
> , `unimportant_1` VARCHAR(8)
> , `unimportant_2` VARCHAR(8)
> , `unimportant_3` VARCHAR(8)
> , PRIMARY KEY( `foo_id` )
> , KEY `idx_bar_id` ( `bar_id` )
> );
> 
> I need to load an arbitrary list of foos by their bar ID (1:N), and I'm not
> sure the best way to do it.  Joining against the `bar` table is,
> unfortunately, not really helpful since the set loaded is totally arbitrary.
> 
> As far as I can tell, the constant list would be ~100  elements at most, but
> take that with a grain of salt.
> 
> Right now, I'm considering:
> 
> SELECT *
> FROM `foo`
> WHERE `bar_id` IN( 1,2,3,4,5,6,7, ... ,61,62 );
> 
> I've done some testing and the performance doesn't seem to be too bad, but I'm
> worried about scalability.
> 
> The query execution isn't too bad.
> 
> |  1 | SIMPLE      | foo | range | idx_bar_id   | idx_bar_id | 5       | NULL
> |    7 | Using where |
> 
> Does anyone have experience with the performance of IN(), given a list of
> constants, on an indexed column?  Any pointers to more information on this
> would be greatly appreciated.
> 
> Thanks in advance!
> 
> kabel



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to