I've just done some benchmarking on MySQL 5 for queries where you need
to select back many individual records, e.g. you could use

SELECT ... WHERE id=? (and iterate for all possible id values)

or

SELECT ... WHERE id=1 OR id=2 OR id=3 OR ...

or

SELECT ... WHERE id IN (1, 2, 3, ...)

and there are several other ways to do it.

What I found was that using OR, at a certain point (when selecting >
about 1000 keys) the performance went exponentially sour - like, to
quote one benchmark, the OR query took 6400 secs vs 30s typical for
other methods.  I've written the results up in more detail at
http://notes.jschutz.net/19/perl/mysql-many-row-select-performance -
have a look at the graph if nothing else.  Using 'IN' is the best
solution for any sort of large key set.

I figure this is relevant to DBIx::Class since it uses SQL::Abstract
which will translate a search spec such as A => [ 1 .. 1000 ] into (A=?
OR A=? ... ) 1000 times, which could be a real gotcha (indeed, some of
my own code, which I shall now be reviewing, can expect arrays of tens
of thousands of keys...).

I was wondering - has anyone else seen similar behaviour?


-- 

Jon Schutz                        My tech notes http://notes.jschutz.net
Chief Technology Officer                        http://www.youramigo.com
YourAmigo         


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/[email protected]

Reply via email to