As others have already mentioned, hash joins can help in a
situation where there are no appropriate indexes.  They can make
things worse if the inputs aren't large enough though, so there's
still some gray area.

   The biggest thing that other databases have going for them - MSSQL
and Oracle at least - is parallelism.  If you've got 8 or 16 or 32
threads available to you, and plenty of RAM to boot, it's often faster
to ignore the indexes and either hash join or nested loop join subsets
of the affected tables.  Thus situations where there are no indexes
seem better too, and SQLite can look bad in comparison.  'tis the
price paid for being a zero-config embedded database vs. a full-blown
client/server database system, that's all.

   -T

On Sat, May 30, 2009 at 11:11 AM, D. Richard Hipp <d...@hwaci.com> wrote:
> There has been a recent flurry of comments about SQLite at
>
>     http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
>     http://news.ycombinator.com/item?id=633151
>
> One of the criticisms of SQLite is that it is slow to do joins.  That
> is true if SQLite is unable to figure out how to use an index to speed
> the join.  I was under the impression that SQLite actually did a
> fairly reasonable job of making use of indices, if they exist.  But
> without indices, an k-way join takes time proportional to N^k.
>
> Do other SQL database engines not have this same limitation?  Are
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating
> phantom indices on-the-fly to help them do joins faster, for example?
> Or do their optimizers do a better job of finding ways to use indices
> in a join?  Can somebody supply me with specific examples of joins
> that other database engines do efficiently but that SQLite does
> slowly?  Is join efficiency really a frustration to many SQLite users?
>
> Curiously, in some of our own internal tests, SQLite is much, much
> faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is
> large - greater than 20 or 30.  (SQLite can handle up to a 64-way
> join.)  This is because SQLite uses a O(k*k) greedy algorithm for
> selecting the ordering of tables in the join whereas the other guys
> all do a much more extensive search.  So the performance loss in the
> other engines is due to the excessive time spent in the query planner,
> not the time actually running the query.  SQLite can plan a 64-way
> join in the blink of an eye, whereas PostgreSQL requires several
> minutes.
>
> But for the tests described in the previous paragraph, there were
> always good indices so that the time to actually run the join was
> approximately linear.  What about situations where you have a 4- or 5-
> way join on tables that are not indexed?  Do other database engines
> handle those more efficiently than SQLite somehow?  Is this something
> we need to look into?
>
> D. Richard Hipp
> d...@hwaci.com
>
>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to