Re: [sqlite] Join performance in SQLite
> Do other SQL database engines not have this same limitation?" Are MySQL > and PostgreSQL and Firebird and MS-SQL and Oracle creating phantom > indices on-the-fly to help them do joins faster, for example?" Or do > their optimizers do a better job of finding ways to use indices in a > join?" Can somebody supply me with specific examples of joins that other > database engines do efficiently but that SQLite does slowly?" Acoording to SQLite wiki other databases do better job without indices: " Test 6: INNER JOIN without an index SELECT t1.a FROM t1 INNER JOIN t2 ON t1.b=t2.b; SQLite 3.3.3 (sync):14.473 SQLite 3.3.3 (nosync): 14.445 SQLite 2.8.17 (sync): 47.776 SQLite 2.8.17 (nosync): 47.750 PostgreSQL 8.1.2: 0.176 MySQL 5.0.18 (sync):3.421 MySQL 5.0.18 (nosync): 3.443 FirebirdSQL 1.5.2: 0.141 " > Is join > efficiency really a frustration to many SQLite users? Generally not, however the behaviour could be more user friendly. The way is I use SQLite is probably not common becase I don't write queries - apllication's users write them. I also deal with quite large data. The biggest problem with the way joins work is with subqueries. If flattening cannot be done the query runs slow. For example I was told by an user that joins on views are really slow (on large data it means that doesn't work at all). The are other minor problems: 1. Creating indices on every (possibly very large) table makes database file much bigger that it would be if SQLite used temporary indices created before query is run. 2. Database users need to know how exaclty how SQLite work. That is not problem if programmer write queries, but can be a problem if database is used by a mathematician who doesn't really care about it and simply wants to do some calculations. -- Chcesz miec nawigacje GPS ? Zamow lub przedluz umowe na neostrade, a nawigacja bedzie Twoja. Kliknij na link po szczegoly! http://link.interia.pl/f219a ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Join performance in SQLite
As others have already mentioned, hash joins can help in a situation where there are no appropriate indexes. They can make things worse if the inputs aren't large enough though, so there's still some gray area. The biggest thing that other databases have going for them - MSSQL and Oracle at least - is parallelism. If you've got 8 or 16 or 32 threads available to you, and plenty of RAM to boot, it's often faster to ignore the indexes and either hash join or nested loop join subsets of the affected tables. Thus situations where there are no indexes seem better too, and SQLite can look bad in comparison. 'tis the price paid for being a zero-config embedded database vs. a full-blown client/server database system, that's all. -T On Sat, May 30, 2009 at 11:11 AM, D. Richard Hippwrote: > There has been a recent flurry of comments about SQLite at > > http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/ > http://news.ycombinator.com/item?id=633151 > > One of the criticisms of SQLite is that it is slow to do joins. That > is true if SQLite is unable to figure out how to use an index to speed > the join. I was under the impression that SQLite actually did a > fairly reasonable job of making use of indices, if they exist. But > without indices, an k-way join takes time proportional to N^k. > > Do other SQL database engines not have this same limitation? Are > MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating > phantom indices on-the-fly to help them do joins faster, for example? > Or do their optimizers do a better job of finding ways to use indices > in a join? Can somebody supply me with specific examples of joins > that other database engines do efficiently but that SQLite does > slowly? Is join efficiency really a frustration to many SQLite users? > > Curiously, in some of our own internal tests, SQLite is much, much > faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is > large - greater than 20 or 30. (SQLite can handle up to a 64-way > join.) This is because SQLite uses a O(k*k) greedy algorithm for > selecting the ordering of tables in the join whereas the other guys > all do a much more extensive search. So the performance loss in the > other engines is due to the excessive time spent in the query planner, > not the time actually running the query. SQLite can plan a 64-way > join in the blink of an eye, whereas PostgreSQL requires several > minutes. > > But for the tests described in the previous paragraph, there were > always good indices so that the time to actually run the join was > approximately linear. What about situations where you have a 4- or 5- > way join on tables that are not indexed? Do other database engines > handle those more efficiently than SQLite somehow? Is this something > we need to look into? > > D. Richard Hipp > d...@hwaci.com > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Join performance in SQLite
* D. Richard Hipp: > One of the criticisms of SQLite is that it is slow to do joins. That > is true if SQLite is unable to figure out how to use an index to speed > the join. I was under the impression that SQLite actually did a > fairly reasonable job of making use of indices, if they exist. But > without indices, an k-way join takes time proportional to N^k. > > Do other SQL database engines not have this same limitation? Are > MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating > phantom indices on-the-fly to help them do joins faster, for example? PostgreSQL roughly does one of the following (when dealing with a two-way join): * If one side of the join is estimated to be a small set, PostgreSQL performs a sequential scan on it, hashes it, and joins the other table in a hash join. * If both sides are large, each side is sorted, and a merge join is performed. Things go horribly wrong if the estimates are off and the wrong plan is picked. There's also a nested loop join (which would be what SQLite does), but I haven't seen it in recent version. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Join performance in SQLite
On Sat, May 30, 2009 at 07:01:31PM +0100, Simon Slavin wrote: > I'm interested in how sqlite works differently to the SQL systems > which keep a daemon running as a background task. One of the > advantages of having a daemon which persists between runs of an > application is that the daemon can keep its own list of ORDERs, and > JOINs which are asked for frequently, and decide to maintain them even > when no SQL-using application is running. [...] You don't need a daemon to do that. One could use a special table in the database itself (much like the master table) to keep statistics about all sorts of things. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Join performance in SQLite
SQLite has surprised me with its quick performance, not the other way around. In fact, I've implemented all kinds of lookup queries that I knew could be optimized by caching results so I didn't have to keep repeating the SQL query, but the performance was so good even repeating the queries that I never bothered with the caching. I'm sure there are queries that SQLite doesn't run as fast as database product X, and I'm sure it goes the other way too. It's a balancing act, and as the primary developer, you have to choose for us what's important to optimize and what isn't. So far, I'm very happy with the choices and trade-offs that have been made in SQLite. :-) Jim On 5/30/09, Simon Slavinwrote: > I'm interested in how sqlite works differently to the SQL systems > which keep a daemon running as a background task. One of the > advantages of having a daemon which persists between runs of an > application is that the daemon can keep its own list of ORDERs, and > JOINs which are asked for frequently, and decide to maintain them even > when no SQL-using application is running. This can give the > impression that something is being done very quickly, when in fact the > majority of the time was taken during a previous run of the > application. It can be particularly hard to figure out what a > performance test means under these circumstances. > > But the problem is that I like the way sqlite works. I like the tiny > library, I like the way that the SQL library is entirely inside my > application, and any CPU load is mine. I like knowing that when my > app quits, nothing is going on. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Software first. Software lasts! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Join performance in SQLite
I'm interested in how sqlite works differently to the SQL systems which keep a daemon running as a background task. One of the advantages of having a daemon which persists between runs of an application is that the daemon can keep its own list of ORDERs, and JOINs which are asked for frequently, and decide to maintain them even when no SQL-using application is running. This can give the impression that something is being done very quickly, when in fact the majority of the time was taken during a previous run of the application. It can be particularly hard to figure out what a performance test means under these circumstances. But the problem is that I like the way sqlite works. I like the tiny library, I like the way that the SQL library is entirely inside my application, and any CPU load is mine. I like knowing that when my app quits, nothing is going on. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Join performance in SQLite
D. Richard Hipp wrote: > There has been a recent flurry of comments about SQLite at > > http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/ > http://news.ycombinator.com/item?id=633151 > > One of the criticisms of SQLite is that it is slow to do joins. That > is true if SQLite is unable to figure out how to use an index to speed > the join. I was under the impression that SQLite actually did a > fairly reasonable job of making use of indices, if they exist. But > without indices, an k-way join takes time proportional to N^k. > We're finishing a system which produces auto generated queries where k can potentially range from 1 to 16. The only times I have seen Sqlite slowdown were when the indexes needed tweaking. I noticed one of the comments made was that a three inner join query took 10 minutes and that the joins should have filtered the table, however, I did not notice any example, contrived or obfuscated, which would demonstrate the issue. This absence means we must rely upon the author's interpretation of what the query did being accurate. The author makes the assertion that each progressive inner join should make the search table smaller, however, just because that was the intent does not make it the reality. The fact that temp tables with no apparent indexing ran faster makes me question whether or not the initial assumption was true. I've been wrong far too often about the actual vs theoretical of my code in operation to accept anything such as this at face value; however, that caveat would apply from both directions. The author said he reported it as a bug, which implies he presented a repeatable test case. If that is so, that specific test case might merit further examination. FWIW John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Join performance in SQLite
Assuming memory is sufficiently inexpensive, I would think that it would almost always be useful to build an index for any field in a join rather than doing a full scan. (Or better yet, build a hash table if memory is sufficient.) Indices maintained in the database then become optimizations to avoid starting the query with an index build. Mark ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Join performance in SQLite
> Do other SQL database engines not have this same limitation? Are > MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating > phantom indices on-the-fly to help them do joins faster, for example? Sort of. There's 2 types of join methods in Oracle for this - Hash joins and Sort merge joins - when server creates in memory (or in temporary storage in general) sorted data for one or both merging data sets and then merges these sets. You can read about it here: http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i51523. Not sure though if it's worth to implement such technique in SQLite. Pavel On Sat, May 30, 2009 at 11:11 AM, D. Richard Hippwrote: > There has been a recent flurry of comments about SQLite at > > http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/ > http://news.ycombinator.com/item?id=633151 > > One of the criticisms of SQLite is that it is slow to do joins. That > is true if SQLite is unable to figure out how to use an index to speed > the join. I was under the impression that SQLite actually did a > fairly reasonable job of making use of indices, if they exist. But > without indices, an k-way join takes time proportional to N^k. > > Do other SQL database engines not have this same limitation? Are > MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating > phantom indices on-the-fly to help them do joins faster, for example? > Or do their optimizers do a better job of finding ways to use indices > in a join? Can somebody supply me with specific examples of joins > that other database engines do efficiently but that SQLite does > slowly? Is join efficiency really a frustration to many SQLite users? > > Curiously, in some of our own internal tests, SQLite is much, much > faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is > large - greater than 20 or 30. (SQLite can handle up to a 64-way > join.) This is because SQLite uses a O(k*k) greedy algorithm for > selecting the ordering of tables in the join whereas the other guys > all do a much more extensive search. So the performance loss in the > other engines is due to the excessive time spent in the query planner, > not the time actually running the query. SQLite can plan a 64-way > join in the blink of an eye, whereas PostgreSQL requires several > minutes. > > But for the tests described in the previous paragraph, there were > always good indices so that the time to actually run the join was > approximately linear. What about situations where you have a 4- or 5- > way join on tables that are not indexed? Do other database engines > handle those more efficiently than SQLite somehow? Is this something > we need to look into? > > D. Richard Hipp > d...@hwaci.com > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Join performance in SQLite
There has been a recent flurry of comments about SQLite at http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/ http://news.ycombinator.com/item?id=633151 One of the criticisms of SQLite is that it is slow to do joins. That is true if SQLite is unable to figure out how to use an index to speed the join. I was under the impression that SQLite actually did a fairly reasonable job of making use of indices, if they exist. But without indices, an k-way join takes time proportional to N^k. Do other SQL database engines not have this same limitation? Are MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating phantom indices on-the-fly to help them do joins faster, for example? Or do their optimizers do a better job of finding ways to use indices in a join? Can somebody supply me with specific examples of joins that other database engines do efficiently but that SQLite does slowly? Is join efficiency really a frustration to many SQLite users? Curiously, in some of our own internal tests, SQLite is much, much faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is large - greater than 20 or 30. (SQLite can handle up to a 64-way join.) This is because SQLite uses a O(k*k) greedy algorithm for selecting the ordering of tables in the join whereas the other guys all do a much more extensive search. So the performance loss in the other engines is due to the excessive time spent in the query planner, not the time actually running the query. SQLite can plan a 64-way join in the blink of an eye, whereas PostgreSQL requires several minutes. But for the tests described in the previous paragraph, there were always good indices so that the time to actually run the join was approximately linear. What about situations where you have a 4- or 5- way join on tables that are not indexed? Do other database engines handle those more efficiently than SQLite somehow? Is this something we need to look into? D. Richard Hipp d...@hwaci.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users