Re: [sqlite] Join performance in SQLite

2009-06-01 Thread BardzoTajneKonto

> Do other SQL database engines not have this same limitation?" Are MySQL 
> and PostgreSQL and Firebird and MS-SQL and Oracle creating phantom 
> indices on-the-fly to help them do joins faster, for example?" Or do 
> their optimizers do a better job of finding ways to use indices in a 
> join?" Can somebody supply me with specific examples of joins that other 
> database engines do efficiently but that SQLite does slowly?"
 
Acoording to SQLite wiki other databases do better job without indices:
"
Test 6: INNER JOIN without an index
SELECT t1.a FROM t1 INNER JOIN t2 ON t1.b=t2.b; 
SQLite 3.3.3 (sync):14.473 
SQLite 3.3.3 (nosync):  14.445 
SQLite 2.8.17 (sync):   47.776 
SQLite 2.8.17 (nosync): 47.750 
PostgreSQL 8.1.2:   0.176 
MySQL 5.0.18 (sync):3.421 
MySQL 5.0.18 (nosync):  3.443 
FirebirdSQL 1.5.2:  0.141 
"

> Is join 
> efficiency really a frustration to many SQLite users?

Generally not, however the behaviour could be more user friendly. The way is I 
use SQLite is probably not common becase I don't write queries - apllication's 
users write them. I also deal with quite large data. The biggest problem with 
the way joins work is with subqueries. If flattening cannot be done the query 
runs slow. For example I was told by an user that joins on views are really 
slow (on large data it means that doesn't work at all).
The are other minor problems:
1. Creating indices on every (possibly very large) table makes database file 
much bigger that it would be if SQLite used temporary indices created before 
query is run.
2. Database users need to know how exaclty how SQLite work. That is not problem 
if programmer write queries, but can be a problem if database is used by a 
mathematician who doesn't really care about it and simply wants to do some 
calculations.

--
Chcesz miec nawigacje GPS ?
Zamow lub przedluz umowe na neostrade, a nawigacja bedzie Twoja.
Kliknij na link po szczegoly! http://link.interia.pl/f219a


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-31 Thread Thomas Briggs
   As others have already mentioned, hash joins can help in a
situation where there are no appropriate indexes.  They can make
things worse if the inputs aren't large enough though, so there's
still some gray area.

   The biggest thing that other databases have going for them - MSSQL
and Oracle at least - is parallelism.  If you've got 8 or 16 or 32
threads available to you, and plenty of RAM to boot, it's often faster
to ignore the indexes and either hash join or nested loop join subsets
of the affected tables.  Thus situations where there are no indexes
seem better too, and SQLite can look bad in comparison.  'tis the
price paid for being a zero-config embedded database vs. a full-blown
client/server database system, that's all.

   -T

On Sat, May 30, 2009 at 11:11 AM, D. Richard Hipp  wrote:
> There has been a recent flurry of comments about SQLite at
>
>     http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
>     http://news.ycombinator.com/item?id=633151
>
> One of the criticisms of SQLite is that it is slow to do joins.  That
> is true if SQLite is unable to figure out how to use an index to speed
> the join.  I was under the impression that SQLite actually did a
> fairly reasonable job of making use of indices, if they exist.  But
> without indices, an k-way join takes time proportional to N^k.
>
> Do other SQL database engines not have this same limitation?  Are
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating
> phantom indices on-the-fly to help them do joins faster, for example?
> Or do their optimizers do a better job of finding ways to use indices
> in a join?  Can somebody supply me with specific examples of joins
> that other database engines do efficiently but that SQLite does
> slowly?  Is join efficiency really a frustration to many SQLite users?
>
> Curiously, in some of our own internal tests, SQLite is much, much
> faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is
> large - greater than 20 or 30.  (SQLite can handle up to a 64-way
> join.)  This is because SQLite uses a O(k*k) greedy algorithm for
> selecting the ordering of tables in the join whereas the other guys
> all do a much more extensive search.  So the performance loss in the
> other engines is due to the excessive time spent in the query planner,
> not the time actually running the query.  SQLite can plan a 64-way
> join in the blink of an eye, whereas PostgreSQL requires several
> minutes.
>
> But for the tests described in the previous paragraph, there were
> always good indices so that the time to actually run the join was
> approximately linear.  What about situations where you have a 4- or 5-
> way join on tables that are not indexed?  Do other database engines
> handle those more efficiently than SQLite somehow?  Is this something
> we need to look into?
>
> D. Richard Hipp
> d...@hwaci.com
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-31 Thread Florian Weimer
* D. Richard Hipp:

> One of the criticisms of SQLite is that it is slow to do joins.  That  
> is true if SQLite is unable to figure out how to use an index to speed  
> the join.  I was under the impression that SQLite actually did a  
> fairly reasonable job of making use of indices, if they exist.  But  
> without indices, an k-way join takes time proportional to N^k.
>
> Do other SQL database engines not have this same limitation?  Are  
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating  
> phantom indices on-the-fly to help them do joins faster, for example?   

PostgreSQL roughly does one of the following (when dealing with a
two-way join):

  * If one side of the join is estimated to be a small set, PostgreSQL
performs a sequential scan on it, hashes it, and joins the other
table in a hash join.

  * If both sides are large, each side is sorted, and a merge join is
performed.

Things go horribly wrong if the estimates are off and the wrong plan
is picked.

There's also a nested loop join (which would be what SQLite does), but
I haven't seen it in recent version.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Nicolas Williams
On Sat, May 30, 2009 at 07:01:31PM +0100, Simon Slavin wrote:
> I'm interested in how sqlite works differently to the SQL systems  
> which keep a daemon running as a background task.  One of the  
> advantages of having a daemon which persists between runs of an  
> application is that the daemon can keep its own list of ORDERs, and  
> JOINs which are asked for frequently, and decide to maintain them even  
> when no SQL-using application is running.  [...]

You don't need a daemon to do that.  One could use a special table in
the database itself (much like the master table) to keep statistics
about all sorts of things.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Jim Wilcoxson
SQLite has surprised me with its quick performance, not the other way
around.  In fact, I've implemented all kinds of lookup queries that I
knew could be optimized by caching results so I didn't have to keep
repeating the SQL query, but the performance was so good even
repeating the queries that I never bothered with the caching.

I'm sure there are queries that SQLite doesn't run as fast as database
product X, and I'm sure it goes the other way too.  It's a balancing
act, and as the primary developer, you have to choose for us what's
important to optimize and what isn't.

So far, I'm very happy with the choices and trade-offs that have been
made in SQLite. :-)

Jim

On 5/30/09, Simon Slavin  wrote:
> I'm interested in how sqlite works differently to the SQL systems
> which keep a daemon running as a background task.  One of the
> advantages of having a daemon which persists between runs of an
> application is that the daemon can keep its own list of ORDERs, and
> JOINs which are asked for frequently, and decide to maintain them even
> when no SQL-using application is running.  This can give the
> impression that something is being done very quickly, when in fact the
> majority of the time was taken during a previous run of the
> application.  It can be particularly hard to figure out what a
> performance test means under these circumstances.
>
> But the problem is that I like the way sqlite works.  I like the tiny
> library, I like the way that the SQL library is entirely inside my
> application, and any CPU load is mine.  I like knowing that when my
> app quits, nothing is going on.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Software first.  Software lasts!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Simon Slavin
I'm interested in how sqlite works differently to the SQL systems  
which keep a daemon running as a background task.  One of the  
advantages of having a daemon which persists between runs of an  
application is that the daemon can keep its own list of ORDERs, and  
JOINs which are asked for frequently, and decide to maintain them even  
when no SQL-using application is running.  This can give the  
impression that something is being done very quickly, when in fact the  
majority of the time was taken during a previous run of the  
application.  It can be particularly hard to figure out what a  
performance test means under these circumstances.

But the problem is that I like the way sqlite works.  I like the tiny  
library, I like the way that the SQL library is entirely inside my  
application, and any CPU load is mine.  I like knowing that when my  
app quits, nothing is going on.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread John Elrick
D. Richard Hipp wrote:
> There has been a recent flurry of comments about SQLite at
>
>  http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
>  http://news.ycombinator.com/item?id=633151
>
> One of the criticisms of SQLite is that it is slow to do joins.  That  
> is true if SQLite is unable to figure out how to use an index to speed  
> the join.  I was under the impression that SQLite actually did a  
> fairly reasonable job of making use of indices, if they exist.  But  
> without indices, an k-way join takes time proportional to N^k.
>   

We're finishing a system which produces auto generated queries where k 
can potentially range from 1 to 16.  The only times I have seen Sqlite 
slowdown were when the indexes needed tweaking.

I noticed one of the comments made was that a three inner join query 
took 10 minutes and that the joins should have filtered the table, 
however, I did not notice any example, contrived or obfuscated, which 
would demonstrate the issue.  This absence means we must rely upon the 
author's interpretation of what the query did being accurate.  The 
author makes the assertion that each progressive inner join should make 
the search table smaller, however, just because that was the intent does 
not make it the reality.  The fact that temp tables with no apparent 
indexing ran faster makes me question whether or not the initial 
assumption was true.

I've been wrong far too often about the actual vs theoretical of my code 
in operation to accept anything such as this at face value; however, 
that caveat would apply from both directions.  The author said he 
reported it as a bug, which implies he presented a repeatable test 
case.  If that is so, that specific test case might merit further 
examination.

FWIW


John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Mark Hamburg
Assuming memory is sufficiently inexpensive, I would think that it  
would almost always be useful to build an index for any field in a  
join rather than doing a full scan. (Or better yet, build a hash table  
if memory is sufficient.) Indices maintained in the database then  
become optimizations to avoid starting the query with an index build.

Mark

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Join performance in SQLite

2009-05-30 Thread Pavel Ivanov
> Do other SQL database engines not have this same limitation?  Are
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating
> phantom indices on-the-fly to help them do joins faster, for example?

Sort of. There's 2 types of join methods in Oracle for this - Hash
joins and Sort merge joins - when server creates in memory (or in
temporary storage in general) sorted data for one or both merging data
sets and then merges these sets. You can read about it here:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i51523.
Not sure though if it's worth to implement such technique in SQLite.

Pavel

On Sat, May 30, 2009 at 11:11 AM, D. Richard Hipp  wrote:
> There has been a recent flurry of comments about SQLite at
>
>     http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
>     http://news.ycombinator.com/item?id=633151
>
> One of the criticisms of SQLite is that it is slow to do joins.  That
> is true if SQLite is unable to figure out how to use an index to speed
> the join.  I was under the impression that SQLite actually did a
> fairly reasonable job of making use of indices, if they exist.  But
> without indices, an k-way join takes time proportional to N^k.
>
> Do other SQL database engines not have this same limitation?  Are
> MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating
> phantom indices on-the-fly to help them do joins faster, for example?
> Or do their optimizers do a better job of finding ways to use indices
> in a join?  Can somebody supply me with specific examples of joins
> that other database engines do efficiently but that SQLite does
> slowly?  Is join efficiency really a frustration to many SQLite users?
>
> Curiously, in some of our own internal tests, SQLite is much, much
> faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is
> large - greater than 20 or 30.  (SQLite can handle up to a 64-way
> join.)  This is because SQLite uses a O(k*k) greedy algorithm for
> selecting the ordering of tables in the join whereas the other guys
> all do a much more extensive search.  So the performance loss in the
> other engines is due to the excessive time spent in the query planner,
> not the time actually running the query.  SQLite can plan a 64-way
> join in the blink of an eye, whereas PostgreSQL requires several
> minutes.
>
> But for the tests described in the previous paragraph, there were
> always good indices so that the time to actually run the join was
> approximately linear.  What about situations where you have a 4- or 5-
> way join on tables that are not indexed?  Do other database engines
> handle those more efficiently than SQLite somehow?  Is this something
> we need to look into?
>
> D. Richard Hipp
> d...@hwaci.com
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Join performance in SQLite

2009-05-30 Thread D. Richard Hipp
There has been a recent flurry of comments about SQLite at

 http://www.reddit.com/r/programming/comments/8oed5/how_sqlite_is_tested/
 http://news.ycombinator.com/item?id=633151

One of the criticisms of SQLite is that it is slow to do joins.  That  
is true if SQLite is unable to figure out how to use an index to speed  
the join.  I was under the impression that SQLite actually did a  
fairly reasonable job of making use of indices, if they exist.  But  
without indices, an k-way join takes time proportional to N^k.

Do other SQL database engines not have this same limitation?  Are  
MySQL and PostgreSQL and Firebird and MS-SQL and Oracle creating  
phantom indices on-the-fly to help them do joins faster, for example?   
Or do their optimizers do a better job of finding ways to use indices  
in a join?  Can somebody supply me with specific examples of joins  
that other database engines do efficiently but that SQLite does  
slowly?  Is join efficiency really a frustration to many SQLite users?

Curiously, in some of our own internal tests, SQLite is much, much  
faster than MS-SQL, MySQL, and PostgreSQL for k-way joins where k is  
large - greater than 20 or 30.  (SQLite can handle up to a 64-way  
join.)  This is because SQLite uses a O(k*k) greedy algorithm for  
selecting the ordering of tables in the join whereas the other guys  
all do a much more extensive search.  So the performance loss in the  
other engines is due to the excessive time spent in the query planner,  
not the time actually running the query.  SQLite can plan a 64-way  
join in the blink of an eye, whereas PostgreSQL requires several  
minutes.

But for the tests described in the previous paragraph, there were  
always good indices so that the time to actually run the join was  
approximately linear.  What about situations where you have a 4- or 5- 
way join on tables that are not indexed?  Do other database engines  
handle those more efficiently than SQLite somehow?  Is this something  
we need to look into?

D. Richard Hipp
d...@hwaci.com



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users