Re: [sqlite] select performance with join
Thanks for the explanation. I recall seeing posts suggesting the use of union instead of or, and thought "if it's that easy, why doesn't SQLite do it?" The optimizer documentation says: --- Suppose the OR clause consists of multiple subterms as follows: expr1 OR expr2 OR expr3 If every subterm of an OR clause is separately indexable and the transformation to an IN operator described above does not apply, then the OR clause is coded so that it logically works the same as the following: rowid IN (SELECT rowid FROM table WHERE expr1 UNION SELECT rowid FROM table WHERE expr2 UNION SELECT rowid FROM table WHERE expr3) The implemention of the OR clause does not really use subqueries. A more efficient internal mechanism is employed. The implementation also works even for tables where the "rowid" column name has been overloaded for other uses and no longer refers to the real rowid. But the essence of the implementation is captured by the statement above: Separate indices are used to find rowids that satisfy each subterm of the OR clause and then the union of those rowids is used to find all matching rows in the database. --- It sounds like it might use indexes for an OR after all. Jim On 5/8/09, Igor Tandetnikwrote: > "Jim Wilcoxson" wrote > in message > news:c5830b750905080813p6bf901bn526c81ca8ce7a...@mail.gmail.com >> I don't know if it makes any difference, but is that where clause the >> same as: >> >> WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122) > > SQLite's optimizer cannot use an index for any condition involving OR. > That's why it's common to write an equivalent but somewhat unnatural > > name <= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) > > This way, at least the first condition has a chance of being satisfied > with an index. > > Igor Tandetnik > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Software first. Software lasts! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
wrote in message news:20090508113252.2uqkghcsj6og8...@webmail.korg.it > Citando Igor Tandetnik : > >> Andrea Galeazzi wrote: >>> but when I execute: >>> >>> SELECT S.id,title,artist,bpm,name >>> >>> FROM Song AS S >>> >>> LEFT JOIN Genre AS G ON (S.genre_id = G.id) >>> >>> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) >>> >>> ORDER BY name DESC, S.id DESC LIMIT 20; >> >> Note that LEFT JOIN is pointless here, since any record with >> G.name=NULL won't make it past the WHERE clause. Replace it with >> plain JOIN, you should see an improvement. >> > I replaced LEFT JOIN with JOIN but it got worse This means that the majority of all records in Song table satisfy the condition of name<= 'zUmM'. Basically, your query leaves SQLite no choice but to scan all records in Song. I don't see how it could be made to work any faster (without significant redesign of the schema - e.g. moving genre name into Song table). > But I think I need to use LEFT JOIN because I have also > to accept the records with S.genre_id = NULL. Well, if you need such records, then you need a different query. The one you show doesn't return these records, despite using LEFT JOIN. Don't take my word for it, test it on a small sample database. > I also tried this query: > "SELECT S.id,title,artist,bpm,name " > "FROM Song AS S, Genre AS G " > "WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR > S.id< 8122) " > "ORDER BY name DESC, S.id DESC LIMIT 20"; > even if it doesn't work for me because it doesn't match S.genre_id = > NULL, I noticed a little improvement to 6000 ms. Then I delete S.id > DESC and the performance has been abruptly increased to 400 ms. Again, this means that the condition (name<= 'zUmM') matches a large number of records in Song. When not ordering by S.id, SQLite can do the following: scan Genre table starting from 'zUmM' and going down, using an index on Genre(name). For each record in Genre, retrieve corresponding records in Song using an index on Song(genre_id). As soon as it got 20 records, it can stop. If you also order on S.id, SQLite must continue the process described above until it retieves _all_ records, sort them, and then return top 20. Run your queries (in sqlite3 command line utility or your favorite management tool) with EXPLAIN QUERY PLAN prepended. You may find the results illuminating. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
"Jim Wilcoxson"wrote in message news:c5830b750905080813p6bf901bn526c81ca8ce7a...@mail.gmail.com > I don't know if it makes any difference, but is that where clause the > same as: > > WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122) SQLite's optimizer cannot use an index for any condition involving OR. That's why it's common to write an equivalent but somewhat unnatural name <= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) This way, at least the first condition has a chance of being satisfied with an index. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
I don't know if it makes any difference, but is that where clause the same as: WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122) The original way it was coded, all 3 conditions would have to be evaluated most of the time. The new way might get most rows with 1 condition. Depends on the data distribution of course. Jim On 5/8/09, galea...@korg.itwrote: > Citando Igor Tandetnik : > >> Andrea Galeazzi wrote: >>> but when I execute: >>> >>> SELECT S.id,title,artist,bpm,name >>> >>> FROM Song AS S >>> >>> LEFT JOIN Genre AS G ON (S.genre_id = G.id) >>> >>> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) >>> >>> ORDER BY name DESC, S.id DESC LIMIT 20; >> >> Note that LEFT JOIN is pointless here, since any record with G.name=NULL >> won't make it past the WHERE clause. Replace it with plain JOIN, you >> should see an improvement. >> >> Igor Tandetnik >> > I replaced LEFT JOIN with JOIN but it got worse, now the the time is > about 8700 ms! But I think I need to use LEFT JOIN because I have also > to accept the records with S.genre_id = NULL. > I also tried this query: > “SELECT S.id,title,artist,bpm,name " > "FROM Song AS S, Genre AS G " > "WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR > S.id< 8122) " > "ORDER BY name DESC, S.id DESC LIMIT 20"; > even if it doesn't work for me because it doesn't match S.genre_id = > NULL, I noticed a little improvement to 6000 ms. Then I delete S.id > DESC and the performance has been abruptly increased to 400 ms. > Anyway probably the right statement is LEFT JOIN but how can I > optimize this kind of task? > Is it really an hard work or does it depend on my no knowledge about sqlite? >> >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Software first. Software lasts! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
Citando Igor Tandetnik: > Andrea Galeazzi wrote: >> but when I execute: >> >> SELECT S.id,title,artist,bpm,name >> >> FROM Song AS S >> >> LEFT JOIN Genre AS G ON (S.genre_id = G.id) >> >> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) >> >> ORDER BY name DESC, S.id DESC LIMIT 20; > > Note that LEFT JOIN is pointless here, since any record with G.name=NULL > won't make it past the WHERE clause. Replace it with plain JOIN, you > should see an improvement. > > Igor Tandetnik > I replaced LEFT JOIN with JOIN but it got worse, now the the time is about 8700 ms! But I think I need to use LEFT JOIN because I have also to accept the records with S.genre_id = NULL. I also tried this query: SELECT S.id,title,artist,bpm,name " "FROM Song AS S, Genre AS G " "WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR S.id< 8122) " "ORDER BY name DESC, S.id DESC LIMIT 20"; even if it doesn't work for me because it doesn't match S.genre_id = NULL, I noticed a little improvement to 6000 ms. Then I delete S.id DESC and the performance has been abruptly increased to 400 ms. Anyway probably the right statement is LEFT JOIN but how can I optimize this kind of task? Is it really an hard work or does it depend on my no knowledge about sqlite? > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
Andrea Galeazziwrote: > but when I execute: > > SELECT S.id,title,artist,bpm,name > > FROM Song AS S > > LEFT JOIN Genre AS G ON (S.genre_id = G.id) > > WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) > > ORDER BY name DESC, S.id DESC LIMIT 20; Note that LEFT JOIN is pointless here, since any record with G.name=NULL won't make it past the WHERE clause. Replace it with plain JOIN, you should see an improvement. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users