Re: [sqlite] select performance with join
Thanks for the explanation. I recall seeing posts suggesting the use of union instead of or, and thought "if it's that easy, why doesn't SQLite do it?" The optimizer documentation says: --- Suppose the OR clause consists of multiple subterms as follows: expr1 OR expr2 OR expr3 If every subterm of an OR clause is separately indexable and the transformation to an IN operator described above does not apply, then the OR clause is coded so that it logically works the same as the following: rowid IN (SELECT rowid FROM table WHERE expr1 UNION SELECT rowid FROM table WHERE expr2 UNION SELECT rowid FROM table WHERE expr3) The implemention of the OR clause does not really use subqueries. A more efficient internal mechanism is employed. The implementation also works even for tables where the "rowid" column name has been overloaded for other uses and no longer refers to the real rowid. But the essence of the implementation is captured by the statement above: Separate indices are used to find rowids that satisfy each subterm of the OR clause and then the union of those rowids is used to find all matching rows in the database. --- It sounds like it might use indexes for an OR after all. Jim On 5/8/09, Igor Tandetnikwrote: > "Jim Wilcoxson" wrote > in message > news:c5830b750905080813p6bf901bn526c81ca8ce7a...@mail.gmail.com >> I don't know if it makes any difference, but is that where clause the >> same as: >> >> WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122) > > SQLite's optimizer cannot use an index for any condition involving OR. > That's why it's common to write an equivalent but somewhat unnatural > > name <= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) > > This way, at least the first condition has a chance of being satisfied > with an index. > > Igor Tandetnik > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Software first. Software lasts! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
wrote in message news:20090508113252.2uqkghcsj6og8...@webmail.korg.it > Citando Igor Tandetnik : > >> Andrea Galeazzi wrote: >>> but when I execute: >>> >>> SELECT S.id,title,artist,bpm,name >>> >>> FROM Song AS S >>> >>> LEFT JOIN Genre AS G ON (S.genre_id = G.id) >>> >>> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) >>> >>> ORDER BY name DESC, S.id DESC LIMIT 20; >> >> Note that LEFT JOIN is pointless here, since any record with >> G.name=NULL won't make it past the WHERE clause. Replace it with >> plain JOIN, you should see an improvement. >> > I replaced LEFT JOIN with JOIN but it got worse This means that the majority of all records in Song table satisfy the condition of name<= 'zUmM'. Basically, your query leaves SQLite no choice but to scan all records in Song. I don't see how it could be made to work any faster (without significant redesign of the schema - e.g. moving genre name into Song table). > But I think I need to use LEFT JOIN because I have also > to accept the records with S.genre_id = NULL. Well, if you need such records, then you need a different query. The one you show doesn't return these records, despite using LEFT JOIN. Don't take my word for it, test it on a small sample database. > I also tried this query: > "SELECT S.id,title,artist,bpm,name " > "FROM Song AS S, Genre AS G " > "WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR > S.id< 8122) " > "ORDER BY name DESC, S.id DESC LIMIT 20"; > even if it doesn't work for me because it doesn't match S.genre_id = > NULL, I noticed a little improvement to 6000 ms. Then I delete S.id > DESC and the performance has been abruptly increased to 400 ms. Again, this means that the condition (name<= 'zUmM') matches a large number of records in Song. When not ordering by S.id, SQLite can do the following: scan Genre table starting from 'zUmM' and going down, using an index on Genre(name). For each record in Genre, retrieve corresponding records in Song using an index on Song(genre_id). As soon as it got 20 records, it can stop. If you also order on S.id, SQLite must continue the process described above until it retieves _all_ records, sort them, and then return top 20. Run your queries (in sqlite3 command line utility or your favorite management tool) with EXPLAIN QUERY PLAN prepended. You may find the results illuminating. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
"Jim Wilcoxson"wrote in message news:c5830b750905080813p6bf901bn526c81ca8ce7a...@mail.gmail.com > I don't know if it makes any difference, but is that where clause the > same as: > > WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122) SQLite's optimizer cannot use an index for any condition involving OR. That's why it's common to write an equivalent but somewhat unnatural name <= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) This way, at least the first condition has a chance of being satisfied with an index. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
I don't know if it makes any difference, but is that where clause the same as: WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122) The original way it was coded, all 3 conditions would have to be evaluated most of the time. The new way might get most rows with 1 condition. Depends on the data distribution of course. Jim On 5/8/09, galea...@korg.itwrote: > Citando Igor Tandetnik : > >> Andrea Galeazzi wrote: >>> but when I execute: >>> >>> SELECT S.id,title,artist,bpm,name >>> >>> FROM Song AS S >>> >>> LEFT JOIN Genre AS G ON (S.genre_id = G.id) >>> >>> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) >>> >>> ORDER BY name DESC, S.id DESC LIMIT 20; >> >> Note that LEFT JOIN is pointless here, since any record with G.name=NULL >> won't make it past the WHERE clause. Replace it with plain JOIN, you >> should see an improvement. >> >> Igor Tandetnik >> > I replaced LEFT JOIN with JOIN but it got worse, now the the time is > about 8700 ms! But I think I need to use LEFT JOIN because I have also > to accept the records with S.genre_id = NULL. > I also tried this query: > “SELECT S.id,title,artist,bpm,name " > "FROM Song AS S, Genre AS G " > "WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR > S.id< 8122) " > "ORDER BY name DESC, S.id DESC LIMIT 20"; > even if it doesn't work for me because it doesn't match S.genre_id = > NULL, I noticed a little improvement to 6000 ms. Then I delete S.id > DESC and the performance has been abruptly increased to 400 ms. > Anyway probably the right statement is LEFT JOIN but how can I > optimize this kind of task? > Is it really an hard work or does it depend on my no knowledge about sqlite? >> >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Software first. Software lasts! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
Citando Igor Tandetnik: > Andrea Galeazzi wrote: >> but when I execute: >> >> SELECT S.id,title,artist,bpm,name >> >> FROM Song AS S >> >> LEFT JOIN Genre AS G ON (S.genre_id = G.id) >> >> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) >> >> ORDER BY name DESC, S.id DESC LIMIT 20; > > Note that LEFT JOIN is pointless here, since any record with G.name=NULL > won't make it past the WHERE clause. Replace it with plain JOIN, you > should see an improvement. > > Igor Tandetnik > I replaced LEFT JOIN with JOIN but it got worse, now the the time is about 8700 ms! But I think I need to use LEFT JOIN because I have also to accept the records with S.genre_id = NULL. I also tried this query: SELECT S.id,title,artist,bpm,name " "FROM Song AS S, Genre AS G " "WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR S.id< 8122) " "ORDER BY name DESC, S.id DESC LIMIT 20"; even if it doesn't work for me because it doesn't match S.genre_id = NULL, I noticed a little improvement to 6000 ms. Then I delete S.id DESC and the performance has been abruptly increased to 400 ms. Anyway probably the right statement is LEFT JOIN but how can I optimize this kind of task? Is it really an hard work or does it depend on my no knowledge about sqlite? > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select performance with join
Andrea Galeazziwrote: > but when I execute: > > SELECT S.id,title,artist,bpm,name > > FROM Song AS S > > LEFT JOIN Genre AS G ON (S.genre_id = G.id) > > WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) > > ORDER BY name DESC, S.id DESC LIMIT 20; Note that LEFT JOIN is pointless here, since any record with G.name=NULL won't make it past the WHERE clause. Replace it with plain JOIN, you should see an improvement. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] select performance with join
Hi guys, I've got a big problem about select performance on an left join. I have two tables: CREATE TABLE Song ( id INTEGER NOT NULL UNIQUE, title VARCHAR(40) NULL COLLATE NOCASE, artist VARCHAR(40) NULL COLLATE NOCASE, bpm INT NULL, genre_id INT NULL, PRIMARY KEY (id), CONSTRAINT fk_Genre FOREIGN KEY (genre_id) REFERENCES Song (id) ON DELETE SET NULL ON UPDATE CASCADE); -- Indeces CREATE INDEX Song_title_idx ON Song(title); CREATE INDEX Song_artist_idx ON Song(artist); CREATE INDEX Song_bpm_idx ON Song(bpm); CREATE INDEX Song_genre_idx ON Song(genre_id); CREATE TABLE Genre ( id INTEGER NOT NULL UNIQUE, name VARCHAR(20) NOT NULL COLLATE NOCASE, image_id INT NOT NULL, PRIMARY KEY (id), CONSTRAINT fk_Image2 FOREIGN KEY (image_id) REFERENCES Genre (id) ON DELETE SET NULL ON UPDATE CASCADE); -- Indeces CREATE INDEX Genre_name_idx ON Genre(name); - Now when I execute this query: SELECT S.id,title,artist,bpm,name FROM Song AS S LEFT JOIN Genre AS G ON (S.genre_id = G.id) WHERE title<= 'zzX_5238' AND (title< 'zzX_5238' OR S.id< 5238) ORDER BY title DESC, S.id DESC LIMIT 20; it takes only 200 ms but when I execute: SELECT S.id,title,artist,bpm,name FROM Song AS S LEFT JOIN Genre AS G ON (S.genre_id = G.id) WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122) ORDER BY name DESC, S.id DESC LIMIT 20; it takes 8100! It's a huge time for our application! I also noticed that the Genre_name_idx it's useless. Song has 1 records, Genre has 100 records. Does anyone have any ideas about how to improve the previous query? Thanks ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users