Re: [sqlite] select performance with join

2009-05-09 Thread Jim Wilcoxson
Thanks for the explanation.  I recall seeing posts suggesting the use
of union instead of or, and thought "if it's that easy, why doesn't
SQLite do it?"  The optimizer documentation says:

---
Suppose the OR clause consists of multiple subterms as follows:

  expr1 OR expr2 OR expr3

If every subterm of an OR clause is separately indexable and the
transformation to an IN operator described above does not apply, then
the OR clause is coded so that it logically works the same as the
following:


  rowid IN (SELECT rowid FROM table WHERE expr1
UNION SELECT rowid FROM table WHERE expr2
UNION SELECT rowid FROM table WHERE expr3)

The implemention of the OR clause does not really use subqueries. A
more efficient internal mechanism is employed. The implementation also
works even for tables where the "rowid" column name has been
overloaded for other uses and no longer refers to the real rowid. But
the essence of the implementation is captured by the statement above:
Separate indices are used to find rowids that satisfy each subterm of
the OR clause and then the union of those rowids is used to find all
matching rows in the database.
---

It sounds like it might use indexes for an OR after all.

Jim

On 5/8/09, Igor Tandetnik  wrote:
> "Jim Wilcoxson"  wrote
> in message
> news:c5830b750905080813p6bf901bn526c81ca8ce7a...@mail.gmail.com
>> I don't know if it makes any difference, but is that where clause the
>> same as:
>>
>> WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122)
>
> SQLite's optimizer cannot use an index for any condition involving OR.
> That's why it's common to write an equivalent but somewhat unnatural
>
> name <= 'zUmM' AND (name< 'zUmM' OR S.id< 8122)
>
> This way, at least the first condition has a chance of being satisfied
> with an index.
>
> Igor Tandetnik
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Software first.  Software lasts!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] select performance with join

2009-05-08 Thread Igor Tandetnik
 wrote in message
news:20090508113252.2uqkghcsj6og8...@webmail.korg.it
> Citando Igor Tandetnik :
>
>> Andrea Galeazzi  wrote:
>>> but when I execute:
>>>
>>> SELECT S.id,title,artist,bpm,name
>>>
>>> FROM Song AS S
>>>
>>> LEFT JOIN Genre AS G ON (S.genre_id = G.id)
>>>
>>> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122)
>>>
>>> ORDER BY name DESC, S.id DESC LIMIT 20;
>>
>> Note that LEFT JOIN is pointless here, since any record with
>> G.name=NULL won't make it past the WHERE clause. Replace it with
>> plain JOIN, you should see an improvement.
>>
> I replaced LEFT JOIN with JOIN but it got worse

This means that the majority of all records in Song table satisfy the 
condition of name<= 'zUmM'. Basically, your query leaves SQLite no 
choice but to scan all records in Song. I don't see how it could be made 
to work any faster (without significant redesign of the schema - e.g. 
moving genre name into Song table).

> But I think I need to use LEFT JOIN because I have also
> to accept the records with S.genre_id = NULL.

Well, if you need such records, then you need a different query. The one 
you show doesn't return these records, despite using LEFT JOIN. Don't 
take my word for it, test it on a small sample database.

> I also tried this query:
> "SELECT S.id,title,artist,bpm,name "
> "FROM Song  AS S, Genre AS G "
> "WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR
> S.id< 8122) "
> "ORDER BY name DESC, S.id DESC LIMIT 20";
> even if it doesn't work for me because it doesn't  match  S.genre_id =
> NULL, I noticed a little improvement to 6000 ms. Then I delete S.id
> DESC and the performance has been abruptly increased to 400 ms.

Again, this means that the condition (name<= 'zUmM') matches a large 
number of records in Song. When not ordering by S.id, SQLite can do the 
following: scan Genre table starting from 'zUmM' and going down, using 
an index on Genre(name). For each record in Genre, retrieve 
corresponding records in Song using an index on Song(genre_id). As soon 
as it got 20 records, it can stop.

If you also order on S.id, SQLite must continue the process described 
above until it retieves _all_ records, sort them, and then return top 
20.

Run your queries (in sqlite3 command line utility or your favorite 
management tool) with EXPLAIN QUERY PLAN prepended. You may find the 
results illuminating.

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] select performance with join

2009-05-08 Thread Igor Tandetnik
"Jim Wilcoxson"  wrote
in message
news:c5830b750905080813p6bf901bn526c81ca8ce7a...@mail.gmail.com
> I don't know if it makes any difference, but is that where clause the
> same as:
>
> WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122)

SQLite's optimizer cannot use an index for any condition involving OR. 
That's why it's common to write an equivalent but somewhat unnatural

name <= 'zUmM' AND (name< 'zUmM' OR S.id< 8122)

This way, at least the first condition has a chance of being satisfied 
with an index.

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] select performance with join

2009-05-08 Thread Jim Wilcoxson
I don't know if it makes any difference, but is that where clause the same as:

WHERE name< 'zUmM' OR (name= 'zUmM' AND S.id< 8122)

The original way it was coded, all 3 conditions would have to be
evaluated most of the time.  The new way might get most rows with 1
condition.  Depends on the data distribution of course.

Jim

On 5/8/09, galea...@korg.it  wrote:
> Citando Igor Tandetnik :
>
>> Andrea Galeazzi  wrote:
>>> but when I execute:
>>>
>>> SELECT S.id,title,artist,bpm,name
>>>
>>> FROM Song AS S
>>>
>>> LEFT JOIN Genre AS G ON (S.genre_id = G.id)
>>>
>>> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122)
>>>
>>> ORDER BY name DESC, S.id DESC LIMIT 20;
>>
>> Note that LEFT JOIN is pointless here, since any record with G.name=NULL
>> won't make it past the WHERE clause. Replace it with plain JOIN, you
>> should see an improvement.
>>
>> Igor Tandetnik
>>
> I replaced LEFT JOIN with JOIN but it got worse, now the the time is
> about 8700 ms! But I think I need to use LEFT JOIN because I have also
> to accept the records with S.genre_id = NULL.
> I also tried this query:
> “SELECT S.id,title,artist,bpm,name "
> "FROM Song  AS S, Genre AS G "
> "WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR
> S.id< 8122) "
> "ORDER BY name DESC, S.id DESC LIMIT 20";
> even if it doesn't work for me because it doesn't  match  S.genre_id =
> NULL, I noticed a little improvement to 6000 ms. Then I delete S.id
> DESC and the performance has been abruptly increased to 400 ms.
> Anyway probably the right statement is LEFT JOIN but how can I
> optimize this kind of task?
> Is it really an hard work or does it depend on my no knowledge about sqlite?
>>
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Software first.  Software lasts!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] select performance with join

2009-05-08 Thread galeazzi
Citando Igor Tandetnik :

> Andrea Galeazzi  wrote:
>> but when I execute:
>>
>> SELECT S.id,title,artist,bpm,name
>>
>> FROM Song AS S
>>
>> LEFT JOIN Genre AS G ON (S.genre_id = G.id)
>>
>> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122)
>>
>> ORDER BY name DESC, S.id DESC LIMIT 20;
>
> Note that LEFT JOIN is pointless here, since any record with G.name=NULL
> won't make it past the WHERE clause. Replace it with plain JOIN, you
> should see an improvement.
>
> Igor Tandetnik
>
I replaced LEFT JOIN with JOIN but it got worse, now the the time is  
about 8700 ms! But I think I need to use LEFT JOIN because I have also  
to accept the records with S.genre_id = NULL.
I also tried this query:
“SELECT S.id,title,artist,bpm,name "
"FROM Song  AS S, Genre AS G "
"WHERE (S.genre_id = G.id) AND (name<= 'zUmM') AND (name< 'zUmM' OR  
S.id< 8122) "
"ORDER BY name DESC, S.id DESC LIMIT 20";
even if it doesn't work for me because it doesn't  match  S.genre_id =  
NULL, I noticed a little improvement to 6000 ms. Then I delete S.id  
DESC and the performance has been abruptly increased to 400 ms.
Anyway probably the right statement is LEFT JOIN but how can I  
optimize this kind of task?
Is it really an hard work or does it depend on my no knowledge about sqlite?
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] select performance with join

2009-05-07 Thread Igor Tandetnik
Andrea Galeazzi  wrote:
> but when I execute:
>
> SELECT S.id,title,artist,bpm,name
>
> FROM Song AS S
>
> LEFT JOIN Genre AS G ON (S.genre_id = G.id)
>
> WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122)
>
> ORDER BY name DESC, S.id DESC LIMIT 20;

Note that LEFT JOIN is pointless here, since any record with G.name=NULL 
won't make it past the WHERE clause. Replace it with plain JOIN, you 
should see an improvement.

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] select performance with join

2009-05-07 Thread Andrea Galeazzi
Hi guys,

I've got a big problem about select performance on an left join. I have 
two tables:

CREATE TABLE Song (

id INTEGER NOT NULL UNIQUE,

title VARCHAR(40) NULL COLLATE NOCASE,

artist VARCHAR(40) NULL COLLATE NOCASE,

bpm INT NULL,

genre_id INT NULL,

PRIMARY KEY (id),

CONSTRAINT fk_Genre FOREIGN KEY (genre_id)

REFERENCES Song (id)

ON DELETE SET NULL

ON UPDATE CASCADE);

-- Indeces

CREATE INDEX Song_title_idx ON Song(title);

CREATE INDEX Song_artist_idx ON Song(artist);

CREATE INDEX Song_bpm_idx ON Song(bpm);

CREATE INDEX Song_genre_idx ON Song(genre_id);


CREATE TABLE Genre (

id INTEGER NOT NULL UNIQUE,

name VARCHAR(20) NOT NULL COLLATE NOCASE,

image_id INT NOT NULL,

PRIMARY KEY (id),

CONSTRAINT fk_Image2 FOREIGN KEY (image_id)

REFERENCES Genre (id)

ON DELETE SET NULL

ON UPDATE CASCADE);


-- Indeces

CREATE INDEX Genre_name_idx ON Genre(name);

-

Now when I execute this query:

SELECT S.id,title,artist,bpm,name

FROM Song AS S

LEFT JOIN Genre AS G ON (S.genre_id = G.id)

WHERE title<= 'zzX_5238' AND (title< 'zzX_5238' OR S.id< 5238)

ORDER BY title DESC, S.id DESC LIMIT 20;

it takes only 200 ms

but when I execute:

SELECT S.id,title,artist,bpm,name

FROM Song AS S

LEFT JOIN Genre AS G ON (S.genre_id = G.id)

WHERE name<= 'zUmM' AND (name< 'zUmM' OR S.id< 8122)

ORDER BY name DESC, S.id DESC LIMIT 20;


it takes 8100! It's a huge time for our application!

I also noticed that the Genre_name_idx it's useless. Song has 1 
records, Genre has 100 records.

Does anyone have any ideas about how to improve the previous query?

Thanks

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users