Александр Прокофьев wrote:
I've tried running versions of query that you suggested and got following
results:


Originally posted version       4.5 sec
Version with only first-part of where clause 16 ms

Yes, but this version isn't finding the same rows. :-)

Rewritten version without joins runs in 6.5 seconds

This is surprising. Are you sure about this timing, it doesn't make sense given the 4.5 second timings below. The only thing I can think of is that reversing the order of the second half of the where clause caused it to be slower. Can you try the following?

   Select ne.*
   From Entity AS ne
   Join Link AS l
   Where l."Source.Id" = ne.Id
       AND l."Target.Id" = '06d15df5-4253-4a65-b91f-cca52da960fe'
   OR l."Target.Id" = ne.Id
       AND l."Source.Id" = '06d15df5-4253-4a65-b91f-cca52da960fe'


Version with unions runs in     15ms

This is as I suspected. Each sub select runs a table scan of the entity table and uses an index to lookup matching rows in the link table. These results are then combined.

As I mentioned in my post to Samuel, it may be faster to swap the order of the tables in the subselects. It will then scan the link table and use the entity id index to lookup matching rows in the entity table.
Creating multi-column index (source.id, target.id) 4.5 sec

This shouldn't have any effect because the index will not be used if you already have the other indexes.

Removing all indexes for source.id and target.id        4.5 sec

OK, this should be the time for a full cartesian product scan, i.e. a scan through each row of the link table combined with each row if the entity table, or a intermediate joined table of N links times M entities.

Running query with first part of where clause and no indexes    2.5 sec

So it seems that indexes are not used at all, and that is pretty strange

I didn't state it in my original post, but this query was automatically
generated by object relational-mapper, that generates hundreds of similar
queries. I see that replacement of OR with UNION and AND with INTERSECT will
probably solve the problem, but yet it seems that using OR and AND in WHERE
clause is a better way to write queries, especially with complex nested
conditions.

Ah. That explains the unnecessary quotes and brackets. I'm not sure why it is including the unnecessary tables thought.

Using OR and AND in the where clause might make sense for a database with a much more sophisticated optimizer which effectively rewrites your queries for you, or one that can use multiple indexes to implement a table scan. SQLite doesn't do many optimizations and is limited to a single index for each table scan. For more info on how sqlite implements queries see http://www.sqlite.org/google-talk-slides/page-001.html around slide 40 or so (if memory serves me).

To get optimal performance from sqlite you may have to hand tune some of your queries, or make your query generator more intelligent.

Some information on db:
About 1500 records in each table
Original query returns about 10 rows


So a full cartesian join results in about 1500 x 1500 or 2,250,000 rows which takes about 4.5 seconds to scan looking for matching entity.id and link.target.id or link.source.id, and then checking the other link id for the specified value.

Given the tables are about the same size it would probably be best to remove the indexes on source id and target id and the reorder the tables in the queries so that they both use the same index on entity id.

     Select e.*
     From Link AS l
     Join Entity AS e on l."Source.Id" = e.Id
     Where l."Target.Id" = '06d15df5-4253-4a65-b91f-cca52da960fe'
 UNION
     Select e.*
     From Link AS l
     Join Entity AS e on l."Target.Id" = e.Id
     Where l."Source.Id" = '06d15df5-4253-4a65-b91f-cca52da960fe'

This will do two scans through the 1500 records in the link table for a total of 3000 rows. For each row it will do in index lookup in the entity table to find the matching entity. In fact, sqlite will probably do the where clause comparison before the index lookup so that the lookup is only done for rows that match the where condition (an optimization). Effectively you are scanning 300 rows and doing 10 indexed lookups.


Any suggestions on why indexes are not used and how to make SQLite use them
for this query?


Indexes are not used where there is no way to know which one to use or where a single index can't provide the required information. You can't join the entity table to the link table using two different columns, source.id and target.id, using an index.

You could try this as well with an index on each of the source id and target id link fields.

   Select e.*
   From Entity AS e
   left Join Link AS sl on sl."Source.Id" = e.Id
   left Join Link as tl on tl."Target.Id" = e.Id
   Where sl."Target.Id" = '06d15df5-4253-4a65-b91f-cca52da960fe'
   OR tl."Source.Id" = '06d15df5-4253-4a65-b91f-cca52da960fe'

The extra join gives sqlite an opportunity to use the source id index for the first join and the target id index for the second join.

HTH
Dennis Cote


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to