On Apr 14, 2010, at 1:05 AM, <diego.d...@bentley.com> wrote: > Hello, > > I have a question regarding the query-optimizer. I've tested SQLite > with the following situation: > > 1 table defined as > > CREATE TABLE MyTable (ElementID INTEGER CONSTRAINT ElementID_PK > PRIMARY KEY AUTOINCREMENT, Label VARCHAR(255)) > > After filling that table with lots of records (e.g. 100K), I tested > random look-ups of the "Label" column based on the ElementID column > (which is the primary-key of the table). That is, executing the > following query, which always returns 1 label-value, many times > (after preparing it only once), with random values for the > @ElementID parameter: > > SELECT Label FROM MyTable WHERE ElementID = @ElementID > > I've noticed that the performance of such query is not what I > expected. So, I looked at the query-plan (with the EXPLAIN keyword), > confirming that it is not using the primary-key, but it seems to be > doing table-scans instead (e.g. operation NotExists at addr 5):
It is using the primary key. NotExists does a seek on the IPK column. This query should be fast. > addr opcode p1 p2 p3 p4 p5 comment > 0 Trace 0 0 0 0 > 1 Variable 1 1 1 @ElementID 0 > 2 Goto 0 10 0 0 > 3 OpenRead 0 2 0 2 0 > 4 MustBeInt 1 8 0 0 > 5 NotExists 0 8 1 0 > 6 Column 0 1 3 0 > 7 ResultRow 3 1 0 0 > 8 Close 0 0 0 0 > 9 Halt 0 0 0 0 > 10 Transaction 0 0 0 0 > 11 VerifyCookie 0 2 0 0 > 12 TableLock 0 2 0 MyTable 0 > 13 Goto 0 3 0 0 > > On the other hand, I've tried the same kind of "look-ups" but > involving a few joined-tables, leading to a more complex query like: > > SELECT MyDataTable .MyValue FROM MyDataTable, MyIntermediateTable > WHERE MyIntermediateTable.ID1 = @ID1 AND MyIntermediateTable.ID2 = > @ID2 AND MyIntermediateTable.RecordID = MyDataTable.RecordID; > > That query returns one value for a provided pair of integer IDs > (@ID1 and @ID2) and it involves an inner join between two tables > (MyDataTable.RecordID is a primary key autoincrement column). > > Executing that second query with random IDs many times in a row > leads to much better performance (takes less than half of the time) > than the simple query shown before (the size in number of rows of > all tables is comparable). > > After analyzing the query-plan for the second query, I can tell it > is using the indexes to seek into those tables as shown below (e.g. > operations at addr 13 through 16: SeekGe, IdxGE, IdxRowid and Seek). > > addr opcode p1 p2 p3 p4 p5 comment > 0 Trace 0 0 0 0 > 1 Integer 1 1 0 0 > 2 Variable 1 2 1 @ID1 0 > 3 Variable 2 3 1 @ID2 0 > 4 Goto 0 29 0 0 > 5 OpenRead 1 7 0 3 0 > 6 OpenRead 2 8 0 > keyinfo(2,BINARY,BINARY) 0 > 7 OpenRead 0 6 0 2 0 > 8 SCopy 2 4 0 0 > 9 IsNull 4 25 0 0 > 10 SCopy 3 5 0 0 > 11 IsNull 5 25 0 0 > 12 Affinity 4 2 0 dd 0 > 13 SeekGe 2 25 4 2 0 > 14 IdxGE 2 25 4 2 1 > 15 IdxRowid 2 6 0 0 > 16 Seek 1 6 0 0 > 17 Column 1 2 7 0 > 18 MustBeInt 7 24 0 0 > 19 NotExists 0 24 7 0 > 20 Column 0 1 8 0 0 > 21 RealAffinity 8 0 0 0 > 22 ResultRow 8 1 0 0 > 23 IfZero 1 25 -1 0 > 24 Next 2 14 0 0 > 25 Close 1 0 0 0 > 26 Close 2 0 0 0 > 27 Close 0 0 0 0 > 28 Halt 0 0 0 0 > 29 Transaction 0 0 0 0 > 30 VerifyCookie 0 101 0 0 > 31 TableLock 0 7 0 > MyIntermediateTable 0 > 32 TableLock 0 6 0 MyDataTable 0 > 33 Goto 0 5 0 0 > > Does anyone have any idea why the query-optimizer is not taking > advantage of the index associated with the primary key in the simple > query I showed initially? How can I rewrite such a query so it uses > the primary key index? I've tried "INDEXED BY" as well (after > creating a second index on the ElementID column) with no success. > > Thanks, > > Diego > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users