Hello, I have a question regarding the query-optimizer. I've tested SQLite with the following situation:
1 table defined as CREATE TABLE MyTable (ElementID INTEGER CONSTRAINT ElementID_PK PRIMARY KEY AUTOINCREMENT, Label VARCHAR(255)) After filling that table with lots of records (e.g. 100K), I tested random look-ups of the "Label" column based on the ElementID column (which is the primary-key of the table). That is, executing the following query, which always returns 1 label-value, many times (after preparing it only once), with random values for the @ElementID parameter: SELECT Label FROM MyTable WHERE ElementID = @ElementID I've noticed that the performance of such query is not what I expected. So, I looked at the query-plan (with the EXPLAIN keyword), confirming that it is not using the primary-key, but it seems to be doing table-scans instead (e.g. operation NotExists at addr 5): addr opcode p1 p2 p3 p4 p5 comment 0 Trace 0 0 0 0 1 Variable 1 1 1 @ElementID 0 2 Goto 0 10 0 0 3 OpenRead 0 2 0 2 0 4 MustBeInt 1 8 0 0 5 NotExists 0 8 1 0 6 Column 0 1 3 0 7 ResultRow 3 1 0 0 8 Close 0 0 0 0 9 Halt 0 0 0 0 10 Transaction 0 0 0 0 11 VerifyCookie 0 2 0 0 12 TableLock 0 2 0 MyTable 0 13 Goto 0 3 0 0 On the other hand, I've tried the same kind of "look-ups" but involving a few joined-tables, leading to a more complex query like: SELECT MyDataTable .MyValue FROM MyDataTable, MyIntermediateTable WHERE MyIntermediateTable.ID1 = @ID1 AND MyIntermediateTable.ID2 = @ID2 AND MyIntermediateTable.RecordID = MyDataTable.RecordID; That query returns one value for a provided pair of integer IDs (@ID1 and @ID2) and it involves an inner join between two tables (MyDataTable.RecordID is a primary key autoincrement column). Executing that second query with random IDs many times in a row leads to much better performance (takes less than half of the time) than the simple query shown before (the size in number of rows of all tables is comparable). After analyzing the query-plan for the second query, I can tell it is using the indexes to seek into those tables as shown below (e.g. operations at addr 13 through 16: SeekGe, IdxGE, IdxRowid and Seek). addr opcode p1 p2 p3 p4 p5 comment 0 Trace 0 0 0 0 1 Integer 1 1 0 0 2 Variable 1 2 1 @ID1 0 3 Variable 2 3 1 @ID2 0 4 Goto 0 29 0 0 5 OpenRead 1 7 0 3 0 6 OpenRead 2 8 0 keyinfo(2,BINARY,BINARY) 0 7 OpenRead 0 6 0 2 0 8 SCopy 2 4 0 0 9 IsNull 4 25 0 0 10 SCopy 3 5 0 0 11 IsNull 5 25 0 0 12 Affinity 4 2 0 dd 0 13 SeekGe 2 25 4 2 0 14 IdxGE 2 25 4 2 1 15 IdxRowid 2 6 0 0 16 Seek 1 6 0 0 17 Column 1 2 7 0 18 MustBeInt 7 24 0 0 19 NotExists 0 24 7 0 20 Column 0 1 8 0 0 21 RealAffinity 8 0 0 0 22 ResultRow 8 1 0 0 23 IfZero 1 25 -1 0 24 Next 2 14 0 0 25 Close 1 0 0 0 26 Close 2 0 0 0 27 Close 0 0 0 0 28 Halt 0 0 0 0 29 Transaction 0 0 0 0 30 VerifyCookie 0 101 0 0 31 TableLock 0 7 0 MyIntermediateTable 0 32 TableLock 0 6 0 MyDataTable 0 33 Goto 0 5 0 0 Does anyone have any idea why the query-optimizer is not taking advantage of the index associated with the primary key in the simple query I showed initially? How can I rewrite such a query so it uses the primary key index? I've tried "INDEXED BY" as well (after creating a second index on the ElementID column) with no success. Thanks, Diego _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users