Re: [sqlite] column affinity and the query planner's use of indices

Keith Medcalf Tue, 08 Jul 2014 08:52:31 -0700

There is a way around this:

I am going to use a wholenumber virtual table for z because, for me, this is 
easier than fiddling with CTE's to generate integer sequences.  It makes no 
difference (other than it also executes about a million times faster than the 
CTE).  You will note that the CTE does not generate a column affinity for table 
z.  You can fix this the same way if you need column affinities.


Anyway, when you CREATE TABLE a AS SELECT you need to CAST the columns to the 
type you want for the column affinity for that column, and alias them to the 
column name you want.  While this may cause extra typing, you only do it once, 
and it is not incompatible with any other SQL dialect.

** Note ** this was done with the current head of trunk with all the extensions 
(including WHOLENUMBER) included.  If you don't have the WHOLENUMBER virtual 
table instruction then obviously you will not be able to run this and must use 
the CTE to generate table z.

CREATE VIRTUAL TABLE z USING WHOLENUMBER;
CREATE TABLE a(x INTEGER PRIMARY KEY);
INSERT INTO a SELECT value FROM z WHERE value BETWEEN 0 AND 1e6;
CREATE TABLE b AS SELECT CAST(value AS INTEGER) AS x FROM z WHERE value BETWEEN 
0 AND 1e6;
CREATE UNIQUE INDEX b_idx_1 ON b(x);
ANALYZE;

sqlite> PRAGMA table_info(a);
0|x|INTEGER|0||1

sqlite> PRAGMA table_info(b);
0|x|INT|0||0

sqlite> .schema
CREATE VIRTUAL TABLE z USING WHOLENUMBER;
CREATE TABLE a(x INTEGER PRIMARY KEY);
CREATE TABLE b(x INT);
CREATE UNIQUE INDEX b_idx_1 ON b(x);

sqlite> EXPLAIN QUERY PLAN SELECT * FROM a JOIN b INDEXED BY b_idx_1 ON a.x = 
b.x AND a.x = 123;
0|0|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?)
0|1|1|SEARCH TABLE b USING COVERING INDEX b_idx_1 (x=?)

sqlite> .timer on
sqlite> SELECT * FROM a JOIN b ON a.x = b.x AND a.x = 123;
123|123
Run Time: real 0.000 user 0.000000 sys 0.000000
sqlite>

Or using the original code:

sqlite> CREATE TEMP TABLE z AS
   ...> WITH RECURSIVE cnt(x) AS (
   ...>     VALUES(1) UNION ALL SELECT x+1 FROM cnt
   ...>     WHERE x < 1e6
   ...> ) SELECT x FROM cnt;
sqlite>
sqlite> CREATE TABLE a(x INTEGER NOT NULL PRIMARY KEY);
sqlite> INSERT INTO a SELECT x FROM temp.z;
sqlite>
sqlite> CREATE TABLE b AS SELECT cast(x as INTEGER) as x FROM temp.z;
sqlite> CREATE UNIQUE INDEX b_idx_1 ON b(x);
sqlite>
sqlite> ANALYZE;
sqlite>
sqlite> PRAGMA table_info(a);
0|x|INTEGER|1||1
sqlite>
sqlite> PRAGMA table_info(b);
0|x|INT|0||0
sqlite>
sqlite> PRAGMA index_info(b_idx_1);
0|0|x
sqlite>
sqlite> .schema
CREATE TABLE a(x INTEGER NOT NULL PRIMARY KEY);
CREATE TABLE b(x INT);
CREATE UNIQUE INDEX b_idx_1 ON b(x);
CREATE TABLE z(x);
sqlite>
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT * FROM a JOIN b ON a.x = b.x AND a.x = 123;
0|0|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?)
0|1|1|SEARCH TABLE b USING COVERING INDEX b_idx_1 (x=?)
sqlite>
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT * FROM a JOIN b INDEXED BY b_idx_1 ON a.x = b.x AND a.x = 123;
0|0|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?)
0|1|1|SEARCH TABLE b USING COVERING INDEX b_idx_1 (x=?)
sqlite>
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT * FROM a JOIN b ON b.x IN (a.x) AND a.x = 123;
0|0|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?)
0|1|1|SEARCH TABLE b USING COVERING INDEX b_idx_1 (x=?)
sqlite>
sqlite> .timer on
sqlite> SELECT * FROM a JOIN b ON a.x = b.x AND a.x = 123;
123|123
Run Time: real 0.000 user 0.000000 sys 0.000000
sqlite> SELECT * FROM a JOIN b ON b.x IN (a.x) AND a.x = 123;
123|123
Run Time: real 0.016 user 0.000000 sys 0.015625


>-----Original Message-----
>From: [email protected] [mailto:sqlite-users-
>[email protected]] On Behalf Of Richard Hipp
>Sent: Tuesday, 8 July, 2014 09:09
>To: General Discussion of SQLite Database
>Subject: Re: [sqlite] column affinity and the query planner's use of
>indices
>
>On Tue, Jul 8, 2014 at 11:01 AM, Hinrichsen, John <[email protected]>
>wrote:
>
>> Hi,
>>
>> Would you consider changing the column affinity determination rules
>
>
>Probably not.  There are over a half million apps (literally) in
>circulation that use the existing rules.  Changing the rules would break
>some fraction of those half-million apps, which would likely be very
>annoying to the developers.
>
>
>
>
>> so that
>> expressions would, at least in easy-to-deduce cases, automatically
>assign
>> the appropriate column affinity?
>>
>> Making this change would improve the performance of some queries (see
>my
>> original email.)
>>
>> It would be more intuitive: why should aggregate functions like min(),
>> max(), and sum() return column data stripped of the original column
>> affinity?
>>
>>
>> On Fri, May 23, 2014 at 2:21 PM, Hinrichsen, John
><[email protected]>
>> wrote:
>>
>> > At table creation time, when column types are not declared
>explicitly, or
>> > are produced by an expression, column affinity defaults to NONE, with
>the
>> > result that indexes added afterwards often go unused in joins because
>of
>> a
>> > column affinity mismatch.
>> >
>> > Adding casts around the expressions is an effective way to enforce
>column
>> > affinities, at the expense of redundant code.  Column types that
>could be
>> > declared in just one place, or simply inferred, have to be repeatedly
>> > re-declared via casts.  I assume that this is a result of sqlite's
>> manifest
>> > typing.
>> >
>> > Would you consider changing the column affinity determination rules
>such
>> > that expressions could, at least in easy-to-deduce cases,
>automatically
>> > assign the appropriate column affinity?
>> >
>> > My colleague Ivan, who diagnosed the issue, put the following repro
>> > together to illustrate the problem.  He uses the syntax "a JOIN b ON
>b.x
>> > IN (a.x)" and compares its query plan and performance to that of
>using
>> > the standard join syntax.  He also requests the query planner to
>> > specifically use the index within the context of the normal join
>syntax,
>> > which the query planner rejects.
>> >
>> > $ sqlite3
>> > SQLite version 3.8.4.3 2014-04-03 16:53:12
>> > Enter ".help" for usage hints.
>> > Connected to a transient in-memory database.
>> > Use ".open FILENAME" to reopen on a persistent database.
>> > sqlite>
>> > sqlite> CREATE TEMP TABLE z AS
>> >    ...> WITH RECURSIVE cnt(x) AS (
>> >    ...>     VALUES(1) UNION ALL SELECT x+1 FROM cnt
>> >    ...>     WHERE x < 1e6
>> >    ...> ) SELECT x FROM cnt;
>> > sqlite>
>> > sqlite> CREATE TABLE a(x INTEGER NOT NULL PRIMARY KEY);
>> > sqlite> INSERT INTO a SELECT x FROM temp.z;
>> > sqlite>
>> > sqlite> CREATE TABLE b AS SELECT x FROM temp.z;
>> > sqlite> CREATE UNIQUE INDEX b_idx_1 ON b(x);
>> > sqlite>
>> > sqlite> ANALYZE;
>> > sqlite>
>> > sqlite> PRAGMA table_info(a);
>> > cid         name        type        notnull     dflt_value  pk
>> > ----------  ----------  ----------  ----------  ----------  ---------
>-
>> > 0           x           INTEGER     1                       1
>> > sqlite> PRAGMA table_info(b);
>> > cid         name        type        notnull     dflt_value  pk
>> > ----------  ----------  ----------  ----------  ----------  ---------
>-
>> > 0           x                       0                       0
>> > sqlite> PRAGMA index_info(b_idx_1);
>> > seqno       cid         name
>> > ----------  ----------  ----------
>> > 0           0           x
>> > sqlite> EXPLAIN QUERY PLAN
>> >    ...> SELECT * FROM a JOIN b ON a.x = b.x AND a.x = 123;
>> > selectid    order       from        detail
>> > ----------  ----------  ----------
>> > --------------------------------------------------
>> > 0           0           0           SEARCH TABLE a USING INTEGER
>PRIMARY
>> > KEY (rowid=?)
>> > 0           1           1           SCAN TABLE b
>> > sqlite> EXPLAIN QUERY PLAN
>> >    ...> SELECT * FROM a JOIN b INDEXED BY b_idx_1 ON a.x = b.x AND
>a.x =
>> > 123;
>> > Error: no query solution
>> >
>> > sqlite> EXPLAIN QUERY PLAN
>> >    ...> SELECT * FROM a JOIN b ON b.x IN (a.x) AND a.x = 123;
>> > selectid    order       from        detail
>> > ----------  ----------  ----------
>> > --------------------------------------------------
>> > 0           0           0           SEARCH TABLE a USING INTEGER
>PRIMARY
>> > KEY (rowid=?)
>> > 0           1           1           SEARCH TABLE b USING COVERING
>INDEX
>> > b_idx_1 (x=?)
>> > 0           0           0           EXECUTE LIST SUBQUERY 1
>> > sqlite> .timer on
>> > sqlite> SELECT * FROM a JOIN b ON a.x = b.x AND a.x = 123;
>> > x           x
>> > ----------  ----------
>> > 123         123
>> > Run Time: real 0.157 user 0.155976 sys 0.000000
>> >
>> > sqlite> SELECT * FROM a JOIN b ON b.x IN (a.x) AND a.x = 123;
>> > x           x
>> > ----------  ----------
>> > 123         123
>> > Run Time: real 0.000 user 0.000000 sys 0.000000
>> > sqlite>
>> >
>>
>> --
>>
>> This message contains confidential information and is intended only for
>the
>> individual named. If you are not the named addressee, you should not
>> disseminate, distribute, alter or copy this e-mail. Please notify the
>> sender immediately by e-mail if you have received this e-mail by
>mistake
>> and delete this e-mail from your system. E-mail transmissions cannot be
>> guaranteed to be secure or without error as information could be
>> intercepted, corrupted, lost, destroyed, or arrive late or incomplete.
>The
>> sender, therefore, does not accept liability for any errors or
>omissions in
>> the contents of this message which arise during or as a result of e-
>mail
>> transmission. If verification is required, please request a hard-copy
>> version. This message is provided for information purposes and should
>not
>> be construed as a solicitation or offer to buy or sell any securities
>or
>> related financial instruments in any jurisdiction.
>> _______________________________________________
>> sqlite-users mailing list
>> [email protected]
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
>
>--
>D. Richard Hipp
>[email protected]
>_______________________________________________
>sqlite-users mailing list
>[email protected]
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] column affinity and the query planner's use of indices

Reply via email to