I asked this question on Stackoverflow with not much success, and a suggestion 
to ask it on the list. So here I am. I have two tables, t1(id, t1Id, … other 
cols …, fullText) and a FTS5 virtual table vt1(t1Id, fullText)

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(*) as num FROM t1 WHERE deleted = 0;

QUERY 
PLAN
--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=?)

sqlite> SELECT Count(*) as num FROM t1 WHERE deleted = 0;
308498

Run Time
: real 0.043 user 0.023668 sys 0.009005
```


As can be see above, the actual query takes ~43ms

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(*) as num FROM vt1 WHERE vt1 MATCH 'foo';

QUERY 
PLAN
--SCAN TABLE vt1 VIRTUAL TABLE INDEX 131073:

sqlite> SELECT Count(*) as num FROM vt1 WHERE vt1 MATCH 'foo';
80789

Run Time
: real 0.047 user 0.008021 sys 0.009640
```

The actual query, in this case, takes ~47ms. So far so good. But the problem 
occurs when I join the two tables

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND vt1 MATCH 'foo';

QUERY 
PLAN
|--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
 --SEARCH TABLE t1 USING INDEX sqlite_autoindex_t1_1 (t1Id=?)

sqlite> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND vt1 MATCH 'foo';
80789

Run Time
: real 26.218 user 1.396376 sys 5.413630
```

The answer is correct but the query takes more than 26 seconds! Of course, I 
would like to speed up this query by several orders of magnitude, but I would 
also like to understand why this join is causing the slowdown.

Now, the reason I have constructed a query like this is because users can add 
further constraints for the table t1. For example, 

```
sqlite> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND 
   ...> WHERE t1.frob = ‘bar' AND 
   ...> WHERE t1.nob = ‘baz' AND 
   ...> vt1 MATCH 'foo’;
```

Also, in every operation, for every given constraint, two queries are 
performed, one that returns the count and the other that returns the actual 
columns. And, finally, only a subset of the results are returned using LIMIT 
and OFFSET but *after* a sort ORDER has been prescribed. So, in reality, the 
last constraint above would result in the following

```
sqlite> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND 
   ...> WHERE t1.frob = ‘bar' AND 
   ...> WHERE t1.nob = ‘baz' AND 
   ...> vt1 MATCH 'foo’;
20367

sqlite> SELECT t1.id, t1.t1Id, … other cols …, 
   ...> snippet(vt1, 1, "<b>", "</b>", "", 50) AS context,
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND 
   ...> WHERE t1.frob = ‘bar' AND 
   ...> WHERE t1.nob = ‘baz' AND 
   ...> vt1 MATCH ‘foo’
   ...> ORDER BY <some t1 col>
   ...> LIMIT 30 OFFSET <some offset calculated by pagination>;
```

When no t1 columns are prescribed in the constraint, the default count (shown 
above) and default cols are returned with the FTS search

```
sqlite> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 
   ...> vt1 MATCH 'foo’;
20367

sqlite> SELECT t1.id, t1.t1Id, … other cols …, 
   ...> snippet(vt1, 1, "<b>", "</b>", "", 50) AS context,
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 
   ...> vt1 MATCH ‘foo’
   ...> ORDER BY <some t1 col>
   ...> LIMIT 30 OFFSET 0;
```
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to