On 2017/04/19 8:50 PM, Gabriele Lanaro wrote:
Dear SQLite communiy,
I’m trying to increase the performance of a query to its maximum possible
speed.
The schema is roughly constituted of two tables, one is a full text index,
fts5_table, while the other is called datatable.
The query in question is a full text query on the full text index joined
with the other table.
*SELECT* fts5.column, datatable.column2 *FROM* fts5_table, datatable *WHERE*
fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid *ORDER BY*
rank *ASC* *LIMIT* 10;
Another query looks for counts of the same kind:
*SELECT* count(datatable.id) *FROM* fts5_table, datatable *WHERE*
fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid
On a database of about the size of 1 GB of text (evenly distributed amount
100 000 rows). To give an idea of the speeds obtained I achieve a speed of
about 40 ms for the first and 80 ms for the second query.
Do you really mean "ms" as in milli-seconds? getting a result on an
aggregated or sorted query in under a 100 milliseconds from a Gigabyte+
table with 100K+ rows is what we would call: Very Fast.
Things I tried:
1) ANALYZE
Decreased the speed by orders of magnitude.
Do you really mean "Decreased the speed" as in made it much much slower?
I'm going to assume for the sake of brevity that you meant "seconds" and
"Increased" so that there is real opportunity for improvement.
One strategy I would employ, if you always do those two queries
together, is to query to a temp table like this:
CREATE TEMP TABLE IF NOT EXISTS tmp_aq(
id INTEGER PRIMARY KEY,
col1 TEXT,
iRank INT
);
-- In case there's already values:
DELETE * FROM tmp_aq;
INSERT INTO tmp_aq(id, col1, iRank)
SELECT rowid, column, rank
FROM fts5_table
WHERE fts5_table MATCH 'term';
-- The first query then becomes:
SELECT col1, datatable.column2
FROM tmp_aq
JOIN datatable ON datatable.id = tmp_aq.id
ORDER BY tmp_aq.iRank ASC
LIMIT 10;
-- If datatable has a 1-to-1 relationship with the fts table,
-- then this is enough for the second query:
SELECT count(*) FROM tmp_aq;
-- else you will still need the join:
SELECT count(*) FROM tmp_aq JOIN datatable ON datatable.id = tmp_aq.id;
An Index on iRank may or may not improve things - you need to test it.
Cheers,
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users