Hello,
I have a question regarding caching mechanism in SQLite. Especially when using
multiple connections.
My understanding was that each connection has its own page cache (if I don't
use SQLITE_OPEN_SHAREDCACHE).
However I wrote some tests and results didn't met my expectations...
So what I tested:
* I have two FTS4 tables (each table contains about million of entries, 10
columns, 5 columns in the FTS index,
other columns are used as 'not_indexed')
* My application executes queries (with MATCH operator) to both tables
* each query is querying only to a single table (no join used) and the
queries to particular tables are
mixed so that the order is: run query1 to table1, run query2 to table2, run
query3 to table1, run query4 to
table2 and so on)
* The tables are (from the application point of view) READ_ONLY - i.e. the
content is not modified during the
application live -> connections opened using
sqlite3_open_v2(db_path,&p_db,SQLITE_OPEN_READONLY, NULL)
* I wanted to compare performance of my queries with different approaches, the
complete time to execute all
queries (10 different queries) was measured:
1) Run each query with the NEW connection (open connection, execute query,
close connection):
* here I expected the worst performance
2) Run all queries with the same connection (single opened connection
during the whole test):
* this was my first implementation and I wanted to improve performance
to be better then with this approach
* my assumption was that in this approach the page cache will not be
reused since the queries are mixed
and each consecutive query is querying different table
3) Create two connections (on the same DB) and dedicate each connection
for queries to a single table:
* It means that connection1 is used only for queries to table1 and
connection2 only for queries to table2
* I expected the best performance here because I thought that each
connection will keep its own cache
and therefore the cache will be maximally reused
Results:
The approach 1 was really the slowest. However the approach 3 was unexpectedly
slightly worse than approach 2.
Moreover, when I run similar tests with another artificial queries which all
returned lot of entries, the
approach 1 !!! was even the best -> i.e. almost 10% better than 2 and 3!
And I really cannot understand how it is possible. Does SQLite have yet some
other global cache?
Or is it possible that "overwriting" the cache content (at least in approach 2)
can even slow down the query
time so that it will be slower then using a new connection (=clear cache) for
each particular query?
Unfortunately I cannot post here exact queries / tables / tests from my project
(its confidential :-) ) but I
would like at least to check whether my thinking is correct or if I
misunderstood something.
Would you expect that the approach 2 will be faster than approach 3? If yes,
please explain why.
Many thanks,
Milan