[sqlite] Concurrent read performance

Wayne Bradney Sat, 12 Jan 2013 09:38:58 -0800
I have a requirement for which I'm proposing SQLite as a solution, and would 
appreciate some feedback to clarify my thinking. The application is a shared 
service (.NET/WCF) that caches relational data in-memory and serves it up when 
(read-only) requests come in (via operation contracts) from multiple clients. 
Such client requests are high-level, not relational, and could result in 
several (potentially many) individual queries to the cache itself. These 
individual cache queries are dynamically generated and could be arbitrarily 
complex, but are all essentially relational in nature. The service itself will 
periodically load data from an external data source, transform it and update 
the cache. There's no requirement currently for the cache to ever be persisted 
- it can be reloaded from the external source if necessary, but performance 
(especially read performance) is critical. The amount of data/indexes to cache 
potentially could be quite large (of the order of several gigabytes, let's 
 say). I've already worked an initial implementation that uses an in-memory 
SQLite database via System.Data.SQLite (1.0.82.0). The service maintains a 
"mode=memory&cache=shared" database, and each SQL query happens on its own 
connection, and in its own thread. Some observations of our read performance 
(when the cache is fully populated and there are no writers): [Experiment 1: 
Memory-backed, single query]
For simple client requests that only result in a single (albeit complex) SQL 
query to the database, performance is excellent, and I'm very happy to get the 
maintenance benefits of using a flexible query language against the cache. 
[Experiment 2: Memory-backed, concurrent queries]
For any client request that results in multiple simultaneous SQL queries to the 
database, those queries seem to be serialized rather than concurrent, and the 
whole request actually performs much worse than the old 
home-grown-but-horribly-limited caching/querying mechanism that was in place 
beforehand, and I'm sad. [Experiment 3: File-backed, concurrent queries, with 
shared cache]
I switched to a file-backed database (but still "cache=shared") and it appears 
that the queries are still being serialized, and is overall about 15% slower 
than Experiment 2. [Experiment 4: File-backed, concurrent queries, without 
shared cache]
I switched to a file-backed database without a shared cache, and performance 
improved dramatically (about 70% faster than Experiment 3). It appears that the 
queries are now truly happening concurrently. So it appears that, since: 1. 
when shared cache is enabled, all reads are serialized, and
2. there doesn't seem to be any way to have a memory-backed database that can 
be accessed by multiple connections without using a shared cache,  then I guess 
I MUST use a file-backed database to get concurrent reads, even though I don't 
need the persistence and don't want to take the I/O hit. Am I making any sense? 
Anything I'm missing?                                        
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Concurrent read performance

Reply via email to