Also...does increasing cache_size help?
Are you able to use a RAM disk?

-----Original Message-----
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Wayne Bradney
Sent: Saturday, January 12, 2013 11:39 AM
To: sqlite-users@sqlite.org
Subject: [sqlite] Concurrent read performance

I have a requirement for which I'm proposing SQLite as a solution, and would
appreciate some feedback to clarify my thinking. The application is a shared
service (.NET/WCF) that caches relational data in-memory and serves it up
when (read-only) requests come in (via operation contracts) from multiple
clients. Such client requests are high-level, not relational, and could
result in several (potentially many) individual queries to the cache itself.
These individual cache queries are dynamically generated and could be
arbitrarily complex, but are all essentially relational in nature. The
service itself will periodically load data from an external data source,
transform it and update the cache. There's no requirement currently for the
cache to ever be persisted - it can be reloaded from the external source if
necessary, but performance (especially read performance) is critical. The
amount of data/indexes to cache potentially could be quite large (of the
order of several gigabytes, let's 
 say). I've already worked an initial implementation that uses an in-memory
SQLite database via System.Data.SQLite (1.0.82.0). The service maintains a
"mode=memory&cache=shared" database, and each SQL query happens on its own
connection, and in its own thread. Some observations of our read performance
(when the cache is fully populated and there are no writers): [Experiment 1:
Memory-backed, single query]
For simple client requests that only result in a single (albeit complex) SQL
query to the database, performance is excellent, and I'm very happy to get
the maintenance benefits of using a flexible query language against the
cache. [Experiment 2: Memory-backed, concurrent queries]
For any client request that results in multiple simultaneous SQL queries to
the database, those queries seem to be serialized rather than concurrent,
and the whole request actually performs much worse than the old
home-grown-but-horribly-limited caching/querying mechanism that was in place
beforehand, and I'm sad. [Experiment 3: File-backed, concurrent queries,
with shared cache]
I switched to a file-backed database (but still "cache=shared") and it
appears that the queries are still being serialized, and is overall about
15% slower than Experiment 2. [Experiment 4: File-backed, concurrent
queries, without shared cache]
I switched to a file-backed database without a shared cache, and performance
improved dramatically (about 70% faster than Experiment 3). It appears that
the queries are now truly happening concurrently. So it appears that, since:
1. when shared cache is enabled, all reads are serialized, and
2. there doesn't seem to be any way to have a memory-backed database that
can be accessed by multiple connections without using a shared cache,  then
I guess I MUST use a file-backed database to get concurrent reads, even
though I don't need the persistence and don't want to take the I/O hit. Am I
making any sense? Anything I'm missing?

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to