On Wed, Mar 5, 2008 at 8:21 AM, <[EMAIL PROTECTED]> wrote:
> My main concern is performance. Has anyone had any similar application
> experience they could comment on? I want to be able to insert data arriving
> on the SPI bus and then query the data to update a GUI at a very high rate
> (less than 250Ms). This is not real time so 250Ms is desirable but does not
> have to be guaranteed.
Well, SQLite stores all its data into a BTree. So, if you have a
normalized schema, like this:
CREATE TABLE Signals3 (
channel_number INTEGER PRIMARY KEY,
sample_number INTEGER PRIMARY KEY,
sample_value INTEGER
);
...or perhaps even something simpler like this (if you don't have to
worry about multiple channels):
CREATE TABLE Signals2 (
sample_number INTEGER PRIMARY KEY,
sample_value INTEGER
);
The problem with these normalized schemas is that the require a BTree
lookup for every sample that you try to insert or retrieve. BTrees
are great for random access, but BTrees are not optimized for special
purpose applications such as data acquisition. Data acquisition
systems have the special property that you are going to insert or
retreive several contiguous values at a time. A better data
structure, for example, could be a hash table with a custom hash
function. The hash function can be designed to densely pack the
samples (since they are expected to be contiguous and therefore
dense).
I have done some performance testing with the above schema (Signals3)
and have been able to get somewhere around 30k INSERTs per second on
an Intel Core2 Duo, but only when:
1. I use an in-memory SQLite database
or
2. All the INSERTs are grouped into a single transaction for a
disk-based SQL database
If you do not group the INSERTs into a single transaction, the
performance is much worse: something like 300 INSERTs per second.
So, while excellent for random access applications, the BTree in
SQLite doesn't work very well for high-speed data acquisition
applications.
Here is a workaround. Denormalize the data by storing several samples
into a BLOB:
CREATE TABLE DenormalizedSamples (
channel_number INTEGER PRIMARY KEY,
beginning_sample_number INTEGER PRIMARY KEY,
blob_of_1024_samples BLOB
);
This decreases the BTree accesses to 1 access every 1024 samples. I
don't have any performance results for this technique that I can tell
about yet. I also don't like this technique because it is a real pain
to pack and unpack the BLOBs.
Incidentally, I have been considering the idea of patching SQLite for
higher performance in data acquisition systems. I guess I should make
a post and see if there is any interest in that, but I wouldn't be
able to get started on that for about a month and I don't know how
much time would be required. Furthermore, there are other problems
such as marshalling data to/from SQL strings and it could perhaps be
better to have a lower-level interface for interacting with the
database to skip the parsing step.
-David
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users