I apologize in advance for the length of the following…

I have recently been handed a customer-reported problem against a legacy 
embedded device (an industrial process monitor) that makes use of SQLite over 
flash for storing the monitored info (it uses this internally to respond to 
remote info requests – there is no “user” access to the database).  The device 
runs dedicated real-time software with SQLite embedded.  To make life fun, no 
one left here knows the device’s software, there doesn’t appear to be any 
design documentation, and I am not a database person and know almost nothing 
about SQLite (besides what I’ve learned in the last week).

The problem is basically that as part of a test, the customer wants to power 
fail the device, and then to be able to recover (after the device restarts) 
data that was generated/stored as close to the power failure as possible.  
(There are some valid reasons for this test.)  For the purposes of this test, 
the device can’t have any battery backup, and it doesn’t get any early warning 
of the failure – it just dies.  Unfortunately, when doing this, it appears that 
significant amounts of data that should be in the database are missing and/or 
corrupted, and this doesn’t appear particularly dependent on when the failure 
occurs.  The overall database isn’t corrupted, however.  For control tests – 
with no power failure – running for the same elapsed time, 8 or 10 “batches” of 
data captures are recorded, while for the tests with the power failures there 
are only 2 batches and the (apparent) second one is odd or worse (most of the 
fields are 0 or invalid, and the timestamp column entries (while reasonable)are 
in the past from when the test occurred.

Areas of possible relevance I’ve noted so far include:

Using a very old version of SQLite (3.7.4, from 2010).  The software was 
apparently being updated/supported by the original developer until a year or 2 
ago and I have no idea why it was never updated to a newer version (that is, 
whether it was just inertia or if there was an active reason not to update).
The update batches are wrapped in transaction boundaries, though I don’t know 
if this is done (or configured for) correctly.
SQLite is configured to use “temp files always in memory”, which is suspicious 
for a power fail problem, but I don’t know what the temp files are used for.
The underlying real-time OS appears very simple, and I don’t believe it does 
any buffering or other special IO handling itself (I think it just acts as a 
router to the appropriate driver).  However, I haven’t yet determined if the 
flash drivers, etc. that were written for this device might do so.
While the required database update rate seems low (even for the quite modest 
hardware involved), there are several references to things (related to the 
database) that are done for “performance” reasons, and while I haven’t traced 
them down yet, this could be leaving things in memory longer than desirable.
So – some questions:

While there have been all kinds of bug fixes (and perhaps improvements in 
transaction resiliency and such) in SQLite since that version, is it likely 
that I am running into things that have been fixed relative to power fail 
resilience?
Regardless, I’d like to pick up the current version of SQLite – I am 
uncomfortable using such an old version.  Any idea of the level of risk I would 
run to make that big a leap at once?  Aside from any possible dependence on 
behavior that was “broken”, any incompatibilities (interfaces, intended 
behavior, file formats) that aren’t automatically/transparently handled would 
be a problem, as the updated software load would need to be able to be dropped 
into existing fielded systems with no other user intervention, and there aren’t 
resources (meaning me) to make more than minor code changes for that.  For what 
it’s worth, I believe the current database usage is pretty simple – nothing 
besides creates, inserts/updates, deletes, and simple selects.  The other 
concern might be if the code size increased substantially – the available code 
space is limited.
Might the memory only “temp files” be an issue?  What does (did) SQLite use 
those for?  If the transaction/rollback file or similar, it could obviously be 
an issue, but that doesn’t make sense to me.
Are there any specific configuration settings or usage that I should look at 
(again, taking into account this is a very old version)?
Anything else that comes to mind?  (Obviously, doing the update first if it is 
perceived as pretty safe would at least make any of the remaining issues 
relative to things as they exist now.  But as noted, there are no resources to 
deal with anything beyond minor - and easy to find/fix - incompatibilities.) 
Thanks much for any feedback!

Ted

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to