On Wed, Mar 30, 2011 at 8:46 AM, Mikael <[email protected]> wrote:

> Hi,
>
> (We're looking into using SQLite for some databases that need to be
> reachable all the time by the users, so they must be non-corrupt and
> working
> all the time. Non-corrupt both in the sense that opening doesn't fail
> without manual intervention by the administrator, and in the sense that any
> queries/inserts/updates don't fail by data integrity reasons - so to sum
> up,
> all of the database needs to be kept completely non-corrupt.
>     The consequences in terms of trust and other things toward users if
> their records would be unavailable, would be unacceptable.
>
> We have some different databases, both some that will be just 1-50MB, and
> some that could grow to gigabytes. We will use WAL mode. The server will
> never have a power failure, RAM will always be intact thanks to ECC, and
> the
> harddrives' data will never be corrupt as they are in a mirror raid
> configuration.
>     We presume that our FSYNC_FULL:s really write all data to disk. We see
> no reason in using another FS than EXT3, FS-level journalling wouldn't
> improve anything as power won't fail anyhow and it doesn't help in app/OS
> crashes.
>     We will make continuous rsync backups to external media, though if this
> is done during checkpoints, ultimately backups can't be trusted to have
> integrity.
>
> Now to the technical side: )
>
> When in WAL mode, except for during checkpoints, the main database file is
> not altered. Furthermore, the contents of the write-ahead log file has
> checksums in it, so if the app restarts after an app/OS crash, it will keep
> only valid parts of the write-ahead log file and throw invalid parts
> silently. (Right?)
>

Correct

>
> Based on this, the only time our database file could be corrupted would be
> during checkpoints. (Right?)
>

Correct

>
> Therefore I now wish to check with you, is SQLite implemented to somehow
> make checkpoints not be able to corrupt the main database file ever? (Say a
> checkpoint went halfway and then the OS ran out of memory, killed all
> processes and rebooted.)
>

Yes.  The WAL is not cleared or deleted until after the checkpoint
completes.  If there is an OS crash or power loss in the middle of a
checkpoint, then the checkpoint will automatically start over again the next
time the database file is opened.

This is where correct functioning of fsync() is critical.  We use fsync() as
a write barrier operation.  We require that all I/O operations that occur
before the fsync() (writes to the database file) must complete prior to any
I/O operations that occur after the fsync() (clearing the WAL file).  If
fsync() is working correctly, then everything is safe.  Unfortunately, on
many consumer-grade disk drives nowadays, fsync() does not work correctly
:-(  But you say that fsync is working correctly on your system, so it
shouldn't be a problem for you.



>
> If yes, we have the solution needed already now - app/OS crashes won't
> crash
> the database, and rsync backups will create working backups even if made
> during checkpoints.
>

An rsync risks making a corrupt copy of the database file if the rsync runs
concurrently with a checkpoint.  I think you would be better off using the
sqlite3_backup interface (http://www.sqlite.org/backup.html) to make a
consistent local copy of the database file, then rsync the local copy.  Oh,
wait - you mention that already in the next paragraph....


>
> If no, I suppose we need to put some kind of file versioning system under
> the database file so we can rewind the database file to right before the
> last unfinished checkpoint, or we need to use the online backup API to
> create intact backups that we then rsync to external media.
>     Though, there doesn't seem to exist any user/application-level
> versioning system ready for production, and the online backup API is not
> incremental but only for the whole database making backups expensive and
> thus we'd do them seldomly.
>     Possibly I could do all checkpoints from another OS process than the
> one where database reads/writes are done, zeroing the impact of app
> crashes on database integrity, and leaving OS crashes as the only possible
> reason to corrupt the database.
>

That's an idea - do checkpoints in the same separate process that is doing
the rsyncs.  That will guarantee that the rsync is never running at the same
time as a checkpoint, and thus guarantee that the rsync-ed database file
will be consistent.


>
> What do you say is the best way to do this?
>
> Please give your point of view on how to do this the best way - what
> strategies/utilities/methods/SQLite parameters and calls are needed to do
> this. While I didn't find a clear answer to this in the list archives, I'm
> sure this is a key question for many SQLite users.
>
> Many thanks,
> Mikael
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
[email protected]
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to