We are using Sqlite for data from satellite Earth observations. It
works very well. Thanks to everybody contributing to Sqlite, uppermost
Dr. Hipp.

The largest database file has now grown to about 180 GB. I need to have
copies of the files at at least two different places. The databases are
updated regularly as new data from the satellites become available.

Having the copies of the file synced becomes increasingly tedious
as their sizes increase. Ideal would be some kind of
incremental backup/sync facility.

I looked at https://www.sqlite.org/backup.html, but, as it does
a complete backup, this takes a long time for such large
databases (files are mounted over networks), and other work gets
blocked.

An alternative is perhaps https://www.sqlite.org/sqldiff.html, but it
is also slow for large databases. Most of the data are binary, and the
SQL input/output of sqldiff/sqlite3 is probably not very efficient.

A solution might be to add an extension to the standard VFS
http://www.sqlite.org/vfs.html. Each page would need a counter which
increments when the page is modified (and there seems to
be spare place for such counters). Then the backup could test which
pages need to be updated. Is there any chance that such an extension
could be added?

A kind of hack-ish solution might be to update the primary database
files in WAL mode, copy only the WAL file to the secondary place,
and force there WAL checkpoint. Would this work?

/Stephan

Reply via email to