On 5/6/2016 5:32 AM, Stephan Buchert wrote:
> We are using Sqlite for data from satellite Earth observations. It
> works very well. Thanks to everybody contributing to Sqlite, uppermost
> Dr. Hipp.
>
> The largest database file has now grown to about 180 GB.

One feature of SQLite -- the whole database in one file -- is normally 
an advantage but becomes less so when the file is huge.


>   I need to have
> copies of the files at at least two different places. The databases are
> updated regularly as new data from the satellites become available.

Others have suggested keeping track of changes using a logging file. 
That allows all the SELECT statements to remain unchanged, while 
complicating the input side of things.

Another approach could be to have new observations go into a separate 
file (one day's worth, or whatever makes sense). This much smaller file 
could be shipped to the other site(s) and then merged into each copy of 
the main database. Almost no changes to the input logic, but every 
SELECT would have to use a JOIN. No idea how painful  process that would 
be.
>
> Having the copies of the file synced becomes increasingly tedious
> as their sizes increase. Ideal would be some kind of
> incremental backup/sync facility.
>
> I looked at https://www.sqlite.org/backup.html, but, as it does
> a complete backup, this takes a long time for such large
> databases (files are mounted over networks), and other work gets
> blocked.
>
> An alternative is perhaps https://www.sqlite.org/sqldiff.html, but it
> is also slow for large databases. Most of the data are binary, and the
> SQL input/output of sqldiff/sqlite3 is probably not very efficient.
>
> A solution might be to add an extension to the standard VFS
> http://www.sqlite.org/vfs.html. Each page would need a counter which
> increments when the page is modified (and there seems to
> be spare place for such counters). Then the backup could test which
> pages need to be updated. Is there any chance that such an extension
> could be added?
>
> A kind of hack-ish solution might be to update the primary database
> files in WAL mode, copy only the WAL file to the secondary place,
> and force there WAL checkpoint. Would this work?
>
> /Stephan
>
Gerry

Reply via email to