Re: [sqlite] Online/Hot backup of WAL journalling mode database
Simon Slavin wrote: > On 12 Dec 2014, at 10:27am, Clemens Ladisch wrote: >> If you write your own backup tool that simply calls >> "sqlite3_backup_step(b, -1)", the entire database is copied in >> a single atomic transaction. > > OP's problem is that he runs several processes which are constantly > (every few seconds) writing to the database he needs to copy. So any > operation which locks the database for a long period would mean that > some data was not captured. This is not a problem because he uses WAL mode. Regards, Clemens ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 12 Dec 2014, at 10:27am, Clemens Ladisch wrote: > If you write your own backup tool that simply calls > "sqlite3_backup_step(b, -1)", the entire database is copied in > a single atomic transaction. OP's problem is that he runs several processes which are constantly (every few seconds) writing to the database he needs to copy. So any operation which locks the database for a long period would mean that some data was not captured. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
Nick wrote: > On 11 Dec 2014, at 20:39, David King wrote: >> Why are you trying to hard to avoid using the backup API? It sounds >> like it does exactly what you want > > Backup API works great if you have periods of no writing. > However, if a process writes during the backup then the API would stop > and start over again. So if you have frequent writes then > theoretically the backup API would not complete. The ".backup" command of the sqlite3 command-line shell uses a step size of 100 pages, i.e., gives other processes opportunities to write regularly. If you write your own backup tool that simply calls "sqlite3_backup_step(b, -1)", the entire database is copied in a single atomic transaction. Regards, Clemens ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 12/12/2014 03:31 AM, Nick wrote: On 11 Dec 2014, at 10:08, Dan Kennedy wrote: On 12/11/2014 05:49 AM, Nick wrote: On 10 Dec 2014, at 07:35, Dan Kennedy wrote: Strictly speaking the database file may not be well-formed even if there is no ongoing checkpoint. If: a) process A opens a read transaction, b) process B opens and commits a write transaction to the database, c) process C checkpoints the db, then the db file considered without the *-wal file may be corrupt. The problem comes about because process C can only checkpoint frames up until the start of B's transaction. And there is an optimization that will prevent it from copying any earlier frames for which there exists a frame in B's transaction that corresponds to the same database page. So it effectively copis only a subset of the modifications made by earlier transactions into the db file - not necessarily creating a valid db file. Can this corruption be detected by running PRAGMA quick_check / integrity_check? Having the occasional backup db corrupted would be tolerable. In many cases, but not generally. There would exist cases where a part of a committed transaction was lost, or the values in unindexed columns where replaced, that sort of thing. Ok. Presumably a SQLITE_CHECKPOINT_FULL or SQLITE_CHECKPOINT_RESTART checkpoint mode would ensure the db file is valid? That sounds right. A successful FULL or RESTART checkpoint will always copy entire transactions into the db. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 11 Dec 2014, at 20:39, David King wrote: > Why are you trying to hard to avoid using the backup API? It sounds like it > does exactly what you want Backup API works great if you have periods of no writing. However, if a process writes during the backup then the API would stop and start over again. So if you have frequent writes then theoretically the backup API would not complete. In an ideal world the backup API would only copy pages altered during the write rather than start over. Regards Nick ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
Why are you trying to hard to avoid using the backup API? It sounds like it does exactly what you want On 11 Dec 2014, at 12:36, Nick wrote: > > On 11 Dec 2014, at 10:43, Simon Slavin wrote: > >> >> I don't know enough about the internals of SQLite to be sure, but various >> parts of me are concerned that this is a bad idea. I don't know what WAL >> mode would be like without checkpointing but there has to be a reason for >> checkpointing and disabling it between backups sounds bad. >> > > I would say the docs (https://www.sqlite.org/wal.html#how_to_checkpoint) do > not imply application initiated checkpoints is a bad idea. > > Regards > Nick > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > signature.asc Description: Message signed with OpenPGP using GPGMail ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 11 Dec 2014, at 10:43, Simon Slavin wrote: > > I don't know enough about the internals of SQLite to be sure, but various > parts of me are concerned that this is a bad idea. I don't know what WAL > mode would be like without checkpointing but there has to be a reason for > checkpointing and disabling it between backups sounds bad. > I would say the docs (https://www.sqlite.org/wal.html#how_to_checkpoint) do not imply application initiated checkpoints is a bad idea. Regards Nick ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 11 Dec 2014, at 10:08, Dan Kennedy wrote: > On 12/11/2014 05:49 AM, Nick wrote: >> On 10 Dec 2014, at 07:35, Dan Kennedy wrote: >> >>> Strictly speaking the database file may not be well-formed even if there is >>> no ongoing checkpoint. If: >>> >>> a) process A opens a read transaction, >>> b) process B opens and commits a write transaction to the database, >>> c) process C checkpoints the db, >>> >>> then the db file considered without the *-wal file may be corrupt. The >>> problem comes about because process C can only checkpoint frames up until >>> the start of B's transaction. And there is an optimization that will >>> prevent it from copying any earlier frames for which there exists a frame >>> in B's transaction that corresponds to the same database page. So it >>> effectively copis only a subset of the modifications made by earlier >>> transactions into the db file - not necessarily creating a valid db file. >> Can this corruption be detected by running PRAGMA quick_check / >> integrity_check? Having the occasional backup db corrupted would be >> tolerable. > > In many cases, but not generally. There would exist cases where a part of a > committed transaction was lost, or the values in unindexed columns where > replaced, that sort of thing. Ok. Presumably a SQLITE_CHECKPOINT_FULL or SQLITE_CHECKPOINT_RESTART checkpoint mode would ensure the db file is valid? Regards Nick ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 10 Dec 2014, at 10:40pm, Nick wrote: > All the processes would have automatic checkpointing disabled. Just the > backup process would perform the checkpoint. I don't know enough about the internals of SQLite to be sure, but various parts of me are concerned that this is a bad idea. I don't know what WAL mode would be like without checkpointing but there has to be a reason for checkpointing and disabling it between backups sounds bad. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 12/11/2014 05:49 AM, Nick wrote: On 10 Dec 2014, at 07:35, Dan Kennedy wrote: Strictly speaking the database file may not be well-formed even if there is no ongoing checkpoint. If: a) process A opens a read transaction, b) process B opens and commits a write transaction to the database, c) process C checkpoints the db, then the db file considered without the *-wal file may be corrupt. The problem comes about because process C can only checkpoint frames up until the start of B's transaction. And there is an optimization that will prevent it from copying any earlier frames for which there exists a frame in B's transaction that corresponds to the same database page. So it effectively copis only a subset of the modifications made by earlier transactions into the db file - not necessarily creating a valid db file. Can this corruption be detected by running PRAGMA quick_check / integrity_check? Having the occasional backup db corrupted would be tolerable. In many cases, but not generally. There would exist cases where a part of a committed transaction was lost, or the values in unindexed columns where replaced, that sort of thing. Regards Nick ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 10 Dec 2014, at 07:35, Dan Kennedy wrote: > Strictly speaking the database file may not be well-formed even if there is > no ongoing checkpoint. If: > > a) process A opens a read transaction, > b) process B opens and commits a write transaction to the database, > c) process C checkpoints the db, > > then the db file considered without the *-wal file may be corrupt. The > problem comes about because process C can only checkpoint frames up until the > start of B's transaction. And there is an optimization that will prevent it > from copying any earlier frames for which there exists a frame in B's > transaction that corresponds to the same database page. So it effectively > copis only a subset of the modifications made by earlier transactions into > the db file - not necessarily creating a valid db file. Can this corruption be detected by running PRAGMA quick_check / integrity_check? Having the occasional backup db corrupted would be tolerable. Regards Nick ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 10 Dec 2014, at 02:36, Simon Slavin wrote: > > On 10 Dec 2014, at 12:30am, Nick wrote: > >> That's interesting Simon I didn't expect the database not to be trustworthy. > > The database will be trustworthy at any instant. Your copy of it will be > corrupt because the file will be changing while you are copying it. > >> In WAL mode I thought the database file is only written to when >> checkpointing. Have I misunderstood this journaling mode? > > How do you intend to prevent your other processes from checkpointing while > you take the backup ? You can disable checkpointing for your own connection > to the database but not for the connections other processes have. All the processes would have automatic checkpointing disabled. Just the backup process would perform the checkpoint. >> Again I may have misunderstood the docs around the Backup API, does it not >> start again from the beginning copying pages if another process writes to >> the database during the process? In practice could it successfully backup a >> 2GB database that is being written to once a second? > > Not if the writing never stopped. But there's no way to take a copy of a > file which is constantly being rewritten. rsync can't do it either. How can > anything copy a file which is constantly being modified ? > > You can BEGIN EXCLUSIVE and then END once your backup is finished. That > should prevent other processes writing to the file. You will have to deal > with what happens if your BEGIN EXCLUSIVE times out, and you will have to put > long timeouts in your other processes so they can handle the file being > locked long enough for the entire copy to be taken. That's the only way I > can think of to do it. And yes, it will prevent writing to the database > while it's being copied. > > On the other hand, there's a different way to clone a database: log the > changes. > > When something issues an INSERT/DELETE/UPDATE command, execute the command > but also append a copy of that command to a text file somewhere. When you > want to bring your backup copy up-to-date, take a copy of the log file, then > execute all the commands in it to your out-of-date copy. > > You need a method of zeroing out the log file, or knowing where you got to on > your last backup. Thanks for the info, Simon. Regards Nick ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 12/10/2014 05:06 AM, Simon Slavin wrote: On 9 Dec 2014, at 8:57pm, Nick wrote: Environment is Linux with multiple (c. 4-6) processes accessing a single sqlite database named "test.db". Backup: - New process started using cronjob to initiate application checkpoint until completion. - rsync diff the file "test.db" to another drive/location (specifically ignoring the "-shm" and "-wal" file). - exit process Restore: - rsync the file "test.db" from another drive/location. Will not be trustworthy if the database is being written to during the rsync operations. Strictly speaking the database file may not be well-formed even if there is no ongoing checkpoint. If: a) process A opens a read transaction, b) process B opens and commits a write transaction to the database, c) process C checkpoints the db, then the db file considered without the *-wal file may be corrupt. The problem comes about because process C can only checkpoint frames up until the start of B's transaction. And there is an optimization that will prevent it from copying any earlier frames for which there exists a frame in B's transaction that corresponds to the same database page. So it effectively copis only a subset of the modifications made by earlier transactions into the db file - not necessarily creating a valid db file. Dan. A) Ensure all processes besides the backup process have the database closed while it is being copied. Establish some kind of semaphore so they can tell when it's safe to open the database again. B) Use the SQLite Backup API which was invented to do what you want. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 10 Dec 2014, at 12:30am, Nick wrote: > That's interesting Simon I didn't expect the database not to be trustworthy. The database will be trustworthy at any instant. Your copy of it will be corrupt because the file will be changing while you are copying it. > In WAL mode I thought the database file is only written to when > checkpointing. Have I misunderstood this journaling mode? How do you intend to prevent your other processes from checkpointing while you take the backup ? You can disable checkpointing for your own connection to the database but not for the connections other processes have. > Again I may have misunderstood the docs around the Backup API, does it not > start again from the beginning copying pages if another process writes to the > database during the process? In practice could it successfully backup a 2GB > database that is being written to once a second? Not if the writing never stopped. But there's no way to take a copy of a file which is constantly being rewritten. rsync can't do it either. How can anything copy a file which is constantly being modified ? You can BEGIN EXCLUSIVE and then END once your backup is finished. That should prevent other processes writing to the file. You will have to deal with what happens if your BEGIN EXCLUSIVE times out, and you will have to put long timeouts in your other processes so they can handle the file being locked long enough for the entire copy to be taken. That's the only way I can think of to do it. And yes, it will prevent writing to the database while it's being copied. On the other hand, there's a different way to clone a database: log the changes. When something issues an INSERT/DELETE/UPDATE command, execute the command but also append a copy of that command to a text file somewhere. When you want to bring your backup copy up-to-date, take a copy of the log file, then execute all the commands in it to your out-of-date copy. You need a method of zeroing out the log file, or knowing where you got to on your last backup. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 9 Dec 2014, at 22:06, Simon Slavin wrote: > > On 9 Dec 2014, at 8:57pm, Nick wrote: > >> Environment is Linux with multiple (c. 4-6) processes accessing a single >> sqlite database named "test.db". >> >> Backup: >> - New process started using cronjob to initiate application checkpoint until >> completion. >> - rsync diff the file "test.db" to another drive/location (specifically >> ignoring the "-shm" and "-wal" file). >> - exit process >> >> Restore: >> - rsync the file "test.db" from another drive/location. > > Will not be trustworthy if the database is being written to during the rsync > operations. Recommend either of the following: > > A) Ensure all processes besides the backup process have the database closed > while it is being copied. Establish some kind of semaphore so they can tell > when it's safe to open the database again. > > B) Use the SQLite Backup API which was invented to do what you want. > > Simon. That's interesting Simon I didn't expect the database not to be trustworthy. In WAL mode I thought the database file is only written to when checkpointing. Have I misunderstood this journaling mode? Again I may have misunderstood the docs around the Backup API, does it not start again from the beginning copying pages if another process writes to the database during the process? In practice could it successfully backup a 2GB database that is being written to once a second? Regards Nick ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Online/Hot backup of WAL journalling mode database
On 9 Dec 2014, at 8:57pm, Nick wrote: > Environment is Linux with multiple (c. 4-6) processes accessing a single > sqlite database named "test.db". > > Backup: > - New process started using cronjob to initiate application checkpoint until > completion. > - rsync diff the file "test.db" to another drive/location (specifically > ignoring the "-shm" and "-wal" file). > - exit process > > Restore: > - rsync the file "test.db" from another drive/location. Will not be trustworthy if the database is being written to during the rsync operations. Recommend either of the following: A) Ensure all processes besides the backup process have the database closed while it is being copied. Establish some kind of semaphore so they can tell when it's safe to open the database again. B) Use the SQLite Backup API which was invented to do what you want. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users