Re: [sqlite] Fastest way to backup a "live" database
On 5 Oct 2016, at 1:07pm, Eric Grangewrote: > I have added SQLITE_OPEN_EXCLUSIVE to the flags of the backup db. That's a very good idea. Might add that to some tips I use elsewhere. Thanks. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
> But yes, if you have an application which uses only the backup API it should be safe. I have added SQLITE_OPEN_EXCLUSIVE to the flags of the backup db. That should cover accidental cases where the backup db is opened before or during the backup by "something else" ? It should never happen in my case, but probably best to leave it in case of copy-pasta later. > A backup simply copies all pages. It accesses each page once, so all cache lookups will fail. Yes, but a 20% performance hit for a 2000 entries cache semed a bit steep, given that a backup should be I/O bound. After looking with a profiler, it seems that the extra time spent seems in ntdll.dll, not SQLite itself. The source cache_size has an impact but small, the main impact is for the destination cache_size. Also of note, the backup operation can take about half of a CPU core (a fast E3 core at 3.5 GHZ). Eric On Wed, Oct 5, 2016 at 12:50 PM, Clemens Ladischwrote: > Eric Grange wrote: > > Ran some tests with variations of Clemens's backup exemple, and adding > > > > sqlite3_exec(src, "PRAGMA cache_size=1", NULL, NULL, NULL); > > sqlite3_exec(dst, "PRAGMA cache_size=1", NULL, NULL, NULL); > > > > seems to provide the best performance [...] > > > > While the effect of synchronous=OFF on the destination backup db was > > something I expected, the cache_size of 1 was not. > > > > Using large cache_size had a detrimental effect, both on source or > > destination, between cache_size 1 and the default of SQLite (-2000) there > > is a 20% difference in performance during backup. > > A backup simply copies all pages. It accesses each page once, so all > cache lookups will fail. > > > Regards, > Clemens > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
Eric Grange wrote: > Ran some tests with variations of Clemens's backup exemple, and adding > > sqlite3_exec(src, "PRAGMA cache_size=1", NULL, NULL, NULL); > sqlite3_exec(dst, "PRAGMA cache_size=1", NULL, NULL, NULL); > > seems to provide the best performance [...] > > While the effect of synchronous=OFF on the destination backup db was > something I expected, the cache_size of 1 was not. > > Using large cache_size had a detrimental effect, both on source or > destination, between cache_size 1 and the default of SQLite (-2000) there > is a 20% difference in performance during backup. A backup simply copies all pages. It accesses each page once, so all cache lookups will fail. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
On 5 Oct 2016, at 8:46am, Eric Grangewrote: >sqlite3_exec(dst, "PRAGMA synchronous=OFF", NULL, NULL, NULL); Don't do that for any operation outside taking the backup. It's a major cause of corruption in any situation where two different connections (in the same computer or different computers) are accessing the same database. But yes, if you have an application which uses only the backup API it should be safe. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
Eric Grange wrote: >> No, this is what makes the backup restart. With a step size of -1, it >> would never restart. > > Hmm, given WAL mode, this brings another question: why the sqlite CLI does > not do that but uses 100 pages? Because it's much older than WAL mode. > Is using a backup_step N parameter only useful if you want to provide some > progress indicator or pace the backup? (ie. pace I/O bandwidth ?) It's useful in rollback journal mode, where the backup reader would block any writers for a long time. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
Ran some tests with variations of Clemens's backup exemple, and adding sqlite3_exec(src, "PRAGMA cache_size=1", NULL, NULL, NULL); sqlite3_exec(dst, "PRAGMA synchronous=OFF", NULL, NULL, NULL); sqlite3_exec(dst, "PRAGMA cache_size=1", NULL, NULL, NULL); seems to provide the best performance, which is about 4 to 5 times faster than using ".backup" in the CLI. While the effect of synchronous=OFF on the destination backup db was something I expected, the cache_size of 1 was not. Using large cache_size had a detrimental effect, both on source or destination, between cache_size 1 and the default of SQLite (-2000) there is a 20% difference in performance during backup. Note that this is in Windows, on SSD and with lots of free RAM, so it seems that the OS is better at handling cache than SQLite in that particular use case. Eric On Wed, Oct 5, 2016 at 8:42 AM, Eric Grangewrote: > Thanks for the tool Clemens! > > > No, this is what makes the backup restart. With a step size of -1, it > would never restart. > > Hmm, given WAL mode, this brings another question: why the sqlite CLI does > not do that but uses 100 pages? > Is using a backup_step N parameter only useful if you want to provide some > progress indicator or pace the backup? (ie. pace I/O bandwidth ?) > > Or in other words, given WAL mode, what do we "lose" when using -1 and to > copy everything at once during a backup? > > > On Tue, Oct 4, 2016 at 5:33 PM, Clemens Ladisch > wrote: > >> Eric Grange wrote: >> > If all else fail, I could also suspend DB writes during backups >> (suspending >> > DB reads would be more problematic). >> >> With WAL, the backup reader does not block writers. >> >> >> Use the backup API, and copy everything in one step. >> >> (The restart-on-write feature should not be necessary with WAL.) >> > >> > That was what I thought initially, but I can only explain the >> multi-hours >> > backups with it: usually the backup API takes 4-5 minutes. It is just >> once >> > in a while that a very long backup occurs. >> > >> >> It calls sqlite3_backup_step() with a size of 100 pages. >> > >> > Ok, so I guess the huge cache is overkill with the default CLI! >> >> No, this is what makes the backup restart. With a step size of -1, >> it would never restart. >> >> Use a tool like the one below to do the backup in one step, without >> restarts. >> >> >> Regards, >> Clemens >> -- >> >> #include >> #include >> >> int main(int argc, char *argv[]) >> { >> sqlite3 *src = NULL; >> sqlite3 *dst = NULL; >> sqlite3_backup *backup; >> int rc; >> int ok = 0; >> >> if (argc != 3) { >> fputs("I want two file names: source, destination\n", >> stderr); >> goto error; >> } >> >> rc = sqlite3_open_v2(argv[1], , SQLITE_OPEN_READONLY, NULL); >> if (rc != SQLITE_OK) { >> fprintf(stderr, "Cannot open %s: %s\n", argv[1], >> sqlite3_errmsg(src)); >> goto error; >> } >> >> rc = sqlite3_open_v2(argv[2], , SQLITE_OPEN_READWRITE | >> SQLITE_OPEN_CREATE, NULL); >> if (rc != SQLITE_OK) { >> fprintf(stderr, "Cannot open %s: %s\n", argv[2], >> sqlite3_errmsg(dst)); >> goto error; >> } >> >> sqlite3_exec(src, "PRAGMA busy_timeout=1", NULL, NULL, NULL); >> sqlite3_exec(dst, "PRAGMA busy_timeout=1", NULL, NULL, NULL); >> >> backup = sqlite3_backup_init(dst, "main", src, "main"); >> if (backup == NULL) { >> fprintf(stderr, "Cannot initialize backup: %s\n", >> sqlite3_errmsg(dst)); >> goto error; >> } >> >> do { >> rc = sqlite3_backup_step(backup, -1); >> } while (rc == SQLITE_BUSY || rc == SQLITE_LOCKED); >> >> rc = sqlite3_backup_finish(backup); >> if (rc == SQLITE_OK) { >> ok = 1; >> } else { >> fprintf(stderr, "Backup failed: %s\n", >> sqlite3_errmsg(dst)); >> } >> >> error: >> sqlite3_close(dst); >> sqlite3_close(src); >> >> return ok ? 0 : 1; >> } >> ___ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> > > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
Thanks for the tool Clemens! > No, this is what makes the backup restart. With a step size of -1, it would never restart. Hmm, given WAL mode, this brings another question: why the sqlite CLI does not do that but uses 100 pages? Is using a backup_step N parameter only useful if you want to provide some progress indicator or pace the backup? (ie. pace I/O bandwidth ?) Or in other words, given WAL mode, what do we "lose" when using -1 and to copy everything at once during a backup? On Tue, Oct 4, 2016 at 5:33 PM, Clemens Ladischwrote: > Eric Grange wrote: > > If all else fail, I could also suspend DB writes during backups > (suspending > > DB reads would be more problematic). > > With WAL, the backup reader does not block writers. > > >> Use the backup API, and copy everything in one step. > >> (The restart-on-write feature should not be necessary with WAL.) > > > > That was what I thought initially, but I can only explain the multi-hours > > backups with it: usually the backup API takes 4-5 minutes. It is just > once > > in a while that a very long backup occurs. > > > >> It calls sqlite3_backup_step() with a size of 100 pages. > > > > Ok, so I guess the huge cache is overkill with the default CLI! > > No, this is what makes the backup restart. With a step size of -1, > it would never restart. > > Use a tool like the one below to do the backup in one step, without > restarts. > > > Regards, > Clemens > -- > > #include > #include > > int main(int argc, char *argv[]) > { > sqlite3 *src = NULL; > sqlite3 *dst = NULL; > sqlite3_backup *backup; > int rc; > int ok = 0; > > if (argc != 3) { > fputs("I want two file names: source, destination\n", > stderr); > goto error; > } > > rc = sqlite3_open_v2(argv[1], , SQLITE_OPEN_READONLY, NULL); > if (rc != SQLITE_OK) { > fprintf(stderr, "Cannot open %s: %s\n", argv[1], > sqlite3_errmsg(src)); > goto error; > } > > rc = sqlite3_open_v2(argv[2], , SQLITE_OPEN_READWRITE | > SQLITE_OPEN_CREATE, NULL); > if (rc != SQLITE_OK) { > fprintf(stderr, "Cannot open %s: %s\n", argv[2], > sqlite3_errmsg(dst)); > goto error; > } > > sqlite3_exec(src, "PRAGMA busy_timeout=1", NULL, NULL, NULL); > sqlite3_exec(dst, "PRAGMA busy_timeout=1", NULL, NULL, NULL); > > backup = sqlite3_backup_init(dst, "main", src, "main"); > if (backup == NULL) { > fprintf(stderr, "Cannot initialize backup: %s\n", > sqlite3_errmsg(dst)); > goto error; > } > > do { > rc = sqlite3_backup_step(backup, -1); > } while (rc == SQLITE_BUSY || rc == SQLITE_LOCKED); > > rc = sqlite3_backup_finish(backup); > if (rc == SQLITE_OK) { > ok = 1; > } else { > fprintf(stderr, "Backup failed: %s\n", > sqlite3_errmsg(dst)); > } > > error: > sqlite3_close(dst); > sqlite3_close(src); > > return ok ? 0 : 1; > } > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
Eric Grange wrote: > If all else fail, I could also suspend DB writes during backups (suspending > DB reads would be more problematic). With WAL, the backup reader does not block writers. >> Use the backup API, and copy everything in one step. >> (The restart-on-write feature should not be necessary with WAL.) > > That was what I thought initially, but I can only explain the multi-hours > backups with it: usually the backup API takes 4-5 minutes. It is just once > in a while that a very long backup occurs. > >> It calls sqlite3_backup_step() with a size of 100 pages. > > Ok, so I guess the huge cache is overkill with the default CLI! No, this is what makes the backup restart. With a step size of -1, it would never restart. Use a tool like the one below to do the backup in one step, without restarts. Regards, Clemens -- #include #include int main(int argc, char *argv[]) { sqlite3 *src = NULL; sqlite3 *dst = NULL; sqlite3_backup *backup; int rc; int ok = 0; if (argc != 3) { fputs("I want two file names: source, destination\n", stderr); goto error; } rc = sqlite3_open_v2(argv[1], , SQLITE_OPEN_READONLY, NULL); if (rc != SQLITE_OK) { fprintf(stderr, "Cannot open %s: %s\n", argv[1], sqlite3_errmsg(src)); goto error; } rc = sqlite3_open_v2(argv[2], , SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, NULL); if (rc != SQLITE_OK) { fprintf(stderr, "Cannot open %s: %s\n", argv[2], sqlite3_errmsg(dst)); goto error; } sqlite3_exec(src, "PRAGMA busy_timeout=1", NULL, NULL, NULL); sqlite3_exec(dst, "PRAGMA busy_timeout=1", NULL, NULL, NULL); backup = sqlite3_backup_init(dst, "main", src, "main"); if (backup == NULL) { fprintf(stderr, "Cannot initialize backup: %s\n", sqlite3_errmsg(dst)); goto error; } do { rc = sqlite3_backup_step(backup, -1); } while (rc == SQLITE_BUSY || rc == SQLITE_LOCKED); rc = sqlite3_backup_finish(backup); if (rc == SQLITE_OK) { ok = 1; } else { fprintf(stderr, "Backup failed: %s\n", sqlite3_errmsg(dst)); } error: sqlite3_close(dst); sqlite3_close(src); return ok ? 0 : 1; } ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
On 4 Oct 2016, at 3:58pm, Eric Grangewrote: > If all else fail, I could also suspend DB writes during backups If it's easy to do that, then I'd recommend it. Used in conjunction with the SQLite backup API it should provide the simplest solution least likely to present a problem. It should not be necessary to suspend DB reads during backups. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
> Have you looked at the ext/session extension ? Yes, but it is a bit more complicated to integrate, and would impose a penalty during execution as far as I understand: there are quite a few intermediate states that would be stored by a changeset, but that do not really need to be preserved at the end of the day. Those intermediate states could be moved to a secondary DB, but would then lose foreign keys and other integrity constraints This is however an extension that I have been looking at, should the need for some form of "live" db replication occur. > I'm sorry. I misunderstood your question and thought you were just duplicating the file using OS calls. The SQLite backup API takes care of all necessary locking and > consistency problems for you. You should be fine. Ok, thanks! > The problem of backing up a changing database is one of the difficult problems in database management. Please don't expect an easy solution. Right now I am tackling it with brute force: a backup usually takes about 4-5 minutes (to a secondary SSD, that offline copy is then uploaded to a rempte server). It is mostly the odd occurences when the backup takes several hours that are problematic. If all else fail, I could also suspend DB writes during backups (suspending DB reads would be more problematic). > Use the backup API, and copy everything in one step. > (The restart-on-write feature should not be necessary with WAL.) That was what I thought initially, but I can only explain the multi-hours backups with it: usually the backup API takes 4-5 minutes. It is just once in a while that a very long backup occurs. > It calls sqlite3_backup_step() with a size of 100 pages. Ok, so I guess the huge cache is overkill with the default CLI! Thanks On Tue, Oct 4, 2016 at 4:05 PM, Simon Slavinwrote: > > On 4 Oct 2016, at 2:53pm, Eric Grange wrote: > > > I am going on the assumption that if something fails during backup, the > > backup itself will be toast anyway, but is that safe otherwise? > > No. You have no locking. You might copy the beginning of the file before > a transaction and the end of the file after it, meaning that pointers at > one part of the file point to things which no longer exist. > > The problem of backing up a changing database is one of the difficult > problems in database management. Please don't expect an easy solution. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
On 4 Oct 2016, at 3:05pm, Simon Slavinwrote: > No. You have no locking. You might copy the beginning of the file before a > transaction and the end of the file after it, meaning that pointers at one > part of the file point to things which no longer exist. I'm sorry. I misunderstood your question and thought you were just duplicating the file using OS calls. The SQLite backup API takes care of all necessary locking and consistency problems for you. You should be fine. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
Eric Grange wrote: > Given a fairly large database (dozens of gigabytes), which uses WAL, and is > being accessed continuously (mostly read transactions, but regular write > transactions as well), what are the fastest and less disruptive ways to > back it up? Use the backup API, and copy everything in one step. (The restart-on-write feature should not be necessary with WAL.) > A basic ".backup" from the CLI can occasionnally take hours, as it is > thrown thrown off whenever large write transactions occur. It calls sqlite3_backup_step() with a size of 100 pages. I guess you have to implement your own backup tool (or modify the shell). > I am going on the assumption that if something fails during backup, the > backup itself will be toast anyway, but is that safe otherwise? Yes; the backup API uses a transaction, like any other read access. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
On 4 Oct 2016, at 2:53pm, Eric Grangewrote: > I am going on the assumption that if something fails during backup, the > backup itself will be toast anyway, but is that safe otherwise? No. You have no locking. You might copy the beginning of the file before a transaction and the end of the file after it, meaning that pointers at one part of the file point to things which no longer exist. The problem of backing up a changing database is one of the difficult problems in database management. Please don't expect an easy solution. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fastest way to backup a "live" database
Hello Eric ! Have you looked at the ext/session extension ? It seems that it can be the answer for your problem. The basic idea is you'll create a function that will receive the changes made to the database and then you can incrementally apply then on the backup database. This way the memory usage will be low (as long you do not have millions changes in a short period of time). Cheers ! On 04/10/16 10:53, Eric Grange wrote: Hi, Given a fairly large database (dozens of gigabytes), which uses WAL, and is being accessed continuously (mostly read transactions, but regular write transactions as well), what are the fastest and less disruptive ways to back it up? A basic ".backup" from the CLI can occasionnally take hours, as it is thrown thrown off whenever large write transactions occur. I have found the following approaches to seem to work, but would like a confirmation: - using temp_store = MEMORY - using synchronous = 0 - using a cache_size as large a possible (as high as possible without running out of memory) I am going on the assumption that if something fails during backup, the backup itself will be toast anyway, but is that safe otherwise? Also the precompiled CLI for Windows (sqlite3.exe) of 3.14.2 fails at around 2 GB RAM, which in my case is about 500,000 pages, is that already too high or could it be worth going with a 64bits CLI? Thanks! Eric ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Fastest way to backup a "live" database
Hi, Given a fairly large database (dozens of gigabytes), which uses WAL, and is being accessed continuously (mostly read transactions, but regular write transactions as well), what are the fastest and less disruptive ways to back it up? A basic ".backup" from the CLI can occasionnally take hours, as it is thrown thrown off whenever large write transactions occur. I have found the following approaches to seem to work, but would like a confirmation: - using temp_store = MEMORY - using synchronous = 0 - using a cache_size as large a possible (as high as possible without running out of memory) I am going on the assumption that if something fails during backup, the backup itself will be toast anyway, but is that safe otherwise? Also the precompiled CLI for Windows (sqlite3.exe) of 3.14.2 fails at around 2 GB RAM, which in my case is about 500,000 pages, is that already too high or could it be worth going with a 64bits CLI? Thanks! Eric ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users