Re: [sqlite] Fastest way to backup a "live" database

2016-10-05 Thread Simon Slavin

On 5 Oct 2016, at 1:07pm, Eric Grange  wrote:

> I have added SQLITE_OPEN_EXCLUSIVE to the flags of the backup db.

That's a very good idea.  Might add that to some tips I use elsewhere.  Thanks.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-05 Thread Eric Grange
> But yes, if you have an application which uses only the backup API it
should be safe.

I have added SQLITE_OPEN_EXCLUSIVE to the flags of the backup db. That
should cover accidental cases where the backup db is opened before or
during the backup by "something else" ? It should never happen in my case,
but probably best to leave it in case of copy-pasta later.

> A backup simply copies all pages.  It accesses each page once, so all cache
lookups will fail.

Yes, but a 20% performance hit for a 2000 entries cache semed a bit steep,
given that a backup should be I/O bound.

After looking with a profiler, it seems that the extra time spent seems in
ntdll.dll, not SQLite itself. The source cache_size has an impact but
small, the main impact is for the destination cache_size.

Also of note, the backup operation can take about half of a CPU core (a
fast E3 core at 3.5 GHZ).

Eric



On Wed, Oct 5, 2016 at 12:50 PM, Clemens Ladisch  wrote:

> Eric Grange wrote:
> > Ran some tests with variations of Clemens's backup exemple, and adding
> >
> > sqlite3_exec(src, "PRAGMA cache_size=1", NULL, NULL, NULL);
> > sqlite3_exec(dst, "PRAGMA cache_size=1", NULL, NULL, NULL);
> >
> > seems to provide the best performance [...]
> >
> > While the effect of synchronous=OFF on the destination backup db was
> > something I expected, the cache_size of 1 was not.
> >
> > Using large cache_size had a detrimental effect, both on source or
> > destination, between cache_size 1 and the default of SQLite (-2000) there
> > is a 20% difference in performance during backup.
>
> A backup simply copies all pages.  It accesses each page once, so all
> cache lookups will fail.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-05 Thread Clemens Ladisch
Eric Grange wrote:
> Ran some tests with variations of Clemens's backup exemple, and adding
>
> sqlite3_exec(src, "PRAGMA cache_size=1", NULL, NULL, NULL);
> sqlite3_exec(dst, "PRAGMA cache_size=1", NULL, NULL, NULL);
>
> seems to provide the best performance [...]
>
> While the effect of synchronous=OFF on the destination backup db was
> something I expected, the cache_size of 1 was not.
>
> Using large cache_size had a detrimental effect, both on source or
> destination, between cache_size 1 and the default of SQLite (-2000) there
> is a 20% difference in performance during backup.

A backup simply copies all pages.  It accesses each page once, so all
cache lookups will fail.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-05 Thread Simon Slavin

On 5 Oct 2016, at 8:46am, Eric Grange  wrote:

>sqlite3_exec(dst, "PRAGMA synchronous=OFF", NULL, NULL, NULL);

Don't do that for any operation outside taking the backup.  It's a major cause 
of corruption in any situation where two different connections (in the same 
computer or different computers) are accessing the same database.

But yes, if you have an application which uses only the backup API it should be 
safe.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-05 Thread Clemens Ladisch
Eric Grange wrote:
>> No, this is what makes the backup restart.  With a step size of -1, it
>> would never restart.
>
> Hmm, given WAL mode, this brings another question: why the sqlite CLI does
> not do that but uses 100 pages?

Because it's much older than WAL mode.

> Is using a backup_step N parameter only useful if you want to provide some
> progress indicator or pace the backup? (ie. pace I/O bandwidth ?)

It's useful in rollback journal mode, where the backup reader would block
any writers for a long time.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-05 Thread Eric Grange
Ran some tests with variations of Clemens's backup exemple, and adding

sqlite3_exec(src, "PRAGMA cache_size=1", NULL, NULL, NULL);

sqlite3_exec(dst, "PRAGMA synchronous=OFF", NULL, NULL, NULL);
sqlite3_exec(dst, "PRAGMA cache_size=1", NULL, NULL, NULL);

seems to provide the best performance, which is about 4 to 5 times faster
than using ".backup" in the CLI.

While the effect of synchronous=OFF on the destination backup db was
something I expected, the cache_size of 1 was not.

Using large cache_size had a detrimental effect, both on source or
destination, between cache_size 1 and the default of SQLite (-2000) there
is a 20% difference in performance during backup.

Note that this is in Windows, on SSD and with lots of free RAM, so it seems
that the OS is better at handling cache than SQLite in that particular use
case.

Eric



On Wed, Oct 5, 2016 at 8:42 AM, Eric Grange  wrote:

> Thanks for the tool Clemens!
>
> > No, this is what makes the backup restart.  With a step size of -1, it
> would never restart.
>
> Hmm, given WAL mode, this brings another question: why the sqlite CLI does
> not do that but uses 100 pages?
> Is using a backup_step N parameter only useful if you want to provide some
> progress indicator or pace the backup? (ie. pace I/O bandwidth ?)
>
> Or in other words, given WAL mode, what do we "lose" when using -1 and to
> copy everything at once during a backup?
>
>
> On Tue, Oct 4, 2016 at 5:33 PM, Clemens Ladisch 
> wrote:
>
>> Eric Grange wrote:
>> > If all else fail, I could also suspend DB writes during backups
>> (suspending
>> > DB reads would be more problematic).
>>
>> With WAL, the backup reader does not block writers.
>>
>> >> Use the backup API, and copy everything in one step.
>> >> (The restart-on-write feature should not be necessary with WAL.)
>> >
>> > That was what I thought initially, but I can only explain the
>> multi-hours
>> > backups with it: usually the backup API takes 4-5 minutes. It is just
>> once
>> > in a while that a very long backup occurs.
>> >
>> >> It calls sqlite3_backup_step() with a size of 100 pages.
>> >
>> > Ok, so I guess the huge cache is overkill with the default CLI!
>>
>> No, this is what makes the backup restart.  With a step size of -1,
>> it would never restart.
>>
>> Use a tool like the one below to do the backup in one step, without
>> restarts.
>>
>>
>> Regards,
>> Clemens
>> --
>>
>> #include 
>> #include 
>>
>> int main(int argc, char *argv[])
>> {
>> sqlite3 *src = NULL;
>> sqlite3 *dst = NULL;
>> sqlite3_backup *backup;
>> int rc;
>> int ok = 0;
>>
>> if (argc != 3) {
>> fputs("I want two file names: source, destination\n",
>> stderr);
>> goto error;
>> }
>>
>> rc = sqlite3_open_v2(argv[1], , SQLITE_OPEN_READONLY, NULL);
>> if (rc != SQLITE_OK) {
>> fprintf(stderr, "Cannot open %s: %s\n", argv[1],
>> sqlite3_errmsg(src));
>> goto error;
>> }
>>
>> rc = sqlite3_open_v2(argv[2], , SQLITE_OPEN_READWRITE |
>> SQLITE_OPEN_CREATE, NULL);
>> if (rc != SQLITE_OK) {
>> fprintf(stderr, "Cannot open %s: %s\n", argv[2],
>> sqlite3_errmsg(dst));
>> goto error;
>> }
>>
>> sqlite3_exec(src, "PRAGMA busy_timeout=1", NULL, NULL, NULL);
>> sqlite3_exec(dst, "PRAGMA busy_timeout=1", NULL, NULL, NULL);
>>
>> backup = sqlite3_backup_init(dst, "main", src, "main");
>> if (backup == NULL) {
>> fprintf(stderr, "Cannot initialize backup: %s\n",
>> sqlite3_errmsg(dst));
>> goto error;
>> }
>>
>> do {
>> rc = sqlite3_backup_step(backup, -1);
>> } while (rc == SQLITE_BUSY || rc == SQLITE_LOCKED);
>>
>> rc = sqlite3_backup_finish(backup);
>> if (rc == SQLITE_OK) {
>> ok = 1;
>> } else {
>> fprintf(stderr, "Backup failed: %s\n",
>> sqlite3_errmsg(dst));
>> }
>>
>> error:
>> sqlite3_close(dst);
>> sqlite3_close(src);
>>
>> return ok ? 0 : 1;
>> }
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-05 Thread Eric Grange
Thanks for the tool Clemens!

> No, this is what makes the backup restart.  With a step size of -1, it
would never restart.

Hmm, given WAL mode, this brings another question: why the sqlite CLI does
not do that but uses 100 pages?
Is using a backup_step N parameter only useful if you want to provide some
progress indicator or pace the backup? (ie. pace I/O bandwidth ?)

Or in other words, given WAL mode, what do we "lose" when using -1 and to
copy everything at once during a backup?


On Tue, Oct 4, 2016 at 5:33 PM, Clemens Ladisch  wrote:

> Eric Grange wrote:
> > If all else fail, I could also suspend DB writes during backups
> (suspending
> > DB reads would be more problematic).
>
> With WAL, the backup reader does not block writers.
>
> >> Use the backup API, and copy everything in one step.
> >> (The restart-on-write feature should not be necessary with WAL.)
> >
> > That was what I thought initially, but I can only explain the multi-hours
> > backups with it: usually the backup API takes 4-5 minutes. It is just
> once
> > in a while that a very long backup occurs.
> >
> >> It calls sqlite3_backup_step() with a size of 100 pages.
> >
> > Ok, so I guess the huge cache is overkill with the default CLI!
>
> No, this is what makes the backup restart.  With a step size of -1,
> it would never restart.
>
> Use a tool like the one below to do the backup in one step, without
> restarts.
>
>
> Regards,
> Clemens
> --
>
> #include 
> #include 
>
> int main(int argc, char *argv[])
> {
> sqlite3 *src = NULL;
> sqlite3 *dst = NULL;
> sqlite3_backup *backup;
> int rc;
> int ok = 0;
>
> if (argc != 3) {
> fputs("I want two file names: source, destination\n",
> stderr);
> goto error;
> }
>
> rc = sqlite3_open_v2(argv[1], , SQLITE_OPEN_READONLY, NULL);
> if (rc != SQLITE_OK) {
> fprintf(stderr, "Cannot open %s: %s\n", argv[1],
> sqlite3_errmsg(src));
> goto error;
> }
>
> rc = sqlite3_open_v2(argv[2], , SQLITE_OPEN_READWRITE |
> SQLITE_OPEN_CREATE, NULL);
> if (rc != SQLITE_OK) {
> fprintf(stderr, "Cannot open %s: %s\n", argv[2],
> sqlite3_errmsg(dst));
> goto error;
> }
>
> sqlite3_exec(src, "PRAGMA busy_timeout=1", NULL, NULL, NULL);
> sqlite3_exec(dst, "PRAGMA busy_timeout=1", NULL, NULL, NULL);
>
> backup = sqlite3_backup_init(dst, "main", src, "main");
> if (backup == NULL) {
> fprintf(stderr, "Cannot initialize backup: %s\n",
> sqlite3_errmsg(dst));
> goto error;
> }
>
> do {
> rc = sqlite3_backup_step(backup, -1);
> } while (rc == SQLITE_BUSY || rc == SQLITE_LOCKED);
>
> rc = sqlite3_backup_finish(backup);
> if (rc == SQLITE_OK) {
> ok = 1;
> } else {
> fprintf(stderr, "Backup failed: %s\n",
> sqlite3_errmsg(dst));
> }
>
> error:
> sqlite3_close(dst);
> sqlite3_close(src);
>
> return ok ? 0 : 1;
> }
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-04 Thread Clemens Ladisch
Eric Grange wrote:
> If all else fail, I could also suspend DB writes during backups (suspending
> DB reads would be more problematic).

With WAL, the backup reader does not block writers.

>> Use the backup API, and copy everything in one step.
>> (The restart-on-write feature should not be necessary with WAL.)
>
> That was what I thought initially, but I can only explain the multi-hours
> backups with it: usually the backup API takes 4-5 minutes. It is just once
> in a while that a very long backup occurs.
>
>> It calls sqlite3_backup_step() with a size of 100 pages.
>
> Ok, so I guess the huge cache is overkill with the default CLI!

No, this is what makes the backup restart.  With a step size of -1,
it would never restart.

Use a tool like the one below to do the backup in one step, without
restarts.


Regards,
Clemens
--

#include 
#include 

int main(int argc, char *argv[])
{
sqlite3 *src = NULL;
sqlite3 *dst = NULL;
sqlite3_backup *backup;
int rc;
int ok = 0;

if (argc != 3) {
fputs("I want two file names: source, destination\n", stderr);
goto error;
}

rc = sqlite3_open_v2(argv[1], , SQLITE_OPEN_READONLY, NULL);
if (rc != SQLITE_OK) {
fprintf(stderr, "Cannot open %s: %s\n", argv[1], 
sqlite3_errmsg(src));
goto error;
}

rc = sqlite3_open_v2(argv[2], , SQLITE_OPEN_READWRITE | 
SQLITE_OPEN_CREATE, NULL);
if (rc != SQLITE_OK) {
fprintf(stderr, "Cannot open %s: %s\n", argv[2], 
sqlite3_errmsg(dst));
goto error;
}

sqlite3_exec(src, "PRAGMA busy_timeout=1", NULL, NULL, NULL);
sqlite3_exec(dst, "PRAGMA busy_timeout=1", NULL, NULL, NULL);

backup = sqlite3_backup_init(dst, "main", src, "main");
if (backup == NULL) {
fprintf(stderr, "Cannot initialize backup: %s\n", 
sqlite3_errmsg(dst));
goto error;
}

do {
rc = sqlite3_backup_step(backup, -1);
} while (rc == SQLITE_BUSY || rc == SQLITE_LOCKED);

rc = sqlite3_backup_finish(backup);
if (rc == SQLITE_OK) {
ok = 1;
} else {
fprintf(stderr, "Backup failed: %s\n", sqlite3_errmsg(dst));
}

error:
sqlite3_close(dst);
sqlite3_close(src);

return ok ? 0 : 1;
}
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-04 Thread Simon Slavin

On 4 Oct 2016, at 3:58pm, Eric Grange  wrote:

> If all else fail, I could also suspend DB writes during backups

If it's easy to do that, then I'd recommend it.  Used in conjunction with the 
SQLite backup API it should provide the simplest solution least likely to 
present a problem.  It should not be necessary to suspend DB reads during 
backups.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-04 Thread Eric Grange
> Have you looked at the ext/session extension ?

Yes, but it is a bit more complicated to integrate, and would impose a
penalty during execution as far as I understand: there are quite a few
intermediate states that would be stored by a changeset, but that do not
really need to be preserved at the end of the day.

Those intermediate states could be moved to a secondary DB, but would then
lose foreign keys and other integrity constraints

This is however an extension that I have been looking at, should the need
for some form of "live" db replication occur.

> I'm sorry.  I misunderstood your question and thought you were just
duplicating the file using OS calls.  The SQLite backup API takes care of
all necessary locking and > consistency problems for you. You should be
fine.

Ok, thanks!

> The problem of backing up a changing database is one of the difficult
problems in database management.  Please don't expect an easy solution.

Right now I am tackling it with brute force: a backup usually takes about
4-5 minutes (to a secondary SSD, that offline copy is then uploaded to a
rempte server).
It is mostly the odd occurences when the backup takes several hours that
are problematic.

If all else fail, I could also suspend DB writes during backups (suspending
DB reads would be more problematic).

> Use the backup API, and copy everything in one step.
> (The restart-on-write feature should not be necessary with WAL.)

That was what I thought initially, but I can only explain the multi-hours
backups with it: usually the backup API takes 4-5 minutes. It is just once
in a while that a very long backup occurs.

> It calls sqlite3_backup_step() with a size of 100 pages.

Ok, so I guess the huge cache is overkill with the default CLI!

Thanks


On Tue, Oct 4, 2016 at 4:05 PM, Simon Slavin  wrote:

>
> On 4 Oct 2016, at 2:53pm, Eric Grange  wrote:
>
> > I am going on the assumption that if something fails during backup, the
> > backup itself will be toast anyway, but is that safe otherwise?
>
> No.  You have no locking.  You might copy the beginning of the file before
> a transaction and the end of the file after it, meaning that pointers at
> one part of the file point to things which no longer exist.
>
> The problem of backing up a changing database is one of the difficult
> problems in database management.  Please don't expect an easy solution.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-04 Thread Simon Slavin

On 4 Oct 2016, at 3:05pm, Simon Slavin  wrote:

> No.  You have no locking.  You might copy the beginning of the file before a 
> transaction and the end of the file after it, meaning that pointers at one 
> part of the file point to things which no longer exist.

I'm sorry.  I misunderstood your question and thought you were just duplicating 
the file using OS calls.  The SQLite backup API takes care of all necessary 
locking and consistency problems for you. You should be fine.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-04 Thread Clemens Ladisch
Eric Grange wrote:
> Given a fairly large database (dozens of gigabytes), which uses WAL, and is
> being accessed continuously (mostly read transactions, but regular write
> transactions as well), what are the fastest and less disruptive ways to
> back it up?

Use the backup API, and copy everything in one step.
(The restart-on-write feature should not be necessary with WAL.)

> A basic ".backup" from the CLI can occasionnally take hours, as it is
> thrown thrown off whenever large write transactions occur.

It calls sqlite3_backup_step() with a size of 100 pages.
I guess you have to implement your own backup tool (or modify the shell).

> I am going on the assumption that if something fails during backup, the
> backup itself will be toast anyway, but is that safe otherwise?

Yes; the backup API uses a transaction, like any other read access.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-04 Thread Simon Slavin

On 4 Oct 2016, at 2:53pm, Eric Grange  wrote:

> I am going on the assumption that if something fails during backup, the
> backup itself will be toast anyway, but is that safe otherwise?

No.  You have no locking.  You might copy the beginning of the file before a 
transaction and the end of the file after it, meaning that pointers at one part 
of the file point to things which no longer exist.

The problem of backing up a changing database is one of the difficult problems 
in database management.  Please don't expect an easy solution.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fastest way to backup a "live" database

2016-10-04 Thread Domingo Alvarez Duarte

Hello Eric !

Have you looked at the ext/session extension ?

It seems that it can be the answer for your problem. The basic idea is 
you'll create a function that will receive the changes made to the 
database and then you can incrementally apply then on the backup database.


This way the memory usage will be low (as long you do not have millions 
changes in a short period of time).


Cheers !


On 04/10/16 10:53, Eric Grange wrote:

Hi,

Given a fairly large database (dozens of gigabytes), which uses WAL, and is
being accessed continuously (mostly read transactions, but regular write
transactions as well), what are the fastest and less disruptive ways to
back it up?

A basic ".backup" from the CLI can occasionnally take hours, as it is
thrown thrown off whenever large write transactions occur.

I have found the following approaches to seem to work, but would like a
confirmation:
- using temp_store = MEMORY
- using synchronous = 0
- using a cache_size as large a possible (as high as possible without
running out of memory)

I am going on the assumption that if something fails during backup, the
backup itself will be toast anyway, but is that safe otherwise?

Also the precompiled CLI for Windows (sqlite3.exe) of 3.14.2 fails at
around 2 GB RAM, which in my case is about 500,000 pages, is that already
too high or could it be worth going with a 64bits CLI?

Thanks!

Eric
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Fastest way to backup a "live" database

2016-10-04 Thread Eric Grange
Hi,

Given a fairly large database (dozens of gigabytes), which uses WAL, and is
being accessed continuously (mostly read transactions, but regular write
transactions as well), what are the fastest and less disruptive ways to
back it up?

A basic ".backup" from the CLI can occasionnally take hours, as it is
thrown thrown off whenever large write transactions occur.

I have found the following approaches to seem to work, but would like a
confirmation:
- using temp_store = MEMORY
- using synchronous = 0
- using a cache_size as large a possible (as high as possible without
running out of memory)

I am going on the assumption that if something fails during backup, the
backup itself will be toast anyway, but is that safe otherwise?

Also the precompiled CLI for Windows (sqlite3.exe) of 3.14.2 fails at
around 2 GB RAM, which in my case is about 500,000 pages, is that already
too high or could it be worth going with a 64bits CLI?

Thanks!

Eric
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users