[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-09 Thread David Barrett
Thanks Richard, this is exactly what I was thinking.  One question on this:

On Tue, Oct 6, 2015 at 10:22 PM, Richard Hipp  wrote:

> (2) For the source database connection of the backup, use the same
> database connection that is used for writing to the database.  That
> means that when changes are made to the source database, the backup
> mechanism can go back and resend only those pages that actually
> changed.  If the database is modified by any database connection other
> than the one used for backup, then the backup has no way of knowing
> which pages changed, and so it has to start over again at the
> beginning and rewrite every page.
>

Hm, that's interesting.  I was thinking that when the backup started, all
new writes would just accumulate in the WAL file -- and then be
checkpointed after the backup completed.  Thus the backup would be of the
state of the database at the moment the backup was *started*.  However,
based on your description it sounds like the backup API will copy over
changes that have occurred throughout the backup process itself. Our
database is modified about 40 times per second, so my fear is the backup
API would never actually complete because it would be continuously
re-backing up the same pages.  Am I understanding this correctly?  Is there
any way to obtain the behavior I initially assumed?

Thanks!

-david


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-09 Thread Richard Hipp
On 10/9/15, David Barrett  wrote:
> Thanks Richard, this is exactly what I was thinking.  One question on this:
>
> On Tue, Oct 6, 2015 at 10:22 PM, Richard Hipp  wrote:
>
>> (2) For the source database connection of the backup, use the same
>> database connection that is used for writing to the database.  That
>> means that when changes are made to the source database, the backup
>> mechanism can go back and resend only those pages that actually
>> changed.  If the database is modified by any database connection other
>> than the one used for backup, then the backup has no way of knowing
>> which pages changed, and so it has to start over again at the
>> beginning and rewrite every page.
>>
>
> Hm, that's interesting.  I was thinking that when the backup started, all
> new writes would just accumulate in the WAL file -- and then be
> checkpointed after the backup completed.  Thus the backup would be of the
> state of the database at the moment the backup was *started*.  However,
> based on your description it sounds like the backup API will copy over
> changes that have occurred throughout the backup process itself. Our
> database is modified about 40 times per second, so my fear is the backup
> API would never actually complete because it would be continuously
> re-backing up the same pages.  Am I understanding this correctly?  Is there
> any way to obtain the behavior I initially assumed?
>

Yes - run the backup using a separate "source" database connection and
do the entire backup within a read transaction.  Note that having this
long-running read transaction going in a separate database connection
will prevent checkpoint operations from resetting the WAL file, and so
the WAL file will continue to grow until the backup completes.  So you
may want to take steps to ensure that the backup doesn't take *too*
long.


-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread David Barrett
On Tue, Oct 6, 2015 at 8:36 PM, Simon Slavin  wrote:

> Or copy an existing /open/ database file to the new server using the
> SQLite Backup API, [requires other connections to stop modifying the
> database for long enough for the copy to be made]
>

Well the backup API works with WAL mode [1] so it can still be modified
while being backed up (though I imagine the WAL file will get huge during
the backup, but that's fine).

Regardless, all those solutions require me to wait for the entire backup to
complete before sending the file to the remote host -- my goal is to send
it one page at a time (eg, send the pages as the backup API processes them)
so as to avoid the large delay and disk space of the backup itself.

-david

[1] http://sqlite.1065341.n5.nabble.com/WAL-mode-and-backup-API-td40107.html


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread David Barrett
On Tue, Oct 6, 2015 at 2:57 PM, Clemens Ladisch  wrote:

> It backs up to any disk that you can access.
> Do you have a network file system?
>

Well yes, but I'd like to handle it at the application layer.  Basically,
we operate a custom replication layer atop sqlite.  It replicates
individual transactions great with 2-phase commit, but right now you need
to manually "bootstrap" a new server by copying the database from a
different server.  I'd like to auto-bootstrap a new node by just starting
it, it'd connect to a peer, and then download the entire database.


> > how to use this API to do an incremental backup
>
> This API is not incremental; it always copies the entire database.
>

Agreed it will copy the entire database, but incrementally -- one bit at a
time.  Each call to sqlite3_backup_step() backs up a bit more to the target
file on disk.  My goal is to instead have sqlite3_backup_step() copy some
pages to some block of RAM, so I can send that data over the wire and write
it to the remote disk.

-david


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread R.Smith
Well, there's the obvious Elephant in the room - SQL. You could just 
reduce any DB to SQL statements and pass those along at whichever 
pace/destination/byte-mode you fancy. The target system will have zero 
trouble turning it into a DB, thanks to the SQLite engine already able 
to parse SQL.

This can become really efficient later when you only move updated bits - 
plus, if bandwidth is an issue, SQL text compresses well.

Cheers,
Ryan


On 2015-10-06 03:44 PM, David Barrett wrote:
> On Tue, Oct 6, 2015 at 8:36 PM, Simon Slavin  wrote:
>
>> Or copy an existing /open/ database file to the new server using the
>> SQLite Backup API, [requires other connections to stop modifying the
>> database for long enough for the copy to be made]
>>
> Well the backup API works with WAL mode [1] so it can still be modified
> while being backed up (though I imagine the WAL file will get huge during
> the backup, but that's fine).
>
> Regardless, all those solutions require me to wait for the entire backup to
> complete before sending the file to the remote host -- my goal is to send
> it one page at a time (eg, send the pages as the backup API processes them)
> so as to avoid the large delay and disk space of the backup itself.
>
> -david
>
> [1] http://sqlite.1065341.n5.nabble.com/WAL-mode-and-backup-API-td40107.html
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread Simon Slavin

On 6 Oct 2015, at 2:44pm, David Barrett  wrote:

> Regardless, all those solutions require me to wait for the entire backup to
> complete before sending the file to the remote host -- my goal is to send
> it one page at a time (eg, send the pages as the backup API processes them)
> so as to avoid the large delay and disk space of the backup itself.

In that case it's probably best, as was suggested by Clemens, to use a network 
filing system to allow a computer to access both the source file storage and 
the destination file storage. By opening a file on the remote filespace using 
SQLite you can use the Backup API to write directly to the destination 
filespace.

Simon.


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread Simon Slavin

On 6 Oct 2015, at 1:52pm, David Barrett  wrote:

> Well yes, but I'd like to handle it at the application layer.  Basically,
> we operate a custom replication layer atop sqlite.  It replicates
> individual transactions great with 2-phase commit, but right now you need
> to manually "bootstrap" a new server by copying the database from a
> different server.  I'd like to auto-bootstrap a new node by just starting
> it, it'd connect to a peer, and then download the entire database.

Copy an existing /closed/ database file to the new server using file commands,
[other connections trying to make modifications will, I think, get error 
messages]

Or copy an existing /open/ database file to the new server using the SQLite 
Backup API, [requires other connections to stop modifying the database for long 
enough for the copy to be made]

Or copy an existing /open/ database file to the new server by creating the new 
database using sqlite_open(), then ATTACHing the source database and using 
INSERT (SELECT ...)
[locks other connections trying to use the source database out while copying is 
done]

Simon.


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread Richard Hipp
On 10/6/15, David Barrett  wrote:
> sqlite has a cool "online backup" API: https://www.sqlite.org/backup.html
>  However, it only backs up to a local disk.  I'm wondering if anybody can
> think on how to use this API to do an incremental backup over a network
> connection to a remote host?  The networking part is easy.  But I can't
> figure out how to programmatically get the binary of the database file into
> memory such that I can send over the network (other than waiting for the
> whole thing to backup to disk and then just doing a remote disk copy,
> obviously, but my file is so large I don't want to wait that long nor do I
> have enough disk to do that).  I'm not really familiar with the VFS layer
> -- would there be some way to do a custom VFS that the online backup API
> writes to, and then I just copy the blocks over from that?  Thanks for any
> creative ideas!
>

Suppose you have a really massive database that is being continually
updated, and you want to have a live backup to another host over a
network.  The SQLite backup API will work for this, we believe, as
long as the database is only being written from a single database
connection.

The backup API transfers information from a "source" database
connection over to a "destination" database connection.  There are
several considerations here:

(1) For the destination database connection, use a custom VFS that
writes content over the network rather than writing content to disk.
This custom VFS does not need to be general-purpose.  It can be highly
specialized so that it only handles the specific VFS methods that the
backup interface will generate.

(2) For the source database connection of the backup, use the same
database connection that is used for writing to the database.  That
means that when changes are made to the source database, the backup
mechanism can go back and resend only those pages that actually
changed.  If the database is modified by any database connection other
than the one used for backup, then the backup has no way of knowing
which pages changed, and so it has to start over again at the
beginning and rewrite every page.

(3) The thread that is running the backup should monitor how fast new
content is being shipped over the network and should throttle calls to
sqlite3_backup_step() to prevent too much content from being queued up
in cache.

-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread Clemens Ladisch
David Barrett wrote:
> sqlite has a cool "online backup" API: https://www.sqlite.org/backup.html
>  However, it only backs up to a local disk.

It backs up to any disk that you can access.
Do you have a network file system?

> how to use this API to do an incremental backup

This API is not incremental; it always copies the entire database.


Regards,
Clemens


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread Charles Leifer
You could mount a directory as a ramdisk.

On Tue, Oct 6, 2015 at 7:52 AM, David Barrett 
wrote:

> On Tue, Oct 6, 2015 at 2:57 PM, Clemens Ladisch 
> wrote:
>
> > It backs up to any disk that you can access.
> > Do you have a network file system?
> >
>
> Well yes, but I'd like to handle it at the application layer.  Basically,
> we operate a custom replication layer atop sqlite.  It replicates
> individual transactions great with 2-phase commit, but right now you need
> to manually "bootstrap" a new server by copying the database from a
> different server.  I'd like to auto-bootstrap a new node by just starting
> it, it'd connect to a peer, and then download the entire database.
>
>
> > > how to use this API to do an incremental backup
> >
> > This API is not incremental; it always copies the entire database.
> >
>
> Agreed it will copy the entire database, but incrementally -- one bit at a
> time.  Each call to sqlite3_backup_step() backs up a bit more to the target
> file on disk.  My goal is to instead have sqlite3_backup_step() copy some
> pages to some block of RAM, so I can send that data over the wire and write
> it to the remote disk.
>
> -david
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread David Barrett
sqlite has a cool "online backup" API: https://www.sqlite.org/backup.html
 However, it only backs up to a local disk.  I'm wondering if anybody can
think on how to use this API to do an incremental backup over a network
connection to a remote host?  The networking part is easy.  But I can't
figure out how to programmatically get the binary of the database file into
memory such that I can send over the network (other than waiting for the
whole thing to backup to disk and then just doing a remote disk copy,
obviously, but my file is so large I don't want to wait that long nor do I
have enough disk to do that).  I'm not really familiar with the VFS layer
-- would there be some way to do a custom VFS that the online backup API
writes to, and then I just copy the blocks over from that?  Thanks for any
creative ideas!

-david