Re: [libvirt] Overview of libvirt incremental backup API, part 2 (incremental/differential pull mode)

2018-11-22 Thread Michael Ablassmeier
hi,

after watching Johns slides from the kvm forum (thanks for that) i had
some quick look at the backup-v3 branch. Just to provide some feebdack
for you guys, and some questions.

My main question is about the part of the NBD backup. By default what
you get from reading all the NDB data is a thick provisioned image of
the domains disk. One can use the `qemu-img map' function to get a
detailed information about the used blocks in the image, in case one
wants to create a thin provisioned backup.

As a third party backup vendor you cannot allways depend on qemu tools,
because you might not even install any software on the host you are
taking a backup from. So is, or will there be any way to get an output
that represents the same information from the map function in the backup
XML description via the libvirt API?  Would it make sense to provide
this information in the `backup-dumpxml' output?

>From what i know in the Citrix XEN implementation, they provide a way to read
this information via the API, because they do not want the backup vendor to
install any component on the host systems.

Another thing i came across is that libvirt currently seems to forget about the
running backup job if a domain is destroyed and started after a backup job was
created:

[root@x ~]# virsh backup-begin centos backup-pull.xml 
Backup id 1 started
backup used description from 'backup-pull.xml'
[root@x ~]# virsh destroy centos && virsh start centos
[root@x ~]# virsh backup-end --id 1 centos
error: Requested operation is not valid: No active block job 'tmp-hda'
[root@x ~]# virsh backup-dumpxml --id 1 centos

  
  

  
  

  



thanks for your hard work on this!

bye,
- michael


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Overview of libvirt incremental backup API, part 2 (incremental/differential pull mode)

2018-10-09 Thread Eric Blake

On 10/9/18 8:29 AM, Nir Soffer wrote:

On Fri, Oct 5, 2018 at 7:58 AM Eric Blake  wrote:


On 10/4/18 12:05 AM, Eric Blake wrote:

The following (long) email describes a portion of the work-flow of how
my proposed incremental backup APIs will work, along with the backend
QMP commands that each one executes.  I will reply to this thread with
further examples (the first example is long enough to be its own email).
This is an update to a thread last posted here:
https://www.redhat.com/archives/libvir-list/2018-June/msg01066.html




More to come in part 2.



- Second example: a sequence of incremental backups via pull model

In the first example, we did not create a checkpoint at the time of the
full pull. That means we have no way to track a delta of changes since
that point in time.



Why do we want to support backup without creating a checkpoint?


Fleecing. If you want to examine a portion of the disk at a given point 
in time, then kicking off a pull model backup gives you access to the 
state of the disk at that time, and your actions are transient.  Ending 
the job when you are done with the fleece cleans up everything needed to 
perform the fleece operation, and since you did not intend to capture a 
full (well, a complete) incremental backup, but were rather grabbing 
just a subset of the disk, you really don't want that point in time to 
be recorded as a new checkpoint.


Also, incremental backups (which are what require checkpoints) are 
limited to qcow2 disks, but full backups can be performed on any format 
(including raw disks).  If you have a guest that does not use qcow2 
disks, you can perform a full backup, but cannot create a checkpoint.




If we don't have any real use case, I suggest to always require a
checkpoint.


But we do have real cases for backup without checkpoint.





Let's repeat the full backup (reusing the same
backup.xml from before), but this time, we'll add a new parameter, a
second XML file for describing the checkpoint we want to create.

Actually, it was easy enough to get virsh to write the XML for me
(because it was very similar to existing code in virsh that creates XML
for snapshot creation):

$ $virsh checkpoint-create-as --print-xml $dom check1 testing \
 --diskspec sdc --diskspec sdd | tee check1.xml

check1



We should use an id, not a name, even of name is name is also unique like
in most libvirt apis.

In RHV we will use always use a UUID for this.


Nothing prevents you from using a UUID as your name. But this particular 
choice of XML () matches what already exists in the snapshot XML.






testing

  
  



I had to supply two --diskspec arguments to virsh to select just the two
qcow2 disks that I am using in my example (rather than every disk in the
domain, which is the default when  is not present).



So  is valid configuration, selecting all disks, or not having
"disks" element
selects all disks?


It's about a one-line change to get whichever behavior you find more 
useful. Right now, I'm leaning towards:  omitted == backup all 
disks,  present: you MUST have at least one  subelement 
that explicitly requests a checkpoint (because any omitted  when 
 is present is skipped). A checkpoint only makes sense as long as 
there is at least one disk to create a checkpoint with.


But I could also go with:  omitted == backup all disks,  
present but  subelements missing: the missing elements default to 
being backed up, and you have to explicitly provide checkpoint='no'> to skip a particular disk.


Or even:  omitted, or  present but  subelements 
missing: the missing elements defer to the hypervisor for their default 
state, and the qemu hypervisor defaults to qcow2 disks being backed 
up/checkpointed and to non-qcow2 disks being omitted.  But this latter 
one feels like more magic, which is harder to document and liable to go 
wrong.


A stricter version would be  is mandatory, and no  
subelement can be missing (or else the API fails because you weren't 
explicit in your choice). But that's rather strict, especially since 
existing snapshots XML handling is not that strict.






I also picked
a name (mandatory) and description (optional) to be associated with the
checkpoint.

The backup.xml file that we plan to reuse still mentions scratch1.img
and scratch2.img as files needed for staging the pull request. However,
any contents in those files could interfere with our second backup
(after all, every cluster written into that file from the first backup
represents a point in time that was frozen at the first backup; but our
second backup will want to read the data as the guest sees it now rather
than what it was at the first backup), so we MUST regenerate the scratch
files. (Perhaps I should have just deleted them at the end of example 1
in my previous email, had I remembered when typing that mail).

$ $qemu_img create -f qcow2 -b $orig1 -F qcow2 scratch1.img
$ $qemu_img create -f qcow2 -b $orig2 -F qcow2 scratch2.img

Now, to

Re: [libvirt] Overview of libvirt incremental backup API, part 2 (incremental/differential pull mode)

2018-10-09 Thread Nir Soffer
On Fri, Oct 5, 2018 at 7:58 AM Eric Blake  wrote:

> On 10/4/18 12:05 AM, Eric Blake wrote:
> > The following (long) email describes a portion of the work-flow of how
> > my proposed incremental backup APIs will work, along with the backend
> > QMP commands that each one executes.  I will reply to this thread with
> > further examples (the first example is long enough to be its own email).
> > This is an update to a thread last posted here:
> > https://www.redhat.com/archives/libvir-list/2018-June/msg01066.html
> >
>
> > More to come in part 2.
> >
>
> - Second example: a sequence of incremental backups via pull model
>
> In the first example, we did not create a checkpoint at the time of the
> full pull. That means we have no way to track a delta of changes since
> that point in time.


Why do we want to support backup without creating a checkpoint?

If we don't have any real use case, I suggest to always require a
checkpoint.


> Let's repeat the full backup (reusing the same
> backup.xml from before), but this time, we'll add a new parameter, a
> second XML file for describing the checkpoint we want to create.
>
> Actually, it was easy enough to get virsh to write the XML for me
> (because it was very similar to existing code in virsh that creates XML
> for snapshot creation):
>
> $ $virsh checkpoint-create-as --print-xml $dom check1 testing \
> --diskspec sdc --diskspec sdd | tee check1.xml
> 
>check1
>

We should use an id, not a name, even of name is name is also unique like
in most libvirt apis.

In RHV we will use always use a UUID for this.


>testing
>
>  
>  
>
> 
>
> I had to supply two --diskspec arguments to virsh to select just the two
> qcow2 disks that I am using in my example (rather than every disk in the
> domain, which is the default when  is not present).


So  is valid configuration, selecting all disks, or not having
"disks" element
selects all disks?


> I also picked
> a name (mandatory) and description (optional) to be associated with the
> checkpoint.
>
> The backup.xml file that we plan to reuse still mentions scratch1.img
> and scratch2.img as files needed for staging the pull request. However,
> any contents in those files could interfere with our second backup
> (after all, every cluster written into that file from the first backup
> represents a point in time that was frozen at the first backup; but our
> second backup will want to read the data as the guest sees it now rather
> than what it was at the first backup), so we MUST regenerate the scratch
> files. (Perhaps I should have just deleted them at the end of example 1
> in my previous email, had I remembered when typing that mail).
>
> $ $qemu_img create -f qcow2 -b $orig1 -F qcow2 scratch1.img
> $ $qemu_img create -f qcow2 -b $orig2 -F qcow2 scratch2.img
>
> Now, to begin the full backup and create a checkpoint at the same time.
> Also, this time around, it would be nice if the guest had a chance to
> freeze I/O to the disks prior to the point chosen as the checkpoint.
> Assuming the guest is trusted, and running the qemu guest agent (qga),
> we can do that with:
>
> $ $virsh fsfreeze $dom
> $ $virsh backup-begin $dom backup.xml check1.xml
> Backup id 1 started
> backup used description from 'backup.xml'
> checkpoint used description from 'check1.xml'
> $ $virsh fsthaw $dom
>

Great, this answer my (unsent) question about freeze/thaw from part 1 :-)

>
> and eventually, we may decide to add a VIR_DOMAIN_BACKUP_BEGIN_QUIESCE
> flag to combine those three steps into a single API (matching what we've
> done on some other existing API).  In other words, the sequence of QMP
> operations performed during virDomainBackupBegin are quick enough that
> they won't stall a freeze operation (at least Windows is picky if you
> stall a freeze operation longer than 10 seconds).
>

We use fsFreeze/fsThaw directly in RHV since we need to support external
snapshots (e.g. ceph), so we don't need this functionality, but it sounds
good
idea to make it work like snapshot.


>
> The tweaked $virsh backup-begin now results in a call to:
>   virDomainBackupBegin(dom, "",
> " and in turn libvirt makes a similar sequence of QMP calls as before,
> with a slight modification in the middle:
> {"execute":"nbd-server-start",...
> {"execute":"blockdev-add",...
>

This does not work yet for network disks like "rbd" and "glusterfs"
does it mean that they will not be supported for backup?


> {"execute":"transaction",
>   "arguments":{"actions":[
>{"type":"blockdev-backup", "data":{
> "device":"$node1", "target":"backup-sdc", "sync":"none",
> "job-id":"backup-sdc" }},
>{"type":"blockdev-backup", "data":{
> "device":"$node2", "target":"backup-sdd", "sync":"none",
> "job-id":"backup-sdd" }}
>{"type":"block-dirty-bitmap-add", "data":{
> "node":"$node1", "name":"check1", "persistent":true}},
>{"type":"block-dirty-bitmap-add", "data":{
> "node":"$node2", "name":"check1", "persistent":true

Re: [libvirt] Overview of libvirt incremental backup API, part 2 (incremental/differential pull mode)

2018-10-04 Thread Eric Blake

On 10/4/18 12:05 AM, Eric Blake wrote:
The following (long) email describes a portion of the work-flow of how 
my proposed incremental backup APIs will work, along with the backend 
QMP commands that each one executes.  I will reply to this thread with 
further examples (the first example is long enough to be its own email). 
This is an update to a thread last posted here:

https://www.redhat.com/archives/libvir-list/2018-June/msg01066.html




More to come in part 2.



- Second example: a sequence of incremental backups via pull model

In the first example, we did not create a checkpoint at the time of the 
full pull. That means we have no way to track a delta of changes since 
that point in time. Let's repeat the full backup (reusing the same 
backup.xml from before), but this time, we'll add a new parameter, a 
second XML file for describing the checkpoint we want to create.


Actually, it was easy enough to get virsh to write the XML for me 
(because it was very similar to existing code in virsh that creates XML 
for snapshot creation):


$ $virsh checkpoint-create-as --print-xml $dom check1 testing \
   --diskspec sdc --diskspec sdd | tee check1.xml

  check1
  testing
  


  


I had to supply two --diskspec arguments to virsh to select just the two 
qcow2 disks that I am using in my example (rather than every disk in the 
domain, which is the default when  is not present). I also picked 
a name (mandatory) and description (optional) to be associated with the 
checkpoint.


The backup.xml file that we plan to reuse still mentions scratch1.img 
and scratch2.img as files needed for staging the pull request. However, 
any contents in those files could interfere with our second backup 
(after all, every cluster written into that file from the first backup 
represents a point in time that was frozen at the first backup; but our 
second backup will want to read the data as the guest sees it now rather 
than what it was at the first backup), so we MUST regenerate the scratch 
files. (Perhaps I should have just deleted them at the end of example 1 
in my previous email, had I remembered when typing that mail).


$ $qemu_img create -f qcow2 -b $orig1 -F qcow2 scratch1.img
$ $qemu_img create -f qcow2 -b $orig2 -F qcow2 scratch2.img

Now, to begin the full backup and create a checkpoint at the same time. 
Also, this time around, it would be nice if the guest had a chance to 
freeze I/O to the disks prior to the point chosen as the checkpoint. 
Assuming the guest is trusted, and running the qemu guest agent (qga), 
we can do that with:


$ $virsh fsfreeze $dom
$ $virsh backup-begin $dom backup.xml check1.xml
Backup id 1 started
backup used description from 'backup.xml'
checkpoint used description from 'check1.xml'
$ $virsh fsthaw $dom

and eventually, we may decide to add a VIR_DOMAIN_BACKUP_BEGIN_QUIESCE 
flag to combine those three steps into a single API (matching what we've 
done on some other existing API).  In other words, the sequence of QMP 
operations performed during virDomainBackupBegin are quick enough that 
they won't stall a freeze operation (at least Windows is picky if you 
stall a freeze operation longer than 10 seconds).


The tweaked $virsh backup-begin now results in a call to:
 virDomainBackupBegin(dom, "",
   "and in turn libvirt makes a similar sequence of QMP calls as before, 
with a slight modification in the middle:

{"execute":"nbd-server-start",...
{"execute":"blockdev-add",...
{"execute":"transaction",
 "arguments":{"actions":[
  {"type":"blockdev-backup", "data":{
   "device":"$node1", "target":"backup-sdc", "sync":"none",
   "job-id":"backup-sdc" }},
  {"type":"blockdev-backup", "data":{
   "device":"$node2", "target":"backup-sdd", "sync":"none",
   "job-id":"backup-sdd" }}
  {"type":"block-dirty-bitmap-add", "data":{
   "node":"$node1", "name":"check1", "persistent":true}},
  {"type":"block-dirty-bitmap-add", "data":{
   "node":"$node2", "name":"check1", "persistent":true}}
 ]}}
{"execute":"nbd-server-add",...

The only change was adding more actions to the "transaction" command - 
in addition to kicking off the fleece image in the scratch nodes, it 
ALSO added a persistent bitmap to each of the original images, to track 
all changes made after the point of the transaction.  The bitmaps are 
persistent - at this point (well, it's better if you wait until after 
backup-end), you could shut the guest down and restart it, and libvirt 
will still remember that the checkpoint exists, and qemu will continue 
track guest writes via the bitmap. However, the backup job itself is 
currently live-only, and shutting down the guest while a backup 
operation is in effect will lose track of the backup job.  What that 
really means is that if the guest shuts down, your current backup job is 
hosed (you cannot ever get back the point-in-time data from your API 
request - as your next API request will be a new point in time) - but 
you have not permanently ruined the guest, and your recov