Re: Squelch 'eof from qemu monitor' error on normal VM shutdown

2021-09-30 Thread Jim Fehlig

On 9/30/21 11:24, Daniel P. Berrangé wrote:

On Thu, Sep 30, 2021 at 11:15:18AM -0600, Jim Fehlig wrote:

On 9/29/21 21:29, Jim Fehlig wrote:

Hi All,

Likely Christian received a bug report that motivated commit aeda1b8c56,
which was later reverted by Michal with commit 72adaf2f10. In the past,
I recall being asked about "internal error: End of file from qemu
monitor" on normal VM shutdown and gave a hand wavy response using some
of Michal's words from the revert commit message.

I recently received a bug report (sorry, but no public link) from a
concerned user about this error and wondered if there is some way to
improve it? I went down some dead ends before circling back to
Christian's patch. When rebased to latest master, I cannot reproduce the
hangs reported by Michal [1]. Perhaps Nikolay's series to resolve
hangs/crashes of libvirtd [2] has now made Christian's patch viable?


Hmm, Nikolay's series improves thread management at daemon shutdown and
doesn't touch VM shutdown logic. But there has been some behavior change
from the time aeda1b8c56 was committed (3.4.0 dev cycle) to current git
master. At least I don't see libvirtd hanging when running Michal's test on
master + rebased aeda1b8c56.


This particular "eof" error message has been a source of never ending
complaints from people who mistakenly think it indicates a significant
problem.


Nod. I've probably been asked about it more times than I'm remembering. "error" 
and "fail" often catch the attention of users and monitor tools.



I've always been wary of hiding it by default as there are potentially
scenarios it which it is interesting to see. I think I'm coming around
to the idea though that we're better off hiding it by default. The
scenarios which care about it will probably already need the user to
contribute full debug level logs in order to diagnose properly.


I've started a variant of Michal's test on git master + rebased aeda1b8c56. The 
test starts 6 VMs, waits 90sec, shuts them down in parallel, waits for shutdowns 
to finish, then repeats. All while continuously calling GetAllDomainStats from 
another client. I'll see how that holds up before re-proposing Christian's patch.


Regards,
Jim




Re: Squelch 'eof from qemu monitor' error on normal VM shutdown

2021-09-30 Thread Jim Fehlig

On 9/29/21 21:29, Jim Fehlig wrote:

Hi All,

Likely Christian received a bug report that motivated commit aeda1b8c56, which 
was later reverted by Michal with commit 72adaf2f10. In the past, I recall being 
asked about "internal error: End of file from qemu monitor" on normal VM 
shutdown and gave a hand wavy response using some of Michal's words from the 
revert commit message.


I recently received a bug report (sorry, but no public link) from a concerned 
user about this error and wondered if there is some way to improve it? I went 
down some dead ends before circling back to Christian's patch. When rebased to 
latest master, I cannot reproduce the hangs reported by Michal [1]. Perhaps 
Nikolay's series to resolve hangs/crashes of libvirtd [2] has now made 
Christian's patch viable?


Hmm, Nikolay's series improves thread management at daemon shutdown and doesn't 
touch VM shutdown logic. But there has been some behavior change from the time 
aeda1b8c56 was committed (3.4.0 dev cycle) to current git master. At least I 
don't see libvirtd hanging when running Michal's test on master + rebased 
aeda1b8c56.


Regards,
Jim


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1536461
[2] https://listman.redhat.com/archives/libvir-list/2020-July/msg01606.html




Re: [PATCH] failover: allow to pause the VM during the migration

2021-09-30 Thread Laine Stump

On 9/30/21 1:09 PM, Laurent Vivier wrote:

If we want to save a snapshot of a VM to a file, we used to follow the
following steps:

1- stop the VM:
(qemu) stop

2- migrate the VM to a file:
(qemu) migrate "exec:cat > snapshot"

3- resume the VM:
(qemu) cont

After that we can restore the snapshot with:
   qemu-system-x86_64 ... -incoming "exec:cat snapshot"
   (qemu) cont


This is the basics of what libvirt does for a snapshot, and steps 1+2 
are what it does for a "managedsave" (where it saves the snapshot to 
disk and then terminates the qemu process, for later re-animation).


In those cases, it seems like this new parameter could work for us - 
instead of explicitly pausing the guest prior to migrating it to disk, 
we would set this new parameter to on, then directly migrate-to-disk 
(relying on qemu to do the pause). Care will need to be taken to assure 
that error recovery behaves the same though.


There are a couple of cases when libvirt apparently *doesn't* pause the 
guest during the migrate-to-disk, both having to do with saving a 
coredump of the guest. Since I really have no idea of how 
common/important that is (or even if my assessment of the code is 
correct), I'm Cc'ing this patch to libvir-list to make sure it catches 
the attention of someone who knows the answers and implications.




But when failover is configured, it doesn't work anymore.

As the failover needs to ask the guest OS to unplug the card
the machine cannot be paused.

This patch introduces a new migration parameter, "pause-vm", that
asks the migration to pause the VM during the migration startup
phase after the the card is unplugged.

Once the migration is done, we only need to resume the VM with
"cont" and the card is plugged back:

1- set the parameter:
(qemu) migrate_set_parameter pause-vm on

2- migrate the VM to a file:
(qemu) migrate "exec:cat > snapshot"

The primary failover card (VFIO) is unplugged and the VM is paused.

3- resume the VM:
(qemu) cont

The VM restarts and the primary failover card is plugged back

The VM state sent in the migration stream is "paused", it means
when the snapshot is loaded or if the stream is sent to a destination
QEMU, the VM needs to be resumed manually.

Signed-off-by: Laurent Vivier 
---
  qapi/migration.json| 20 +++---
  include/hw/virtio/virtio-net.h |  1 +
  hw/net/virtio-net.c| 33 ++
  migration/migration.c  | 37 +-
  monitor/hmp-cmds.c |  8 
  5 files changed, 95 insertions(+), 4 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 88f07baedd06..86284d96ad2a 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -743,6 +743,10 @@
  #block device name if there is one, and to their node 
name
  #otherwise. (Since 5.2)
  #
+# @pause-vm:   Pause the virtual machine before doing the migration.
+#  This allows QEMU to unplug a card before doing the
+#  migration as it depends on the guest kernel. (since 6.2)
+#
  # Since: 2.4
  ##
  { 'enum': 'MigrationParameter',
@@ -758,7 +762,7 @@
 'xbzrle-cache-size', 'max-postcopy-bandwidth',
 'max-cpu-throttle', 'multifd-compression',
 'multifd-zlib-level' ,'multifd-zstd-level',
-   'block-bitmap-mapping' ] }
+   'block-bitmap-mapping', 'pause-vm' ] }
  
  ##

  # @MigrateSetParameters:
@@ -903,6 +907,10 @@
  #block device name if there is one, and to their node 
name
  #otherwise. (Since 5.2)
  #
+# @pause-vm:   Pause the virtual machine before doing the migration.
+#  This allows QEMU to unplug a card before doing the
+#  migration as it depends on the guest kernel. (since 6.2)
+#
  # Since: 2.4
  ##
  # TODO either fuse back into MigrationParameters, or make
@@ -934,7 +942,8 @@
  '*multifd-compression': 'MultiFDCompression',
  '*multifd-zlib-level': 'uint8',
  '*multifd-zstd-level': 'uint8',
-'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } }
+'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
+'*pause-vm': 'bool' } }
  
  ##

  # @migrate-set-parameters:
@@ -1099,6 +1108,10 @@
  #block device name if there is one, and to their node 
name
  #otherwise. (Since 5.2)
  #
+# @pause-vm:   Pause the virtual machine before doing the migration.
+#  This allows QEMU to unplug a card before doing the
+#  migration as it depends on the guest kernel. (since 6.2)
+#
  # Since: 2.4
  ##
  { 'struct': 'MigrationParameters',
@@ -1128,7 +1141,8 @@
  '*multifd-compression': 'MultiFDCompression',
  '*multifd-zlib-level': 'uint8',

Re: Squelch 'eof from qemu monitor' error on normal VM shutdown

2021-09-30 Thread Daniel P . Berrangé
On Thu, Sep 30, 2021 at 11:15:18AM -0600, Jim Fehlig wrote:
> On 9/29/21 21:29, Jim Fehlig wrote:
> > Hi All,
> > 
> > Likely Christian received a bug report that motivated commit aeda1b8c56,
> > which was later reverted by Michal with commit 72adaf2f10. In the past,
> > I recall being asked about "internal error: End of file from qemu
> > monitor" on normal VM shutdown and gave a hand wavy response using some
> > of Michal's words from the revert commit message.
> > 
> > I recently received a bug report (sorry, but no public link) from a
> > concerned user about this error and wondered if there is some way to
> > improve it? I went down some dead ends before circling back to
> > Christian's patch. When rebased to latest master, I cannot reproduce the
> > hangs reported by Michal [1]. Perhaps Nikolay's series to resolve
> > hangs/crashes of libvirtd [2] has now made Christian's patch viable?
> 
> Hmm, Nikolay's series improves thread management at daemon shutdown and
> doesn't touch VM shutdown logic. But there has been some behavior change
> from the time aeda1b8c56 was committed (3.4.0 dev cycle) to current git
> master. At least I don't see libvirtd hanging when running Michal's test on
> master + rebased aeda1b8c56.

This particular "eof" error message has been a source of never ending
complaints from people who mistakenly think it indicates a significant
problem.

I've always been wary of hiding it by default as there are potentially
scenarios it which it is interesting to see. I think I'm coming around
to the idea though that we're better off hiding it by default. The
scenarios which care about it will probably already need the user to
contribute full debug level logs in order to diagnose properly.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Libguestfs] translating CD-ROM device paths from i440fx to Q35 in virt-v2v (was: test-v2v-cdrom: update the CD-ROM's bus to SATA in the converted domain)

2021-09-30 Thread Richard W.M. Jones
On Thu, Sep 30, 2021 at 12:53:51PM +0100, Richard W.M. Jones wrote:
> On Thu, Sep 30, 2021 at 01:12:39PM +0200, Laszlo Ersek wrote:
> > All this requires virt-v2v to parse complete  elements from the
> > original domain XML, and to generate complete  elements in the
> > destination domain XML. Is that feasible?

Just to put a bit of flesh on these bones, links to the source:

> The input is not always (in fact, hardly ever) full libvirt XML.  It's
> input specific to the hypervisor.  For VMware it might be:
> 
>  - the *.vmx file (the real source of truth) (-i vmx)

Here's the virt-v2v code that can parse either a local or remote (over
ssh) *.vmx file into virt-v2v's internal metadata representation:

https://github.com/libguestfs/virt-v2v/blob/master/input/parse_domain_from_vmx.ml

Libvirt also has a vmx driver, but I decided not to use it because it
just adds an extra translation step and probably loses fidelity in the
process.

In all cases, the *.vmx file is the source of truth for VMware.

>  - partial libvirt XML generated by libvirt's vpx driver, but this is
>derived from information from VMware APIs and ultimately that comes
>from the *.vmx file (-i libvirt -ic esx:// or -ic vpx://)

The libvirt driver:

https://gitlab.com/libvirt/libvirt/-/tree/master/src/esx

I've hacked on this one a little bit in the past, and ... it's complicated.

>  - the *.ovf file (-i ova)

OVF is a pseudo-standard invented by VMware, but essentially a plot so
they can say they are "standard" when in fact it's just a data dump of
internal VMware structures with no compatibility across software that
generates or consumes OVF.

Here's our OVF parser (for VMware-OVF):

https://github.com/libguestfs/virt-v2v/blob/master/input/OVF.ml

We can also generate OVF, but we generate the RHV-flavour of OVF which
(see above) is not related except that some XML element names overlap.

>  - nothing at all! (-i disk)

If you give virt-v2v just a disk then it will invent some metadata:

https://github.com/libguestfs/virt-v2v/blob/5adf437bcc00586961d8aa058559f86eb5165149/input/input.ml#L235

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top



Re: [PATCH v3 0/5] Add support for two i386 pm options which control acpi hotplug

2021-09-30 Thread Ani Sinha
On Thu, Sep 30, 2021 at 18:52 Laine Stump  wrote:

> On 9/30/21 2:16 AM, Ani Sinha wrote:
> > On Fri, Sep 24, 2021 at 2:16 AM Laine Stump  wrote:
> >>
> >> On 9/11/21 11:26 PM, Ani Sinha wrote:
> >>> The above two options are only available for qemu driver and that too
> for x86
> >>> guests only. Both of them are global options.
> >>>
> >>> ``acpi-hotplug-bridge`` option enables or disables ACPI hotplug
> support for cold
> >>> plugged bridges. Examples of cold plugged bridges include PCI-PCI
> bridge
> >>> (pci-bridge controller) for pc machines and pcie-root-port controller
> for q35
> >>> machines. The corresponding commandline options to qemu for x86 guests
> are:
> >>
> >> The "cold plugged bridges" term here throws me for a loop - it implies
> >> that hotplugging bridges is something that's supported, and I think it
> >> still isn't. Of course this is just the cover letter, so it won't go
> >> into git anywhere, but I think it should be enough to say "enables ACPI
> >> hotplug into non-root bus PCI bridges/ports".
> >
> > I think emphasizing cold plugged bridges is important as Igor (CC'd)
> > has clarified in the other email on patch #3 of this series.
>
> Okay, so the implication in Igor's email is that a) it is possible to
> hotplug a pcie controller, but b) any controller that is hotplugged will
> not have ACPI enabled. Note though that libvirt doesn't allow
> hotplugging *any* PCI controller, since we were told long ago that no OS
> will actually rescan the PCI topology when this is done, and so the new
> controller wouldn't be usable anyway. (that information may well be
> outdated).


>From i440fx side all empty ports in the pci root controller are described
as hotplug capable from ACPI. So I do not see why we cannot hotplug a pci
bridge in one of the pci root ports and OS should be able to detect it
without reboot. I have not tried it though.


>
> I think if you're going to mention that it is specifically for
> "cold-plugged bridges" then you should also 1) define what
> "cold-plugged" means, i.e. "(PCI controllers that were present in the
> domain definition when the guest was first started"), and 2) note that
> "ACPI is not enabled for bridges that are hot-plugged (but currently
> libvirt doesn't support hotplugging a pci controller anyway)" or
> something like that.
>
>


Re: [PATCH v3 0/5] Add support for two i386 pm options which control acpi hotplug

2021-09-30 Thread Laine Stump

On 9/30/21 2:16 AM, Ani Sinha wrote:

On Fri, Sep 24, 2021 at 2:16 AM Laine Stump  wrote:


On 9/11/21 11:26 PM, Ani Sinha wrote:

The above two options are only available for qemu driver and that too for x86
guests only. Both of them are global options.

``acpi-hotplug-bridge`` option enables or disables ACPI hotplug support for cold
plugged bridges. Examples of cold plugged bridges include PCI-PCI bridge
(pci-bridge controller) for pc machines and pcie-root-port controller for q35
machines. The corresponding commandline options to qemu for x86 guests are:


The "cold plugged bridges" term here throws me for a loop - it implies
that hotplugging bridges is something that's supported, and I think it
still isn't. Of course this is just the cover letter, so it won't go
into git anywhere, but I think it should be enough to say "enables ACPI
hotplug into non-root bus PCI bridges/ports".


I think emphasizing cold plugged bridges is important as Igor (CC'd)
has clarified in the other email on patch #3 of this series.


Okay, so the implication in Igor's email is that a) it is possible to 
hotplug a pcie controller, but b) any controller that is hotplugged will 
not have ACPI enabled. Note though that libvirt doesn't allow 
hotplugging *any* PCI controller, since we were told long ago that no OS 
will actually rescan the PCI topology when this is done, and so the new 
controller wouldn't be usable anyway. (that information may well be 
outdated).


I think if you're going to mention that it is specifically for 
"cold-plugged bridges" then you should also 1) define what 
"cold-plugged" means, i.e. "(PCI controllers that were present in the 
domain definition when the guest was first started"), and 2) note that 
"ACPI is not enabled for bridges that are hot-plugged (but currently 
libvirt doesn't support hotplugging a pci controller anyway)" or 
something like that.




Re: [PATCH 5/5] qemu: Prefer -numa cpu over -numa node,cpus=

2021-09-30 Thread Peter Krempa
On Tue, Sep 21, 2021 at 16:50:31 +0200, Michal Privoznik wrote:
> QEMU is trying to obsolete -numa node,cpus= because that uses
> ambiguous vCPU id to [socket, die, core, thread] mapping. The new
> form is:
> 
>   -numa cpu,node-id=N,socket-id=S,die-id=D,core-id=C,thread-id=T
> 
> which is repeated for every vCPU and places it at [S, D, C, T]
> into guest NUMA node N.
> 
> While in general this is magic mapping, we can deal with it.
> Firstly, with QEMU 2.7 or newer, libvirt ensures that if topology
> is given then maxvcpus must be sockets * dies * cores * threads
> (i.e. there are no 'holes').
> Secondly, if no topology is given then libvirt itself places each
> vCPU into a different socket (basically, it fakes topology of:
> [maxvcpus, 1, 1, 1])
> Thirdly, we can copy whatever QEMU is doing when mapping vCPUs
> onto topology, to make sure vCPUs don't start to move around.

There's a problem with this premise though and unfortunately we don't
seem to have qemuxml2argvtest for it.

On PPC64, in certain situations the CPU can be configured such that
threads are visible only to VMs. This has substantial impact on how CPUs
are configured using the modern parameters (until now used only for
cpu hotplug purposes, and that's the reason vCPU hotplug has such
complicated incantations when starting the VM).

In the above situation a CPU with topology of:
 sockets=1, cores=4, threads=8 (thus 32 cpus)

will only expose 4 CPU "devices".

 core-id: 0,  core-id: 8, core-id: 16 and core-id: 24

yet the guest will correctly see 32 cpus when used as such.

You can see this in:

tests/qemuhotplugtestcpus/ppc64-modern-individual-monitor.json

Also note that the 'props' object does _not_ have any socket-id, and
management apps are supposed to pass in 'props' as is. (There's a bunch
of code to do that on hotplug).

The problem is that you need to query the topology first (unless we want
to duplicate all of qemu code that has to do with topology state and
keep up with changes to it) to know how it's behaving on current
machine.  This historically was not possible. The supposed solution for
this was the pre-config state where we'd be able to query and set it up
via QMP, but I was not keeping up sufficiently with that work, so I
don't know if it's possible.

If preconfig is a viable option we IMO should start using it sooner
rather than later and avoid duplicating qemu's logic here.

> Note, migration from old to new cmd line works and therefore
> doesn't need any special handling.
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1678085
> Signed-off-by: Michal Privoznik 
> ---
>  src/qemu/qemu_command.c   | 112 +-
>  .../hugepages-nvdimm.x86_64-latest.args   |   4 +-
>  ...memory-default-hugepage.x86_64-latest.args |  10 +-
>  .../memfd-memory-numa.x86_64-latest.args  |  10 +-
>  ...y-hotplug-nvdimm-access.x86_64-latest.args |   4 +-
>  ...ory-hotplug-nvdimm-align.x86_64-5.2.0.args |   4 +-
>  ...ry-hotplug-nvdimm-align.x86_64-latest.args |   4 +-
>  ...ory-hotplug-nvdimm-label.x86_64-5.2.0.args |   4 +-
>  ...ry-hotplug-nvdimm-label.x86_64-latest.args |   4 +-
>  ...mory-hotplug-nvdimm-pmem.x86_64-5.2.0.args |   4 +-
>  ...ory-hotplug-nvdimm-pmem.x86_64-latest.args |   4 +-
>  ...-hotplug-nvdimm-readonly.x86_64-5.2.0.args |   4 +-
>  ...hotplug-nvdimm-readonly.x86_64-latest.args |   4 +-
>  .../memory-hotplug-nvdimm.x86_64-latest.args  |   4 +-
>  ...mory-hotplug-virtio-pmem.x86_64-5.2.0.args |   4 +-
>  ...ory-hotplug-virtio-pmem.x86_64-latest.args |   4 +-
>  .../numatune-hmat.x86_64-latest.args  |  18 ++-
>  ...emnode-restrictive-mode.x86_64-latest.args |  38 +-
>  .../numatune-memnode.x86_64-5.2.0.args|  38 +-
>  .../numatune-memnode.x86_64-latest.args   |  38 +-
>  ...vhost-user-fs-fd-memory.x86_64-latest.args |   4 +-
>  ...vhost-user-fs-hugepages.x86_64-latest.args |   4 +-
>  ...host-user-gpu-secondary.x86_64-latest.args |   3 +-
>  .../vhost-user-vga.x86_64-latest.args |   3 +-
>  24 files changed, 296 insertions(+), 34 deletions(-)
> 
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index f04ae1e311..5192bd7630 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c

[...]

> @@ -7432,6 +7432,94 @@ qemuBuildNumaCPUs(virBuffer *buf,
>  }
>  
>  
> +/**
> + * qemuTranlsatevCPUID:
> + *
> + * For given vCPU @id and vCPU topology (@cpu) compute corresponding
> + * @socket, @die, @core and @thread). This assumes linear topology,
> + * that is every [socket, die, core, thread] combination is valid vCPU
> + * ID and there are no 'holes'. This is ensured by
> + * qemuValidateDomainDef() if QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS is
> + * set.

As noted above, this assumption does not hold on PPC64. There are indeed
"holes" in certain cases, while filled with cpus you are e.g. unable to
spread them across multiple numa nodes.

In fact allowing to have two sibling threads to be spread across
multiple NUMA nodes 

Re: translating CD-ROM device paths from i440fx to Q35 in virt-v2v (was: test-v2v-cdrom: update the CD-ROM's bus to SATA in the converted domain)

2021-09-30 Thread Daniel P . Berrangé
On Thu, Sep 30, 2021 at 01:12:39PM +0200, Laszlo Ersek wrote:
> On 09/29/21 21:22, Richard W.M. Jones wrote:
> Please correct me if I'm wrong: at the moment, I believe virt-v2v parses
> and manipulates the following elements and attributes in the domain XML:
> 
>   
>   
>   
>   
>   ^^^   ^^^
> 
> My understanding is however that the target/@dev attribute is mostly
> irrelevant:
> 
> https://libvirt.org/formatdomain.html#hard-drives-floppy-disks-cdroms
> 
> The dev attribute indicates the "logical" device name. The actual
> device name specified is not guaranteed to map to the device name in
> the guest OS. Treat it as a device ordering hint. [...]

I won't say it is irrelevant.  Functionally @dev is absolutely still
important, as it influences how the disk is attached to the VM.

Rather I would say that the @dev attribute is misleading to users,
because they mistakenly think it provides a guarantee that the disk
will appear with this name inside the guest.

> What actually matters is the target/@bus attribute, in combination with
> the sibling element . Such as:
> 
>   
>  ^^^
>   
> ^   ^   ^
> 
>   
>  ^^^
>   
> ^   ^   ^
> 
>   
>  ^^^
>   
> ^   ^   ^
> 
>   
>  ^^^
>   
> ^   ^   ^
> 
> So, target/@dev should be mostly ignored; what matters is the following
> tuple:
> 
>   (target/@bus, address/@controller, address/@bus, address/@unit)

Yes, the  is what libvirt internally drivers all configuration
off, but in practice application developers almost never use the 
element directly.  They will just give target/@dev and libvirt will use
that to automatically populate an  element, in order to reliably
fixate the guest ABI thereafter.

> 
> Extracting just the tuples:
> 
>   (ide, 0, 0, 0)
>   (ide, 0, 0, 1)
>   (ide, 0, 1, 0)
>   (ide, 0, 1, 1)
> 
> The first two components of each tuple -- i.e., (ide, 0) -- refer to the
> following IDE controller:
> 
> 
> ^^ ^
>function='0x1'/>
> 
> 
> and then the rest of the components, such as (0, 0), (0, 1), (1, 0), (1,
> 1), identify the disk on that IDE controller.

Yes, that's correct for IDE.

> (Side comment: the PCI location of the (first) IDE controller is fixed
> in QEMU; if one tries to change it, libvirt complains: "Primary IDE
> controller must have PCI address 0:0:1.1".)

Yep, its defined by the QEMU machine type and we just have to accept
that for IDE/SATA.

> Inside the guest, /dev/hd* nodes don't even exist, so it's unlikely that
> /etc/fstab would refer to them. /etc/fstab can however refer to symlinks
> under "/dev/disk/by-id" (for example):
> 
>   lrwxrwxrwx. 1 root root  9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM1 -> 
> ../../sr0
>   lrwxrwxrwx. 1 root root  9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM2 -> 
> ../../sr1
>   lrwxrwxrwx. 1 root root  9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM3 -> 
> ../../sr2
>   lrwxrwxrwx. 1 root root  9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM4 -> 
> ../../sr3
> 
> Furthermore, we have pseudo-files (directories) such as:
> 
>   
> /sys/devices/pci:00/:00:01.1/ata1/host0/target0:0:0/0:0:0:0/block/sr0
>   
> /sys/devices/pci:00/:00:01.1/ata1/host0/target0:0:1/0:0:1:0/block/sr1
>   
> /sys/devices/pci:00/:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr2
>   
> /sys/devices/pci:00/:00:01.1/ata2/host1/target1:0:1/1:0:1:0/block/sr3
>   ^   ^
> 
> So in order to map a device path from the original guest's "/etc/fstab",
> such as "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM3", to the original
> domain XML's  element, we have to do the following in the "source"
> appliance:
> 
> NODE=$(realpath /dev/disk/by-id/ata-QEMU_DVD-ROM_QM3)
> # -> /dev/sr2
> NODE=${NODE#/dev/}
> # -> sr2
> DEVPATH=$(ls -d 
> /sys/devices/pci:00/:00:01.1/ata?/host?/target?:0:?/?:0:?:0/block/$NODE)
> # -> 
> /sys/devices/pci:00/:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr2
> 
> And then map the "1:0:0:0" pathname component from $DEVPATH to:
> 
>   
>   
>^^^
>[1]:0:0:0  1:0:[0]:0
> 
> in the original domain XML. This tells us under what device node the
> original guest sees the host-side file ( element).

Yes, to map from the guest to the libvirt XML, you need to be working
in terms of the hardware buses/addresses.


> So, assuming we mapped the original (i440fx)
> "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM3" guest device path to some
>  element (= host-side file) 

Re: translating CD-ROM device paths from i440fx to Q35 in virt-v2v (was: test-v2v-cdrom: update the CD-ROM's bus to SATA in the converted domain)

2021-09-30 Thread Richard W.M. Jones
On Thu, Sep 30, 2021 at 01:12:39PM +0200, Laszlo Ersek wrote:
> All this requires virt-v2v to parse complete  elements from the
> original domain XML, and to generate complete  elements in the
> destination domain XML. Is that feasible?

The input is not always (in fact, hardly ever) full libvirt XML.  It's
input specific to the hypervisor.  For VMware it might be:

 - the *.vmx file (the real source of truth) (-i vmx)

 - partial libvirt XML generated by libvirt's vpx driver, but this is
   derived from information from VMware APIs and ultimately that comes
   from the *.vmx file (-i libvirt -ic esx:// or -ic vpx://)

 - the *.ovf file (-i ova)

 - nothing at all! (-i disk)

Also we don't currently try to find or rewrite /dev/disk/ paths in
guest configuration files.  The only rewriting that happens is for
/dev/[hs]d* block device filenames and a few others.  The actual code
that does this is convert/convert_linux.ml:remap_block_devices

So I wouldn't over-think this.  It's likely fine to identify such
devices and rewrite them as "/dev/cdrom", assuming that (I didn't
check) udev creates that symlink for any reasonably modern Linux.  And
if there's more than one attached CD to the source, only convert the
first one and warn about but drop the others.

Note that the aim of virt-v2v is to make it boot on the target, make
sure the network works, and get the user to a login prompt.  The MTV
management app around virt-v2v allows site-specific pre- and post-
configuration to happen (site-specific Ansible playbooks).  There also
exists ssh and remote console.  For people using virt-v2v on the
command line, if they get the guest to boot on the target and not
every device fully works, there is an expectation that they can log in
as root and fix small[*] things.

Rich.

[*] Obviously if the boot / system is completely broken, not that.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top



Re: [PATCH 4/5] qemuBuildNumaCommandLine: Separate out building of CPU list

2021-09-30 Thread Peter Krempa
On Tue, Sep 21, 2021 at 16:50:30 +0200, Michal Privoznik wrote:
> Signed-off-by: Michal Privoznik 
> ---
>  src/qemu/qemu_command.c | 43 ++---
>  1 file changed, 27 insertions(+), 16 deletions(-)

Reviewed-by: Michal Privoznik 



Re: [PATCH 3/5] qemuBuildNumaCommandLine: Move vars into loops

2021-09-30 Thread Peter Krempa
On Tue, Sep 21, 2021 at 16:50:29 +0200, Michal Privoznik wrote:
> There are two variables that are used only in a single
> loop. Move their definitions into their respective blocks.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  src/qemu/qemu_command.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Peter Krempa 



Re: [PATCH 2/5] virCPUDefParseXML: Prefer virXMLPropUInt over virXPathUInt

2021-09-30 Thread Peter Krempa
On Tue, Sep 21, 2021 at 16:50:28 +0200, Michal Privoznik wrote:
> When parsing CPU topology, which is described in 
> attributes we can use virXMLPropUInt() instead of virXPathUInt()
> as the former results in shorter code.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  src/conf/cpu_conf.c | 41 ++---
>  1 file changed, 18 insertions(+), 23 deletions(-)

Reviewed-by: Peter Krempa 



[libvirt PATCH v3 7/9] virChrdevFree: Use VIR_WITH_MUTEX_LOCK

2021-09-30 Thread Tim Wiederhake
Signed-off-by: Tim Wiederhake 
---
 src/conf/virchrdev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/conf/virchrdev.c b/src/conf/virchrdev.c
index 0184a1ae10..1cf727d46e 100644
--- a/src/conf/virchrdev.c
+++ b/src/conf/virchrdev.c
@@ -291,10 +291,10 @@ void virChrdevFree(virChrdevs *devs)
 if (!devs)
 return;
 
-virMutexLock(>lock);
-virHashForEachSafe(devs->hash, virChrdevFreeClearCallbacks, NULL);
-virHashFree(devs->hash);
-virMutexUnlock(>lock);
+VIR_WITH_MUTEX_LOCK_GUARD(>lock) {
+virHashForEachSafe(devs->hash, virChrdevFreeClearCallbacks, NULL);
+virHashFree(devs->hash);
+}
 virMutexDestroy(>lock);
 
 g_free(devs);
-- 
2.31.1



[libvirt PATCH v3 6/9] virChrdevFDStreamCloseCb: Use virLockGuardNew

2021-09-30 Thread Tim Wiederhake
Signed-off-by: Tim Wiederhake 
---
 src/conf/virchrdev.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/conf/virchrdev.c b/src/conf/virchrdev.c
index 5d6de68427..0184a1ae10 100644
--- a/src/conf/virchrdev.c
+++ b/src/conf/virchrdev.c
@@ -237,12 +237,10 @@ static void virChrdevFDStreamCloseCb(virStreamPtr st 
G_GNUC_UNUSED,
   void *opaque)
 {
 virChrdevStreamInfo *priv = opaque;
-virMutexLock(>devs->lock);
+VIR_LOCK_GUARD lock = virLockGuardNew(>devs->lock);
 
 /* remove entry from hash */
 virHashRemoveEntry(priv->devs->hash, priv->path);
-
-virMutexUnlock(>devs->lock);
 }
 
 /**
-- 
2.31.1



[libvirt PATCH v3 4/9] virobject: Introduce virObjectLockGuard

2021-09-30 Thread Tim Wiederhake
Typical usage:
void foobar(virObjectLockable *obj)
{
VIR_LOCK_GUARD lock = virObjectLockGuard(obj);
/* `obj` is locked, and released automatically on scope exit */

...
}

Signed-off-by: Tim Wiederhake 
---
 src/libvirt_private.syms |  1 +
 src/util/virobject.c | 16 
 src/util/virobject.h |  4 
 3 files changed, 21 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index f41674781d..733e40384e 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2947,6 +2947,7 @@ virObjectListFree;
 virObjectListFreeCount;
 virObjectLock;
 virObjectLockableNew;
+virObjectLockGuard;
 virObjectNew;
 virObjectRef;
 virObjectRWLockableNew;
diff --git a/src/util/virobject.c b/src/util/virobject.c
index 3412985b79..4812b79d56 100644
--- a/src/util/virobject.c
+++ b/src/util/virobject.c
@@ -426,6 +426,22 @@ virObjectGetRWLockableObj(void *anyobj)
 }
 
 
+/**
+ * virObjectLockGuard:
+ * @anyobj: any instance of virObjectLockable
+ *
+ * Acquire a lock on @anyobj that will be managed by the virLockGuard object
+ * returned to the caller.
+ */
+virLockGuard *
+virObjectLockGuard(void *anyobj)
+{
+virObjectLockable *obj = virObjectGetLockableObj(anyobj);
+
+return virLockGuardNew(>lock);
+}
+
+
 /**
  * virObjectLock:
  * @anyobj: any instance of virObjectLockable
diff --git a/src/util/virobject.h b/src/util/virobject.h
index 1e34c77744..0c9cbea726 100644
--- a/src/util/virobject.h
+++ b/src/util/virobject.h
@@ -118,6 +118,10 @@ void *
 virObjectRWLockableNew(virClass *klass)
 ATTRIBUTE_NONNULL(1);
 
+virLockGuard *
+virObjectLockGuard(void *lockableobj)
+ATTRIBUTE_NONNULL(1);
+
 void
 virObjectLock(void *lockableobj)
 ATTRIBUTE_NONNULL(1);
-- 
2.31.1



[libvirt PATCH v3 2/9] virthread: Introduce virLockGuard

2021-09-30 Thread Tim Wiederhake
Locks a virMutex on creation and unlocks it in its destructor.

The VIR_LOCK_GUARD macro is used instead of "g_autoptr(virLockGuard)" to
work around a clang issue (see https://bugs.llvm.org/show_bug.cgi?id=3888
and https://bugs.llvm.org/show_bug.cgi?id=43482).

Typical usage:

void function(virMutex *m)
{
VIR_LOCK_GUARD lock = virLockGuardNew(m);
/* `m` is locked, and released automatically on scope exit */

...
while (expression) {
VIR_LOCK_GUARD lock2 = virLockGuardNew(...);
/* similar */
}
}

Signed-off-by: Tim Wiederhake 
---
 src/libvirt_private.syms |  3 +++
 src/util/virthread.c | 26 ++
 src/util/virthread.h | 11 +++
 3 files changed, 40 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 6de9d9aef1..f41674781d 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -3373,6 +3373,9 @@ virCondInit;
 virCondSignal;
 virCondWait;
 virCondWaitUntil;
+virLockGuardFree;
+virLockGuardNew;
+virLockGuardUnlock;
 virMutexDestroy;
 virMutexInit;
 virMutexInitRecursive;
diff --git a/src/util/virthread.c b/src/util/virthread.c
index e89c1a09fb..a5a948985f 100644
--- a/src/util/virthread.c
+++ b/src/util/virthread.c
@@ -96,6 +96,32 @@ void virMutexUnlock(virMutex *m)
 pthread_mutex_unlock(>lock);
 }
 
+virLockGuard *virLockGuardNew(virMutex *m)
+{
+virLockGuard *l = g_new0(virLockGuard, 1);
+l->mutex = m;
+
+virMutexLock(l->mutex);
+return l;
+}
+
+void virLockGuardFree(virLockGuard *l)
+{
+if (!l)
+return;
+
+virLockGuardUnlock(l);
+g_free(l);
+}
+
+void virLockGuardUnlock(virLockGuard *l)
+{
+if (!l)
+return;
+
+virMutexUnlock(g_steal_pointer(>mutex));
+}
+
 
 int virRWLockInit(virRWLock *m)
 {
diff --git a/src/util/virthread.h b/src/util/virthread.h
index 55c8263ae6..d896a6ad3a 100644
--- a/src/util/virthread.h
+++ b/src/util/virthread.h
@@ -31,6 +31,11 @@ struct virMutex {
 pthread_mutex_t lock;
 };
 
+typedef struct virLockGuard virLockGuard;
+struct virLockGuard {
+virMutex *mutex;
+};
+
 typedef struct virRWLock virRWLock;
 struct virRWLock {
 pthread_rwlock_t lock;
@@ -121,6 +126,12 @@ void virMutexLock(virMutex *m);
 void virMutexUnlock(virMutex *m);
 
 
+virLockGuard *virLockGuardNew(virMutex *m);
+void virLockGuardFree(virLockGuard *l);
+void virLockGuardUnlock(virLockGuard *l);
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(virLockGuard, virLockGuardFree);
+#define VIR_LOCK_GUARD g_autoptr(virLockGuard) G_GNUC_UNUSED
+
 int virRWLockInit(virRWLock *m) G_GNUC_WARN_UNUSED_RESULT;
 void virRWLockDestroy(virRWLock *m);
 
-- 
2.31.1



[libvirt PATCH v3 9/9] lxcDomainDetachDeviceHostdevUSBLive: Use VIR_WITH_OBJECT_LOCK_GUARD

2021-09-30 Thread Tim Wiederhake
Signed-off-by: Tim Wiederhake 
---
 src/lxc/lxc_driver.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c
index e2720a6f89..528af5d164 100644
--- a/src/lxc/lxc_driver.c
+++ b/src/lxc/lxc_driver.c
@@ -4079,9 +4079,9 @@ lxcDomainDetachDeviceHostdevUSBLive(virLXCDriver *driver,
 VIR_WARN("cannot deny device %s for domain %s: %s",
  dst, vm->def->name, virGetLastErrorMessage());
 
-virObjectLock(hostdev_mgr->activeUSBHostdevs);
-virUSBDeviceListDel(hostdev_mgr->activeUSBHostdevs, usb);
-virObjectUnlock(hostdev_mgr->activeUSBHostdevs);
+VIR_WITH_OBJECT_LOCK_GUARD(hostdev_mgr->activeUSBHostdevs) {
+virUSBDeviceListDel(hostdev_mgr->activeUSBHostdevs, usb);
+}
 
 virDomainHostdevRemove(vm->def, idx);
 virDomainHostdevDefFree(def);
-- 
2.31.1



[libvirt PATCH v3 5/9] virobject: Introduce VIR_WITH_OBJECT_LOCK_GUARD

2021-09-30 Thread Tim Wiederhake
Modeled after "WITH_QEMU_LOCK_GUARD" (see qemu's include/qemu/lockable.h).
Uses "__LINE__" instead of "__COUNTER__", as the latter is a GNU extension.

See comment for typical usage.

Signed-off-by: Tim Wiederhake 
---
 src/util/virobject.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/util/virobject.h b/src/util/virobject.h
index 0c9cbea726..a95985b69a 100644
--- a/src/util/virobject.h
+++ b/src/util/virobject.h
@@ -148,3 +148,23 @@ virObjectListFree(void *list);
 void
 virObjectListFreeCount(void *list,
size_t count);
+
+/**
+ * VIR_WITH_OBJECT_LOCK_GUARD:
+ *
+ * This macro defines a lock scope such that entering the scope takes the lock
+ * and leaving the scope releases the lock. Return statements are allowed
+ * within the scope and release the lock. Break and continue statements leave
+ * the scope early and release the lock.
+ *
+ * virObjectLockable *lockable = ...;
+ *
+ * VIR_WITH_OBJECT_LOCK_GUARD(lockable) {
+ * // `lockable` is locked, and released automatically on scope exit
+ * ...
+ * }
+ */
+#define VIR_WITH_OBJECT_LOCK_GUARD(o) \
+for (g_autoptr(virLockGuard) CONCAT(var, __LINE__) = 
virObjectLockGuard(o); \
+ CONCAT(var, __LINE__); \
+ CONCAT(var, __LINE__) = (virLockGuardFree(CONCAT(var, __LINE__)), 
NULL))
-- 
2.31.1



[libvirt PATCH v3 3/9] virthread: Introduce VIR_WITH_MUTEX_LOCK_GUARD

2021-09-30 Thread Tim Wiederhake
Modeled after "WITH_QEMU_LOCK_GUARD" (see qemu's include/qemu/lockable.h).
Uses "__LINE__" instead of "__COUNTER__", as the latter is a GNU extension.

See comment for typical usage.

Signed-off-by: Tim Wiederhake 
---
 src/util/virthread.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/util/virthread.h b/src/util/virthread.h
index d896a6ad3a..78b3684193 100644
--- a/src/util/virthread.h
+++ b/src/util/virthread.h
@@ -210,3 +210,23 @@ int virThreadLocalSet(virThreadLocal *l, void*) 
G_GNUC_WARN_UNUSED_RESULT;
 return 0; \
 } \
 struct classname ## EatSemicolon
+
+/**
+ * VIR_WITH_MUTEX_LOCK_GUARD:
+ *
+ * This macro defines a lock scope such that entering the scope takes the lock
+ * and leaving the scope releases the lock. Return statements are allowed
+ * within the scope and release the lock. Break and continue statements leave
+ * the scope early and release the lock.
+ *
+ * virMutex *mutex = ...;
+ *
+ * VIR_WITH_MUTEX_LOCK_GUARD(mutex) {
+ * // `mutex` is locked, and released automatically on scope exit
+ * ...
+ * }
+ */
+#define VIR_WITH_MUTEX_LOCK_GUARD(m) \
+for (g_autoptr(virLockGuard) CONCAT(var, __LINE__) = virLockGuardNew(m); \
+ CONCAT(var, __LINE__); \
+ CONCAT(var, __LINE__) = (virLockGuardFree(CONCAT(var, __LINE__)), 
NULL))
-- 
2.31.1



[libvirt PATCH v3 0/9] Automatic mutex management

2021-09-30 Thread Tim Wiederhake
V1: https://listman.redhat.com/archives/libvir-list/2021-August/msg00823.html
V2: https://listman.redhat.com/archives/libvir-list/2021-September/msg00249.html

Changes since V2:

* Dropped VIR_XPATH_NODE_AUTORESTORE simplification. Moved to a
  separate series.
* Dropped the attempt to work around g_auto* / clang unused variable
  warnings (See https://bugs.llvm.org/show_bug.cgi?id=3888,
  https://bugs.llvm.org/show_bug.cgi?id=43482, and
  https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2272).
  Instead, introduce VIR_LOCK_GUARD macro that adds G_GNUC_UNUSED.

Regards,
Tim

Tim Wiederhake (9):
  internal: Add CONCAT macro
  virthread: Introduce virLockGuard
  virthread: Introduce VIR_WITH_MUTEX_LOCK_GUARD
  virobject: Introduce virObjectLockGuard
  virobject: Introduce VIR_WITH_OBJECT_LOCK_GUARD
  virChrdevFDStreamCloseCb: Use virLockGuardNew
  virChrdevFree: Use VIR_WITH_MUTEX_LOCK
  bhyveAutostartDomain: Use virObjectLockGuard
  lxcDomainDetachDeviceHostdevUSBLive: Use VIR_WITH_OBJECT_LOCK_GUARD

 src/bhyve/bhyve_driver.c |  4 ++--
 src/conf/virchrdev.c | 12 +---
 src/internal.h   |  3 +++
 src/libvirt_private.syms |  4 
 src/lxc/lxc_driver.c |  6 +++---
 src/util/virobject.c | 16 
 src/util/virobject.h | 24 
 src/util/virthread.c | 26 ++
 src/util/virthread.h | 31 +++
 9 files changed, 114 insertions(+), 12 deletions(-)

-- 
2.31.1




[libvirt PATCH v3 1/9] internal: Add CONCAT macro

2021-09-30 Thread Tim Wiederhake
Using the two-step idiom to force resolution of other macros, e.g.:

  #define bar BAR
  CONCAT_(foo, bar) // foobar
  CONCAT(foo, bar)  // fooBAR

Signed-off-by: Tim Wiederhake 
---
 src/internal.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/internal.h b/src/internal.h
index e1250a59fe..48188e6fa3 100644
--- a/src/internal.h
+++ b/src/internal.h
@@ -93,6 +93,9 @@
 
 #define NUL_TERMINATE(buf) do { (buf)[sizeof(buf)-1] = '\0'; } while (0)
 
+#define CONCAT_(a, b) a ## b
+#define CONCAT(a, b) CONCAT_(a, b)
+
 #ifdef WIN32
 # ifndef O_CLOEXEC
 #  define O_CLOEXEC _O_NOINHERIT
-- 
2.31.1



[libvirt PATCH v3 8/9] bhyveAutostartDomain: Use virObjectLockGuard

2021-09-30 Thread Tim Wiederhake
Signed-off-by: Tim Wiederhake 
---
 src/bhyve/bhyve_driver.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/bhyve/bhyve_driver.c b/src/bhyve/bhyve_driver.c
index 516490f6cd..c41de30be9 100644
--- a/src/bhyve/bhyve_driver.c
+++ b/src/bhyve/bhyve_driver.c
@@ -87,7 +87,8 @@ bhyveAutostartDomain(virDomainObj *vm, void *opaque)
 {
 const struct bhyveAutostartData *data = opaque;
 int ret = 0;
-virObjectLock(vm);
+VIR_LOCK_GUARD lock = virObjectLockGuard(vm);
+
 if (vm->autostart && !virDomainObjIsActive(vm)) {
 virResetLastError();
 ret = virBhyveProcessStart(data->conn, vm,
@@ -98,7 +99,6 @@ bhyveAutostartDomain(virDomainObj *vm, void *opaque)
vm->def->name, virGetLastErrorMessage());
 }
 }
-virObjectUnlock(vm);
 return ret;
 }
 
-- 
2.31.1



Re: [PATCH 1/5] virCPUDefParseXML: Parse uint using virXPathUInt()

2021-09-30 Thread Peter Krempa
On Tue, Sep 21, 2021 at 16:50:27 +0200, Michal Privoznik wrote:
> There is no need to use virXPathULong() and a temporary UL
> variable if we can use virXPathUInt() directly.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  src/conf/cpu_conf.c | 14 --
>  1 file changed, 4 insertions(+), 10 deletions(-)

Reviewed-by: Peter Krempa 



translating CD-ROM device paths from i440fx to Q35 in virt-v2v (was: test-v2v-cdrom: update the CD-ROM's bus to SATA in the converted domain)

2021-09-30 Thread Laszlo Ersek
(+libvirt-devel)

On 09/29/21 21:22, Richard W.M. Jones wrote:

> We currently partially install the virtio block drivers in the Windows
> guest (just enough to get the guest to boot on the target), and
> Windows itself re-installs the virtio block driver and other drivers
> it needs, and that's enough to get it to see C:
>
> As for other hard disk partitions, Windows does indeed contain a
> mapping to other drives in the Registry but IIRC it's not sensitive to
> the device driver (unlike Linux /dev/vdX vs /dev/sdX).  If you're
> interested in that, see libguestfs.git/daemon/inspect_fs_windows.ml:
> get_drive_mappings.  We never bothered with attempting to handle
> conversion of floppy drives or CD-ROMs for Windows.

OK. So AIUI, that means no work is needed here for Windows.

> On Linux we do better: We iterate over all the configuration files in
> /etc and change device paths.  The significance of this bug is we need
> to change (eg) /dev/hdc to /dev/.  The difficulty is
> working out where the device will appear on the target and not having
> it conflict with any hard disk, something we partly control (see
> virt-v2v.git/convert/target_bus_assignment.ml*)

AIUI the conflict avoidance logic ("no overlapping disks") is already in
place. The question is how to translate device paths in /etc/fstab and
similar.

Please correct me if I'm wrong: at the moment, I believe virt-v2v parses
and manipulates the following elements and attributes in the domain XML:

  
  
  
  
  ^^^   ^^^

My understanding is however that the target/@dev attribute is mostly
irrelevant:

https://libvirt.org/formatdomain.html#hard-drives-floppy-disks-cdroms

The dev attribute indicates the "logical" device name. The actual
device name specified is not guaranteed to map to the device name in
the guest OS. Treat it as a device ordering hint. [...]

What actually matters is the target/@bus attribute, in combination with
the sibling element . Such as:

  
 ^^^
  
^   ^   ^

  
 ^^^
  
^   ^   ^

  
 ^^^
  
^   ^   ^

  
 ^^^
  
^   ^   ^

So, target/@dev should be mostly ignored; what matters is the following
tuple:

  (target/@bus, address/@controller, address/@bus, address/@unit)

Extracting just the tuples:

  (ide, 0, 0, 0)
  (ide, 0, 0, 1)
  (ide, 0, 1, 0)
  (ide, 0, 1, 1)

The first two components of each tuple -- i.e., (ide, 0) -- refer to the
following IDE controller:


^^ ^
  


and then the rest of the components, such as (0, 0), (0, 1), (1, 0), (1,
1), identify the disk on that IDE controller.

(Side comment: the PCI location of the (first) IDE controller is fixed
in QEMU; if one tries to change it, libvirt complains: "Primary IDE
controller must have PCI address 0:0:1.1".)

(Side comment: on the QEMU command line, this maps to

  -device ide-cd,bus=ide.0,unit=0,... \
  -device ide-cd,bus=ide.0,unit=1,... \
  -device ide-cd,bus=ide.1,unit=0,... \
  -device ide-cd,bus=ide.1,unit=1,... \
)

Inside the guest, /dev/hd* nodes don't even exist, so it's unlikely that
/etc/fstab would refer to them. /etc/fstab can however refer to symlinks
under "/dev/disk/by-id" (for example):

  lrwxrwxrwx. 1 root root  9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM1 -> ../../sr0
  lrwxrwxrwx. 1 root root  9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM2 -> ../../sr1
  lrwxrwxrwx. 1 root root  9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM3 -> ../../sr2
  lrwxrwxrwx. 1 root root  9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM4 -> ../../sr3

Furthermore, we have pseudo-files (directories) such as:

  /sys/devices/pci:00/:00:01.1/ata1/host0/target0:0:0/0:0:0:0/block/sr0
  /sys/devices/pci:00/:00:01.1/ata1/host0/target0:0:1/0:0:1:0/block/sr1
  /sys/devices/pci:00/:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr2
  /sys/devices/pci:00/:00:01.1/ata2/host1/target1:0:1/1:0:1:0/block/sr3
  ^   ^

So in order to map a device path from the original guest's "/etc/fstab",
such as "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM3", to the original
domain XML's  element, we have to do the following in the "source"
appliance:

NODE=$(realpath /dev/disk/by-id/ata-QEMU_DVD-ROM_QM3)
# -> /dev/sr2
NODE=${NODE#/dev/}
# -> sr2
DEVPATH=$(ls -d 
/sys/devices/pci:00/:00:01.1/ata?/host?/target?:0:?/?:0:?:0/block/$NODE)
# -> 
/sys/devices/pci:00/:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr2

And then map the "1:0:0:0" pathname component from $DEVPATH to:

  
  
   ^^^
 

Re: [PATCH 0/1] vmx: Fix mapping

2021-09-30 Thread Richard W.M. Jones
On Thu, Sep 30, 2021 at 09:47:01AM +0100, Daniel P. Berrangé wrote:
> On Thu, Sep 30, 2021 at 08:33:48AM +0100, Richard W.M. Jones wrote:
> > I propose we deprecate the guid parameter in:
> > 
> >   -device vmgenid,guid=8987940a-0951-2cc5-e815-10634ff550b9,id=vmgenid0
> > 
> > Instead it will be replaced by bytes= which will simply write
> > the bytes, in the order they appear, into guest memory with no
> > attempt to interpret or byte-swap.  Something like:
> > 
> >   -device vmgenid,bytes=112233445566778899aabbccddeeff00,id=vmgenid0
> > 
> > (guid although deprecated will need to be kept around for a while,
> > along with its weird byte-swapping behaviour).
> > 
> > We will then have a plain and simple method to emulate the behaviour
> > of other hypervisors.  We will look at exactly what bytes they write
> > to guest memory and copy that behaviour when v2v converting from those
> > hypervisors.
> 
> From the libvirt POV, I'm not expecting anything in QEMU to change
> in this respect. If guid is replaced by a new attribute taking data
> in a different way, then libvirt will have to remap itself, so that
> existing usage in libvirt keeps working the same way as it did with
> guid.  Essentially from libvirt's POV, it is simply a documentation
> issue to specify how the libvirt XML representation translates to
> the guest visible representation, and ensure that all libvirt drivers
> implement it the same way. The QEMU genid support arrived first so
> that set the standard for how libvirt will represent it, that all
> further libvirt hypervisor drivers need to match.

I was going to suggest something like:

  aa-bb-cc..
or
  aabbcc..

with the type defaulting to guid for backwards compatibility.

Does libvirt XML have any other fields were you're passing
essentially small snippets of binary data?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org



Re: [PATCH 0/1] vmx: Fix mapping

2021-09-30 Thread Daniel P . Berrangé
On Thu, Sep 30, 2021 at 08:33:48AM +0100, Richard W.M. Jones wrote:
> 
> More data: I found a colleague who has a Hyper-V instance with a
> Windows guest and he helped me to understand how Hyper-V represents
> generation IDs.  Hyper-V has had support for generation IDs since long
> before Microsoft proposed the feature for standardization.  Originally
> (I think pre-2013) Hyper-V used an XML description which included:
> 
> 4a07df4361fdf4c883f97bc30e524b9d
> 
> Note this is a pure hex string, not a GUID.
> 
> In current Hyper-V, the same representation is used but it's embedded
> in a binary file.
> 
> My colleague ran two Windows VMs on Hyper-V and examined the
> generation_id on the hypervisor and compared it to the output of
> VMGENID.EXE inside the guest.
> 
> The first guest had this in the binary hypervisor metadata:
> 
> 56b0  00 00 0e 67 65 6e 65 72  61 74 69 6f 6e 5f 69 64  |...generation_id|
> 56c0  00 40 00 00 00 38 00 30  00 35 00 61 00 32 00 38  |.@...8.0.5.a.2.8|
> 56d0  00 37 00 61 00 32 00 35  00 30 00 39 00 38 00 39  |.7.a.2.5.0.9.8.9|
> 56e0  00 65 00 34 00 66 00 65  00 36 00 66 00 36 00 39  |.e.4.f.e.6.f.6.9|
> 56f0  00 39 00 32 00 62 00 65  00 33 00 33 00 34 00 61  |.9.2.b.e.3.3.4.a|
> 5700  00 34 00 33 00 00 00 00  00 00 00 00 00 00 00 00  |.4.3|
> 
> and reported the output of VMGENID in this guest as:
> 
> VmCounterValue: fe6f6992be334a43:805a287a250989e4
> 
> The second guest had:
> 
> 5360  c5 06 00 00 00 0e 67 65  6e 65 72 61 74 69 6f 6e  |..generation|
> 5370  5f 69 64 00 40 00 00 00  65 00 62 00 66 00 62 00  |_id.@...e.b.f.b.|
> 5380  62 00 37 00 39 00 37 00  33 00 36 00 35 00 37 00  |b.7.9.7.3.6.5.7.|
> 5390  32 00 38 00 65 00 37 00  30 00 62 00 33 00 66 00  |2.8.e.7.0.b.3.f.|
> 53a0  64 00 33 00 63 00 37 00  31 00 33 00 65 00 63 00  |d.3.c.7.1.3.e.c.|
> 53b0  65 00 38 00 34 00 32 00  00 00 00 00 00 00 00 00  |e.8.4.2.|
> 
> and VMGENID was:
> 
> VmCounterValue: b3fd3c713ece842:ebfbb797365728e7
> 
> Note:
> 
>  - In both cases, the generation ID is clearly not a GUID.
> 
>  - The two 64 bit words are swapped over somewhere, but individual
>bytes are not byte-swapped, and there is again no evidence of
>UUID-aware byte swapping.
> 
> --
> 
> So how do we get from where we are to a more sane vmgenid in qemu?
> 
> I propose we deprecate the guid parameter in:
> 
>   -device vmgenid,guid=8987940a-0951-2cc5-e815-10634ff550b9,id=vmgenid0
> 
> Instead it will be replaced by bytes= which will simply write
> the bytes, in the order they appear, into guest memory with no
> attempt to interpret or byte-swap.  Something like:
> 
>   -device vmgenid,bytes=112233445566778899aabbccddeeff00,id=vmgenid0
> 
> (guid although deprecated will need to be kept around for a while,
> along with its weird byte-swapping behaviour).
> 
> We will then have a plain and simple method to emulate the behaviour
> of other hypervisors.  We will look at exactly what bytes they write
> to guest memory and copy that behaviour when v2v converting from those
> hypervisors.

>From the libvirt POV, I'm not expecting anything in QEMU to change
in this respect. If guid is replaced by a new attribute taking data
in a different way, then libvirt will have to remap itself, so that
existing usage in libvirt keeps working the same way as it did with
guid.  Essentially from libvirt's POV, it is simply a documentation
issue to specify how the libvirt XML representation translates to
the guest visible representation, and ensure that all libvirt drivers
implement it the same way. The QEMU genid support arrived first so
that set the standard for how libvirt will represent it, that all
further libvirt hypervisor drivers need to match.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [PATCH 0/1] vmx: Fix mapping

2021-09-30 Thread Laszlo Ersek
On 09/30/21 09:33, Richard W.M. Jones wrote:
> 
> More data: I found a colleague who has a Hyper-V instance with a
> Windows guest and he helped me to understand how Hyper-V represents
> generation IDs.  Hyper-V has had support for generation IDs since long
> before Microsoft proposed the feature for standardization.  Originally
> (I think pre-2013) Hyper-V used an XML description which included:
> 
> 4a07df4361fdf4c883f97bc30e524b9d
> 
> Note this is a pure hex string, not a GUID.
> 
> In current Hyper-V, the same representation is used but it's embedded
> in a binary file.
> 
> My colleague ran two Windows VMs on Hyper-V and examined the
> generation_id on the hypervisor and compared it to the output of
> VMGENID.EXE inside the guest.
> 
> The first guest had this in the binary hypervisor metadata:
> 
> 56b0  00 00 0e 67 65 6e 65 72  61 74 69 6f 6e 5f 69 64  |...generation_id|
> 56c0  00 40 00 00 00 38 00 30  00 35 00 61 00 32 00 38  |.@...8.0.5.a.2.8|
> 56d0  00 37 00 61 00 32 00 35  00 30 00 39 00 38 00 39  |.7.a.2.5.0.9.8.9|
> 56e0  00 65 00 34 00 66 00 65  00 36 00 66 00 36 00 39  |.e.4.f.e.6.f.6.9|
> 56f0  00 39 00 32 00 62 00 65  00 33 00 33 00 34 00 61  |.9.2.b.e.3.3.4.a|
> 5700  00 34 00 33 00 00 00 00  00 00 00 00 00 00 00 00  |.4.3|
> 
> and reported the output of VMGENID in this guest as:
> 
> VmCounterValue: fe6f6992be334a43:805a287a250989e4
> 
> The second guest had:
> 
> 5360  c5 06 00 00 00 0e 67 65  6e 65 72 61 74 69 6f 6e  |..generation|
> 5370  5f 69 64 00 40 00 00 00  65 00 62 00 66 00 62 00  |_id.@...e.b.f.b.|
> 5380  62 00 37 00 39 00 37 00  33 00 36 00 35 00 37 00  |b.7.9.7.3.6.5.7.|
> 5390  32 00 38 00 65 00 37 00  30 00 62 00 33 00 66 00  |2.8.e.7.0.b.3.f.|
> 53a0  64 00 33 00 63 00 37 00  31 00 33 00 65 00 63 00  |d.3.c.7.1.3.e.c.|
> 53b0  65 00 38 00 34 00 32 00  00 00 00 00 00 00 00 00  |e.8.4.2.|
> 
> and VMGENID was:
> 
> VmCounterValue: b3fd3c713ece842:ebfbb797365728e7
> 
> Note:
> 
>  - In both cases, the generation ID is clearly not a GUID.
> 
>  - The two 64 bit words are swapped over somewhere, but individual
>bytes are not byte-swapped, and there is again no evidence of
>UUID-aware byte swapping.
> 
> --
> 
> So how do we get from where we are to a more sane vmgenid in qemu?
> 
> I propose we deprecate the guid parameter in:
> 
>   -device vmgenid,guid=8987940a-0951-2cc5-e815-10634ff550b9,id=vmgenid0
> 
> Instead it will be replaced by bytes= which will simply write
> the bytes, in the order they appear, into guest memory with no
> attempt to interpret or byte-swap.  Something like:
> 
>   -device vmgenid,bytes=112233445566778899aabbccddeeff00,id=vmgenid0
> 
> (guid although deprecated will need to be kept around for a while,
> along with its weird byte-swapping behaviour).
> 
> We will then have a plain and simple method to emulate the behaviour
> of other hypervisors.  We will look at exactly what bytes they write
> to guest memory and copy that behaviour when v2v converting from those
> hypervisors.

I don't have anything against this, I just don't know who's going to
implement the QEMU change.

Thanks
Laszlo



Re: [libvirt][PATCH v7 5/5] Add get domaincaps unit test

2021-09-30 Thread Daniel P . Berrangé
On Thu, Sep 30, 2021 at 01:39:20AM +, Huang, Haibin wrote:
> 
> 
> > -Original Message-
> > From: Daniel P. Berrangé 
> > Sent: Tuesday, September 28, 2021 10:15 PM
> > To: Huang, Haibin 
> > Cc: libvir-list@redhat.com; Ding, Jian-feng ; 
> > Yang,
> > Lin A ; Lu, Lianhao ;
> > pbonz...@redhat.com; pkre...@redhat.com; twied...@redhat.com;
> > phrd...@redhat.com; mpriv...@redhat.com
> > Subject: Re: [libvirt][PATCH v7 5/5] Add get domaincaps unit test
> > 
> > On Wed, Sep 08, 2021 at 09:15:58AM +0800, Haibin Huang wrote:
> > > Signed-off-by: Haibin Huang 
> > > ---
> > >  tests/domaincapsdata/bhyve_basic.x86_64.xml   | 1 +
> > >  tests/domaincapsdata/bhyve_fbuf.x86_64.xml| 1 +
> > >  tests/domaincapsdata/bhyve_uefi.x86_64.xml| 1 +
> > >  tests/domaincapsdata/empty.xml| 1 +
> > >  tests/domaincapsdata/libxl-xenfv.xml  | 1 +
> > >  tests/domaincapsdata/libxl-xenpv.xml  | 1 +
> > >  tests/domaincapsdata/qemu_2.11.0-q35.x86_64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_2.11.0-tcg.x86_64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_2.11.0.s390x.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.11.0.x86_64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_2.12.0-q35.x86_64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_2.12.0-tcg.x86_64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_2.12.0-virt.aarch64.xml | 1 +
> > >  tests/domaincapsdata/qemu_2.12.0.aarch64.xml  | 1 +
> > >  tests/domaincapsdata/qemu_2.12.0.ppc64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.12.0.s390x.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.12.0.x86_64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_2.4.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.4.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.4.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.5.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.5.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.5.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.6.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.6.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.6.0-virt.aarch64.xml  | 1 +
> > >  tests/domaincapsdata/qemu_2.6.0.aarch64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_2.6.0.ppc64.xml | 1 +
> > >  tests/domaincapsdata/qemu_2.6.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.7.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.7.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.7.0.s390x.xml | 1 +
> > >  tests/domaincapsdata/qemu_2.7.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.8.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.8.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.8.0.s390x.xml | 1 +
> > >  tests/domaincapsdata/qemu_2.8.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.9.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.9.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_2.9.0.ppc64.xml | 1 +
> > >  tests/domaincapsdata/qemu_2.9.0.s390x.xml | 1 +
> > >  tests/domaincapsdata/qemu_2.9.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_3.0.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_3.0.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_3.0.0.ppc64.xml | 1 +
> > >  tests/domaincapsdata/qemu_3.0.0.s390x.xml | 1 +
> > >  tests/domaincapsdata/qemu_3.0.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_3.1.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_3.1.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_3.1.0.ppc64.xml | 1 +
> > >  tests/domaincapsdata/qemu_3.1.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.0.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.0.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.0.0-virt.aarch64.xml  | 1 +
> > >  tests/domaincapsdata/qemu_4.0.0.aarch64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_4.0.0.ppc64.xml | 1 +
> > >  tests/domaincapsdata/qemu_4.0.0.s390x.xml | 1 +
> > >  tests/domaincapsdata/qemu_4.0.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.1.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.1.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.1.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.2.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.2.0-tcg.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_4.2.0-virt.aarch64.xml  | 1 +
> > >  tests/domaincapsdata/qemu_4.2.0.aarch64.xml   | 1 +
> > >  tests/domaincapsdata/qemu_4.2.0.ppc64.xml | 1 +
> > >  tests/domaincapsdata/qemu_4.2.0.s390x.xml | 1 +
> > >  tests/domaincapsdata/qemu_4.2.0.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_5.0.0-q35.x86_64.xml| 1 +
> > >  tests/domaincapsdata/qemu_5.0.0-tcg.x86_64.xml| 1 +
> > > 

Re: [PATCH 0/1] vmx: Fix mapping

2021-09-30 Thread Richard W.M. Jones


More data: I found a colleague who has a Hyper-V instance with a
Windows guest and he helped me to understand how Hyper-V represents
generation IDs.  Hyper-V has had support for generation IDs since long
before Microsoft proposed the feature for standardization.  Originally
(I think pre-2013) Hyper-V used an XML description which included:

4a07df4361fdf4c883f97bc30e524b9d

Note this is a pure hex string, not a GUID.

In current Hyper-V, the same representation is used but it's embedded
in a binary file.

My colleague ran two Windows VMs on Hyper-V and examined the
generation_id on the hypervisor and compared it to the output of
VMGENID.EXE inside the guest.

The first guest had this in the binary hypervisor metadata:

56b0  00 00 0e 67 65 6e 65 72  61 74 69 6f 6e 5f 69 64  |...generation_id|
56c0  00 40 00 00 00 38 00 30  00 35 00 61 00 32 00 38  |.@...8.0.5.a.2.8|
56d0  00 37 00 61 00 32 00 35  00 30 00 39 00 38 00 39  |.7.a.2.5.0.9.8.9|
56e0  00 65 00 34 00 66 00 65  00 36 00 66 00 36 00 39  |.e.4.f.e.6.f.6.9|
56f0  00 39 00 32 00 62 00 65  00 33 00 33 00 34 00 61  |.9.2.b.e.3.3.4.a|
5700  00 34 00 33 00 00 00 00  00 00 00 00 00 00 00 00  |.4.3|

and reported the output of VMGENID in this guest as:

VmCounterValue: fe6f6992be334a43:805a287a250989e4

The second guest had:

5360  c5 06 00 00 00 0e 67 65  6e 65 72 61 74 69 6f 6e  |..generation|
5370  5f 69 64 00 40 00 00 00  65 00 62 00 66 00 62 00  |_id.@...e.b.f.b.|
5380  62 00 37 00 39 00 37 00  33 00 36 00 35 00 37 00  |b.7.9.7.3.6.5.7.|
5390  32 00 38 00 65 00 37 00  30 00 62 00 33 00 66 00  |2.8.e.7.0.b.3.f.|
53a0  64 00 33 00 63 00 37 00  31 00 33 00 65 00 63 00  |d.3.c.7.1.3.e.c.|
53b0  65 00 38 00 34 00 32 00  00 00 00 00 00 00 00 00  |e.8.4.2.|

and VMGENID was:

VmCounterValue: b3fd3c713ece842:ebfbb797365728e7

Note:

 - In both cases, the generation ID is clearly not a GUID.

 - The two 64 bit words are swapped over somewhere, but individual
   bytes are not byte-swapped, and there is again no evidence of
   UUID-aware byte swapping.

--

So how do we get from where we are to a more sane vmgenid in qemu?

I propose we deprecate the guid parameter in:

  -device vmgenid,guid=8987940a-0951-2cc5-e815-10634ff550b9,id=vmgenid0

Instead it will be replaced by bytes= which will simply write
the bytes, in the order they appear, into guest memory with no
attempt to interpret or byte-swap.  Something like:

  -device vmgenid,bytes=112233445566778899aabbccddeeff00,id=vmgenid0

(guid although deprecated will need to be kept around for a while,
along with its weird byte-swapping behaviour).

We will then have a plain and simple method to emulate the behaviour
of other hypervisors.  We will look at exactly what bytes they write
to guest memory and copy that behaviour when v2v converting from those
hypervisors.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/



Squelch 'eof from qemu monitor' error on normal VM shutdown

2021-09-30 Thread Jim Fehlig

Hi All,

Likely Christian received a bug report that motivated commit aeda1b8c56, which 
was later reverted by Michal with commit 72adaf2f10. In the past, I recall being 
asked about "internal error: End of file from qemu monitor" on normal VM 
shutdown and gave a hand wavy response using some of Michal's words from the 
revert commit message.


I recently received a bug report (sorry, but no public link) from a concerned 
user about this error and wondered if there is some way to improve it? I went 
down some dead ends before circling back to Christian's patch. When rebased to 
latest master, I cannot reproduce the hangs reported by Michal [1]. Perhaps 
Nikolay's series to resolve hangs/crashes of libvirtd [2] has now made 
Christian's patch viable?


Regards
Jim

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1536461
[2] https://listman.redhat.com/archives/libvir-list/2020-July/msg01606.html



Re: [PATCH v3 0/5] Add support for two i386 pm options which control acpi hotplug

2021-09-30 Thread Ani Sinha
On Fri, Sep 24, 2021 at 2:16 AM Laine Stump  wrote:
>
> On 9/11/21 11:26 PM, Ani Sinha wrote:
> > Hi all:
> >
> > This patchset introduces libvirt xml support for the following two pm conf
> > options:
> >
> > 
> >
> >
> > 
>
> (before I get into a more radical discussion about different options -
> since we aren't exactly duplicating the QEMU option name anyway, what if
> we made these names more consistent, e.g. "acpi-hotplug-bridge" and
> "acpi-hotplug-root"?)
>
> I've thought quite a bit about whether to put these attributes here, or
> somewhere else, and I'm still undecided.
>
> My initial reaction to this was "PM == Power Management, and power
> management is all about suspend mode support. Hotplug isn't power
> management." But then you look at the name of the QEMU option and PM is
> right there in the name, and I guess it's *kind of related* (effectively
> suspending/resuming a single device), so maybe I'm thinking too narrowly.
>
> So are there alternate places that might fit the purpose of these new
> options better, rather than directly mimicking the QEMU option placement
> (for better or worse)? A couple alternative possibilities:
>
> 1) 
>
> One possibility would be to include these new flags within the existing
>  subelement of , which is already used to control
> whether the guest exposes ACPI to the guest *at all* (via adding
> "-no-acpi" to the QEMU commandline when  is missing - NB: this
> feature flag is currently supported only on x86 and aarch64 QEMU
> platforms, and ignored for all other hypervisors).
>
> Possibly the new flags could be put in something like this:
>
> 
>
>  
>  
>
>...
> 
>
> But:
>
> * currently there are no subelements to . So this isn't "extending
> according to an existing pattern".
>
> * even though the  element uses presence of a subelement to
> indicate "enabled" and absence of the subelement to indicate "disabled".
> But in the case of these new acpi bridge options we would need to
> explicitly have the "enabled='yes/no'" rather than just using presence
> of the option to mean "enabled" and absence to mean "disabled" because
> the default for "root-hotplug" up until now has been *enabled*, and the
> default for hotplug-bridge is different depending on machinetype. We
> need to continue working properly (and identically) with old/existing
> XML, but if we didn't have an "enabled" attribute for these new flags,
> there would be no way to tell the difference between "not specified" and
> "disabled", and so no way to disable the feature for a QEMU where the
> default was "enabled". (Why does this matter? Because I don't like the
> inconsistency that would arise from some feature flags using absense to
> mean "disabled" and some using it to mean "use the default".)
>
> * Having something in  in the domain XML kind of implies that
> the associated capability flags should be represented in the 
> section of the domain capabilities. For example,  is listed under
>  in the output of virsh capabilities, separately from the flag
> indicating presence of the -no-acpi option. I'm not sure if we would
> need to add something there for these options if we moved them into
>  (seems a bit redundant to me to have it in both places, but
> I'm sure there are $reasons).
>
>
> 2) *
>
> Alternately, there is an  subelement of , which is currently
> used to add a SLIC table (some sort of software license table, which I'd
> never heard of before) using QEMU's -acpitable commandline option. It is
> also used somehow by the Xen driver.
>
> 
>
>  /path/to/slic.dat
>  
>  
>
>...
> 
>
> My problem with adding these new PCI controller acpi options to os/acpi
> is simply that it's in the  subelement, which is claimed elsewhere
> to be intended for OS boot options, and is used for things like
> specifying the path to a kernel / initrd to boot from.
>
> 3) 
>
> A third option, suggested somewhere by Ani, would be to make a
> completely new top-level element, called something like 
> that would have separate attributes for the two flags, e.g.:
>
> 
>
> I dislike new toplevel options because they just seem so adhoc, as if
> the XML namespace is a cluttered, disorganized room. That reminds me too
> much of my own workspace, which is just... depressing.
>
> 
>
> Since I always seem to spend *way too much time* worrying about naming,
> only to have it come out wrong in the end anyway, I'm looking for some
> other opinions. Counting the version that is in Ani's patch currently as
> option "0", which option do you all think is the best? Or is it
> completely unimportant?
>
> > The above two options are only available for qemu driver and that too for 
> > x86
> > guests only. Both of them are global options.
> >
> > ``acpi-hotplug-bridge`` option enables or disables ACPI hotplug support for 
> > cold
> > plugged bridges. Examples of cold plugged bridges include PCI-PCI bridge
> > (pci-bridge controller) for pc machines and