[PATCH v2 01/13] virhostmem: Introduce virHostMemGetTHPSize()

2021-02-18 Thread Michal Privoznik
New virHostMemGetTHPSize() is introduced which allows caller to
obtain THP PMD (Page Middle Directory) size, which is equal to
the minimal size that THP can use, taken from kernel doc
(Documentation/admin-guide/mm/transhuge.rst):

  Some userspace (such as a test program, or an optimized memory allocation
  library) may want to know the size (in bytes) of a transparent hugepage::

cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size

Since this size depends on the host architecture and the kernel
it won't change whilst libvirtd is running. Therefore, we can use
virOnce() and cache the value. Of course, we can be running under
kernel that has THP disabled or has no notion of THP at all. In
that case a negative value is returned to signal error.

Signed-off-by: Michal Privoznik 
---
 src/libvirt_private.syms |  1 +
 src/util/virhostmem.c| 63 
 src/util/virhostmem.h|  3 ++
 tests/domaincapsmock.c   |  9 ++
 4 files changed, 76 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 6b7261b987..0e6ad1222f 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2374,6 +2374,7 @@ virHostMemGetFreePages;
 virHostMemGetInfo;
 virHostMemGetParameters;
 virHostMemGetStats;
+virHostMemGetTHPSize;
 virHostMemSetParameters;
 
 
diff --git a/src/util/virhostmem.c b/src/util/virhostmem.c
index ae42978ed2..ef7b97806f 100644
--- a/src/util/virhostmem.c
+++ b/src/util/virhostmem.c
@@ -45,11 +45,14 @@
 #include "virstring.h"
 #include "virnuma.h"
 #include "virlog.h"
+#include "virthread.h"
 
 #define VIR_FROM_THIS VIR_FROM_NONE
 
 VIR_LOG_INIT("util.hostmem");
 
+static unsigned long long virHostTHPPMDSize;
+static virOnceControl virHostMemGetTHPSizeOnce = VIR_ONCE_CONTROL_INITIALIZER;
 
 #ifdef __FreeBSD__
 # define BSD_MEMORY_STATS_ALL 4
@@ -920,3 +923,63 @@ virHostMemAllocPages(unsigned int npages,
 
 return ncounts;
 }
+
+#if defined(__linux__)
+# define HPAGE_PMD_SIZE_PATH 
"/sys/kernel/mm/transparent_hugepage/hpage_pmd_size"
+static int
+virHostMemGetTHPSizeSysfs(unsigned long long *size)
+{
+g_autofree char *buf = NULL;
+
+/* 1KiB limit is more than enough. */
+if (virFileReadAll(HPAGE_PMD_SIZE_PATH, 1024, ) < 0)
+return -1;
+
+virStringTrimOptionalNewline(buf);
+if (virStrToLong_ull(buf, NULL, 10, size) < 0) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("unable to parse THP PMD size: %s"), buf);
+return -1;
+}
+
+/* Size is now in bytes. Convert to KiB. */
+*size >>= 10;
+return 0;
+}
+#endif /* defined(__linux__) */
+
+
+static void
+virHostMemGetTHPSizeOnceInit(void)
+{
+#if defined(__linux__)
+virHostMemGetTHPSizeSysfs();
+#else /* !defined(__linux__) */
+VIR_WARN("Getting THP size not ported yet");
+#endif /* !defined(__linux__) */
+}
+
+
+/**
+ * virHostMemGetTHPSize:
+ * @size: returned size of THP in kibibytes
+ *
+ * Obtain Transparent Huge Page size in kibibytes. The size
+ * depends on host architecture and kernel. Because of virOnce(),
+ * do not rely on errno in case of failure.
+ *
+ * Returns: 0 on success,
+ * -1 on failure.
+ */
+int
+virHostMemGetTHPSize(unsigned long long *size)
+{
+if (virOnce(, virHostMemGetTHPSizeOnceInit) < 0)
+return -1;
+
+if (virHostTHPPMDSize == 0)
+return -1;
+
+*size = virHostTHPPMDSize;
+return 0;
+}
diff --git a/src/util/virhostmem.h b/src/util/virhostmem.h
index 1369829807..bf15c40698 100644
--- a/src/util/virhostmem.h
+++ b/src/util/virhostmem.h
@@ -53,3 +53,6 @@ int virHostMemAllocPages(unsigned int npages,
  int startCell,
  unsigned int cellCount,
  bool add);
+
+int virHostMemGetTHPSize(unsigned long long *size)
+G_GNUC_NO_INLINE;
diff --git a/tests/domaincapsmock.c b/tests/domaincapsmock.c
index d81a898dc0..34a2f9ad81 100644
--- a/tests/domaincapsmock.c
+++ b/tests/domaincapsmock.c
@@ -17,6 +17,7 @@
 #include 
 
 #include "virhostcpu.h"
+#include "virhostmem.h"
 #ifdef WITH_LIBXL
 # include "libxl/libxl_capabilities.h"
 #endif
@@ -40,3 +41,11 @@ virHostCPUGetMicrocodeVersion(virArch hostArch G_GNUC_UNUSED)
 {
 return 0;
 }
+
+int
+virHostMemGetTHPSize(unsigned long long *size)
+{
+/* Pretend Transparent Huge Page size is 2MiB. */
+*size = 2048;
+return 0;
+}
-- 
2.26.2



[PATCH v2 RESEND 00/13] Introduce virtio-mem model

2021-02-18 Thread Michal Privoznik
Resend of:

https://listman.redhat.com/archives/libvir-list/2021-February/msg00534.html

We had couple of changes made (new capabilities added mostly) and the
original patch set no long applies cleanly. This is pure rebase of the
original patch set, no functional change (hence I'm keeping version
number).

Michal Prívozník (13):
  virhostmem: Introduce virHostMemGetTHPSize()
  qemu_process: Deduplicate code in qemuProcessNeedHugepagesPath()
  qemu_process: Drop needless check in
qemuProcessNeedMemoryBackingPath()
  qemu_capabilities: Introduce QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI
  conf: Introduce virtio-mem  model
  qemu: Build command line for virtio-mem
  qemu: Wire up  live update
  qemu: Wire up MEMORY_DEVICE_SIZE_CHANGE event
  Introduce MEMORY_DEVICE_SIZE_CHANGE event
  qemu: Refresh the actual size of virtio-mem on monitor reconnect
  virsh: Introduce update-memory command
  news: document recent virtio memory addition
  kbase: Document virtio-mem

 NEWS.rst  |   7 +
 docs/formatdomain.rst |  58 -
 docs/kbase/index.rst  |   4 +
 docs/kbase/memorydevices.rst  | 160 ++
 docs/kbase/meson.build|   1 +
 docs/manpages/virsh.rst   |  30 +++
 docs/schemas/domaincommon.rng |  16 ++
 examples/c/misc/event-test.c  |  17 ++
 include/libvirt/libvirt-domain.h  |  23 ++
 src/conf/domain_conf.c| 115 +-
 src/conf/domain_conf.h|  15 ++
 src/conf/domain_event.c   |  84 +++
 src/conf/domain_event.h   |  10 +
 src/conf/domain_validate.c|  39 
 src/libvirt_private.syms  |   5 +
 src/qemu/qemu_alias.c |  10 +-
 src/qemu/qemu_capabilities.c  |   2 +
 src/qemu/qemu_capabilities.h  |   1 +
 src/qemu/qemu_command.c   |  13 +-
 src/qemu/qemu_domain.c|  67 +-
 src/qemu/qemu_domain.h|   1 +
 src/qemu/qemu_domain_address.c|  38 +++-
 src/qemu/qemu_driver.c| 209 +-
 src/qemu/qemu_hotplug.c   |  18 ++
 src/qemu/qemu_hotplug.h   |   5 +
 src/qemu/qemu_monitor.c   |  37 
 src/qemu/qemu_monitor.h   |  27 +++
 src/qemu/qemu_monitor_json.c  |  95 ++--
 src/qemu/qemu_monitor_json.h  |   5 +
 src/qemu/qemu_process.c   | 101 +++--
 src/qemu/qemu_validate.c  |  22 +-
 src/remote/remote_daemon_dispatch.c   |  30 +++
 src/remote/remote_driver.c|  32 +++
 src/remote/remote_protocol.x  |  15 +-
 src/remote_protocol-structs   |   7 +
 src/security/security_apparmor.c  |   1 +
 src/security/security_dac.c   |   2 +
 src/security/security_selinux.c   |   2 +
 src/util/virhostmem.c |  63 ++
 src/util/virhostmem.h |   3 +
 tests/domaincapsmock.c|   9 +
 .../caps_5.1.0.x86_64.xml |   1 +
 .../caps_5.2.0.x86_64.xml |   1 +
 .../caps_6.0.0.x86_64.xml |   1 +
 ...mory-hotplug-virtio-mem.x86_64-latest.args |  48 
 .../memory-hotplug-virtio-mem.xml |  64 ++
 tests/qemuxml2argvtest.c  |   1 +
 ...emory-hotplug-virtio-mem.x86_64-latest.xml |   1 +
 tests/qemuxml2xmltest.c   |   1 +
 tools/virsh-domain.c  | 170 ++
 50 files changed, 1619 insertions(+), 68 deletions(-)
 create mode 100644 docs/kbase/memorydevices.rst
 create mode 100644 
tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args
 create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml
 create mode 12 
tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml

-- 
2.26.2



[PATCH v2 02/13] qemu_process: Deduplicate code in qemuProcessNeedHugepagesPath()

2021-02-18 Thread Michal Privoznik
The aim of qemuProcessNeedHugepagesPath() is to return whether
guest needs private path inside HugeTLBFS mounts (deducted from
domain definition @def) or whether the memory device that user is
hotplugging in needs the private path (deducted from the @mem
argument). The actual creation of the path is done in the only
caller qemuProcessBuildDestroyMemoryPaths().

The rule for the first case (@def) and the second case (@mem) is
the same (domain has a DIMM device that has HP requested) and is
written twice. Move the logic into a function to deduplicate the
code.

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_process.c | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 7feb35e609..6a223e18c2 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -3863,6 +3863,27 @@ 
qemuProcessReconnectCheckMemAliasOrderMismatch(virDomainObjPtr vm)
 }
 
 
+static bool
+qemuProcessDomainMemoryDefNeedHugepagesPath(const virDomainMemoryDef *mem,
+const long system_pagesize)
+{
+switch (mem->model) {
+case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+return mem->pagesize &&
+mem->pagesize != system_pagesize;
+
+case VIR_DOMAIN_MEMORY_MODEL_NONE:
+case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
+case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM:
+case VIR_DOMAIN_MEMORY_MODEL_LAST:
+/* None of these can be backed by hugepages. */
+return false;
+}
+
+return false;
+}
+
+
 static bool
 qemuProcessNeedHugepagesPath(virDomainDefPtr def,
  virDomainMemoryDefPtr mem)
@@ -3879,16 +3900,12 @@ qemuProcessNeedHugepagesPath(virDomainDefPtr def,
 }
 
 for (i = 0; i < def->nmems; i++) {
-if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_DIMM &&
-def->mems[i]->pagesize &&
-def->mems[i]->pagesize != system_pagesize)
+if (qemuProcessDomainMemoryDefNeedHugepagesPath(def->mems[i], 
system_pagesize))
 return true;
 }
 
 if (mem &&
-mem->model == VIR_DOMAIN_MEMORY_MODEL_DIMM &&
-mem->pagesize &&
-mem->pagesize != system_pagesize)
+qemuProcessDomainMemoryDefNeedHugepagesPath(mem, system_pagesize))
 return true;
 
 return false;
-- 
2.26.2



[PATCH v2 03/13] qemu_process: Drop needless check in qemuProcessNeedMemoryBackingPath()

2021-02-18 Thread Michal Privoznik
The aim of this function is to return whether domain definition
and/or memory device that user intents to hotplug needs a private
path inside cfg->memoryBackingDir. The rule for the memory device
that's being hotplug includes checking whether corresponding
guest NUMA node needs memoryBackingDir. Well, while the rationale
behind makes sense it is not necessary to check for that really -
just a few lines above every guest NUMA node was checked exactly
for that.

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_process.c | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 6a223e18c2..066f153703 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -3930,13 +3930,24 @@ qemuProcessNeedMemoryBackingPath(virDomainDefPtr def,
 return true;
 }
 
-if (mem &&
-mem->model == VIR_DOMAIN_MEMORY_MODEL_DIMM &&
-(mem->access != VIR_DOMAIN_MEMORY_ACCESS_DEFAULT ||
- (mem->targetNode >= 0 &&
-  virDomainNumaGetNodeMemoryAccessMode(def->numa, mem->targetNode)
-  != VIR_DOMAIN_MEMORY_ACCESS_DEFAULT)))
-return true;
+if (mem) {
+switch (mem->model) {
+case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+if (mem->access != VIR_DOMAIN_MEMORY_ACCESS_DEFAULT) {
+/* No need to check for access mode on the target node,
+ * it was checked for in the previous loop. */
+return true;
+}
+
+case VIR_DOMAIN_MEMORY_MODEL_NONE:
+case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
+case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM:
+case VIR_DOMAIN_MEMORY_MODEL_LAST:
+/* Backed by user provided path. Not stored in memory
+ * backing dir anyway. */
+break;
+}
+}
 
 return false;
 }
-- 
2.26.2



[PATCH v2 05/13] conf: Introduce virtio-mem model

2021-02-18 Thread Michal Privoznik
The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:

  1) block-size, which defines the size of a block
  2) requested-size, which defines how much memory (in bytes)
 is the device requested to expose to the guest.

The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'actualsize' and it is dealt with in future
commits (it is a runtime information anyway).

In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using 
under .

Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:

  

  1-3
  2048


  2097152
  0
  2048
  1048576


  

I hope by now it is obvious that:

  1) 'requested-size' must be an integer multiple of
 'block-size', and
  2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
 address.

Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).

Since now we have (possibly) two or more devices that allow
memory inflation/deflation and accounting for all of them (and
thus keeping  updated) might be hard. Therefore,
I'm deliberately forbidding memballoon. It's okay - virtio-mem is
superior to memballoon anyway. We can always reevaluate later.

Signed-off-by: Michal Privoznik 
---
 docs/formatdomain.rst | 51 +--
 docs/schemas/domaincommon.rng | 11 
 src/conf/domain_conf.c| 53 ++-
 src/conf/domain_conf.h|  3 +
 src/conf/domain_validate.c| 39 +++
 src/qemu/qemu_alias.c |  1 +
 src/qemu/qemu_command.c   |  1 +
 src/qemu/qemu_domain.c| 27 +++-
 src/qemu/qemu_domain_address.c| 38 ---
 src/qemu/qemu_process.c   |  2 +
 src/qemu/qemu_validate.c  | 22 ++-
 src/security/security_apparmor.c  |  1 +
 src/security/security_dac.c   |  2 +
 src/security/security_selinux.c   |  2 +
 .../memory-hotplug-virtio-mem.xml | 64 +++
 ...emory-hotplug-virtio-mem.x86_64-latest.xml |  1 +
 tests/qemuxml2xmltest.c   |  1 +
 17 files changed, 298 insertions(+), 21 deletions(-)
 create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml
 create mode 12 
tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml

diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index 2587106191..05d359a32d 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
@@ -6924,6 +6924,8 @@ appropriate, so there is no need to explicitly add this 
element in the guest XML
 unless a specific PCI slot needs to be assigned. :since:`Since 0.8.3, Xen, QEMU
 and KVM only` Additionally, :since:`since 0.8.4` , if the memballoon device
 needs to be explicitly disabled, ``model='none'`` may be used.
+Using memballoon among with ``virtio-mem`` memory device is unsupported. In
+that case, setting model to anything else then ``none`` results in an error.
 
 Example: automatically added device with KVM
 
@@ -6935,6 +6937,16 @@ Example: automatically added device with KVM

...
 
+Example: automatically added device with QEMU/KVM and a ``virtio-mem`` device:
+
+::
+
+   ...
+   
+ 
+   
+   ...
+
 Example: manually added device with static PCI slot 2 requested
 
 ::
@@ -7374,6 +7386,18 @@ Example: usage of the memory devices
  524288

  
+ 
+   
+ 1-3
+ 2048
+   
+   
+ 2097152
+ 0
+ 2048
+ 1048576
+   
+ 

...
 
@@ -7381,7 +7405,9 @@ Example: usage of the memory devices
Provide 

[PATCH v2 04/13] qemu_capabilities: Introduce QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI

2021-02-18 Thread Michal Privoznik
This commit introduces a new capability that reflects virtio-mem-pci
device support in QEMU:

  QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI, /* -device virtio-mem-pci */

The virtio-mem-pci device was introduced in QEMU 5.1.

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_capabilities.c | 2 ++
 src/qemu/qemu_capabilities.h | 1 +
 tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 +
 tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 +
 tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml | 1 +
 5 files changed, 6 insertions(+)

diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
index 3f8593a9e5..3f47a6aaf5 100644
--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -617,6 +617,7 @@ VIR_ENUM_IMPL(virQEMUCaps,
   "cpu-max",
   "memory-backend-file.x-use-canonical-path-for-ramblock-id",
   "vnc-opts",
+  "virtio-mem-pci",
 );
 
 
@@ -1338,6 +1339,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = {
 { "am53c974", QEMU_CAPS_SCSI_AM53C974 },
 { "virtio-pmem-pci", QEMU_CAPS_DEVICE_VIRTIO_PMEM_PCI },
 { "vhost-user-blk", QEMU_CAPS_DEVICE_VHOST_USER_BLK },
+{ "virtio-mem-pci", QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI },
 };
 
 
diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h
index 38574eef16..9eb69a5fd7 100644
--- a/src/qemu/qemu_capabilities.h
+++ b/src/qemu/qemu_capabilities.h
@@ -597,6 +597,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for 
syntax-check */
 QEMU_CAPS_CPU_MAX, /* -cpu max */
 QEMU_CAPS_X_USE_CANONICAL_PATH_FOR_RAMBLOCK_ID, /* -object 
memory-backend-file,x-use-canonical-path-for-ramblock-id= */
 QEMU_CAPS_VNC_OPTS, /* -vnc uses QemuOpts parser instead of custom code */
+QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI, /* -device virtio-mem-pci */
 
 QEMU_CAPS_LAST /* this must always be the last item */
 } virQEMUCapsFlags;
diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml 
b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml
index d1096c3cb7..2fb2edb2a5 100644
--- a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml
+++ b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml
@@ -254,6 +254,7 @@
   
   
   
+  
   5001000
   0
   43100242
diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml 
b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml
index 853f73fa53..75c5008f8a 100644
--- a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml
+++ b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml
@@ -255,6 +255,7 @@
   
   
   
+  
   5002000
   0
   43100243
diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml 
b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml
index 23fb5b7393..eee225dc92 100644
--- a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml
+++ b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml
@@ -256,6 +256,7 @@
   
   
   
+  
   5002050
   0
   43100242
-- 
2.26.2



[PATCH v2 08/13] qemu: Wire up MEMORY_DEVICE_SIZE_CHANGE event

2021-02-18 Thread Michal Privoznik
As advertised in previous commit, this event is delivered to us
when virtio-mem module changes the allocation inside the guest.
It comes with one attribute - size - which holds the new size of
the virtio-mem (well, allocated size), in bytes.
Mind you, this is not necessarily the same number as 'requested
size'. It almost certainly will be when sizing the memory up, but
it might not be when sizing the memory down - the guest kernel
might be unable to free some blocks.

This actual size is reported in the domain XML as an output
element only.

Signed-off-by: Michal Privoznik 
---
 docs/formatdomain.rst |  7 ++
 docs/schemas/domaincommon.rng |  5 +
 src/conf/domain_conf.c| 26 --
 src/conf/domain_conf.h|  8 +++
 src/libvirt_private.syms  |  1 +
 src/qemu/qemu_domain.c|  3 +++
 src/qemu/qemu_domain.h|  1 +
 src/qemu/qemu_driver.c| 31 ++
 src/qemu/qemu_monitor.c   | 24 
 src/qemu/qemu_monitor.h   | 20 +
 src/qemu/qemu_monitor_json.c  | 24 
 src/qemu/qemu_process.c   | 42 +++
 12 files changed, 190 insertions(+), 2 deletions(-)

diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index 05d359a32d..ccce01fd3b 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
@@ -7396,6 +7396,7 @@ Example: usage of the memory devices
  0
  2048
  1048576
+ 524288

  

@@ -7512,6 +7513,12 @@ Example: usage of the memory devices
  The total size exposed to the guest. Must respect ``block`` granularity
  and be smaller or equal to ``size``.
 
+   ``actual``
+ Active XML for ``virtio-mem`` model may contain ``actual`` element that
+ reflects the actual size of the corresponding virtio memory device. The
+ element is formatted into live XML and never parsed, i.e. it is
+ output-only element.
+
 :anchor:``
 
 IOMMU devices
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 1f672756bd..fc7569ba8f 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -6151,6 +6151,11 @@
 
   
 
+
+  
+
+  
+
 
   
 
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 5f9210a112..aaeb50f2a2 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -17413,6 +17413,23 @@ virDomainMemoryFindByDeviceInfo(virDomainDefPtr def,
 }
 
 
+virDomainMemoryDefPtr
+virDomainMemoryFindByDeviceAlias(virDomainDefPtr def,
+ const char *alias)
+{
+size_t i;
+
+for (i = 0; i < def->nmems; i++) {
+virDomainMemoryDefPtr tmp = def->mems[i];
+
+if (STREQ_NULLABLE(tmp->info.alias, alias))
+return tmp;
+}
+
+return NULL;
+}
+
+
 /**
  * virDomainMemoryInsert:
  *
@@ -26746,7 +26763,8 @@ virDomainMemorySourceDefFormat(virBufferPtr buf,
 
 static void
 virDomainMemoryTargetDefFormat(virBufferPtr buf,
-   virDomainMemoryDefPtr def)
+   virDomainMemoryDefPtr def,
+   unsigned int flags)
 {
 g_auto(virBuffer) childBuf = VIR_BUFFER_INIT_CHILD(buf);
 
@@ -26768,6 +26786,10 @@ virDomainMemoryTargetDefFormat(virBufferPtr buf,
 
 virBufferAsprintf(, "%llu\n",
   def->requestedsize);
+if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) {
+virBufferAsprintf(, "%llu\n",
+  def->actualsize);
+}
 }
 
 virXMLFormatElement(buf, "target", NULL, );
@@ -26800,7 +26822,7 @@ virDomainMemoryDefFormat(virBufferPtr buf,
 if (virDomainMemorySourceDefFormat(buf, def) < 0)
 return -1;
 
-virDomainMemoryTargetDefFormat(buf, def);
+virDomainMemoryTargetDefFormat(buf, def, flags);
 
 virDomainDeviceInfoFormat(buf, >info, flags);
 
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index af38d8aa81..8a43c9655f 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -2335,6 +2335,9 @@ struct _virDomainMemoryDef {
 unsigned long long labelsize; /* kibibytes; valid only for NVDIMM */
 unsigned long long blocksize; /* kibibytes; valid only for VIRTIO_MEM */
 unsigned long long requestedsize; /* kibibytes; valid only for VIRTIO_MEM 
*/
+unsigned long long actualsize; /* kibibytes, valid for VIRTIO_MEM and
+  active domain only, only to report never
+  parse */
 bool readonly; /* valid only for NVDIMM */
 
 /* required for QEMU NVDIMM ppc64 support */
@@ -3627,6 +3630,11 @@ virDomainMemoryFindByDeviceInfo(virDomainDefPtr dev,
 virDomainDeviceInfoPtr info)
 ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) 

[PATCH v2 07/13] qemu: Wire up live update

2021-02-18 Thread Michal Privoznik
As advertised in one of previous commits, we want to be able to
change 'requested-size' attribute of virtio-mem on the fly. This
commit does exactly that. Changing anything else is checked for
and forbidden.

Once guest has changed the allocation, QEMU emits an event which
we will use to track the allocation. In the next commit.

Signed-off-by: Michal Privoznik 
---
 src/conf/domain_conf.c   |  36 
 src/conf/domain_conf.h   |   4 +
 src/libvirt_private.syms |   1 +
 src/qemu/qemu_driver.c   | 172 ++-
 src/qemu/qemu_hotplug.c  |  18 
 src/qemu/qemu_hotplug.h  |   5 +
 src/qemu/qemu_monitor.c  |  13 +++
 src/qemu/qemu_monitor.h  |   4 +
 src/qemu/qemu_monitor_json.c |  15 +++
 src/qemu/qemu_monitor_json.h |   5 +
 10 files changed, 272 insertions(+), 1 deletion(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index f4c08ef43d..5f9210a112 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -17377,6 +17377,42 @@ virDomainMemoryFindInactiveByDef(virDomainDefPtr def,
 }
 
 
+/**
+ * virDomainMemoryFindByDeviceInfo:
+ * @def: domain defintion
+ * @info: device info to match
+ *
+ * For given domain definition @def find  device with
+ * matching address and matching device alias (if set in @info,
+ * otherwise ignored).
+ *
+ * Returns: device if found,
+ *  NULL otherwise.
+ */
+virDomainMemoryDefPtr
+virDomainMemoryFindByDeviceInfo(virDomainDefPtr def,
+virDomainDeviceInfoPtr info)
+{
+size_t i;
+
+for (i = 0; i < def->nmems; i++) {
+virDomainMemoryDefPtr tmp = def->mems[i];
+
+if (!virDomainDeviceInfoAddressIsEqual(>info, info))
+continue;
+
+/* alias, if present */
+if (info->alias &&
+STRNEQ_NULLABLE(tmp->info.alias, info->alias))
+continue;
+
+return tmp;
+}
+
+return NULL;
+}
+
+
 /**
  * virDomainMemoryInsert:
  *
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 14d238df4b..af38d8aa81 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -3622,6 +3622,10 @@ int virDomainMemoryFindByDef(virDomainDefPtr def, 
virDomainMemoryDefPtr mem)
 int virDomainMemoryFindInactiveByDef(virDomainDefPtr def,
  virDomainMemoryDefPtr mem)
 ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT;
+virDomainMemoryDefPtr
+virDomainMemoryFindByDeviceInfo(virDomainDefPtr dev,
+virDomainDeviceInfoPtr info)
+ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT;
 
 int virDomainShmemDefInsert(virDomainDefPtr def, virDomainShmemDefPtr shmem)
 ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT;
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 0e6ad1222f..ec1af101a9 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -494,6 +494,7 @@ virDomainMemballoonModelTypeFromString;
 virDomainMemballoonModelTypeToString;
 virDomainMemoryDefFree;
 virDomainMemoryFindByDef;
+virDomainMemoryFindByDeviceInfo;
 virDomainMemoryFindInactiveByDef;
 virDomainMemoryInsert;
 virDomainMemoryModelTypeToString;
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index b9bbdf8d48..1a121e9278 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -7093,6 +7093,165 @@ qemuDomainChangeDiskLive(virDomainObjPtr vm,
 return 0;
 }
 
+
+static bool
+qemuDomainChangeMemoryLiveValidateChange(const virDomainMemoryDef *oldDef,
+ const virDomainMemoryDef *newDef)
+{
+/* The only thing that is allowed to change is 'requestedsize' for virtio
+ * model. */
+if (oldDef->model != newDef->model) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("cannot modify memory model from '%s' to '%s'"),
+   virDomainMemoryModelTypeToString(oldDef->model),
+   virDomainMemoryModelTypeToString(newDef->model));
+return false;
+}
+
+if (oldDef->access != newDef->access) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("cannot modify memory access from '%s' to '%s'"),
+   virDomainMemoryAccessTypeToString(oldDef->access),
+   virDomainMemoryAccessTypeToString(newDef->access));
+return false;
+}
+
+if (oldDef->discard != newDef->discard) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("cannot modify memory discard from '%s' to '%s'"),
+   virTristateBoolTypeToString(oldDef->discard),
+   virTristateBoolTypeToString(newDef->discard));
+return false;
+}
+
+if (!virBitmapEqual(oldDef->sourceNodes,
+newDef->sourceNodes)) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("cannot 

[PATCH v2 06/13] qemu: Build command line for virtio-mem

2021-02-18 Thread Michal Privoznik
Nothing special is happening here. All important changes were
done when for 'virtio-pmem' (adjusting the code to put virtio
memory on PCI bus, generating alias using
qemuDomainDeviceAliasIndex(). The only bit that might look
suspicious is no prealloc for virtio-mem. But if you think about
it, the whole purpose of this device is to change amount of
memory exposed to guest on the fly. There is no point in locking
the whole backend in memory.

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_alias.c |  9 +++-
 src/qemu/qemu_command.c   | 12 -
 ...mory-hotplug-virtio-mem.x86_64-latest.args | 48 +++
 tests/qemuxml2argvtest.c  |  1 +
 4 files changed, 67 insertions(+), 3 deletions(-)
 create mode 100644 
tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args

diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c
index 1a330829a3..cef02ae916 100644
--- a/src/qemu/qemu_alias.c
+++ b/src/qemu/qemu_alias.c
@@ -475,8 +475,11 @@ qemuDeviceMemoryGetAliasID(virDomainDefPtr def,
 size_t i;
 int maxidx = 0;
 
-/* virtio-pmem goes onto PCI bus and thus DIMM address is not valid */
-if (!oldAlias && mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM)
+/* virtio-pmem and virtio-mem go onto PCI bus and thus DIMM address is not
+ * valid */
+if (!oldAlias &&
+mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM &&
+mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM)
 return mem->info.addr.dimm.slot;
 
 for (i = 0; i < def->nmems; i++) {
@@ -523,6 +526,8 @@ qemuAssignDeviceMemoryAlias(virDomainDefPtr def,
 prefix = "virtiopmem";
 break;
 case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM:
+prefix = "virtiomem";
+break;
 case VIR_DOMAIN_MEMORY_MODEL_NONE:
 case VIR_DOMAIN_MEMORY_MODEL_LAST:
 default:
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index ac9a8c9f6a..d56ba26874 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -3127,7 +3127,9 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
 } else if (useHugepage) {
 if (qemuGetDomainHupageMemPath(priv->driver, def, pagesize, 
) < 0)
 return -1;
-prealloc = true;
+/* For virtio-mem backed by hugepages we don't need prealloc. */
+if (mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM)
+prealloc = true;
 } else {
 /* We can have both pagesize and mem source. If that's the case,
  * prefer hugepages as those are more specific. */
@@ -3356,6 +3358,9 @@ qemuBuildMemoryDeviceStr(const virDomainDef *def,
 break;
 
 case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM:
+device = "virtio-mem-pci";
+break;
+
 case VIR_DOMAIN_MEMORY_MODEL_NONE:
 case VIR_DOMAIN_MEMORY_MODEL_LAST:
 default:
@@ -3372,6 +3377,11 @@ qemuBuildMemoryDeviceStr(const virDomainDef *def,
 if (mem->labelsize)
 virBufferAsprintf(, "label-size=%llu,", mem->labelsize * 1024);
 
+if (mem->blocksize) {
+virBufferAsprintf(, "block-size=%llu,", mem->blocksize * 1024);
+virBufferAsprintf(, "requested-size=%llu,", mem->requestedsize * 
1024);
+}
+
 if (mem->uuid) {
 char uuidstr[VIR_UUID_STRING_BUFLEN];
 
diff --git 
a/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args 
b/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args
new file mode 100644
index 00..76d8c41886
--- /dev/null
+++ b/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args
@@ -0,0 +1,48 @@
+LC_ALL=C \
+PATH=/bin \
+HOME=/tmp/lib/domain--1-QEMUGuest1 \
+USER=test \
+LOGNAME=test \
+XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \
+XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \
+XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \
+QEMU_AUDIO_DRV=none \
+/usr/bin/qemu-system-i386 \
+-name guest=QEMUGuest1,debug-threads=on \
+-S \
+-object secret,id=masterKey0,format=raw,\
+file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \
+-machine pc,accel=kvm,usb=off,dump-guest-core=off \
+-cpu qemu64 \
+-m size=2095104k,slots=16,maxmem=1099511627776k \
+-overcommit mem-lock=off \
+-smp 2,sockets=2,dies=1,cores=1,threads=1 \
+-object memory-backend-ram,id=ram-node0,size=2145386496 \
+-numa node,nodeid=0,cpus=0-1,memdev=ram-node0 \
+-object memory-backend-ram,id=memvirtiomem0,size=1073741824 \
+-device virtio-mem-pci,node=0,block-size=2097152,requested-size=536870912,\
+memdev=memvirtiomem0,id=virtiomem0,bus=pci.0,addr=0x2 \
+-object memory-backend-file,id=memvirtiomem1,\
+mem-path=/dev/hugepages2M/libvirt/qemu/-1-QEMUGuest1,size=2147483648,\
+host-nodes=1-3,policy=bind \
+-device virtio-mem-pci,node=0,block-size=2097152,requested-size=1073741824,\
+memdev=memvirtiomem1,id=virtiomem1,bus=pci.0,addr=0x3 \
+-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
+-display none \

[PATCH v2 09/13] Introduce MEMORY_DEVICE_SIZE_CHANGE event

2021-02-18 Thread Michal Privoznik
New event is introduced that is emitted whenever guest
acknowledges allocation change request of a virtio-mem.
The aim is to let applications know when that happens,
because changes in allocation are not synchronous with
issuing the request. Under the hood, the event is
emitted whenever QEMU emits MEMORY_DEVICE_SIZE_CHANGE
event.

Signed-off-by: Michal Privoznik 
---
 examples/c/misc/event-test.c| 17 ++
 include/libvirt/libvirt-domain.h| 23 
 src/conf/domain_event.c | 84 +
 src/conf/domain_event.h | 10 
 src/libvirt_private.syms|  2 +
 src/qemu/qemu_driver.c  |  6 +++
 src/remote/remote_daemon_dispatch.c | 30 +++
 src/remote/remote_driver.c  | 32 +++
 src/remote/remote_protocol.x| 15 +-
 src/remote_protocol-structs |  7 +++
 tools/virsh-domain.c| 20 +++
 11 files changed, 245 insertions(+), 1 deletion(-)

diff --git a/examples/c/misc/event-test.c b/examples/c/misc/event-test.c
index f164e825e1..e4b29ef67d 100644
--- a/examples/c/misc/event-test.c
+++ b/examples/c/misc/event-test.c
@@ -978,6 +978,22 @@ myDomainEventMemoryFailureCallback(virConnectPtr conn 
G_GNUC_UNUSED,
 }
 
 
+static int
+myDomainEventMemoryDeviceSizeChangeCallback(virConnectPtr conn G_GNUC_UNUSED,
+virDomainPtr dom,
+const char *alias,
+unsigned long long size,
+void *opaque G_GNUC_UNUSED)
+{
+/* Casts to uint64_t to work around mingw not knowing %lld */
+printf("%s EVENT: Domain %s(%d) memory device size change: "
+   "alias: '%s' new size %" PRIu64 "'\n",
+   __func__, virDomainGetName(dom), virDomainGetID(dom),
+   alias, (uint64_t)size);
+return 0;
+}
+
+
 static int
 myDomainEventMigrationIterationCallback(virConnectPtr conn G_GNUC_UNUSED,
 virDomainPtr dom,
@@ -1109,6 +1125,7 @@ struct domainEventData domainEvents[] = {
 DOMAIN_EVENT(VIR_DOMAIN_EVENT_ID_METADATA_CHANGE, 
myDomainEventMetadataChangeCallback),
 DOMAIN_EVENT(VIR_DOMAIN_EVENT_ID_BLOCK_THRESHOLD, 
myDomainEventBlockThresholdCallback),
 DOMAIN_EVENT(VIR_DOMAIN_EVENT_ID_MEMORY_FAILURE, 
myDomainEventMemoryFailureCallback),
+DOMAIN_EVENT(VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE, 
myDomainEventMemoryDeviceSizeChangeCallback),
 };
 
 struct storagePoolEventData {
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index 8011cf9fe1..2c449b4f31 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -4651,6 +4651,28 @@ typedef void 
(*virConnectDomainEventMemoryFailureCallback)(virConnectPtr conn,
unsigned int flags,
void *opaque);
 
+/**
+ * virConnectDomainEventMemoryDeviceSizeChangeCallback:
+ * @conn: connection object
+ * @dom: domain on which the event occurred
+ * @alias: memory device alias
+ * @size: new actual size of memory device (in KiB)
+ * @opaque: application specified data
+ *
+ * The callback occurs when the guest acknowledges request to change size of
+ * memory device (so far only virtio-mem model supports this). The @size then
+ * reflects the new amount of guest visible memory (in kibibytes).
+ *
+ * The callback signature to use when registering for an event of type
+ * VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE with
+ * virConnectDomainEventRegisterAny().
+ */
+typedef void 
(*virConnectDomainEventMemoryDeviceSizeChangeCallback)(virConnectPtr conn,
+
virDomainPtr dom,
+const char 
*alias,
+unsigned 
long long size,
+void 
*opaque);
+
 
 /**
  * VIR_DOMAIN_EVENT_CALLBACK:
@@ -4695,6 +4717,7 @@ typedef enum {
 VIR_DOMAIN_EVENT_ID_METADATA_CHANGE = 23, /* 
virConnectDomainEventMetadataChangeCallback */
 VIR_DOMAIN_EVENT_ID_BLOCK_THRESHOLD = 24, /* 
virConnectDomainEventBlockThresholdCallback */
 VIR_DOMAIN_EVENT_ID_MEMORY_FAILURE = 25,  /* 
virConnectDomainEventMemoryFailureCallback */
+VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE = 26, /* 
virConnectDomainEventMemoryDeviceSizeChangeCallback */
 
 # ifdef VIR_ENUM_SENTINELS
 VIR_DOMAIN_EVENT_ID_LAST
diff --git a/src/conf/domain_event.c b/src/conf/domain_event.c
index 726a792dae..31f9cc34f1 100644
--- a/src/conf/domain_event.c
+++ b/src/conf/domain_event.c
@@ -58,6 +58,7 @@ static virClassPtr virDomainEventDeviceRemovalFailedClass;
 static virClassPtr virDomainEventMetadataChangeClass;
 static virClassPtr 

[PATCH v2 10/13] qemu: Refresh the actual size of virtio-mem on monitor reconnect

2021-02-18 Thread Michal Privoznik
If the QEMU driver restarts it loses the track of the actual size
of virtio-mem (because it's runtime type of information and thus
not stored in XML) and therefore, we have to refresh it when
reconnecting to the domain monitor.

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_domain.c   | 37 
 src/qemu/qemu_monitor.h  |  3 ++
 src/qemu/qemu_monitor_json.c | 56 ++--
 src/qemu/qemu_process.c  |  3 ++
 4 files changed, 72 insertions(+), 27 deletions(-)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 7de9f8c050..3275514908 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -8096,9 +8096,21 @@ qemuDomainUpdateMemoryDeviceInfo(virQEMUDriverPtr driver,
 return -1;
 }
 
-/* if qemu doesn't support the info request, just carry on */
-if (rc == -2)
+/* If qemu doesn't support the info request, just carry on, unless we
+ * really need it. */
+if (rc == -2) {
+for (i = 0; i < vm->def->nmems; i++) {
+virDomainMemoryDefPtr mem = vm->def->mems[i];
+
+if (mem->model == VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("qemu did not return info on vitio-mem 
device"));
+return -1;
+}
+}
+
 return 0;
+}
 
 if (rc < 0)
 return -1;
@@ -8113,9 +8125,24 @@ qemuDomainUpdateMemoryDeviceInfo(virQEMUDriverPtr driver,
 if (!(dimm = virHashLookup(meminfo, mem->info.alias)))
 continue;
 
-mem->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM;
-mem->info.addr.dimm.slot = dimm->slot;
-mem->info.addr.dimm.base = dimm->address;
+switch (mem->model) {
+case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM:
+mem->actualsize = VIR_DIV_UP(dimm->size, 1024);
+break;
+
+case VIR_DOMAIN_MEMORY_MODEL_DIMM:
+case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
+mem->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM;
+mem->info.addr.dimm.slot = dimm->slot;
+mem->info.addr.dimm.base = dimm->address;
+break;
+
+case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM:
+case VIR_DOMAIN_MEMORY_MODEL_NONE:
+case VIR_DOMAIN_MEMORY_MODEL_LAST:
+/* nada */
+break;
+}
 }
 
 virHashFree(meminfo);
diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
index 53f2aa9600..c369b8c6ef 100644
--- a/src/qemu/qemu_monitor.h
+++ b/src/qemu/qemu_monitor.h
@@ -1393,10 +1393,13 @@ typedef struct _qemuMonitorMemoryDeviceInfo 
qemuMonitorMemoryDeviceInfo;
 typedef qemuMonitorMemoryDeviceInfo *qemuMonitorMemoryDeviceInfoPtr;
 
 struct _qemuMonitorMemoryDeviceInfo {
+/* For pc-dimm */
 unsigned long long address;
 unsigned int slot;
 bool hotplugged;
 bool hotpluggable;
+/* For virtio-mem */
+unsigned long long size; /* in bytes */
 };
 
 int qemuMonitorGetMemoryDeviceInfo(qemuMonitorPtr mon,
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c
index 9ce7ba52ba..09e898442d 100644
--- a/src/qemu/qemu_monitor_json.c
+++ b/src/qemu/qemu_monitor_json.c
@@ -8238,7 +8238,6 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon,
 virJSONValuePtr cmd;
 virJSONValuePtr reply = NULL;
 virJSONValuePtr data = NULL;
-qemuMonitorMemoryDeviceInfoPtr meminfo = NULL;
 size_t i;
 
 if (!(cmd = qemuMonitorJSONMakeCommand("query-memory-devices", NULL)))
@@ -8259,6 +8258,9 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon,
 
 for (i = 0; i < virJSONValueArraySize(data); i++) {
 virJSONValuePtr elem = virJSONValueArrayGet(data, i);
+g_autofree qemuMonitorMemoryDeviceInfoPtr meminfo = NULL;
+virJSONValuePtr dimminfo;
+const char *devalias;
 const char *type;
 
 if (!(type = virJSONValueObjectGetString(elem, "type"))) {
@@ -8268,26 +8270,17 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon,
 goto cleanup;
 }
 
+if (!(dimminfo = virJSONValueObjectGetObject(elem, "data"))) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("query-memory-devices reply data doesn't "
+ "contain enum data"));
+goto cleanup;
+}
+
+meminfo = g_new0(qemuMonitorMemoryDeviceInfo, 1);
+
 /* dimm memory devices */
 if (STREQ(type, "dimm")) {
-virJSONValuePtr dimminfo;
-const char *devalias;
-
-if (!(dimminfo = virJSONValueObjectGetObject(elem, "data"))) {
-virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
-   _("query-memory-devices reply data doesn't "
- "contain enum data"));
-goto cleanup;
-}
-
-if (!(devalias = 

[PATCH v2 13/13] kbase: Document virtio-mem

2021-02-18 Thread Michal Privoznik
This commit adds new memorydevices.rst page which should serve
all models of memory devices. Yet, I'm documenting virtio-mem
quirks only.

Signed-off-by: Michal Privoznik 
---
 docs/kbase/index.rst |   4 +
 docs/kbase/memorydevices.rst | 160 +++
 docs/kbase/meson.build   |   1 +
 3 files changed, 165 insertions(+)
 create mode 100644 docs/kbase/memorydevices.rst

diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst
index 532804fe05..45450bf33b 100644
--- a/docs/kbase/index.rst
+++ b/docs/kbase/index.rst
@@ -46,6 +46,10 @@ Usage
 `PCI topology <../pci-addresses.html>`__
Addressing schemes for PCI devices
 
+`Memory devices `__
+   Memory devices and their use
+
+
 Internals / Debugging
 -
 
diff --git a/docs/kbase/memorydevices.rst b/docs/kbase/memorydevices.rst
new file mode 100644
index 00..23adf54e16
--- /dev/null
+++ b/docs/kbase/memorydevices.rst
@@ -0,0 +1,160 @@
+==
+Memory devices
+==
+
+.. contents::
+
+Basics
+==
+
+Memory devices can be divided into two families: DIMMs and NVDIMMs. The former
+is typical RAM memory: it's volatile and thus its contents doesn't survive
+reboots nor guest shut downs and power ons. The latter retains its contents
+across reboots or power outages.
+
+In Libvirt, there are two models for DIMMs:
+
+* ``dimm`` model:
+
+  ::
+
+
+  
+523264
+0
+  
+  
+
+
+* ``virtio-mem`` model:
+
+  ::
+
+
+  
+1048576
+0
+2048
+524288
+  
+  
+
+
+Then there are two models for NVDIMMs:
+
+* ``nvidmm`` model:
+
+  ::
+
+
+  
+/tmp/nvdimm
+  
+  
+523264
+0
+  
+  
+
+
+* ``virtio-pmem`` model:
+
+  ::
+
+
+  
+/tmp/virtio_pmem
+  
+  
+524288
+  
+  
+
+
+
+Please not that (maybe somewhat surprisingly) virtio models go onto PCI bus
+instead of DIMM slots.
+
+Furthermore, DIMMs can have  element which configures backend for
+devices. For NVDIMMs the element is mandatory and reflects where the contents
+is saved.
+
+See https://libvirt.org/formatdomain.html#elementsMemory
+
+``virtio-mem`` model
+
+
+The ``virtio-mem`` model can be viewed as revised memory balloon. It offers
+memory hotplug and hotunplug solution (without the actual hotplug of the
+device). It solves problems that memory balloon can't solve on its own and thus
+is more flexible than DIMM + balloon solution. ``virtio-mem`` is NUMA aware,
+and thus memory can be inflated/deflated only for a subset of guest NUMA nodes.
+Also, it works with chunks that are either exposed to guest or taken back from
+it.
+
+See https://virtio-mem.gitlab.io/
+
+Under the hood, ``virtio-mem`` device is split into chunks of equal size which
+are then exposed to the guest. Either all of them or only a portion depending
+on user's request. Therefore there are three important sizes for
+``virtio-mem``. All are to be found under  element:
+
+#. The maximum size the device can ever offer, exposed under 
+#. The size a single block, exposed under 
+#. The current size exposed to the guest, exposed under 
+
+For instance, the following example the maximum size is 4GiB, the block size is
+2MiB and only 1GiB should be exposed to the guest:
+
+  ::
+
+
+  
+4194304
+2048
+1048576
+  
+
+
+Please note that  must be an integer multiple of 
+size or zero (memory completely deflated) and has to be less or equal to
+ (memory completely inflated). Furthermore, QEMU recommends the
+ size to be as big as a Transparent Huge Page (usually 2MiB).
+
+To change the size exposed to the guest, users should pass memory device XML
+with nothing but  changed into the
+``virDomainUpdateDeviceFlags()`` API. For user's convenience this can be done
+via virsh too:
+
+ ::
+
+   # virsh update-memory-device $dom --requested size 2GiB
+
+If there are two or more  devices then ``--alias`` shall be used
+to tell virsh which memory device should be updated.
+
+For running guests there is fourth size that can be found under :
+
+  ::
+
+2097152
+
+The  reflects the actual size consumed by the guest. In general it
+can differ from . Reasons include guest kernel missing
+``virtio-mem`` module and thus being unable to take offered memory, or guest
+kernel being unable to free memory and allow deflation.  Since 
+only reports size to users, the element is never parsed. It is formatted only
+into live XML.
+
+Since changing actual allocation requires cooperation with guest kernel,
+requests for change are not instant. Therefore, libvirt emits
+``VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE`` event whenever actual
+allocation changed.
+
+Please not that using ``virtio-mem`` with memory balloon is not possible,
+currently. The real reason is that libvirt's memory accounting isn't 

[PATCH v2 11/13] virsh: Introduce update-memory command

2021-02-18 Thread Michal Privoznik
New 'update-memory' command is introduced which aims on making it
user friendly to change  device. So far I just need to
change  so I'm introducing --requested-size only; but
the idea is that this is extensible for other cases too. For
instance, want to change ? A new --my-element
argument can be easily introduced.

Signed-off-by: Michal Privoznik 
---
 docs/manpages/virsh.rst |  30 
 tools/virsh-domain.c| 150 
 2 files changed, 180 insertions(+)

diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst
index e3afa48f7b..ea26a47fd7 100644
--- a/docs/manpages/virsh.rst
+++ b/docs/manpages/virsh.rst
@@ -4891,6 +4891,36 @@ results as some fields may be autogenerated and thus 
match devices other than
 expected.
 
 
+update-memory-device
+
+
+**Syntax:**
+
+::
+
+   update-memory-device domain [--print-xml] [--alias alias]
+ [[--live] [--config] | [--current]]
+ [--requested-size size]
+
+This command finds  device inside given *domain*, changes
+requested values and passes updated device XML to daemon. If *--print-xml* is
+specified then the device is not changed, but the updated device XML is printed
+to stdout.  If there are more than one  devices in *domain* use
+*--alias* to select the desired one.
+
+If *--live* is specified, affect a running domain.
+If *--config* is specified, affect the next startup of a persistent guest.
+If *--current* is specified, it is equivalent to either *--live* or
+*--config*, depending on the current state of the guest.
+Both *--live* and *--config* flags may be given, but *--current* is
+exclusive. Not specifying any flag is the same as specifying *--current*.
+
+If *--requested-size* is specified then  under memory target is
+changed to requested *size* (as scaled integer, see ``NOTES`` above). It
+defaults to kibibytes if no suffix is provided. The option is valid only for
+``virtio-mem`` memory device model.
+
+
 change-media
 
 
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
index 49ddd69d1b..b6e27e0e63 100644
--- a/tools/virsh-domain.c
+++ b/tools/virsh-domain.c
@@ -9129,6 +9129,150 @@ cmdSetmaxmem(vshControl *ctl, const vshCmd *cmd)
 return ret;
 }
 
+
+/*
+ * "update-memory-device" command
+ */
+static const vshCmdInfo info_update_memory_device[] = {
+{.name = "help",
+ .data = N_("update memory device of a domain")
+},
+{.name = "desc",
+ .data = N_("Update values of a memory device of a domain")
+},
+{.name = NULL}
+};
+
+static const vshCmdOptDef opts_update_memory_device[] = {
+VIRSH_COMMON_OPT_DOMAIN_FULL(0),
+VIRSH_COMMON_OPT_DOMAIN_CONFIG,
+VIRSH_COMMON_OPT_DOMAIN_LIVE,
+VIRSH_COMMON_OPT_DOMAIN_CURRENT,
+{.name = "print-xml",
+ .type = VSH_OT_BOOL,
+ .help = N_("print updated memory device XML instead of executing the 
change")
+},
+{.name = "alias",
+ .type = VSH_OT_STRING,
+ .completer = virshDomainDeviceAliasCompleter,
+ .help = N_("memory device alias"),
+},
+{.name = "requested-size",
+ .type = VSH_OT_INT,
+ .help = N_("new value of  size, as scaled integer (default 
KiB)")
+},
+{.name = NULL}
+};
+
+static int
+virshGetUpdatedMemoryXML(char **updatedMemoryXML,
+ vshControl *ctl,
+ const vshCmd *cmd,
+ virDomainPtr dom,
+ unsigned int flags)
+{
+const char *alias = NULL;
+g_autoptr(xmlDoc) doc = NULL;
+g_autoptr(xmlXPathContext) ctxt = NULL;
+g_autofree char *xpath = NULL;
+int nmems;
+g_autofree xmlNodePtr *mems = NULL;
+g_autoptr(xmlBuffer) xmlbuf = NULL;
+unsigned int domainXMLFlags = 0;
+
+if (flags & VIR_DOMAIN_AFFECT_CONFIG)
+domainXMLFlags |= VIR_DOMAIN_XML_INACTIVE;
+
+if (virshDomainGetXMLFromDom(ctl, dom, domainXMLFlags, , ) < 0)
+return -1;
+
+if (vshCommandOptStringReq(ctl, cmd, "alias", ) < 0)
+return -1;
+
+if (alias) {
+xpath = g_strdup_printf("/domain/devices/memory[./alias/@name='%s']", 
alias);
+} else {
+xpath = g_strdup("/domain/devices/memory");
+}
+
+nmems = virXPathNodeSet(xpath, ctxt, );
+if (nmems < 0) {
+vshSaveLibvirtError();
+return -1;
+} else if (nmems == 0) {
+vshError(ctl, _("no memory device found"));
+return -1;
+} else if (nmems > 1) {
+vshError(ctl, _("multiple memory devices found, use --alias to select 
one"));
+return -1;
+}
+
+ctxt->node = mems[0];
+
+if (vshCommandOptBool(cmd, "requested-size")) {
+xmlNodePtr requestedSizeNode;
+g_autofree char *kibibytesStr = NULL;
+unsigned long long bytes = 0;
+unsigned long kibibytes = 0;
+
+if (vshCommandOptScaledInt(ctl, cmd, "requested-size", , 1024, 
ULLONG_MAX) < 0)
+return -1;
+kibibytes = VIR_DIV_UP(bytes, 1024);
+
+

[PATCH v2 12/13] news: document recent virtio memory addition

2021-02-18 Thread Michal Privoznik
Signed-off-by: Michal Privoznik 
---
 NEWS.rst | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/NEWS.rst b/NEWS.rst
index 440dbf2601..abb3f5d867 100644
--- a/NEWS.rst
+++ b/NEWS.rst
@@ -47,6 +47,13 @@ v7.1.0 (unreleased)
 to set the hostdev device's MAC address (which is a necessary
 part of the alternate ).
 
+  * Introduce virtio-mem  model
+
+New virtio-mem model is introduced for  device which is a
+paravirtualized mechanism of adding/removing memory to/from a VM. Use
+``virDomainUpdateDeviceFlags()`` API to adjust amount of memory or ``virsh
+update-memory-device`` for convenience.
+
 * **Improvements**
 
 * **Bug fixes**
-- 
2.26.2



[libvirt PATCH 1/4] security: dac: remove leftover virPCIDeviceFree

2021-02-18 Thread Ján Tomko
The switch to g_auto left this one call behind.

Reported by Coverity.

Fixes: 4ab0d1844a1e60def576086edc8b2c3775e7c10d
Signed-off-by: Ján Tomko 
---
 src/security/security_dac.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/security/security_dac.c b/src/security/security_dac.c
index 00eeae0d27..11f6c5c3da 100644
--- a/src/security/security_dac.c
+++ b/src/security/security_dac.c
@@ -1430,10 +1430,9 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr 
mgr,
 if (pcisrc->backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) {
 g_autofree char *vfioGroupDev = virPCIDeviceGetIOMMUGroupDev(pci);
 
-if (!vfioGroupDev) {
-virPCIDeviceFree(pci);
+if (!vfioGroupDev)
 return -1;
-}
+
 ret = virSecurityDACRestoreFileLabelInternal(mgr, NULL,
  vfioGroupDev, false);
 } else {
-- 
2.26.2



[libvirt PATCH 4/4] hyperv: check return value of virUUIDGenerate

2021-02-18 Thread Ján Tomko
Fixes: fa66bd8cad2359b7d21676e0fd69bec472b36514
Signed-off-by: Ján Tomko 
---
 src/hyperv/hyperv_driver.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/hyperv/hyperv_driver.c b/src/hyperv/hyperv_driver.c
index c1ed4b5c7c..701456cdb3 100644
--- a/src/hyperv/hyperv_driver.c
+++ b/src/hyperv/hyperv_driver.c
@@ -1080,7 +1080,9 @@ hypervDomainAttachSyntheticEthernetAdapter(virDomainPtr 
domain,
  */
 portResourceType = g_strdup_printf("%d", 
MSVM_RASD_RESOURCETYPE_ETHERNET_ADAPTER);
 
-virUUIDGenerate(vsiGuid);
+if (virUUIDGenerate(vsiGuid) < 0)
+return -1;
+
 virUUIDFormat(vsiGuid, guidString);
 virtualSystemIdentifiers = g_strdup_printf("{%s}", guidString);
 
-- 
2.26.2



[libvirt PATCH 2/4] qemu: saveimage: only steal domXML on success

2021-02-18 Thread Ján Tomko
The comment and the caller assume virQEMUSaveDataNew only steals
domXML on success, but it is copied even on failure.

Also remove the misleading g_steal_pointer call on a local variable.

Reported by coverity.

Signed-off-by: Ján Tomko 
---
 src/qemu/qemu_saveimage.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
index de2d63dd9a..5d542bf977 100644
--- a/src/qemu/qemu_saveimage.c
+++ b/src/qemu/qemu_saveimage.c
@@ -105,8 +105,6 @@ virQEMUSaveDataNew(char *domXML,
 
 data = g_new0(virQEMUSaveData, 1);
 
-data->xml = g_steal_pointer();
-
 if (cookieObj &&
 !(data->cookie = virSaveCookieFormat((virObjectPtr) cookieObj,
  
virDomainXMLOptionGetSaveCookie(xmlopt
@@ -118,6 +116,7 @@ virQEMUSaveDataNew(char *domXML,
 header->was_running = running ? 1 : 0;
 header->compressed = compressed;
 
+data->xml = domXML;
 return data;
 
  error:
-- 
2.26.2



[libvirt PATCH 0/4] Coverity diaries

2021-02-18 Thread Ján Tomko
Ján Tomko (4):
  security: dac: remove leftover virPCIDeviceFree
  qemu: saveimage: only steal domXML on success
  qemu: monitor: clear cpu props properly in CPUInfoClear
  hyperv: check return value of virUUIDGenerate

 src/hyperv/hyperv_driver.c  | 4 +++-
 src/qemu/qemu_monitor.c | 1 +
 src/qemu/qemu_saveimage.c   | 3 +--
 src/security/security_dac.c | 5 ++---
 4 files changed, 7 insertions(+), 6 deletions(-)

-- 
2.26.2



[libvirt PATCH 3/4] qemu: monitor: clear cpu props properly in CPUInfoClear

2021-02-18 Thread Ján Tomko
Stay true to the name of the function and clear the pointer
after freeing it.

This also silences a bogus Coverity report about a double
free in qemuMonitorGetCPUInfo where qemuMonitorCPUInfoClear
is called right after allocating a new qemuMonitorCPUInfo
to fill out the non-zero defaults.

Signed-off-by: Ján Tomko 
---
 src/qemu/qemu_monitor.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index 0476d606f5..151f69acef 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -1661,6 +1661,7 @@ qemuMonitorCPUInfoClear(qemuMonitorCPUInfoPtr cpus,
 VIR_FREE(cpus[i].alias);
 VIR_FREE(cpus[i].type);
 virJSONValueFree(cpus[i].props);
+cpus[i].props = NULL;
 }
 }
 
-- 
2.26.2



Re: [PATCH v2 01/13] virhostmem: Introduce virHostMemGetTHPSize()

2021-02-18 Thread Peter Krempa
On Thu, Feb 18, 2021 at 14:30:56 +0100, Michal Privoznik wrote:
> New virHostMemGetTHPSize() is introduced which allows caller to
> obtain THP PMD (Page Middle Directory) size, which is equal to
> the minimal size that THP can use, taken from kernel doc
> (Documentation/admin-guide/mm/transhuge.rst):
> 
>   Some userspace (such as a test program, or an optimized memory allocation
>   library) may want to know the size (in bytes) of a transparent hugepage::
> 
> cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
> 
> Since this size depends on the host architecture and the kernel
> it won't change whilst libvirtd is running. Therefore, we can use
> virOnce() and cache the value. Of course, we can be running under
> kernel that has THP disabled or has no notion of THP at all. In
> that case a negative value is returned to signal error.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  src/libvirt_private.syms |  1 +
>  src/util/virhostmem.c| 63 
>  src/util/virhostmem.h|  3 ++
>  tests/domaincapsmock.c   |  9 ++
>  4 files changed, 76 insertions(+)

[...]

> diff --git a/src/util/virhostmem.c b/src/util/virhostmem.c
> index ae42978ed2..ef7b97806f 100644
> --- a/src/util/virhostmem.c
> +++ b/src/util/virhostmem.c
> @@ -45,11 +45,14 @@
>  #include "virstring.h"
>  #include "virnuma.h"
>  #include "virlog.h"
> +#include "virthread.h"
>  
>  #define VIR_FROM_THIS VIR_FROM_NONE
>  
>  VIR_LOG_INIT("util.hostmem");
>  
> +static unsigned long long virHostTHPPMDSize;

Please add a comment that it's in kiB.

> +static virOnceControl virHostMemGetTHPSizeOnce = 
> VIR_ONCE_CONTROL_INITIALIZER;



[PATCH] qemu_driver.c: Coverity fix in qemuNodeDeviceDetachFlags()

2021-02-18 Thread Daniel Henrique Barboza
Commit 76f47889326c4 made qemuNodeDeviceDetachFlags() unusable due to an
'if then else if' chain that will always results in a 'return -1',
regardless of 'driverName' input. This slipped through review process
and Gitlab CI, making it clear now that there is no unit tests
exercising this code ATM.

LUckily John Ferlan caught this up with his Coverity scan and spared us
from an unneeded headache later on.

Fixes: 76f47889326c45d2732711bc6dd5751aaf6e5194
Reported-by: John Ferlan 
Signed-off-by: Daniel Henrique Barboza 
---
 src/qemu/qemu_driver.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index f1035a536e..b9bbdf8d48 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -11998,7 +11998,6 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev,
   unsigned int flags)
 {
 virQEMUDriverPtr driver = dev->conn->privateData;
-bool vfio = qemuHostdevHostSupportsPassthroughVFIO();
 virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr;
 
 virCheckFlags(0, -1);
@@ -12006,22 +12005,27 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev,
 if (!driverName)
 driverName = "vfio";
 
-if (STREQ(driverName, "vfio") && !vfio) {
-virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
-   _("VFIO device assignment is currently not "
- "supported on this system"));
- return -1;
-} else if (STREQ(driverName, "kvm")) {
+/* Only the 'vfio' driver is supported and a special error message for
+ * the previously supported 'kvm' driver is provided below. */
+if (STRNEQ(driverName, "vfio") && STRNEQ(driverName, "kvm")) {
+virReportError(VIR_ERR_INVALID_ARG,
+   _("unknown driver name '%s'"), driverName);
+return -1;
+}
+
+if (STREQ(driverName, "kvm")) {
 virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
_("KVM device assignment is no longer "
  "supported on this system"));
 return -1;
-} else {
-virReportError(VIR_ERR_INVALID_ARG,
-   _("unknown driver name '%s'"), driverName);
-return -1;
 }
 
+if (!qemuHostdevHostSupportsPassthroughVFIO()) {
+virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
+   _("VFIO device assignment is currently not "
+ "supported on this system"));
+ return -1;
+}
 
 /* virNodeDeviceDetachFlagsEnsureACL() is being called by
  * virDomainDriverNodeDeviceDetachFlags() */
-- 
2.29.2



Re: [PATCH] qemu_driver.c: Coverity fix in qemuNodeDeviceDetachFlags()

2021-02-18 Thread Ján Tomko

On a Thursday in 2021, Daniel Henrique Barboza wrote:

Commit 76f47889326c4 made qemuNodeDeviceDetachFlags() unusable due to an
'if then else if' chain that will always results in a 'return -1',
regardless of 'driverName' input. This slipped through review process
and Gitlab CI, making it clear now that there is no unit tests
exercising this code ATM.


There is no need to enumerate all the entities that missed the mistake,
especially not the reviewer! O:-)



LUckily John Ferlan caught this up with his Coverity scan and spared us


s/LU/Lu/


from an unneeded headache later on.

Fixes: 76f47889326c45d2732711bc6dd5751aaf6e5194
Reported-by: John Ferlan 
Signed-off-by: Daniel Henrique Barboza 
---
src/qemu/qemu_driver.c | 26 +++---
1 file changed, 15 insertions(+), 11 deletions(-)



with the typo fixed:
Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature


Re: [PATCH] qemu_driver.c: Coverity fix in qemuNodeDeviceDetachFlags()

2021-02-18 Thread John Ferlan



On 2/18/21 7:33 AM, Daniel Henrique Barboza wrote:
> Commit 76f47889326c4 made qemuNodeDeviceDetachFlags() unusable due to an
> 'if then else if' chain that will always results in a 'return -1',
> regardless of 'driverName' input. This slipped through review process
> and Gitlab CI, making it clear now that there is no unit tests
> exercising this code ATM.
> 
> LUckily John Ferlan caught this up with his Coverity scan and spared us
> from an unneeded headache later on.
> 
> Fixes: 76f47889326c45d2732711bc6dd5751aaf6e5194
> Reported-by: John Ferlan 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  src/qemu/qemu_driver.c | 26 +++---
>  1 file changed, 15 insertions(+), 11 deletions(-)
> 

I can vouch this does make Coverity happy again.  I do agree w/ Jano no
need to be too elaborate on slipping thru cracks And when I do
create patches (rarely) now for Coverity, I'll just attribute it to
"Found by Coverity" and be done with it. You already put the Reported-by
so that's probably sufficient!

Reviewed-by: John Ferlan 

John



[libvirt PATCH] esx: use g_autofree for datastoreRelatedPath

2021-02-18 Thread Ján Tomko
Reported by Coverity.

Fixes: 213662813cd846d045be8857dc7b917d33a40989
Signed-off-by: Ján Tomko 
---
Pushed as trivial.

 src/esx/esx_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/esx/esx_driver.c b/src/esx/esx_driver.c
index e49975de86..96cc8bda0d 100644
--- a/src/esx/esx_driver.c
+++ b/src/esx/esx_driver.c
@@ -2886,7 +2886,7 @@ esxDomainDefineXMLFlags(virConnectPtr conn, const char 
*xml, unsigned int flags)
 g_autofree char *escapedName = NULL;
 g_auto(virBuffer) buffer = VIR_BUFFER_INITIALIZER;
 g_autofree char *url = NULL;
-char *datastoreRelatedPath = NULL;
+g_autofree char *datastoreRelatedPath = NULL;
 esxVI_String *propertyNameList = NULL;
 esxVI_ObjectContent *hostSystem = NULL;
 esxVI_ManagedObjectReference *resourcePool = NULL;
-- 
2.26.2



Re: [PATCH] qemu_driver.c: Coverity fix in qemuNodeDeviceDetachFlags()

2021-02-18 Thread Daniel Henrique Barboza




On 2/18/21 9:52 AM, John Ferlan wrote:



On 2/18/21 7:33 AM, Daniel Henrique Barboza wrote:

Commit 76f47889326c4 made qemuNodeDeviceDetachFlags() unusable due to an
'if then else if' chain that will always results in a 'return -1',
regardless of 'driverName' input. This slipped through review process
and Gitlab CI, making it clear now that there is no unit tests
exercising this code ATM.

LUckily John Ferlan caught this up with his Coverity scan and spared us
from an unneeded headache later on.

Fixes: 76f47889326c45d2732711bc6dd5751aaf6e5194
Reported-by: John Ferlan 
Signed-off-by: Daniel Henrique Barboza 
---
  src/qemu/qemu_driver.c | 26 +++---
  1 file changed, 15 insertions(+), 11 deletions(-)



I can vouch this does make Coverity happy again.  I do agree w/ Jano no
need to be too elaborate on slipping thru cracks And when I do
create patches (rarely) now for Coverity, I'll just attribute it to
"Found by Coverity" and be done with it. You already put the Reported-by
so that's probably sufficient!


It was my attempt to save face by claiming that I failed because there were no
unit tests :)

But you gotta a good point though. I'll remove this except before pushing it.


Thanks,


DHB




Reviewed-by: John Ferlan 

John





Re: [PATCH v2 07/10] qemu_driver.c: validate 'driverName' earlier in qemuNodeDeviceDetachFlags()

2021-02-18 Thread Daniel Henrique Barboza




On 2/18/21 8:30 AM, John Ferlan wrote:



On 2/2/21 8:06 PM, Daniel Henrique Barboza wrote:

The validation of 'driverName' does not depend on any other state and can be
done right on the start of the function. We can fail earlier while avoiding
a cleanup jump.

Reviewed-by: Ján Tomko 
Signed-off-by: Daniel Henrique Barboza 
---
  src/qemu/qemu_driver.c | 41 -
  1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c

index 64ae8fafc0..c6ba33e4ad 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -11982,6 +11982,25 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev,
  
  virCheckFlags(0, -1);
  
+if (!driverName)

+driverName = "vfio";
+
+if (STREQ(driverName, "vfio") && !vfio) {
+virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
+   _("VFIO device assignment is currently not "
+ "supported on this system"));
+ return -1;
+} else if (STREQ(driverName, "kvm")) {
+virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
+   _("KVM device assignment is no longer "
+ "supported on this system"));
+return -1;
+} else {
+virReportError(VIR_ERR_INVALID_ARG,
+   _("unknown driver name '%s'"), driverName);
+return -1;
+}
+


Coverity points out the rest of this is unreachable now (even with the
subsequent patch).



ooo. I'll fix it. Thanks for the heads up John!




  if (!(nodeconn = virGetConnectNodeDev()))
  goto cleanup;
  
@@ -12013,27 +12032,7 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev,

  if (!pci)
  goto cleanup;
  
-if (!driverName)

-driverName = "vfio";
-
-if (STREQ(driverName, "vfio")) {
-if (!vfio) {
-virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
-   _("VFIO device assignment is currently not "
- "supported on this system"));
-goto cleanup;
-}
-virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_VFIO);


This section seems to be more than just code motion...  it used to pass
thru here when vfio = true, but with the movement it would fall into the
catch all else.

John


-} else if (STREQ(driverName, "kvm")) {
-virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
-   _("KVM device assignment is no longer "
- "supported on this system"));
-goto cleanup;
-} else {
-virReportError(VIR_ERR_INVALID_ARG,
-   _("unknown driver name '%s'"), driverName);
-goto cleanup;
-}
+virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_VFIO);
  
  ret = virHostdevPCINodeDeviceDetach(hostdev_mgr, pci);

   cleanup:







Re: [PATCH 18/19] qemu: migration: Migrate block dirty bitmaps corresponding to checkpoints

2021-02-18 Thread Peter Krempa
On Thu, Feb 18, 2021 at 15:54:37 +0100, Jiri Denemark wrote:
> On Thu, Feb 11, 2021 at 16:37:57 +0100, Peter Krempa wrote:
> > Preserve block dirty bitmaps after migration with
> > QEMU_MONITOR_MIGRATE_NON_SHARED_(DISK|INC).
> > 
> > This patch implements functions which offer the bitmaps to the
> > destination, check for eligibility on destination and then configure
> > source for the migration.
> > 
> > Signed-off-by: Peter Krempa 
> > ---
> >  src/qemu/qemu_migration.c | 333 +-
> >  1 file changed, 331 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> > index 36424f8493..16bfad0390 100644
> > --- a/src/qemu/qemu_migration.c
> > +++ b/src/qemu/qemu_migration.c
> ...
> > @@ -2528,6 +2619,92 @@ qemuMigrationDstPrepare(virDomainObjPtr vm,
> >   migrateFrom, fd, NULL);
> >  }
> > 
> > +
> > +/**
> > + * qemuMigrationDstPrepareAnyBlockDirtyBitmaps:
> > + * @vm: domain object
> > + * @mig: migration cookie
> > + * @migParams: migration parameters
> > + * @flags: migration flags
> > + *
> > + * Checks whether block dirty bitmaps offered by the migration source are
> > + * to be migrated (e.g. they don't exist, the destination is compatible 
> > etc)
> > + * and sets up destination qemu for migrating the bitmaps as well as 
> > updates the
> > + * list of eligible bitmaps in the migration cookie to be sent back to the
> > + * source.
> > + */
> > +static int
> > +qemuMigrationDstPrepareAnyBlockDirtyBitmaps(virDomainObjPtr vm,
> > +qemuMigrationCookiePtr mig,
> > +qemuMigrationParamsPtr 
> > migParams,
> > +unsigned int flags)
> > +{
> > +qemuDomainObjPrivatePtr priv = vm->privateData;
> > +g_autoptr(virJSONValue) mapping = NULL;
> > +g_autoptr(GHashTable) blockNamedNodeData = NULL;
> > +GSList *nextdisk;
> > +
> > +if (!mig->nbd ||
> > +!mig->blockDirtyBitmaps ||
> > +!(flags & (VIR_MIGRATE_NON_SHARED_DISK | 
> > VIR_MIGRATE_NON_SHARED_INC)) ||
> > +!virQEMUCapsGet(priv->qemuCaps, 
> > QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING))
> > +return 0;
> 
> Shouldn't we report an error in case the source sent bitmaps, but local
> QEMU does not support QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING?

See below.

> 
> > +
> > +if (qemuMigrationCookieBlockDirtyBitmapsMatchDisks(vm->def, 
> > mig->blockDirtyBitmaps) < 0)
> > +return -1;
> > +
> > +if (!(blockNamedNodeData = qemuBlockGetNamedNodeData(vm, 
> > QEMU_ASYNC_JOB_MIGRATION_IN)))
> > +return -1;
> > +
> > +for (nextdisk = mig->blockDirtyBitmaps; nextdisk; nextdisk = 
> > nextdisk->next) {
> > +qemuMigrationBlockDirtyBitmapsDiskPtr disk = nextdisk->data;
> > +qemuBlockNamedNodeDataPtr nodedata;
> > +GSList *nextbitmap;
> > +
> > +if (!(nodedata = virHashLookup(blockNamedNodeData, 
> > disk->nodename))) {
> > +virReportError(VIR_ERR_INTERNAL_ERROR,
> > +   _("failed to find data for block node '%s'"),
> > +   disk->nodename);
> > +return -1;
> > +}
> > +
> > +/* don't migrate bitmaps into non-qcow2v3+ images */
> 
> How about "Bitmaps can only be migrated to qcow2 v3+"?
> 
> > +if (disk->disk->src->format != VIR_STORAGE_FILE_QCOW2 ||
> > +nodedata->qcow2v2) {
> > +disk->skip = true;
> 
> Is skipping the disk the right thing to do? Should we report an error
> and abort migration instead? Just checking, maybe we can't do so for
> backward compatibility...
> 
> > +continue;
> > +}
> > +
> > +for (nextbitmap = disk->bitmaps; nextbitmap; nextbitmap = 
> > nextbitmap->next) {
> > +qemuMigrationBlockDirtyBitmapsDiskBitmapPtr bitmap = 
> > nextbitmap->data;
> > +size_t k;
> > +
> > +/* don't migrate into existing bitmaps */
> > +for (k = 0; k < nodedata->nbitmaps; k++) {
> > +if (STREQ(bitmap->bitmapname, nodedata->bitmaps[k]->name)) 
> > {
> > +bitmap->skip = true;
> 
> And similar questions for bitmaps here.

That would require that we have the users explicitly enable this feature
rather than doing it implicitly. The disk format can be changed during
the migration.

If we want to do it explicitly only I'll need to add a migration flag
and all the infra for it.



Re: [PATCH 12/19] qemu: migration_cookie: Add XML handling for setting up bitmap migration

2021-02-18 Thread Jiri Denemark
On Thu, Feb 11, 2021 at 16:37:51 +0100, Peter Krempa wrote:
> In cases where we are copying the storage we need to ensure that also
> bitmaps are copied properly. This patch adds migration cookie XML
> infrastructure which will allow the migration sides reach consensus on
> which bitmaps to migrate.
> 
> Signed-off-by: Peter Krempa 
> ---
>  src/qemu/qemu_migration_cookie.c | 134 +++
>  src/qemu/qemu_migration_cookie.h |  34 
>  2 files changed, 168 insertions(+)
> 
> diff --git a/src/qemu/qemu_migration_cookie.c 
> b/src/qemu/qemu_migration_cookie.c
> index 6f2b1b2f57..94ba9c83d0 100644
> --- a/src/qemu/qemu_migration_cookie.c
> +++ b/src/qemu/qemu_migration_cookie.c
...
> @@ -758,6 +795,48 @@ 
> qemuMigrationCookieNBDXMLFormat(qemuMigrationCookieNBDPtr nbd,
>  }
> 
> 
> +static void
> +qemuMigrationCookieBlockDirtyBitmapsFormat(virBufferPtr buf,
> +   GSList *bitmaps)
> +{
> +g_auto(virBuffer) disksBuf = VIR_BUFFER_INIT_CHILD(buf);
> +GSList *nextdisk;
> +
> +for (nextdisk = bitmaps; nextdisk; nextdisk = nextdisk->next) {
> +qemuMigrationBlockDirtyBitmapsDiskPtr disk = nextdisk->data;
> +g_auto(virBuffer) diskAttrBuf = VIR_BUFFER_INITIALIZER;
> +g_auto(virBuffer) diskChildBuf = VIR_BUFFER_INIT_CHILD();
> +bool hasBitmaps = false;
> +GSList *nextbitmap;
> +
> +if (disk->skip || !disk->bitmaps)
> +continue;
> +
> +for (nextbitmap = disk->bitmaps; nextbitmap; nextbitmap = 
> nextbitmap->next) {
> +qemuMigrationBlockDirtyBitmapsDiskBitmapPtr bitmap = 
> nextbitmap->data;
> +
> +if (bitmap->skip)
> +continue;
> +
> +virBufferAsprintf(,
> +  "\n",
> +  bitmap->bitmapname, bitmap->alias);
> +
> +hasBitmaps = true;
> +}
> +
> +if (!hasBitmaps)
> +continue;

You could drop hasBitmaps and call virBufferUse instead, but not a big
deal.

> +
> +virBufferAsprintf(, " target='%s'", disk->target);
> +virXMLFormatElement(, "disk", , );
> +}
> +
> +
> +virXMLFormatElement(buf, "blockDirtyBitmaps", NULL, );
> +}
> +
> +
>  int
>  qemuMigrationCookieXMLFormat(virQEMUDriverPtr driver,
>   virQEMUCapsPtr qemuCaps,
> @@ -829,6 +908,9 @@ qemuMigrationCookieXMLFormat(virQEMUDriverPtr driver,
>  if (mig->flags & QEMU_MIGRATION_COOKIE_CAPS)
>  qemuMigrationCookieCapsXMLFormat(buf, mig->caps);
> 
> +if (mig->flags & QEMU_MIGRATION_COOKIE_BLOCK_DIRTY_BITMAPS)
> +qemuMigrationCookieBlockDirtyBitmapsFormat(buf, 
> mig->blockDirtyBitmaps);
> +
>  virBufferAdjustIndent(buf, -2);
>  virBufferAddLit(buf, "\n");
>  return 0;
> @@ -1132,6 +1214,53 @@ 
> qemuMigrationCookieXMLParseMandatoryFeatures(xmlXPathContextPtr ctxt,
>  }
> 
> 
> +static int
> +qemuMigrationCookieBlockDirtyBitmapsParse(xmlXPathContextPtr ctxt,
> +  qemuMigrationCookiePtr mig)
> +{
> +g_autoslist(qemuMigrationBlockDirtyBitmapsDisk) disks = NULL;
> +g_autofree xmlNodePtr *disknodes = NULL;
> +int ndisknodes;
> +size_t i;
> +VIR_XPATH_NODE_AUTORESTORE(ctxt)
> +
> +if ((ndisknodes = virXPathNodeSet("./blockDirtyBitmaps/disk", ctxt, 
> )) < 0)
> +return -1;
> +
> +for (i = 0; i < ndisknodes; i++) {
> +GSList *bitmaps = NULL;
> +qemuMigrationBlockDirtyBitmapsDiskPtr disk;
> +g_autofree xmlNodePtr *bitmapnodes = NULL;
> +int nbitmapnodes;
> +size_t j;
> +
> +ctxt->node = disknodes[i];
> +
> +if ((nbitmapnodes = virXPathNodeSet("./bitmap", ctxt, )) 
> < 0)
> +return -1;
> +
> +for (j = 0; j < nbitmapnodes; j++) {
> +qemuMigrationBlockDirtyBitmapsDiskBitmapPtr bitmap;
> +
> +bitmap = g_new0(qemuMigrationBlockDirtyBitmapsDiskBitmap, 1);
> +bitmap->bitmapname = virXMLPropString(bitmapnodes[j], "name");
> +bitmap->alias = virXMLPropString(bitmapnodes[j], "alias");

So what if the attributes do not exist? And virXMLPropString does not
abort on OOM. You should check for the result being non-NULL here.

> +bitmaps = g_slist_prepend(bitmaps, bitmap);
> +}
> +
> +disk = g_new0(qemuMigrationBlockDirtyBitmapsDisk, 1);
> +disk->target = virXMLPropString(disknodes[i], "target");

And here as well.

> +disk->bitmaps = g_slist_reverse(bitmaps);
> +
> +disks = g_slist_prepend(disks, disk);
> +}
> +
> +mig->blockDirtyBitmaps = g_slist_reverse(g_steal_pointer());
> +
> +return 0;
> +}
> +
> +
>  static int
>  qemuMigrationCookieXMLParse(qemuMigrationCookiePtr mig,
>  virQEMUDriverPtr driver,
...

With the checks for virXMLPropString result added

Reviewed-by: Jiri Denemark 



Re: [PATCH 16/19] tests: qemumigrationcookie: Add testing for block dirty bitmap migration

2021-02-18 Thread Jiri Denemark
On Thu, Feb 11, 2021 at 16:37:55 +0100, Peter Krempa wrote:
> Test the XML infrastructure for  migration cookie
> element as well as the conversion to migration parameters for QMP schema
> validation.
> 
> Signed-off-by: Peter Krempa 
> ---
>  tests/meson.build |   2 +-
>  .../nbd-bitmaps-xml2xml-in.xml|  52 ++
>  .../nbd-bitmaps-xml2xml-migparams.json|  25 +++
>  .../nbd-bitmaps-xml2xml-out.xml   |  51 ++
>  tests/qemumigrationcookiexmltest.c| 166 +++---
>  5 files changed, 269 insertions(+), 27 deletions(-)
>  create mode 100644 
> tests/qemumigrationcookiexmldata/nbd-bitmaps-xml2xml-in.xml
>  create mode 100644 
> tests/qemumigrationcookiexmldata/nbd-bitmaps-xml2xml-migparams.json
>  create mode 100644 
> tests/qemumigrationcookiexmldata/nbd-bitmaps-xml2xml-out.xml

Reviewed-by: Jiri Denemark 



Re: [PATCH 13/19] qemu: migration_cookie: Add helpers for transforming the cookie into migration params

2021-02-18 Thread Jiri Denemark
On Thu, Feb 11, 2021 at 16:37:52 +0100, Peter Krempa wrote:
> 'qemuMigrationCookieBlockDirtyBitmapsMatchDisks' maps the bitmaps from
> the migration cookie to actual disk objects definition pointers.
> 
> 'qemuMigrationCookieBlockDirtyBitmapsToParams' converts the bitmap
> definitions from the migration cookie into parameters for the
> 'block-bitmap-mapping' migration parameter.
> 
> Signed-off-by: Peter Krempa 
> ---
>  src/qemu/qemu_migration_cookie.c | 115 +++
>  src/qemu/qemu_migration_cookie.h |   7 ++
>  2 files changed, 122 insertions(+)
...
> diff --git a/src/qemu/qemu_migration_cookie.h 
> b/src/qemu/qemu_migration_cookie.h
> index 8636f955da..f8393e0ce0 100644
> --- a/src/qemu/qemu_migration_cookie.h
> +++ b/src/qemu/qemu_migration_cookie.h
> @@ -226,3 +226,10 @@ qemuMigrationCookieXMLFormat(virQEMUDriverPtr driver,
>   virQEMUCapsPtr qemuCaps,
>   virBufferPtr buf,
>   qemuMigrationCookiePtr mig);
> +
> +int qemuMigrationCookieBlockDirtyBitmapsMatchDisks(virDomainDefPtr def,
> +   GSList *disks);
> +
> +int
> +qemuMigrationCookieBlockDirtyBitmapsToParams(GSList *disks,
> + virJSONValuePtr *mapping);

A little bit of consistency would not hurt :-) Either

int
qemuMigrationCookieBlockDirtyBitmapsMatchDisks(...

or

int qemuMigrationCookieBlockDirtyBitmapsToParams(...


Reviewed-by: Jiri Denemark 



Re: [PATCH 17/19] qemu: migration: Clean up temporary bitmaps when cancelling a migration

2021-02-18 Thread Jiri Denemark
On Thu, Feb 11, 2021 at 16:37:56 +0100, Peter Krempa wrote:
> In case when the block migration job required temporary bitmaps for
> merging the appropriate checkpoints we need to clean them up when
> cancelling the job. On success we don't need to do that though as the
> bitmaps are just temporary thus are not written to disk.
> 
> Signed-off-by: Peter Krempa 
> ---
>  src/qemu/qemu_migration.c | 31 +++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> index 37f0d43d24..36424f8493 100644
> --- a/src/qemu/qemu_migration.c
> +++ b/src/qemu/qemu_migration.c
> @@ -834,6 +834,32 @@ qemuMigrationSrcNBDCopyCancel(virQEMUDriverPtr driver,
>  }
> 
> 
> +static int
> +qemuMigrationSrcCancelRemoveTempBitmaps(virDomainObjPtr vm,
> +qemuDomainAsyncJob asyncJob)
> +{
> +qemuDomainObjPrivatePtr priv = vm->privateData;
> +virQEMUDriverPtr driver = priv->driver;
> +qemuDomainJobPrivatePtr jobPriv = priv->job.privateData;
> +GSList *next;
> +
> +if (!jobPriv->migTempBitmaps)
> +return 0;

This check is pretty much redundant as the loop will do exactly the
same.

> +
> +for (next = jobPriv->migTempBitmaps; next; next = next->next) {
> +qemuDomainJobPrivateMigrateTempBitmapPtr t = next->data;
> +
> +if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0)
> +return -1;
> +qemuMonitorBitmapRemove(priv->mon, t->nodename, t->bitmapname);
> +if (qemuDomainObjExitMonitor(driver, vm) < 0)
> +return -1;
> +}
> +
> +return 0;
> +}
> +
> +
>  static virStorageSourcePtr
>  qemuMigrationSrcNBDStorageCopyBlockdevPrepareSource(virDomainDiskDefPtr disk,
>  const char *host,
...

Reviewed-by: Jiri Denemark 



Re: [PATCH v2 07/10] qemu_driver.c: validate 'driverName' earlier in qemuNodeDeviceDetachFlags()

2021-02-18 Thread John Ferlan



On 2/2/21 8:06 PM, Daniel Henrique Barboza wrote:
> The validation of 'driverName' does not depend on any other state and can be
> done right on the start of the function. We can fail earlier while avoiding
> a cleanup jump.
> 
> Reviewed-by: Ján Tomko 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  src/qemu/qemu_driver.c | 41 -
>  1 file changed, 20 insertions(+), 21 deletions(-)
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 64ae8fafc0..c6ba33e4ad 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -11982,6 +11982,25 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev,
>  
>  virCheckFlags(0, -1);
>  
> +if (!driverName)
> +driverName = "vfio";
> +
> +if (STREQ(driverName, "vfio") && !vfio) {
> +virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
> +   _("VFIO device assignment is currently not "
> + "supported on this system"));
> + return -1;
> +} else if (STREQ(driverName, "kvm")) {
> +virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
> +   _("KVM device assignment is no longer "
> + "supported on this system"));
> +return -1;
> +} else {
> +virReportError(VIR_ERR_INVALID_ARG,
> +   _("unknown driver name '%s'"), driverName);
> +return -1;
> +}
> +

Coverity points out the rest of this is unreachable now (even with the
subsequent patch).

>  if (!(nodeconn = virGetConnectNodeDev()))
>  goto cleanup;
>  
> @@ -12013,27 +12032,7 @@ qemuNodeDeviceDetachFlags(virNodeDevicePtr dev,
>  if (!pci)
>  goto cleanup;
>  
> -if (!driverName)
> -driverName = "vfio";
> -
> -if (STREQ(driverName, "vfio")) {
> -if (!vfio) {
> -virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
> -   _("VFIO device assignment is currently not "
> - "supported on this system"));
> -goto cleanup;
> -}
> -virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_VFIO);

This section seems to be more than just code motion...  it used to pass
thru here when vfio = true, but with the movement it would fall into the
catch all else.

John

> -} else if (STREQ(driverName, "kvm")) {
> -virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s",
> -   _("KVM device assignment is no longer "
> - "supported on this system"));
> -goto cleanup;
> -} else {
> -virReportError(VIR_ERR_INVALID_ARG,
> -   _("unknown driver name '%s'"), driverName);
> -goto cleanup;
> -}
> +virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_VFIO);
>  
>  ret = virHostdevPCINodeDeviceDetach(hostdev_mgr, pci);
>   cleanup:
> 



Plans for the next release

2021-02-18 Thread Jiri Denemark
We are getting close to the next release of libvirt. To aim for the
release on Mar 01 I suggest entering the freeze on Monday Feb 22 and
tagging RC2 on Thursday Feb 25.

I hope this works for everyone.

Jirka



Re: [PATCH 15/19] tests: qemustatusxml2xml: Add status XML from migration with bitmaps

2021-02-18 Thread Jiri Denemark
On Thu, Feb 11, 2021 at 16:37:54 +0100, Peter Krempa wrote:
> The XML sample shows the status XML when migrating with bitmaps
> including the  element added in previous commit.
> 
> It will also be used for the migration cookie test.
> 
> Signed-off-by: Peter Krempa 
> ---
>  .../migration-out-nbd-bitmaps-in.xml  | 574 ++
>  .../migration-out-nbd-bitmaps-out.xml |   1 +
>  tests/qemustatusxml2xmltest.c |   1 +
>  3 files changed, 576 insertions(+)
>  create mode 100644 
> tests/qemustatusxml2xmldata/migration-out-nbd-bitmaps-in.xml
>  create mode 12 
> tests/qemustatusxml2xmldata/migration-out-nbd-bitmaps-out.xml

Reviewed-by: Jiri Denemark 



Re: [PATCH 14/19] qemu: domain: Store list of temporary bitmaps for migration in status XML

2021-02-18 Thread Jiri Denemark
On Thu, Feb 11, 2021 at 16:37:53 +0100, Peter Krempa wrote:
> Add status XML infrastructure for storing a list of block dirty bitmaps
> which are temporarily used when migrating a VM with
> VIR_MIGRATE_NON_SHARED_DISK for cleanup after a libvirtd restart during
> migration.
> 
> Signed-off-by: Peter Krempa 
> ---
>  src/qemu/qemu_domain.c | 90 --
>  src/qemu/qemu_domain.h | 15 +++
>  2 files changed, 102 insertions(+), 3 deletions(-)
> 
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 53b4fb5f4f..ed37917670 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -86,6 +86,18 @@ qemuJobAllocPrivate(void)
>  }
> 
> 
> +void
> +qemuDomainJobPrivateMigrateTempBitmapFree(qemuDomainJobPrivateMigrateTempBitmapPtr
>  bmp)
> +{
> +if (!bmp)
> +return;
> +
> +g_free(bmp->nodename);
> +g_free(bmp->bitmapname);
> +g_free(bmp);
> +}
> +
> +
>  static void
>  qemuJobFreePrivate(void *opaque)
>  {
> @@ -95,6 +107,9 @@ qemuJobFreePrivate(void *opaque)
>  return;
> 
>  qemuMigrationParamsFree(priv->migParams);
> +if (priv->migTempBitmaps)
> +g_slist_free_full(priv->migTempBitmaps,
> +  (GDestroyNotify) 
> qemuDomainJobPrivateMigrateTempBitmapFree);

I just realized this now although the same pattern is used in previous
patches... Shouldn't g_slist_free_full be able to cope with NULL making
the if () check before it redundant?

>  VIR_FREE(priv);
>  }
> 
> @@ -165,6 +180,28 @@ qemuDomainObjPrivateXMLFormatNBDMigration(virBufferPtr 
> buf,
>  return 0;
>  }
> 
> +
> +static void
> +qemuDomainObjPrivateXMLFormatMigrationBlockDirtyBitmapsTemp(virBufferPtr buf,
> +GSList *bitmaps)

The naming is pretty inconsistent here, how about

qemuDomainObjPrivateXMLFormatMigrateTempBitmap(...)?

> +{
> +g_auto(virBuffer) childBuf = VIR_BUFFER_INIT_CHILD(buf);
> +GSList *next;
> +
> +for (next = bitmaps; next; next = next->next) {
> +qemuDomainJobPrivateMigrateTempBitmapPtr t = next->data;
> +g_auto(virBuffer) bitmapBuf = VIR_BUFFER_INITIALIZER;
> +
> +virBufferAsprintf(, " name='%s'", t->bitmapname);
> +virBufferAsprintf(, " nodename='%s'", t->nodename);
> +
> +virXMLFormatElement(, "bitmap", , NULL);
> +}
> +
> +virXMLFormatElement(buf, "tempBlockDirtyBitmaps", NULL, );
> +}
> +
> +
>  static int
>  qemuDomainFormatJobPrivate(virBufferPtr buf,
> qemuDomainJobObjPtr job,
> @@ -172,9 +209,14 @@ qemuDomainFormatJobPrivate(virBufferPtr buf,
>  {
>  qemuDomainJobPrivatePtr priv = job->privateData;
> 
> -if (job->asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT &&
> -qemuDomainObjPrivateXMLFormatNBDMigration(buf, vm) < 0)
> -return -1;
> +if (job->asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) {
> +if (qemuDomainObjPrivateXMLFormatNBDMigration(buf, vm) < 0)
> +return -1;
> +
> +if (priv->migTempBitmaps)
> +qemuDomainObjPrivateXMLFormatMigrationBlockDirtyBitmapsTemp(buf,
> +
> priv->migTempBitmaps);

You could just directly call the formatting function as it won't format
anything when priv->migTempBitmaps is an empty list.

> +}
> 
>  if (priv->migParams)
>  qemuMigrationParamsFormat(buf, priv->migParams);
> @@ -267,6 +309,45 @@ qemuDomainObjPrivateXMLParseJobNBD(virDomainObjPtr vm,
>  return 0;
>  }
> 
> +
> +static int
> +qemuDomainParseJobPrivateXMLMigrationBlockDirtyBitmapsTemp(qemuDomainJobPrivatePtr
>  jobPriv,
> +   
> xmlXPathContextPtr ctxt)

qemuDomainObjPrivateXMLParseMigrateTempBitmap would make the naming a
bit more consistent with other formatting and parsing functions.

> +{
> +g_autoslist(qemuDomainJobPrivateMigrateTempBitmap) bitmaps = NULL;
> +g_autofree xmlNodePtr *nodes = NULL;
> +size_t i;
> +int n;
> +
> +if ((n = virXPathNodeSet("./tempBlockDirtyBitmaps/bitmap", ctxt, 
> )) < 0)
> +return -1;
> +
> +if (n == 0)
> +return 0;
> +
> +for (i = 0; i < n; i++) {
> +g_autofree char *bitmapname = virXMLPropString(nodes[i], "name");
> +g_autofree char *nodename = virXMLPropString(nodes[i], "nodename");
> +qemuDomainJobPrivateMigrateTempBitmapPtr bmp;
> +
> +if (!bitmapname || !nodename) {
> +virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +   _("malformed "));
> +return -1;
> +}

Right, something similar is needed in patch 12/19. Although you could
mention extend the error message here and mention the error happened in
a status XML.

And you could even get away without the temp variables by marking bmp
for autoclean and stealing its content when adding to the list.

> +
> +bmp = 

Re: [PATCH v2 09/13] Introduce MEMORY_DEVICE_SIZE_CHANGE event

2021-02-18 Thread Peter Krempa
On Thu, Feb 18, 2021 at 14:31:04 +0100, Michal Privoznik wrote:
> New event is introduced that is emitted whenever guest
> acknowledges allocation change request of a virtio-mem.
> The aim is to let applications know when that happens,
> because changes in allocation are not synchronous with
> issuing the request. Under the hood, the event is
> emitted whenever QEMU emits MEMORY_DEVICE_SIZE_CHANGE
> event.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  examples/c/misc/event-test.c| 17 ++
>  include/libvirt/libvirt-domain.h| 23 
>  src/conf/domain_event.c | 84 +
>  src/conf/domain_event.h | 10 
>  src/libvirt_private.syms|  2 +
>  src/qemu/qemu_driver.c  |  6 +++
>  src/remote/remote_daemon_dispatch.c | 30 +++
>  src/remote/remote_driver.c  | 32 +++
>  src/remote/remote_protocol.x| 15 +-
>  src/remote_protocol-structs |  7 +++
>  tools/virsh-domain.c| 20 +++
>  11 files changed, 245 insertions(+), 1 deletion(-)

[...]

> diff --git a/include/libvirt/libvirt-domain.h 
> b/include/libvirt/libvirt-domain.h
> index 8011cf9fe1..2c449b4f31 100644
> --- a/include/libvirt/libvirt-domain.h
> +++ b/include/libvirt/libvirt-domain.h
> @@ -4651,6 +4651,28 @@ typedef void 
> (*virConnectDomainEventMemoryFailureCallback)(virConnectPtr conn,
> unsigned int 
> flags,
> void *opaque);
>  
> +/**
> + * virConnectDomainEventMemoryDeviceSizeChangeCallback:
> + * @conn: connection object
> + * @dom: domain on which the event occurred
> + * @alias: memory device alias
> + * @size: new actual size of memory device (in KiB)

I'd specifically call this actualsize or something to maybe add one more
layer to avoid confuision with the 'size'

> + * @opaque: application specified data
> + *
> + * The callback occurs when the guest acknowledges request to change size of
> + * memory device (so far only virtio-mem model supports this). The @size then
> + * reflects the new amount of guest visible memory (in kibibytes).
> + *
> + * The callback signature to use when registering for an event of type
> + * VIR_DOMAIN_EVENT_ID_MEMORY_DEVICE_SIZE_CHANGE with
> + * virConnectDomainEventRegisterAny().
> + */
> +typedef void 
> (*virConnectDomainEventMemoryDeviceSizeChangeCallback)(virConnectPtr conn,
> +
> virDomainPtr dom,
> +const 
> char *alias,
> +unsigned 
> long long size,
> +void 
> *opaque);
> +
>  
>  /**
>   * VIR_DOMAIN_EVENT_CALLBACK:

But I don't think it needs to be changed in the internals though.



Re: [PATCH v2 10/13] qemu: Refresh the actual size of virtio-mem on monitor reconnect

2021-02-18 Thread Peter Krempa
On Thu, Feb 18, 2021 at 14:31:05 +0100, Michal Privoznik wrote:
> If the QEMU driver restarts it loses the track of the actual size
> of virtio-mem (because it's runtime type of information and thus
> not stored in XML) and therefore, we have to refresh it when
> reconnecting to the domain monitor.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  src/qemu/qemu_domain.c   | 37 
>  src/qemu/qemu_monitor.h  |  3 ++
>  src/qemu/qemu_monitor_json.c | 56 ++--
>  src/qemu/qemu_process.c  |  3 ++
>  4 files changed, 72 insertions(+), 27 deletions(-)

[...]

>  int qemuMonitorGetMemoryDeviceInfo(qemuMonitorPtr mon,
> diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c
> index 9ce7ba52ba..09e898442d 100644
> --- a/src/qemu/qemu_monitor_json.c
> +++ b/src/qemu/qemu_monitor_json.c
> @@ -8238,7 +8238,6 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon,
>  virJSONValuePtr cmd;
>  virJSONValuePtr reply = NULL;
>  virJSONValuePtr data = NULL;
> -qemuMonitorMemoryDeviceInfoPtr meminfo = NULL;
>  size_t i;
>  
>  if (!(cmd = qemuMonitorJSONMakeCommand("query-memory-devices", NULL)))
> @@ -8259,6 +8258,9 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon,
>  
>  for (i = 0; i < virJSONValueArraySize(data); i++) {
>  virJSONValuePtr elem = virJSONValueArrayGet(data, i);
> +g_autofree qemuMonitorMemoryDeviceInfoPtr meminfo = NULL;
> +virJSONValuePtr dimminfo;
> +const char *devalias;
>  const char *type;

../../../libvirt/src/qemu/qemu_monitor_json.c:8335:13: error: ‘devalias’ may be 
used uninitialized in this function [-Werror=maybe-uninitialized]
 8335 | if (virHashAddEntry(info, devalias, meminfo) < 0)
  | ^~~~



Re: [PATCH 19/19] qemu: capabilities: Enable QEMU_CAPS_INCREMENTAL_BACKUP

2021-02-18 Thread Jiri Denemark
On Thu, Feb 11, 2021 at 16:37:58 +0100, Peter Krempa wrote:
> For incremental backup we need QEMU_CAPS_BLOCKDEV,
> QEMU_CAPS_BLOCKDEV_REOPEN, QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING.
> 
> Signed-off-by: Peter Krempa 
> ---
>  src/qemu/qemu_capabilities.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
> index 38555dde98..7cc2532954 100644
> --- a/src/qemu/qemu_capabilities.c
> +++ b/src/qemu/qemu_capabilities.c
> @@ -5164,8 +5164,10 @@ virQEMUCapsInitQMPVersionCaps(virQEMUCapsPtr qemuCaps)
>  void
>  virQEMUCapsInitProcessCapsInterlock(virQEMUCapsPtr qemuCaps)
>  {
> -if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_BLOCKDEV))
> -virQEMUCapsClear(qemuCaps, QEMU_CAPS_INCREMENTAL_BACKUP);
> +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_BLOCKDEV) &&
> +virQEMUCapsGet(qemuCaps, QEMU_CAPS_BLOCKDEV_REOPEN) &&
> +virQEMUCapsGet(qemuCaps, 
> QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING))
> +virQEMUCapsSet(qemuCaps, QEMU_CAPS_INCREMENTAL_BACKUP);
> 
>  if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_USB_STORAGE) &&
>  !virQEMUCapsGet(qemuCaps, QEMU_CAPS_USB_STORAGE_WERROR)) {

Reviewed-by: Jiri Denemark 



Re: [PATCH v2 11/13] virsh: Introduce update-memory command

2021-02-18 Thread Peter Krempa
On Thu, Feb 18, 2021 at 14:31:06 +0100, Michal Privoznik wrote:

s/update-memory/update-memory-device/ in subject

> New 'update-memory' command is introduced which aims on making it

and here.

> user friendly to change  device. So far I just need to
> change  so I'm introducing --requested-size only; but
> the idea is that this is extensible for other cases too. For
> instance, want to change ? A new --my-element
> argument can be easily introduced.



Re: [PATCH v2 13/13] kbase: Document virtio-mem

2021-02-18 Thread Peter Krempa
On Thu, Feb 18, 2021 at 14:31:08 +0100, Michal Privoznik wrote:
> This commit adds new memorydevices.rst page which should serve
> all models of memory devices. Yet, I'm documenting virtio-mem
> quirks only.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  docs/kbase/index.rst |   4 +
>  docs/kbase/memorydevices.rst | 160 +++
>  docs/kbase/meson.build   |   1 +
>  3 files changed, 165 insertions(+)
>  create mode 100644 docs/kbase/memorydevices.rst
> 
> diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst
> index 532804fe05..45450bf33b 100644
> --- a/docs/kbase/index.rst
> +++ b/docs/kbase/index.rst
> @@ -46,6 +46,10 @@ Usage
>  `PCI topology <../pci-addresses.html>`__
> Addressing schemes for PCI devices
>  
> +`Memory devices `__
> +   Memory devices and their use
> +
> +
>  Internals / Debugging
>  -
>  
> diff --git a/docs/kbase/memorydevices.rst b/docs/kbase/memorydevices.rst
> new file mode 100644
> index 00..23adf54e16
> --- /dev/null
> +++ b/docs/kbase/memorydevices.rst
> @@ -0,0 +1,160 @@
> +==
> +Memory devices
> +==
> +
> +.. contents::
> +
> +Basics
> +==
> +
> +Memory devices can be divided into two families: DIMMs and NVDIMMs. The 
> former
> +is typical RAM memory: it's volatile and thus its contents doesn't survive
> +reboots nor guest shut downs and power ons. The latter retains its contents
> +across reboots or power outages.
> +
> +In Libvirt, there are two models for DIMMs:
> +
> +* ``dimm`` model:
> +
> +  ::
> +
> +
> +  
> +523264
> +0
> +  
> +  
> +
> +
> +* ``virtio-mem`` model:
> +
> +  ::
> +
> +
> +  
> +1048576
> +0
> +2048
> +524288
> +  
> +   function='0x0'/>
> +
> +
> +Then there are two models for NVDIMMs:
> +
> +* ``nvidmm`` model:
> +
> +  ::
> +
> +
> +  
> +/tmp/nvdimm
> +  
> +  
> +523264
> +0
> +  
> +  
> +
> +
> +* ``virtio-pmem`` model:
> +
> +  ::
> +
> +
> +  
> +/tmp/virtio_pmem
> +  
> +  
> +524288
> +  
> +   function='0x0'/>
> +
> +
> +
> +Please not that (maybe somewhat surprisingly) virtio models go onto PCI bus

s/not/note/

> +instead of DIMM slots.
> +
> +Furthermore, DIMMs can have  element which configures backend 
> for
> +devices. For NVDIMMs the element is mandatory and reflects where the contents
> +is saved.
> +
> +See https://libvirt.org/formatdomain.html#elementsMemory

Please use a relative link inside of the libvirt project. You might need
to enclose it.

> +``virtio-mem`` model
> +
> +
> +The ``virtio-mem`` model can be viewed as revised memory balloon. It offers
> +memory hotplug and hotunplug solution (without the actual hotplug of the

I'd say 'adding and removing of memory without actual hotplug or unplug
of devices'.

> +device). It solves problems that memory balloon can't solve on its own and 
> thus
> +is more flexible than DIMM + balloon solution. ``virtio-mem`` is NUMA aware,
> +and thus memory can be inflated/deflated only for a subset of guest NUMA 
> nodes.
> +Also, it works with chunks that are either exposed to guest or taken back 
> from
> +it.
> +
> +See https://virtio-mem.gitlab.io/
> +
> +Under the hood, ``virtio-mem`` device is split into chunks of equal size 
> which
> +are then exposed to the guest. Either all of them or only a portion depending
> +on user's request. Therefore there are three important sizes for
> +``virtio-mem``. All are to be found under  element:
> +
> +#. The maximum size the device can ever offer, exposed under 
> +#. The size a single block, exposed under 

size of a

> +#. The current size exposed to the guest, exposed under 
> +
> +For instance, the following example the maximum size is 4GiB, the block size 
> is
> +2MiB and only 1GiB should be exposed to the guest:
> +
> +  ::
> +
> +
> +  
> +4194304
> +2048
> +1048576
> +  
> +
> +
> +Please note that  must be an integer multiple of 
> +size or zero (memory completely deflated) and has to be less or equal to
> + (memory completely inflated). Furthermore, QEMU recommends the

I'd avoid inflated/deflated since it has exactly the opposite meaning
with memballoon.

> + size to be as big as a Transparent Huge Page (usually 2MiB).
> +
> +To change the size exposed to the guest, users should pass memory device XML
> +with nothing but  changed into the
> +``virDomainUpdateDeviceFlags()`` API. For user's convenience this can be done
> +via virsh too:
> +
> + ::
> +
> +   # virsh update-memory-device $dom --requested size 2GiB
> +
> +If there are two or more  devices then ``--alias`` shall be used
> +to tell virsh which memory device should be updated.
> +
> +For running guests there is fourth size that can be found under 
> :
> +
> +  ::
> +
> +2097152

Re: [libvirt PATCH 0/4] Coverity diaries

2021-02-18 Thread Erik Skultety
On Thu, Feb 18, 2021 at 02:55:25PM +0100, Ján Tomko wrote:
> Ján Tomko (4):
>   security: dac: remove leftover virPCIDeviceFree
>   qemu: saveimage: only steal domXML on success
>   qemu: monitor: clear cpu props properly in CPUInfoClear
>   hyperv: check return value of virUUIDGenerate
> 
>  src/hyperv/hyperv_driver.c  | 4 +++-
>  src/qemu/qemu_monitor.c | 1 +
>  src/qemu/qemu_saveimage.c   | 3 +--
>  src/security/security_dac.c | 5 ++---
>  4 files changed, 7 insertions(+), 6 deletions(-)
> 
Reviewed-by: Erik Skultety 



Re: [PATCH 18/19] qemu: migration: Migrate block dirty bitmaps corresponding to checkpoints

2021-02-18 Thread Jiri Denemark
On Thu, Feb 11, 2021 at 16:37:57 +0100, Peter Krempa wrote:
> Preserve block dirty bitmaps after migration with
> QEMU_MONITOR_MIGRATE_NON_SHARED_(DISK|INC).
> 
> This patch implements functions which offer the bitmaps to the
> destination, check for eligibility on destination and then configure
> source for the migration.
> 
> Signed-off-by: Peter Krempa 
> ---
>  src/qemu/qemu_migration.c | 333 +-
>  1 file changed, 331 insertions(+), 2 deletions(-)
> 
> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> index 36424f8493..16bfad0390 100644
> --- a/src/qemu/qemu_migration.c
> +++ b/src/qemu/qemu_migration.c
...
> @@ -2528,6 +2619,92 @@ qemuMigrationDstPrepare(virDomainObjPtr vm,
>   migrateFrom, fd, NULL);
>  }
> 
> +
> +/**
> + * qemuMigrationDstPrepareAnyBlockDirtyBitmaps:
> + * @vm: domain object
> + * @mig: migration cookie
> + * @migParams: migration parameters
> + * @flags: migration flags
> + *
> + * Checks whether block dirty bitmaps offered by the migration source are
> + * to be migrated (e.g. they don't exist, the destination is compatible etc)
> + * and sets up destination qemu for migrating the bitmaps as well as updates 
> the
> + * list of eligible bitmaps in the migration cookie to be sent back to the
> + * source.
> + */
> +static int
> +qemuMigrationDstPrepareAnyBlockDirtyBitmaps(virDomainObjPtr vm,
> +qemuMigrationCookiePtr mig,
> +qemuMigrationParamsPtr migParams,
> +unsigned int flags)
> +{
> +qemuDomainObjPrivatePtr priv = vm->privateData;
> +g_autoptr(virJSONValue) mapping = NULL;
> +g_autoptr(GHashTable) blockNamedNodeData = NULL;
> +GSList *nextdisk;
> +
> +if (!mig->nbd ||
> +!mig->blockDirtyBitmaps ||
> +!(flags & (VIR_MIGRATE_NON_SHARED_DISK | 
> VIR_MIGRATE_NON_SHARED_INC)) ||
> +!virQEMUCapsGet(priv->qemuCaps, 
> QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING))
> +return 0;

Shouldn't we report an error in case the source sent bitmaps, but local
QEMU does not support QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING?

> +
> +if (qemuMigrationCookieBlockDirtyBitmapsMatchDisks(vm->def, 
> mig->blockDirtyBitmaps) < 0)
> +return -1;
> +
> +if (!(blockNamedNodeData = qemuBlockGetNamedNodeData(vm, 
> QEMU_ASYNC_JOB_MIGRATION_IN)))
> +return -1;
> +
> +for (nextdisk = mig->blockDirtyBitmaps; nextdisk; nextdisk = 
> nextdisk->next) {
> +qemuMigrationBlockDirtyBitmapsDiskPtr disk = nextdisk->data;
> +qemuBlockNamedNodeDataPtr nodedata;
> +GSList *nextbitmap;
> +
> +if (!(nodedata = virHashLookup(blockNamedNodeData, disk->nodename))) 
> {
> +virReportError(VIR_ERR_INTERNAL_ERROR,
> +   _("failed to find data for block node '%s'"),
> +   disk->nodename);
> +return -1;
> +}
> +
> +/* don't migrate bitmaps into non-qcow2v3+ images */

How about "Bitmaps can only be migrated to qcow2 v3+"?

> +if (disk->disk->src->format != VIR_STORAGE_FILE_QCOW2 ||
> +nodedata->qcow2v2) {
> +disk->skip = true;

Is skipping the disk the right thing to do? Should we report an error
and abort migration instead? Just checking, maybe we can't do so for
backward compatibility...

> +continue;
> +}
> +
> +for (nextbitmap = disk->bitmaps; nextbitmap; nextbitmap = 
> nextbitmap->next) {
> +qemuMigrationBlockDirtyBitmapsDiskBitmapPtr bitmap = 
> nextbitmap->data;
> +size_t k;
> +
> +/* don't migrate into existing bitmaps */
> +for (k = 0; k < nodedata->nbitmaps; k++) {
> +if (STREQ(bitmap->bitmapname, nodedata->bitmaps[k]->name)) {
> +bitmap->skip = true;

And similar questions for bitmaps here.

> +break;
> +}
> +}
> +
> +if (bitmap->skip)
> +continue;
> +}
> +}
> +
> +if (qemuMigrationCookieBlockDirtyBitmapsToParams(mig->blockDirtyBitmaps,
> + ) < 0)
> +return -1;
> +
> +if (!mapping)
> +return 0;
> +
> +qemuMigrationParamsSetBlockDirtyBitmapMapping(migParams, );
> +mig->flags |= QEMU_MIGRATION_COOKIE_BLOCK_DIRTY_BITMAPS;
> +return 0;
> +}
> +
> +
>  static int
>  qemuMigrationDstPrepareAny(virQEMUDriverPtr driver,
> virConnectPtr dconn,
...
> +static int
> +qemuMigrationSrcRunPrepareBlockDirtyBitmaps(virDomainObjPtr vm,
> +qemuMigrationCookiePtr mig,
> +qemuMigrationParamsPtr migParams,
> +unsigned int flags)
> +
> +{
> +

Re: [PATCH v2 05/13] conf: Introduce virtio-mem model

2021-02-18 Thread David Hildenbrand

On 18.02.21 14:31, Michal Privoznik wrote:

The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:

   1) block-size, which defines the size of a block
   2) requested-size, which defines how much memory (in bytes)
  is the device requested to expose to the guest.

The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'actualsize' and it is dealt with in future
commits (it is a runtime information anyway).

In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using 
under .

Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:

   
 
   1-3
   2048
 
 
   2097152
   0
   2048
   1048576
 
 
   

I hope by now it is obvious that:

   1) 'requested-size' must be an integer multiple of
  'block-size', and
   2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
  address.

Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).

Since now we have (possibly) two or more devices that allow
memory inflation/deflation and accounting for all of them (and
thus keeping  updated) might be hard. Therefore,
I'm deliberately forbidding memballoon. It's okay - virtio-mem is
superior to memballoon anyway. We can always reevaluate later.


That's a bad idea. It'll still be used for getting memory stats, free 
page hinting and free page reporting.


Very weird use cases might even want to mix balloon inflation/deflation 
with virtio-mem ...


--
Thanks,

David / dhildenb



Re: [PATCH v2 05/13] conf: Introduce virtio-mem model

2021-02-18 Thread Peter Krempa
On Thu, Feb 18, 2021 at 14:31:00 +0100, Michal Privoznik wrote:
> The virtio-mem is paravirtualized mechanism of adding/removing
> memory to/from a VM. A virtio-mem-pci device is split into blocks
> of equal size which are then exposed (all or only a requested
> portion of them) to the guest kernel to use as regular memory.
> Therefore, the device has two important attributes:
> 
>   1) block-size, which defines the size of a block
>   2) requested-size, which defines how much memory (in bytes)
>  is the device requested to expose to the guest.
> 
> The 'block-size' is configured on command line and immutable
> throughout device's lifetime. The 'requested-size' can be set on
> the command line too, but also is adjustable via monitor. In
> fact, that is how management software places its requests to
> change the memory allocation. If it wants to give more memory to
> the guest it changes 'requested-size' to a bigger value, and if it
> wants to shrink guest memory it changes the 'requested-size' to a
> smaller value. Note, value of zero means that guest should
> release all memory offered by the device. Of course, guest has to
> cooperate. Therefore, there is a third attribute 'size' which is
> read only and reflects how much memory the guest still has. This
> can be different to 'requested-size', obviously. Because of name
> clash, I've named it 'actualsize' and it is dealt with in future
> commits (it is a runtime information anyway).
> 
> In the backend, memory for virtio-mem is backed by usual objects:
> memory-backend-{ram,file,memfd} and their size puts the cap on
> the amount of memory that a virtio-mem device can offer to a
> guest. But we are already able to express this info using 
> under .
> 
> Therefore, we need only two more elements to cover 'block-size'
> and 'requested-size' attributes. This is the XML I've came up
> with:
> 
>   
> 
>   1-3
>   2048
> 
> 
>   2097152
>   0
>   2048
>   1048576
> 
>  function='0x0'/>
>   
> 
> I hope by now it is obvious that:
> 
>   1) 'requested-size' must be an integer multiple of
>  'block-size', and
>   2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
>  address.
> 
> Then there is a limitation that the minimal 'block-size' is
> transparent huge page size (I'll leave this without explanation).
> 
> Since now we have (possibly) two or more devices that allow
> memory inflation/deflation and accounting for all of them (and
> thus keeping  updated) might be hard. Therefore,
> I'm deliberately forbidding memballoon. It's okay - virtio-mem is
> superior to memballoon anyway. We can always reevaluate later.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  docs/formatdomain.rst | 51 +--
>  docs/schemas/domaincommon.rng | 11 
>  src/conf/domain_conf.c| 53 ++-
>  src/conf/domain_conf.h|  3 +
>  src/conf/domain_validate.c| 39 +++
>  src/qemu/qemu_alias.c |  1 +
>  src/qemu/qemu_command.c   |  1 +
>  src/qemu/qemu_domain.c| 27 +++-
>  src/qemu/qemu_domain_address.c| 38 ---
>  src/qemu/qemu_process.c   |  2 +
>  src/qemu/qemu_validate.c  | 22 ++-
>  src/security/security_apparmor.c  |  1 +
>  src/security/security_dac.c   |  2 +
>  src/security/security_selinux.c   |  2 +
>  .../memory-hotplug-virtio-mem.xml | 64 +++
>  ...emory-hotplug-virtio-mem.x86_64-latest.xml |  1 +
>  tests/qemuxml2xmltest.c   |  1 +
>  17 files changed, 298 insertions(+), 21 deletions(-)
>  create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml
>  create mode 12 
> tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml
> 
> diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
> index 2587106191..05d359a32d 100644
> --- a/docs/formatdomain.rst
> +++ b/docs/formatdomain.rst

[...]

> @@ -6935,6 +6937,16 @@ Example: automatically added device with KVM
> 
> ...
>  
> +Example: automatically added device with QEMU/KVM and a ``virtio-mem`` 
> device:
> +
> +::
> +
> +   ...
> +   

The example is missing the 'virtio-mem' device which causes the balloon
to be disabled.


> + 
> +   
> +   ...
> +
>  Example: manually added device with static PCI slot 2 requested
>  
>  ::


[...]

> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 541d592bbe..26b9b5583e 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -3679,10 +3679,21 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def,
>  }
>  
>  if (addDefaultMemballoon && !def->memballoon) {
> -virDomainMemballoonDefPtr memballoon;
> -memballoon = g_new0(virDomainMemballoonDef, 1);
> +

Re: [PATCH] virsh: Add virshKeycodeNameCompleter

2021-02-18 Thread Ján Tomko

On a Thursday in 2021, Kristina Hanicova wrote:

Signed-off-by: Kristina Hanicova 
---
tools/virsh-completer-domain.c | 72 ++
tools/virsh-completer-domain.h |  8 +++-
tools/virsh-domain.c   |  1 +
3 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/tools/virsh-completer-domain.h b/tools/virsh-completer-domain.h
index 4c01b0ca1f..04a3705ff9 100644
--- a/tools/virsh-completer-domain.h
+++ b/tools/virsh-completer-domain.h
@@ -112,5 +112,9 @@ char ** virshDomainLifecycleActionCompleter(vshControl *ctl,
unsigned int flags);

char ** virshCodesetNameCompleter(vshControl *ctl,
-  const vshCmd *cmd,
-  unsigned int flags);
+  const vshCmd *cmd,
+  unsigned int flags);
+


This indentation change is not related to the rest of the commit,
so I split it out into a separate commit before pushing.


+char ** virshKeycodeNameCompleter(vshControl *ctl,
+  const vshCmd *cmd,
+  unsigned int flags);
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
index d40995f44d..df33467646 100644
--- a/tools/virsh-domain.c
+++ b/tools/virsh-domain.c
@@ -8802,6 +8802,7 @@ static const vshCmdOptDef opts_send_key[] = {
{.name = "keycode",
 .type = VSH_OT_ARGV,
 .flags = VSH_OFLAG_REQ,
+ .completer = virshKeycodeNameCompleter,
 .help = N_("the key code")
},
{.name = NULL}
--
2.29.2



Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature


Re: [libvirt PATCH v4 03/25] nodedev: Add ability to filter by active state

2021-02-18 Thread Jonathon Jongsma
On Mon, 15 Feb 2021 18:22:41 +0100
Erik Skultety  wrote:

> On Wed, Feb 03, 2021 at 11:38:47AM -0600, Jonathon Jongsma wrote:
> > Add two flag values for virConnectListAllNodeDevices() so that we
> > can list only node devices that are active or inactive.
> > 
> > Signed-off-by: Jonathon Jongsma 
> > Reviewed-by: Erik Skultety 
> > ---
> >  include/libvirt/libvirt-nodedev.h|  9 +++--
> >  src/conf/node_device_conf.h  |  8 
> >  src/conf/virnodedeviceobj.c  | 56
> >  src/libvirt-nodedev.c|
> >  2 + src/node_device/node_device_driver.c |  2 +-
> >  5 files changed, 50 insertions(+), 27 deletions(-)
> > 
> > diff --git a/include/libvirt/libvirt-nodedev.h
> > b/include/libvirt/libvirt-nodedev.h index eab8abf6ab..d304283871
> > 100644 --- a/include/libvirt/libvirt-nodedev.h
> > +++ b/include/libvirt/libvirt-nodedev.h
> > @@ -61,10 +61,9 @@ int virNodeListDevices
> > (virConnectPtr conn,
> >   * virConnectListAllNodeDevices:
> >   *
> >   * Flags used to filter the returned node devices. Flags in each
> > group
> > - * are exclusive. Currently only one group to filter the devices
> > by cap
> > - * type.
> > - */
> > + * are exclusive.  */
> >  typedef enum {
> > +/* filter the devices by cap type */
> >  VIR_CONNECT_LIST_NODE_DEVICES_CAP_SYSTEM= 1 << 0,  /*
> > System capability */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_PCI_DEV
> >  = 1 << 1,  /* PCI device */
> > VIR_CONNECT_LIST_NODE_DEVICES_CAP_USB_DEV   = 1 << 2,  /* USB
> > device */ @@ -86,6 +85,10 @@ typedef enum {
> > VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_CARD   = 1 << 18, /* s390
> > AP Card device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_QUEUE  =
> > 1 << 19, /* s390 AP Queue */
> > VIR_CONNECT_LIST_NODE_DEVICES_CAP_AP_MATRIX = 1 << 20, /* s390
> > AP Matrix */ +
> > +/* filter the devices by active state */
> > +VIR_CONNECT_LIST_NODE_DEVICES_INACTIVE  = 1 << 29, /*
> > Inactive devices */
> > +VIR_CONNECT_LIST_NODE_DEVICES_ACTIVE= 1 << 30, /*
> > Active devices */  
> 
> We don't define sentinels on public flags, so what are we saving the
> last value for? Why couldn't ^this become 1U << 31?
> 
> ...

Because gcc implements the enum as a signed int and 1<<31 overflows the
maximum positive integer value:

In file included from ../include/libvirt/libvirt.h:42,
 from ../src/internal.h:65,
 from ../src/util/vircgroupv1.c:30:
../include/libvirt/libvirt-nodedev.h:91:57: error: result of ‘1 << 31’
requires 33 bits to represent, but ‘int’ only has 32 bits
[-Werror=shift-overflow=] 91 | VIR_CONNECT_LIST_NODE_DEVICES_ACTIVE
   = 1 << 31, /* Active devices */ |
 ^~ cc1: all warnings being treated as errors


Jonathon



[PATCH] virsh: Add virshKeycodeNameCompleter

2021-02-18 Thread Kristina Hanicova
Signed-off-by: Kristina Hanicova 
---
 tools/virsh-completer-domain.c | 72 ++
 tools/virsh-completer-domain.h |  8 +++-
 tools/virsh-domain.c   |  1 +
 3 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/tools/virsh-completer-domain.c b/tools/virsh-completer-domain.c
index fc4d7b2e52..15993064fa 100644
--- a/tools/virsh-completer-domain.c
+++ b/tools/virsh-completer-domain.c
@@ -32,6 +32,9 @@
 #include "virperf.h"
 #include "virbitmap.h"
 #include "virkeycode.h"
+#include "virkeynametable_linux.h"
+#include "virkeynametable_osx.h"
+#include "virkeynametable_win32.h"
 
 char **
 virshDomainNameCompleter(vshControl *ctl,
@@ -800,3 +803,72 @@ virshCodesetNameCompleter(vshControl *ctl G_GNUC_UNUSED,
 
 return g_steal_pointer();
 }
+
+
+char **
+virshKeycodeNameCompleter(vshControl *ctl,
+  const vshCmd *cmd,
+  unsigned int flags)
+{
+g_auto(GStrv) tmp = NULL;
+size_t i = 0;
+size_t j = 0;
+const char *codeset_option;
+int codeset;
+const char **names = NULL;
+size_t len;
+
+virCheckFlags(0, NULL);
+
+if (vshCommandOptStringQuiet(ctl, cmd, "codeset", _option) <= 0)
+codeset_option = "linux";
+
+if (STREQ(codeset_option, "rfb"))
+codeset_option = "qnum";
+
+codeset = virKeycodeSetTypeFromString(codeset_option);
+
+if (codeset < 0)
+return NULL;
+
+switch ((virKeycodeSet) codeset) {
+case VIR_KEYCODE_SET_LINUX:
+names = virKeyNameTable_linux;
+len = virKeyNameTable_linux_len;
+break;
+case VIR_KEYCODE_SET_OSX:
+names = virKeyNameTable_osx;
+len = virKeyNameTable_osx_len;
+break;
+case VIR_KEYCODE_SET_WIN32:
+names = virKeyNameTable_win32;
+len = virKeyNameTable_win32_len;
+break;
+case VIR_KEYCODE_SET_XT:
+case VIR_KEYCODE_SET_ATSET1:
+case VIR_KEYCODE_SET_ATSET2:
+case VIR_KEYCODE_SET_ATSET3:
+case VIR_KEYCODE_SET_XT_KBD:
+case VIR_KEYCODE_SET_USB:
+case VIR_KEYCODE_SET_QNUM:
+case VIR_KEYCODE_SET_LAST:
+break;
+}
+
+if (!names)
+return NULL;
+
+tmp = g_new0(char *, len + 1);
+
+for (i = 0; i < len; i++) {
+if (!names[i])
+continue;
+
+tmp[j] = g_strdup(names[i]);
+j++;
+}
+
+tmp = g_renew(char *, tmp, j + 1);
+
+return g_steal_pointer();
+}
diff --git a/tools/virsh-completer-domain.h b/tools/virsh-completer-domain.h
index 4c01b0ca1f..04a3705ff9 100644
--- a/tools/virsh-completer-domain.h
+++ b/tools/virsh-completer-domain.h
@@ -112,5 +112,9 @@ char ** virshDomainLifecycleActionCompleter(vshControl *ctl,
 unsigned int flags);
 
 char ** virshCodesetNameCompleter(vshControl *ctl,
-  const vshCmd *cmd,
-  unsigned int flags);
+  const vshCmd *cmd,
+  unsigned int flags);
+
+char ** virshKeycodeNameCompleter(vshControl *ctl,
+  const vshCmd *cmd,
+  unsigned int flags);
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
index d40995f44d..df33467646 100644
--- a/tools/virsh-domain.c
+++ b/tools/virsh-domain.c
@@ -8802,6 +8802,7 @@ static const vshCmdOptDef opts_send_key[] = {
 {.name = "keycode",
  .type = VSH_OT_ARGV,
  .flags = VSH_OFLAG_REQ,
+ .completer = virshKeycodeNameCompleter,
  .help = N_("the key code")
 },
 {.name = NULL}
-- 
2.29.2



Re: [PATCH 18/19] qemu: migration: Migrate block dirty bitmaps corresponding to checkpoints

2021-02-18 Thread Jiri Denemark
On Thu, Feb 18, 2021 at 16:50:43 +0100, Peter Krempa wrote:
> On Thu, Feb 18, 2021 at 15:54:37 +0100, Jiri Denemark wrote:
> > On Thu, Feb 11, 2021 at 16:37:57 +0100, Peter Krempa wrote:
> > > Preserve block dirty bitmaps after migration with
> > > QEMU_MONITOR_MIGRATE_NON_SHARED_(DISK|INC).
> > > 
> > > This patch implements functions which offer the bitmaps to the
> > > destination, check for eligibility on destination and then configure
> > > source for the migration.
> > > 
> > > Signed-off-by: Peter Krempa 
> > > ---
> > >  src/qemu/qemu_migration.c | 333 +-
> > >  1 file changed, 331 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> > > index 36424f8493..16bfad0390 100644
> > > --- a/src/qemu/qemu_migration.c
> > > +++ b/src/qemu/qemu_migration.c
> > ...
> > > @@ -2528,6 +2619,92 @@ qemuMigrationDstPrepare(virDomainObjPtr vm,
> > >   migrateFrom, fd, NULL);
> > >  }
> > > 
> > > +
> > > +/**
> > > + * qemuMigrationDstPrepareAnyBlockDirtyBitmaps:
> > > + * @vm: domain object
> > > + * @mig: migration cookie
> > > + * @migParams: migration parameters
> > > + * @flags: migration flags
> > > + *
> > > + * Checks whether block dirty bitmaps offered by the migration source are
> > > + * to be migrated (e.g. they don't exist, the destination is compatible 
> > > etc)
> > > + * and sets up destination qemu for migrating the bitmaps as well as 
> > > updates the
> > > + * list of eligible bitmaps in the migration cookie to be sent back to 
> > > the
> > > + * source.
> > > + */
> > > +static int
> > > +qemuMigrationDstPrepareAnyBlockDirtyBitmaps(virDomainObjPtr vm,
> > > +qemuMigrationCookiePtr mig,
> > > +qemuMigrationParamsPtr 
> > > migParams,
> > > +unsigned int flags)
> > > +{
> > > +qemuDomainObjPrivatePtr priv = vm->privateData;
> > > +g_autoptr(virJSONValue) mapping = NULL;
> > > +g_autoptr(GHashTable) blockNamedNodeData = NULL;
> > > +GSList *nextdisk;
> > > +
> > > +if (!mig->nbd ||
> > > +!mig->blockDirtyBitmaps ||
> > > +!(flags & (VIR_MIGRATE_NON_SHARED_DISK | 
> > > VIR_MIGRATE_NON_SHARED_INC)) ||
> > > +!virQEMUCapsGet(priv->qemuCaps, 
> > > QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING))
> > > +return 0;
> > 
> > Shouldn't we report an error in case the source sent bitmaps, but local
> > QEMU does not support QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING?
> 
> See below.
> 
> > 
> > > +
> > > +if (qemuMigrationCookieBlockDirtyBitmapsMatchDisks(vm->def, 
> > > mig->blockDirtyBitmaps) < 0)
> > > +return -1;
> > > +
> > > +if (!(blockNamedNodeData = qemuBlockGetNamedNodeData(vm, 
> > > QEMU_ASYNC_JOB_MIGRATION_IN)))
> > > +return -1;
> > > +
> > > +for (nextdisk = mig->blockDirtyBitmaps; nextdisk; nextdisk = 
> > > nextdisk->next) {
> > > +qemuMigrationBlockDirtyBitmapsDiskPtr disk = nextdisk->data;
> > > +qemuBlockNamedNodeDataPtr nodedata;
> > > +GSList *nextbitmap;
> > > +
> > > +if (!(nodedata = virHashLookup(blockNamedNodeData, 
> > > disk->nodename))) {
> > > +virReportError(VIR_ERR_INTERNAL_ERROR,
> > > +   _("failed to find data for block node '%s'"),
> > > +   disk->nodename);
> > > +return -1;
> > > +}
> > > +
> > > +/* don't migrate bitmaps into non-qcow2v3+ images */
> > 
> > How about "Bitmaps can only be migrated to qcow2 v3+"?
> > 
> > > +if (disk->disk->src->format != VIR_STORAGE_FILE_QCOW2 ||
> > > +nodedata->qcow2v2) {
> > > +disk->skip = true;
> > 
> > Is skipping the disk the right thing to do? Should we report an error
> > and abort migration instead? Just checking, maybe we can't do so for
> > backward compatibility...
> > 
> > > +continue;
> > > +}
> > > +
> > > +for (nextbitmap = disk->bitmaps; nextbitmap; nextbitmap = 
> > > nextbitmap->next) {
> > > +qemuMigrationBlockDirtyBitmapsDiskBitmapPtr bitmap = 
> > > nextbitmap->data;
> > > +size_t k;
> > > +
> > > +/* don't migrate into existing bitmaps */
> > > +for (k = 0; k < nodedata->nbitmaps; k++) {
> > > +if (STREQ(bitmap->bitmapname, 
> > > nodedata->bitmaps[k]->name)) {
> > > +bitmap->skip = true;
> > 
> > And similar questions for bitmaps here.
> 
> That would require that we have the users explicitly enable this feature
> rather than doing it implicitly. The disk format can be changed during
> the migration.
> 
> If we want to do it explicitly only I'll need to add a migration flag
> and all the infra for it.

I see. I agree copying the bitmaps whenever possible is a good idea.

Reviewed-by: Jiri Denemark 



Re: [libvirt PATCH v4 12/25] nodedev: Refresh mdev devices when changes are detected

2021-02-18 Thread Jonathon Jongsma
On Wed, 17 Feb 2021 14:35:51 +0100
Erik Skultety  wrote:

> > +static void
> > +mdevctlEventHandleCallback(GFileMonitor *monitor G_GNUC_UNUSED,
> > +   GFile *file,
> > +   GFile *other_file G_GNUC_UNUSED,
> > +   GFileMonitorEvent event_type,
> > +   gpointer user_data)
> > +{
> > +udevEventDataPtr priv = user_data;
> > +/* if a new directory appears, monitor that directory for
> > changes */
> > +if (event_type == G_FILE_MONITOR_EVENT_CREATED &&
> > +g_file_query_file_type(file, G_FILE_QUERY_INFO_NONE, NULL)
> > ==
> > +G_FILE_TYPE_DIRECTORY) {
> > +GList *newmonitors = monitorFileRecursively(priv, file);
> > +priv->mdevctlMonitors =
> > g_list_concat(priv->mdevctlMonitors, newmonitors);  
> 
> priv->mdevctlMonitors is shared among threads, but you only acquire
> the mutex just before exec'ing mdevctl in the mdevctlHandler thread,
> so ^this can yield unexpected results in terms of directories
> monitored.

A) It doesn't really affect the directories being monitored, it only
affects our ability to free these monitors later in the destructor.

B) Aside from the very first time this variable is set in
nodeStateInitialize(), the only place that accesses the mdevctlMonitors
variable is this glib signal callback (to add new monitors to the list)
and the destructor (to free the monitors). And I believe those should
always run in the main thread. So I don't think there's really much of
a thread-safety issue here. Perhaps there's a slight race in
nodeStateInitialize() which appears to be called from a non-main
thread. If a directory gets created immediately after connecting to the
file monitor signal but before the function returns and the GList gets
assigned to priv->mdevctlMonitors, I suppose the signal handler could
run and it could get corrupted. Would you like me to protect all
accesses with a Mutex?


> 
> > +}
> > +
> > +/* When mdevctl creates a device, it can result in multiple
> > notify events
> > + * emitted for a single logical change (e.g. several CHANGED
> > events, or a
> > + * CREATED and CHANGED event followed by CHANGES_DONE_HINT).
> > To avoid
> > + * spawning a mdevctl thread multiple times for a single
> > logical
> > + * configuration change, try to coalesce these changes by
> > waiting for the
> > + * CHANGES_DONE_HINT event. As a fallback,  add a timeout to
> > trigger the
> > + * signal if that event never comes */  
> 
> So there's no guaranteed pattern to monitor, is that so? IOW Even the
> CHANGES_DONE_HINT may not actually arrive? That's a very poor design.
> I'd like to ditch the timer as much as possible.

Maybe we can simply rely on the CHANGES_DONE_HINT. I tend to be
conservative and was trying to handle the case where the
CHANGES_DONE_HINT might not be delivered (it is called a "hint" after
all). Part of this was caused by my reading this comment in a test
case within the glib repository:

  /* Sometimes the emission of 'CHANGES_DONE_HINT' may be late because
   * it depends on the ability of file monitor implementation to report
   * 'CHANGES_DONE_HINT' itself. If the file monitor implementation
   * doesn't report 'CHANGES_DONE_HINT' itself, it may be emitted by
   * GLocalFileMonitor after a few seconds, which causes the event to
   * mix with results from different steps. Since 'CHANGES_DONE_HINT'
   * is just a hint, we don't require it to be reliable and we simply
   * ignore unexpected 'CHANGES_DONE_HINT' events here. */

So it is not reliably communicated if your file monitor backend doesn't
implement it. But the above comment seems to suggest that if the file monitor
backend doesn't support it, the event will be generated by glib itself and will
simply be delayed by a couple of seconds rather than missing altogether. I'd
have to read the glib code to tell whether it's possible for it to be omitted
altogether. I guess I don't see the timer as a horrible workaround, but I can
ditch it if you want.

Alternatively I suppose I could simply kick off a new thread for each file
change event and not bother trying to coalesce them at all. This will result in
perhaps a an extra thread or two being launched when mdevctl modifies
the mdev registry behind our backs, but that's unlikely to be a common
occurrence and so shouldn't really be a significant performance issue.

> 
> > +if (event_type != G_FILE_MONITOR_EVENT_CHANGES_DONE_HINT) {
> > +if (priv->mdevctlTimeout > 0)
> > +virEventRemoveTimeout(priv->mdevctlTimeout);
> > +priv->mdevctlTimeout = virEventAddTimeout(100,
> > scheduleMdevctlHandler,
> > +  priv, NULL);
> > +return;
> > +}
> > +
> > +scheduleMdevctlHandler(-1, priv);
> > +}
> > +
> > +
> >  static int
> >  nodeStateInitialize(bool privileged,
> >  const char *root,

Jonathon