Re: [libvirt PATCH 3/3] qemu: Extra check for NBD URI being specified

2020-12-17 Thread Martin Kletzander

On Thu, Dec 17, 2020 at 11:15:11AM +0100, Peter Krempa wrote:

On Wed, Dec 16, 2020 at 12:19:28 +0100, Martin Kletzander wrote:

It must be used when migration URI uses `unix:` transport because otherwise we
cannot just guess where to connect for disk migration.

https://bugzilla.redhat.com/show_bug.cgi?id=1638889

Signed-off-by: Martin Kletzander 
---
 src/qemu/qemu_driver.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 5f0fb0a55fee..9caaa0723720 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -11503,6 +11503,16 @@ qemuDomainMigratePrepare3Params(virConnectPtr dconn,

QEMU_MIGRATION_DESTINATION)))
 return -1;

+if (flags & (VIR_MIGRATE_NON_SHARED_DISK | VIR_MIGRATE_NON_SHARED_INC) ||
+nmigrate_disks > 0) {


Is this to check that NBD is used? I think that's not enough as
VIR_MIGRATE_NON_SHARED_DISK/INC doesn't necessarily guarantee that NBD
is used. It also depends on whether TUNELLED and other possible flags
are in the mix too. We might need a helper function for that.



Thanks for the review.  It would be nicer to have it extracted into a separate
function for that, I agree.  Tunnelled migration would fortunately not work with
`--migrate-uri`, so we should be safe there (although the logic lends itself for
slightly more readable condition).  I went through both QEMU_MIGRATION_FLAGS and
QEMU_MIGRATION_PARAMETERS and I believe these are the only cases that might
cause an issue.  I'll move the condition to a separate function and I'll gladly
accept any suggestions for improvements.



Also 'nmigrate_disks' AFAIK requires that VIR_MIGRATE_NON_SHARED_DISK
is used so it's redundant here.



I only did it based on what virsh does and `virsh migrate --migrate-disks vda
...` without any `--copy-storage-*` option worked for me and it looked like
virsh does not apply additional logic on top of this particular set of options.


+if (uri_in && STRPREFIX(uri_in, "unix:") && !nbdURI) {
+virReportError(VIR_ERR_INVALID_ARG, "%s",
+   _("NBD URI must be supplied when "
+ "migration URI uses UNIX transport method"));


Please don't split error messages it's hard to grep for them if they are
split.



Are we finally enforcing this?  I'd be glad to, I guess I'm stuck in the old 
ways ;)


+return -1;
+}
+}
+
 if (nbdURI && nbdPort) {
 virReportError(VIR_ERR_INVALID_ARG, "%s",
_("Both port and URI requested for disk migration "
@@ -11743,6 +11753,7 @@ qemuDomainMigratePerform3Params(virDomainPtr dom,
 _xml) < 0)
 goto cleanup;

+


Spurious newline addition.


 if (nbdURI && nbdPort) {
 virReportError(VIR_ERR_INVALID_ARG, "%s",
_("Both port and URI requested for disk migration "
@@ -11766,6 +11777,16 @@ qemuDomainMigratePerform3Params(virDomainPtr dom,
 if (nmigrate_disks < 0)
 goto cleanup;

+if (flags & (VIR_MIGRATE_NON_SHARED_DISK | VIR_MIGRATE_NON_SHARED_INC) ||
+nmigrate_disks > 0) {
+if (uri && STRPREFIX(uri, "unix:") && !nbdURI) {
+virReportError(VIR_ERR_INVALID_ARG, "%s",
+   _("NBD URI must be supplied when "
+ "migration URI uses UNIX transport method"));
+return -1;
+}
+}
+
 if (!(migParams = qemuMigrationParamsFromFlags(params, nparams, flags,
QEMU_MIGRATION_SOURCE)))
 goto cleanup;
--
2.29.2



signature.asc
Description: PGP signature


Re: [libvirt][PATCH v1 0/3] introduce 'restrictive' mode in numatune

2020-12-17 Thread Daniel Henrique Barboza




On 12/15/20 10:12 PM, Luyao Zhong wrote:

Before this patch set, numatune only has three memory modes:
static, interleave and prefered. These memory policies are
ultimately set by mbind() system call.

Memory policy could be 'hard coded' into the kernel, but none of
above policies fit our requirment under this case. mbind() support
default memory policy, but it requires a NULL nodemask. So obviously
setting allowed memory nodes is cgroups' mission under this case.
So we introduce a new option for mode in numatune named 'restrictive'.







The config above means we only use cgroups to restrict the allowed
memory nodes and not setting any specific memory policies explicitly.

RFC discussion:
https://www.redhat.com/archives/libvir-list/2020-November/msg01256.html

Regards,
Luyao



Series LGTM


Reviewed-by: Daniel Henrique Barboza 



Luyao Zhong (3):
   docs: add docs for 'restrictive' option for mode in numatune
   schema: add 'restrictive' config option for mode in numatune
   qemu: add parser and formatter for 'restrictive' mode in numatune

  docs/formatdomain.rst |  7 +++-
  docs/schemas/domaincommon.rng |  2 +
  include/libvirt/libvirt-domain.h  |  1 +
  src/conf/numa_conf.c  |  9 +
  src/qemu/qemu_command.c   |  6 ++-
  src/qemu/qemu_process.c   | 27 +
  src/util/virnuma.c|  3 ++
  .../numatune-memnode-invalid-mode.err |  1 +
  .../numatune-memnode-invalid-mode.xml | 33 +++
  ...emnode-restrictive-mode.x86_64-latest.args | 40 +++
  .../numatune-memnode-restrictive-mode.xml | 33 +++
  tests/qemuxml2argvtest.c  |  2 +
  ...memnode-restrictive-mode.x86_64-latest.xml | 40 +++
  tests/qemuxml2xmltest.c   |  1 +
  14 files changed, 202 insertions(+), 3 deletions(-)
  create mode 100644 tests/qemuxml2argvdata/numatune-memnode-invalid-mode.err
  create mode 100644 tests/qemuxml2argvdata/numatune-memnode-invalid-mode.xml
  create mode 100644 
tests/qemuxml2argvdata/numatune-memnode-restrictive-mode.x86_64-latest.args
  create mode 100644 
tests/qemuxml2argvdata/numatune-memnode-restrictive-mode.xml
  create mode 100644 
tests/qemuxml2xmloutdata/numatune-memnode-restrictive-mode.x86_64-latest.xml





Re: [libvirt][PATCH v1 3/3] qemu: add parser and formatter for 'restrictive' mode in numatune

2020-12-17 Thread Daniel Henrique Barboza




On 12/15/20 10:13 PM, Luyao Zhong wrote:

---
  include/libvirt/libvirt-domain.h  |  1 +
  src/conf/numa_conf.c  |  9 +
  src/qemu/qemu_command.c   |  6 ++-
  src/qemu/qemu_process.c   | 27 +
  src/util/virnuma.c|  3 ++
  .../numatune-memnode-invalid-mode.err |  1 +
  .../numatune-memnode-invalid-mode.xml | 33 +++
  ...emnode-restrictive-mode.x86_64-latest.args | 40 +++
  .../numatune-memnode-restrictive-mode.xml | 33 +++
  tests/qemuxml2argvtest.c  |  2 +
  ...memnode-restrictive-mode.x86_64-latest.xml | 40 +++
  tests/qemuxml2xmltest.c   |  1 +
  12 files changed, 195 insertions(+), 1 deletion(-)
  create mode 100644 tests/qemuxml2argvdata/numatune-memnode-invalid-mode.err
  create mode 100644 tests/qemuxml2argvdata/numatune-memnode-invalid-mode.xml
  create mode 100644 
tests/qemuxml2argvdata/numatune-memnode-restrictive-mode.x86_64-latest.args
  create mode 100644 
tests/qemuxml2argvdata/numatune-memnode-restrictive-mode.xml
  create mode 100644 
tests/qemuxml2xmloutdata/numatune-memnode-restrictive-mode.x86_64-latest.xml

diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index b929877643..309040bf97 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -1527,6 +1527,7 @@ typedef enum {
  VIR_DOMAIN_NUMATUNE_MEM_STRICT  = 0,
  VIR_DOMAIN_NUMATUNE_MEM_PREFERRED   = 1,
  VIR_DOMAIN_NUMATUNE_MEM_INTERLEAVE  = 2,
+VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE = 3,
  
  # ifdef VIR_ENUM_SENTINELS

  VIR_DOMAIN_NUMATUNE_MEM_LAST /* This constant is subject to change */
diff --git a/src/conf/numa_conf.c b/src/conf/numa_conf.c
index f8a7a01ac9..df888a8dfb 100644
--- a/src/conf/numa_conf.c
+++ b/src/conf/numa_conf.c
@@ -43,6 +43,7 @@ VIR_ENUM_IMPL(virDomainNumatuneMemMode,
"strict",
"preferred",
"interleave",
+  "restrictive",
  );
  
  VIR_ENUM_IMPL(virDomainNumatunePlacement,

@@ -234,6 +235,14 @@ virDomainNumatuneNodeParseXML(virDomainNumaPtr numa,
 _("Invalid mode attribute in memnode 
element"));
  goto cleanup;
  }
+
+if (numa->memory.mode == VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE &&
+mode != VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE) {
+virReportError(VIR_ERR_XML_ERROR, "%s",
+   _("'restrictive' mode is required in memnode element 
"
+ "when mode is 'restrictive' in memory 
element"));
+goto cleanup;
+}
  VIR_FREE(tmp);
  mem_node->mode = mode;
  }
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 479bcc0b0c..c0eff6cb12 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -174,6 +174,7 @@ VIR_ENUM_IMPL(qemuNumaPolicy,
"bind",
"preferred",
"interleave",
+  "restricted",
  );
  
  
@@ -3138,7 +3139,10 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,

  return -1;
  }
  
-if (nodemask) {

+/* If mode is "restrictive", we should only use cgroups setting allowed 
memory
+ * nodes, and skip passing the host-nodes and policy parameters to QEMU 
command
+ * line which means we will use system default memory policy. */
+if (nodemask && mode != VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE) {
  if (!virNumaNodesetIsAvailable(nodemask))
  return -1;
  if (virJSONValueObjectAdd(props,
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 579b3c3713..8c260883d9 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -2709,6 +2709,7 @@ qemuProcessSetupPid(virDomainObjPtr vm,
  g_autoptr(virBitmap) hostcpumap = NULL;
  g_autofree char *mem_mask = NULL;
  int ret = -1;
+size_t i;
  
  if ((period || quota) &&

  !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) {
@@ -2749,6 +2750,32 @@ qemuProcessSetupPid(virDomainObjPtr vm,
  _mask, -1) < 0)
  goto cleanup;
  
+/* For vCPU threads, mem_mask is different among cells and mem_mask

+ * is used to set cgroups cpuset.mems for vcpu threads. If we specify
+ * 'restrictive' mode, that means we will set system default memory
+ * policy and only use cgroups to restrict allowed memory nodes. */
+if (nameval == VIR_CGROUP_THREAD_VCPU) {
+virDomainNumaPtr numatune = vm->def->numa;
+virBitmapPtr numanode_cpumask = NULL;
+for (i = 0; i < virDomainNumaGetNodeCount(numatune); i++) {
+numanode_cpumask = 

Re: [PATCH 2/5] bhyve: remove redundant code that adds "template" netdev name

2020-12-17 Thread Roman Bogorodskiy
  Laine Stump wrote:

> The FreeBSD version of virNetDevTapCreate() now calls
> virNetDevGenerateName(), and virNetDevGenerateName() understands that
> a blank ifname should be replaced with a generated name based on a
> device-type-specific template - so there is no longer any need for the
> higher level functions to stuff a template name ("vnet%d") into
> ifname.
> 
> Signed-off-by: Laine Stump 

For this and 1/5:

Reviewed-by: Roman Bogorodskiy 

Thanks for this cleanup.

> ---
>  src/bhyve/bhyve_command.c | 7 ---
>  1 file changed, 7 deletions(-)
> 
> diff --git a/src/bhyve/bhyve_command.c b/src/bhyve/bhyve_command.c
> index 4cf98c0eb1..daf313c9c1 100644
> --- a/src/bhyve/bhyve_command.c
> +++ b/src/bhyve/bhyve_command.c
> @@ -79,13 +79,6 @@ bhyveBuildNetArgStr(const virDomainDef *def,
>  goto cleanup;
>  }
>  
> -if (!net->ifname ||
> -STRPREFIX(net->ifname, VIR_NET_GENERATED_VNET_PREFIX) ||
> -strchr(net->ifname, '%')) {
> -VIR_FREE(net->ifname);
> -net->ifname = g_strdup(VIR_NET_GENERATED_VNET_PREFIX "%d");
> -}
> -
>  if (!dryRun) {
>  if (virNetDevTapCreateInBridgePort(brname, >ifname, >mac,
> def->uuid, NULL, NULL, 0,
> -- 
> 2.28.0
> 

Roman Bogorodskiy


signature.asc
Description: PGP signature


Re: RFC PATCH: Issue 90 (Test Clarification)

2020-12-17 Thread Peter Krempa
On Wed, Dec 16, 2020 at 19:37:48 -0600, Ryan Gahagan wrote:
> We addressed the feedback from our previous RFC patch for the most part.
> Under src/util/virstoragefile.c, we left a cast to an integer pointer
> that Peter mentioned because we were unable to provide a better

Casting uid_t * to int * will work in this instance but is not
acceptable. The problem with casting pointers is that if sizeof(uid_t)
!= sizeof(int) casting a pointer could result into accessing invalid
memory, while if you just assign the value of integers of distinct sizes
you get an overflow at worst. That must be changed in the code.

> solution. We've written some tests for our code but our testing
> environment is not working locally (meson doesn't even recognize the
> project as a meson build project) and so we can't regenerate output or
> test our tests.

Could you elaborate? As in, post exact steps and output what you've done
and what's broken.

> It's probably bad practice but the only solution we could think of that
> would allow us to check our tests was just to email you what we've got.
> Sorry if it's not up to standard, but please let us know if there's a
> better way to do it or if you spot any problems in these tests.

It's better to commit what you have and ask directly, ideally including
output you are (not) getting.

> Under tests/qemuxml2xmltest.c:
> DO_TEST_CAPS_LATEST("disk-network-nfs");
> 
> The same line would be in tests/qemuxml2argvtest.c
> 
> We created the file tests/qemuxml2argvdata/disk-network-nfs.xml:
> 
>   QEMUGuest1
>   c7a5fdbd-edaf-9455-926a-d65c16db1809
>   219136
>   219136
>   1
>   
> hvm
> 
>   
>   
>   destroy
>   restart
>   destroy
>   
> /usr/bin/qemu-system-x86_64
> 
>   
>   
> 
> 
>   
>   
>   eb90327c-8302-4725-9e1b-4e85ed4dc251
>function='0x0'/>
> 
> 
> 
> 
> 
> 
>   
> 
> 
> We have under tests/qemublocktest.c:
> TEST_JSON_FORMAT_NET(“\n”
>  “  \n”
>  “  \n”
>  "\n”);
> and
> TEST_IMAGE_CREATE(“network-nfs-qcow2”, NULL);
> 
> And finally under tests/virstoragetest.c:
> TEST_BACKING_PARSE(“json:{\”file\”:{\”driver\”:\”nfs\”,”
>“\”user\”:\”USER\”,”
>“\”group\”:\”GROUP\”,”
>“\”server\”: {  \”host\”:\”example.com\”,”
>   “\”port\”:\”2049\””
>”}”
>   “}”
> “}”,
>“\n”
>“  \n”
>“  \n”
>“\n”);
> 
> Again, sorry if this looks awful. Please let us know if there's a more
> practical way to do this because submitting a commit with these tests
> would guarantee that the tests fail and the commit wouldn't be mergeable
> due to our environment issues, or if you see anything wrong with these
> tests.

Note that patches without tests are not acceptable either. Additionally
I'll not be copying your changes to appropriate places. Please add the
test code to appropriate places.

Also note that the upstream test-suite run in the CI does actually
provide the expected output. Obviously you can't use the ENV variable to
automatically overwrite your files, but you certainly can copy out the
diffs from the CI. Just commit empty files in place for the output files
and the CI will complain that they differ including the full difference.

If you don't post what's wrong I will not be guessing, so if you want
any input from us, be sure to provide as much information as you have
including exact steps.



Re: [PATCH libvirt v1] tests: add capabilities for QEMU 5.2.0 on s390x

2020-12-17 Thread Andrea Bolognani
On Wed, 2020-12-16 at 10:10 +0100, Shalini Chellathurai Saroja wrote:
>  tests/domaincapsdata/qemu_5.2.0.s390x.xml |   231 +
>  .../caps_5.2.0.s390x.replies  | 25458 
>  .../qemucapabilitiesdata/caps_5.2.0.s390x.xml |  3300 ++
>  ...default-video-type-s390x.s390x-latest.args | 9 +-
>  .../disk-error-policy-s390x.s390x-latest.args |16 +-
>  .../fs9p-ccw.s390x-latest.args| 8 +-
>  ...tdev-subsys-mdev-vfio-ap.s390x-latest.args | 4 +-
>  ...ubsys-mdev-vfio-ccw-boot.s390x-latest.args | 4 +-
>  ...othreads-virtio-scsi-ccw.s390x-latest.args | 6 +-
>  ...t-cpu-kvm-ccw-virtio-2.7.s390x-latest.args | 4 +-
>  ...t-cpu-kvm-ccw-virtio-4.2.s390x-latest.args | 9 +-
>  ...t-cpu-tcg-ccw-virtio-2.7.s390x-latest.args | 4 +-
>  ...t-cpu-tcg-ccw-virtio-4.2.s390x-latest.args | 4 +-
>  .../s390x-ccw-graphics.s390x-latest.args  | 8 +-
>  .../s390x-ccw-headless.s390x-latest.args  | 8 +-
>  .../vhost-vsock-ccw-auto.s390x-latest.args| 8 +-
>  .../vhost-vsock-ccw.s390x-latest.args | 8 +-
>  17 files changed, 29054 insertions(+), 35 deletions(-)
>  create mode 100644 tests/domaincapsdata/qemu_5.2.0.s390x.xml
>  create mode 100644 tests/qemucapabilitiesdata/caps_5.2.0.s390x.replies
>  create mode 100644 tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml

The diff looks sane enough, so

  Reviewed-by: Andrea Bolognani 

and pushed. Thanks for helping!

However...

> +++ b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml
> @@ -0,0 +1,3300 @@
> +
> +  /usr/bin/qemu-system-s390x
> +  5002000
> +  0
> +  39100243
> +  qemu-5.2.0-20201215.0.ba93e22c.fc32

... the version string seems to indicate you're grabbing the replies
from a packaged version rather than a build made from pristine
upstream sources: this is consistent with what was done for earlier
QEMU capabilities on s390x, but not with how we usually do things for
other architectures - see the other caps_5.2.0.*.replies files.

I don't think this is a blocker, because a Fedora-based package will
be quite close to upstream anyway, but it would be great if you could
generate the replies file again against a QEMU binary that's been
built exclusively from upstream sources. You can then submit the
update as a follow-up patch - I expect such patch to be fairly small.

Thanks again for your help getting updated capabilities in libvirt :)

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: [RFC PATCH 1/6] conf: Add NFS disk protocol

2020-12-17 Thread Peter Krempa
On Wed, Dec 16, 2020 at 19:37:49 -0600, Ryan Gahagan wrote:

Please add a commit message.

> Signed-off-by: Ryan Gahagan 
> ---
>  src/libxl/libxl_conf.c|  1 +
>  src/libxl/xen_xl.c|  1 +
>  src/qemu/qemu_block.c |  3 +++
>  src/qemu/qemu_command.c   |  1 +
>  src/qemu/qemu_domain.c| 10 ++
>  src/qemu/qemu_snapshot.c  |  3 +++
>  src/util/virstoragefile.c |  6 ++
>  src/util/virstoragefile.h |  1 +
>  8 files changed, 26 insertions(+)

[...]

> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index bfb6e23942..692bc925c6 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -4626,6 +4626,14 @@ qemuDomainValidateStorageSource(virStorageSourcePtr 
> src,
>  return -1;
>  }
>  
> +/* NFS protocol may only be used if current QEMU supports blockdev */
> +if (actualType == VIR_STORAGE_TYPE_NETWORK &&
> +src->protocol == VIR_STORAGE_NET_PROTOCL_NFS &&
> +!blockdev) {
> +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> +   _("'nfs' protocol is not supported with this QEMU 
> binary"));
> +}

This doesn't belong to this patch. This should be part of the patch
implementing the qemu support for it.

In this patch you are adding a new type and we try to minimize and
separate changes.


> +
>  return 0;
>  }

[..]

> @@ -4627,6 +4629,10 @@ 
> virStorageSourceNetworkDefaultPort(virStorageNetProtocol protocol)
>  case VIR_STORAGE_NET_PROTOCOL_VXHS:
>  return ;
>  
> +case VIR_STORAGE_NET_PROTOCOL_NFS:
> +/* return based on exmaple and SUSE support docs */

It's better to add a link rather than a statement that is hard to
verify.

> +return 2049;
> +
>  case VIR_STORAGE_NET_PROTOCOL_LAST:
>  case VIR_STORAGE_NET_PROTOCOL_NONE:
>  return 0;

We'll discuss the lack of tests separately.

The rest of this patch looks good.



Re: [libvirt PATCH 3/3] qemu: Extra check for NBD URI being specified

2020-12-17 Thread Peter Krempa
On Wed, Dec 16, 2020 at 12:19:28 +0100, Martin Kletzander wrote:
> It must be used when migration URI uses `unix:` transport because otherwise we
> cannot just guess where to connect for disk migration.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1638889
> 
> Signed-off-by: Martin Kletzander 
> ---
>  src/qemu/qemu_driver.c | 21 +
>  1 file changed, 21 insertions(+)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 5f0fb0a55fee..9caaa0723720 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -11503,6 +11503,16 @@ qemuDomainMigratePrepare3Params(virConnectPtr dconn,
> 
> QEMU_MIGRATION_DESTINATION)))
>  return -1;
>  
> +if (flags & (VIR_MIGRATE_NON_SHARED_DISK | VIR_MIGRATE_NON_SHARED_INC) ||
> +nmigrate_disks > 0) {

Is this to check that NBD is used? I think that's not enough as
VIR_MIGRATE_NON_SHARED_DISK/INC doesn't necessarily guarantee that NBD
is used. It also depends on whether TUNELLED and other possible flags
are in the mix too. We might need a helper function for that.

Also 'nmigrate_disks' AFAIK requires that VIR_MIGRATE_NON_SHARED_DISK
is used so it's redundant here.

> +if (uri_in && STRPREFIX(uri_in, "unix:") && !nbdURI) {
> +virReportError(VIR_ERR_INVALID_ARG, "%s",
> +   _("NBD URI must be supplied when "
> + "migration URI uses UNIX transport method"));

Please don't split error messages it's hard to grep for them if they are
split.

> +return -1;
> +}
> +}
> +
>  if (nbdURI && nbdPort) {
>  virReportError(VIR_ERR_INVALID_ARG, "%s",
> _("Both port and URI requested for disk migration "
> @@ -11743,6 +11753,7 @@ qemuDomainMigratePerform3Params(virDomainPtr dom,
>  _xml) < 0)
>  goto cleanup;
>  
> +

Spurious newline addition.

>  if (nbdURI && nbdPort) {
>  virReportError(VIR_ERR_INVALID_ARG, "%s",
> _("Both port and URI requested for disk migration "
> @@ -11766,6 +11777,16 @@ qemuDomainMigratePerform3Params(virDomainPtr dom,
>  if (nmigrate_disks < 0)
>  goto cleanup;
>  
> +if (flags & (VIR_MIGRATE_NON_SHARED_DISK | VIR_MIGRATE_NON_SHARED_INC) ||
> +nmigrate_disks > 0) {
> +if (uri && STRPREFIX(uri, "unix:") && !nbdURI) {
> +virReportError(VIR_ERR_INVALID_ARG, "%s",
> +   _("NBD URI must be supplied when "
> + "migration URI uses UNIX transport method"));
> +return -1;
> +}
> +}
> +
>  if (!(migParams = qemuMigrationParamsFromFlags(params, nparams, flags,
> QEMU_MIGRATION_SOURCE)))
>  goto cleanup;
> -- 
> 2.29.2
> 



RE: [PATCH] util: Add phys_port_name support on virPCIGetNetName

2020-12-17 Thread Adrian Chiris
>-Original Message-
>From: Laine Stump 
>Sent: Thursday, December 17, 2020 5:22 AM
>To: libvir-list@redhat.com
>Cc: Dmytro Linkin ; Moshe Levi ;
>Adrian Chiris 
>Subject: Re: [PATCH] util: Add phys_port_name support on
>virPCIGetNetName
>
>External email: Use caution opening links or attachments
>
>
>On 12/10/20 11:51 AM, Adrian Chiris wrote:
>
>> Hi,
>> Would be great to get a pair of eyes on this Patch, Thanks!
>
>
>I've looked at it several times and every time would just end up shaking my
>head wondering why there isn't one definitive symlink in the VF's sysfs for the
>netdev of the physical port. (Part of my lack of response from the last time is
>that I didn't know how to respond since I didn't understand why such
>roundabout logic should be needed to answer such a basic question, decided I
>should look at it again before responding, and then forgot about it :-()
>
>
>Anyway, this time I'm determined to not get up until I've got it figured out 
>(or
>at least understand exactly what I don't have figured out)...
>
>
>
>>> -Original Message-
>>> From: Dmytro Linkin 
>>> Sent: Tuesday, October 27, 2020 10:58 AM
>>> To: libvir-list@redhat.com
>>> Cc: Laine Stump ; Adrian Chiris
>>> ; Moshe Levi 
>>> Subject: Re: [PATCH] util: Add phys_port_name support on
>>> virPCIGetNetName
>>>
>>> On Mon, Sep 28, 2020 at 12:56:12PM +0300, Dmytro Linkin wrote:
 On Tue, Sep 22, 2020 at 08:31:15AM -0400, Laine Stump wrote:
> On 8/28/20 6:53 AM, Dmytro Linkin wrote:
>> Current virPCIGetNetName() logic is to get net device name by
>> checking it's phys_port_id, if caller provide it, or by it's index
>> (eg, by it's position at sysfs net directory). This approach
>> worked fine up until linux kernel version 5.8, where NVIDIA
>> Mellanox driver implemented linking of VFs' representors to PCI
>> device in switchdev mode. This mean that device's sysfs net
>> directory will hold
>>> multiple net devices. Ex.:
>> $ ls '/sys/bus/pci/devices/:82:00.0/net'
>> ens1f0  eth0  eth1
>>
>> Most switch devices support phys_port_name instead of
>> phys_port_id, so
>> virPCIGetNetName() will try to get PF name by it's index - 0. The
>> problem here is that the PF nedev entry may not be the first.
>>
>> To fix that, for switch devices, we introduce a new logic to
>> select the PF uplink netdev according to the content of
>phys_port_name.
>> Extend
>> virPCIGetNetName() with physPortNameRegex variable to get proper
>> device by it's phys_port_name scheme, for ex., "p[0-9]+$" to get
>> PF, "pf[0-9]+vf[0-9]+$" to get VF or "p1$" to get exact net device.
>> So now
>> virPCIGetNetName() logic work in following sequence:
>>   - filter by phys_port_id, if it's provided,
>>   or
>>   - filter by phys_port_name, if it's regex provided,
>>   or
>>   - get net device by it's index (position) in sysfs net directory.
>> Also, make getting content of iface sysfs files more generic.
>>
>> Signed-off-by: Dmytro Linkin 
>> Reviewed-by: Adrian Chiris 
>
> [...]
>
>
>> +/* Represents format of PF's phys_port_name in switchdev mode:
>> + * 'p%u' or 'p%us%u'. New line checked since value is readed from
>> +sysfs
>>> file.
>> + */
>> +# define VIR_PF_PHYS_PORT_NAME_REGEX ((char
>> +*)"(p[0-9]+$)|(p[0-9]+s[0-9]+$)")
>> +
>
> I've come back to look at this patch several times since it was
> posted (sorry for the extreme delay in responding), but just can't
> figure out what it's doing with this regex and why the regex is
> necessary. Not having access to the hardware that it works with is
> a bit of a problem, but perhaps I could get a better idea if you
> gave a full example of sysfs contents? My concern with using a
> regex is that it might work just fine when using one method for net
> device naming, but break if that was changed. Also, it seems
> counterintuitive that it would be necessary to look for a device
> with a name matching a specific pattern; why isn't there simply a
> single symbolic link somewhere in the sysfs tree for the net device
> that just directly points at its physical port? That would be so
> much simpler and more reliable (or at least it would give the
> *perception* of being more reliable).
>
 I'll emphasize that regex is being used for phys_port_name, NOT net
 device name (they are not the same). Anyway, I'll give an example.

 Lets look how virPCIGetNetName() currently work.
 At first it try to read phys_port_id of every net device in net
 folder, like:
 $ cd '/sys/bus/pci/devices/:80:02.0/:82:00.0/'
 $ cat net/ens1f0/phys_port_id

 Imagine we have single pci dual port nic (I hope named it right), eg.
 net folder holds net devices for both ports, so read operation will
 be successfull and function will return name of 

Re: To start multiple KVM guests from one qcow2 image with transient disk option

2020-12-17 Thread Peter Krempa
On Mon, Dec 14, 2020 at 22:49:03 -0500, Masayoshi Mizuma wrote:
> On Sat, Dec 12, 2020 at 11:57:15AM +0100, Peter Krempa wrote:
> > On Fri, Dec 11, 2020 at 20:58:48 -0500, Masayoshi Mizuma wrote:
> > > Hello,
> > > 
> > > I would like to start multiple KVM guests from one qcow2 image, and
> > > discard the changes which the KVM guests done.
> > > 
> > > transient disk option is useful for discarding the changes, however,
> > > we cannot start multiple KVM guest from one qcow2 image because the
> > > image is write-locked by the first guest to be started.
> > > 
> > > I suppose the disk which transient option is enabled don't need to
> > > get the write lock because any changes go to the overlay image, and
> > > the overlay image is removed when the guest shutdown.
> > > 
> > > qemu has 'locking' option and the write lock is disabled when locking=off.
> > > To implement that, I have two ideas. I would appreciate it if you could
> > > give me the ideas which way is better (or another way).
> > 
> > There are two aspects of this.
> > 
> > 1) Controlling the locking property of qemu may be worth in many cases,
> > so by itself this would be a worthwhile patchset to add control of
> > qemu locking for a per-backing store basis.
> > 
> > 2) Disabling locking to achieve this is wrong though. The qemu image
> > locking infrastructure is justified and prevents many problems which
> > users might get into by wrong configuration.
> > 
> > For  disks, we should rather instantiate the top level
> > format node as 'readonly' and then put a read-write overlay on top of
> > it. This still prevents from using the file which became a backing file
> > in read-write mode in another VM.
> 
> Thank you for the idea!
> I just tried to change the original image to read-only (changed the 
> "read-only":false
> to "read-only":true) simply, then created a read-write overlay.
> qemu stopped with following assertion:
> 
>   qemu-system-x86_64: ../block/io.c:1874: bdrv_co_write_req_prepare: 
> Assertion `child->perm & BLK_PERM_WRITE' failed.
> 
> It looks like the qemu hit the assertion because the permission of the 
> overlay image
> is same as the original image.
> Probably I'm missing something... I'll dive into that...

Could you please post the command line and the QMP commands? Maybe
something isn't configured right in libvirt. Alternatively qemu might
need modification.




Re: [PATCH 4/4] lxd_domain: Require that VIR_LXC_DOMAIN_NAMESPACE_SOURCE_NONE is zero

2020-12-17 Thread Michal Privoznik

On 12/17/20 3:53 AM, Laine Stump wrote:

On 12/16/20 4:13 PM, Michal Privoznik wrote:

Our parser code relies on the fact that
VIR_LXC_DOMAIN_NAMESPACE_SOURCE_NONE has value of zero and thus
uses g_new0().  But strictly speaking, this is not mandated by
the enum typedef. Fix that.



Is there really any C compiler that doesn't make the first value a 0 by 
default? (If so, I wonder why?)




Honestly, I don't know.



Reviewed-by: Laine Stump 




Thanks.

Michal



Re: [PATCH 2/2] virnetdevopenvswitch: Try to unescape ovs-vsctl reply in one specific case

2020-12-17 Thread Michal Privoznik

On 12/16/20 8:17 PM, Laine Stump wrote:

On 12/16/20 1:45 PM, Michal Privoznik wrote:

During testing of my patch v6.10.0-rc1~221 it was found that

   'ovs-vsctl get Interface $name name' or
   'ovs-vsctl find Interface options:vhost-server-path=$path'

may return a string in double quotes, e.g. "vhost-user1". Later
investigation of openvswitch code showed, that early versions
(like 1.3.0) have somewhat restrictive set of safe characters
(isalpha() || '_' || '-' || '.'), which is then refined with
increasing version. For instance, version 2.11.4 has: isalnum()
|| '_' || '-' || '.'. If the string that ovs-vsctl wants to
output contains any other character it is escaped. You want to be
looking at ovsdb_atom_to_string() which handles outputting of a
single string and calls string_needs_quotes() and possibly
json_serialize_string() in openvswitch code base.

Since the interfaces are usually named "vhost-userN" we are
facing a problem where with one version we get the name in double
quotes and with another we get plain name without funny business.

Because of json involved I thought, let's make ovs-vsctl output
into JSON format and then use our JSON parser, but guess what -
ovs-vsctl ignores --format=json.



Yuck. Is this a known/reported problem? Is it true with current 
ovs-vsctl, or just with old versions? (I guess in the end it doesn't 
really matter, because we have to support old and new anyway :-/)


Honestly, I did not bother exactly because of the reason you're 
mentioning: we have to support both.






  But with a little help of
g_strdup_printf() it can be turned into JSON.

Fixes: e4c29e2904197472919d050c67acfd59f0144bbc
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1767013
Signed-off-by: Michal Privoznik 
---
  src/libvirt_private.syms |  1 +
  src/util/virnetdevopenvswitch.c  | 47 +
  src/util/virnetdevopenvswitch.h  |  4 +++
  tests/virnetdevopenvswitchtest.c | 52 
  4 files changed, 104 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index c7c37d9689..583fc5800e 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2676,6 +2676,7 @@ virNetDevOpenvswitchGetVhostuserIfname;
  virNetDevOpenvswitchInterfaceGetMaster;
  virNetDevOpenvswitchInterfaceParseStats;
  virNetDevOpenvswitchInterfaceStats;
+virNetDevOpenvswitchMaybeUnescapeReply;
  virNetDevOpenvswitchRemovePort;
  virNetDevOpenvswitchSetMigrateData;
  virNetDevOpenvswitchSetTimeout;
diff --git a/src/util/virnetdevopenvswitch.c 
b/src/util/virnetdevopenvswitch.c

index 7eabaa763d..14fa294ae1 100644
--- a/src/util/virnetdevopenvswitch.c
+++ b/src/util/virnetdevopenvswitch.c
@@ -460,6 +460,48 @@ virNetDevOpenvswitchInterfaceGetMaster(const char 
*ifname, char **master)

  }
+/**
+ * virNetDevOpenvswitchMaybeUnescapeReply:
+ * @reply: a string to unescape
+ *
+ * Depending on ovs-vsctl version a string might be escaped. For 
instance:
+ *  -version 2.11.4 allows only is_alpha(), an underscore, a dash or 
a dot,
+ *  -version 2.14.0 allows only is_alnum(), an underscore, a dash or 
a dot,

+ * any other character causes the string to be escaped.
+ *
+ * What this function does, is it checks whether @reply string 
consists solely
+ * from safe, not escaped characters (as defined by version 2.14.0) 
and if not
+ * an error is reported. If @reply is a string enclosed in double 
quotes, but

+ * otherwise safe those double quotes are removed.
+ *
+ * Returns: 0 on success,
+ * -1 otherwise (with error reported).
+ */
+int
+virNetDevOpenvswitchMaybeUnescapeReply(char *reply)
+{
+    g_autoptr(virJSONValue) json = NULL;
+    g_autofree char *jsonStr = NULL;
+    const char *tmp = NULL;
+    size_t replyLen = strlen(reply);
+
+    if (*reply != '"')
+    return 0;
+
+    jsonStr = g_strdup_printf("{\"name\": %s}", reply);
+    if (!(json = virJSONValueFromString(jsonStr)))
+    return -1;
+
+    if (!(tmp = virJSONValueObjectGetString(json, "name"))) {
+    virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("Malformed ovs-vsctl output"));
+    return -1;
+    }



Tricky!

There are other ways to unescape a string, but this one gets a vote for 
novelty :-)


Thank you sir!




Reviewed-by: Laine Stump 



Michal



Re: [PATCH 6/5] lxc: skip the netdev autogenerated name counter past existing devices

2020-12-17 Thread Michal Privoznik

On 12/17/20 1:28 AM, Laine Stump wrote:

On 12/16/20 4:27 PM, Michal Privoznik wrote:

On 12/16/20 9:13 PM, Laine Stump wrote:

the lxc driver uses virNetDevGenerateName() for its veth device names
since patch 2dd0fb492, so it should be using virNetDevReserveName()
during daemon restart/reconnect to skip over the device names that are
in use.

Signed-off-by: Laine Stump 
---

I meant to mention this during review of the abovementioned patch, 
but forgot.


(NB: a couple days ago I *removed* similar code from this same spot,
but it was trying to reserve the name of macvlan devices; a macvlan
device is moved into the container's namespace at startup, so it is
not visible to the host anyway. This new case is for the 1/2 of a veth
pair that does remain in the host's namespace
(type='bridge|network|ethernet' use a veth pair)


  src/lxc/lxc_process.c | 24 
  1 file changed, 24 insertions(+)

diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c
index 0f7c929535..a842ac91c5 100644
--- a/src/lxc/lxc_process.c
+++ b/src/lxc/lxc_process.c
@@ -1640,6 +1640,30 @@ 
virLXCProcessReconnectNotifyNets(virDomainDefPtr def)

  for (i = 0; i < def->nnets; i++) {
  virDomainNetDefPtr net = def->nets[i];
+    /* type='bridge|network|ethernet' interfaces may be using an
+ * autogenerated netdev name, so we should update the counter
+ * for autogenerated names to skip past this one.
+ */
+    switch (virDomainNetGetActualType(net)) {
+    case VIR_DOMAIN_NET_TYPE_BRIDGE:
+    case VIR_DOMAIN_NET_TYPE_NETWORK:
+    case VIR_DOMAIN_NET_TYPE_ETHERNET:
+    virNetDevReserveName(net->ifname);
+    break;
+    case VIR_DOMAIN_NET_TYPE_DIRECT:
+    case VIR_DOMAIN_NET_TYPE_USER:
+    case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
+    case VIR_DOMAIN_NET_TYPE_SERVER:
+    case VIR_DOMAIN_NET_TYPE_CLIENT:
+    case VIR_DOMAIN_NET_TYPE_MCAST:
+    case VIR_DOMAIN_NET_TYPE_INTERNAL:
+    case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+    case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
+    case VIR_DOMAIN_NET_TYPE_LAST:
+    break;
+    }
+


I remember Peter being picky about switch()-es and (almost) always he 
wanted me to add the default case with virReportEnumRangeError() 
despite the variable passed to switch() being verified earlier. IIUC 
his reasoning was that if we had a memory being overwritten somewhere 
it's better to error out (I say it's better to crash), but since I 
don't care that much, this could have:


 default:
 virReportEnumRangeError(virDomainNetType, 
virDomainNetGetActualType(net));

 return;


But if we add a default case to the switch, we won't get all the nice 
compile-time errors when we add a new value to the enum and forget to 
add it to every switch, will we?


We will. I worried about the same, but as it turned out we will get 
errors if a new item is added into the enum.


Michal