from:"Daniel P . Berrangé"

Re: [PATCH] crypto: Remove unused DER string functions

2024-09-18 Thread Daniel P . Berrangé

On Wed, Sep 18, 2024 at 03:55:28PM +0100, d...@treblig.org wrote:
> From: "Dr. David Alan Gilbert" 
> 
> qcrypto_der_encode_octet_str_begin and _end have been unused
> since they were added in
>   3b34ccad66 ("crypto: Support DER encodings")
> 
> Remove them.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  crypto/der.c | 13 -
>  crypto/der.h | 22 --
>  2 files changed, 35 deletions(-)

Reviewed-by: Daniel P. Berrangé 

and queued

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] hw/sysbus: Remove unused sysbus_mmio_unmap

2024-09-18 Thread Daniel P . Berrangé

On Wed, Sep 18, 2024 at 01:27:53PM +0100, d...@treblig.org wrote:
> From: "Dr. David Alan Gilbert" 
> 
> The last use of sysbus_mmio_unmap was removed by
>   981b1c6266 ("spapr/xive: rework the mapping the KVM memory regions")
> 
> Remove it.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  hw/core/sysbus.c| 10 --
>  include/hw/sysbus.h |  1 -
>  2 files changed, 11 deletions(-)

Reviewed-by: Daniel P. Berrangé 

> 
> diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
> index ad34fb7344..e64d99c8ed 100644
> --- a/hw/core/sysbus.c
> +++ b/hw/core/sysbus.c
> @@ -154,16 +154,6 @@ static void sysbus_mmio_map_common(SysBusDevice *dev, 
> int n, hwaddr addr,
>  }
>  }
>  
> -void sysbus_mmio_unmap(SysBusDevice *dev, int n)
> -{
> -assert(n >= 0 && n < dev->num_mmio);
> -
> -if (dev->mmio[n].addr != (hwaddr)-1) {
> -memory_region_del_subregion(get_system_memory(), 
> dev->mmio[n].memory);
> -dev->mmio[n].addr = (hwaddr)-1;
> -}
> -}
> -
>  void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr)
>  {
>  sysbus_mmio_map_common(dev, n, addr, false, 0);
> diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
> index 3cb29a480e..c9b1e0e90e 100644
> --- a/include/hw/sysbus.h
> +++ b/include/hw/sysbus.h
> @@ -82,7 +82,6 @@ qemu_irq sysbus_get_connected_irq(SysBusDevice *dev, int n);
>  void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr);
>  void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr addr,
>   int priority);
> -void sysbus_mmio_unmap(SysBusDevice *dev, int n);
>  
>  bool sysbus_realize(SysBusDevice *dev, Error **errp);
>  bool sysbus_realize_and_unref(SysBusDevice *dev, Error **errp);
> -- 
> 2.46.0
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: flakiness on CI jobs run via k8s

2024-09-18 Thread Daniel P . Berrangé

On Tue, Sep 17, 2024 at 04:48:45PM +0100, Peter Maydell wrote:
> I notice that a lot of the CI job flakiness I'm seeing with main
> CI runs involves jobs that are run via the k8s runners. Notably
> cross-i686-tci and cross-i686-system and cross-i686-user are like this.
> These jobs run with no flakiness that I've noticed when they're run
> by an individual gitlab user (in which case they're not running on
> k8s, I believe). So something seems to be up with the environment
> we're using to run the jobs for the main CI. My impression is that
> the time things take to run can be very variable, especially if the
> CI job believes the reported number of CPUs and actually tries to run
> 8 or 9 test cases in parallel.
> 
> Any ideas what might be causing issues here, or config tweaks
> we might be able to make to ensure that the environment reports
> to the CI job a number of CPUs/etc that accurately reflects
> the amount of resource it really has?

Didn't we change the hosting for our k8s runners recently ? They were
running on Azure, but I vaguely recall hearing that it was being
switched again.

Anyway, perhaps the cloud provider is over-committing the env such
that we have excessive streal time and thus not getting the full
power of the CPUs we expect.  I know gitlab's own public runners
will suffer from this periodically, due to the very cheap VMs they
host on.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 3/3] .gitlab-ci.d/cirrus: Add manual testing of macOS 15 (Sequoia)

2024-09-17 Thread Daniel P . Berrangé

On Tue, Sep 17, 2024 at 10:50:58AM +0200, Philippe Mathieu-Daudé wrote:
> Upgrade libvirt-ci so it covers macOS 15. Add a manual entry
> (QEMU_JOB_OPTIONAL: 1) to test on Sequoia release. Refresh the
> lci-tool generated files.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Pending libvirt-ci MR 501: 
> https://gitlab.com/libvirt/libvirt-ci/-/merge_requests/501
> 
> CI job: https://gitlab.com/philmd/qemu/-/jobs/7841560032
> ---
>  .gitlab-ci.d/cirrus.yml   | 17 +
>  .gitlab-ci.d/cirrus/macos-15.vars | 16 
>  tests/lcitool/libvirt-ci  |  2 +-
>  tests/lcitool/refresh |  1 +
>  4 files changed, 35 insertions(+), 1 deletion(-)
>  create mode 100644 .gitlab-ci.d/cirrus/macos-15.vars
> 
> diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml
> index f061687f1b..b84b42cce5 100644
> --- a/.gitlab-ci.d/cirrus.yml
> +++ b/.gitlab-ci.d/cirrus.yml
> @@ -66,6 +66,22 @@ aarch64-macos-14-base-build:
>  NAME: macos-14
>  CIRRUS_VM_INSTANCE_TYPE: macos_instance
>  CIRRUS_VM_IMAGE_SELECTOR: image
> +CIRRUS_VM_IMAGE_NAME: ghcr.io/cirruslabs/macos-ventura-base:latest

Something isn't right here - the existing 14 release is "sonoma", "ventura"
was 13 IIUC which you just removed

> +CIRRUS_VM_CPUS: 12
> +CIRRUS_VM_RAM: 24G
> +UPDATE_COMMAND: brew update
> +INSTALL_COMMAND: brew install
> +PATH_EXTRA: /opt/homebrew/ccache/libexec:/opt/homebrew/gettext/bin
> +PKG_CONFIG_PATH: 
> /opt/homebrew/curl/lib/pkgconfig:/opt/homebrew/ncurses/lib/pkgconfig:/opt/homebrew/readline/lib/pkgconfig
> +CONFIGURE_ARGS: 
> --target-list-exclude=arm-softmmu,i386-softmmu,microblazeel-softmmu,mips64-softmmu,mipsel-softmmu,mips-softmmu,ppc-softmmu,sh4-softmmu,xtensaeb-softmmu
> +TEST_TARGETS: check-unit check-block check-qapi-schema check-softfloat 
> check-qtest-x86_64
> +
> +aarch64-macos-15-base-build:
> +  extends: .cirrus_build_job
> +  variables:
> +NAME: macos-15
> +CIRRUS_VM_INSTANCE_TYPE: macos_instance
> +CIRRUS_VM_IMAGE_SELECTOR: image
>  CIRRUS_VM_IMAGE_NAME: ghcr.io/cirruslabs/macos-sonoma-base:latest

And this should be sequoia


I think the info we've committed to libvirt-ci is probably wrong.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] tests/unit: Really build pbkdf test on macOS

2024-09-17 Thread Daniel P . Berrangé

On Tue, Sep 17, 2024 at 08:57:36AM +0200, Philippe Mathieu-Daudé wrote:
> Fix a typo to run the pbkdf crypto cipher tests on macOS.
> 
>  $ make check-unit
>...
>87/102 qemu:unit / test-crypto-pbkdf  OK  2.35s   17 subtests 
> passed
> 
> Fixes: ebe0302ac8 ("tests/unit: build pbkdf test on macOS")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  tests/unit/test-crypto-pbkdf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/unit/test-crypto-pbkdf.c b/tests/unit/test-crypto-pbkdf.c
> index b477cf4e4b..12ee808fbc 100644
> --- a/tests/unit/test-crypto-pbkdf.c
> +++ b/tests/unit/test-crypto-pbkdf.c
> @@ -25,7 +25,7 @@
>  #include 
>  #endif
>  
> -#if defined(_WIN32) || defined(RUSAGE_THREAD) || defined(CONFIG_DARWNI)
> +#if defined(_WIN32) || defined(RUSAGE_THREAD) || defined(CONFIG_DARWIN)

/face-palm

>  #include "crypto/pbkdf.h"
>  
>  typedef struct QCryptoPbkdfTestData QCryptoPbkdfTestData;

Reviewed-by: Daniel P. Berrangé 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2] .gitlab-ci.d/crossbuilds.yml: Force 'make check' single-threaded for cross-i686-tci

2024-09-13 Thread Daniel P . Berrangé

On Fri, Sep 13, 2024 at 02:31:34PM +0100, Peter Maydell wrote:
> On Fri, 13 Sept 2024 at 13:24, Peter Maydell  wrote:
> >
> > On Thu, 12 Sept 2024 at 16:10, Peter Maydell  
> > wrote:
> > >
> > > The cross-i686-tci CI job is persistently flaky with various tests
> > > hitting timeouts.  One theory for why this is happening is that we're
> > > running too many tests in parallel and so sometimes a test gets
> > > starved of CPU and isn't able to complete within the timeout.
> > >
> > > (The environment this CI job runs in seems to cause us to default
> > > to a parallelism of 9 in the main CI.)
> > >
> > > Signed-off-by: Peter Maydell 
> > > ---
> > > If this works we might be able to wind this up to -j2 or -j3,
> > > and/or consider whether other CI jobs need something similar.
> >
> > I gave this a try, but unfortunately the result seems to be
> > that the whole job times out:
> > https://gitlab.com/qemu-project/qemu/-/jobs/7818441897
> 
> ...but then this simple retry passed with a runtime of 47 mins:
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/7819225200
> 
> I'm tempted to commit this as-is, and see whether it helps.
> If it doesn't I can always back it off to -j2, and if it does
> generate a lot of full-job-timeouts it's only me it's annoying.

Anyone know how many vCPUs our k8s runners have ?

The gitlab runners that contributor forks use will have 2
vCPUs. So our current make -j$(nproc+1)  will be effectively
-j3 already in pipelines for forks. IOW, we intentionally
slightly over-commit CPUs right now. Backing off to just
-j$(nproc)  may be better than hardcoding -j1/-j2, so that
it takes account of different runner sizes ?


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [RFC PATCH 0/2] qtest: Log verbosity changes

2024-09-13 Thread Daniel P . Berrangé

On Fri, Sep 06, 2024 at 10:52:53AM +0100, Peter Maydell wrote:
> On Fri, 6 Sept 2024 at 09:14, Daniel P. Berrangé  wrote:
> >
> > On Fri, Sep 06, 2024 at 08:16:31AM +0200, Thomas Huth wrote:
> > > On 05/09/2024 23.03, Fabiano Rosas wrote:
> > > > Hi,
> > > >
> > > > This series silences QEMU stderr unless the QTEST_LOG variable is set
> > > > and silences -qtest-log unless both QTEST_LOG and gtest's --verbose
> > > > flag is passed.
> > > >
> > > > This was motivated by Peter Maydell's ask to suppress deprecation
> > > > warn_report messages from the migration-tests and by my own
> > > > frustration over noisy output from qtest.
> 
> This isn't what I want, though -- what I want is that a
> qtest run should not print "warning:" messages for things
> that we expect to happen when we run that test. I *do* want
> warnings for things that we do not expect to happen when
> we run the test.

Currently we just allow the child QEMU process stdout/err
to inherit to the qtest program's stdout/err. With that
approach we have to do filtering at soruce (ie in QEMU
itself). I feel that in general it is a bad idea for the
program being tested to alter itself to suit the test
suite, not least because two different parts of the test
suite may have differing views about what messages they
want to ignore vs display.

We could address this be switching to the model used
with IO tests.  Always capture the child QEMU process
stdout/err to a pipe. The test program can apply regex
filters to cull output that is expected & irrelevant,
and then print out whatever is left over on its own
stderr.

That way all the filtering of undesirable messages would
be exclusively in the test suite, not QEMU itself.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: Gitlab CI caching is not working

2024-09-12 Thread Daniel P . Berrangé

On Thu, Sep 12, 2024 at 07:38:37AM +0200, Thomas Huth wrote:
> 
>  Hi!
> 
> While looking at some recent CI jobs, I noticed that the caching of the
> Gitlab-CI jobs is not working at all anymore. In the build jobs, the ccache
> saving is not working and causing a complete cache miss of each compile:
> 
>  https://gitlab.com/qemu-project/qemu/-/jobs/7802183187#L5328
> 
> And, maybe more important, in the avocado/functional jobs we don't cache the
> assets anymore, causing a re-download of multiple gigabytes each time:
> 
>  https://gitlab.com/qemu-project/qemu/-/jobs/7802183251#L29
> 
> (the du -chs in line 35 is not executed, thus the cache is nonexistent)
> 
> The problem is not new, it's been there for some weeks already, e.g. here's
> a run from the last freeze period (when the job was only running avocado
> tests):
> 
>  https://gitlab.com/qemu-project/qemu/-/jobs/7753544153#L86
> 
> There is a suspicious message at the beginning of the logs:
> 
>  "No URL provided, cache will not be downloaded from shared cache server.
> Instead a local version of cache will be extracted."
> 
> ... but since we use throw-away containers for building, I guess there is no
> local version of the cache?
> 
> Anyway, the problem only exists for the k8s runners, in my private clone of
> the repository that uses shared runners from gitlab, the caching is working
> right.

Right, by default caches are local to the runner's execution environment.
If you want a distributed cache, that has to be turned on in the runner
configuration:

  
https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH v2] hw/i386: define _AS_LATEST() macros for machine types

2024-09-10 Thread Daniel P . Berrangé

Follow the other architecture targets by adding extra macros for
defining a versioned machine type as the latest. This reduces the
size of the changes when introducing new machine types at the start
of each release cycle.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Cornelia Huck 
Signed-off-by: Daniel P. Berrangé 
---

In v2:

 - Rebased on top of new 9.2 machine types

 hw/i386/pc_piix.c| 11 +--
 hw/i386/pc_q35.c | 11 ++-
 include/hw/i386/pc.h |  4 +++-
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 2bf6865d40..4953676170 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -446,7 +446,10 @@ static void pc_i440fx_init(MachineState *machine)
 }
 
 #define DEFINE_I440FX_MACHINE(major, minor) \
-DEFINE_PC_VER_MACHINE(pc_i440fx, "pc-i440fx", pc_i440fx_init, major, 
minor);
+DEFINE_PC_VER_MACHINE(pc_i440fx, "pc-i440fx", pc_i440fx_init, false, NULL, 
major, minor);
+
+#define DEFINE_I440FX_MACHINE_AS_LATEST(major, minor) \
+DEFINE_PC_VER_MACHINE(pc_i440fx, "pc-i440fx", pc_i440fx_init, true, "pc", 
major, minor);
 
 static void pc_i440fx_machine_options(MachineClass *m)
 {
@@ -477,17 +480,13 @@ static void pc_i440fx_machine_options(MachineClass *m)
 static void pc_i440fx_machine_9_2_options(MachineClass *m)
 {
 pc_i440fx_machine_options(m);
-m->alias = "pc";
-m->is_default = true;
 }
 
-DEFINE_I440FX_MACHINE(9, 2);
+DEFINE_I440FX_MACHINE_AS_LATEST(9, 2);
 
 static void pc_i440fx_machine_9_1_options(MachineClass *m)
 {
 pc_i440fx_machine_9_2_options(m);
-m->alias = NULL;
-m->is_default = false;
 compat_props_add(m->compat_props, hw_compat_9_1, hw_compat_9_1_len);
 compat_props_add(m->compat_props, pc_compat_9_1, pc_compat_9_1_len);
 }
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 8319b6d45e..42bdedbaa4 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -327,10 +327,13 @@ static void pc_q35_init(MachineState *machine)
 }
 
 #define DEFINE_Q35_MACHINE(major, minor) \
-DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, major, minor);
+DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, false, NULL, major, 
minor);
+
+#define DEFINE_Q35_MACHINE_AS_LATEST(major, minor) \
+DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, false, "q35", major, 
minor);
 
 #define DEFINE_Q35_MACHINE_BUGFIX(major, minor, micro) \
-DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, major, minor, micro);
+DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, false, NULL, major, 
minor, micro);
 
 static void pc_q35_machine_options(MachineClass *m)
 {
@@ -359,15 +362,13 @@ static void pc_q35_machine_options(MachineClass *m)
 static void pc_q35_machine_9_2_options(MachineClass *m)
 {
 pc_q35_machine_options(m);
-m->alias = "q35";
 }
 
-DEFINE_Q35_MACHINE(9, 2);
+DEFINE_Q35_MACHINE_AS_LATEST(9, 2);
 
 static void pc_q35_machine_9_1_options(MachineClass *m)
 {
 pc_q35_machine_9_2_options(m);
-m->alias = NULL;
 compat_props_add(m->compat_props, hw_compat_9_1, hw_compat_9_1_len);
 compat_props_add(m->compat_props, pc_compat_9_1, pc_compat_9_1_len);
 }
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 14ee06287d..890427c56e 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -320,7 +320,7 @@ extern const size_t pc_compat_2_3_len;
 } \
 type_init(pc_machine_init_##suffix)
 
-#define DEFINE_PC_VER_MACHINE(namesym, namestr, initfn, ...) \
+#define DEFINE_PC_VER_MACHINE(namesym, namestr, initfn, isdefault, malias, 
...) \
 static void MACHINE_VER_SYM(init, namesym, __VA_ARGS__)( \
 MachineState *machine) \
 { \
@@ -334,6 +334,8 @@ extern const size_t pc_compat_2_3_len;
 MACHINE_VER_SYM(options, namesym, __VA_ARGS__)(mc); \
 mc->init = MACHINE_VER_SYM(init, namesym, __VA_ARGS__); \
 MACHINE_VER_DEPRECATION(__VA_ARGS__); \
+mc->is_default = isdefault; \
+mc->alias = malias; \
 } \
 static const TypeInfo MACHINE_VER_SYM(info, namesym, __VA_ARGS__) = \
 { \
-- 
2.43.0

Re: [PULL 20/24] audio: Add sndio backend

2024-09-10 Thread Daniel P . Berrangé

On Tue, Sep 10, 2024 at 04:16:23PM +0200, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> (This is commit 663df1cc68).
> 
> On 27/9/22 10:19, Gerd Hoffmann wrote:
> > From: Alexandre Ratchov 
> > 
> > sndio is the native API used by OpenBSD, although it has been ported to
> > other *BSD's and Linux (packages for Ubuntu, Debian, Void, Arch, etc.).
> > 
> > Signed-off-by: Brad Smith 
> > Signed-off-by: Alexandre Ratchov 
> > Reviewed-by: Volker Rümelin 
> > Tested-by: Volker Rümelin 
> > Message-Id: 
> > Signed-off-by: Gerd Hoffmann 
> > ---
> >   meson_options.txt |   4 +-
> >   audio/audio_template.h|   2 +
> >   audio/audio.c |   1 +
> >   audio/sndioaudio.c| 565 ++
> >   MAINTAINERS   |   7 +
> >   audio/meson.build |   1 +
> >   meson.build   |   9 +-
> >   qapi/audio.json   |  25 +-
> >   qemu-options.hx   |  16 +
> >   scripts/meson-buildoptions.sh |   7 +-
> >   10 files changed, 632 insertions(+), 5 deletions(-)
> >   create mode 100644 audio/sndioaudio.c
> 
> 
> > diff --git a/audio/sndioaudio.c b/audio/sndioaudio.c
> > new file mode 100644
> > index ..7c45276d36ce
> > --- /dev/null
> > +++ b/audio/sndioaudio.c
> > @@ -0,0 +1,565 @@
> > +/*
> > + * SPDX-License-Identifier: ISC
> 
> This is the single use of the ISC license in the more than 10k
> files in the repository. Just checking IIUC this document:
> https://www.gnu.org/licenses/quick-guide-gplv3.en.html
> 
> ISC -> LGPLv2.1 -> GPLv2 -> GPLv3
> 
> So ISC is compatible with GPLv2-or-later. Is that correct?

ISC is a permissive license that's semantically pretty much equivalent
to either MIT or BSD 2 clause licenses and thus is broadly compatible
with most other licenses, including the various GPL variants/versions.

None the less, since sndioaudio.c was a new file, it should have been
submitted using the GPLv2+, unless there was a reason it needed to
diverge and use ISC.

An example justification for divering is if the new code is derived
from some non-QEMU source that was already ISC.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] tests/functional: Fix bad usage of has_cmd

2024-09-10 Thread Daniel P . Berrangé

On Tue, Sep 10, 2024 at 09:58:20AM +0200, Thomas Huth wrote:
> has_cmd returns a tuple, not a boolean value. This fixes a crash when
> e.g. "tesseract" is not available in the test_m68k_nextcube test.
> 
> Reported-by: Richard Henderson 
> Signed-off-by: Thomas Huth 
> ---
>  tests/functional/qemu_test/cmd.py   | 6 +++---
>  tests/functional/qemu_test/tesseract.py | 3 ++-
>  2 files changed, 5 insertions(+), 4 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PULL 10/10] crypto: Introduce x509 utils

2024-09-09 Thread Daniel P . Berrangé

From: Dorjoy Chowdhury 

An utility function for getting fingerprint from X.509 certificate
has been introduced. Implementation only provided using gnutls.

Signed-off-by: Dorjoy Chowdhury 
[DB: fixed missing gnutls_x509_crt_deinit in success path]
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Daniel P. Berrangé 
---
 crypto/meson.build  |  4 ++
 crypto/x509-utils.c | 76 +
 include/crypto/x509-utils.h | 22 +++
 3 files changed, 102 insertions(+)
 create mode 100644 crypto/x509-utils.c
 create mode 100644 include/crypto/x509-utils.h

diff --git a/crypto/meson.build b/crypto/meson.build
index c46f9c22a7..735635de1f 100644
--- a/crypto/meson.build
+++ b/crypto/meson.build
@@ -24,6 +24,10 @@ crypto_ss.add(files(
   'rsakey.c',
 ))
 
+if gnutls.found()
+  crypto_ss.add(files('x509-utils.c'))
+endif
+
 if nettle.found()
   crypto_ss.add(nettle, files('hash-nettle.c', 'hmac-nettle.c', 
'pbkdf-nettle.c'))
   if hogweed.found()
diff --git a/crypto/x509-utils.c b/crypto/x509-utils.c
new file mode 100644
index 00..6e157af76b
--- /dev/null
+++ b/crypto/x509-utils.c
@@ -0,0 +1,76 @@
+/*
+ * X.509 certificate related helpers
+ *
+ * Copyright (c) 2024 Dorjoy Chowdhury 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "crypto/x509-utils.h"
+#include 
+#include 
+#include 
+
+static const int qcrypto_to_gnutls_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {
+[QCRYPTO_HASH_ALG_MD5] = GNUTLS_DIG_MD5,
+[QCRYPTO_HASH_ALG_SHA1] = GNUTLS_DIG_SHA1,
+[QCRYPTO_HASH_ALG_SHA224] = GNUTLS_DIG_SHA224,
+[QCRYPTO_HASH_ALG_SHA256] = GNUTLS_DIG_SHA256,
+[QCRYPTO_HASH_ALG_SHA384] = GNUTLS_DIG_SHA384,
+[QCRYPTO_HASH_ALG_SHA512] = GNUTLS_DIG_SHA512,
+[QCRYPTO_HASH_ALG_RIPEMD160] = GNUTLS_DIG_RMD160,
+};
+
+int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t size,
+  QCryptoHashAlgorithm alg,
+  uint8_t *result,
+  size_t *resultlen,
+  Error **errp)
+{
+int ret = -1;
+int hlen;
+gnutls_x509_crt_t crt;
+gnutls_datum_t datum = {.data = cert, .size = size};
+
+if (alg >= G_N_ELEMENTS(qcrypto_to_gnutls_hash_alg_map)) {
+error_setg(errp, "Unknown hash algorithm");
+return -1;
+}
+
+if (result == NULL) {
+error_setg(errp, "No valid buffer given");
+return -1;
+}
+
+gnutls_x509_crt_init(&crt);
+
+if (gnutls_x509_crt_import(crt, &datum, GNUTLS_X509_FMT_PEM) != 0) {
+error_setg(errp, "Failed to import certificate");
+goto cleanup;
+}
+
+hlen = gnutls_hash_get_len(qcrypto_to_gnutls_hash_alg_map[alg]);
+if (*resultlen < hlen) {
+error_setg(errp,
+   "Result buffer size %zu is smaller than hash %d",
+   *resultlen, hlen);
+goto cleanup;
+}
+
+if (gnutls_x509_crt_get_fingerprint(crt,
+qcrypto_to_gnutls_hash_alg_map[alg],
+result, resultlen) != 0) {
+error_setg(errp, "Failed to get fingerprint from certificate");
+goto cleanup;
+}
+
+ret = 0;
+
+ cleanup:
+gnutls_x509_crt_deinit(crt);
+return ret;
+}
diff --git a/include/crypto/x509-utils.h b/include/crypto/x509-utils.h
new file mode 100644
index 00..4210dfbcfc
--- /dev/null
+++ b/include/crypto/x509-utils.h
@@ -0,0 +1,22 @@
+/*
+ * X.509 certificate related helpers
+ *
+ * Copyright (c) 2024 Dorjoy Chowdhury 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#ifndef QCRYPTO_X509_UTILS_H
+#define QCRYPTO_X509_UTILS_H
+
+#include "crypto/hash.h"
+
+int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t size,
+  QCryptoHashAlgorithm hash,
+  uint8_t *result,
+  size_t *resultlen,
+  Error **errp);
+
+#endif
-- 
2.45.2

[PULL 04/10] tests/unit: always build the pbkdf crypto unit test

2024-09-09 Thread Daniel P . Berrangé

The meson rules were excluding the pbkdf crypto test when gnutls was the
crypto backend. It was then excluded again in #if statements in the test
file.

Rather than update these conditions, remove them all, and use the result
of the qcrypto_pbkdf_supports() function to determine whether to skip
test registration.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Daniel P. Berrangé 
---
 tests/unit/meson.build |  4 +---
 tests/unit/test-crypto-pbkdf.c | 13 -
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index 490ab8182d..972d792883 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -121,9 +121,7 @@ if have_block
   if config_host_data.get('CONFIG_REPLICATION')
 tests += {'test-replication': [testblock]}
   endif
-  if nettle.found() or gcrypt.found()
-tests += {'test-crypto-pbkdf': [io]}
-  endif
+  tests += {'test-crypto-pbkdf': [io]}
 endif
 
 if have_system
diff --git a/tests/unit/test-crypto-pbkdf.c b/tests/unit/test-crypto-pbkdf.c
index 43c417f6b4..241e1c2cf0 100644
--- a/tests/unit/test-crypto-pbkdf.c
+++ b/tests/unit/test-crypto-pbkdf.c
@@ -25,8 +25,7 @@
 #include 
 #endif
 
-#if ((defined(CONFIG_NETTLE) || defined(CONFIG_GCRYPT)) && \
- (defined(_WIN32) || defined(RUSAGE_THREAD)))
+#if defined(_WIN32) || defined(RUSAGE_THREAD)
 #include "crypto/pbkdf.h"
 
 typedef struct QCryptoPbkdfTestData QCryptoPbkdfTestData;
@@ -394,7 +393,7 @@ static void test_pbkdf(const void *opaque)
 }
 
 
-static void test_pbkdf_timing(void)
+static void test_pbkdf_timing_sha256(void)
 {
 uint8_t key[32];
 uint8_t salt[32];
@@ -422,14 +421,18 @@ int main(int argc, char **argv)
 g_assert(qcrypto_init(NULL) == 0);
 
 for (i = 0; i < G_N_ELEMENTS(test_data); i++) {
+if (!qcrypto_pbkdf2_supports(test_data[i].hash)) {
+continue;
+}
+
 if (!test_data[i].slow ||
 g_test_slow()) {
 g_test_add_data_func(test_data[i].path, &test_data[i], test_pbkdf);
 }
 }
 
-if (g_test_slow()) {
-g_test_add_func("/crypt0/pbkdf/timing", test_pbkdf_timing);
+if (g_test_slow() && qcrypto_pbkdf2_supports(QCRYPTO_HASH_ALG_SHA256)) {
+g_test_add_func("/crypt0/pbkdf/timing/sha256", 
test_pbkdf_timing_sha256);
 }
 
 return g_test_run();
-- 
2.45.2

[PULL 07/10] crypto: use consistent error reporting pattern for unsupported cipher modes

2024-09-09 Thread Daniel P . Berrangé

Not all paths in qcrypto_cipher_ctx_new() were correctly distinguishing
between valid user input for cipher mode (which should report a user
facing error), vs program logic errors (which should assert).

Reported-by: Peter Maydell 
Signed-off-by: Daniel P. Berrangé 
---
 crypto/cipher-nettle.c.inc | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/crypto/cipher-nettle.c.inc b/crypto/cipher-nettle.c.inc
index 766de036ba..2654b439c1 100644
--- a/crypto/cipher-nettle.c.inc
+++ b/crypto/cipher-nettle.c.inc
@@ -525,8 +525,10 @@ static QCryptoCipher 
*qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg,
 case QCRYPTO_CIPHER_MODE_CTR:
 drv = &qcrypto_nettle_des_driver_ctr;
 break;
-default:
+case QCRYPTO_CIPHER_MODE_XTS:
 goto bad_cipher_mode;
+default:
+g_assert_not_reached();
 }
 
 ctx = g_new0(QCryptoNettleDES, 1);
@@ -551,8 +553,10 @@ static QCryptoCipher 
*qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg,
 case QCRYPTO_CIPHER_MODE_CTR:
 drv = &qcrypto_nettle_des3_driver_ctr;
 break;
-default:
+case QCRYPTO_CIPHER_MODE_XTS:
 goto bad_cipher_mode;
+default:
+g_assert_not_reached();
 }
 
 ctx = g_new0(QCryptoNettleDES3, 1);
@@ -663,8 +667,10 @@ static QCryptoCipher 
*qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg,
 case QCRYPTO_CIPHER_MODE_CTR:
 drv = &qcrypto_nettle_cast128_driver_ctr;
 break;
-default:
+case QCRYPTO_CIPHER_MODE_XTS:
 goto bad_cipher_mode;
+default:
+g_assert_not_reached();
 }
 
 ctx = g_new0(QCryptoNettleCAST128, 1);
@@ -741,8 +747,12 @@ static QCryptoCipher 
*qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg,
 case QCRYPTO_CIPHER_MODE_ECB:
 drv = &qcrypto_nettle_sm4_driver_ecb;
 break;
-default:
+case QCRYPTO_CIPHER_MODE_CBC:
+case QCRYPTO_CIPHER_MODE_CTR:
+case QCRYPTO_CIPHER_MODE_XTS:
 goto bad_cipher_mode;
+default:
+g_assert_not_reached();
 }
 
 ctx = g_new0(QCryptoNettleSm4, 1);
-- 
2.45.2

[PULL 02/10] crypto: run qcrypto_pbkdf2_count_iters in a new thread

2024-09-09 Thread Daniel P . Berrangé

From: Tiago Pasqualini 

CPU time accounting in the kernel has been demonstrated to have a
sawtooth pattern[1][2]. This can cause the getrusage system call to
not be as accurate as we are expecting, which can cause this calculation
to stall.

The kernel discussions shows that this inaccuracy happens when CPU time
gets big enough, so this patch changes qcrypto_pbkdf2_count_iters to run
in a fresh thread to avoid this inaccuracy. It also adds a sanity check
to fail the process if CPU time is not accounted.

[1] 
https://lore.kernel.org/lkml/159231011694.16989.16351419333851309713.tip-bot2@tip-bot2/
[2] 
https://lore.kernel.org/lkml/20221226031010.4079885-1-maxing@bytedance.com/t/#m1c7f2fdc0ea742776a70fd1aa2a2e414c437f534

Resolves: #2398
Signed-off-by: Tiago Pasqualini 
Signed-off-by: Daniel P. Berrangé 
---
 crypto/pbkdf.c | 53 +++---
 1 file changed, 46 insertions(+), 7 deletions(-)

diff --git a/crypto/pbkdf.c b/crypto/pbkdf.c
index 8d198c152c..d1c06ef3ed 100644
--- a/crypto/pbkdf.c
+++ b/crypto/pbkdf.c
@@ -19,6 +19,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/thread.h"
 #include "qapi/error.h"
 #include "crypto/pbkdf.h"
 #ifndef _WIN32
@@ -85,12 +86,28 @@ static int qcrypto_pbkdf2_get_thread_cpu(unsigned long long 
*val_ms,
 #endif
 }
 
-uint64_t qcrypto_pbkdf2_count_iters(QCryptoHashAlgorithm hash,
-const uint8_t *key, size_t nkey,
-const uint8_t *salt, size_t nsalt,
-size_t nout,
-Error **errp)
+typedef struct CountItersData {
+QCryptoHashAlgorithm hash;
+const uint8_t *key;
+size_t nkey;
+const uint8_t *salt;
+size_t nsalt;
+size_t nout;
+uint64_t iterations;
+Error **errp;
+} CountItersData;
+
+static void *threaded_qcrypto_pbkdf2_count_iters(void *data)
 {
+CountItersData *iters_data = (CountItersData *) data;
+QCryptoHashAlgorithm hash = iters_data->hash;
+const uint8_t *key = iters_data->key;
+size_t nkey = iters_data->nkey;
+const uint8_t *salt = iters_data->salt;
+size_t nsalt = iters_data->nsalt;
+size_t nout = iters_data->nout;
+Error **errp = iters_data->errp;
+
 uint64_t ret = -1;
 g_autofree uint8_t *out = g_new(uint8_t, nout);
 uint64_t iterations = (1 << 15);
@@ -114,7 +131,10 @@ uint64_t qcrypto_pbkdf2_count_iters(QCryptoHashAlgorithm 
hash,
 
 delta_ms = end_ms - start_ms;
 
-if (delta_ms > 500) {
+if (delta_ms == 0) { /* sanity check */
+error_setg(errp, "Unable to get accurate CPU usage");
+goto cleanup;
+} else if (delta_ms > 500) {
 break;
 } else if (delta_ms < 100) {
 iterations = iterations * 10;
@@ -129,5 +149,24 @@ uint64_t qcrypto_pbkdf2_count_iters(QCryptoHashAlgorithm 
hash,
 
  cleanup:
 memset(out, 0, nout);
-return ret;
+iters_data->iterations = ret;
+return NULL;
+}
+
+uint64_t qcrypto_pbkdf2_count_iters(QCryptoHashAlgorithm hash,
+const uint8_t *key, size_t nkey,
+const uint8_t *salt, size_t nsalt,
+size_t nout,
+Error **errp)
+{
+CountItersData data = {
+hash, key, nkey, salt, nsalt, nout, 0, errp
+};
+QemuThread thread;
+
+qemu_thread_create(&thread, "pbkdf2", threaded_qcrypto_pbkdf2_count_iters,
+   &data, QEMU_THREAD_JOINABLE);
+qemu_thread_join(&thread);
+
+return data.iterations;
 }
-- 
2.45.2

[PULL 01/10] iotests: fix expected output from gnutls

2024-09-09 Thread Daniel P . Berrangé

Error reporting from gnutls was improved by:

  commit 57941c9c86357a6a642f9ee3279d881df4043b6d
  Author: Daniel P. Berrangé 
  Date:   Fri Mar 15 14:07:58 2024 +

crypto: push error reporting into TLS session I/O APIs

This has the effect of changing the output from one of the NBD
tests.

Reported-by: Thomas Huth 
Signed-off-by: Daniel P. Berrangé 
---
 tests/qemu-iotests/233.out | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index 1910f7df20..d498d55e0e 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -69,8 +69,8 @@ read 1048576/1048576 bytes at offset 1048576
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 == check TLS with authorization ==
-qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: Software caused 
connection abort
-qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: Software caused 
connection abort
+qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: The TLS connection 
was non-properly terminated.
+qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: The TLS connection 
was non-properly terminated.
 
 == check TLS fail over UNIX with no hostname ==
 qemu-img: Could not open 
'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0': No hostname for 
certificate validation
@@ -103,14 +103,14 @@ qemu-img: Could not open 
'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0'
 qemu-nbd: TLS handshake failed: The TLS connection was non-properly terminated.
 
 == final server log ==
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: The TLS connection was non-properly terminated.
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: The TLS connection was non-properly terminated.
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
 qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: The TLS connection was non-properly terminated.
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: The TLS connection was non-properly terminated.
 qemu-nbd: option negotiation failed: TLS handshake failed: An illegal 
parameter has been received.
 qemu-nbd: option negotiation failed: TLS handshake failed: An illegal 
parameter has been received.
 *** done
-- 
2.45.2

[PULL 06/10] crypto: avoid leak of ctx when bad cipher mode is given

2024-09-09 Thread Daniel P . Berrangé

Fixes: Coverity CID 1546884
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Daniel P. Berrangé 
---
 crypto/cipher-nettle.c.inc | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/crypto/cipher-nettle.c.inc b/crypto/cipher-nettle.c.inc
index 42b39e18a2..766de036ba 100644
--- a/crypto/cipher-nettle.c.inc
+++ b/crypto/cipher-nettle.c.inc
@@ -734,16 +734,19 @@ static QCryptoCipher 
*qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg,
 #ifdef CONFIG_CRYPTO_SM4
 case QCRYPTO_CIPHER_ALG_SM4:
 {
-QCryptoNettleSm4 *ctx = g_new0(QCryptoNettleSm4, 1);
+QCryptoNettleSm4 *ctx;
+const QCryptoCipherDriver *drv;
 
 switch (mode) {
 case QCRYPTO_CIPHER_MODE_ECB:
-ctx->base.driver = &qcrypto_nettle_sm4_driver_ecb;
+drv = &qcrypto_nettle_sm4_driver_ecb;
 break;
 default:
 goto bad_cipher_mode;
 }
 
+ctx = g_new0(QCryptoNettleSm4, 1);
+ctx->base.driver = drv;
 sm4_set_encrypt_key(&ctx->key[0], key);
 sm4_set_decrypt_key(&ctx->key[1], key);
 
-- 
2.45.2

[PULL 08/10] crypto: Define macros for hash algorithm digest lengths

2024-09-09 Thread Daniel P . Berrangé

From: Dorjoy Chowdhury 

Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Dorjoy Chowdhury 
Signed-off-by: Daniel P. Berrangé 
---
 crypto/hash.c | 14 +++---
 include/crypto/hash.h |  8 
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/crypto/hash.c b/crypto/hash.c
index b0f8228bdc..8087f5dae6 100644
--- a/crypto/hash.c
+++ b/crypto/hash.c
@@ -23,13 +23,13 @@
 #include "hashpriv.h"
 
 static size_t qcrypto_hash_alg_size[QCRYPTO_HASH_ALG__MAX] = {
-[QCRYPTO_HASH_ALG_MD5] = 16,
-[QCRYPTO_HASH_ALG_SHA1] = 20,
-[QCRYPTO_HASH_ALG_SHA224] = 28,
-[QCRYPTO_HASH_ALG_SHA256] = 32,
-[QCRYPTO_HASH_ALG_SHA384] = 48,
-[QCRYPTO_HASH_ALG_SHA512] = 64,
-[QCRYPTO_HASH_ALG_RIPEMD160] = 20,
+[QCRYPTO_HASH_ALG_MD5]   = QCRYPTO_HASH_DIGEST_LEN_MD5,
+[QCRYPTO_HASH_ALG_SHA1]  = QCRYPTO_HASH_DIGEST_LEN_SHA1,
+[QCRYPTO_HASH_ALG_SHA224]= QCRYPTO_HASH_DIGEST_LEN_SHA224,
+[QCRYPTO_HASH_ALG_SHA256]= QCRYPTO_HASH_DIGEST_LEN_SHA256,
+[QCRYPTO_HASH_ALG_SHA384]= QCRYPTO_HASH_DIGEST_LEN_SHA384,
+[QCRYPTO_HASH_ALG_SHA512]= QCRYPTO_HASH_DIGEST_LEN_SHA512,
+[QCRYPTO_HASH_ALG_RIPEMD160] = QCRYPTO_HASH_DIGEST_LEN_RIPEMD160,
 };
 
 size_t qcrypto_hash_digest_len(QCryptoHashAlgorithm alg)
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 54d87aa2a1..a113cc3b04 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -23,6 +23,14 @@
 
 #include "qapi/qapi-types-crypto.h"
 
+#define QCRYPTO_HASH_DIGEST_LEN_MD5   16
+#define QCRYPTO_HASH_DIGEST_LEN_SHA1  20
+#define QCRYPTO_HASH_DIGEST_LEN_SHA22428
+#define QCRYPTO_HASH_DIGEST_LEN_SHA25632
+#define QCRYPTO_HASH_DIGEST_LEN_SHA38448
+#define QCRYPTO_HASH_DIGEST_LEN_SHA51264
+#define QCRYPTO_HASH_DIGEST_LEN_RIPEMD160 20
+
 /* See also "QCryptoHashAlgorithm" defined in qapi/crypto.json */
 
 /**
-- 
2.45.2

[PULL 03/10] crypto: check gnutls & gcrypt support the requested pbkdf hash

2024-09-09 Thread Daniel P . Berrangé

Both gnutls and gcrypt can be configured to exclude support for certain
algorithms via a runtime check against system crypto policies. Thus it
is not sufficient to have a compile time test for hash support in their
pbkdf implementations.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Daniel P. Berrangé 
---
 crypto/pbkdf-gcrypt.c | 2 +-
 crypto/pbkdf-gnutls.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/pbkdf-gcrypt.c b/crypto/pbkdf-gcrypt.c
index a8d8e64f4d..bc0719c831 100644
--- a/crypto/pbkdf-gcrypt.c
+++ b/crypto/pbkdf-gcrypt.c
@@ -33,7 +33,7 @@ bool qcrypto_pbkdf2_supports(QCryptoHashAlgorithm hash)
 case QCRYPTO_HASH_ALG_SHA384:
 case QCRYPTO_HASH_ALG_SHA512:
 case QCRYPTO_HASH_ALG_RIPEMD160:
-return true;
+return qcrypto_hash_supports(hash);
 default:
 return false;
 }
diff --git a/crypto/pbkdf-gnutls.c b/crypto/pbkdf-gnutls.c
index 2dfbbd382c..911b565bea 100644
--- a/crypto/pbkdf-gnutls.c
+++ b/crypto/pbkdf-gnutls.c
@@ -33,7 +33,7 @@ bool qcrypto_pbkdf2_supports(QCryptoHashAlgorithm hash)
 case QCRYPTO_HASH_ALG_SHA384:
 case QCRYPTO_HASH_ALG_SHA512:
 case QCRYPTO_HASH_ALG_RIPEMD160:
-return true;
+return qcrypto_hash_supports(hash);
 default:
 return false;
 }
-- 
2.45.2

[PULL 05/10] tests/unit: build pbkdf test on macOS

2024-09-09 Thread Daniel P . Berrangé

Add CONFIG_DARWIN to the pbkdf test build condition, since we have a way
to measure CPU time on this platform since commit bf98afc75efedf1.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Daniel P. Berrangé 
---
 tests/unit/test-crypto-pbkdf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/unit/test-crypto-pbkdf.c b/tests/unit/test-crypto-pbkdf.c
index 241e1c2cf0..39264cb662 100644
--- a/tests/unit/test-crypto-pbkdf.c
+++ b/tests/unit/test-crypto-pbkdf.c
@@ -25,7 +25,7 @@
 #include 
 #endif
 
-#if defined(_WIN32) || defined(RUSAGE_THREAD)
+#if defined(_WIN32) || defined(RUSAGE_THREAD) || defined(CONFIG_DARWNI)
 #include "crypto/pbkdf.h"
 
 typedef struct QCryptoPbkdfTestData QCryptoPbkdfTestData;
-- 
2.45.2

[PULL 09/10] crypto: Support SHA384 hash when using glib

2024-09-09 Thread Daniel P . Berrangé

From: Dorjoy Chowdhury 

QEMU requires minimum glib version 2.66.0 as per the root meson.build
file and per glib documentation[1] G_CHECKSUM_SHA384 is available since
2.51.

[1] https://docs.gtk.org/glib/enum.ChecksumType.html

Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Dorjoy Chowdhury 
Signed-off-by: Daniel P. Berrangé 
---
 crypto/hash-glib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/hash-glib.c b/crypto/hash-glib.c
index 82de9db705..18e64faa9c 100644
--- a/crypto/hash-glib.c
+++ b/crypto/hash-glib.c
@@ -29,7 +29,7 @@ static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {
 [QCRYPTO_HASH_ALG_SHA1] = G_CHECKSUM_SHA1,
 [QCRYPTO_HASH_ALG_SHA224] = -1,
 [QCRYPTO_HASH_ALG_SHA256] = G_CHECKSUM_SHA256,
-[QCRYPTO_HASH_ALG_SHA384] = -1,
+[QCRYPTO_HASH_ALG_SHA384] = G_CHECKSUM_SHA384,
 [QCRYPTO_HASH_ALG_SHA512] = G_CHECKSUM_SHA512,
 [QCRYPTO_HASH_ALG_RIPEMD160] = -1,
 };
-- 
2.45.2

[PULL 00/10] Crypto fixes patches

2024-09-09 Thread Daniel P . Berrangé

The following changes since commit f2aee60305a1e40374b2fc1093e4d04404e780ee:

  Merge tag 'pull-request-2024-09-08' of https://gitlab.com/huth/qemu into 
staging (2024-09-09 10:47:24 +0100)

are available in the Git repository at:

  https://gitlab.com/berrange/qemu tags/crypto-fixes-pull-request

for you to fetch changes up to 10a1d34fc0d4dfe0dd6f5ec73f62dc1afa04af6c:

  crypto: Introduce x509 utils (2024-09-09 15:13:38 +0100)


Various crypto fixes

 * Support sha384 with glib crypto backend
 * Improve error reporting for unsupported cipher modes
 * Avoid memory leak when bad cipher mode is given
 * Run pbkdf tests on macOS
 * Runtime check for pbkdf hash impls with gnutls & gcrypt
 * Avoid hangs counter pbkdf iterations on some Linux kernels
   by using a throwaway thread for benchmarking performance
 * Fix iotests expected output from gnutls errors

--------

Daniel P. Berrangé (6):
  iotests: fix expected output from gnutls
  crypto: check gnutls & gcrypt support the requested pbkdf hash
  tests/unit: always build the pbkdf crypto unit test
  tests/unit: build pbkdf test on macOS
  crypto: avoid leak of ctx when bad cipher mode is given
  crypto: use consistent error reporting pattern for unsupported cipher
modes

Dorjoy Chowdhury (3):
  crypto: Define macros for hash algorithm digest lengths
  crypto: Support SHA384 hash when using glib
  crypto: Introduce x509 utils

Tiago Pasqualini (1):
  crypto: run qcrypto_pbkdf2_count_iters in a new thread

 crypto/cipher-nettle.c.inc | 25 ---
 crypto/hash-glib.c |  2 +-
 crypto/hash.c  | 14 +++
 crypto/meson.build |  4 ++
 crypto/pbkdf-gcrypt.c  |  2 +-
 crypto/pbkdf-gnutls.c  |  2 +-
 crypto/pbkdf.c | 53 
 crypto/x509-utils.c| 76 ++
 include/crypto/hash.h  |  8 
 include/crypto/x509-utils.h| 22 ++
 tests/qemu-iotests/233.out | 12 +++---
 tests/unit/meson.build |  4 +-
 tests/unit/test-crypto-pbkdf.c | 13 +++---
 13 files changed, 200 insertions(+), 37 deletions(-)
 create mode 100644 crypto/x509-utils.c
 create mode 100644 include/crypto/x509-utils.h

-- 
2.45.2

[PATCH] gitlab: fix logic for changing docker tag on stable branches

2024-09-06 Thread Daniel P . Berrangé

This fixes:

  commit e28112d00703abd136e2411d23931f4f891c9244
  Author: Daniel P. Berrangé 
  Date:   Thu Jun 8 17:40:16 2023 +0100

gitlab: stable staging branches publish containers in a separate tag

Due to a copy+paste mistake, that commit included "QEMU_JOB_SKIPPED"
in the final rule that was meant to be a 'catch all' for staging
branches.

As a result stable branches are still splattering dockers from the
primary development branch.

Signed-off-by: Daniel P. Berrangé 
---

This should be pulled into all stable branches that have the above
mentioned commit content present.

 .gitlab-ci.d/base.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.gitlab-ci.d/base.yml b/.gitlab-ci.d/base.yml
index bf3d8efab6..25b88aaa06 100644
--- a/.gitlab-ci.d/base.yml
+++ b/.gitlab-ci.d/base.yml
@@ -128,7 +128,7 @@ variables:
   when: manual
 
 # Jobs can run if any jobs they depend on were successful
-- if: '$QEMU_JOB_SKIPPED && $CI_PROJECT_NAMESPACE == $QEMU_CI_UPSTREAM && 
$CI_COMMIT_BRANCH =~ /staging-[[:digit:]]+\.[[:digit:]]/'
+- if: '$CI_PROJECT_NAMESPACE == $QEMU_CI_UPSTREAM && $CI_COMMIT_BRANCH =~ 
/staging-[[:digit:]]+\.[[:digit:]]/'
   when: on_success
   variables:
 QEMU_CI_CONTAINER_TAG: $CI_COMMIT_REF_SLUG
-- 
2.45.2

Re: [PATCH v6 0/8] AWS Nitro Enclave emulation support

2024-09-06 Thread Daniel P . Berrangé

On Fri, Sep 06, 2024 at 01:57:27AM +0600, Dorjoy Chowdhury wrote:
> This is v6 submission for AWS Nitro Enclave emulation in QEMU. From the QEMU 
> side
> the implementation for nitro enclaves is complete. v5 is at:
> https://lists.gnu.org/archive/html/qemu-devel/2024-08/msg03251.html

snip.

Since the first three patches just touch generic crypto subsystem
code, and I'm preparing a crypto pull request, I'm going to queue
those three patches myself, leaving the remainder of this series
for x86 maintainers.


> 
> Dorjoy Chowdhury (8):
>   crypto: Define macros for hash algorithm digest lengths
>   crypto: Support SHA384 hash when using glib
>   crypto: Introduce x509 utils
>   tests/lcitool: Update libvirt-ci and add libcbor dependency
>   device/virtio-nsm: Support for Nitro Secure Module device
>   hw/core: Add Enclave Image Format (EIF) related helpers
>   machine/nitro-enclave: New machine type for AWS Nitro Enclaves
>   docs/nitro-enclave: Documentation for nitro-enclave machine type

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v6 3/8] crypto: Introduce x509 utils

2024-09-06 Thread Daniel P . Berrangé

On Fri, Sep 06, 2024 at 01:57:30AM +0600, Dorjoy Chowdhury wrote:
> An utility function for getting fingerprint from X.509 certificate
> has been introduced. Implementation only provided using gnutls.
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  crypto/meson.build  |  4 ++
>  crypto/x509-utils.c | 75 +
>  include/crypto/x509-utils.h | 22 +++
>  3 files changed, 101 insertions(+)
>  create mode 100644 crypto/x509-utils.c
>  create mode 100644 include/crypto/x509-utils.h


> +int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t size,
> +  QCryptoHashAlgorithm alg,
> +  uint8_t *result,
> +  size_t *resultlen,
> +  Error **errp)
> +{
> +int ret;
> +gnutls_x509_crt_t crt;
> +gnutls_datum_t datum = {.data = cert, .size = size};
> +
> +if (alg >= G_N_ELEMENTS(qcrypto_to_gnutls_hash_alg_map)) {
> +error_setg(errp, "Unknown hash algorithm");
> +return -1;
> +}
> +
> +if (result == NULL) {
> +error_setg(errp, "No valid buffer given");
> +return -1;
> +}
> +
> +gnutls_x509_crt_init(&crt);
> +
> +if (gnutls_x509_crt_import(crt, &datum, GNUTLS_X509_FMT_PEM) != 0) {
> +error_setg(errp, "Failed to import certificate");
> +goto cleanup;
> +}
> +
> +ret = gnutls_hash_get_len(qcrypto_to_gnutls_hash_alg_map[alg]);
> +if (*resultlen < ret) {
> +error_setg(errp,
> +   "Result buffer size %zu is smaller than hash %d",
> +   *resultlen, ret);
> +goto cleanup;
> +}
> +
> +if (gnutls_x509_crt_get_fingerprint(crt,
> +qcrypto_to_gnutls_hash_alg_map[alg],
> +result, resultlen) != 0) {
> +error_setg(errp, "Failed to get fingerprint from certificate");
> +goto cleanup;
> +}
> +
> +return 0;
> +
> + cleanup:
> +gnutls_x509_crt_deinit(crt);
> +return -1;
> +}

This fails to call gnutls_x509_crt_deinit in the success path.

I'm going to squash in the following change:


diff --git a/crypto/x509-utils.c b/crypto/x509-utils.c
index 593eb8968b..6e157af76b 100644
--- a/crypto/x509-utils.c
+++ b/crypto/x509-utils.c
@@ -31,7 +31,8 @@ int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t 
size,
   size_t *resultlen,
   Error **errp)
 {
-int ret;
+int ret = -1;
+int hlen;
 gnutls_x509_crt_t crt;
 gnutls_datum_t datum = {.data = cert, .size = size};
 
@@ -52,11 +53,11 @@ int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t 
size,
 goto cleanup;
 }
 
-ret = gnutls_hash_get_len(qcrypto_to_gnutls_hash_alg_map[alg]);
-if (*resultlen < ret) {
+hlen = gnutls_hash_get_len(qcrypto_to_gnutls_hash_alg_map[alg]);
+if (*resultlen < hlen) {
 error_setg(errp,
"Result buffer size %zu is smaller than hash %d",
-   *resultlen, ret);
+   *resultlen, hlen);
 goto cleanup;
 }
 
@@ -67,9 +68,9 @@ int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t 
size,
 goto cleanup;
 }
 
-return 0;
+ret = 0;
 
  cleanup:
 gnutls_x509_crt_deinit(crt);
-return -1;
+return ret;
 }



With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [RFC PATCH 0/2] qtest: Log verbosity changes

2024-09-06 Thread Daniel P . Berrangé

On Fri, Sep 06, 2024 at 08:16:31AM +0200, Thomas Huth wrote:
> On 05/09/2024 23.03, Fabiano Rosas wrote:
> > Hi,
> > 
> > This series silences QEMU stderr unless the QTEST_LOG variable is set
> > and silences -qtest-log unless both QTEST_LOG and gtest's --verbose
> > flag is passed.
> > 
> > This was motivated by Peter Maydell's ask to suppress deprecation
> > warn_report messages from the migration-tests and by my own
> > frustration over noisy output from qtest.
> 
> Not sure whether we want to ignore stderr by default... we might also miss
> important warnings or error messages that way...?

I would prefer if our tests were quiet by default, and just printed
clear pass/fail notices without extraneous fluff. Having an opt-in
to see full messages from stderr feels good enough for debugging cases
where you need more info from a particular test.

Probably we should set verbose mode in CI though, since it is tedious
to re-run CI on failure to gather more info

> If you just want to suppress one certain warning, I think it's maybe best to
> fence it with "if (!qtest_enabled()) { ... }" on the QEMU side - at least
> that's what we did in similar cases a couple of times, IIRC.

We're got a surprisingly large mumber of if(qtest_enabled()) conditions
in the code. I can't help feeling this is a bad idea in the long term,
as its making us take different codepaths when testing from production.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH for-9.2] hw: add compat machines for 9.2

2024-09-06 Thread Daniel P . Berrangé

On Thu, Sep 05, 2024 at 08:05:14PM +0100, Peter Maydell wrote:
> On Thu, 5 Sept 2024 at 19:22, Daniel P. Berrangé  wrote:
> >
> > On Fri, Aug 16, 2024 at 11:47:16AM +0100, Daniel P. Berrangé wrote:
> > > On Fri, Aug 16, 2024 at 12:37:23PM +0200, Cornelia Huck wrote:
> > > > Add 9.2 machine types for arm/i440fx/m68k/q35/s390x/spapr.
> > > >
> > > > Signed-off-by: Cornelia Huck 
> > > > ---
> > > >  hw/arm/virt.c  |  9 -
> > > >  hw/core/machine.c  |  3 +++
> > > >  hw/i386/pc.c   |  3 +++
> > > >  hw/i386/pc_piix.c  | 15 ---
> > > >  hw/i386/pc_q35.c   | 13 +++--
> > > >  hw/m68k/virt.c |  9 -
> > > >  hw/ppc/spapr.c | 15 +--
> > > >  hw/s390x/s390-virtio-ccw.c | 14 +-
> > > >  include/hw/boards.h|  3 +++
> > > >  include/hw/i386/pc.h   |  3 +++
> > > >  10 files changed, 77 insertions(+), 10 deletions(-)
> > >
> > > Reviewed-by: Daniel P. Berrangé 
> > >
> > >
> > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > > index d9e69243b4a7..746bfe05d386 100644
> > > > --- a/hw/i386/pc_piix.c
> > > > +++ b/hw/i386/pc_piix.c
> > > > @@ -479,13 +479,24 @@ static void 
> > > > pc_i440fx_machine_options(MachineClass *m)
> > > >   "Use a different south bridge 
> > > > than PIIX3");
> > > >  }
> > > >
> > > > -static void pc_i440fx_machine_9_1_options(MachineClass *m)
> > > > +static void pc_i440fx_machine_9_2_options(MachineClass *m)
> > > >  {
> > > >  pc_i440fx_machine_options(m);
> > > >  m->alias = "pc";
> > > >  m->is_default = true;
> > > >  }
> > > >
> > > > +DEFINE_I440FX_MACHINE(9, 2);
> > > > +
> > > > +static void pc_i440fx_machine_9_1_options(MachineClass *m)
> > > > +{
> > > > +pc_i440fx_machine_9_2_options(m);
> > > > +m->alias = NULL;
> > > > +m->is_default = false;
> > > > +compat_props_add(m->compat_props, hw_compat_9_1, 
> > > > hw_compat_9_1_len);
> > > > +compat_props_add(m->compat_props, pc_compat_9_1, 
> > > > pc_compat_9_1_len);
> > > > +}
> > > > +
> > > >  DEFINE_I440FX_MACHINE(9, 1);
> > > >
> > > >  static void pc_i440fx_machine_9_0_options(MachineClass *m)
> > > > @@ -493,8 +504,6 @@ static void 
> > > > pc_i440fx_machine_9_0_options(MachineClass *m)
> > > >  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> > > >
> > > >  pc_i440fx_machine_9_1_options(m);
> > > > -m->alias = NULL;
> > > > -m->is_default = false;
> > > >  m->smbios_memory_device_size = 16 * GiB;
> > >
> > > Feels like we should be adding an "_AS_LATEST" macro
> > > variant for piix/q35 too, so it matches the pattern
> > > in other targets for handling alias & is_default.
> > >
> > > Not a thing your patch needs todo though.
> >
> > I've just a patch that does that now. If it looks good & you want to include
> > it as a pre-requisite for your patch here feel free to grab, otherwise I can
> > rebase it after your patch merges.
> 
> I have this patch in my target-arm pullreq that's currently posted
> and pending merge, by the way.

Ah ok, i'll rebase on top of that. then

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH for-9.2] hw: add compat machines for 9.2

2024-09-05 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 11:47:16AM +0100, Daniel P. Berrangé wrote:
> On Fri, Aug 16, 2024 at 12:37:23PM +0200, Cornelia Huck wrote:
> > Add 9.2 machine types for arm/i440fx/m68k/q35/s390x/spapr.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> >  hw/arm/virt.c  |  9 -
> >  hw/core/machine.c  |  3 +++
> >  hw/i386/pc.c   |  3 +++
> >  hw/i386/pc_piix.c  | 15 ---
> >  hw/i386/pc_q35.c   | 13 +++--
> >  hw/m68k/virt.c |  9 -
> >  hw/ppc/spapr.c | 15 +--
> >  hw/s390x/s390-virtio-ccw.c | 14 +-
> >  include/hw/boards.h|  3 +++
> >  include/hw/i386/pc.h   |  3 +++
> >  10 files changed, 77 insertions(+), 10 deletions(-)
> 
> Reviewed-by: Daniel P. Berrangé 
> 
> 
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index d9e69243b4a7..746bfe05d386 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -479,13 +479,24 @@ static void pc_i440fx_machine_options(MachineClass *m)
> >   "Use a different south bridge than 
> > PIIX3");
> >  }
> >  
> > -static void pc_i440fx_machine_9_1_options(MachineClass *m)
> > +static void pc_i440fx_machine_9_2_options(MachineClass *m)
> >  {
> >  pc_i440fx_machine_options(m);
> >  m->alias = "pc";
> >  m->is_default = true;
> >  }
> >  
> > +DEFINE_I440FX_MACHINE(9, 2);
> > +
> > +static void pc_i440fx_machine_9_1_options(MachineClass *m)
> > +{
> > +pc_i440fx_machine_9_2_options(m);
> > +m->alias = NULL;
> > +m->is_default = false;
> > +compat_props_add(m->compat_props, hw_compat_9_1, hw_compat_9_1_len);
> > +compat_props_add(m->compat_props, pc_compat_9_1, pc_compat_9_1_len);
> > +}
> > +
> >  DEFINE_I440FX_MACHINE(9, 1);
> >  
> >  static void pc_i440fx_machine_9_0_options(MachineClass *m)
> > @@ -493,8 +504,6 @@ static void pc_i440fx_machine_9_0_options(MachineClass 
> > *m)
> >  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> >  
> >  pc_i440fx_machine_9_1_options(m);
> > -m->alias = NULL;
> > -m->is_default = false;
> >  m->smbios_memory_device_size = 16 * GiB;
> 
> Feels like we should be adding an "_AS_LATEST" macro
> variant for piix/q35 too, so it matches the pattern
> in other targets for handling alias & is_default.
> 
> Not a thing your patch needs todo though.

I've just a patch that does that now. If it looks good & you want to include
it as a pre-requisite for your patch here feel free to grab, otherwise I can
rebase it after your patch merges.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH] hw/i386: define _AS_LATEST() macros for machine types

2024-09-05 Thread Daniel P . Berrangé

Follow the other architecture targets by adding extra macros for
defining a versioned machine type as the latest. This reduces the
size of the changes when introducing new machine types at the start
of each release cycle.

Signed-off-by: Daniel P. Berrangé 
---
 hw/i386/pc_piix.c| 11 +--
 hw/i386/pc_q35.c | 11 ++-
 include/hw/i386/pc.h |  4 +++-
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 347afa4c37..f04592ad4d 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -446,7 +446,10 @@ static void pc_i440fx_init(MachineState *machine)
 }
 
 #define DEFINE_I440FX_MACHINE(major, minor) \
-DEFINE_PC_VER_MACHINE(pc_i440fx, "pc-i440fx", pc_i440fx_init, major, 
minor);
+DEFINE_PC_VER_MACHINE(pc_i440fx, "pc-i440fx", pc_i440fx_init, false, NULL, 
major, minor);
+
+#define DEFINE_I440FX_MACHINE_AS_LATEST(major, minor) \
+DEFINE_PC_VER_MACHINE(pc_i440fx, "pc-i440fx", pc_i440fx_init, true, "pc", 
major, minor);
 
 static void pc_i440fx_machine_options(MachineClass *m)
 {
@@ -477,19 +480,15 @@ static void pc_i440fx_machine_options(MachineClass *m)
 static void pc_i440fx_machine_9_1_options(MachineClass *m)
 {
 pc_i440fx_machine_options(m);
-m->alias = "pc";
-m->is_default = true;
 }
 
-DEFINE_I440FX_MACHINE(9, 1);
+DEFINE_I440FX_MACHINE_AS_LATEST(9, 1);
 
 static void pc_i440fx_machine_9_0_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 
 pc_i440fx_machine_9_1_options(m);
-m->alias = NULL;
-m->is_default = false;
 m->smbios_memory_device_size = 16 * GiB;
 
 compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index f2d8edfa84..ec4f373b07 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -327,10 +327,13 @@ static void pc_q35_init(MachineState *machine)
 }
 
 #define DEFINE_Q35_MACHINE(major, minor) \
-DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, major, minor);
+DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, false, NULL, major, 
minor);
+
+#define DEFINE_Q35_MACHINE_AS_LATEST(major, minor) \
+DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, false, "q35", major, 
minor);
 
 #define DEFINE_Q35_MACHINE_BUGFIX(major, minor, micro) \
-DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, major, minor, micro);
+DEFINE_PC_VER_MACHINE(pc_q35, "pc-q35", pc_q35_init, false, NULL, major, 
minor, micro);
 
 static void pc_q35_machine_options(MachineClass *m)
 {
@@ -359,16 +362,14 @@ static void pc_q35_machine_options(MachineClass *m)
 static void pc_q35_machine_9_1_options(MachineClass *m)
 {
 pc_q35_machine_options(m);
-m->alias = "q35";
 }
 
-DEFINE_Q35_MACHINE(9, 1);
+DEFINE_Q35_MACHINE_AS_LATEST(9, 1);
 
 static void pc_q35_machine_9_0_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_q35_machine_9_1_options(m);
-m->alias = NULL;
 m->smbios_memory_device_size = 16 * GiB;
 compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
 compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 4e55d7ef6e..5775addb90 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -317,7 +317,7 @@ extern const size_t pc_compat_2_3_len;
 } \
 type_init(pc_machine_init_##suffix)
 
-#define DEFINE_PC_VER_MACHINE(namesym, namestr, initfn, ...) \
+#define DEFINE_PC_VER_MACHINE(namesym, namestr, initfn, isdefault, malias, 
...) \
 static void MACHINE_VER_SYM(init, namesym, __VA_ARGS__)( \
 MachineState *machine) \
 { \
@@ -331,6 +331,8 @@ extern const size_t pc_compat_2_3_len;
 MACHINE_VER_SYM(options, namesym, __VA_ARGS__)(mc); \
 mc->init = MACHINE_VER_SYM(init, namesym, __VA_ARGS__); \
 MACHINE_VER_DEPRECATION(__VA_ARGS__); \
+mc->is_default = isdefault; \
+mc->alias = malias; \
 } \
 static const TypeInfo MACHINE_VER_SYM(info, namesym, __VA_ARGS__) = \
 { \
-- 
2.45.2

[PATCH v2 7/7] hw/net: improve tracing of eBPF RSS setup

2024-09-05 Thread Daniel P . Berrangé

This adds more trace events to key eBPF RSS setup operations, and
also distinguishes events from multiple NIC instances.

Signed-off-by: Daniel P. Berrangé 
---
 hw/net/trace-events | 8 +---
 hw/net/virtio-net.c | 9 ++---
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/net/trace-events b/hw/net/trace-events
index 78efa2ec2c..6cad34aba2 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -399,9 +399,11 @@ virtio_net_announce_notify(void) ""
 virtio_net_announce_timer(int round) "%d"
 virtio_net_handle_announce(int round) "%d"
 virtio_net_post_load_device(void)
-virtio_net_rss_disable(void)
-virtio_net_rss_error(const char *msg, uint32_t value) "%s, value 0x%08x"
-virtio_net_rss_enable(uint32_t p1, uint16_t p2, uint8_t p3) "hashes 0x%x, 
table of %d, key of %d"
+virtio_net_rss_load(void *nic, size_t nfds, void *fds) "nic=%p nfds=%zu fds=%p"
+virtio_net_rss_attach_ebpf(void *nic, int prog_fd) "nic=%p prog-fd=%d"
+virtio_net_rss_disable(void *nic) "nic=%p"
+virtio_net_rss_error(void *nic, const char *msg, uint32_t value) "nic=%p 
msg=%s, value 0x%08x"
+virtio_net_rss_enable(void *nic, uint32_t p1, uint16_t p2, uint8_t p3) "nic=%p 
hashes 0x%x, table of %d, key of %d"
 
 # tulip.c
 tulip_reg_write(uint64_t addr, const char *name, int size, uint64_t val) "addr 
0x%02"PRIx64" (%s) size %d value 0x%08"PRIx64
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index f2690390c1..5d26a8e260 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1241,6 +1241,7 @@ static bool virtio_net_attach_ebpf_to_backend(NICState 
*nic, int prog_fd)
 return false;
 }
 
+trace_virtio_net_rss_attach_ebpf(nic, prog_fd);
 return nc->info->set_steering_ebpf(nc, prog_fd);
 }
 
@@ -1297,12 +1298,13 @@ static void virtio_net_commit_rss_config(VirtIONet *n)
 }
 }
 
-trace_virtio_net_rss_enable(n->rss_data.hash_types,
+trace_virtio_net_rss_enable(n,
+n->rss_data.hash_types,
 n->rss_data.indirections_len,
 sizeof(n->rss_data.key));
 } else {
 virtio_net_detach_ebpf_rss(n);
-trace_virtio_net_rss_disable();
+trace_virtio_net_rss_disable(n);
 }
 }
 
@@ -1353,6 +1355,7 @@ static bool virtio_net_load_ebpf(VirtIONet *n, Error 
**errp)
 bool ret = false;
 
 if (virtio_net_attach_ebpf_to_backend(n->nic, -1)) {
+trace_virtio_net_rss_load(n, n->nr_ebpf_rss_fds, n->ebpf_rss_fds);
 if (n->ebpf_rss_fds) {
 ret = virtio_net_load_ebpf_fds(n, errp);
 } else {
@@ -1484,7 +1487,7 @@ static uint16_t virtio_net_handle_rss(VirtIONet *n,
 virtio_net_commit_rss_config(n);
 return queue_pairs;
 error:
-trace_virtio_net_rss_error(err_msg, err_value);
+trace_virtio_net_rss_error(n, err_msg, err_value);
 virtio_net_disable_rss(n);
 return 0;
 }
-- 
2.45.2

[PATCH v2 3/7] ebpf: improve error trace events

2024-09-05 Thread Daniel P . Berrangé

A design pattern of

   trace_foo_error("descriptive string")

is undesirable because it does not allow for filtering trace events
based on the error scenario. Split eBPF error trace event into three
separate events to address this filtering need.

Signed-off-by: Daniel P. Berrangé 
---
 ebpf/ebpf_rss.c   | 10 +-
 ebpf/trace-events |  4 +++-
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c
index aa7170d997..47186807ad 100644
--- a/ebpf/ebpf_rss.c
+++ b/ebpf/ebpf_rss.c
@@ -53,21 +53,21 @@ static bool ebpf_rss_mmap(struct EBPFRSSContext *ctx)
PROT_READ | PROT_WRITE, MAP_SHARED,
ctx->map_configuration, 0);
 if (ctx->mmap_configuration == MAP_FAILED) {
-trace_ebpf_error("eBPF RSS", "can not mmap eBPF configuration array");
+trace_ebpf_rss_mmap_error(ctx, "configuration");
 return false;
 }
 ctx->mmap_toeplitz_key = mmap(NULL, qemu_real_host_page_size(),
PROT_READ | PROT_WRITE, MAP_SHARED,
ctx->map_toeplitz_key, 0);
 if (ctx->mmap_toeplitz_key == MAP_FAILED) {
-trace_ebpf_error("eBPF RSS", "can not mmap eBPF toeplitz key");
+trace_ebpf_rss_mmap_error(ctx, "toeplitz key");
 goto toeplitz_fail;
 }
 ctx->mmap_indirections_table = mmap(NULL, qemu_real_host_page_size(),
PROT_READ | PROT_WRITE, MAP_SHARED,
ctx->map_indirections_table, 0);
 if (ctx->mmap_indirections_table == MAP_FAILED) {
-trace_ebpf_error("eBPF RSS", "can not mmap eBPF indirection table");
+trace_ebpf_rss_mmap_error(ctx, "indirections table");
 goto indirection_fail;
 }
 
@@ -105,14 +105,14 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx)
 
 rss_bpf_ctx = rss_bpf__open();
 if (rss_bpf_ctx == NULL) {
-trace_ebpf_error("eBPF RSS", "can not open eBPF RSS object");
+trace_ebpf_rss_open_error(ctx);
 goto error;
 }
 
 bpf_program__set_type(rss_bpf_ctx->progs.tun_rss_steering_prog, 
BPF_PROG_TYPE_SOCKET_FILTER);
 
 if (rss_bpf__load(rss_bpf_ctx)) {
-trace_ebpf_error("eBPF RSS", "can not load RSS program");
+trace_ebpf_rss_load_error(ctx);
 goto error;
 }
 
diff --git a/ebpf/trace-events b/ebpf/trace-events
index b3ad1a35f2..a0f157be37 100644
--- a/ebpf/trace-events
+++ b/ebpf/trace-events
@@ -1,4 +1,6 @@
 # See docs/devel/tracing.rst for syntax documentation.
 
 # ebpf-rss.c
-ebpf_error(const char *s1, const char *s2) "error in %s: %s"
+ebpf_rss_load_error(void *ctx) "ctx=%p"
+ebpf_rss_mmap_error(void *ctx, const char *object) "ctx=%p object=%s"
+ebpf_rss_open_error(void *ctx) "ctx=%p"
-- 
2.45.2

[PATCH v2 1/7] hw/net: fix typo s/epbf/ebpf/ in virtio-net

2024-09-05 Thread Daniel P . Berrangé

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Daniel P. Berrangé 
---
 hw/net/virtio-net.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index ed33a32877..055fce1d78 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1254,7 +1254,7 @@ static void rss_data_to_rss_config(struct 
VirtioNetRssData *data,
 config->default_queue = data->default_queue;
 }
 
-static bool virtio_net_attach_epbf_rss(VirtIONet *n)
+static bool virtio_net_attach_ebpf_rss(VirtIONet *n)
 {
 struct EBPFRSSConfig config = {};
 
@@ -1276,7 +1276,7 @@ static bool virtio_net_attach_epbf_rss(VirtIONet *n)
 return true;
 }
 
-static void virtio_net_detach_epbf_rss(VirtIONet *n)
+static void virtio_net_detach_ebpf_rss(VirtIONet *n)
 {
 virtio_net_attach_ebpf_to_backend(n->nic, -1);
 }
@@ -1286,8 +1286,8 @@ static void virtio_net_commit_rss_config(VirtIONet *n)
 if (n->rss_data.enabled) {
 n->rss_data.enabled_software_rss = n->rss_data.populate_hash;
 if (n->rss_data.populate_hash) {
-virtio_net_detach_epbf_rss(n);
-} else if (!virtio_net_attach_epbf_rss(n)) {
+virtio_net_detach_ebpf_rss(n);
+} else if (!virtio_net_attach_ebpf_rss(n)) {
 if (get_vhost_net(qemu_get_queue(n->nic)->peer)) {
 warn_report("Can't load eBPF RSS for vhost");
 } else {
@@ -1300,7 +1300,7 @@ static void virtio_net_commit_rss_config(VirtIONet *n)
 n->rss_data.indirections_len,
 sizeof(n->rss_data.key));
 } else {
-virtio_net_detach_epbf_rss(n);
+virtio_net_detach_ebpf_rss(n);
 trace_virtio_net_rss_disable();
 }
 }
-- 
2.45.2

[PATCH v2 6/7] ebpf: improve trace event coverage to all key operations

2024-09-05 Thread Daniel P . Berrangé

The existing error trace event is renamed to have a name prefix
matching its source file & to remove the redundant first arg that
adds no useful information.

Signed-off-by: Daniel P. Berrangé 
---
 ebpf/ebpf_rss.c   | 19 +++
 ebpf/trace-events |  4 
 2 files changed, 23 insertions(+)

diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c
index f65a58b0b6..2afff27e78 100644
--- a/ebpf/ebpf_rss.c
+++ b/ebpf/ebpf_rss.c
@@ -74,6 +74,10 @@ static bool ebpf_rss_mmap(struct EBPFRSSContext *ctx, Error 
**errp)
 goto indirection_fail;
 }
 
+trace_ebpf_rss_mmap(ctx,
+ctx->mmap_configuration,
+ctx->mmap_toeplitz_key,
+ctx->mmap_indirections_table);
 return true;
 
 indirection_fail:
@@ -131,6 +135,11 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx, Error 
**errp)
 ctx->map_toeplitz_key = bpf_map__fd(
 rss_bpf_ctx->maps.tap_rss_map_toeplitz_key);
 
+trace_ebpf_rss_load(ctx,
+ctx->program_fd,
+ctx->map_configuration,
+ctx->map_indirections_table,
+ctx->map_toeplitz_key);
 if (!ebpf_rss_mmap(ctx, errp)) {
 goto error;
 }
@@ -178,6 +187,12 @@ bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int 
program_fd,
 ctx->map_toeplitz_key = toeplitz_fd;
 ctx->map_indirections_table = table_fd;
 
+trace_ebpf_rss_load(ctx,
+ctx->program_fd,
+ctx->map_configuration,
+ctx->map_indirections_table,
+ctx->map_toeplitz_key);
+
 if (!ebpf_rss_mmap(ctx, errp)) {
 ctx->program_fd = -1;
 ctx->map_configuration = -1;
@@ -259,6 +274,8 @@ bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct 
EBPFRSSConfig *config,
 
 ebpf_rss_set_toepliz_key(ctx, toeplitz_key);
 
+trace_ebpf_rss_set_data(ctx, config, indirections_table, toeplitz_key);
+
 return true;
 }
 
@@ -268,6 +285,8 @@ void ebpf_rss_unload(struct EBPFRSSContext *ctx)
 return;
 }
 
+trace_ebpf_rss_unload(ctx);
+
 ebpf_rss_munmap(ctx);
 
 if (ctx->obj) {
diff --git a/ebpf/trace-events b/ebpf/trace-events
index a0f157be37..bf3d9b6451 100644
--- a/ebpf/trace-events
+++ b/ebpf/trace-events
@@ -1,6 +1,10 @@
 # See docs/devel/tracing.rst for syntax documentation.
 
 # ebpf-rss.c
+ebpf_rss_load(void *ctx, int progfd, int cfgfd, int toepfd, int indirfd) 
"ctx=%p program-fd=%d config-fd=%d toeplitz-fd=%d indirection-fd=%d"
 ebpf_rss_load_error(void *ctx) "ctx=%p"
+ebpf_rss_mmap(void *ctx, void *cfgptr, void *toepptr, void *indirptr) "ctx=%p 
config-ptr=%p toeplitz-ptr=%p indirection-ptr=%p"
 ebpf_rss_mmap_error(void *ctx, const char *object) "ctx=%p object=%s"
 ebpf_rss_open_error(void *ctx) "ctx=%p"
+ebpf_rss_set_data(void *ctx, void *cfgptr, void *toepptr, void *indirptr) 
"ctx=%p config-ptr=%p toeplitz-ptr=%p indirection-ptr=%p"
+ebpf_rss_unload(void *ctx) "rss unload ctx=%p"
-- 
2.45.2

[PATCH v2 5/7] hw/net: report errors from failing to use eBPF RSS FDs

2024-09-05 Thread Daniel P . Berrangé

If the user/mgmt app passed in a set of pre-opened FDs for eBPF RSS,
then it is expecting QEMU to use them. Any failure to do so must be
considered a fatal error and propagated back up the stack, otherwise
deployment mistakes will not be detectable in a prompt manner. When
not using pre-opened FDs, then eBPF RSS is tried on a "best effort"
basis only and thus fallback to software RSS is valid.

Signed-off-by: Daniel P. Berrangé 
---
 hw/net/virtio-net.c | 41 +
 1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 558fc62844..f2690390c1 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1316,28 +1316,27 @@ static void virtio_net_disable_rss(VirtIONet *n)
 virtio_net_commit_rss_config(n);
 }
 
-static bool virtio_net_load_ebpf_fds(VirtIONet *n)
+static bool virtio_net_load_ebpf_fds(VirtIONet *n, Error **errp)
 {
 int fds[EBPF_RSS_MAX_FDS] = { [0 ... EBPF_RSS_MAX_FDS - 1] = -1};
 int ret = true;
 int i = 0;
 
 if (n->nr_ebpf_rss_fds != EBPF_RSS_MAX_FDS) {
-warn_report("Expected %d file descriptors but got %d",
-EBPF_RSS_MAX_FDS, n->nr_ebpf_rss_fds);
-   return false;
-   }
+error_setg(errp, "Expected %d file descriptors but got %d",
+   EBPF_RSS_MAX_FDS, n->nr_ebpf_rss_fds);
+return false;
+}
 
 for (i = 0; i < n->nr_ebpf_rss_fds; i++) {
-fds[i] = monitor_fd_param(monitor_cur(), n->ebpf_rss_fds[i],
-  &error_warn);
+fds[i] = monitor_fd_param(monitor_cur(), n->ebpf_rss_fds[i], errp);
 if (fds[i] < 0) {
 ret = false;
 goto exit;
 }
 }
 
-ret = ebpf_rss_load_fds(&n->ebpf_rss, fds[0], fds[1], fds[2], fds[3], 
NULL);
+ret = ebpf_rss_load_fds(&n->ebpf_rss, fds[0], fds[1], fds[2], fds[3], 
errp);
 
 exit:
 if (!ret) {
@@ -1349,13 +1348,15 @@ exit:
 return ret;
 }
 
-static bool virtio_net_load_ebpf(VirtIONet *n)
+static bool virtio_net_load_ebpf(VirtIONet *n, Error **errp)
 {
 bool ret = false;
 
 if (virtio_net_attach_ebpf_to_backend(n->nic, -1)) {
-if (!(n->ebpf_rss_fds && virtio_net_load_ebpf_fds(n))) {
-ret = ebpf_rss_load(&n->ebpf_rss, NULL);
+if (n->ebpf_rss_fds) {
+ret = virtio_net_load_ebpf_fds(n, errp);
+} else {
+ret = ebpf_rss_load(&n->ebpf_rss, errp);
 }
 }
 
@@ -3761,7 +3762,23 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
 net_rx_pkt_init(&n->rx_pkt);
 
 if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) {
-virtio_net_load_ebpf(n);
+Error *err = NULL;
+if (!virtio_net_load_ebpf(n, &err)) {
+/*
+ * If user explicitly gave QEMU RSS FDs to use, then
+ * failing to use them must be considered a fatal
+ * error. If no RSS FDs were provided, QEMU is trying
+ * eBPF on a "best effort" basis only, so report a
+ * warning and allow fallback to software RSS.
+ */
+if (n->ebpf_rss_fds) {
+error_propagate(errp, err);
+} else {
+warn_report("unable to load eBPF RSS: %s",
+error_get_pretty(err));
+error_free(err);
+}
+}
 }
 }
 
-- 
2.45.2

[PATCH v2 2/7] ebpf: drop redundant parameter checks in static methods

2024-09-05 Thread Daniel P . Berrangé

Various static methods have checks on their parameters which were
already checked immediately before the method was invoked. Drop
these redundat checks to simplify the following commit which adds
formal error reporting.

Signed-off-by: Daniel P. Berrangé 
---
 ebpf/ebpf_rss.c | 32 +---
 1 file changed, 5 insertions(+), 27 deletions(-)

diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c
index 87f0714910..aa7170d997 100644
--- a/ebpf/ebpf_rss.c
+++ b/ebpf/ebpf_rss.c
@@ -49,10 +49,6 @@ bool ebpf_rss_is_loaded(struct EBPFRSSContext *ctx)
 
 static bool ebpf_rss_mmap(struct EBPFRSSContext *ctx)
 {
-if (!ebpf_rss_is_loaded(ctx)) {
-return false;
-}
-
 ctx->mmap_configuration = mmap(NULL, qemu_real_host_page_size(),
PROT_READ | PROT_WRITE, MAP_SHARED,
ctx->map_configuration, 0);
@@ -90,10 +86,6 @@ toeplitz_fail:
 
 static void ebpf_rss_munmap(struct EBPFRSSContext *ctx)
 {
-if (!ebpf_rss_is_loaded(ctx)) {
-return;
-}
-
 munmap(ctx->mmap_indirections_table, qemu_real_host_page_size());
 munmap(ctx->mmap_toeplitz_key, qemu_real_host_page_size());
 munmap(ctx->mmap_configuration, qemu_real_host_page_size());
@@ -177,15 +169,10 @@ bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int 
program_fd,
 return true;
 }
 
-static bool ebpf_rss_set_config(struct EBPFRSSContext *ctx,
+static void ebpf_rss_set_config(struct EBPFRSSContext *ctx,
 struct EBPFRSSConfig *config)
 {
-if (!ebpf_rss_is_loaded(ctx)) {
-return false;
-}
-
 memcpy(ctx->mmap_configuration, config, sizeof(*config));
-return true;
 }
 
 static bool ebpf_rss_set_indirections_table(struct EBPFRSSContext *ctx,
@@ -194,8 +181,7 @@ static bool ebpf_rss_set_indirections_table(struct 
EBPFRSSContext *ctx,
 {
 char *cursor = ctx->mmap_indirections_table;
 
-if (!ebpf_rss_is_loaded(ctx) || indirections_table == NULL ||
-   len > VIRTIO_NET_RSS_MAX_TABLE_LEN) {
+if (len > VIRTIO_NET_RSS_MAX_TABLE_LEN) {
 return false;
 }
 
@@ -207,20 +193,16 @@ static bool ebpf_rss_set_indirections_table(struct 
EBPFRSSContext *ctx,
 return true;
 }
 
-static bool ebpf_rss_set_toepliz_key(struct EBPFRSSContext *ctx,
+static void ebpf_rss_set_toepliz_key(struct EBPFRSSContext *ctx,
  uint8_t *toeplitz_key)
 {
 /* prepare toeplitz key */
 uint8_t toe[VIRTIO_NET_RSS_MAX_KEY_SIZE] = {};
 
-if (!ebpf_rss_is_loaded(ctx) || toeplitz_key == NULL) {
-return false;
-}
 memcpy(toe, toeplitz_key, VIRTIO_NET_RSS_MAX_KEY_SIZE);
 *(uint32_t *)toe = ntohl(*(uint32_t *)toe);
 
 memcpy(ctx->mmap_toeplitz_key, toe, VIRTIO_NET_RSS_MAX_KEY_SIZE);
-return true;
 }
 
 bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct EBPFRSSConfig *config,
@@ -231,18 +213,14 @@ bool ebpf_rss_set_all(struct EBPFRSSContext *ctx, struct 
EBPFRSSConfig *config,
 return false;
 }
 
-if (!ebpf_rss_set_config(ctx, config)) {
-return false;
-}
+ebpf_rss_set_config(ctx, config);
 
 if (!ebpf_rss_set_indirections_table(ctx, indirections_table,
   config->indirections_len)) {
 return false;
 }
 
-if (!ebpf_rss_set_toepliz_key(ctx, toeplitz_key)) {
-return false;
-}
+ebpf_rss_set_toepliz_key(ctx, toeplitz_key);
 
 return true;
 }
-- 
2.45.2

[PATCH v2 0/7] Report fatal errors from failure with pre-opened eBPF RSS FDs

2024-09-05 Thread Daniel P . Berrangé

The virtio-net code for eBPF RSS is still ignoring errors when
failing to load the eBPF RSS program passed in by the mgmt app
via pre-opened FDs.

This series re-factors the eBPF common code so that it actually
reports using "Error" objects. Then it makes virtio-net treat
a failure to load pre-opened FDs as a fatal problem. When doing
speculative opening of eBPF FDs, QEMU merely prints a warning,
and allows the software fallback to continue.

Trace event coverage is significantly expanded to make this all
much more debuggable too.

Changed in v2:

 - Split 'ebpf_error' probe into multiple probes

Daniel P. Berrangé (7):
  hw/net: fix typo s/epbf/ebpf/ in virtio-net
  ebpf: drop redundant parameter checks in static methods
  ebpf: improve error trace events
  ebpf: add formal error reporting to all APIs
  hw/net: report errors from failing to use eBPF RSS FDs
  ebpf: improve trace event coverage to all key operations
  hw/net: improve tracing of eBPF RSS setup

 ebpf/ebpf_rss.c | 118 
 ebpf/ebpf_rss.h |  10 ++--
 ebpf/trace-events   |   8 ++-
 hw/net/trace-events |   8 +--
 hw/net/virtio-net.c |  63 +++
 5 files changed, 137 insertions(+), 70 deletions(-)

-- 
2.45.2

[PATCH v2 4/7] ebpf: add formal error reporting to all APIs

2024-09-05 Thread Daniel P . Berrangé

The eBPF code is currently reporting error messages through trace
events. Trace events are fine for debugging, but they are not to be
considered the primary error reporting mechanism, as their output
is inaccessible to callers.

This adds an "Error **errp" parameter to all methods which have
important error scenarios to report to the caller.

Signed-off-by: Daniel P. Berrangé 
---
 ebpf/ebpf_rss.c | 59 -
 ebpf/ebpf_rss.h | 10 +---
 hw/net/virtio-net.c |  7 +++---
 3 files changed, 59 insertions(+), 17 deletions(-)

diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c
index 47186807ad..f65a58b0b6 100644
--- a/ebpf/ebpf_rss.c
+++ b/ebpf/ebpf_rss.c
@@ -47,13 +47,14 @@ bool ebpf_rss_is_loaded(struct EBPFRSSContext *ctx)
 return ctx != NULL && (ctx->obj != NULL || ctx->program_fd != -1);
 }
 
-static bool ebpf_rss_mmap(struct EBPFRSSContext *ctx)
+static bool ebpf_rss_mmap(struct EBPFRSSContext *ctx, Error **errp)
 {
 ctx->mmap_configuration = mmap(NULL, qemu_real_host_page_size(),
PROT_READ | PROT_WRITE, MAP_SHARED,
ctx->map_configuration, 0);
 if (ctx->mmap_configuration == MAP_FAILED) {
 trace_ebpf_rss_mmap_error(ctx, "configuration");
+error_setg(errp, "Unable to map eBPF configuration array");
 return false;
 }
 ctx->mmap_toeplitz_key = mmap(NULL, qemu_real_host_page_size(),
@@ -61,6 +62,7 @@ static bool ebpf_rss_mmap(struct EBPFRSSContext *ctx)
ctx->map_toeplitz_key, 0);
 if (ctx->mmap_toeplitz_key == MAP_FAILED) {
 trace_ebpf_rss_mmap_error(ctx, "toeplitz key");
+error_setg(errp, "Unable to map eBPF toeplitz array");
 goto toeplitz_fail;
 }
 ctx->mmap_indirections_table = mmap(NULL, qemu_real_host_page_size(),
@@ -68,6 +70,7 @@ static bool ebpf_rss_mmap(struct EBPFRSSContext *ctx)
ctx->map_indirections_table, 0);
 if (ctx->mmap_indirections_table == MAP_FAILED) {
 trace_ebpf_rss_mmap_error(ctx, "indirections table");
+error_setg(errp, "Unable to map eBPF indirection array");
 goto indirection_fail;
 }
 
@@ -95,7 +98,7 @@ static void ebpf_rss_munmap(struct EBPFRSSContext *ctx)
 ctx->mmap_indirections_table = NULL;
 }
 
-bool ebpf_rss_load(struct EBPFRSSContext *ctx)
+bool ebpf_rss_load(struct EBPFRSSContext *ctx, Error **errp)
 {
 struct rss_bpf *rss_bpf_ctx;
 
@@ -106,6 +109,7 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx)
 rss_bpf_ctx = rss_bpf__open();
 if (rss_bpf_ctx == NULL) {
 trace_ebpf_rss_open_error(ctx);
+error_setg(errp, "Unable to open eBPF RSS object");
 goto error;
 }
 
@@ -113,6 +117,7 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx)
 
 if (rss_bpf__load(rss_bpf_ctx)) {
 trace_ebpf_rss_load_error(ctx);
+error_setg(errp, "Unable to load eBPF program");
 goto error;
 }
 
@@ -126,7 +131,7 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx)
 ctx->map_toeplitz_key = bpf_map__fd(
 rss_bpf_ctx->maps.tap_rss_map_toeplitz_key);
 
-if (!ebpf_rss_mmap(ctx)) {
+if (!ebpf_rss_mmap(ctx, errp)) {
 goto error;
 }
 
@@ -143,13 +148,28 @@ error:
 }
 
 bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int program_fd,
-   int config_fd, int toeplitz_fd, int table_fd)
+   int config_fd, int toeplitz_fd, int table_fd,
+   Error **errp)
 {
 if (ebpf_rss_is_loaded(ctx)) {
+error_setg(errp, "eBPF program is already loaded");
 return false;
 }
 
-if (program_fd < 0 || config_fd < 0 || toeplitz_fd < 0 || table_fd < 0) {
+if (program_fd < 0) {
+error_setg(errp, "eBPF program FD is not open");
+return false;
+}
+if (config_fd < 0) {
+error_setg(errp, "eBPF config FD is not open");
+return false;
+}
+if (toeplitz_fd < 0) {
+error_setg(errp, "eBPF toeplitz FD is not open");
+return false;
+}
+if (table_fd < 0) {
+error_setg(errp, "eBPF indirection FD is not open");
 return false;
 }
 
@@ -158,7 +178,7 @@ bool ebpf_rss_load_fds(struct EBPFRSSContext *ctx, int 
program_fd,
 ctx->map_toeplitz_key = toeplitz_fd;
 ctx->map_indirections_table = table_fd;
 
-if (!ebpf_rss_mmap(ctx)) {
+if (!ebpf_rss_mmap(ctx, errp)) {
 ctx->program_fd = -1;
 ctx->map_configuration = -1;
 ctx->map_toeplitz_key = -1;
@@ -177,11 +197,14 @@ static void ebpf_rss_set_config(struct EBPFRSSContext 
*ctx,
 
 static bool ebpf_rss_set_indirections_table(struct EBPFRSSContext *ctx,

Re: [PATCH v2] crypto: run qcrypto_pbkdf2_count_iters in a new thread

2024-09-05 Thread Daniel P . Berrangé

On Wed, Sep 04, 2024 at 08:52:30PM -0300, Tiago Pasqualini wrote:
> CPU time accounting in the kernel has been demonstrated to have a
> sawtooth pattern[1][2]. This can cause the getrusage system call to
> not be as accurate as we are expecting, which can cause this calculation
> to stall.
> 
> The kernel discussions shows that this inaccuracy happens when CPU time
> gets big enough, so this patch changes qcrypto_pbkdf2_count_iters to run
> in a fresh thread to avoid this inaccuracy. It also adds a sanity check
> to fail the process if CPU time is not accounted.
> 
> [1] 
> https://lore.kernel.org/lkml/159231011694.16989.16351419333851309713.tip-bot2@tip-bot2/
> [2] 
> https://lore.kernel.org/lkml/20221226031010.4079885-1-maxing@bytedance.com/t/#m1c7f2fdc0ea742776a70fd1aa2a2e414c437f534
> 
> Resolves: #2398
> Signed-off-by: Tiago Pasqualini 
> ---
>  crypto/pbkdf.c | 53 +++---
>  1 file changed, 46 insertions(+), 7 deletions(-)

Reviewed-by: Daniel P. Berrangé 

and queued.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 2/2] tests/qtest: Bump timeout on ahci-test

2024-09-05 Thread Daniel P . Berrangé

On Thu, Sep 05, 2024 at 05:55:54PM +0100, Peter Maydell wrote:
> On my OpenBSD VM test system, the ahci-test sometimes hits its 60 second
> timeout. It has 75 subtests and allowing at least two seconds per
> subtest seems reasonable. Bump it to 150s.
> 
> Signed-off-by: Peter Maydell 
> ---
>  tests/qtest/meson.build | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 1/2] tests/qtest: Add missing qtest_quit() to stm32 tests

2024-09-05 Thread Daniel P . Berrangé

On Thu, Sep 05, 2024 at 05:55:53PM +0100, Peter Maydell wrote:
> In the dm163-test and stm32l4x5_usart-test, a couple of subtests are
> missing the qtest_quit() call.  The effect of this is that on hosts
> other than Linux and FreeBSD the test will timeout after executing
> all the tests:
> 
> 242/845 qemu:qtest+qtest-arm / qtest-arm/dm163-test   
> TIMEOUT 60.04s   3 subtests passed
> 100/845 qemu:qtest+qtest-arm / qtest-arm/stm32l4x5_usart-test 
> TIMEOUT600.02s   5 subtests passed
> 
> This happens because the qemu-system-arm binary which the test
> starts does not exit, and because it shares the stdout with the
> test binary, the overall meson test harness thinks the test is
> still running. On Linux and FreeBSD we have an extra safety net
> set up in qtest_spawn_qemu() which kills off any QEMU binary that
> ends up without a parent. This is intended for the case where
> QEMU crashed and didn't respond to a SIGTERM or polite request
> to quit, but it also sidestepped the problem in this case.
> However, OpenBSD doesn't have a PDEATHSIG equivalent, so we
> see the timeouts when running a 'make vm-build-openbsd' run.
> 
> Add the missing qtest_quit() calls.
> 
> Signed-off-by: Peter Maydell 
> ---
> Many thanks to Dan Berrangé for diagnosing the cause of this hang...
> Dan also suggested on IRC that we should support g_autoptr for
> the QTestState, so you don't need to manually qtest_quit(). But
> for the immediate fix, I just add the missing calls.
> ---
>  tests/qtest/dm163-test.c       | 2 ++
>  tests/qtest/stm32l4x5_usart-test.c | 2 ++
>  2 files changed, 4 insertions(+)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 01/19] qapi: Smarter camel_to_upper() to reduce need for 'prefix'

2024-09-05 Thread Daniel P . Berrangé

On Thu, Sep 05, 2024 at 07:59:13AM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé  writes:
> 
> > On Wed, Sep 04, 2024 at 01:18:18PM +0200, Markus Armbruster wrote:
> >> camel_to_upper() converts its argument from camel case to upper case
> >> with '_' between words.  Used for generated enumeration constant
> >> prefixes.
> >
> >
> >> 
> >> Signed-off-by: Markus Armbruster 
> >> Reviewed-by: Daniel P. Berrang?? 
> >
> > The accent in my name is getting mangled in this series.
> 
> Uh-oh!
> 
> Checking...  Hmm.  It's correct in git, correct in output of
> git-format-patch, correct in the copy I got from git-send-email --bcc
> armbru via localhost MTA, and the copy I got from --to
> qemu-devel@nongnu.org, correct in lore.kernel.org[*], correct in an mbox
> downloaded from patchew.
> 
> Could the culprit be on your side?

I compared my received mail vs the mbox archive on nongnu.org for
qemu-devel.

In both cases the actual mail body seems to be valid UTF-8 and is
identical. The message in the nongnu.org archive, however, has

  Content-Type: text/plain; charset=UTF-8

while the copy I got in my inbox has merely

  Content-Type: text/plain

What I can't determine is whether your original sent message
had "charset=UTF-8" which then got stripped by redhat's incoming
mail server, or whether your original lacked 'charset=UTF8' and
it got added by mailman when saving the message to the mbox archives ?


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH V2 00/11] Live update: cpr-exec

2024-09-05 Thread Daniel P . Berrangé

On Wed, Sep 04, 2024 at 06:23:50PM -0400, Peter Xu wrote:
> On Wed, Sep 04, 2024 at 04:58:14PM -0400, Steven Sistare wrote:
> > On 8/21/2024 2:34 PM, Peter Xu wrote:
> > > On Fri, Aug 16, 2024 at 01:09:23PM -0400, Steven Sistare wrote:
> > > > On 8/16/2024 12:17 PM, Peter Xu wrote:
> > > What I read so far from Dan is that cpr-transfer seems to be also 
> > > preferred
> > > from Libvirt POV:
> > > 
> > >https://lore.kernel.org/r/zr9-ivorkgjre...@redhat.com
> > > 
> > > Did I read it right?
> > 
> > I read that as: cpr-transfer is a viable option for libvirt.  I don't hear 
> > him
> > excluding the possibility of cpr-exec.
> 
> I preferred not having two solution because if they work the same problem
> out, then it potentially means one of them might be leftover at some point,
> unless they suite different needs.  But I don't feel strongly, especially
> if cpr-exec is light if cpr-transfer is there.
> 
> > 
> > I agree that "Dan the libvirt expert prefers cpr-transfer" is a good reason 
> > to
> > provide cpr-transfer.  Which I will do.
> > 
> > So does "Steve the OCI expert prefers cpr-exec" carry equal weight, for also
> > providing cpr-exec?
> 
> As an open source project, Libvirt using it means the feature can be
> actively used and tested.  When e.g. there's a new feature replacing CPR we
> know when we can obsolete the old CPR, no matter -exec or -transfer.
> 
> Close sourced projects can also be great itself but naturally are less
> important in open source communities IMHO due to not accessible to anyone
> in the community.  E.g., we never know when an close sourced project
> abandoned a feature, then QEMU can carry over that feature forever without
> knowing who's using it.

In terms of closed source projects, effectively they don't exist from a
QEMU maintainer's POV. Our deprecation & removal policy is designed so
that we don't need to think about who is using stuff.

When QEMU deprecates something, any users (whether open source or closed
source) have 2 releases in which to notice this, and make a request that
we cancel the deprecation, or change their code.

Libvirt is special in the sense that we'll CC libvirt mailing list on
changes to the deprecated.rst file, and we'll often not propose
deprecations in the first place if we know libvirt is using it, since
we can ask libvirt quite easily & libvirt people pay attention to QEMU.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH V2 00/11] Live update: cpr-exec

2024-09-05 Thread Daniel P . Berrangé

On Wed, Sep 04, 2024 at 04:58:14PM -0400, Steven Sistare wrote:
> On 8/21/2024 2:34 PM, Peter Xu wrote:
> > On Fri, Aug 16, 2024 at 01:09:23PM -0400, Steven Sistare wrote:
> > > 
> > > libvirt starts qemu with the -sandbox spawn=deny option which blocks 
> > > fork, exec,
> > > and change namespace operations.  I have a patch in my workspace to be 
> > > submitted
> > > later called "seccomp: fine-grained control of fork, exec, and namespace" 
> > > that allows
> > > libvirt to block fork and namespace but allow exec.
> > 
> > The question is whether that would be accepted, and it also gives me the
> > feeling that even if it's accepted, it might limit the use cases that cpr
> > can apply to.
> 
> This is more acceptable for libvirt running in a container (such as under 
> kubevirt)
> with a limited set of binaries in /bin that could be exec'd.  In that case 
> allowing
> exec is more reasonable.

Running inside a container does protect the host to a significant
degree. I'd say it is still important, however, to protect the
control plane (libvirt's daemons & kubevirt's agent) from the QEMU
process being managed, and in that case it still looks pretty
compelling to deny exec.

> > What I read so far from Dan is that cpr-transfer seems to be also preferred
> > from Libvirt POV:
> > 
> >https://lore.kernel.org/r/zr9-ivorkgjre...@redhat.com
> > 
> > Did I read it right?
> 
> I read that as: cpr-transfer is a viable option for libvirt.  I don't hear him
> excluding the possibility of cpr-exec.
> 
> I agree that "Dan the libvirt expert prefers cpr-transfer" is a good reason to
> provide cpr-transfer.  Which I will do.

Both approaches have significant challenges for integration, but my general
preference is towards a solution that doesn't require undermining our security
protections.

When starting a VM we have no knowledge of whether a user may want to use
CPR at a later date. We're not going to disable the seccomp sandbox by
default, so that means cpr-exec would not be viable in a default VM
deployment.

Admins could choose to modify /etc/libvirt/qemu.conf to turn off seccomp,
but I'm very much not in favour of introducing a feature that requires
them todo this. It would be a first in libvirt, as everything else we
support is possible to use with seccomp enabled. The seccomp opt-out is
essentially just there as an emergency escape hatch, not as something we
want used in production.

> We are at an impasse on this series.  To make forward progress, I am willing 
> to
> reorder the patches, and re-submit cpr-transfer as the first mode, so we can
> review and pull that.  I will submit cpr-exec as a follow on and we can resume
> our arguments then.

Considering the end result, are there CPR usage scenarios that are possible
with cpr-exec, that can't be achieved with cpr-transfer ?

Supporting two ways to doing the same thing is increasing the maint burden
for QEMU maintainers, as well as downstream testing engineers who have to
validate this functionality. So unless there's compelling need to support
both cpr-transfer and cpr-exec, it'd be nice to standardize on just one of
them.

cpr-transfer does look like its probably more viable, even with its own
challenges wrt resources being opened twice.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH V2 00/11] Live update: cpr-exec

2024-09-05 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 01:09:23PM -0400, Steven Sistare wrote:
> On 8/16/2024 12:17 PM, Peter Xu wrote:
> > On Fri, Aug 16, 2024 at 05:00:32PM +0100, Daniel P. Berrangé wrote:
> > > On Fri, Aug 16, 2024 at 11:34:10AM -0400, Peter Xu wrote:
> > > > On Fri, Aug 16, 2024 at 04:16:50PM +0100, Daniel P. Berrangé wrote:
> > > > > On Fri, Aug 16, 2024 at 11:06:10AM -0400, Peter Xu wrote:
> > > > > > On Thu, Aug 15, 2024 at 04:55:20PM -0400, Steven Sistare wrote:
> > > > > > > On 8/13/2024 3:46 PM, Peter Xu wrote:
> > > > > > > > On Tue, Aug 06, 2024 at 04:56:18PM -0400, Steven Sistare wrote:
> > > > > > > > > > The flipside, however, is that localhost migration via 2 
> > > > > > > > > > separate QEMU
> > > > > > > > > > processes has issues where both QEMUs want to be opening 
> > > > > > > > > > the very same
> > > > > > > > > > file, and only 1 of them can ever have them open.
> > > > > > > > 
> > > > > > > > I thought we used to have similar issue with block devices, but 
> > > > > > > > I assume
> > > > > > > > it's solved for years (and whoever owns it will take proper 
> > > > > > > > file lock,
> > > > > > > > IIRC, and QEMU migration should properly serialize the time 
> > > > > > > > window on who's
> > > > > > > > going to take the file lock).
> > > > > > > > 
> > > > > > > > Maybe this is about something else?
> > > > > > > 
> > > > > > > I don't have an example where this fails.
> > > > > > > 
> > > > > > > I can cause "Failed to get "write" lock" errors if two qemu 
> > > > > > > instances open
> > > > > > > the same block device, but the error is suppressed if you add the 
> > > > > > > -incoming
> > > > > > > argument, due to this code:
> > > > > > > 
> > > > > > >blk_attach_dev()
> > > > > > >  if (runstate_check(RUN_STATE_INMIGRATE))
> > > > > > >blk->disable_perm = true;
> > > > > > 
> > > > > > Yep, this one is pretty much expected.
> > > > > > 
> > > > > > > 
> > > > > > > > > Indeed, and "files" includes unix domain sockets.
> > > > > > > 
> > > > > > > More on this -- the second qemu to bind a unix domain socket for 
> > > > > > > listening
> > > > > > > wins, and the first qemu loses it (because second qemu unlinks 
> > > > > > > and recreates
> > > > > > > the socket path before binding on the assumption that it is 
> > > > > > > stale).
> > > > > > > 
> > > > > > > One must use a different name for the socket for second qemu, and 
> > > > > > > clients
> > > > > > > that wish to connect must be aware of the new port.
> > > > > > > 
> > > > > > > > > Network ports also conflict.
> > > > > > > > > cpr-exec avoids such problems, and is one of the advantages 
> > > > > > > > > of the method that
> > > > > > > > > I forgot to promote.
> > > > > > > > 
> > > > > > > > I was thinking that's fine, as the host ports should be the 
> > > > > > > > backend of the
> > > > > > > > VM ports only anyway so they don't need to be identical on both 
> > > > > > > > sides?
> > > > > > > > 
> > > > > > > > IOW, my understanding is it's the guest IP/ports/... which 
> > > > > > > > should still be
> > > > > > > > stable across migrations, where the host ports can be different 
> > > > > > > > as long as
> > > > > > > > the host ports can forward guest port messages correctly?
> > > > > > > 
> > > > > > > Yes, one must use a different host port number for the second 
> > > > > > > qemu, and clients
> > > > &

Re: [PATCH v2 01/19] qapi: Smarter camel_to_upper() to reduce need for 'prefix'

2024-09-04 Thread Daniel P . Berrangé

On Wed, Sep 04, 2024 at 01:18:18PM +0200, Markus Armbruster wrote:
> camel_to_upper() converts its argument from camel case to upper case
> with '_' between words.  Used for generated enumeration constant
> prefixes.


> 
> Signed-off-by: Markus Armbruster 
> Reviewed-by: Daniel P. Berrang?? 

The accent in my name is getting mangled in this series.

IIRC your mail client (git send-email ?) needs to be explicitly
setting a chardset eg

  Content-type: text/plain; charset=utf8

so that mail clients & intermediate servers know how to interpret
the 8bit data.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 2/2] tests/unit: always build the pbkdf crypto unit test

2024-09-04 Thread Daniel P . Berrangé

On Mon, Sep 02, 2024 at 09:45:04PM +0200, Philippe Mathieu-Daudé wrote:
> On 30/8/24 13:05, Daniel P. Berrangé wrote:
> > The meson rules were excluding the pbkdf crypto test when gnutls was the
> > crypto backend. It was then excluded again in #if statements in the test
> > file.
> > 
> > Rather than update these conditions, remove them all, and use the result
> > of the qcrypto_pbkdf_supports() function to determine whether to skip
> > test registration.
> > 
> > Also add CONFIG_DARWIN to the remaining condition, since we have a way
> > to measure CPU time on this platform since commit bf98afc75efedf1.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >   tests/unit/meson.build | 4 +---
> >   tests/unit/test-crypto-pbkdf.c | 9 ++---
> >   2 files changed, 7 insertions(+), 6 deletions(-)
> > 
> > diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> > index 490ab8182d..972d792883 100644
> > --- a/tests/unit/meson.build
> > +++ b/tests/unit/meson.build
> > @@ -121,9 +121,7 @@ if have_block
> > if config_host_data.get('CONFIG_REPLICATION')
> >   tests += {'test-replication': [testblock]}
> > endif
> > -  if nettle.found() or gcrypt.found()
> > -tests += {'test-crypto-pbkdf': [io]}
> > -  endif
> > +  tests += {'test-crypto-pbkdf': [io]}
> >   endif
> >   if have_system
> > diff --git a/tests/unit/test-crypto-pbkdf.c b/tests/unit/test-crypto-pbkdf.c
> > index 43c417f6b4..034bb02422 100644
> > --- a/tests/unit/test-crypto-pbkdf.c
> > +++ b/tests/unit/test-crypto-pbkdf.c
> > @@ -25,8 +25,7 @@
> >   #include 
> >   #endif
> > -#if ((defined(CONFIG_NETTLE) || defined(CONFIG_GCRYPT)) && \
> > - (defined(_WIN32) || defined(RUSAGE_THREAD)))
> > +#if defined(_WIN32) || defined(RUSAGE_THREAD) || defined(CONFIG_DARWIN)
> 
> Add CONFIG_DARWIN in a subsequent commit?

Yes, classic trap. If you have "Also ..." in a commit message then you've
just told yourself it should have been a separate commit :-)

> 
> >   #include "crypto/pbkdf.h"
> >   typedef struct QCryptoPbkdfTestData QCryptoPbkdfTestData;
> > @@ -422,13 +421,17 @@ int main(int argc, char **argv)
> >   g_assert(qcrypto_init(NULL) == 0);
> >   for (i = 0; i < G_N_ELEMENTS(test_data); i++) {
> > +if (!qcrypto_pbkdf2_supports(test_data[i].hash)) {
> > +continue;
> > +}
> > +
> >   if (!test_data[i].slow ||
> >   g_test_slow()) {
> >   g_test_add_data_func(test_data[i].path, &test_data[i], 
> > test_pbkdf);
> >   }
> >   }
> > -if (g_test_slow()) {
> > +if (g_test_slow() && qcrypto_pbkdf2_supports(QCRYPTO_HASH_ALG_SHA256)) 
> > {
> >   g_test_add_func("/crypt0/pbkdf/timing", test_pbkdf_timing);
> 
> While here, rename test_pbkdf_timing -> test_pbkdf_sha256_timing?

Will do

> 
> >   }
> 
> Anyway,
> Reviewed-by: Philippe Mathieu-Daudé 
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] tests/functional/test_vnc: Reduce raciness in find_free_ports()

2024-09-04 Thread Daniel P . Berrangé

On Wed, Sep 04, 2024 at 08:20:12AM +0200, Thomas Huth wrote:
> On 03/09/2024 16.50, Daniel P. Berrangé wrote:
> > On Tue, Sep 03, 2024 at 04:35:53PM +0200, Philippe Mathieu-Daudé wrote:
> > > Pass the port range as argument. In order to reduce races
> > > when looking for free ports, use a per-target per-process
> > > base port (based on the target built-in hash).
> > > 
> > > Signed-off-by: Philippe Mathieu-Daudé 
> > > ---
> > > Based-on: <20240830133841.142644-33-th...@redhat.com>
> > > ---
> > >   tests/functional/test_vnc.py | 12 
> > >   1 file changed, 8 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/tests/functional/test_vnc.py b/tests/functional/test_vnc.py
> > > index b769d3b268..508db0709d 100755
> > > --- a/tests/functional/test_vnc.py
> > > +++ b/tests/functional/test_vnc.py
> > > @@ -10,6 +10,7 @@
> > >   # This work is licensed under the terms of the GNU GPL, version 2 or
> > >   # later.  See the COPYING file in the top-level directory.
> > > +import os
> > >   import socket
> > >   from typing import List
> > > @@ -18,7 +19,6 @@
> > >   VNC_ADDR = '127.0.0.1'
> > >   VNC_PORT_START = 32768
> > > -VNC_PORT_END = VNC_PORT_START + 1024
> > >   def check_bind(port: int) -> bool:
> > > @@ -41,9 +41,10 @@ def check_connect(port: int) -> bool:
> > >   return True
> > > -def find_free_ports(count: int) -> List[int]:
> > > +# warning, racy function
> > > +def find_free_ports(portrange, count: int) -> List[int]:
> > >   result = []
> > > -for port in range(VNC_PORT_START, VNC_PORT_END):
> > > +for port in portrange:
> > >   if check_bind(port):
> > >   result.append(port)
> > >   if len(result) >= count:
> > > @@ -91,7 +92,10 @@ def test_change_password(self):
> > >   password='new_password')
> > >   def test_change_listen(self):
> > > -a, b, c = find_free_ports(3)
> > > +per_arch_port_base = abs((os.getpid() + hash(self.arch)) % (10 
> > > ** 4))
> > > +port_start = VNC_PORT_START + per_arch_port_base
> > > +port_stop = port_start + 100
> > > +a, b, c = find_free_ports(range(port_start, port_stop), 3)
> > >   self.assertFalse(check_connect(a))
> > >   self.assertFalse(check_connect(b))
> > >   self.assertFalse(check_connect(c))
> > 
> > As your comment says, this is still racey, and its also not too
> > nice to read & understand this logic. How about we just make
> > test_vnc.py be serialized wrt itself ?
> 
> We'll likely have more tests that need a free port in the future...
> tests/avocado/migration.py and tests/avocado/reverse_debugging.py use
> find_free_ports(), too, so we should maybe think of a logic that avoids
> clashes between different tests, too.

Create a context manager that holds a fcntl lockfile on disk, while
giving you a free port ?

   class FreePort:
 def __enter__(self):
self.num = find_free_port()
..acquire fcntl lock...

 def __exit__(self):
...release fcntl lock..

Letting tests do

  with FreePort() as port:
  ..do some test that uses port.num


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 2/2] qemu/osdep: handle sysconf(_SC_OPEN_MAX) return value == -1

2024-09-03 Thread Daniel P . Berrangé

On Tue, Sep 03, 2024 at 05:02:44PM +0200, Philippe Mathieu-Daudé wrote:
> On 3/9/24 15:37, Clément Léger wrote:
> > On 03/09/2024 15:34, Philippe Mathieu-Daudé wrote:
> > > On 3/9/24 09:53, Clément Léger wrote:
> > > > On 02/09/2024 21:38, Philippe Mathieu-Daudé wrote:
> > > > > On 30/8/24 13:57, Clément Léger wrote:
> > > > > > On 30/08/2024 13:31, Michael Tokarev wrote:
> > > > > > > 30.08.2024 14:14, Clément Léger wrote:
> > > > > > > > On some systems (MacOS for instance), sysconf(_SC_OPEN_MAX) can
> > > > > > > > return
> > > > > > > > -1. In that case we should fallback to using the OPEN_MAX 
> > > > > > > > define.
> > > > > > > > According to "man sysconf", the OPEN_MAX define should be 
> > > > > > > > present and
> > > > > > > > provided by either unistd.h and/or limits.h so include them for 
> > > > > > > > that
> > > > > > > > purpose. For other OSes, just assume a maximum of 1024 files
> > > > > > > > descriptors
> > > > > > > > as a fallback.
> > > > > > > > 
> > > > > > > > Fixes: 4ec5ebea078e ("qemu/osdep: Move close_all_open_fds() to 
> > > > > > > > oslib-
> > > > > > > > posix")
> > > > > > > > Reported-by: Daniel P. Berrangé 
> > > > > > > > Signed-off-by: Clément Léger 
> > > > > > > 
> > > > > > > Reviewed-by: Michael Tokarev 
> > > > > > > 
> > > > > > > > @@ -928,6 +933,13 @@ static void
> > > > > > > > qemu_close_all_open_fd_fallback(const
> > > > > > > > int *skip, unsigned int nskip,
> > > > > > > >      void qemu_close_all_open_fd(const int *skip, unsigned int 
> > > > > > > > nskip)
> > > > > > > >      {
> > > > > > > >      int open_max = sysconf(_SC_OPEN_MAX);
> > > > > > > > +    if (open_max == -1) {
> > > > > > > > +#ifdef CONFIG_DARWIN
> > > > > > > > +    open_max = OPEN_MAX;
> > > > > 
> > > > > Missing errno check.
> > > > 
> > > > man sysconf states that:
> > > > 
> > > > "On error, -1 is returned and errno is set to indicate the error (for
> > > > example, EINVAL, indicating that name is invalid)."
> > > > 
> > > > So it seems checking for -1 is enough no ? Or were you thinking about
> > > > something else ?
> > > 
> > > Mine (macOS 14.6) is:
> > > 
> > >   RETURN VALUES
> > >   If the call to sysconf() is not successful, -1 is returned and
> > >   errno is set appropriately.  Otherwise, if the variable is
> > >   associated with functionality that is not supported, -1 is
> > >   returned and errno is not modified.  Otherwise, the current
> > >   variable value is returned.
> > 
> > Which seems to imply the same than mine right ? -1 is always returned in
> > case of error and errno might or not be set. So checking for -1 seems
> > enough to check an error return.
> 
> Yes but we can check for the unsupported case. Something like:
> 
> long qemu_sysconf(int name, long unsupported_default)
> {
> int current_errno = errno;
> long retval;
> 
> retval = sysconf(name);
> if (retval == -1) {
> if (errno == current_errno) {
> return unsupported_default;
> }
> perror("sysconf");
> return -1;
> }
> return retval;
> }

That looks uncessarily complicated, and IMHO makes it less
portable. eg consider macOS behaviour:

 "if the variable is associated with functionality that is
  not supported, -1 is returned and errno is not modified"

vs Linux documented behaviour:

  "If name corresponds to a maximum or minimum limit, and
   that limit is indeterminate, -1 is returned and errno
   is  not  changed."

IMHO we should do what Clément already suggested and set a
default anytime retval == -1, and ignore errno. There is
no user benefit from turning errnos into a fatal error
via perror()

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] tests/functional/test_vnc: Reduce raciness in find_free_ports()

2024-09-03 Thread Daniel P . Berrangé

On Tue, Sep 03, 2024 at 04:35:53PM +0200, Philippe Mathieu-Daudé wrote:
> Pass the port range as argument. In order to reduce races
> when looking for free ports, use a per-target per-process
> base port (based on the target built-in hash).
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Based-on: <20240830133841.142644-33-th...@redhat.com>
> ---
>  tests/functional/test_vnc.py | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/functional/test_vnc.py b/tests/functional/test_vnc.py
> index b769d3b268..508db0709d 100755
> --- a/tests/functional/test_vnc.py
> +++ b/tests/functional/test_vnc.py
> @@ -10,6 +10,7 @@
>  # This work is licensed under the terms of the GNU GPL, version 2 or
>  # later.  See the COPYING file in the top-level directory.
>  
> +import os
>  import socket
>  from typing import List
>  
> @@ -18,7 +19,6 @@
>  
>  VNC_ADDR = '127.0.0.1'
>  VNC_PORT_START = 32768
> -VNC_PORT_END = VNC_PORT_START + 1024
>  
>  
>  def check_bind(port: int) -> bool:
> @@ -41,9 +41,10 @@ def check_connect(port: int) -> bool:
>  return True
>  
>  
> -def find_free_ports(count: int) -> List[int]:
> +# warning, racy function
> +def find_free_ports(portrange, count: int) -> List[int]:
>  result = []
> -for port in range(VNC_PORT_START, VNC_PORT_END):
> +for port in portrange:
>  if check_bind(port):
>  result.append(port)
>  if len(result) >= count:
> @@ -91,7 +92,10 @@ def test_change_password(self):
>  password='new_password')
>  
>  def test_change_listen(self):
> -a, b, c = find_free_ports(3)
> +per_arch_port_base = abs((os.getpid() + hash(self.arch)) % (10 ** 4))
> +port_start = VNC_PORT_START + per_arch_port_base
> +port_stop = port_start + 100
> +a, b, c = find_free_ports(range(port_start, port_stop), 3)
>  self.assertFalse(check_connect(a))
>  self.assertFalse(check_connect(b))
>  self.assertFalse(check_connect(c))

As your comment says, this is still racey, and its also not too
nice to read & understand this logic. How about we just make
test_vnc.py be serialized wrt itself ?


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 7/8] python/qapi: move scripts/qapi to python/qemu/qapi

2024-09-02 Thread Daniel P . Berrangé

On Fri, Aug 30, 2024 at 02:22:50PM -0400, John Snow wrote:
> Gave Dan a related answer. For you, my explanation is:
> 
> - It's nice to have just one configuration for static analysis in just one
> place
> - It's nice to have that configuration follow python ecosystem norms
> - It's nice to use standard python management tools to configure and test
> the supported versions of static analysis tools, again kept in one place.
> - Just moving the folder costs virtually nothing.
> - Moving it here makes it easier for me to test the eventual integration
> with make check in one place.
> - I like being able to say that anything under `python/` is fiercely
> guarded by high standards (and the gitlab pipelines) and everything else is
> not. I consider this to be organizationally simple and easy to communicate.
> i.e., I find it attractive to say that "python is maintained, scripts are
> YMMV." I am *not* willing to maintain everything under `scripts/` with the
> same level of effort I apply to `python/`. I think it's important to allow
> people to commit low-development-cost scripts ("contrib quality") that they
> can run from time to time and not everything needs to be held to a
> crystalline perfect standard, but some stuff does.

FYI, I was NOT suggesting that you maintain anything under scripts/.

Rather I'm saying that if we want to apply python code standards, we
should (ultimately) apply them to all python code in the tree, and
that *ALL* maintainers and contributors should comply.

Consider our C standards - we don't apply them selectively to a subset
of the tree - we expect all maintainers to comply. I'd like us to have
the same be true of Python.

The only real issue we have with python standards is bringing existing
code upto par, before we can enable the checks.

Currently we have no easy way for other maintainers to enable their
python code be checked, without moving their code under python/ which
is undesirable IMHO.

If we move the python coding standards to meson.build, such that apply
to all of the source tree, and then exclude everything except python/,
we make it easier for other maintainers to bring stuff upto par. All
need do at that point is remove the exclusion rule for files incrementally
as they fix them.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 15/35] tests/functional: enable pre-emptive caching of assets

2024-08-30 Thread Daniel P . Berrangé

On Fri, Aug 30, 2024 at 01:27:15PM +0200, Thomas Huth wrote:
> On 30/08/2024 09.42, Daniel P. Berrangé wrote:
> > On Fri, Aug 30, 2024 at 09:38:17AM +0200, Thomas Huth wrote:
> > > On 29/08/2024 12.15, Daniel P. Berrangé wrote:
> > > > On Tue, Aug 27, 2024 at 04:24:59PM +0200, Thomas Huth wrote:
> > > > > On 27/08/2024 15.16, Thomas Huth wrote:
> > > > > > On 23/08/2024 09.28, Philippe Mathieu-Daudé wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On 21/8/24 10:27, Thomas Huth wrote:
> > > > > > > > From: Daniel P. Berrangé 
> > > > > > > > 
> > > > > > > > Many tests need to access assets stored on remote sites. We 
> > > > > > > > don't want
> > > > > > > > to download these during test execution when run by meson, 
> > > > > > > > since this
> > > > > > > > risks hitting test timeouts when data transfers are slow.
> > > > > > > > 
> > > > > > > > Add support for pre-emptive caching of assets by setting the 
> > > > > > > > env var
> > > > > > > > QEMU_TEST_PRECACHE to point to a timestamp file. When this is 
> > > > > > > > set,
> > > > > > > > instead of running the test, the assets will be downloaded and 
> > > > > > > > saved
> > > > > > > > to the cache, then the timestamp file created.
> > > ...
> > > > > > > 
> > > > > > > When using multiple jobs (-jN) I'm observing some hangs,
> > > > > > > apparently multiple threads trying to download the same file.
> > > > > > > The files are eventually downloaded successfully but it takes
> > > > > > > longer. Should we acquire some exclusive lock somewhere?
> > > > > > 
> > > > > > I haven't seen that yet ... what did you exactly run? "make
> > > > > > check-functional -jN" ? Or "make check-functional- -jN" ?
> > > > > 
> > > > > After applying some of your patches, I think I've run now into this 
> > > > > problem,
> > > > > too: It's because test_aarch64_sbsaref.py and test_aarch64_virt.py 
> > > > > try to
> > > > > download the same asset in parallel 
> > > > > (alpine-standard-3.17.2-aarch64.iso).
> > > > > 
> > > > > Daniel, any ideas how to fix this in the Asset code?
> > > > 
> > > > So when downloading we open a file with a ".download" suffix, write to
> > > > that, and then rename it to the final filename.
> > > > 
> > > > If we have concurrent usage, both will open the same file and try to
> > > > write to it. Assuming both are downloading the same content we would
> > > > probably "get lucky" and have a consistent file at the end, but clearly
> > > > it is bad to rely on luck.
> > > > 
> > > > The lame option is to use NamedTemporaryFile for the teporary file.
> > > > This ensures both processes will write to different temp files, and
> > > > the final rename is atomic. This guarantees safety, but still has
> > > > the double download penalty.
> > > > 
> > > > The serious option is to use fcntl.lockf(..., fcntl.LOCK_EX) on the
> > > > temp file. If we can't acquire the lock then just immediately close
> > > > the temp file (don't delete it) and assume another thread is going to
> > > > finish its download.
> > > > 
> > > > On windows  we'll need msvcrt.locking(..., msvcrt.LK_WLCK, ...)
> > > > instead of fcntl.
> > > 
> > > While looking for portable solutions, I noticed that newer versions
> > > of Python have a "x" mode for creating files only if they do not
> > > exist yet. So I think something like this could be a solution:
> > > 
> > > @@ -71,17 +72,26 @@ def fetch(self):
> > >   tmp_cache_file = self.cache_file.with_suffix(".download")
> > >   try:
> > > -resp = urllib.request.urlopen(self.url)
> > > +with tmp_cache_file.open("xb") as dst:
> > > +with urllib.request.urlopen(self.url) as resp:
> > > +copyfileobj(resp, dst)
&

Re: [PATCH 7/8] python/qapi: move scripts/qapi to python/qemu/qapi

2024-08-30 Thread Daniel P . Berrangé

On Fri, Aug 30, 2024 at 01:20:35PM +0200, Markus Armbruster wrote:
> John Snow  writes:
> 
> > This is being done for the sake of unifying the linting and static type
> > analysis configurations between scripts/qapi and python/qemu/*.
> >
> > With this change, the qapi module will now be checked by mypy, flake8,
> > pylint, isort etc under all python versions from 3.8 through 3.13 under
> > a variety of different dependency configurations in the GitLab testing
> > pipelines.
> >
> > The tests can be run locally, as always:
> >
> >> cd qemu.git/python
> >> make check-minreqs
> >> make check-tox
> >> make check-dev
> >
> > "check-minreqs" is the must-pass GitLab test.
> > "check-tox" is the optional allowed-to-fail GitLab test.
> >
> > Signed-off-by: John Snow 
> 
> I don't understand why we have to keep Python code in its own directory
> just to get it checked.  We wouldn't do that say for Rust code, would
> we?  Anyway, if it's the price of checking, I'll pay[*].

The 'check-tox' target is defined in python/Makefile, and is
written to only check code below that location, which is a
somewhat arbitrary choice.

Having this in "make" at all is a bit outdated. Ideally all
the targets in python/Makefile should be converted into meson
targets and/or tests, in a "python" suite.

IOW, we should make 'check-tox' a target in meson.build at the
top level, and have it check any .py file in the source tree,
with an exclude list for files we know are not "clean" yet,
so we don't have to move stuff around as we clean up individual
files.

> 
> [...]
> 
> > diff --git a/scripts/qapi-gen.py b/scripts/qapi-gen.py
> > index f3518d29a54..42912c91716 100644
> > --- a/scripts/qapi-gen.py
> > +++ b/scripts/qapi-gen.py
> > @@ -11,9 +11,11 @@
> >  execution environment.
> >  """
> >  
> > +import os
> >  import sys
> >  
> > -from qapi import main
> > +sys.path.append(os.path.join(os.path.dirname(__file__), '..', 'python'))
> > +from qemu.qapi import main
> >  
> >  if __name__ == '__main__':
> >  sys.exit(main.main())
> 
> Suggest to use the opportunity to rename to just qapi-gen (no .py) and
> chmod +x, possibly in a separate patch.
> 
> [...]
> 
> 
> [*] Grudgingly.
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 2/2] qemu/osdep: handle sysconf(_SC_OPEN_MAX) return value == -1

2024-08-30 Thread Daniel P . Berrangé

On Fri, Aug 30, 2024 at 01:14:50PM +0200, Clément Léger wrote:
> On some systems (MacOS for instance), sysconf(_SC_OPEN_MAX) can return
> -1. In that case we should fallback to using the OPEN_MAX define.
> According to "man sysconf", the OPEN_MAX define should be present and
> provided by either unistd.h and/or limits.h so include them for that
> purpose. For other OSes, just assume a maximum of 1024 files descriptors
> as a fallback.
> 
> Fixes: 4ec5ebea078e ("qemu/osdep: Move close_all_open_fds() to oslib-posix")
> Reported-by: Daniel P. Berrangé 
> Signed-off-by: Clément Léger 
> ---
>  util/oslib-posix.c | 12 ++++
>  1 file changed, 12 insertions(+)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 1/2] qemu/osdep: fix current process fds path for other OSes

2024-08-30 Thread Daniel P . Berrangé

On Fri, Aug 30, 2024 at 01:14:49PM +0200, Clément Léger wrote:
> While Linux and Solaris uses /proc/self/fd, other OSes (MacOS,
> FreeBSD) uses /dev/fd. In order to support theses OSes, use /dev/fd as
> a fallback.
> 
> Fixes: 4ec5ebea078e ("qemu/osdep: Move close_all_open_fds() to oslib-posix")
> Reported-by: Daniel P. Berrangé 
> Signed-off-by: Clément Léger 
> ---
>  util/oslib-posix.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] crypto: run qcrypto_pbkdf2_count_iters in a new thread

2024-08-30 Thread Daniel P . Berrangé

On Tue, Aug 13, 2024 at 10:19:28AM -0300, Tiago Pasqualini wrote:
> CPU time accounting in the kernel has been demonstrated to have a
> sawtooth pattern[1][2]. This can cause the getrusage system call to
> not be as accurate as we are expecting, which can cause this calculation
> to stall.
> 
> The kernel discussions shows that this inaccuracy happens when CPU time
> gets big enough, so this patch changes qcrypto_pbkdf2_count_iters to run
> in a fresh thread to avoid this inaccuracy. It also adds a sanity check
> to fail the process if CPU time is not accounted.
> 
> [1] 
> https://lore.kernel.org/lkml/159231011694.16989.16351419333851309713.tip-bot2@tip-bot2/
> [2] 
> https://lore.kernel.org/lkml/20221226031010.4079885-1-maxing@bytedance.com/t/#m1c7f2fdc0ea742776a70fd1aa2a2e414c437f534
> 
> Resolves: #2398
> Signed-off-by: Tiago Pasqualini 
> ---
>  crypto/pbkdf.c | 42 +++---
>  include/crypto/pbkdf.h | 10 ++
>  2 files changed, 45 insertions(+), 7 deletions(-)

Mostly looks good, but one minor issue...

> diff --git a/include/crypto/pbkdf.h b/include/crypto/pbkdf.h
> index 2c31a44a27..b3757003e4 100644
> --- a/include/crypto/pbkdf.h
> +++ b/include/crypto/pbkdf.h
> @@ -153,4 +153,14 @@ uint64_t qcrypto_pbkdf2_count_iters(QCryptoHashAlgorithm 
> hash,
>  size_t nout,
>  Error **errp);
>  
> +typedef struct CountItersData {
> +QCryptoHashAlgorithm hash;
> +const uint8_t *key;
> +size_t nkey;
> +const uint8_t *salt;
> +size_t nsalt;
> +size_t nout;
> +Error **errp;
> +uint64_t iterations;

Super fussy here, but lets make 'Error **errp' the very
last item in the struct.

> +} CountItersData;
>  #endif /* QCRYPTO_PBKDF_H */

...this should remain in the pbkdf.c file, since it is not intended to
be part of the public API.


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH 1/2] crypto: check gnutls & gcrypt support the requested pbkdf hash

2024-08-30 Thread Daniel P . Berrangé

Both gnutls and gcrypt can be configured to exclude support for certain
algorithms via a runtime check against system crypto policies. Thus it
is not sufficient to have a compile time test for hash support in their
pbkdf implementations.

Signed-off-by: Daniel P. Berrangé 
---
 crypto/pbkdf-gcrypt.c | 2 +-
 crypto/pbkdf-gnutls.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/pbkdf-gcrypt.c b/crypto/pbkdf-gcrypt.c
index a8d8e64f4d..bc0719c831 100644
--- a/crypto/pbkdf-gcrypt.c
+++ b/crypto/pbkdf-gcrypt.c
@@ -33,7 +33,7 @@ bool qcrypto_pbkdf2_supports(QCryptoHashAlgorithm hash)
 case QCRYPTO_HASH_ALG_SHA384:
 case QCRYPTO_HASH_ALG_SHA512:
 case QCRYPTO_HASH_ALG_RIPEMD160:
-return true;
+return qcrypto_hash_supports(hash);
 default:
 return false;
 }
diff --git a/crypto/pbkdf-gnutls.c b/crypto/pbkdf-gnutls.c
index 2dfbbd382c..911b565bea 100644
--- a/crypto/pbkdf-gnutls.c
+++ b/crypto/pbkdf-gnutls.c
@@ -33,7 +33,7 @@ bool qcrypto_pbkdf2_supports(QCryptoHashAlgorithm hash)
 case QCRYPTO_HASH_ALG_SHA384:
 case QCRYPTO_HASH_ALG_SHA512:
 case QCRYPTO_HASH_ALG_RIPEMD160:
-return true;
+return qcrypto_hash_supports(hash);
 default:
 return false;
 }
-- 
2.45.2

[PATCH 2/2] tests/unit: always build the pbkdf crypto unit test

2024-08-30 Thread Daniel P . Berrangé

The meson rules were excluding the pbkdf crypto test when gnutls was the
crypto backend. It was then excluded again in #if statements in the test
file.

Rather than update these conditions, remove them all, and use the result
of the qcrypto_pbkdf_supports() function to determine whether to skip
test registration.

Also add CONFIG_DARWIN to the remaining condition, since we have a way
to measure CPU time on this platform since commit bf98afc75efedf1.

Signed-off-by: Daniel P. Berrangé 
---
 tests/unit/meson.build | 4 +---
 tests/unit/test-crypto-pbkdf.c | 9 ++---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index 490ab8182d..972d792883 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -121,9 +121,7 @@ if have_block
   if config_host_data.get('CONFIG_REPLICATION')
 tests += {'test-replication': [testblock]}
   endif
-  if nettle.found() or gcrypt.found()
-tests += {'test-crypto-pbkdf': [io]}
-  endif
+  tests += {'test-crypto-pbkdf': [io]}
 endif
 
 if have_system
diff --git a/tests/unit/test-crypto-pbkdf.c b/tests/unit/test-crypto-pbkdf.c
index 43c417f6b4..034bb02422 100644
--- a/tests/unit/test-crypto-pbkdf.c
+++ b/tests/unit/test-crypto-pbkdf.c
@@ -25,8 +25,7 @@
 #include 
 #endif
 
-#if ((defined(CONFIG_NETTLE) || defined(CONFIG_GCRYPT)) && \
- (defined(_WIN32) || defined(RUSAGE_THREAD)))
+#if defined(_WIN32) || defined(RUSAGE_THREAD) || defined(CONFIG_DARWIN)
 #include "crypto/pbkdf.h"
 
 typedef struct QCryptoPbkdfTestData QCryptoPbkdfTestData;
@@ -422,13 +421,17 @@ int main(int argc, char **argv)
 g_assert(qcrypto_init(NULL) == 0);
 
 for (i = 0; i < G_N_ELEMENTS(test_data); i++) {
+if (!qcrypto_pbkdf2_supports(test_data[i].hash)) {
+continue;
+}
+
 if (!test_data[i].slow ||
 g_test_slow()) {
 g_test_add_data_func(test_data[i].path, &test_data[i], test_pbkdf);
 }
 }
 
-if (g_test_slow()) {
+if (g_test_slow() && qcrypto_pbkdf2_supports(QCRYPTO_HASH_ALG_SHA256)) {
 g_test_add_func("/crypt0/pbkdf/timing", test_pbkdf_timing);
 }
 
-- 
2.45.2

[PATCH 0/2] crypto: misc pbkdf fixes for testing & algorithm compat

2024-08-30 Thread Daniel P . Berrangé




Daniel P. Berrangé (2):
  crypto: check gnutls & gcrypt support the requested pbkdf hash
  tests/unit: always build the pbkdf crypto unit test

 crypto/pbkdf-gcrypt.c  | 2 +-
 crypto/pbkdf-gnutls.c  | 2 +-
 tests/unit/meson.build | 4 +---
 tests/unit/test-crypto-pbkdf.c | 9 ++---
 4 files changed, 9 insertions(+), 8 deletions(-)

-- 
2.45.2

Re: [PATCH v4 15/35] tests/functional: enable pre-emptive caching of assets

2024-08-30 Thread Daniel P . Berrangé

On Fri, Aug 30, 2024 at 09:38:17AM +0200, Thomas Huth wrote:
> On 29/08/2024 12.15, Daniel P. Berrangé wrote:
> > On Tue, Aug 27, 2024 at 04:24:59PM +0200, Thomas Huth wrote:
> > > On 27/08/2024 15.16, Thomas Huth wrote:
> > > > On 23/08/2024 09.28, Philippe Mathieu-Daudé wrote:
> > > > > Hi,
> > > > > 
> > > > > On 21/8/24 10:27, Thomas Huth wrote:
> > > > > > From: Daniel P. Berrangé 
> > > > > > 
> > > > > > Many tests need to access assets stored on remote sites. We don't 
> > > > > > want
> > > > > > to download these during test execution when run by meson, since 
> > > > > > this
> > > > > > risks hitting test timeouts when data transfers are slow.
> > > > > > 
> > > > > > Add support for pre-emptive caching of assets by setting the env var
> > > > > > QEMU_TEST_PRECACHE to point to a timestamp file. When this is set,
> > > > > > instead of running the test, the assets will be downloaded and saved
> > > > > > to the cache, then the timestamp file created.
> ...
> > > > > 
> > > > > When using multiple jobs (-jN) I'm observing some hangs,
> > > > > apparently multiple threads trying to download the same file.
> > > > > The files are eventually downloaded successfully but it takes
> > > > > longer. Should we acquire some exclusive lock somewhere?
> > > > 
> > > > I haven't seen that yet ... what did you exactly run? "make
> > > > check-functional -jN" ? Or "make check-functional- -jN" ?
> > > 
> > > After applying some of your patches, I think I've run now into this 
> > > problem,
> > > too: It's because test_aarch64_sbsaref.py and test_aarch64_virt.py try to
> > > download the same asset in parallel (alpine-standard-3.17.2-aarch64.iso).
> > > 
> > > Daniel, any ideas how to fix this in the Asset code?
> > 
> > So when downloading we open a file with a ".download" suffix, write to
> > that, and then rename it to the final filename.
> > 
> > If we have concurrent usage, both will open the same file and try to
> > write to it. Assuming both are downloading the same content we would
> > probably "get lucky" and have a consistent file at the end, but clearly
> > it is bad to rely on luck.
> > 
> > The lame option is to use NamedTemporaryFile for the teporary file.
> > This ensures both processes will write to different temp files, and
> > the final rename is atomic. This guarantees safety, but still has
> > the double download penalty.
> > 
> > The serious option is to use fcntl.lockf(..., fcntl.LOCK_EX) on the
> > temp file. If we can't acquire the lock then just immediately close
> > the temp file (don't delete it) and assume another thread is going to
> > finish its download.
> > 
> > On windows  we'll need msvcrt.locking(..., msvcrt.LK_WLCK, ...)
> > instead of fcntl.
> 
> While looking for portable solutions, I noticed that newer versions
> of Python have a "x" mode for creating files only if they do not
> exist yet. So I think something like this could be a solution:
> 
> @@ -71,17 +72,26 @@ def fetch(self):
>  tmp_cache_file = self.cache_file.with_suffix(".download")
>  try:
> -resp = urllib.request.urlopen(self.url)
> +with tmp_cache_file.open("xb") as dst:
> +with urllib.request.urlopen(self.url) as resp:
> +copyfileobj(resp, dst)
> +except FileExistsError:
> +# Another thread already seems to download this asset,
> +# so wait until it is done
> +self.log.debug("%s already exists, waiting for other thread to 
> finish...",
> +   tmp_cache_file)
> +i = 0
> +while i < 600 and os.path.exists(tmp_cache_file):
> +sleep(1)
> +i += 1
> +if os.path.exists(self.cache_file):
> +return str(self.cache_file)
> +raise
>  except Exception as e:
>  self.log.error("Unable to download %s: %s", self.url, e)
> -raise
> -
> -try:
> -with tmp_cache_file.open("wb+") as dst:
> -copyfileobj(resp, dst)
> -except:
>  tmp_cache_file.unlink()
>  raise
> +
>

Re: [PATCH] raw-format: Fix error message for invalid offset/size

2024-08-29 Thread Daniel P . Berrangé

On Thu, Aug 29, 2024 at 08:55:27PM +0200, Kevin Wolf wrote:
> s->offset and s->size are only set at the end of the function and still
> contain the old values when formatting the error message. Print the
> parameters with the new values that we actually checked instead.
> 
> Fixes: 500e2434207d ('raw-format: Split raw_read_options()')
> Signed-off-by: Kevin Wolf 
> ---
>  block/raw-format.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Daniel P. Berrangé 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 35/35] docs/devel/testing: Add documentation for functional tests

2024-08-29 Thread Daniel P . Berrangé

On Thu, Aug 29, 2024 at 01:35:21PM +0200, Thomas Huth wrote:
> On 29/08/2024 12.34, Daniel P. Berrangé wrote:
> > On Wed, Aug 21, 2024 at 10:27:36AM +0200, Thomas Huth wrote:
> > > Document the new functional testing framework. The text is originally
> > > based on the Avocado documentation, but heavily modified to match the
> > > new framework.
> > > 
> > > Signed-off-by: Thomas Huth 
> > > ---
> > >   docs/devel/testing/functional.rst | 269 ++
> > >   docs/devel/testing/index.rst  |   1 +
> > >   docs/devel/testing/main.rst   |  12 ++
> > >   3 files changed, 282 insertions(+)
> > >   create mode 100644 docs/devel/testing/functional.rst
> > 
> > Reviewed-by: Daniel P. Berrangé 
> > 
> > > +The tests should be written in the style of the Python `unittest`_
> > > +framework, using stdio for the TAP protocol. The folder
> > > +``tests/functional/qemu_test`` provides classes (e.g. the 
> > > ``QemuBaseTest``
> > > +and the ``QemuSystemTest`` classes) and utility functions that help
> > > +to get your test into the right shape.
> > 
> > One gotcha when using TAP protocol is that you can't just spew
> > debug info to stdout/stderr. Each line of debug info needs to
> > be prefixed with '#' so it is interpreted as diagnostic output.
> 
> Actually, that's the great thing about pycotap (in comparison to other
> Python TAP implementations that I've seen), it helps you to get this right:
> By instantiating the TAPTestRunner like this:
> 
> tr = pycotap.TAPTestRunner(message_log = pycotap.LogMode.LogToError,
>test_output_log = pycotap.LogMode.LogToError)
> 
> The stdio output gets redirected to stderr. And the meson test runner is
> fine by collecting the error messages from stderr and showing them to the
> user in the right way in case the test failed (and puts them into the log
> file if the test succeeded).

I'm not sure that works in all scenarios. In the patch that converts the
acpi-bits test, I had to add this chunk:

@@ -264,8 +263,12 @@ def generate_bits_iso(self):

 try:
 if os.getenv('V') or os.getenv('BITS_DEBUG'):
-subprocess.check_call([mkrescue_script, '-o', iso_file,
-   bits_dir], stderr=subprocess.STDOUT)
+proc = subprocess.run([mkrescue_script, '-o', iso_file,
+   bits_dir],
+  stdout=subprocess.PIPE,
+  stderr=subprocess.STDOUT,
+  check=True)
+self.logger.info("grub-mkrescue output %s" % proc.stdout)
 else:
 subprocess.check_call([mkrescue_script, '-o',
   iso_file, bits_dir],

because I saw errors in TAP output parsing when V=1 was set,
due to mkrescue_script printing to the STDOUT file descriptor.

IIUC, I wonder if the pycotap runner is replacing the 'stdout'
python object, but leaving the underlying OS FD 1 open on its
original channel such that it gets inherited by child processes.


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PULL 10/11] crypto: push error reporting into TLS session I/O APIs

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 28, 2024 at 10:32:15AM +0200, Thomas Huth wrote:
> On 27/08/2024 09.05, Markus Armbruster wrote:
> > Daniel P. Berrangé  writes:
> > 
> > > On Mon, Aug 12, 2024 at 05:38:41PM +0200, Thomas Huth wrote:
> > > > On 24/07/2024 11.47, Daniel P. Berrangé wrote:
> > > > > The current TLS session I/O APIs just return a synthetic errno
> > > > > value on error, which has been translated from a gnutls error
> > > > > value. This looses a large amount of valuable information that
> > > > > distinguishes different scenarios.
> > > > > 
> > > > > Pushing population of the "Error *errp" object into the TLS
> > > > > session I/O APIs gives more detailed error information.
> > > > > 
> > > > > Reviewed-by: Philippe Mathieu-Daudé 
> > > > > Signed-off-by: Daniel P. Berrangé 
> > > > > ---
> > > > 
> > > >   Hi Daniel!
> > > > 
> > > > iotest 233 is failing for me with -raw now, and bisection
> > > > points to this commit. Output is:
> > > > 
> > > > --- .../qemu/tests/qemu-iotests/233.out
> > > > +++ /tmp/qemu/tests/qemu-iotests/scratch/raw-file-233/233.out.bad
> > > > @@ -69,8 +69,8 @@
> > > >   1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > > > 
> > > >   == check TLS with authorization ==
> > > > -qemu-img: Could not open 
> > > > 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': Failed to read 
> > > > option reply: Cannot read from TLS channel: Software caused connection 
> > > > abort
> > > > -qemu-img: Could not open 
> > > > 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': Failed to read 
> > > > option reply: Cannot read from TLS channel: Software caused connection 
> > > > abort
> > > > +qemu-img: Could not open 
> > > > 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': Failed to read 
> > > > option reply: Cannot read from TLS channel: The TLS connection was 
> > > > non-properly terminated.
> > > > +qemu-img: Could not open 
> > > > 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': Failed to read 
> > > > option reply: Cannot read from TLS channel: The TLS connection was 
> > > > non-properly terminated.
> > > 
> > > This is an expected change. Previously squashed the real GNUTLS error
> > > into ECONNABORTED:
> > > 
> > > -case GNUTLS_E_PREMATURE_TERMINATION:
> > > -errno = ECONNABORTED;
> > > -break;
> > > 
> > > 
> > > now we report the original gnutls root cause.
> > > 
> > > IOW, we need to update the expected output files.
> > 
> > Has this been done?
> 
> No, I think the problem still persists.

I've just cc'd you both on a patch that fixes this.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH] iotests: fix expected output from gnutls in NBD test

2024-08-29 Thread Daniel P . Berrangé

Error reporting from gnutls was improved by:

  commit 57941c9c86357a6a642f9ee3279d881df4043b6d
  Author: Daniel P. Berrangé 
  Date:   Fri Mar 15 14:07:58 2024 +

crypto: push error reporting into TLS session I/O APIs

This has the effect of changing the output from one of the NBD
tests.

Reported-by: Thomas Huth 
Signed-off-by: Daniel P. Berrangé 
---

NB, wrt qemu-stable this will be for the 9.1 stable branch once
that is created, no earlier releases will need this.

 tests/qemu-iotests/233.out | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index 1910f7df20..d498d55e0e 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -69,8 +69,8 @@ read 1048576/1048576 bytes at offset 1048576
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 == check TLS with authorization ==
-qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: Software caused 
connection abort
-qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: Software caused 
connection abort
+qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: The TLS connection 
was non-properly terminated.
+qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: The TLS connection 
was non-properly terminated.
 
 == check TLS fail over UNIX with no hostname ==
 qemu-img: Could not open 
'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0': No hostname for 
certificate validation
@@ -103,14 +103,14 @@ qemu-img: Could not open 
'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0'
 qemu-nbd: TLS handshake failed: The TLS connection was non-properly terminated.
 
 == final server log ==
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: The TLS connection was non-properly terminated.
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: The TLS connection was non-properly terminated.
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
 qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: The TLS connection was non-properly terminated.
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: The TLS connection was non-properly terminated.
 qemu-nbd: option negotiation failed: TLS handshake failed: An illegal 
parameter has been received.
 qemu-nbd: option negotiation failed: TLS handshake failed: An illegal 
parameter has been received.
 *** done
-- 
2.45.2

Re: [PATCH v4 35/35] docs/devel/testing: Add documentation for functional tests

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 21, 2024 at 10:27:36AM +0200, Thomas Huth wrote:
> Document the new functional testing framework. The text is originally
> based on the Avocado documentation, but heavily modified to match the
> new framework.
> 
> Signed-off-by: Thomas Huth 
> ---
>  docs/devel/testing/functional.rst | 269 ++
>  docs/devel/testing/index.rst  |   1 +
>  docs/devel/testing/main.rst   |  12 ++
>  3 files changed, 282 insertions(+)
>  create mode 100644 docs/devel/testing/functional.rst

Should also mention the use of the "Asset" class and ASSET_blah class
level variables for caching purposes.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 35/35] docs/devel/testing: Add documentation for functional tests

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 21, 2024 at 10:27:36AM +0200, Thomas Huth wrote:
> Document the new functional testing framework. The text is originally
> based on the Avocado documentation, but heavily modified to match the
> new framework.
> 
> Signed-off-by: Thomas Huth 
> ---
>  docs/devel/testing/functional.rst | 269 ++
>  docs/devel/testing/index.rst  |   1 +
>  docs/devel/testing/main.rst   |  12 ++
>  3 files changed, 282 insertions(+)
>  create mode 100644 docs/devel/testing/functional.rst

Reviewed-by: Daniel P. Berrangé 

> +The tests should be written in the style of the Python `unittest`_
> +framework, using stdio for the TAP protocol. The folder
> +``tests/functional/qemu_test`` provides classes (e.g. the ``QemuBaseTest``
> +and the ``QemuSystemTest`` classes) and utility functions that help
> +to get your test into the right shape.

One gotcha when using TAP protocol is that you can't just spew
debug info to stdout/stderr. Each line of debug info needs to
be prefixed with '#' so it is interpreted as diagnostic output.

We should point this out and recommend people to exclusively
using the 'logging' framework.

Particular care should be taken when spawning sub-processes
to capture both stdout/stderr and then log the result if
needed.


> +Overview
> +
> +
> +The ``tests/functional/qemu_test`` directory provides the ``qemu_test``
> +Python module, containing the ``qemu_test.QemuSystemTest`` class.
> +Here is a simple usage example:
> +
> +.. code::
> +
> +  #!/usr/bin/env python3
> +
> +  from qemu_test import QemuSystemTest
> +
> +  class Version(QemuSystemTest):
> +
> +  def test_qmp_human_info_version(self):
> +  self.vm.launch()
> +  res = self.vm.cmd('human-monitor-command',
> +command_line='info version')
> +  self.assertRegex(res, r'^(\d+\.\d+\.\d)')
> +
> +  if __name__ == '__main__':
> +  QemuSystemTest.main()
> +
> +By providing the "hash bang" line at the beginning of the script,
> +and by calling into QemuSystemTest.main() when it is run directly,
> +the test framework makes sure to run all test_*() functions in the
> +right fassion (e.g. with TAP output that is required by the meson test
> +runner).

Perhaps say that the test file should have execute permissions,
given the hash bang ?



With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 34/35] docs/devel/testing: Rename avocado_qemu.Test class

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 21, 2024 at 10:27:35AM +0200, Thomas Huth wrote:
> The avocado_qemu.Test class has been renamed a while back in commit
> 2283b627bc ("tests/avocado: Rename avocado_qemu.Test -> QemuSystemTest"),
> so we should reflect this now in the documentation, too.
> 
> Signed-off-by: Thomas Huth 
> ---
>  docs/devel/testing/avocado.rst | 32 
>  1 file changed, 16 insertions(+), 16 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 33/35] docs/devel/testing: Split the Avocado documentation into a separate file

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 21, 2024 at 10:27:34AM +0200, Thomas Huth wrote:
> The main testing documentation file got very overloaded already.
> Thus let's split the Avocado information into a separate file.
> 
> Signed-off-by: Thomas Huth 
> ---
>  docs/devel/testing/avocado.rst | 581 +
>  docs/devel/testing/index.rst   |   1 +
>  docs/devel/testing/main.rst| 565 +---
>  3 files changed, 583 insertions(+), 564 deletions(-)
>  create mode 100644 docs/devel/testing/avocado.rst

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 15/35] tests/functional: enable pre-emptive caching of assets

2024-08-29 Thread Daniel P . Berrangé

On Tue, Aug 27, 2024 at 04:24:59PM +0200, Thomas Huth wrote:
> On 27/08/2024 15.16, Thomas Huth wrote:
> > On 23/08/2024 09.28, Philippe Mathieu-Daudé wrote:
> > > Hi,
> > > 
> > > On 21/8/24 10:27, Thomas Huth wrote:
> > > > From: Daniel P. Berrangé 
> > > > 
> > > > Many tests need to access assets stored on remote sites. We don't want
> > > > to download these during test execution when run by meson, since this
> > > > risks hitting test timeouts when data transfers are slow.
> > > > 
> > > > Add support for pre-emptive caching of assets by setting the env var
> > > > QEMU_TEST_PRECACHE to point to a timestamp file. When this is set,
> > > > instead of running the test, the assets will be downloaded and saved
> > > > to the cache, then the timestamp file created.
> > > > 
> > > > A meson custom target is created as a dependency of each test suite
> > > > to trigger the pre-emptive caching logic before the test runs.
> > > > 
> > > > When run in caching mode, it will locate assets by looking for class
> > > > level variables with a name prefix "ASSET_", and type "Asset".
> > > > 
> > > > At the ninja level
> > > > 
> > > >     ninja test --suite functional
> > > > 
> > > > will speculatively download any assets that are not already cached,
> > > > so it is advisable to set a timeout multiplier.
> > > > 
> > > >     QEMU_TEST_NO_DOWNLOAD=1 ninja test --suite functional
> > > > 
> > > > will fail the test if a required asset is not already cached
> > > > 
> > > >     ninja precache-functional
> > > > 
> > > > will download and cache all assets required by the functional
> > > > tests
> > > > 
> > > > At the make level, precaching is always done by
> > > > 
> > > >     make check-functional
> > > > 
> > > > Signed-off-by: Daniel P. Berrangé 
> > > > Tested-by: Richard Henderson 
> > > > [thuth: Remove the duplicated "path = os.path.basename(...)" line]
> > > > Signed-off-by: Thomas Huth 
> > > > ---
> > > >   tests/Makefile.include |  3 ++-
> > > >   tests/functional/meson.build   | 33 +++--
> > > >   tests/functional/qemu_test/asset.py    | 34 ++
> > > >   tests/functional/qemu_test/testcase.py |  7 ++
> > > >   4 files changed, 74 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/tests/Makefile.include b/tests/Makefile.include
> > > > index 66c8cc3123..010369bd3a 100644
> > > > --- a/tests/Makefile.include
> > > > +++ b/tests/Makefile.include
> > > > @@ -161,7 +161,8 @@ $(FUNCTIONAL_TARGETS):
> > > >   .PHONY: check-functional
> > > >   check-functional:
> > > > -    @$(MAKE) SPEED=thorough check-func check-func-quick
> > > > +    @$(NINJA) precache-functional
> > > > +    @QEMU_TEST_NO_DOWNLOAD=1 $(MAKE) SPEED=thorough check-func
> > > > check-func-quick
> > > >   # Consolidated targets
> > > > diff --git a/tests/functional/meson.build b/tests/functional/meson.build
> > > > index 8a8fa0ab99..cef74b82a9 100644
> > > > --- a/tests/functional/meson.build
> > > > +++ b/tests/functional/meson.build
> > > > @@ -33,6 +33,7 @@ tests_x86_64_quick = [
> > > >   tests_x86_64_thorough = [
> > > >   ]
> > > > +precache_all = []
> > > >   foreach speed : ['quick', 'thorough']
> > > >     foreach dir : target_dirs
> > > >   if not dir.endswith('-softmmu')
> > > > @@ -63,11 +64,35 @@ foreach speed : ['quick', 'thorough']
> > > >  meson.current_source_dir())
> > > >   foreach test : target_tests
> > > > -  test('func-@0@/@1@'.format(target_base, test),
> > > > +  testname = '@0@-@1@'.format(target_base, test)
> > > > +  testfile = 'test_' + test + '.py'
> > > > +  testpath = meson.current_source_dir() / testfile
> > > > +  teststamp = testname + '.tstamp'
> > > > +  test_precache_env = environment()
> > > > +  test_precache_env.set('QEMU_TEST_PRE

Re: [PATCH v4 14/35] tests/functional: add a module for handling asset download & caching

2024-08-29 Thread Daniel P . Berrangé

On Fri, Aug 23, 2024 at 08:24:45AM +0200, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> On 21/8/24 10:27, Thomas Huth wrote:
> > From: Daniel P. Berrangé 
> > 
> > The 'Asset' class is a simple module that declares a downloadable
> > asset that can be cached locally. Downloads are stored in the user's
> > home dir at ~/.cache/qemu/download, using a sha256 sum of the URL.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > Reviewed-by: Philippe Mathieu-Daudé 
> > Tested-by: Philippe Mathieu-Daudé 
> > [thuth: Drop sha1 support, use hash on file content for naming instead of 
> > URL,
> >  add the possibility to specify the cache dir via environment 
> > variable]
> > Signed-off-by: Thomas Huth 
> > ---
> >   tests/functional/qemu_test/__init__.py |  1 +
> >   tests/functional/qemu_test/asset.py| 97 ++
> >   2 files changed, 98 insertions(+)
> >   create mode 100644 tests/functional/qemu_test/asset.py
> 
> 
> > +def fetch(self):
> > +if not self.cache_dir.exists():
> > +self.cache_dir.mkdir(parents=True, exist_ok=True)
> > +
> > +if self.valid():
> > +self.log.debug("Using cached asset %s for %s",
> > +   self.cache_file, self.url)
> > +return str(self.cache_file)
> > +
> > +self.log.info("Downloading %s to %s...", self.url, self.cache_file)
> > +tmp_cache_file = self.cache_file.with_suffix(".download")
> > +
> > +try:
> > +resp = urllib.request.urlopen(self.url)
> > +except Exception as e:
> > +self.log.error("Unable to download %s: %s", self.url, e)
> > +raise
> > +
> > +try:
> > +with tmp_cache_file.open("wb+") as dst:
> > +copyfileobj(resp, dst)
> > +except:
> > +tmp_cache_file.unlink()
> > +raise
> > +try:
> > +# Set these just for informational purposes
> > +os.setxattr(str(tmp_cache_file), "user.qemu-asset-url",
> > +self.url.encode('utf8'))
> > +os.setxattr(str(tmp_cache_file), "user.qemu-asset-hash",
> > +self.hash.encode('utf8'))
> > +except Exception as e:
> > +self.log.info("Unable to set xattr on %s: %s", tmp_cache_file, 
> > e)
> 
> This line is annoying on macOS as it is logged for each file downloaded.
> Is it really useful? Can we demote to DEBUG level or log it just once,
> given all tmp_cache_files will always be on the same cache_dir thus
> filesystem?

Yeah, DEBUG would be fine.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 14/35] tests/functional: add a module for handling asset download & caching

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 21, 2024 at 04:49:42PM +0200, Philippe Mathieu-Daudé wrote:
> On 21/8/24 10:27, Thomas Huth wrote:
> > From: Daniel P. Berrangé 
> > 
> > The 'Asset' class is a simple module that declares a downloadable
> > asset that can be cached locally. Downloads are stored in the user's
> > home dir at ~/.cache/qemu/download, using a sha256 sum of the URL.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > Reviewed-by: Philippe Mathieu-Daudé 
> > Tested-by: Philippe Mathieu-Daudé 
> > [thuth: Drop sha1 support, use hash on file content for naming instead of 
> > URL,
> >  add the possibility to specify the cache dir via environment 
> > variable]
> > Signed-off-by: Thomas Huth 
> > ---
> >   tests/functional/qemu_test/__init__.py |  1 +
> >   tests/functional/qemu_test/asset.py| 97 ++
> >   2 files changed, 98 insertions(+)
> >   create mode 100644 tests/functional/qemu_test/asset.py
> > 
> > diff --git a/tests/functional/qemu_test/__init__.py 
> > b/tests/functional/qemu_test/__init__.py
> > index 2f1e0bc70d..db05c8f412 100644
> > --- a/tests/functional/qemu_test/__init__.py
> > +++ b/tests/functional/qemu_test/__init__.py
> > @@ -6,6 +6,7 @@
> >   # later.  See the COPYING file in the top-level directory.
> > +from .asset import Asset
> >   from .config import BUILD_DIR
> >   from .cmd import has_cmd, has_cmds, run_cmd, is_readable_executable_file, 
> > \
> >   interrupt_interactive_console_until_pattern, 
> > wait_for_console_pattern, \
> > diff --git a/tests/functional/qemu_test/asset.py 
> > b/tests/functional/qemu_test/asset.py
> > new file mode 100644
> > index 00..cbeb6278af
> > --- /dev/null
> > +++ b/tests/functional/qemu_test/asset.py
> > @@ -0,0 +1,97 @@
> > +# Test utilities for fetching & caching assets
> > +#
> > +# Copyright 2024 Red Hat, Inc.
> > +#
> > +# This work is licensed under the terms of the GNU GPL, version 2 or
> > +# later.  See the COPYING file in the top-level directory.
> > +
> > +import hashlib
> > +import logging
> > +import os
> > +import subprocess
> > +import urllib.request
> > +from pathlib import Path
> > +from shutil import copyfileobj
> > +
> > +
> > +# Instances of this class must be declared as class level variables
> > +# starting with a name "ASSET_". This enables the pre-caching logic
> > +# to easily find all referenced assets and download them prior to
> > +# execution of the tests.
> > +class Asset:
> > +
> > +def __init__(self, url, hashsum):
> > +self.url = url
> > +self.hash = hashsum
> > +cache_dir_env = os.getenv('QEMU_TEST_CACHE_DIR')
> > +if cache_dir_env:
> > +self.cache_dir = Path(cache_dir_env, "download")
> > +else:
> > +self.cache_dir = Path(Path("~").expanduser(),
> > +  ".cache", "qemu", "download")
> > +self.cache_file = Path(self.cache_dir, hashsum)
> > +self.log = logging.getLogger('qemu-test')
> > +
> > +def __repr__(self):
> > +return "Asset: url=%s hash=%s cache=%s" % (
> > +self.url, self.hash, self.cache_file)
> > +
> > +def _check(self, cache_file):
> > +if self.hash is None:
> > +return True
> > +if len(self.hash) == 64:
> > +sum_prog = 'sha256sum'
> > +elif len(self.hash) == 128:
> > +sum_prog = 'sha512sum'
> > +else:
> > +raise Exception("unknown hash type")
> > +
> > +checksum = subprocess.check_output(
> > +[sum_prog, str(cache_file)]).split()[0]
> > +return self.hash == checksum.decode("utf-8")
> > +
> > +def valid(self):
> > +return self.cache_file.exists() and self._check(self.cache_file)
> > +
> > +def fetch(self):
> > +if not self.cache_dir.exists():
> > +self.cache_dir.mkdir(parents=True, exist_ok=True)
> 
> This doesn't work with QEMU_TEST_CACHE_DIR set to someone else:
> 
>   File
> "/home/philippe.mathieu-daude/qemu/tests/functional/qemu_test/asset.py",
> line 60, in fetch
> self.cache_dir.mkdir(parents=True, exist_ok=True)
>   File "/usr/lib/python3.10/pathlib.py", line

Re: [PATCH v4 11/35] tests/functional: Prepare the meson build system for the functional tests

2024-08-29 Thread Daniel P . Berrangé

On Mon, Aug 26, 2024 at 10:18:44AM +0200, Thomas Huth wrote:
> On 23/08/2024 14.54, Philippe Mathieu-Daudé wrote:
> > On 21/8/24 10:27, Thomas Huth wrote:
> > > Provide a meson.build file for the upcoming python-based functional
> > > tests, and add some wrapper glue targets to the tests/Makefile.include
> > > file. We are going to use two "speed" modes for the functional tests:
> > > The "quick" tests can be run at any time (i.e. also during "make check"),
> > > while the "thorough" tests should only be run when running a
> > > "make check-functional" test run (since these tests might download
> > > additional assets from the internet).
> > > 
> > > The changes to the meson.build files are partly based on an earlier
> > > patch by Ani Sinha.
> > > 
> > > Reviewed-by: Philippe Mathieu-Daudé 
> > > Tested-by: Philippe Mathieu-Daudé 
> > > Signed-off-by: Thomas Huth 
> > > ---
> > >   tests/Makefile.include   | 11 ++
> > >   tests/functional/meson.build | 66 
> > >   tests/meson.build    |  1 +
> > >   3 files changed, 78 insertions(+)
> > >   create mode 100644 tests/functional/meson.build
> > 
> > 
> > > +# Timeouts for individual tests that can be slow e.g. with debugging 
> > > enabled
> > > +test_timeouts = {
> > > +}
> > 
> > 
> > > +    foreach test : target_tests
> > > +  test('func-@0@/@1@'.format(target_base, test),
> > > +   python,
> > > +   depends: [test_deps, test_emulator, emulator_modules],
> > > +   env: test_env,
> > > +   args: [meson.current_source_dir() / 'test_' + test + '.py'],
> > > +   protocol: 'tap',
> > > +   timeout: test_timeouts.get(test, 60),
> > > +   priority: test_timeouts.get(test, 60),
> > 
> > IIUC with Avocado the timeout was for each test_func in a TestClass.
> > Now this is only per TestClass. Hopefully I'm wrong...
> 
> I think you're right ... we might need to adjust the meson timeouts here and
> there in case they are causing problems, but that's just business as usual
> (we had to do the same when enabling the meson timeouts for the qtests for
> example).

Meson timeouts are around the lifetime of whatever child process
we tell it to execute.  So, yes, the timeout is for the .py file
as a whole, not the individual TestClass within it.

This is different from Avocado, but not a big problem. Worst case
we'll adjust some meson level timeouts, or split overly long running
.py test files into multiple test .py files.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 07/35] python: Install pycotap in our venv if necessary

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 21, 2024 at 10:27:08AM +0200, Thomas Huth wrote:
> The upcoming functional tests will require pycotap for providing
> TAP output from the python-based tests. Since we want to be able
> to run some of the tests offline by default, too, let's install
> it along with meson in our venv if necessary (it's size is only
> 5 kB, so adding the wheel here should not really be a problem).
> 
> The wheel file has been obtained with:
> 
>  pip download --only-binary :all: --dest . --no-cache pycotap
> 
> Tested-by: Philippe Mathieu-Daudé 
> Signed-off-by: Thomas Huth 
> ---
>  python/wheels/pycotap-1.3.1-py3-none-any.whl | Bin 0 -> 5119 bytes
>  pythondeps.toml  |   1 +
>  2 files changed, 1 insertion(+)
>  create mode 100644 python/wheels/pycotap-1.3.1-py3-none-any.whl

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 06/35] tests/avocado/boot_linux_console: Remove the s390x subtest

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 21, 2024 at 10:27:07AM +0200, Thomas Huth wrote:
> We've got a much more sophisticated, Fedora-based test for s390x
> ("test_s390x_fedora" in another file) already, so the test in
> boot_linux_console.py seems to be rather a waste of precious test
> cycles. Thus move the command line check and delete the s390x
> test in boot_linux_console.py.
> 
> Signed-off-by: Thomas Huth 
> ---
>  tests/avocado/boot_linux_console.py  | 20 
>  tests/avocado/machine_s390_ccw_virtio.py |  2 ++
>  2 files changed, 2 insertions(+), 20 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 04/35] Bump avocado to 103.0

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 21, 2024 at 12:34:45PM +0200, Philippe Mathieu-Daudé wrote:
> On 21/8/24 10:27, Thomas Huth wrote:
> > From: Cleber Rosa 
> > 
> > This bumps Avocado to latest the LTS release.
> > 
> > An LTS release is one that can receive bugfixes and guarantees
> > stability for a much longer period and has incremental minor releases
> > made.
> > 
> > Even though the 103.0 LTS release is pretty a rewrite of Avocado when
> > compared to 88.1, the behavior of all existing tests under
> > tests/avocado has been extensively tested no regression in behavior
> > was found.
> > 
> > To keep behavior of jobs as close as possible with previous version,
> > this version bump keeps the execution serial (maximum of one task at a
> > time being run).
> > 
> > Reference: 
> > https://avocado-framework.readthedocs.io/en/103.0/releases/lts/103_0.html
> > Signed-off-by: Cleber Rosa 
> > Tested-by: Marcin Juszkiewicz 
> > Message-ID: <20240806173119.582857-2-cr...@redhat.com>
> > Signed-off-by: Thomas Huth 
> > ---
> >   pythondeps.toml| 2 +-
> >   tests/Makefile.include | 2 +-
> >   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> I suppose we now need an ultra wide monitor and set the terminal
> COLUMNS to a value >= 512 to not make sense of the spaghetti console
> now (my laptop terminal is 48x150 full screen, I might need new
> eyes). (Compare with commit 44055caaa5 description). See for example
> some lines are 267 columns wide:
> 
> $ make check-avocado AVOCADO_TAGS='machine:ppce500 machine:mpc8544ds'

snip

> Anyway I don't mind much and don't ask/expect anyone to fix that.

Hopefully this will just be a temporary downside, if eventually get all
tests converted to the new functional test framework.

> 
> On macOS:
> Tested-by: Philippe Mathieu-Daudé 
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v5 7/8] machine/nitro-enclave: New machine type for AWS Nitro Enclaves

2024-08-29 Thread Daniel P . Berrangé

On Wed, Aug 28, 2024 at 09:50:25PM +0600, Dorjoy Chowdhury wrote:
> Hi Daniel,
> 
> On Wed, Aug 28, 2024 at 9:39 PM Daniel P. Berrangé  
> wrote:
> >
> > On Thu, Aug 22, 2024 at 09:08:48PM +0600, Dorjoy Chowdhury wrote:
> > > AWS nitro enclaves[1] is an Amazon EC2[2] feature that allows creating
> > > isolated execution environments, called enclaves, from Amazon EC2
> > > instances which are used for processing highly sensitive data. Enclaves
> > > have no persistent storage and no external networking. The enclave VMs
> > > are based on the Firecracker microvm with a vhost-vsock device for
> > > communication with the parent EC2 instance that spawned it and a Nitro
> > > Secure Module (NSM) device for cryptographic attestation. The parent
> > > instance VM always has CID 3 while the enclave VM gets a dynamic CID.
> > >
> > > An EIF (Enclave Image Format)[3] file is used to boot an AWS nitro enclave
> > > virtual machine. This commit adds support for AWS nitro enclave emulation
> > > using a new machine type option '-M nitro-enclave'. This new machine type
> > > is based on the 'microvm' machine type, similar to how real nitro enclave
> > > VMs are based on Firecracker microvm. For nitro-enclave to boot from an
> > > EIF file, the kernel and ramdisk(s) are extracted into a temporary kernel
> > > and a temporary initrd file which are then hooked into the regular x86
> > > boot mechanism along with the extracted cmdline. The EIF file path should
> > > be provided using the '-kernel' QEMU option.
> > >
> > > In QEMU, the vsock emulation for nitro enclave is added using vhost-user-
> > > vsock as opposed to vhost-vsock. vhost-vsock doesn't support sibling VM
> > > communication which is needed for nitro enclaves. So for the vsock
> > > communication to CID 3 to work, another process that does the vsock
> > > emulation in  userspace must be run, for example, vhost-device-vsock[4]
> > > from rust-vmm, with necessary vsock communication support in another
> > > guest VM with CID 3. Using vhost-user-vsock also enables the possibility
> > > to implement some proxying support in the vhost-user-vsock daemon that
> > > will forward all the packets to the host machine instead of CID 3 so
> > > that users of nitro-enclave can run the necessary applications in their
> > > host machine instead of running another whole VM with CID 3. The following
> > > mandatory nitro-enclave machine option has been added related to the
> > > vhost-user-vsock device.
> > >   - 'vsock': The chardev id from the '-chardev' option for the
> > > vhost-user-vsock device.
> > >
> > > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> > > has been added using the virtio-nsm device added in a previous commit.
> > > In Nitro Enclaves, all the PCRs start in a known zero state and the first
> > > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> > > contain the SHA384 hashes related to the EIF file used to boot the VM
> > > for validation. The following optional nitro-enclave machine options
> > > have been added related to the NSM device.
> > >   - 'id': Enclave identifier, reflected in the module-id of the NSM
> > > device. If not provided, a default id will be set.
> > >   - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> > > of the NSM device.
> > >   - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> > > NSM device.
> > >
> > > [1] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html
> > > [2] https://aws.amazon.com/ec2/
> > > [3] https://github.com/aws/aws-nitro-enclaves-image-format
> > > [4] https://github.com/rust-vmm/vhost-device/tree/main/vhost-device-vsock
> > >
> > > Signed-off-by: Dorjoy Chowdhury 
> > > ---
> > >  MAINTAINERS  |   9 +
> > >  backends/hostmem-memfd.c |   2 -
> > >  configs/devices/i386-softmmu/default.mak |   1 +
> > >  hw/core/machine.c|  71 ++---
> > >  hw/core/meson.build  |   3 +
> > >  hw/i386/Kconfig  |   6 +
> > >  hw/i386/meson.build  |   3 +
> > >  hw/i386/microvm.c|   6 +-
> > >  hw/i386/nitro_enclave.c  | 355 +++
> > >  include/

Re: [PULL 3/6] qemu/osdep: Split qemu_close_all_open_fd() and add fallback

2024-08-29 Thread Daniel P . Berrangé

On Thu, Aug 29, 2024 at 08:47:27AM +1000, Richard Henderson wrote:
> On 8/28/24 22:48, Daniel P. Berrangé wrote:
> > >   dir = opendir("/proc/self/fd");
> > 
> > IIUC from previous threads this is valid on Linux and on Solaris.
> > 
> > On FreeBSD & macOS, you need /dev/fd though.
> 
> Fair, but importantly, it doesn't do anything *incorrect* those systems: it
> merely skips this method with ENOENT.
> 
> > > +int open_max = sysconf(_SC_OPEN_MAX), i;
> > > +
> > > +/* Fallback */
> > > +for (i = 0; i < open_max; i++) {
> > > +close(i);
> > > +}
> > 
> > I'm told that sysconf(_SC_OPEN_MAX) returns -1 on some versions of
> > macOS. "Luckily" since we assigned to 'int' rather than 'unsigned int'
> > this will result in us not closing any FDs in this fallback path,
> > rather than trying to close several billion FDs (an effective hang).
> 
> Ouch.
> 
> > 
> > If _SC_OPEN_MAX returns -1, we should fallback to the OPEN_MAX
> > constant on macOS (see commit de448e0f26e710e9d2b7fc91393c40ac24b75847
> > which tackled a similar issue wrt getrlimit), and fallback to perhaps
> > a hardcoded 1024 on non-macOS.
> 
> I wish the timing on this had been better -- 25 minutes earlier and I would 
> have delayed rc4.
> 
> Since macOS simply doesn't close fds, I'm of a mind to release 9.1.0 without
> this, and fix it for 9.1.1.  Thoughts?

The original net/tap.c code for closing FDs has the same bug, so in that
area we do NOT have a regression.

The original async teardown code would fail to close FDs as it is looking
for close_range or /proc/$PID/fd, and has no sysconf fallback, so again
no regression there.

IOW, I think this is acceptable to fix in 9.1 stable.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v5 6/8] hw/core: Add Enclave Image Format (EIF) related helpers

2024-08-28 Thread Daniel P . Berrangé

On Thu, Aug 22, 2024 at 09:08:47PM +0600, Dorjoy Chowdhury wrote:
> An EIF (Enclave Image Format)[1] file is used to boot an AWS nitro
> enclave[2] virtual machine. The EIF file contains the necessary kernel,
> cmdline, ramdisk(s) sections to boot.
> 
> Some helper functions have been introduced for extracting the necessary
> sections from an EIF file and then writing them to temporary files as
> well as computing SHA384 hashes from the section data. These will be
> used in the following commit to add support for nitro-enclave machine
> type in QEMU.
> 
> [1] https://github.com/aws/aws-nitro-enclaves-image-format
> [2] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  hw/core/eif.c | 719 ++
>  hw/core/eif.h |  22 ++
>  2 files changed, 741 insertions(+)
>  create mode 100644 hw/core/eif.c
>  create mode 100644 hw/core/eif.h
> 
> diff --git a/hw/core/eif.c b/hw/core/eif.c
> new file mode 100644
> index 00..2cfd5c911e
> --- /dev/null
> +++ b/hw/core/eif.c
> +static bool get_SHA384_digest(GList *list, uint8_t *digest, Error **errp)
> +{
> +size_t digest_len = QCRYPTO_HASH_DIGEST_LEN_SHA384;
> +size_t list_len = g_list_length(list);
> +struct iovec *iovec_list = g_malloc(list_len * sizeof(struct iovec));

Even if probably harmless in this case, it is best practice to use

   g_new0(struct iovec, list_len)

because glib then checks for integer overflow when doing the
"count * sizeof()" multiplication on your behalf.

> +bool ret = true;
> +GList *l;
> +int i;
> +
> +for (i = 0, l = list; l != NULL; l = l->next, i++) {
> +iovec_list[i] = *(struct iovec *) l->data;
> +}
> +
> +if (qcrypto_hash_bytesv(QCRYPTO_HASH_ALG_SHA384, iovec_list, list_len,
> +&digest, &digest_len, errp) < 0) {
> +ret = false;
> +}
> +
> +g_free(iovec_list);
> +return ret;
> +}
> +

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v5 7/8] machine/nitro-enclave: New machine type for AWS Nitro Enclaves

2024-08-28 Thread Daniel P . Berrangé

On Thu, Aug 22, 2024 at 09:08:48PM +0600, Dorjoy Chowdhury wrote:
> AWS nitro enclaves[1] is an Amazon EC2[2] feature that allows creating
> isolated execution environments, called enclaves, from Amazon EC2
> instances which are used for processing highly sensitive data. Enclaves
> have no persistent storage and no external networking. The enclave VMs
> are based on the Firecracker microvm with a vhost-vsock device for
> communication with the parent EC2 instance that spawned it and a Nitro
> Secure Module (NSM) device for cryptographic attestation. The parent
> instance VM always has CID 3 while the enclave VM gets a dynamic CID.
> 
> An EIF (Enclave Image Format)[3] file is used to boot an AWS nitro enclave
> virtual machine. This commit adds support for AWS nitro enclave emulation
> using a new machine type option '-M nitro-enclave'. This new machine type
> is based on the 'microvm' machine type, similar to how real nitro enclave
> VMs are based on Firecracker microvm. For nitro-enclave to boot from an
> EIF file, the kernel and ramdisk(s) are extracted into a temporary kernel
> and a temporary initrd file which are then hooked into the regular x86
> boot mechanism along with the extracted cmdline. The EIF file path should
> be provided using the '-kernel' QEMU option.
> 
> In QEMU, the vsock emulation for nitro enclave is added using vhost-user-
> vsock as opposed to vhost-vsock. vhost-vsock doesn't support sibling VM
> communication which is needed for nitro enclaves. So for the vsock
> communication to CID 3 to work, another process that does the vsock
> emulation in  userspace must be run, for example, vhost-device-vsock[4]
> from rust-vmm, with necessary vsock communication support in another
> guest VM with CID 3. Using vhost-user-vsock also enables the possibility
> to implement some proxying support in the vhost-user-vsock daemon that
> will forward all the packets to the host machine instead of CID 3 so
> that users of nitro-enclave can run the necessary applications in their
> host machine instead of running another whole VM with CID 3. The following
> mandatory nitro-enclave machine option has been added related to the
> vhost-user-vsock device.
>   - 'vsock': The chardev id from the '-chardev' option for the
> vhost-user-vsock device.
> 
> AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> has been added using the virtio-nsm device added in a previous commit.
> In Nitro Enclaves, all the PCRs start in a known zero state and the first
> 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> contain the SHA384 hashes related to the EIF file used to boot the VM
> for validation. The following optional nitro-enclave machine options
> have been added related to the NSM device.
>   - 'id': Enclave identifier, reflected in the module-id of the NSM
> device. If not provided, a default id will be set.
>   - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> of the NSM device.
>   - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> NSM device.
> 
> [1] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html
> [2] https://aws.amazon.com/ec2/
> [3] https://github.com/aws/aws-nitro-enclaves-image-format
> [4] https://github.com/rust-vmm/vhost-device/tree/main/vhost-device-vsock
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  MAINTAINERS  |   9 +
>  backends/hostmem-memfd.c |   2 -
>  configs/devices/i386-softmmu/default.mak |   1 +
>  hw/core/machine.c|  71 ++---
>  hw/core/meson.build  |   3 +
>  hw/i386/Kconfig  |   6 +
>  hw/i386/meson.build  |   3 +
>  hw/i386/microvm.c|   6 +-
>  hw/i386/nitro_enclave.c  | 355 +++
>  include/hw/boards.h  |   2 +
>  include/hw/i386/microvm.h|   2 +
>  include/hw/i386/nitro_enclave.h  |  62 
>  include/sysemu/hostmem.h |   2 +
>  13 files changed, 488 insertions(+), 36 deletions(-)
>  create mode 100644 hw/i386/nitro_enclave.c
>  create mode 100644 include/hw/i386/nitro_enclave.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index da4f698137..aa7846107e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1877,6 +1877,15 @@ F: hw/i386/microvm.c
>  F: include/hw/i386/microvm.h
>  F: pc-bios/bios-microvm.bin
>  
> +nitro-enclave
> +M: Alexander Graf 
> +M: Dorjoy Chowdhury 
> +S: Maintained
> +F: hw/core/eif.c
> +F: hw/core/eif.h

The eif.c/h files were added in the prevuous patch, so upto this line
should be added in the previous patch.

> +F: hw/i386/nitro_enclave.c
> +F: include/hw/i386/nitro_enclave.h

These two lines can remain in this patch

>  Machine core
>  M: Eduardo Habkost 
>  M: Marcel Apfelbaum 


> diff --git a/hw/core/meson.build b/hw/core/meson.build
> index a3d9bab9f4..5437a94490 100644
> --- a/hw/core/meson.build
>

Re: [PATCH v5 4/8] tests/lcitool: Update libvirt-ci and add libcbor dependency

2024-08-28 Thread Daniel P . Berrangé

On Thu, Aug 22, 2024 at 09:08:45PM +0600, Dorjoy Chowdhury wrote:
> libcbor dependecy is necessary for adding virtio-nsm and nitro-enclave
> machine support in the following commits. libvirt-ci has already been
> updated with the dependency upstream and this commit updates libvirt-ci
> submodule in QEMU to latest upstream. Also the libcbor dependency has
> been added to tests/lcitool/projects/qemu.yml.
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  .gitlab-ci.d/cirrus/macos-13.vars | 2 +-
>  .gitlab-ci.d/cirrus/macos-14.vars | 2 +-
>  scripts/ci/setup/ubuntu/ubuntu-2204-aarch64.yaml  | 1 +
>  scripts/ci/setup/ubuntu/ubuntu-2204-s390x.yaml| 1 +
>  tests/docker/dockerfiles/alpine.docker| 1 +
>  tests/docker/dockerfiles/debian-amd64-cross.docker| 1 +
>  tests/docker/dockerfiles/debian-arm64-cross.docker| 1 +
>  tests/docker/dockerfiles/debian-armel-cross.docker| 1 +
>  tests/docker/dockerfiles/debian-armhf-cross.docker| 1 +
>  tests/docker/dockerfiles/debian-i686-cross.docker | 1 +
>  tests/docker/dockerfiles/debian-mips64el-cross.docker | 1 +
>  tests/docker/dockerfiles/debian-mipsel-cross.docker   | 1 +
>  tests/docker/dockerfiles/debian-ppc64el-cross.docker  | 1 +
>  tests/docker/dockerfiles/debian-s390x-cross.docker| 1 +
>  tests/docker/dockerfiles/debian.docker| 1 +
>  tests/docker/dockerfiles/fedora.docker| 1 +
>  tests/docker/dockerfiles/opensuse-leap.docker | 3 ++-
>  tests/docker/dockerfiles/ubuntu2204.docker| 1 +
>  tests/lcitool/libvirt-ci  | 2 +-
>  tests/lcitool/projects/qemu.yml       | 1 +
>  20 files changed, 21 insertions(+), 4 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v5 1/8] crypto: Define macros for hash algorithm digest lengths

2024-08-28 Thread Daniel P . Berrangé

On Thu, Aug 22, 2024 at 09:08:42PM +0600, Dorjoy Chowdhury wrote:
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  crypto/hash.c | 14 +++---
>  include/crypto/hash.h |  8 
>  2 files changed, 15 insertions(+), 7 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v3 2/2] vl: use qmp_device_add() in qemu_create_cli_devices()

2024-08-28 Thread Daniel P . Berrangé

On Tue, Aug 27, 2024 at 03:27:51PM -0400, Stefan Hajnoczi wrote:
> qemu_create_cli_devices() should use qmp_device_add() to match the
> behavior of the QMP monitor. A comment explained that libvirt changes
> implementing strict CLI syntax were needed.
> 
> Peter Krempa  has confirmed that modern libvirt uses
> the same JSON for -device (CLI) and device_add (QMP). Go ahead and use
> qmp_device_add().
> 
> Cc: Peter Krempa 
> Reviewed-by: Markus Armbruster 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  system/vl.c | 14 --
>  1 file changed, 4 insertions(+), 10 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v3 1/2] qdev-monitor: avoid QemuOpts in QMP device_add

2024-08-28 Thread Daniel P . Berrangé

On Tue, Aug 27, 2024 at 03:27:50PM -0400, Stefan Hajnoczi wrote:
> The QMP device_add monitor command converts the QDict arguments to
> QemuOpts and then back again to QDict. This process only supports scalar
> types. Device properties like virtio-blk-pci's iothread-vq-mapping (an
> array of objects) are silently dropped by qemu_opts_from_qdict() during
> the QemuOpts conversion even though QAPI is capable of validating them.
> As a result, hotplugging virtio-blk-pci devices with the
> iothread-vq-mapping property does not work as expected (the property is
> ignored).
> 
> Get rid of the QemuOpts conversion in qmp_device_add() and call
> qdev_device_add_from_qdict() with from_json=true. Using the QMP
> command's QDict arguments directly allows non-scalar properties.
> 
> The HMP is also adjusted since qmp_device_add()'s now expects properly
> typed JSON arguments and cannot be used from HMP anymore. Move the code
> that was previously in qmp_device_add() (with QemuOpts conversion and
> from_json=false) into hmp_device_add() so that its behavior is
> unchanged.
> 
> This patch changes the behavior of QMP device_add but not HMP
> device_add. QMP clients that sent incorrectly typed device_add QMP
> commands no longer work. This is a breaking change but clients should be
> using the correct types already. See the netdev_add QAPIfication in
> commit db2a380c8457 for similar reasoning and object-add in commit
> 9151e59a8b6e. Unlike those commits, we continue to rely on 'gen': false
> for the time being.
> 
> Markus helped me figure this out and even provided a draft patch. The
> code ended up very close to what he suggested.
> 
> Suggested-by: Markus Armbruster 
> Cc: Daniel P. Berrangé 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  system/qdev-monitor.c | 44 ---
>  1 file changed, 29 insertions(+), 15 deletions(-)


Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PULL 3/6] qemu/osdep: Split qemu_close_all_open_fd() and add fallback

2024-08-28 Thread Daniel P . Berrangé

This is already merged, but I have two comments - one improvement
and one bug which we should probably fix before release.

On Mon, Aug 05, 2024 at 10:31:26AM +1000, Richard Henderson wrote:
> From: Clément Léger 
> 
> In order to make it cleaner, split qemu_close_all_open_fd() logic into
> multiple subfunctions (close with close_range(), with /proc/self/fd and
> fallback).
> 
> Signed-off-by: Clément Léger 
> Reviewed-by: Richard Henderson 
> Message-ID: <20240802145423.3232974-3-cle...@rivosinc.com>
> Signed-off-by: Richard Henderson 
> ---
>  util/oslib-posix.c | 50 ++
>  1 file changed, 37 insertions(+), 13 deletions(-)
> 
> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> index 1e867efa47..9b79fc7cff 100644
> --- a/util/oslib-posix.c
> +++ b/util/oslib-posix.c
> @@ -808,27 +808,16 @@ int qemu_msync(void *addr, size_t length, int fd)
>  return msync(addr, length, MS_SYNC);
>  }
>  
> -/*
> - * Close all open file descriptors.
> - */
> -void qemu_close_all_open_fd(void)
> +static bool qemu_close_all_open_fd_proc(void)
>  {
>  struct dirent *de;
>  int fd, dfd;
>  DIR *dir;
>  
> -#ifdef CONFIG_CLOSE_RANGE
> -int r = close_range(0, ~0U, 0);
> -if (!r) {
> -/* Success, no need to try other ways. */
> -return;
> -}
> -#endif
> -
>  dir = opendir("/proc/self/fd");

IIUC from previous threads this is valid on Linux and on Solaris.

On FreeBSD & macOS, you need /dev/fd though.

>  if (!dir) {
>  /* If /proc is not mounted, there is nothing that can be done. */
> -return;
> +return false;
>  }
>  /* Avoid closing the directory. */
>  dfd = dirfd(dir);
> @@ -840,4 +829,39 @@ void qemu_close_all_open_fd(void)
>  }
>  }
>  closedir(dir);
> +
> +return true;
> +}
> +
> +static bool qemu_close_all_open_fd_close_range(void)
> +{
> +#ifdef CONFIG_CLOSE_RANGE
> +int r = close_range(0, ~0U, 0);
> +if (!r) {
> +/* Success, no need to try other ways. */
> +return true;
> +}
> +#endif
> +return false;
> +}
> +
> +static void qemu_close_all_open_fd_fallback(void)
> +{
> +int open_max = sysconf(_SC_OPEN_MAX), i;
> +
> +/* Fallback */
> +for (i = 0; i < open_max; i++) {
> +close(i);
> +}

I'm told that sysconf(_SC_OPEN_MAX) returns -1 on some versions of
macOS. "Luckily" since we assigned to 'int' rather than 'unsigned int'
this will result in us not closing any FDs in this fallback path,
rather than trying to close several billion FDs (an effective hang).

If _SC_OPEN_MAX returns -1, we should fallback to the OPEN_MAX
constant on macOS (see commit de448e0f26e710e9d2b7fc91393c40ac24b75847
which tackled a similar issue wrt getrlimit), and fallback to perhaps
a hardcoded 1024 on non-macOS.


> +}
> +
> +/*
> + * Close all open file descriptors.
> + */
> +void qemu_close_all_open_fd(void)
> +{
> +if (!qemu_close_all_open_fd_close_range() &&
> +!qemu_close_all_open_fd_proc()) {
> +qemu_close_all_open_fd_fallback();
> +}
>  }
> -- 
> 2.43.0
> 
> 

With regards,
Daniel

[1] https://github.com/open-mpi/ompi/issues/10358
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH RESEND v9 3/9] configure, meson: detect Rust toolchain

2024-08-28 Thread Daniel P . Berrangé

On Wed, Aug 28, 2024 at 07:11:44AM +0300, Manos Pitsidianakis wrote:
> From: Paolo Bonzini 
> 
> Include the correct path and arguments to rustc in the native
> and cross files (native compilation is needed for procedural
> macros).
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  configure   | 50 --
>  meson.build |  8 +++-
>  2 files changed, 51 insertions(+), 7 deletions(-)
> 
> diff --git a/configure b/configure
> index 
> 019fcbd0ef7b07e7b0280b358099cae72c73aa98..9ef6005c557fc627c7c6c732b4c92ed1b934f474
>  100755
> --- a/configure
> +++ b/configure
> @@ -207,6 +207,8 @@ for opt do
>;;
>--objcc=*) objcc="$optarg"
>;;
> +  --rustc=*) RUSTC="$optarg"
> +  ;;
>--cpu=*) cpu="$optarg"
>;;
>--extra-cflags=*)
> @@ -252,6 +254,9 @@ python=
>  download="enabled"
>  skip_meson=no
>  use_containers="yes"
> +# do not enable by default because cross compilation requires 
> --rust-target-triple
> +rust="disabled"
> +rust_target_triple=""
>  gdb_bin=$(command -v "gdb-multiarch" || command -v "gdb")
>  gdb_arches=""
>  
> @@ -317,6 +322,8 @@ windmc="${WINDMC-${cross_prefix}windmc}"
>  pkg_config="${PKG_CONFIG-${cross_prefix}pkg-config}"
>  sdl2_config="${SDL2_CONFIG-${cross_prefix}sdl2-config}"
>  
> +rustc="${RUSTC-rustc}"
> +
>  check_define() {
>  cat > $TMPC <  #if !defined($1)
> @@ -636,6 +643,8 @@ for opt do
>;;
>--objcc=*)
>;;
> +  --rustc=*)
> +  ;;
>--make=*)
>;;
>--install=*)
> @@ -755,8 +764,14 @@ for opt do
>;;
>--container-engine=*) container_engine="$optarg"
>;;
> +  --rust-target-triple=*) rust_target_triple="$optarg"
> +  ;;
>--gdb=*) gdb_bin="$optarg"
>;;
> +  --enable-rust) rust=enabled
> +  ;;
> +  --disable-rust) rust=disabled
> +  ;;
># everything else has the same name in configure and meson
>--*) meson_option_parse "$opt" "$optarg"
>;;
> @@ -859,6 +874,7 @@ Advanced options (experts only):
> at build time [$host_cc]
>--cxx=CXXuse C++ compiler CXX [$cxx]
>--objcc=OBJCCuse Objective-C compiler OBJCC [$objcc]
> +  --rustc=RUSTCuse Rust compiler RUSTC [$rustc]
>--extra-cflags=CFLAGSappend extra C compiler flags CFLAGS
>--extra-cxxflags=CXXFLAGS append extra C++ compiler flags CXXFLAGS
>--extra-objcflags=OBJCFLAGS append extra Objective C compiler flags 
> OBJCFLAGS
> @@ -869,8 +885,9 @@ Advanced options (experts only):
>--python=PYTHON  use specified python [$python]
>--ninja=NINJAuse specified ninja [$ninja]
>--static enable static build [$static]
> -  --without-default-features default all --enable-* options to "disabled"
> -  --without-default-devices  do not include any device that is not needed to
> +  --rust-target-triple=TRIPLE  target for Rust cross compilation
> +  --without-default-features   default all --enable-* options to "disabled"
> +  --without-default-devicesdo not include any device that is not needed 
> to
> start the emulator (only use if you are including
> desired devices in configs/devices/)
>--with-devices-ARCH=NAME override default configs/devices
> @@ -1139,6 +1156,21 @@ EOF
>  fi
>  
>  ##
> +# detect rust triple
> +
> +if test "$rust" != disabled && has "$rustc" && $rustc -vV > 
> "${TMPDIR1}/${TMPB}.out"; then
> +  rust_host_triple=$(sed -n 's/^host: //p' "${TMPDIR1}/${TMPB}.out")
> +else
> +  if test "$rust" = enabled; then
> +error_exit "could not execute rustc binary \"$rustc\""
> +  fi
> +  rust=disabled
> +fi
> +if test "$rust" != disabled && test -z "$rust_target_triple"; then
> +  rust_target_triple=$rust_host_triple
> +fi

Defaulting to the $rust_host_triple is incorrect when QEMU has been
told to build for a non-host target.

Either we need todo the right thing and auto-set rust target based
on QEMU target (preferred), or we need to make it a fatal error
when rust target is omitted & QEMU is building a non-host target.


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v5 01/16] meson: Add optional dependency on IGVM library

2024-08-19 Thread Daniel P . Berrangé

On Tue, Aug 13, 2024 at 04:01:03PM +0100, Roy Hopkins wrote:
> The IGVM library allows Independent Guest Virtual Machine files to be
> parsed and processed. IGVM files are used to configure guest memory
> layout, initial processor state and other configuration pertaining to
> secure virtual machines.
> 
> This adds the --enable-igvm configure option, enabled by default, which
> attempts to locate and link against the IGVM library via pkgconfig and
> sets CONFIG_IGVM if found.
> 
> The library is added to the system_ss target in backends/meson.build
> where the IGVM parsing will be performed by the ConfidentialGuestSupport
> object.
> 
> Signed-off-by: Roy Hopkins 
> Acked-by: Michael S. Tsirkin 
> ---
>  backends/meson.build  | 3 +++
>  meson.build   | 8 
>  meson_options.txt | 2 ++
>  scripts/meson-buildoptions.sh | 3 +++
>  4 files changed, 16 insertions(+)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Daniel P . Berrangé

On Mon, Aug 19, 2024 at 10:07:02PM +0600, Dorjoy Chowdhury wrote:
> On Mon, Aug 19, 2024 at 9:53 PM Daniel P. Berrangé  
> wrote:
> >
> > On Mon, Aug 19, 2024 at 09:32:55PM +0600, Dorjoy Chowdhury wrote:
> > > On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
> > > >
> > > > Hey Dorjoy,
> > > >
> > > > On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> > > > > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device 
> > > > > which
> > > > > is used for stripped down TPM functionality like attestation. This 
> > > > > commit
> > > > > adds the built-in NSM device in the nitro-enclave machine type.
> > > > >
> > > > > In Nitro Enclaves, all the PCRs start in a known zero state and the 
> > > > > first
> > > > > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and 
> > > > > PCR8
> > > > > contain the SHA384 hashes related to the EIF file used to boot the
> > > > > VM for validation.
> > > > >
> > > > > Some optional nitro-enclave machine options have been added:
> > > > >  - 'id': Enclave identifier, reflected in the module-id of the NSM
> > > > > device. If not provided, a default id will be set.
> > > > >  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> > > > > of the NSM device.
> > > > >  - 'parent-id': Parent instance identifier, reflected in PCR4 of 
> > > > > the
> > > > > NSM device.
> > > > >
> > > > > Signed-off-by: Dorjoy Chowdhury 
> > > > > ---
> > > > >   crypto/meson.build  |   2 +-
> > > > >   crypto/x509-utils.c |  73 +++
> > > >
> > > >
> > > > Can you please put this new API into its own patch file?
> > > >
> > > >
> > > > >   hw/core/eif.c   | 225 
> > > > > +---
> > > > >   hw/core/eif.h   |   5 +-
> > > >
> > > >
> > > > These changes to eif.c should ideally already be part of the patch that
> > > > introduces eif.c (patch 1), no? In fact, do you think you can make the
> > > > whole eif logic its own patch file?
> > > >
> > > >
> > > > >   hw/core/meson.build |   4 +-
> > > > >   hw/i386/Kconfig |   1 +
> > > > >   hw/i386/nitro_enclave.c | 141 +++-
> > > > >   include/crypto/x509-utils.h |  22 
> > > > >   include/hw/i386/nitro_enclave.h |  26 
> > > > >   9 files changed, 479 insertions(+), 20 deletions(-)
> > > > >   create mode 100644 crypto/x509-utils.c
> > > > >   create mode 100644 include/crypto/x509-utils.h
> > > > >
> > > > > diff --git a/crypto/meson.build b/crypto/meson.build
> > > > > index c46f9c22a7..09633194ed 100644
> > > > > --- a/crypto/meson.build
> > > > > +++ b/crypto/meson.build
> > > > > @@ -62,7 +62,7 @@ endif
> > > > >   if gcrypt.found()
> > > > > util_ss.add(gcrypt, files('random-gcrypt.c'))
> > > > >   elif gnutls.found()
> > > > > -  util_ss.add(gnutls, files('random-gnutls.c'))
> > > > > +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
> > > >
> > > >
> > > > What if we don't have gnutls. Will everything still compile or do we
> > > > need to add any dependencies?
> > > >
> > > >
> > >
> > > [...]
> > >
> > > > >
> > > > > diff --git a/hw/core/meson.build b/hw/core/meson.build
> > > > > index f32d1ad943..8dc4552e35 100644
> > > > > --- a/hw/core/meson.build
> > > > > +++ b/hw/core/meson.build
> > > > > @@ -12,6 +12,8 @@ hwcore_ss.add(files(
> > > > > 'qdev-clock.c',
> > > > >   ))
> > > > >
> > > > > +libcbor = dependency('libcbor', version: '>=0.7.0')
> > > > > +
> > > > >   common_ss.add(files('cpu-common.c'))
> > > > >   common_ss.add(files('machine-smp.c'))
> > > > >   s

Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Daniel P . Berrangé

On Mon, Aug 19, 2024 at 09:32:55PM +0600, Dorjoy Chowdhury wrote:
> On Mon, Aug 19, 2024 at 4:13 PM Alexander Graf  wrote:
> >
> > Hey Dorjoy,
> >
> > On 18.08.24 13:42, Dorjoy Chowdhury wrote:
> > > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> > > is used for stripped down TPM functionality like attestation. This commit
> > > adds the built-in NSM device in the nitro-enclave machine type.
> > >
> > > In Nitro Enclaves, all the PCRs start in a known zero state and the first
> > > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> > > contain the SHA384 hashes related to the EIF file used to boot the
> > > VM for validation.
> > >
> > > Some optional nitro-enclave machine options have been added:
> > >  - 'id': Enclave identifier, reflected in the module-id of the NSM
> > > device. If not provided, a default id will be set.
> > >  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> > > of the NSM device.
> > >  - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> > > NSM device.
> > >
> > > Signed-off-by: Dorjoy Chowdhury 
> > > ---
> > >   crypto/meson.build  |   2 +-
> > >   crypto/x509-utils.c |  73 +++
> >
> >
> > Can you please put this new API into its own patch file?
> >
> >
> > >   hw/core/eif.c   | 225 +---
> > >   hw/core/eif.h   |   5 +-
> >
> >
> > These changes to eif.c should ideally already be part of the patch that
> > introduces eif.c (patch 1), no? In fact, do you think you can make the
> > whole eif logic its own patch file?
> >
> >
> > >   hw/core/meson.build |   4 +-
> > >   hw/i386/Kconfig |   1 +
> > >   hw/i386/nitro_enclave.c | 141 +++-
> > >   include/crypto/x509-utils.h |  22 
> > >   include/hw/i386/nitro_enclave.h |  26 
> > >   9 files changed, 479 insertions(+), 20 deletions(-)
> > >   create mode 100644 crypto/x509-utils.c
> > >   create mode 100644 include/crypto/x509-utils.h
> > >
> > > diff --git a/crypto/meson.build b/crypto/meson.build
> > > index c46f9c22a7..09633194ed 100644
> > > --- a/crypto/meson.build
> > > +++ b/crypto/meson.build
> > > @@ -62,7 +62,7 @@ endif
> > >   if gcrypt.found()
> > > util_ss.add(gcrypt, files('random-gcrypt.c'))
> > >   elif gnutls.found()
> > > -  util_ss.add(gnutls, files('random-gnutls.c'))
> > > +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
> >
> >
> > What if we don't have gnutls. Will everything still compile or do we
> > need to add any dependencies?
> >
> >
> 
> [...]
> 
> > >
> > > diff --git a/hw/core/meson.build b/hw/core/meson.build
> > > index f32d1ad943..8dc4552e35 100644
> > > --- a/hw/core/meson.build
> > > +++ b/hw/core/meson.build
> > > @@ -12,6 +12,8 @@ hwcore_ss.add(files(
> > > 'qdev-clock.c',
> > >   ))
> > >
> > > +libcbor = dependency('libcbor', version: '>=0.7.0')
> > > +
> > >   common_ss.add(files('cpu-common.c'))
> > >   common_ss.add(files('machine-smp.c'))
> > >   system_ss.add(when: 'CONFIG_FITLOADER', if_true: files('loader-fit.c'))
> > > @@ -24,7 +26,7 @@ system_ss.add(when: 'CONFIG_REGISTER', if_true: 
> > > files('register.c'))
> > >   system_ss.add(when: 'CONFIG_SPLIT_IRQ', if_true: files('split-irq.c'))
> > >   system_ss.add(when: 'CONFIG_XILINX_AXI', if_true: files('stream.c'))
> > >   system_ss.add(when: 'CONFIG_PLATFORM_BUS', if_true: 
> > > files('sysbus-fdt.c'))
> > > -system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: [files('eif.c'), 
> > > zlib])
> > > +system_ss.add(when: 'CONFIG_NITRO_ENCLAVE', if_true: [files('eif.c'), 
> > > zlib, libcbor, gnutls])
> >
> >
> > Ah, you add the gnutls dependency here. Great! However, this means we
> > now make gnutls (and libcbor) a mandatory dependency for the default
> > configuration. Does configure know about that? I believe before gnutls
> > was optional, right?
> >
> 
> I see gnutls is not a required dependency in the root meson.build. I
> am not sure what we should do here.
> 
> Hey Daniel, do you have any suggestions about how this dependency
> should be included?

Unconditionally build the crypto/x509-utils.c file, but in that put
file #ifdef CONFIG_GNUTLS, and in the #else put a stub impl of the
method that just calls error_setg().

That way you can compile everything without any hard dep on gnutls,
but if someone tries to use it they'll get a runtime error when
gnutls is not built


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] crypto/tlscredspsk: Free username on finalize

2024-08-19 Thread Daniel P . Berrangé

On Mon, Aug 19, 2024 at 03:50:21PM +0100, Peter Maydell wrote:
> When the creds->username property is set we allocate memory
> for it in qcrypto_tls_creds_psk_prop_set_username(), but
> we never free this when the QCryptoTLSCredsPSK is destroyed.
> Free the memory in finalize.
> 
> This fixes a LeakSanitizer complaint in migration-test:
> 
> $ (cd build/asan; ASAN_OPTIONS="fast_unwind_on_malloc=0" 
> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --tap -k 
> -p /x86_64/migration/precopy/unix/tls/psk)
> 
> =
> ==3867512==ERROR: LeakSanitizer: detected memory leaks
> 
> Direct leak of 5 byte(s) in 1 object(s) allocated from:
> #0 0x5624e5c99dee in malloc 
> (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218edee)
>  (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
> #1 0x7fb199ae9738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
> #2 0x7fb199afe583 in g_strdup 
> debian/build/deb/../../../glib/gstrfuncs.c:361:17
> #3 0x5624e82ea919 in qcrypto_tls_creds_psk_prop_set_username 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../crypto/tlscredspsk.c:255:23
> #4 0x5624e812c6b5 in property_set_str 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:2277:5
> #5 0x5624e8125ce5 in object_property_set 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:1463:5
> #6 0x5624e8136e7c in object_set_properties_from_qdict 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:55:14
> #7 0x5624e81372d2 in user_creatable_add_type 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:112:5
> #8 0x5624e8137964 in user_creatable_add_qapi 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:157:11
> #9 0x5624e891ba3c in qmp_object_add 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/qom-qmp-cmds.c:227:5
> #10 0x5624e8af9118 in qmp_marshal_object_add 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-commands-qom.c:337:5
> #11 0x5624e8bd1d49 in do_qmp_dispatch_bh 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qapi/qmp-dispatch.c:128:5
> #12 0x5624e8cb2531 in aio_bh_call 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:171:5
> #13 0x5624e8cb340c in aio_bh_poll 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:218:13
> #14 0x5624e8c0be98 in aio_dispatch 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/aio-posix.c:423:5
> #15 0x5624e8cba3ce in aio_ctx_dispatch 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:360:5
> #16 0x7fb199ae0d3a in g_main_dispatch 
> debian/build/deb/../../../glib/gmain.c:3419:28
> #17 0x7fb199ae0d3a in g_main_context_dispatch 
> debian/build/deb/../../../glib/gmain.c:4137:7
> #18 0x5624e8cbe1d9 in glib_pollfds_poll 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:287:9
> #19 0x5624e8cbcb13 in os_host_main_loop_wait 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:310:5
> #20 0x5624e8cbc6dc in main_loop_wait 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:589:11
> #21 0x5624e6f3f917 in qemu_main_loop 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/runstate.c:801:9
> #22 0x5624e893379c in qemu_default_main 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:37:14
> #23 0x5624e89337e7 in main 
> /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:48:12
> #24 0x7fb197972d8f in __libc_start_call_main 
> csu/../sysdeps/nptl/libc_start_call_main.h:58:16
> #25 0x7fb197972e3f in __libc_start_main csu/../csu/libc-start.c:392:3
> #26 0x5624e5c16fa4 in _start 
> (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4)
>  (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
> 
> SUMMARY: AddressSanitizer: 5 byte(s) leaked in 1 allocation(s).
> 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Peter Maydell 
> ---
> Found this playing around with the address-sanitizer and running
> "make check".  I guess this is stable material but maybe not
> important enough to go into 9.1 at this point in the cycle, since the
> bug has been present since the code was first written in 2018.

The memory leak is low impact since credentials either live for the
entire of the QEMU lifetime, or sometimes are created & deleted on
the fly for infrequent operat

Re: [PATCH v4 3/6] device/virtio-nsm: Support for Nitro Secure Module device

2024-08-19 Thread Daniel P . Berrangé

On Sun, Aug 18, 2024 at 05:42:54PM +0600, Dorjoy Chowdhury wrote:
> Nitro Secure Module (NSM)[1] device is used in AWS Nitro Enclaves for
> stripped down TPM functionality like cryptographic attestation. The
> requests to and responses from NSM device are CBOR[2] encoded.
> 
> This commit adds support for NSM device in QEMU. Although related to
> AWS Nitro Enclaves, the virito-nsm device is independent and can be
> used in other machine types as well. The libcbor[3] library has been
> used for the CBOR encoding and decoding functionalities.
> 
> [1] https://lists.oasis-open.org/archives/virtio-comment/202310/msg00387.html
> [2] http://cbor.io/
> [3] https://libcbor.readthedocs.io/en/latest/
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  MAINTAINERS  |   10 +
>  hw/virtio/Kconfig|5 +
>  hw/virtio/cbor-helpers.c |  292 ++
>  hw/virtio/meson.build|4 +
>  hw/virtio/virtio-nsm-pci.c   |   73 ++
>  hw/virtio/virtio-nsm.c   | 1648 ++
>  include/hw/virtio/cbor-helpers.h |   43 +
>  include/hw/virtio/virtio-nsm.h   |   59 ++
>  8 files changed, 2134 insertions(+)
>  create mode 100644 hw/virtio/cbor-helpers.c
>  create mode 100644 hw/virtio/virtio-nsm-pci.c
>  create mode 100644 hw/virtio/virtio-nsm.c
>  create mode 100644 include/hw/virtio/cbor-helpers.h
>  create mode 100644 include/hw/virtio/virtio-nsm.h
> 

> diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
> index 621fc65454..7ccb61cf74 100644
> --- a/hw/virtio/meson.build
> +++ b/hw/virtio/meson.build
> @@ -48,12 +48,15 @@ else
>system_virtio_ss.add(files('vhost-stub.c'))
>  endif
>  
> +libcbor = dependency('libcbor', version: '>=0.7.0')

In the following patch I suggest that we should have this at the top level
meson.build. Now that I see you're referencing this twice in different
places, we definitely want it in the top meson.build.

ALso the CI changes in Mention in the following patch should be in whatever
patch first introduces the libcbor dependency.



> diff --git a/hw/virtio/virtio-nsm.c b/hw/virtio/virtio-nsm.c
> new file mode 100644
> index 00..e91848a2b0
> --- /dev/null
> +++ b/hw/virtio/virtio-nsm.c

> +static bool add_protected_header_to_cose(cbor_item_t *cose)
> +{
> +cbor_item_t *map = NULL;
> +cbor_item_t *key = NULL;
> +cbor_item_t *value = NULL;
> +cbor_item_t *bs = NULL;
> +size_t len;
> +bool r = false;
> +size_t buf_len = 4096;
> +uint8_t *buf = g_malloc(buf_len);

g_autofree avoids the manual  g_free call.

> +
> +map = cbor_new_definite_map(1);
> +if (!map) {
> +goto cleanup;
> +}
> +key = cbor_build_uint8(1);
> +if (!key) {
> +goto cleanup;
> +}
> +value = cbor_new_int8();
> +if (!value) {
> +goto cleanup;
> +}
> +cbor_mark_negint(value);
> +/* we don't actually sign the data, so we use -1 as the 'alg' value */
> +cbor_set_uint8(value, 0);
> +
> +if (!qemu_cbor_map_add(map, key, value)) {
> +goto cleanup;
> +}
> +
> +len = cbor_serialize(map, buf, buf_len);
> +if (len == 0) {
> +goto cleanup_map;
> +}
> +
> +bs = cbor_build_bytestring(buf, len);
> +if (!bs) {
> +goto cleanup_map;
> +}
> +if (!qemu_cbor_array_push(cose, bs)) {
> +cbor_decref(&bs);
> +goto cleanup_map;
> +}
> +r = true;
> +goto cleanup_map;
> +
> + cleanup:
> +if (key) {
> +cbor_decref(&key);
> +}
> +if (value) {
> +cbor_decref(&value);
> +}
> +
> + cleanup_map:
> +if (map) {
> +cbor_decref(&map);
> +}
> +g_free(buf);
> +return r;
> +}
> +



> +static bool handle_Attestation(VirtIONSM *vnsm, struct iovec *request,
> +   struct iovec *response, Error **errp)
> +{
> +cbor_item_t *root = NULL;
> +cbor_item_t *cose = NULL;
> +cbor_item_t *nested_map;
> +size_t len;
> +NSMAttestationReq nsm_req;
> +enum NSMResponseTypes type;
> +bool r = false;
> +size_t buf_len = 16384;
> +uint8_t *buf = g_malloc(buf_len);

Another suitable for g_autofree

> +
> +type = get_nsm_attestation_req(request->iov_base, request->iov_len,
> +   &nsm_req);
> +if (type != NSM_SUCCESS) {
> +if (error_response(response, type, errp)) {
> +r = true;
> +}
> +goto out;
> +}
> +
> +cose = cbor_new_definite_array(4);
> +if (!cose) {
> +goto err;
> +}
> +if (!add_protected_header_to_cose(cose)) {
> +goto err;
> +}
> +if (!add_unprotected_header_to_cose(cose)) {
> +goto err;
> +}
> +
> +if (nsm_req.public_key_len > 0) {
> +memcpy(vnsm->public_key, nsm_req.public_key, nsm_req.public_key_len);
> +vnsm->public_key_len = nsm_req.public_key_len;
> +}
> +if (nsm_req.user_data_len > 0) {
> +memcpy(vnsm->user_data, nsm

Re: [PATCH v4 4/6] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-19 Thread Daniel P . Berrangé

On Sun, Aug 18, 2024 at 05:42:55PM +0600, Dorjoy Chowdhury wrote:
> AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> is used for stripped down TPM functionality like attestation. This commit
> adds the built-in NSM device in the nitro-enclave machine type.
> 
> In Nitro Enclaves, all the PCRs start in a known zero state and the first
> 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> contain the SHA384 hashes related to the EIF file used to boot the
> VM for validation.
> 
> Some optional nitro-enclave machine options have been added:
> - 'id': Enclave identifier, reflected in the module-id of the NSM
> device. If not provided, a default id will be set.
> - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
> of the NSM device.
> - 'parent-id': Parent instance identifier, reflected in PCR4 of the
> NSM device.
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  crypto/meson.build  |   2 +-
>  crypto/x509-utils.c |  73 +++
>  include/crypto/x509-utils.h |  22 

Preferrably add these 3 in a standlone commit, since its is good practice
to separate commits adding infra, from commits adding usage of infra.

>  hw/core/eif.c   | 225 +---
>  hw/core/eif.h   |   5 +-
>  hw/core/meson.build |   4 +-
>  hw/i386/Kconfig |   1 +
>  hw/i386/nitro_enclave.c | 141 +++-
>  include/hw/i386/nitro_enclave.h |  26 
>  9 files changed, 479 insertions(+), 20 deletions(-)
>  create mode 100644 crypto/x509-utils.c
>  create mode 100644 include/crypto/x509-utils.h
> 
> diff --git a/crypto/meson.build b/crypto/meson.build
> index c46f9c22a7..09633194ed 100644
> --- a/crypto/meson.build
> +++ b/crypto/meson.build
> @@ -62,7 +62,7 @@ endif
>  if gcrypt.found()
>util_ss.add(gcrypt, files('random-gcrypt.c'))
>  elif gnutls.found()
> -  util_ss.add(gnutls, files('random-gnutls.c'))
> +  util_ss.add(gnutls, files('random-gnutls.c', 'x509-utils.c'))
>  elif get_option('rng_none')
>util_ss.add(files('random-none.c'))
>  else

This logic block is handling preferences for the RNG impl.

We want to be compiling x509-utils.c *anytime* gnutls is
found, regardless of what we prioritize for RNG backend.
Also it should be added to crypto_ss, not util_ss.

So put this as its own standalone block

  if gnutls.found()
crypto_ss.add(files('x509-utils.c'))
  endif

> diff --git a/crypto/x509-utils.c b/crypto/x509-utils.c
> new file mode 100644
> index 00..2422eb995c
> --- /dev/null
> +++ b/crypto/x509-utils.c
> @@ -0,0 +1,73 @@
> +/*
> + * X.509 certificate related helpers
> + *
> + * Copyright (c) 2024 Dorjoy Chowdhury 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * (at your option) any later version.  See the COPYING file in the
> + * top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "crypto/x509-utils.h"
> +#include 
> +#include 
> +#include 
> +
> +static int qcrypto_to_gnutls_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {

Can make this 'const' too

> +[QCRYPTO_HASH_ALG_MD5] = GNUTLS_DIG_MD5,
> +[QCRYPTO_HASH_ALG_SHA1] = GNUTLS_DIG_SHA1,
> +[QCRYPTO_HASH_ALG_SHA224] = GNUTLS_DIG_SHA224,
> +[QCRYPTO_HASH_ALG_SHA256] = GNUTLS_DIG_SHA256,
> +[QCRYPTO_HASH_ALG_SHA384] = GNUTLS_DIG_SHA384,
> +[QCRYPTO_HASH_ALG_SHA512] = GNUTLS_DIG_SHA512,
> +[QCRYPTO_HASH_ALG_RIPEMD160] = GNUTLS_DIG_RMD160,
> +};
> +
> +int qcrypto_get_x509_cert_fingerprint(uint8_t *cert, size_t size,
> +  QCryptoHashAlgorithm alg,
> +  uint8_t **result,
> +  size_t *resultlen,
> +  Error **errp)
> +{
> +int ret;
> +gnutls_x509_crt_t crt;
> +gnutls_datum_t datum = {.data = cert, .size = size};

Assign '*result = NULL' &&  '*resultlen = 0' here at the start, so we
have clear semantics on failure.


> +
> +if (alg >= G_N_ELEMENTS(qcrypto_to_gnutls_hash_alg_map)) {
> +error_setg(errp, "Unknown hash algorithm");
> +return -1;
> +}
> +
> +gnutls_x509_crt_init(&crt);
> +
> +if (gnutls_x509_crt_import(crt, &datum, GNUTLS_X509_FMT_PEM) != 0) {
> +error_setg(errp, "Failed to import certificate");
> +goto cleanup;
> +}
> +
> +ret = gnutls_hash_get_len(qcrypto_to_gnutls_hash_alg_map[alg]);
> +if (*resultlen == 0) {
> +*resultlen = ret;
> +*result = g_new0(uint8_t, *resultlen);
> +} else if (*resultlen < ret) {
> +error_setg(errp,
> +   "Result buffer size %zu is smaller than hash %d",
> +   *resultlen, ret);
> +goto cleanup;
> +}
> +
> +if (gnutls_x509_crt_get_fingerprint(crt,
> +qcrypto_to_gnutls_hash_alg_map[alg],
> +

Re: [PATCH v4 5/6] crypto: Support SHA384 hash when using glib

2024-08-19 Thread Daniel P . Berrangé

On Sun, Aug 18, 2024 at 05:42:56PM +0600, Dorjoy Chowdhury wrote:
> QEMU requires minimum glib version 2.66.0 as per the root meson.build
> file and per glib documentation[1] G_CHECKSUM_SHA384 is available since
> 2.51.
> 
> [1] https://docs.gtk.org/glib/enum.ChecksumType.html
> 
> Signed-off-by: Dorjoy Chowdhury 
> ---
>  crypto/hash-glib.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé 

> diff --git a/crypto/hash-glib.c b/crypto/hash-glib.c
> index 82de9db705..18e64faa9c 100644
> --- a/crypto/hash-glib.c
> +++ b/crypto/hash-glib.c
> @@ -29,7 +29,7 @@ static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {
>  [QCRYPTO_HASH_ALG_SHA1] = G_CHECKSUM_SHA1,
>  [QCRYPTO_HASH_ALG_SHA224] = -1,
>  [QCRYPTO_HASH_ALG_SHA256] = G_CHECKSUM_SHA256,
> -[QCRYPTO_HASH_ALG_SHA384] = -1,
> +[QCRYPTO_HASH_ALG_SHA384] = G_CHECKSUM_SHA384,
>  [QCRYPTO_HASH_ALG_SHA512] = G_CHECKSUM_SHA512,
>  [QCRYPTO_HASH_ALG_RIPEMD160] = -1,
>  };

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH V2 00/11] Live update: cpr-exec

2024-08-16 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 12:17:30PM -0400, Peter Xu wrote:
> On Fri, Aug 16, 2024 at 05:00:32PM +0100, Daniel P. Berrangé wrote:
> > On Fri, Aug 16, 2024 at 11:34:10AM -0400, Peter Xu wrote:
> > > On Fri, Aug 16, 2024 at 04:16:50PM +0100, Daniel P. Berrangé wrote:
> > > > On Fri, Aug 16, 2024 at 11:06:10AM -0400, Peter Xu wrote:
> > > > > On Thu, Aug 15, 2024 at 04:55:20PM -0400, Steven Sistare wrote:
> > > > > > On 8/13/2024 3:46 PM, Peter Xu wrote:
> > > > > > > On Tue, Aug 06, 2024 at 04:56:18PM -0400, Steven Sistare wrote:
> > > > > > > > > The flipside, however, is that localhost migration via 2 
> > > > > > > > > separate QEMU
> > > > > > > > > processes has issues where both QEMUs want to be opening the 
> > > > > > > > > very same
> > > > > > > > > file, and only 1 of them can ever have them open.
> > > > > > > 
> > > > > > > I thought we used to have similar issue with block devices, but I 
> > > > > > > assume
> > > > > > > it's solved for years (and whoever owns it will take proper file 
> > > > > > > lock,
> > > > > > > IIRC, and QEMU migration should properly serialize the time 
> > > > > > > window on who's
> > > > > > > going to take the file lock).
> > > > > > > 
> > > > > > > Maybe this is about something else?
> > > > > > 
> > > > > > I don't have an example where this fails.
> > > > > > 
> > > > > > I can cause "Failed to get "write" lock" errors if two qemu 
> > > > > > instances open
> > > > > > the same block device, but the error is suppressed if you add the 
> > > > > > -incoming
> > > > > > argument, due to this code:
> > > > > > 
> > > > > >   blk_attach_dev()
> > > > > > if (runstate_check(RUN_STATE_INMIGRATE))
> > > > > >   blk->disable_perm = true;
> > > > > 
> > > > > Yep, this one is pretty much expected.
> > > > > 
> > > > > > 
> > > > > > > > Indeed, and "files" includes unix domain sockets.
> > > > > > 
> > > > > > More on this -- the second qemu to bind a unix domain socket for 
> > > > > > listening
> > > > > > wins, and the first qemu loses it (because second qemu unlinks and 
> > > > > > recreates
> > > > > > the socket path before binding on the assumption that it is stale).
> > > > > > 
> > > > > > One must use a different name for the socket for second qemu, and 
> > > > > > clients
> > > > > > that wish to connect must be aware of the new port.
> > > > > > 
> > > > > > > > Network ports also conflict.
> > > > > > > > cpr-exec avoids such problems, and is one of the advantages of 
> > > > > > > > the method that
> > > > > > > > I forgot to promote.
> > > > > > > 
> > > > > > > I was thinking that's fine, as the host ports should be the 
> > > > > > > backend of the
> > > > > > > VM ports only anyway so they don't need to be identical on both 
> > > > > > > sides?
> > > > > > > 
> > > > > > > IOW, my understanding is it's the guest IP/ports/... which should 
> > > > > > > still be
> > > > > > > stable across migrations, where the host ports can be different 
> > > > > > > as long as
> > > > > > > the host ports can forward guest port messages correctly?
> > > > > > 
> > > > > > Yes, one must use a different host port number for the second qemu, 
> > > > > > and clients
> > > > > > that wish to connect must be aware of the new port.
> > > > > > 
> > > > > > That is my point -- cpr-transfer requires fiddling with such things.
> > > > > > cpr-exec does not.
> > > > > 
> > > > > Right, and my understanding is all these facilities are already 
> > > > > there, so
> > > > > no new code

Re: [PATCH V2 00/11] Live update: cpr-exec

2024-08-16 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 11:34:10AM -0400, Peter Xu wrote:
> On Fri, Aug 16, 2024 at 04:16:50PM +0100, Daniel P. Berrangé wrote:
> > On Fri, Aug 16, 2024 at 11:06:10AM -0400, Peter Xu wrote:
> > > On Thu, Aug 15, 2024 at 04:55:20PM -0400, Steven Sistare wrote:
> > > > On 8/13/2024 3:46 PM, Peter Xu wrote:
> > > > > On Tue, Aug 06, 2024 at 04:56:18PM -0400, Steven Sistare wrote:
> > > > > > > The flipside, however, is that localhost migration via 2 separate 
> > > > > > > QEMU
> > > > > > > processes has issues where both QEMUs want to be opening the very 
> > > > > > > same
> > > > > > > file, and only 1 of them can ever have them open.
> > > > > 
> > > > > I thought we used to have similar issue with block devices, but I 
> > > > > assume
> > > > > it's solved for years (and whoever owns it will take proper file lock,
> > > > > IIRC, and QEMU migration should properly serialize the time window on 
> > > > > who's
> > > > > going to take the file lock).
> > > > > 
> > > > > Maybe this is about something else?
> > > > 
> > > > I don't have an example where this fails.
> > > > 
> > > > I can cause "Failed to get "write" lock" errors if two qemu instances 
> > > > open
> > > > the same block device, but the error is suppressed if you add the 
> > > > -incoming
> > > > argument, due to this code:
> > > > 
> > > >   blk_attach_dev()
> > > > if (runstate_check(RUN_STATE_INMIGRATE))
> > > >   blk->disable_perm = true;
> > > 
> > > Yep, this one is pretty much expected.
> > > 
> > > > 
> > > > > > Indeed, and "files" includes unix domain sockets.
> > > > 
> > > > More on this -- the second qemu to bind a unix domain socket for 
> > > > listening
> > > > wins, and the first qemu loses it (because second qemu unlinks and 
> > > > recreates
> > > > the socket path before binding on the assumption that it is stale).
> > > > 
> > > > One must use a different name for the socket for second qemu, and 
> > > > clients
> > > > that wish to connect must be aware of the new port.
> > > > 
> > > > > > Network ports also conflict.
> > > > > > cpr-exec avoids such problems, and is one of the advantages of the 
> > > > > > method that
> > > > > > I forgot to promote.
> > > > > 
> > > > > I was thinking that's fine, as the host ports should be the backend 
> > > > > of the
> > > > > VM ports only anyway so they don't need to be identical on both sides?
> > > > > 
> > > > > IOW, my understanding is it's the guest IP/ports/... which should 
> > > > > still be
> > > > > stable across migrations, where the host ports can be different as 
> > > > > long as
> > > > > the host ports can forward guest port messages correctly?
> > > > 
> > > > Yes, one must use a different host port number for the second qemu, and 
> > > > clients
> > > > that wish to connect must be aware of the new port.
> > > > 
> > > > That is my point -- cpr-transfer requires fiddling with such things.
> > > > cpr-exec does not.
> > > 
> > > Right, and my understanding is all these facilities are already there, so
> > > no new code should be needed on reconnect issues if to support 
> > > cpr-transfer
> > > in Libvirt or similar management layers that supports migrations.
> > 
> > Note Libvirt explicitly blocks localhost migration today because
> > solving all these clashing resource problems is a huge can of worms
> > and it can't be made invisible to the user of libvirt in any practical
> > way.
> 
> Ahhh, OK.  I'm pretty surprised by this, as I thought at least kubevirt
> supported local migration somehow on top of libvirt.

Since kubevirt runs inside a container, "localhost" migration
is effectively migrating between 2 completely separate OS installs
(containers), that happen to be on the same physical host. IOW, it
is a cross-host migration from Libvirt & QEMU's POV, and there are
no clashing resources to worry about.

> Does it mean that cpr-transfer is a no-go in this case at least for Libvirt
> to consume it (as cpr-* is only for local host migrations so far)?  Even if
> all the rest issues we're discussing with cpr-exec, is that the only way to
> go for Libvirt, then?

cpr-exec is certainly appealing from the POV of avoiding the clashing
resources problem in libvirt.

It has own issues though, because libvirt runs all QEMU processes with
seccomp filters that block 'execve', as we consider QEMU to be untrustworthy
and thus don't want to allow it to exec anything !

I don't know which is the lesser evil from libvirt's POV.

Personally I see security controls as an overriding requirement for
everything.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [RFC V1 0/6] Live update: cpr-transfer

2024-08-16 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 11:23:01AM -0400, Peter Xu wrote:
> On Fri, Aug 16, 2024 at 11:13:36AM -0400, Steven Sistare wrote:
> > On 8/15/2024 4:28 PM, Peter Xu wrote:
> > > On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote:
> > > > > > The new user-visible interfaces are:
> > > > > > * cpr-transfer (MigMode migration parameter)
> > > > > > * cpr-uri (migration parameter)
> > > > > 
> > > > > I wonder whether this parameter can be avoided already, maybe we can 
> > > > > let
> > > > > cpr-transfer depend on unix socket in -incoming, then integrate fd 
> > > > > sharing
> > > > > in the same channel?
> > > > 
> > > > You saw the answer in another thread, but I repeat it here for others 
> > > > benefit:
> > > > 
> > > >"CPR state cannot be sent over the normal migration channel, because 
> > > > devices
> > > > and backends are created prior to reading the channel, so this mode 
> > > > sends
> > > > CPR state over a second migration channel that is not visible to 
> > > > the user.
> > > > New QEMU reads the second channel prior to creating devices or 
> > > > backends."
> > > 
> > > Today when looking again, I wonder about the other way round: can we make
> > > the new parameter called "-incoming-cpr", working exactly the same as
> > > "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be 
> > > automatically
> > > be reused for migration incoming ports?
> > > 
> > > After all, cpr needs to happen already with unix sockets.  Having separate
> > > cmdline options grants user to make the other one to be non-unix, but that
> > > doesn't seem to buy us anything.. then it seems easier to always reuse it,
> > > and restrict cpr-transfer to only work with unix sockets for incoming too?
> > 
> > This idea also occurred to me, but I dislike the loss of flexibility for
> > the incoming socket type.  The exec URI in particular can do anything, and
> > we would be eliminating it.
> 
> Ah, I would be guessing that if Juan is still around then exec URI should
> already been marked deprecated and prone to removal soon.. while I tend to
> agree that exec does introduce some complexity meanwhile iiuc nobody uses
> that in production systems.
> 
> What's the exec use case you're picturing?  Would that mostly for debugging
> purpose, and would that be easily replaceable with another tunnelling like
> "ncat" or so?

Conceptually "exec:" is a nice thing, but from a practical POV it
introduces difficulties for QEMU. QEMU doesn't know if the exec'd
command will provide a unidirectional channel or bidirectional
channel, so has to assume the worst - unidirectional. It also can't
know if it is safe to run the exec multiple times, or is only valid
to run it once - so afgai nhas to assume once only.

We could fix those by adding further flags in the migration address
to indicate if its bi-directional & multi-channel safe.

Technically "exec" is obsolete given "fd", but then that applies to
literally all protocols. Implementing them in QEMU is a more user
friendly thing.

Exec was more compelling when QEMU's other protocols were less
mature, lacking TLS for example, but I still find it interesting
as a facility.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH V2 00/11] Live update: cpr-exec

2024-08-16 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 11:06:10AM -0400, Peter Xu wrote:
> On Thu, Aug 15, 2024 at 04:55:20PM -0400, Steven Sistare wrote:
> > On 8/13/2024 3:46 PM, Peter Xu wrote:
> > > On Tue, Aug 06, 2024 at 04:56:18PM -0400, Steven Sistare wrote:
> > > > > The flipside, however, is that localhost migration via 2 separate QEMU
> > > > > processes has issues where both QEMUs want to be opening the very same
> > > > > file, and only 1 of them can ever have them open.
> > > 
> > > I thought we used to have similar issue with block devices, but I assume
> > > it's solved for years (and whoever owns it will take proper file lock,
> > > IIRC, and QEMU migration should properly serialize the time window on 
> > > who's
> > > going to take the file lock).
> > > 
> > > Maybe this is about something else?
> > 
> > I don't have an example where this fails.
> > 
> > I can cause "Failed to get "write" lock" errors if two qemu instances open
> > the same block device, but the error is suppressed if you add the -incoming
> > argument, due to this code:
> > 
> >   blk_attach_dev()
> > if (runstate_check(RUN_STATE_INMIGRATE))
> >   blk->disable_perm = true;
> 
> Yep, this one is pretty much expected.
> 
> > 
> > > > Indeed, and "files" includes unix domain sockets.
> > 
> > More on this -- the second qemu to bind a unix domain socket for listening
> > wins, and the first qemu loses it (because second qemu unlinks and recreates
> > the socket path before binding on the assumption that it is stale).
> > 
> > One must use a different name for the socket for second qemu, and clients
> > that wish to connect must be aware of the new port.
> > 
> > > > Network ports also conflict.
> > > > cpr-exec avoids such problems, and is one of the advantages of the 
> > > > method that
> > > > I forgot to promote.
> > > 
> > > I was thinking that's fine, as the host ports should be the backend of the
> > > VM ports only anyway so they don't need to be identical on both sides?
> > > 
> > > IOW, my understanding is it's the guest IP/ports/... which should still be
> > > stable across migrations, where the host ports can be different as long as
> > > the host ports can forward guest port messages correctly?
> > 
> > Yes, one must use a different host port number for the second qemu, and 
> > clients
> > that wish to connect must be aware of the new port.
> > 
> > That is my point -- cpr-transfer requires fiddling with such things.
> > cpr-exec does not.
> 
> Right, and my understanding is all these facilities are already there, so
> no new code should be needed on reconnect issues if to support cpr-transfer
> in Libvirt or similar management layers that supports migrations.

Note Libvirt explicitly blocks localhost migration today because
solving all these clashing resource problems is a huge can of worms
and it can't be made invisible to the user of libvirt in any practical
way.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v3 4/5] machine/nitro-enclave: Add built-in Nitro Secure Module device

2024-08-16 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 06:50:34PM +0600, Dorjoy Chowdhury wrote:
> Hi Daniel,
> 
> On Mon, Aug 12, 2024 at 8:07 PM Daniel P. Berrangé  
> wrote:
> >
> > On Sat, Aug 10, 2024 at 10:45:01PM +0600, Dorjoy Chowdhury wrote:
> > > AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
> > > is used for stripped down TPM functionality like attestation. This commit
> > > adds the built-in NSM device in the nitro-enclave machine type.
> > >
> > > In Nitro Enclaves, all the PCRs start in a known zero state and the first
> > > 16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
> > > contain the SHA384 hashes related to the EIF file used to boot the
> > > VM for validation.
> > >
> > > A new optional nitro-enclave machine option 'id' has been added which will
> > > be the enclave identifier reflected in the module-id of the NSM device.
> > > Otherwise, the device will have a default id set.
> > >
> > > Signed-off-by: Dorjoy Chowdhury 
> > > ---
> > >  hw/core/eif.c   | 205 +++-
> > >  hw/core/eif.h   |   5 +-
> > >  hw/core/meson.build |   4 +-
> > >  hw/i386/Kconfig |   1 +
> > >  hw/i386/nitro_enclave.c |  85 -
> > >  include/hw/i386/nitro_enclave.h |  19 +++
> > >  6 files changed, 310 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/hw/core/eif.c b/hw/core/eif.c
> > > index 5558879a96..d2c65668ef 100644
> > > --- a/hw/core/eif.c
> > > +++ b/hw/core/eif.c
> > > @@ -12,6 +12,9 @@
> > >  #include "qemu/bswap.h"
> > >  #include "qapi/error.h"
> > >  #include  /* for crc32 */
> > > +#include 
> > > +#include 
> > > +#include 
> > >
> > >  #include "hw/core/eif.h"
> > >
> >
> > > @@ -269,6 +284,125 @@ static bool read_eif_ramdisk(FILE *eif, FILE 
> > > *initrd, uint64_t size,
> > >  return false;
> > >  }
> > >
> > > +static bool get_fingerprint_sha384_from_cert(uint8_t *cert, size_t size,
> > > + uint8_t *sha384, Error 
> > > **errp)
> > > +{
> > > +gnutls_x509_crt_t crt;
> > > +size_t hash_size = 48;
> > > +gnutls_datum_t datum = {.data = cert, .size = size};
> > > +
> > > +gnutls_global_init();
> > > +gnutls_x509_crt_init(&crt);
> > > +
> > > +if (gnutls_x509_crt_import(crt, &datum, GNUTLS_X509_FMT_PEM) != 0) {
> > > +error_setg(errp, "Failed to import certificate");
> > > +goto cleanup;
> > > +}
> > > +
> > > +if (gnutls_x509_crt_get_fingerprint(crt, GNUTLS_DIG_SHA384, sha384,
> > > +&hash_size) != 0) {
> > > +error_setg(errp, "Failed to compute SHA384 fingerprint");
> > > +goto cleanup;
> > > +}
> > > +
> > > +return true;
> > > +
> > > + cleanup:
> > > +gnutls_x509_crt_deinit(crt);
> > > +gnutls_global_deinit();
> > > +return false;
> > > +}
> >
> > I'd suggest this go into  qcrypto/x509-utils.c & 
> > include/qcrypto/x509-utils.h,
> > as:
> >
> > int qcrypto_get_x509_cert_fingerprint(uint8_t *cert,
> >   size_t size,
> >   QCryptoHashAlgorith hash,
> >   Error **errp);
> >
> > there's no need to be calling gnutls_global_init() / deinit() either.
> >
> >
> > > @@ -299,7 +433,9 @@ static long get_file_size(FILE *f, Error **errp)
> > >   */
> > >  bool read_eif_file(const char *eif_path, const char *machine_initrd,
> > > char **kernel_path, char **initrd_path, char 
> > > **cmdline,
> > > -   Error **errp)
> > > +   uint8_t *image_sha384, uint8_t *bootstrap_sha384,
> > > +   uint8_t *app_sha384, uint8_t *fingerprint_sha384,
> > > +   bool *signature_found, Error **errp)
> > >  {
> > >  FILE *f = NULL;
> > >  FILE *machine_initrd_f = NULL;
> > > @@ -308,9 +444,33 @@ bool read_eif_file(const char *eif_path, const char 
> > > *machine_initrd,

Re: [PATCH for-9.2] hw: add compat machines for 9.2

2024-08-16 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 12:37:23PM +0200, Cornelia Huck wrote:
> Add 9.2 machine types for arm/i440fx/m68k/q35/s390x/spapr.
> 
> Signed-off-by: Cornelia Huck 
> ---
>  hw/arm/virt.c  |  9 -
>  hw/core/machine.c  |  3 +++
>  hw/i386/pc.c   |  3 +++
>  hw/i386/pc_piix.c  | 15 ---
>  hw/i386/pc_q35.c   | 13 +++--
>  hw/m68k/virt.c |  9 -
>  hw/ppc/spapr.c | 15 +--
>  hw/s390x/s390-virtio-ccw.c | 14 +-
>  include/hw/boards.h|  3 +++
>  include/hw/i386/pc.h   |  3 +++
>  10 files changed, 77 insertions(+), 10 deletions(-)

Reviewed-by: Daniel P. Berrangé 


> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index d9e69243b4a7..746bfe05d386 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -479,13 +479,24 @@ static void pc_i440fx_machine_options(MachineClass *m)
>   "Use a different south bridge than 
> PIIX3");
>  }
>  
> -static void pc_i440fx_machine_9_1_options(MachineClass *m)
> +static void pc_i440fx_machine_9_2_options(MachineClass *m)
>  {
>  pc_i440fx_machine_options(m);
>  m->alias = "pc";
>  m->is_default = true;
>  }
>  
> +DEFINE_I440FX_MACHINE(9, 2);
> +
> +static void pc_i440fx_machine_9_1_options(MachineClass *m)
> +{
> +pc_i440fx_machine_9_2_options(m);
> +m->alias = NULL;
> +m->is_default = false;
> +compat_props_add(m->compat_props, hw_compat_9_1, hw_compat_9_1_len);
> +compat_props_add(m->compat_props, pc_compat_9_1, pc_compat_9_1_len);
> +}
> +
>  DEFINE_I440FX_MACHINE(9, 1);
>  
>  static void pc_i440fx_machine_9_0_options(MachineClass *m)
> @@ -493,8 +504,6 @@ static void pc_i440fx_machine_9_0_options(MachineClass *m)
>  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
>  
>  pc_i440fx_machine_9_1_options(m);
> -m->alias = NULL;
> -m->is_default = false;
>  m->smbios_memory_device_size = 16 * GiB;

Feels like we should be adding an "_AS_LATEST" macro
variant for piix/q35 too, so it matches the pattern
in other targets for handling alias & is_default.

Not a thing your patch needs todo though.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 1/4] meson: hide tsan related warnings

2024-08-16 Thread Daniel P . Berrangé

On Fri, Aug 16, 2024 at 07:44:28AM +0200, Thomas Huth wrote:
> On 15/08/2024 19.54, Peter Maydell wrote:
> > On Thu, 15 Aug 2024 at 12:05, Daniel P. Berrangé  
> > wrote:
> > > 
> > > On Thu, Aug 15, 2024 at 11:12:39AM +0100, Peter Maydell wrote:
> > > > On Wed, 14 Aug 2024 at 23:42, Pierrick Bouvier
> > > >  wrote:
> > > > > 
> > > > > When building with gcc-12 -fsanitize=thread, gcc reports some
> > > > > constructions not supported with tsan.
> > > > > Found on debian stable.
> > > > > 
> > > > > qemu/include/qemu/atomic.h:36:52: error: ‘atomic_thread_fence’ is not 
> > > > > supported with ‘-fsanitize=thread’ [-Werror=tsan]
> > > > > 36 | #define smp_mb() ({ barrier(); 
> > > > > __atomic_thread_fence(__ATOMIC_SEQ_CST); })
> > > > >|
> > > > > ^~~
> > > > > 
> > > > > Signed-off-by: Pierrick Bouvier 
> > > > > ---
> > > > >   meson.build | 10 +-
> > > > >   1 file changed, 9 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/meson.build b/meson.build
> > > > > index 81ecd4bae7c..52e5aa95cc0 100644
> > > > > --- a/meson.build
> > > > > +++ b/meson.build
> > > > > @@ -499,7 +499,15 @@ if get_option('tsan')
> > > > >prefix: '#include 
> > > > > ')
> > > > >   error('Cannot enable TSAN due to missing fiber annotation 
> > > > > interface')
> > > > > endif
> > > > > -  qemu_cflags = ['-fsanitize=thread'] + qemu_cflags
> > > > > +  tsan_warn_suppress = []
> > > > > +  # gcc (>=11) will report constructions not supported by tsan:
> > > > > +  # "error: ‘atomic_thread_fence’ is not supported with 
> > > > > ‘-fsanitize=thread’"
> > > > > +  # https://gcc.gnu.org/gcc-11/changes.html
> > > > > +  # However, clang does not support this warning and this triggers 
> > > > > an error.
> > > > > +  if cc.has_argument('-Wno-tsan')
> > > > > +tsan_warn_suppress = ['-Wno-tsan']
> > > > > +  endif
> > > > 
> > > > That last part sounds like a clang bug -- -Wno-foo is supposed
> > > > to not be an error on compilers that don't implement -Wfoo for
> > > > any value of foo (unless some other warning/error would also
> > > > be emitted).
> > > 
> > > -Wno-foo isn't an error, but it is a warning... which we then
> > > turn into an error due to -Werror, unless we pass 
> > > -Wno-unknown-warning-option
> > > to clang.
> > 
> > Which is irritating if you want to be able to blanket say
> > '-Wno-silly-compiler-warning' and not see any of that
> > warning regardless of compiler version. That's why the
> > gcc behaviour is the way it is (i.e. -Wno-such-thingy
> > is neither a warning nor an error if it would be the only
> > warning/error), and if clang doesn't match it that's a shame.
> 
> I thought that Clang would behave the same way as GCC, but apparently it
> does not (anymore?):

It is nothing new - clang has behaved this way wrt unknown warning flags
for as long as I remember.


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [RFC V1 0/6] Live update: cpr-transfer

2024-08-16 Thread Daniel P . Berrangé

On Thu, Aug 15, 2024 at 04:28:59PM -0400, Peter Xu wrote:
> On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote:
> > > > The new user-visible interfaces are:
> > > >* cpr-transfer (MigMode migration parameter)
> > > >* cpr-uri (migration parameter)
> > > 
> > > I wonder whether this parameter can be avoided already, maybe we can let
> > > cpr-transfer depend on unix socket in -incoming, then integrate fd sharing
> > > in the same channel?
> > 
> > You saw the answer in another thread, but I repeat it here for others 
> > benefit:
> > 
> >   "CPR state cannot be sent over the normal migration channel, because 
> > devices
> >and backends are created prior to reading the channel, so this mode sends
> >CPR state over a second migration channel that is not visible to the 
> > user.
> >New QEMU reads the second channel prior to creating devices or backends."
> 
> Today when looking again, I wonder about the other way round: can we make
> the new parameter called "-incoming-cpr", working exactly the same as
> "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically
> be reused for migration incoming ports?
> 
> After all, cpr needs to happen already with unix sockets.  Having separate
> cmdline options grants user to make the other one to be non-unix, but that
> doesn't seem to buy us anything.. then it seems easier to always reuse it,
> and restrict cpr-transfer to only work with unix sockets for incoming too?

IMHO we should not be adding any new command line parameter at all,
and in fact we should actually deprecate the existing "-incoming",
except when used with "defer".

An application managing migration should be doing all the configuration
via QMP

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] meson: add 'qemuutil' dependency for block.c

2024-08-15 Thread Daniel P . Berrangé

On Wed, Aug 14, 2024 at 12:00:52PM +0200, Fiona Ebner wrote:
> The macro block_module_load() used by block.c is a wrapper around
> module_load(), which is implemented in util/module.c.
> 
> Fixes linking for a future binary or downstream binary that does not
> depend on 'qemuutil' directly, but does depend on 'block'.

Such a scenario is impossible surely, even in future. Every file in
QEMU pulls in osdep.h, and as a result effectively gets a dep on
on qemuutil, not to mention the block layer using countless APIs
present in qemuutil

> Signed-off-by: Fiona Ebner 
> ---
>  meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/meson.build b/meson.build
> index 81ecd4bae7..efa0ac8d0b 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -3555,7 +3555,7 @@ if have_block
>  'blockjob.c',
>  'job.c',
>  'qemu-io-cmds.c',
> -  ))
> +  ), qemuutil)
>if config_host_data.get('CONFIG_REPLICATION')
>  block_ss.add(files('replication.c'))
>endif
> -- 
> 2.39.2
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1202 matches

Mail list logo