[xen-unstable test] 186105: regressions - FAIL
flight 186105 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/186105/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 6 xen-buildfail REGR. vs. 186078 Tests which did not succeed, but are not blocking: build-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-armhf-armhf-examine 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-vhd 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit1 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-qcow2 1 build-check(1) blocked n/a test-armhf-armhf-xl-raw 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 186078 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 186078 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 186078 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 186078 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 186078 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass version targeted for testing: xen ced21fbb2842ac4655048bdee56232974ff9ff9c baseline version: xen ced21fbb2842ac4655048bdee56232974ff9ff9c Last test of basis 186105 2024-05-23 09:38:07 Z0 days Testing same since (not found) 0 attempts jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64-xtf pass build-amd64 pass build-arm64 pass build-armhf fail build-i386 pass build-amd64-libvirt pass build-arm64-libvirt pass build-armhf-libvirt blocked build-i386-libvirt pass build-amd64-prev pass build-i386-prev pass build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops
Re: [PATCH v4 3/9] tools/arm: Introduce the "nr_spis" xl config entry
On Fri, 24 May 2024, Julien Grall wrote: > Hi Henry, > > On 23/05/2024 08:40, Henry Wang wrote: > > Currently, the number of SPIs allocated to the domain is only > > configurable for Dom0less DomUs. Xen domains are supposed to be > > platform agnostics and therefore the numbers of SPIs for libxl > > guests should not be based on the hardware. > > > > Introduce a new xl config entry for Arm to provide a method for > > user to decide the number of SPIs. This would help to avoid > > bumping the `config->arch.nr_spis` in libxl everytime there is a > > new platform with increased SPI numbers. > > > > Update the doc and the golang bindings accordingly. > > > > Signed-off-by: Henry Wang > > Reviewed-by: Jason Andryuk > > --- > > v4: > > - Add Jason's Reviewed-by tag. > > v3: > > - Reword documentation to avoid ambiguity. > > v2: > > - New patch to replace the original patch in v1: > >"[PATCH 05/15] tools/libs/light: Increase nr_spi to 160" > > --- > > docs/man/xl.cfg.5.pod.in | 14 ++ > > tools/golang/xenlight/helpers.gen.go | 2 ++ > > tools/golang/xenlight/types.gen.go | 1 + > > tools/libs/light/libxl_arm.c | 4 ++-- > > tools/libs/light/libxl_types.idl | 1 + > > tools/xl/xl_parse.c | 3 +++ > > 6 files changed, 23 insertions(+), 2 deletions(-) > > > > diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in > > index 8f2b375ce9..416d582844 100644 > > --- a/docs/man/xl.cfg.5.pod.in > > +++ b/docs/man/xl.cfg.5.pod.in > > @@ -3072,6 +3072,20 @@ raised. > > =back > > +=over 4 > > + > > +=item B > > + > > +An optional 32-bit integer parameter specifying the number of SPIs (Shared > > We can't support that much SPIs :). The limit would be 991 SPIs. I change it > > +Peripheral Interrupts) to allocate for the domain. If the value specified > > by > > +the `nr_spis` parameter is smaller than the number of SPIs calculated by > > the > > +toolstack based on the devices allocated for the domain, or the `nr_spis` > > +parameter is not specified, the value calculated by the toolstack will be > > used > > +for the domain. Otherwise, the value specified by the `nr_spis` parameter > > will > > +be used. > > I think it would be worth mentioning that the number of SPIs should match the > highest interrupt ID that will be assigned to the domain (rather than the > number of SPIs planned to be assigned). I added it > > + > > +=back > > + > > =head3 x86 > > =over 4 > > diff --git a/tools/golang/xenlight/helpers.gen.go > > b/tools/golang/xenlight/helpers.gen.go > > index b9cb5b33c7..fe5110474d 100644 > > --- a/tools/golang/xenlight/helpers.gen.go > > +++ b/tools/golang/xenlight/helpers.gen.go > > @@ -1154,6 +1154,7 @@ return fmt.Errorf("invalid union key '%v'", x.Type)} > > x.ArchArm.GicVersion = GicVersion(xc.arch_arm.gic_version) > > x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart) > > x.ArchArm.SveVl = SveType(xc.arch_arm.sve_vl) > > +x.ArchArm.NrSpis = uint32(xc.arch_arm.nr_spis) > > if err := x.ArchX86.MsrRelaxed.fromC(_x86.msr_relaxed);err != nil > > { > > return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err) > > } > > @@ -1670,6 +1671,7 @@ return fmt.Errorf("invalid union key '%v'", x.Type)} > > xc.arch_arm.gic_version = C.libxl_gic_version(x.ArchArm.GicVersion) > > xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart) > > xc.arch_arm.sve_vl = C.libxl_sve_type(x.ArchArm.SveVl) > > +xc.arch_arm.nr_spis = C.uint32_t(x.ArchArm.NrSpis) > > if err := x.ArchX86.MsrRelaxed.toC(_x86.msr_relaxed); err != nil { > > return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err) > > } > > diff --git a/tools/golang/xenlight/types.gen.go > > b/tools/golang/xenlight/types.gen.go > > index 5b293755d7..c9e45b306f 100644 > > --- a/tools/golang/xenlight/types.gen.go > > +++ b/tools/golang/xenlight/types.gen.go > > @@ -597,6 +597,7 @@ ArchArm struct { > > GicVersion GicVersion > > Vuart VuartType > > SveVl SveType > > +NrSpis uint32 > > } > > ArchX86 struct { > > MsrRelaxed Defbool > > diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c > > index 1cb89fa584..a4029e3ac8 100644 > > --- a/tools/libs/light/libxl_arm.c > > +++ b/tools/libs/light/libxl_arm.c > > @@ -181,8 +181,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, > > LOG(DEBUG, "Configure the domain"); > > -config->arch.nr_spis = nr_spis; > > -LOG(DEBUG, " - Allocate %u SPIs", nr_spis); > > +config->arch.nr_spis = max(nr_spis, d_config->b_info.arch_arm.nr_spis); > > I am not entirely sure about using max(). To me if the user specifies a lower > limit, then we should throw an error because this is likely an indication that > the SPIs they will want to assign will clash with the emulated ones. > > So it would be better to warn at domain creation rather than waiting until the > IRQs are assigned. > > I would like Anthony's opinion on this one. Given he is away this month, I > guess
Re: [PATCH v4 5/9] xen/arm: Add XEN_DOMCTL_dt_overlay and device attachment to domains
On Thu, 23 May 2024, Julien Grall wrote: > Hi Henry, > > On 23/05/2024 08:40, Henry Wang wrote: > > In order to support the dynamic dtbo device assignment to a running > > VM, the add/remove of the DT overlay and the attach/detach of the > > device from the DT overlay should happen separately. Therefore, > > repurpose the existing XEN_SYSCTL_dt_overlay to only add the DT > > overlay to Xen device tree > > I think it would be worth mentioning in the commit message why changing the > sysctl behavior is fine. The feature is experimental and therefore breaking > compatibility is ok. Added > > , instead of assigning the device to the > > hardware domain at the same time. Add the XEN_DOMCTL_dt_overlay with > > operations XEN_DOMCTL_DT_OVERLAY_ATTACH to do the device assignment > > to the domain. > > > > The hypervisor firstly checks the DT overlay passed from the toolstack > > is valid. Then the device nodes are retrieved from the overlay tracker > > based on the DT overlay. The attach of the device is implemented by > > mapping the IRQ and IOMMU resources. > > So, the expectation is the user will always want to attach all the devices in > the overlay to a single domain. Is that correct? Yes, also added to the commit message > > > > Signed-off-by: Henry Wang > > Signed-off-by: Vikram Garhwal > > --- > > v4: > > - Split the original patch, only do the device attachment. > > v3: > > - Style fixes for arch-selection #ifdefs. > > - Do not include public/domctl.h, only add a forward declaration of > >struct xen_domctl_dt_overlay. > > - Extract the overlay track entry finding logic to a function, drop > >the unused variables. > > - Use op code 1&2 for XEN_DOMCTL_DT_OVERLAY_{ATTACH,DETACH}. > > v2: > > - New patch. > > --- > > xen/arch/arm/domctl.c| 3 + > > xen/common/dt-overlay.c | 199 ++- > > xen/include/public/domctl.h | 14 +++ > > xen/include/public/sysctl.h | 11 +- > > xen/include/xen/dt-overlay.h | 7 ++ > > 5 files changed, 176 insertions(+), 58 deletions(-) > > > > diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c > > index ad56efb0f5..12a12ee781 100644 > > --- a/xen/arch/arm/domctl.c > > +++ b/xen/arch/arm/domctl.c > > @@ -5,6 +5,7 @@ > >* Copyright (c) 2012, Citrix Systems > >*/ > > +#include > > #include > > #include > > #include > > @@ -176,6 +177,8 @@ long arch_do_domctl(struct xen_domctl *domctl, struct > > domain *d, > > return rc; > > } > > +case XEN_DOMCTL_dt_overlay: > > +return dt_overlay_domctl(d, >u.dt_overlay); > > default: > > return subarch_do_domctl(domctl, d, u_domctl); > > } > > diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c > > index 9cece79067..1087f9b502 100644 > > --- a/xen/common/dt-overlay.c > > +++ b/xen/common/dt-overlay.c > > @@ -356,6 +356,42 @@ static int overlay_get_nodes_info(const void *fdto, > > char **nodes_full_path) > > return 0; > > } > > +/* This function should be called with the overlay_lock taken */ > > +static struct overlay_track * > > +find_track_entry_from_tracker(const void *overlay_fdt, > > + uint32_t overlay_fdt_size) > > +{ > > +struct overlay_track *entry, *temp; > > +bool found_entry = false; > > + > > +ASSERT(spin_is_locked(_lock)); > > + > > +/* > > + * First check if dtbo is correct i.e. it should one of the dtbo which > > was > > + * used when dynamically adding the node. > > + * Limitation: Cases with same node names but different property are > > not > > + * supported currently. We are relying on user to provide the same dtbo > > + * as it was used when adding the nodes. > > + */ > > +list_for_each_entry_safe( entry, temp, _tracker, entry ) > > +{ > > +if ( memcmp(entry->overlay_fdt, overlay_fdt, overlay_fdt_size) == 0 > > ) > > +{ > > +found_entry = true; > > +break; > > +} > > +} > > + > > +if ( !found_entry ) > > +{ > > +printk(XENLOG_ERR "Cannot find any matching tracker with input > > dtbo." > > + " Operation is supported only for prior added dtbo.\n"); > > +return NULL; > > +} > > + > > +return entry; > > +} > > + > > /* Check if node itself can be removed and remove node from IOMMU. */ > > static int remove_node_resources(struct dt_device_node *device_node) > > { > > @@ -485,8 +521,7 @@ static long handle_remove_overlay_nodes(const void > > *overlay_fdt, > > uint32_t overlay_fdt_size) > > { > > int rc; > > -struct overlay_track *entry, *temp, *track; > > -bool found_entry = false; > > +struct overlay_track *entry; > > rc = check_overlay_fdt(overlay_fdt, overlay_fdt_size); > > if ( rc ) > > @@ -494,29 +529,10 @@ static long handle_remove_overlay_nodes(const void > > *overlay_fdt, > >
Re: [PATCH v4 9/9] docs: Add device tree overlay documentation
On Thu, 23 May 2024, Julien Grall wrote: > Hi Henry, > > On 23/05/2024 08:40, Henry Wang wrote: > > From: Vikram Garhwal > > > > Signed-off-by: Vikram Garhwal > > Signed-off-by: Stefano Stabellini > > Signed-off-by: Henry Wang > > --- > > v4: > > - No change. > > v3: > > - No change. > > v2: > > - Update the content based on the changes in this version. > > --- > > docs/misc/arm/overlay.txt | 99 +++ > > 1 file changed, 99 insertions(+) > > create mode 100644 docs/misc/arm/overlay.txt > > > > diff --git a/docs/misc/arm/overlay.txt b/docs/misc/arm/overlay.txt > > new file mode 100644 > > index 00..811a6de369 > > --- /dev/null > > +++ b/docs/misc/arm/overlay.txt > > @@ -0,0 +1,99 @@ > > +# Device Tree Overlays support in Xen > > + > > +Xen now supports dynamic device assignment to running domains, > > This reads as we "support" the feature. I would prefer if we write "Xen > expirementally supports..." or similar. Done > > +i.e. adding/removing nodes (using .dtbo) to/from Xen device tree, and > > +attaching/detaching them to/from a running domain with given $domid. > > + > > +Dynamic node assignment works in two steps: > > + > > +## Add/Remove device tree overlay to/from Xen device tree > > + > > +1. Xen tools check the dtbo given and parse all other user provided > > arguments > > +2. Xen tools pass the dtbo to Xen hypervisor via hypercall. > > +3. Xen hypervisor applies/removes the dtbo to/from Xen device tree. > > + > > +## Attach/Detach device from the DT overlay to/from domain > > + > > +1. Xen tools check the dtbo given and parse all other user provided > > arguments > > +2. Xen tools pass the dtbo to Xen hypervisor via hypercall. > > +3. Xen hypervisor attach/detach the device to/from the user-provided $domid > > by > > + mapping/unmapping node resources in the DT overlay. > > + > > +# Examples > > + > > +Here are a few examples on how to use it. > > + > > +## Dom0 device add > > + > > +For assigning a device tree overlay to Dom0, user should firstly properly > > +prepare the DT overlay. More information about device tree overlays can be > > +found in [1]. Then, in Dom0, enter the following: > > + > > +(dom0) xl dt-overlay add overlay.dtbo > > + > > +This will allocate the devices mentioned in overlay.dtbo to Xen device > > tree. > > + > > +To assign the newly added device from the dtbo to Dom0: > > + > > +(dom0) xl dt-overlay attach overlay.dtbo 0 > > + > > +Next, if the user wants to add the same device tree overlay to dom0 > > +Linux, execute the following: > > + > > +(dom0) mkdir -p /sys/kernel/config/device-tree/overlays/new_overlay > > +(dom0) cat overlay.dtbo > > > /sys/kernel/config/device-tree/overlays/new_overlay/dtbo > > + > > +Finally if needed, the relevant Linux kernel drive can be loaded using: > > + > > +(dom0) modprobe module_name.ko > > + > > +## Dom0 device remove > > + > > +For removing the device from Dom0, first detach the device from Dom0: > > + > > +(dom0) xl dt-overlay detach overlay.dtbo 0 > > + > > +NOTE: The user is expected to unload any Linux kernel modules which > > +might be accessing the devices in overlay.dtbo before detach the device. > > +Detaching devices without unloading the modules might result in a crash. > > + > > +Then remove the overlay from Xen device tree: > > + > > +(dom0) xl dt-overlay remove overlay.dtbo > > + > > +## DomU device add/remove > > + > > +All the nodes in dtbo will be assigned to a domain; the user will need > > +to prepare the dtb for the domU. For example, the `interrupt-parent` > > property > > +of the DomU overlay should be changed to the Xen hardcoded value `0xfde8`. > > +Below assumes the properly written DomU dtbo is `overlay_domu.dtbo`. > > + > > +User will need to create the DomU with below properties properly configured > > +in the xl config file: > > +- `iomem` > > I don't quite understand how the user can specify the MMIO region if the > device is attached after the domain is created. I think this was meant for a domain about to be created (not already running). I clarified. > > > +- `passthrough` (if IOMMU is needed) > > + > > +User will also need to modprobe the relevant drivers. > > + > > +Example for domU device add: > > + > > +(dom0) xl dt-overlay add overlay.dtbo# If not executed > > before > > +(dom0) xl dt-overlay attach overlay.dtbo $domid > > Can how clarify how the MMIO will be mapped? Is it direct mapped? If so, > couldn't this result to clash with other part of the address space (e.g. > RAM?). Yes, it is reusing the same code as dom0, which makes the code nice but it doesn't support non-1:1 mappings. I think those should be done via the xen,reg property. My suggestion would be this: - if xen,reg is present, use it - if xen,reg is not present, fall back to 1:1 mapping based on reg For the next version of the series, I'd just document the current limitation of the implementation. I added this to patch
Re: [PATCH v4 8/9] tools: Introduce the "xl dt-overlay {attach,detach}" commands
On Fri, 24 May 2024, Julien Grall wrote: > Hi Henry, > > On 23/05/2024 08:40, Henry Wang wrote: > > With the XEN_DOMCTL_dt_overlay DOMCTL added, users should be able to > > attach/detach devices from the provided DT overlay to domains. > > Support this by introducing a new set of "xl dt-overlay" commands and > > related documentation, i.e. "xl dt-overlay {attach,detach}". Slightly > > rework the command option parsing logic. > > > > Signed-off-by: Henry Wang > > Reviewed-by: Jason Andryuk Reviewed-by: Stefano Stabellini > > --- > > v4: > > - Add Jason's Reviewed-by tag. > > v3: > > - Introduce new API libxl_dt_overlay_domain() and co., instead of > >reusing existing API libxl_dt_overlay(). > > - Add in-code comments for the LIBXL_DT_OVERLAY_* macros. > > - Use find_domain() to avoid getting domain_id from strtol(). > > v2: > > - New patch. > > --- > > tools/include/libxl.h | 10 +++ > > tools/include/xenctrl.h | 3 +++ > > tools/libs/ctrl/xc_dt_overlay.c | 31 + > > tools/libs/light/libxl_dt_overlay.c | 28 +++ > > tools/xl/xl_cmdtable.c | 4 +-- > > tools/xl/xl_vmcontrol.c | 42 - > > 6 files changed, 104 insertions(+), 14 deletions(-) > > > > diff --git a/tools/include/libxl.h b/tools/include/libxl.h > > index 62cb07dea6..6cc6d6bf6a 100644 > > --- a/tools/include/libxl.h > > +++ b/tools/include/libxl.h > > I think you also need to introduce LIBXL_HAVE_... Added I have removed the LIBXL_DT_OVERLAY_DOMAIN_DETACH and the relate mentions. I kept Jasons' ack.
[PATCH v5 7/7] docs: Add device tree overlay documentation
From: Vikram Garhwal Signed-off-by: Vikram Garhwal Signed-off-by: Stefano Stabellini Signed-off-by: Henry Wang --- docs/misc/arm/overlay.txt | 82 +++ 1 file changed, 82 insertions(+) create mode 100644 docs/misc/arm/overlay.txt diff --git a/docs/misc/arm/overlay.txt b/docs/misc/arm/overlay.txt new file mode 100644 index 00..0a2dee951a --- /dev/null +++ b/docs/misc/arm/overlay.txt @@ -0,0 +1,82 @@ +# Device Tree Overlays support in Xen + +Xen experimentally supports dynamic device assignment to running +domains, i.e. adding/removing nodes (using .dtbo) to/from Xen device +tree, and attaching them to a running domain with given $domid. + +Dynamic node assignment works in two steps: + +## Add/Remove device tree overlay to/from Xen device tree + +1. Xen tools check the dtbo given and parse all other user provided arguments +2. Xen tools pass the dtbo to Xen hypervisor via hypercall. +3. Xen hypervisor applies/removes the dtbo to/from Xen device tree. + +## Attach device from the DT overlay to domain + +1. Xen tools check the dtbo given and parse all other user provided arguments +2. Xen tools pass the dtbo to Xen hypervisor via hypercall. +3. Xen hypervisor attach the device to the user-provided $domid by + mapping node resources in the DT overlay. + +# Examples + +Here are a few examples on how to use it. + +## Dom0 device add + +For assigning a device tree overlay to Dom0, user should firstly properly +prepare the DT overlay. More information about device tree overlays can be +found in [1]. Then, in Dom0, enter the following: + +(dom0) xl dt-overlay add overlay.dtbo + +This will allocate the devices mentioned in overlay.dtbo to Xen device tree. + +To assign the newly added device from the dtbo to Dom0: + +(dom0) xl dt-overlay attach overlay.dtbo 0 + +Next, if the user wants to add the same device tree overlay to dom0 +Linux, execute the following: + +(dom0) mkdir -p /sys/kernel/config/device-tree/overlays/new_overlay +(dom0) cat overlay.dtbo > /sys/kernel/config/device-tree/overlays/new_overlay/dtbo + +Finally if needed, the relevant Linux kernel drive can be loaded using: + +(dom0) modprobe module_name.ko + +## DomU device add/remove + +All the nodes in dtbo will be assigned to a domain; the user will need +to prepare the dtb for the domU. For example, the `interrupt-parent` +property of the DomU overlay should be changed to the Xen hardcoded +value `0xfde8`, and the xen,reg property should be added to specify the +address mappings. If xen,reg is not present, it is assumed 1:1 mapping. +Below assumes the properly written DomU dtbo is `overlay_domu.dtbo`. + +For new domains to be created, the user will need to create the DomU +with below properties properly configured in the xl config file: +- `iomem` +- `passthrough` (if IOMMU is needed) + +User will also need to modprobe the relevant drivers. For already +running domains, the user can use the xl dt-overlay attach command, +example: + +(dom0) xl dt-overlay add overlay.dtbo# If not executed before +(dom0) xl dt-overlay attach overlay.dtbo $domid +(dom0) xl console $domid # To access $domid console + +Next, if the user needs to modify/prepare the overlay.dtbo suitable for +the domU: + +(domU) mkdir -p /sys/kernel/config/device-tree/overlays/new_overlay +(domU) cat overlay_domu.dtbo > /sys/kernel/config/device-tree/overlays/new_overlay/dtbo + +Finally, if needed, the relevant Linux kernel drive can be probed: + +(domU) modprobe module_name.ko + +[1] https://www.kernel.org/doc/Documentation/devicetree/overlay-notes.txt -- 2.25.1
[PATCH v5 1/7] tools/xl: Correct the help information and exit code of the dt-overlay command
From: Henry Wang Fix the name mismatch in the xl dt-overlay command, the command name should be "dt-overlay" instead of "dt_overlay". Add the missing "," in the cmdtable. Fix the exit code of the dt-overlay command, use EXIT_FAILURE instead of ERROR_FAIL. Fixes: 61765a07e3d8 ("tools/xl: Add new xl command overlay for device tree overlay support") Suggested-by: Anthony PERARD Signed-off-by: Henry Wang Reviewed-by: Jason Andryuk Reviewed-by: Stefano Stabellini --- tools/xl/xl_cmdtable.c | 2 +- tools/xl/xl_vmcontrol.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index 62bdb2aeaa..1f3c6b5897 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -635,7 +635,7 @@ const struct cmd_spec cmd_table[] = { { "dt-overlay", _dt_overlay, 0, 1, "Add/Remove a device tree overlay", - "add/remove <.dtbo>" + "add/remove <.dtbo>", "-h print this help\n" }, #endif diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c index 98f6bd2e76..02575d5d36 100644 --- a/tools/xl/xl_vmcontrol.c +++ b/tools/xl/xl_vmcontrol.c @@ -1278,7 +1278,7 @@ int main_dt_overlay(int argc, char **argv) const int overlay_remove_op = 2; if (argc < 2) { -help("dt_overlay"); +help("dt-overlay"); return EXIT_FAILURE; } @@ -1302,11 +1302,11 @@ int main_dt_overlay(int argc, char **argv) fprintf(stderr, "failed to read the overlay device tree file %s\n", overlay_config_file); free(overlay_dtb); -return ERROR_FAIL; +return EXIT_FAILURE; } } else { fprintf(stderr, "overlay dtbo file not provided\n"); -return ERROR_FAIL; +return EXIT_FAILURE; } rc = libxl_dt_overlay(ctx, overlay_dtb, overlay_dtb_size, op); -- 2.25.1
[PATCH v5 5/7] xen/arm: Add XEN_DOMCTL_dt_overlay and device attachment to domains
From: Henry Wang In order to support the dynamic dtbo device assignment to a running VM, the add/remove of the DT overlay and the attach/detach of the device from the DT overlay should happen separately. Therefore, repurpose the existing XEN_SYSCTL_dt_overlay to only add the DT overlay to Xen device tree, instead of assigning the device to the hardware domain at the same time. It is OK to change the sysctl behavior as this feature is experimental so changing sysctl behavior and breaking compatibility is OK. Add the XEN_DOMCTL_dt_overlay with operations XEN_DOMCTL_DT_OVERLAY_ATTACH to do the device assignment to the domain. The hypervisor firstly checks the DT overlay passed from the toolstack is valid. Then the device nodes are retrieved from the overlay tracker based on the DT overlay. The attach of the device is implemented by mapping the IRQ and IOMMU resources. All devices in the overlay are assigned to a single domain. Also take the opportunity to make one coding style fix in sysctl.h. xen,reg is to be used to handle non-1:1 mappings but it is currently unsupported. Signed-off-by: Henry Wang Signed-off-by: Vikram Garhwal Signed-off-by: Stefano Stabellini --- xen/arch/arm/domctl.c| 3 + xen/common/dt-overlay.c | 207 ++- xen/include/public/domctl.h | 16 ++- xen/include/public/sysctl.h | 11 +- xen/include/xen/dt-overlay.h | 8 ++ 5 files changed, 186 insertions(+), 59 deletions(-) diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c index ad56efb0f5..12a12ee781 100644 --- a/xen/arch/arm/domctl.c +++ b/xen/arch/arm/domctl.c @@ -5,6 +5,7 @@ * Copyright (c) 2012, Citrix Systems */ +#include #include #include #include @@ -176,6 +177,8 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d, return rc; } +case XEN_DOMCTL_dt_overlay: +return dt_overlay_domctl(d, >u.dt_overlay); default: return subarch_do_domctl(domctl, d, u_domctl); } diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c index 9cece79067..c2b03865a7 100644 --- a/xen/common/dt-overlay.c +++ b/xen/common/dt-overlay.c @@ -356,6 +356,42 @@ static int overlay_get_nodes_info(const void *fdto, char **nodes_full_path) return 0; } +/* This function should be called with the overlay_lock taken */ +static struct overlay_track * +find_track_entry_from_tracker(const void *overlay_fdt, + uint32_t overlay_fdt_size) +{ +struct overlay_track *entry, *temp; +bool found_entry = false; + +ASSERT(spin_is_locked(_lock)); + +/* + * First check if dtbo is correct i.e. it should one of the dtbo which was + * used when dynamically adding the node. + * Limitation: Cases with same node names but different property are not + * supported currently. We are relying on user to provide the same dtbo + * as it was used when adding the nodes. + */ +list_for_each_entry_safe( entry, temp, _tracker, entry ) +{ +if ( memcmp(entry->overlay_fdt, overlay_fdt, overlay_fdt_size) == 0 ) +{ +found_entry = true; +break; +} +} + +if ( !found_entry ) +{ +printk(XENLOG_ERR "Cannot find any matching tracker with input dtbo." + " Operation is supported only for prior added dtbo.\n"); +return NULL; +} + +return entry; +} + /* Check if node itself can be removed and remove node from IOMMU. */ static int remove_node_resources(struct dt_device_node *device_node) { @@ -485,8 +521,7 @@ static long handle_remove_overlay_nodes(const void *overlay_fdt, uint32_t overlay_fdt_size) { int rc; -struct overlay_track *entry, *temp, *track; -bool found_entry = false; +struct overlay_track *entry; rc = check_overlay_fdt(overlay_fdt, overlay_fdt_size); if ( rc ) @@ -494,29 +529,10 @@ static long handle_remove_overlay_nodes(const void *overlay_fdt, spin_lock(_lock); -/* - * First check if dtbo is correct i.e. it should one of the dtbo which was - * used when dynamically adding the node. - * Limitation: Cases with same node names but different property are not - * supported currently. We are relying on user to provide the same dtbo - * as it was used when adding the nodes. - */ -list_for_each_entry_safe( entry, temp, _tracker, entry ) -{ -if ( memcmp(entry->overlay_fdt, overlay_fdt, overlay_fdt_size) == 0 ) -{ -track = entry; -found_entry = true; -break; -} -} - -if ( !found_entry ) +entry = find_track_entry_from_tracker(overlay_fdt, overlay_fdt_size); +if ( entry == NULL ) { rc = -EINVAL; - -printk(XENLOG_ERR "Cannot find any matching tracker with input dtbo." - " Removing nodes is supported only for prior added dtbo.\n"); goto out; } @@
[PATCH v5 4/7] xen/arm/gic: Allow adding interrupt to running VMs
From: Henry Wang Currently, adding physical interrupts are only allowed at the domain creation time. For use cases such as dynamic device tree overlay addition, the adding of physical IRQ to running domains should be allowed. Drop the above-mentioned domain creation check. Since this will introduce interrupt state unsync issues for cases when the interrupt is active or pending in the guest, therefore for these cases we simply reject the operation. Do it for both new and old vGIC implementations. Signed-off-by: Henry Wang Signed-off-by: Stefano Stabellini Reviewed-by: Julien Grall --- xen/arch/arm/gic-vgic.c | 9 +++-- xen/arch/arm/gic.c | 8 xen/arch/arm/vgic/vgic.c | 7 +-- 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c index 56490dbc43..b99e287224 100644 --- a/xen/arch/arm/gic-vgic.c +++ b/xen/arch/arm/gic-vgic.c @@ -442,9 +442,14 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v, unsigned int virq, if ( connect ) { -/* The VIRQ should not be already enabled by the guest */ +/* + * The VIRQ should not be already enabled by the guest nor + * active/pending in the guest. + */ if ( !p->desc && - !test_bit(GIC_IRQ_GUEST_ENABLED, >status) ) + !test_bit(GIC_IRQ_GUEST_ENABLED, >status) && + !test_bit(GIC_IRQ_GUEST_VISIBLE, >status) && + !test_bit(GIC_IRQ_GUEST_ACTIVE, >status) ) p->desc = desc; else ret = -EBUSY; diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 44c40e86de..b3467a76ae 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -135,14 +135,6 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int virq, ASSERT(virq < vgic_num_irqs(d)); ASSERT(!is_lpi(virq)); -/* - * When routing an IRQ to guest, the virtual state is not synced - * back to the physical IRQ. To prevent get unsync, restrict the - * routing to when the Domain is been created. - */ -if ( d->creation_finished ) -return -EBUSY; - ret = vgic_connect_hw_irq(d, NULL, virq, desc, true); if ( ret ) return ret; diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c index b9463a5f27..6cabd0496d 100644 --- a/xen/arch/arm/vgic/vgic.c +++ b/xen/arch/arm/vgic/vgic.c @@ -876,8 +876,11 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *vcpu, if ( connect ) /* assign a mapped IRQ */ { -/* The VIRQ should not be already enabled by the guest */ -if ( !irq->hw && !irq->enabled ) +/* + * The VIRQ should not be already enabled by the guest nor + * active/pending in the guest. + */ +if ( !irq->hw && !irq->enabled && !irq->active && !irq->pending_latch ) { irq->hw = true; irq->hwintid = desc->irq; -- 2.25.1
[PATCH v5 6/7] tools: Introduce the "xl dt-overlay attach" command
From: Henry Wang With the XEN_DOMCTL_dt_overlay DOMCTL added, users should be able to attach (in the future also detach) devices from the provided DT overlay to domains. Support this by introducing a new "xl dt-overlay" command and related documentation, i.e. "xl dt-overlay attach. Slightly rework the command option parsing logic. Signed-off-by: Henry Wang Signed-off-by: Stefano Stabellini Reviewed-by: Jason Andryuk Reviewed-by: Stefano Stabellini --- tools/include/libxl.h | 15 +++ tools/include/xenctrl.h | 3 +++ tools/libs/ctrl/xc_dt_overlay.c | 31 +++ tools/libs/light/libxl_dt_overlay.c | 28 + tools/xl/xl_cmdtable.c | 4 +-- tools/xl/xl_vmcontrol.c | 39 - 6 files changed, 106 insertions(+), 14 deletions(-) diff --git a/tools/include/libxl.h b/tools/include/libxl.h index 3b5c18b48b..f2e19ec592 100644 --- a/tools/include/libxl.h +++ b/tools/include/libxl.h @@ -643,6 +643,12 @@ */ #define LIBXL_HAVE_NR_SPIS 1 +/* + * LIBXL_HAVE_OVERLAY_DOMAIN indicates the presence of + * libxl_dt_overlay_domain. + */ +#define LIBXL_HAVE_OVERLAY_DOMAIN 1 + /* * libxl memory management * @@ -2556,8 +2562,17 @@ libxl_device_pci *libxl_device_pci_list(libxl_ctx *ctx, uint32_t domid, void libxl_device_pci_list_free(libxl_device_pci* list, int num); #if defined(__arm__) || defined(__aarch64__) +/* Values should keep consistent with the op from XEN_SYSCTL_dt_overlay */ +#define LIBXL_DT_OVERLAY_ADD 1 +#define LIBXL_DT_OVERLAY_REMOVE2 int libxl_dt_overlay(libxl_ctx *ctx, void *overlay, uint32_t overlay_size, uint8_t overlay_op); + +/* Values should keep consistent with the op from XEN_DOMCTL_dt_overlay */ +#define LIBXL_DT_OVERLAY_DOMAIN_ATTACH 1 +int libxl_dt_overlay_domain(libxl_ctx *ctx, uint32_t domain_id, +void *overlay_dt, uint32_t overlay_dt_size, +uint8_t overlay_op); #endif /* diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h index 4996855944..9ceca0cffc 100644 --- a/tools/include/xenctrl.h +++ b/tools/include/xenctrl.h @@ -2657,6 +2657,9 @@ int xc_domain_cacheflush(xc_interface *xch, uint32_t domid, #if defined(__arm__) || defined(__aarch64__) int xc_dt_overlay(xc_interface *xch, void *overlay_fdt, uint32_t overlay_fdt_size, uint8_t overlay_op); +int xc_dt_overlay_domain(xc_interface *xch, void *overlay_fdt, + uint32_t overlay_fdt_size, uint8_t overlay_op, + uint32_t domain_id); #endif /* Compat shims */ diff --git a/tools/libs/ctrl/xc_dt_overlay.c b/tools/libs/ctrl/xc_dt_overlay.c index c2224c4d15..ea1da522d1 100644 --- a/tools/libs/ctrl/xc_dt_overlay.c +++ b/tools/libs/ctrl/xc_dt_overlay.c @@ -48,3 +48,34 @@ err: return err; } + +int xc_dt_overlay_domain(xc_interface *xch, void *overlay_fdt, + uint32_t overlay_fdt_size, uint8_t overlay_op, + uint32_t domain_id) +{ +int err; +struct xen_domctl domctl = { +.cmd = XEN_DOMCTL_dt_overlay, +.domain = domain_id, +.u.dt_overlay = { +.overlay_op = overlay_op, +.overlay_fdt_size = overlay_fdt_size, +} +}; + +DECLARE_HYPERCALL_BOUNCE(overlay_fdt, overlay_fdt_size, + XC_HYPERCALL_BUFFER_BOUNCE_IN); + +if ( (err = xc_hypercall_bounce_pre(xch, overlay_fdt)) ) +goto err; + +set_xen_guest_handle(domctl.u.dt_overlay.overlay_fdt, overlay_fdt); + +if ( (err = do_domctl(xch, )) != 0 ) +PERROR("%s failed", __func__); + +err: +xc_hypercall_bounce_post(xch, overlay_fdt); + +return err; +} diff --git a/tools/libs/light/libxl_dt_overlay.c b/tools/libs/light/libxl_dt_overlay.c index a6c709a6dc..00503b76bd 100644 --- a/tools/libs/light/libxl_dt_overlay.c +++ b/tools/libs/light/libxl_dt_overlay.c @@ -69,3 +69,31 @@ out: return rc; } +int libxl_dt_overlay_domain(libxl_ctx *ctx, uint32_t domain_id, +void *overlay_dt, uint32_t overlay_dt_size, +uint8_t overlay_op) +{ +int rc; +int r; +GC_INIT(ctx); + +if (check_overlay_fdt(gc, overlay_dt, overlay_dt_size)) { +LOG(ERROR, "Overlay DTB check failed"); +rc = ERROR_FAIL; +goto out; +} else { +LOG(DEBUG, "Overlay DTB check passed"); +rc = 0; +} + +r = xc_dt_overlay_domain(ctx->xch, overlay_dt, overlay_dt_size, overlay_op, + domain_id); +if (r) { +LOG(ERROR, "%s: Attaching/Detaching overlay dtb failed.", __func__); +rc = ERROR_FAIL; +} + +out: +GC_FREE; +return rc; +} diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index 1f3c6b5897..42751228c1 100644 --- a/tools/xl/xl_cmdtable.c +++
[PATCH v5 3/7] tools/arm: Introduce the "nr_spis" xl config entry
From: Henry Wang Currently, the number of SPIs allocated to the domain is only configurable for Dom0less DomUs. Xen domains are supposed to be platform agnostics and therefore the numbers of SPIs for libxl guests should not be based on the hardware. Introduce a new xl config entry for Arm to provide a method for user to decide the number of SPIs. This would help to avoid bumping the `config->arch.nr_spis` in libxl everytime there is a new platform with increased SPI numbers. Update the doc and the golang bindings accordingly. Signed-off-by: Henry Wang Signed-off-by: Stefano Stabellini Reviewed-by: Jason Andryuk --- docs/man/xl.cfg.5.pod.in | 16 tools/golang/xenlight/helpers.gen.go | 2 ++ tools/golang/xenlight/types.gen.go | 1 + tools/include/libxl.h| 7 +++ tools/libs/light/libxl_arm.c | 4 ++-- tools/libs/light/libxl_types.idl | 1 + tools/xl/xl_parse.c | 3 +++ 7 files changed, 32 insertions(+), 2 deletions(-) diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in index 8f2b375ce9..ac3f88fd57 100644 --- a/docs/man/xl.cfg.5.pod.in +++ b/docs/man/xl.cfg.5.pod.in @@ -3072,6 +3072,22 @@ raised. =back +=over 4 + +=item B + +An optional integer parameter specifying the number of SPIs (Shared +Peripheral Interrupts) to allocate for the domain. Max is 991 SPIs. If +the value specified by the `nr_spis` parameter is smaller than the +number of SPIs calculated by the toolstack based on the devices +allocated for the domain, or the `nr_spis` parameter is not specified, +the value calculated by the toolstack will be used for the domain. +Otherwise, the value specified by the `nr_spis` parameter will be used. +The number of SPIs should match the highest interrupt ID that will be +assigned to the domain. + +=back + =head3 x86 =over 4 diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go index b9cb5b33c7..fe5110474d 100644 --- a/tools/golang/xenlight/helpers.gen.go +++ b/tools/golang/xenlight/helpers.gen.go @@ -1154,6 +1154,7 @@ return fmt.Errorf("invalid union key '%v'", x.Type)} x.ArchArm.GicVersion = GicVersion(xc.arch_arm.gic_version) x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart) x.ArchArm.SveVl = SveType(xc.arch_arm.sve_vl) +x.ArchArm.NrSpis = uint32(xc.arch_arm.nr_spis) if err := x.ArchX86.MsrRelaxed.fromC(_x86.msr_relaxed);err != nil { return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err) } @@ -1670,6 +1671,7 @@ return fmt.Errorf("invalid union key '%v'", x.Type)} xc.arch_arm.gic_version = C.libxl_gic_version(x.ArchArm.GicVersion) xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart) xc.arch_arm.sve_vl = C.libxl_sve_type(x.ArchArm.SveVl) +xc.arch_arm.nr_spis = C.uint32_t(x.ArchArm.NrSpis) if err := x.ArchX86.MsrRelaxed.toC(_x86.msr_relaxed); err != nil { return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err) } diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go index 5b293755d7..c9e45b306f 100644 --- a/tools/golang/xenlight/types.gen.go +++ b/tools/golang/xenlight/types.gen.go @@ -597,6 +597,7 @@ ArchArm struct { GicVersion GicVersion Vuart VuartType SveVl SveType +NrSpis uint32 } ArchX86 struct { MsrRelaxed Defbool diff --git a/tools/include/libxl.h b/tools/include/libxl.h index 62cb07dea6..3b5c18b48b 100644 --- a/tools/include/libxl.h +++ b/tools/include/libxl.h @@ -636,6 +636,13 @@ */ #define LIBXL_HAVE_XEN_9PFS 1 +/* + * LIBXL_HAVE_NR_SPIS indicates the presence of the nr_spis field in + * libxl_domain_build_info that specifies the number of SPIs interrupts + * for the guest. + */ +#define LIBXL_HAVE_NR_SPIS 1 + /* * libxl memory management * diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c index 1cb89fa584..a4029e3ac8 100644 --- a/tools/libs/light/libxl_arm.c +++ b/tools/libs/light/libxl_arm.c @@ -181,8 +181,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, LOG(DEBUG, "Configure the domain"); -config->arch.nr_spis = nr_spis; -LOG(DEBUG, " - Allocate %u SPIs", nr_spis); +config->arch.nr_spis = max(nr_spis, d_config->b_info.arch_arm.nr_spis); +LOG(DEBUG, " - Allocate %u SPIs", config->arch.nr_spis); switch (d_config->b_info.arch_arm.gic_version) { case LIBXL_GIC_VERSION_DEFAULT: diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl index 79e9c656cc..4e65e6fda5 100644 --- a/tools/libs/light/libxl_types.idl +++ b/tools/libs/light/libxl_types.idl @@ -722,6 +722,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("arch_arm", Struct(None, [("gic_version", libxl_gic_version), ("vuart", libxl_vuart_type), ("sve_vl", libxl_sve_type), + ("nr_spis", uint32), ])), ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
[PATCH v5 2/7] xen/arm, doc: Add a DT property to specify IOMMU for Dom0less domUs
From: Henry Wang There are some use cases in which the dom0less domUs need to have the XEN_DOMCTL_CDF_iommu set at the domain construction time. For example, the dynamic dtbo feature allows the domain to be assigned a device that is behind the IOMMU at runtime. For these use cases, we need to have a way to specify the domain will need the IOMMU mapping at domain construction time. Introduce a "passthrough" DT property for Dom0less DomUs following the same entry as the xl.cfg. Currently only provide two options, i.e. "enable" and "disable". Set the XEN_DOMCTL_CDF_iommu at domain construction time based on the property. Signed-off-by: Henry Wang Reviewed-by: Julien Grall --- docs/misc/arm/device-tree/booting.txt | 16 xen/arch/arm/dom0less-build.c | 11 +-- 2 files changed, 25 insertions(+), 2 deletions(-) diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt index bbd955e9c2..f1fd069c87 100644 --- a/docs/misc/arm/device-tree/booting.txt +++ b/docs/misc/arm/device-tree/booting.txt @@ -260,6 +260,22 @@ with the following properties: value specified by Xen command line parameter gnttab_max_maptrack_frames (or its default value if unspecified, i.e. 1024) is used. +- passthrough + +A string property specifying whether IOMMU mappings are enabled for the +domain and hence whether it will be enabled for passthrough hardware. +Possible property values are: + +- "enabled" +IOMMU mappings are enabled for the domain. Note that this option is the +default if the user provides the device partial passthrough device tree +for the domain. + +- "disabled" +IOMMU mappings are disabled for the domain and so hardware may not be +passed through. This option is the default if this property is missing +and the user does not provide the device partial device tree for the domain. + Under the "xen,domain" compatible node, one or more sub-nodes are present for the DomU kernel and ramdisk. diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c index 74f053c242..5830a7051d 100644 --- a/xen/arch/arm/dom0less-build.c +++ b/xen/arch/arm/dom0less-build.c @@ -848,6 +848,8 @@ static int __init construct_domU(struct domain *d, void __init create_domUs(void) { struct dt_device_node *node; +const char *dom0less_iommu; +bool iommu = false; const struct dt_device_node *cpupool_node, *chosen = dt_find_node_by_path("/chosen"); @@ -895,8 +897,13 @@ void __init create_domUs(void) panic("Missing property 'cpus' for domain %s\n", dt_node_name(node)); -if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") && - iommu_enabled ) +if ( !dt_property_read_string(node, "passthrough", _iommu) && + !strcmp(dom0less_iommu, "enabled") ) +iommu = true; + +if ( iommu_enabled && + (iommu || dt_find_compatible_node(node, NULL, + "multiboot,device-tree")) ) d_cfg.flags |= XEN_DOMCTL_CDF_iommu; if ( !dt_property_read_u32(node, "nr_spis", _cfg.arch.nr_spis) ) -- 2.25.1
[PATCH v5 0/7] Remaining patches for dynamic node programming using overlay dtbo
Hi all, This is the remaining series for the full functional "dynamic node programming using overlay dtbo" feature. The first part [1] has already been merged. Quoting from the original series, the first part has already made Xen aware of new device tree node which means updating the dt_host with overlay node information, and in this series, the goal is to map IRQ and IOMMU during runtime, where we will do the actual IOMMU and IRQ mapping and unmapping to a running domain. Also, documentation of the "dynamic node programming using overlay dtbo" feature is added. During the discussion in v3, I was recommended to split the overlay devices attach/detach to/from running domains to separated patches [3]. But I decided to only expose the xl user interfaces together to the users after device attach/detach is fully functional, so I didn't split the toolstack patch (#8). Patch 1 is a fix of the existing code which is noticed during my local tests, details please see the commit message. Gitlab CI for this series can be found in [2]. [1] https://lore.kernel.org/xen-devel/20230906011631.30310-1-vikram.garh...@amd.com/ [2] https://gitlab.com/xen-project/people/henryw/xen/-/pipelines/1301720278 [3] https://lore.kernel.org/xen-devel/e743d3d2-5884-4e55-8627-85985ba33...@amd.com/ Changes in v5: - address Julien's comments - remove patches and mentions of the "detach" operation - add a check for xen,reg and return error if present - Stefano
Re: [PATCH v4 1/9] tools/xl: Correct the help information and exit code of the dt-overlay command
On Thu, 23 May 2024, Henry Wang wrote: > Fix the name mismatch in the xl dt-overlay command, the > command name should be "dt-overlay" instead of "dt_overlay". > Add the missing "," in the cmdtable. > > Fix the exit code of the dt-overlay command, use EXIT_FAILURE > instead of ERROR_FAIL. > > Fixes: 61765a07e3d8 ("tools/xl: Add new xl command overlay for device tree > overlay support") > Suggested-by: Anthony PERARD > Signed-off-by: Henry Wang > Reviewed-by: Jason Andryuk Reviewed-by: Stefano Stabellini > --- > v4: > - No change. > v3: > - Add Jason's Reviewed-by tag. > v2: > - New patch > --- > tools/xl/xl_cmdtable.c | 2 +- > tools/xl/xl_vmcontrol.c | 6 +++--- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c > index 62bdb2aeaa..1f3c6b5897 100644 > --- a/tools/xl/xl_cmdtable.c > +++ b/tools/xl/xl_cmdtable.c > @@ -635,7 +635,7 @@ const struct cmd_spec cmd_table[] = { > { "dt-overlay", >_dt_overlay, 0, 1, >"Add/Remove a device tree overlay", > - "add/remove <.dtbo>" > + "add/remove <.dtbo>", >"-h print this help\n" > }, > #endif > diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c > index 98f6bd2e76..02575d5d36 100644 > --- a/tools/xl/xl_vmcontrol.c > +++ b/tools/xl/xl_vmcontrol.c > @@ -1278,7 +1278,7 @@ int main_dt_overlay(int argc, char **argv) > const int overlay_remove_op = 2; > > if (argc < 2) { > -help("dt_overlay"); > +help("dt-overlay"); > return EXIT_FAILURE; > } > > @@ -1302,11 +1302,11 @@ int main_dt_overlay(int argc, char **argv) > fprintf(stderr, "failed to read the overlay device tree file > %s\n", > overlay_config_file); > free(overlay_dtb); > -return ERROR_FAIL; > +return EXIT_FAILURE; > } > } else { > fprintf(stderr, "overlay dtbo file not provided\n"); > -return ERROR_FAIL; > +return EXIT_FAILURE; > } > > rc = libxl_dt_overlay(ctx, overlay_dtb, overlay_dtb_size, op); > -- > 2.34.1 > >
[xen-4.17-testing test] 186109: regressions - FAIL
flight 186109 xen-4.17-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/186109/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64-pvops 6 kernel-build fail REGR. vs. 185864 build-amd64 6 xen-build fail in 186087 REGR. vs. 185864 build-amd64-xsm 6 xen-build fail in 186087 REGR. vs. 185864 build-i3866 xen-build fail in 186087 REGR. vs. 185864 build-amd64-prev 6 xen-build fail in 186087 REGR. vs. 185864 build-i386-xsm6 xen-build fail in 186087 REGR. vs. 185864 build-i386-prev 6 xen-build fail in 186087 REGR. vs. 185864 Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-credit1 10 host-ping-check-xenfail pass in 186087 test-armhf-armhf-xl 8 xen-boot fail pass in 186087 Tests which did not succeed, but are not blocking: test-amd64-amd64-qemuu-nested-intel 1 build-check(1)blocked in 186087 n/a test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked in 186087 n/a test-xtf-amd64-amd64-51 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-migrupgrade 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemut-ws16-amd64 1 build-check(1) blocked in 186087 n/a build-amd64-libvirt 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked in 186087 n/a test-amd64-coresched-amd64-xl 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-libvirt 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-dom0pvh-xl-amd 1 build-check(1)blocked in 186087 n/a build-i386-libvirt1 build-check(1) blocked in 186087 n/a test-xtf-amd64-amd64-11 build-check(1) blocked in 186087 n/a test-amd64-amd64-libvirt-raw 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-pvshim1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemut-debianhvm-amd64 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-qemuu-freebsd12-amd64 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-qemuu-nested-amd 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-dom0pvh-xl-intel 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-raw 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-pvhv2-amd 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-qemuu-freebsd11-amd64 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked in 186087 n/a test-xtf-amd64-amd64-41 build-check(1) blocked in 186087 n/a test-xtf-amd64-amd64-31 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemuu-ws16-amd64 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-shadow1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-credit1 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-pygrub 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-libvirt-qcow2 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-rtds 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-pair 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked in 186087 n/a test-xtf-amd64-amd64-21 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-qemut-win7-amd64 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-livepatch1 build-check(1) blocked in 186087 n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked in 186087 n/a test-amd64-amd64-xl-pvhv2-intel 1 build-check(1)
Re: [XEN PATCH v2 07/15] x86: guard cpu_has_{svm/vmx} macros with CONFIG_{SVM/VMX}
On Thu, 23 May 2024, Jan Beulich wrote: > On 23.05.2024 15:07, Sergiy Kibrik wrote: > > 16.05.24 14:12, Jan Beulich: > >> On 15.05.2024 11:12, Sergiy Kibrik wrote: > >>> --- a/xen/arch/x86/include/asm/cpufeature.h > >>> +++ b/xen/arch/x86/include/asm/cpufeature.h > >>> @@ -81,7 +81,8 @@ static inline bool boot_cpu_has(unsigned int feat) > >>> #define cpu_has_sse3boot_cpu_has(X86_FEATURE_SSE3) > >>> #define cpu_has_pclmulqdq boot_cpu_has(X86_FEATURE_PCLMULQDQ) > >>> #define cpu_has_monitor boot_cpu_has(X86_FEATURE_MONITOR) > >>> -#define cpu_has_vmx boot_cpu_has(X86_FEATURE_VMX) > >>> +#define cpu_has_vmx ( IS_ENABLED(CONFIG_VMX) && \ > >>> + boot_cpu_has(X86_FEATURE_VMX)) > >>> #define cpu_has_eistboot_cpu_has(X86_FEATURE_EIST) > >>> #define cpu_has_ssse3 boot_cpu_has(X86_FEATURE_SSSE3) > >>> #define cpu_has_fma boot_cpu_has(X86_FEATURE_FMA) > >>> @@ -109,7 +110,8 @@ static inline bool boot_cpu_has(unsigned int feat) > >>> > >>> /* CPUID level 0x8001.ecx */ > >>> #define cpu_has_cmp_legacy boot_cpu_has(X86_FEATURE_CMP_LEGACY) > >>> -#define cpu_has_svm boot_cpu_has(X86_FEATURE_SVM) > >>> +#define cpu_has_svm ( IS_ENABLED(CONFIG_SVM) && \ > >>> + boot_cpu_has(X86_FEATURE_SVM)) > >>> #define cpu_has_sse4a boot_cpu_has(X86_FEATURE_SSE4A) > >>> #define cpu_has_xop boot_cpu_has(X86_FEATURE_XOP) > >>> #define cpu_has_skinit boot_cpu_has(X86_FEATURE_SKINIT) > >> > >> Hmm, leaving aside the style issue (stray blanks after opening parentheses, > >> and as a result one-off indentation on the wrapped lines) I'm not really > >> certain we can do this. The description goes into detail why we would want > >> this, but it doesn't cover at all why it is safe for all present (and > >> ideally also future) uses. I wouldn't be surprised if we had VMX/SVM checks > >> just to derive further knowledge from that, without them being directly > >> related to the use of VMX/SVM. Take a look at calculate_hvm_max_policy(), > >> for example. While it looks to be okay there, it may give you an idea of > >> what I mean. > >> > >> Things might become better separated if instead for such checks we used > >> host and raw CPU policies instead of cpuinfo_x86.x86_capability[]. But > >> that's still pretty far out, I'm afraid. > > > > I've followed a suggestion you made for patch in previous series: > > > > https://lore.kernel.org/xen-devel/8fbd604e-5e5d-410c-880f-2ad257bbe...@suse.com/ > > See the "If not, ..." that I had put there. Doing the change just mechanically > isn't enough, you also need to make clear (in the description) that you > verified it's safe to have this way. What does it mean to "verified it's safe to have this way"? "Safe" in what way? > > yet if this approach can potentially be unsafe (I'm not completely sure > > it's safe), should we instead fallback to the way it was done in v1 > > series? I.e. guard calls to vmx/svm-specific calls where needed, like in > > these 3 patches: > > > > 1) > > https://lore.kernel.org/xen-devel/20240416063328.3469386-1-sergiy_kib...@epam.com/ > > > > 2) > > https://lore.kernel.org/xen-devel/20240416063740.3469592-1-sergiy_kib...@epam.com/ > > > > 3) > > https://lore.kernel.org/xen-devel/20240416063947.3469718-1-sergiy_kib...@epam.com/ > > I don't like this sprinkling around of IS_ENABLED() very much. Maybe we want > to have two new helpers (say using_svm() and using_vmx()), to be used in place > of most but possibly not all cpu_has_{svm,vmx}? Doing such a transformation > would then kind of implicitly answer the safety question above, as at every > use site you'd need to judge whether the replacement is correct. If it's > correct everywhere, the construct(s) as proposed in this version could then be > considered to be used in this very shape (instead of introducing the two new > helpers). But of course the transition could also be done gradually then, > touching only those uses that previously you touched in 1), 2), and 3).
Re: [PATCH v4 8/9] tools: Introduce the "xl dt-overlay {attach,detach}" commands
Hi Henry, On 23/05/2024 08:40, Henry Wang wrote: With the XEN_DOMCTL_dt_overlay DOMCTL added, users should be able to attach/detach devices from the provided DT overlay to domains. Support this by introducing a new set of "xl dt-overlay" commands and related documentation, i.e. "xl dt-overlay {attach,detach}". Slightly rework the command option parsing logic. Signed-off-by: Henry Wang Reviewed-by: Jason Andryuk --- v4: - Add Jason's Reviewed-by tag. v3: - Introduce new API libxl_dt_overlay_domain() and co., instead of reusing existing API libxl_dt_overlay(). - Add in-code comments for the LIBXL_DT_OVERLAY_* macros. - Use find_domain() to avoid getting domain_id from strtol(). v2: - New patch. --- tools/include/libxl.h | 10 +++ tools/include/xenctrl.h | 3 +++ tools/libs/ctrl/xc_dt_overlay.c | 31 + tools/libs/light/libxl_dt_overlay.c | 28 +++ tools/xl/xl_cmdtable.c | 4 +-- tools/xl/xl_vmcontrol.c | 42 - 6 files changed, 104 insertions(+), 14 deletions(-) diff --git a/tools/include/libxl.h b/tools/include/libxl.h index 62cb07dea6..6cc6d6bf6a 100644 --- a/tools/include/libxl.h +++ b/tools/include/libxl.h I think you also need to introduce LIBXL_HAVE_... Cheers, -- Julien Grall
Re: [PATCH v4 3/9] tools/arm: Introduce the "nr_spis" xl config entry
Hi Henry, On 23/05/2024 08:40, Henry Wang wrote: Currently, the number of SPIs allocated to the domain is only configurable for Dom0less DomUs. Xen domains are supposed to be platform agnostics and therefore the numbers of SPIs for libxl guests should not be based on the hardware. Introduce a new xl config entry for Arm to provide a method for user to decide the number of SPIs. This would help to avoid bumping the `config->arch.nr_spis` in libxl everytime there is a new platform with increased SPI numbers. Update the doc and the golang bindings accordingly. Signed-off-by: Henry Wang Reviewed-by: Jason Andryuk --- v4: - Add Jason's Reviewed-by tag. v3: - Reword documentation to avoid ambiguity. v2: - New patch to replace the original patch in v1: "[PATCH 05/15] tools/libs/light: Increase nr_spi to 160" --- docs/man/xl.cfg.5.pod.in | 14 ++ tools/golang/xenlight/helpers.gen.go | 2 ++ tools/golang/xenlight/types.gen.go | 1 + tools/libs/light/libxl_arm.c | 4 ++-- tools/libs/light/libxl_types.idl | 1 + tools/xl/xl_parse.c | 3 +++ 6 files changed, 23 insertions(+), 2 deletions(-) diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in index 8f2b375ce9..416d582844 100644 --- a/docs/man/xl.cfg.5.pod.in +++ b/docs/man/xl.cfg.5.pod.in @@ -3072,6 +3072,20 @@ raised. =back +=over 4 + +=item B + +An optional 32-bit integer parameter specifying the number of SPIs (Shared We can't support that much SPIs :). The limit would be 991 SPIs. +Peripheral Interrupts) to allocate for the domain. If the value specified by +the `nr_spis` parameter is smaller than the number of SPIs calculated by the +toolstack based on the devices allocated for the domain, or the `nr_spis` +parameter is not specified, the value calculated by the toolstack will be used +for the domain. Otherwise, the value specified by the `nr_spis` parameter will +be used. I think it would be worth mentioning that the number of SPIs should match the highest interrupt ID that will be assigned to the domain (rather than the number of SPIs planned to be assigned). + +=back + =head3 x86 =over 4 diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go index b9cb5b33c7..fe5110474d 100644 --- a/tools/golang/xenlight/helpers.gen.go +++ b/tools/golang/xenlight/helpers.gen.go @@ -1154,6 +1154,7 @@ return fmt.Errorf("invalid union key '%v'", x.Type)} x.ArchArm.GicVersion = GicVersion(xc.arch_arm.gic_version) x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart) x.ArchArm.SveVl = SveType(xc.arch_arm.sve_vl) +x.ArchArm.NrSpis = uint32(xc.arch_arm.nr_spis) if err := x.ArchX86.MsrRelaxed.fromC(_x86.msr_relaxed);err != nil { return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err) } @@ -1670,6 +1671,7 @@ return fmt.Errorf("invalid union key '%v'", x.Type)} xc.arch_arm.gic_version = C.libxl_gic_version(x.ArchArm.GicVersion) xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart) xc.arch_arm.sve_vl = C.libxl_sve_type(x.ArchArm.SveVl) +xc.arch_arm.nr_spis = C.uint32_t(x.ArchArm.NrSpis) if err := x.ArchX86.MsrRelaxed.toC(_x86.msr_relaxed); err != nil { return fmt.Errorf("converting field ArchX86.MsrRelaxed: %v", err) } diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go index 5b293755d7..c9e45b306f 100644 --- a/tools/golang/xenlight/types.gen.go +++ b/tools/golang/xenlight/types.gen.go @@ -597,6 +597,7 @@ ArchArm struct { GicVersion GicVersion Vuart VuartType SveVl SveType +NrSpis uint32 } ArchX86 struct { MsrRelaxed Defbool diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c index 1cb89fa584..a4029e3ac8 100644 --- a/tools/libs/light/libxl_arm.c +++ b/tools/libs/light/libxl_arm.c @@ -181,8 +181,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, LOG(DEBUG, "Configure the domain"); -config->arch.nr_spis = nr_spis; -LOG(DEBUG, " - Allocate %u SPIs", nr_spis); +config->arch.nr_spis = max(nr_spis, d_config->b_info.arch_arm.nr_spis); I am not entirely sure about using max(). To me if the user specifies a lower limit, then we should throw an error because this is likely an indication that the SPIs they will want to assign will clash with the emulated ones. So it would be better to warn at domain creation rather than waiting until the IRQs are assigned. I would like Anthony's opinion on this one. Given he is away this month, I guess we could get this patch merged (with other comments addressed) and have a follow-up if wanted before 4.19. +LOG(DEBUG, " - Allocate %u SPIs", config->arch.nr_spis); switch (d_config->b_info.arch_arm.gic_version) { case LIBXL_GIC_VERSION_DEFAULT: diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl index 79e9c656cc..4e65e6fda5 100644 --- a/tools/libs/light/libxl_types.idl +++ b/tools/libs/light/libxl_types.idl @@ -722,6 +722,7
[linux-linus test] 186103: regressions - FAIL
flight 186103 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/186103/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-xsm 6 xen-buildfail REGR. vs. 186052 build-amd64 6 xen-buildfail REGR. vs. 186052 build-i3866 xen-buildfail REGR. vs. 186052 build-i386-xsm6 xen-buildfail REGR. vs. 186052 build-armhf 6 xen-buildfail REGR. vs. 186052 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-vhd 1 build-check(1) blocked n/a test-amd64-amd64-xl-shadow1 build-check(1) blocked n/a test-amd64-amd64-xl-rtds 1 build-check(1) blocked n/a test-amd64-amd64-xl-raw 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-xl-qemut-ws16-amd64 1 build-check(1) blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvshim1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-amd64-xl-pvhv2-intel 1 build-check(1) blocked n/a test-amd64-amd64-dom0pvh-xl-amd 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvhv2-amd 1 build-check(1) blocked n/a test-amd64-amd64-dom0pvh-xl-intel 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-amd64-examine 1 build-check(1) blocked n/a test-amd64-amd64-examine-bios 1 build-check(1) blocked n/a test-amd64-amd64-examine-uefi 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit1 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qcow2 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-raw 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-intel 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-amd 1 build-check(1) blocked n/a test-amd64-amd64-pygrub 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-freebsd11-amd64 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-freebsd12-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-coresched-amd64-xl 1 build-check(1) blocked n/a test-armhf-armhf-examine 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-vhd 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit1 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-qcow2 1 build-check(1) blocked n/a test-armhf-armhf-xl-raw
Re: [PATCH v4 9/9] docs: Add device tree overlay documentation
Hi Henry, On 23/05/2024 08:40, Henry Wang wrote: From: Vikram Garhwal Signed-off-by: Vikram Garhwal Signed-off-by: Stefano Stabellini Signed-off-by: Henry Wang --- v4: - No change. v3: - No change. v2: - Update the content based on the changes in this version. --- docs/misc/arm/overlay.txt | 99 +++ 1 file changed, 99 insertions(+) create mode 100644 docs/misc/arm/overlay.txt diff --git a/docs/misc/arm/overlay.txt b/docs/misc/arm/overlay.txt new file mode 100644 index 00..811a6de369 --- /dev/null +++ b/docs/misc/arm/overlay.txt @@ -0,0 +1,99 @@ +# Device Tree Overlays support in Xen + +Xen now supports dynamic device assignment to running domains, This reads as we "support" the feature. I would prefer if we write "Xen expirementally supports..." or similar. +i.e. adding/removing nodes (using .dtbo) to/from Xen device tree, and +attaching/detaching them to/from a running domain with given $domid. + +Dynamic node assignment works in two steps: + +## Add/Remove device tree overlay to/from Xen device tree + +1. Xen tools check the dtbo given and parse all other user provided arguments +2. Xen tools pass the dtbo to Xen hypervisor via hypercall. +3. Xen hypervisor applies/removes the dtbo to/from Xen device tree. + +## Attach/Detach device from the DT overlay to/from domain + +1. Xen tools check the dtbo given and parse all other user provided arguments +2. Xen tools pass the dtbo to Xen hypervisor via hypercall. +3. Xen hypervisor attach/detach the device to/from the user-provided $domid by + mapping/unmapping node resources in the DT overlay. + +# Examples + +Here are a few examples on how to use it. + +## Dom0 device add + +For assigning a device tree overlay to Dom0, user should firstly properly +prepare the DT overlay. More information about device tree overlays can be +found in [1]. Then, in Dom0, enter the following: + +(dom0) xl dt-overlay add overlay.dtbo + +This will allocate the devices mentioned in overlay.dtbo to Xen device tree. + +To assign the newly added device from the dtbo to Dom0: + +(dom0) xl dt-overlay attach overlay.dtbo 0 + +Next, if the user wants to add the same device tree overlay to dom0 +Linux, execute the following: + +(dom0) mkdir -p /sys/kernel/config/device-tree/overlays/new_overlay +(dom0) cat overlay.dtbo > /sys/kernel/config/device-tree/overlays/new_overlay/dtbo + +Finally if needed, the relevant Linux kernel drive can be loaded using: + +(dom0) modprobe module_name.ko + +## Dom0 device remove + +For removing the device from Dom0, first detach the device from Dom0: + +(dom0) xl dt-overlay detach overlay.dtbo 0 + +NOTE: The user is expected to unload any Linux kernel modules which +might be accessing the devices in overlay.dtbo before detach the device. +Detaching devices without unloading the modules might result in a crash. + +Then remove the overlay from Xen device tree: + +(dom0) xl dt-overlay remove overlay.dtbo + +## DomU device add/remove + +All the nodes in dtbo will be assigned to a domain; the user will need +to prepare the dtb for the domU. For example, the `interrupt-parent` property +of the DomU overlay should be changed to the Xen hardcoded value `0xfde8`. +Below assumes the properly written DomU dtbo is `overlay_domu.dtbo`. + +User will need to create the DomU with below properties properly configured +in the xl config file: +- `iomem` I don't quite understand how the user can specify the MMIO region if the device is attached after the domain is created. +- `passthrough` (if IOMMU is needed) + +User will also need to modprobe the relevant drivers. + +Example for domU device add: + +(dom0) xl dt-overlay add overlay.dtbo# If not executed before +(dom0) xl dt-overlay attach overlay.dtbo $domid Can how clarify how the MMIO will be mapped? Is it direct mapped? If so, couldn't this result to clash with other part of the address space (e.g. RAM?). +(dom0) xl console $domid # To access $domid console + +Next, if the user needs to modify/prepare the overlay.dtbo suitable for +the domU: + +(domU) mkdir -p /sys/kernel/config/device-tree/overlays/new_overlay +(domU) cat overlay_domu.dtbo > /sys/kernel/config/device-tree/overlays/new_overlay/dtbo + +Finally, if needed, the relevant Linux kernel drive can be probed: + +(domU) modprobe module_name.ko + +Example for domU overlay remove: + +(dom0) xl dt-overlay detach overlay.dtbo $domid +(dom0) xl dt-overlay remove overlay.dtbo I assume we have safety check in place to ensure we can't remove the device if it is already attached. Is that correct? Cheers, -- Julien Grall
Re: [PATCH v4 7/9] xen/arm: Support device detachment from domains
Hi Henry, On 23/05/2024 08:40, Henry Wang wrote: Similarly as the device attachment from DT overlay to domain, this commit implements the device detachment from domain. The DOMCTL XEN_DOMCTL_dt_overlay op is extended to have the operation XEN_DOMCTL_DT_OVERLAY_DETACH. The detachment of the device is implemented by unmapping the IRQ and IOMMU resources. Note that with these changes, the device de-registration from the IOMMU driver should only happen at the time when the DT overlay is removed from the Xen device tree. Signed-off-by: Henry Wang Signed-off-by: Vikram Garhwal --- v4: - Split the original patch, only do device detachment from domain. --- xen/common/dt-overlay.c | 243 xen/include/public/domctl.h | 3 +- 2 files changed, 194 insertions(+), 52 deletions(-) diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c index 1087f9b502..693b6e4777 100644 --- a/xen/common/dt-overlay.c +++ b/xen/common/dt-overlay.c @@ -392,24 +392,100 @@ find_track_entry_from_tracker(const void *overlay_fdt, return entry; } +static int remove_irq(unsigned long s, unsigned long e, void *data) +{ +struct domain *d = data; +int rc = 0; + +/* + * IRQ should always have access unless there are duplication of + * of irqs in device tree. There are few cases of xen device tree + * where there are duplicate interrupts for the same node. + */ +if (!irq_access_permitted(d, s)) Because of this check, it means that ... +return 0; +/* + * TODO: We don't handle shared IRQs for now. So, it is assumed that + * the IRQs was not shared with another domain. + */ +rc = irq_deny_access(d, s); +if ( rc ) +{ +printk(XENLOG_ERR "unable to revoke access for irq %ld\n", s); +return rc; +} + +rc = release_guest_irq(d, s); ... release_guest_irq() fails on the next retry it will pass. I don't think this is what we want. Instead, we probably want to re-order the call. +if ( rc ) +{ +printk(XENLOG_ERR "unable to release irq %ld\n", s); +return rc; +} + +return rc; +} + +static int remove_all_irqs(struct rangeset *irq_ranges, struct domain *d) +{ +return rangeset_report_ranges(irq_ranges, 0, ~0UL, remove_irq, d); +} + +static int remove_iomem(unsigned long s, unsigned long e, void *data) +{ +struct domain *d = data; +int rc = 0; +p2m_type_t t; +mfn_t mfn; + +mfn = p2m_lookup(d, _gfn(s), ); What are you trying to addres with this check? For instance, the fact that the first MFN is mapped, doesn't guarantee the rest is. +if ( mfn_x(mfn) == 0 || mfn_x(mfn) == ~0UL ) I don't understand why we are checking for 0 here. In theory, it is valid MFN. Also, the second part wants to be INVALID_MFN. +return -EINVAL; + +rc = iomem_deny_access(d, s, e); iomem_deny_access() works on MFN but here you pass an MFN. Are you assuming the GFN == MFN? How would that work for domains that are not direct mapped? +if ( rc ) +{ +printk(XENLOG_ERR "Unable to remove %pd access to %#lx - %#lx\n", + d, s, e); +return rc; +} + +rc = unmap_mmio_regions(d, _gfn(s), e - s, _mfn(s)); +if ( rc ) +return rc; + +return rc; +} + +static int remove_all_iomems(struct rangeset *iomem_ranges, struct domain *d) +{ +return rangeset_report_ranges(iomem_ranges, 0, ~0UL, remove_iomem, d); +} + /* Check if node itself can be removed and remove node from IOMMU. */ -static int remove_node_resources(struct dt_device_node *device_node) +static int remove_node_resources(struct dt_device_node *device_node, + struct domain *d) { int rc = 0; unsigned int len; domid_t domid; -domid = dt_device_used_by(device_node); +if ( !d ) I looked at the code, I am a bit unsure how "d" can be NULL. Do you have any pointer? +{ +domid = dt_device_used_by(device_node); -dt_dprintk("Checking if node %s is used by any domain\n", - device_node->full_name); +dt_dprintk("Checking if node %s is used by any domain\n", + device_node->full_name); -/* Remove the node if only it's assigned to hardware domain or domain io. */ -if ( domid != hardware_domain->domain_id && domid != DOMID_IO ) -{ -printk(XENLOG_ERR "Device %s is being used by domain %u. Removing nodes failed\n", - device_node->full_name, domid); -return -EINVAL; +/* + * We also check if device is assigned to DOMID_IO as when a domain + * is destroyed device is assigned to DOMID_IO. + */ +if ( domid != DOMID_IO ) +{ +printk(XENLOG_ERR "Device %s is being assigned to %u. Device is assigned to %d\n", + device_node->full_name, DOMID_IO, domid); +return -EINVAL; +} }
Re: [PATCH v4 5/9] xen/arm: Add XEN_DOMCTL_dt_overlay and device attachment to domains
Hi Henry, On 23/05/2024 08:40, Henry Wang wrote: In order to support the dynamic dtbo device assignment to a running VM, the add/remove of the DT overlay and the attach/detach of the device from the DT overlay should happen separately. Therefore, repurpose the existing XEN_SYSCTL_dt_overlay to only add the DT overlay to Xen device tree I think it would be worth mentioning in the commit message why changing the sysctl behavior is fine. The feature is experimental and therefore breaking compatibility is ok. , instead of assigning the device to the hardware domain at the same time. Add the XEN_DOMCTL_dt_overlay with operations XEN_DOMCTL_DT_OVERLAY_ATTACH to do the device assignment to the domain. The hypervisor firstly checks the DT overlay passed from the toolstack is valid. Then the device nodes are retrieved from the overlay tracker based on the DT overlay. The attach of the device is implemented by mapping the IRQ and IOMMU resources. So, the expectation is the user will always want to attach all the devices in the overlay to a single domain. Is that correct? Signed-off-by: Henry Wang Signed-off-by: Vikram Garhwal --- v4: - Split the original patch, only do the device attachment. v3: - Style fixes for arch-selection #ifdefs. - Do not include public/domctl.h, only add a forward declaration of struct xen_domctl_dt_overlay. - Extract the overlay track entry finding logic to a function, drop the unused variables. - Use op code 1&2 for XEN_DOMCTL_DT_OVERLAY_{ATTACH,DETACH}. v2: - New patch. --- xen/arch/arm/domctl.c| 3 + xen/common/dt-overlay.c | 199 ++- xen/include/public/domctl.h | 14 +++ xen/include/public/sysctl.h | 11 +- xen/include/xen/dt-overlay.h | 7 ++ 5 files changed, 176 insertions(+), 58 deletions(-) diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c index ad56efb0f5..12a12ee781 100644 --- a/xen/arch/arm/domctl.c +++ b/xen/arch/arm/domctl.c @@ -5,6 +5,7 @@ * Copyright (c) 2012, Citrix Systems */ +#include #include #include #include @@ -176,6 +177,8 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d, return rc; } +case XEN_DOMCTL_dt_overlay: +return dt_overlay_domctl(d, >u.dt_overlay); default: return subarch_do_domctl(domctl, d, u_domctl); } diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c index 9cece79067..1087f9b502 100644 --- a/xen/common/dt-overlay.c +++ b/xen/common/dt-overlay.c @@ -356,6 +356,42 @@ static int overlay_get_nodes_info(const void *fdto, char **nodes_full_path) return 0; } +/* This function should be called with the overlay_lock taken */ +static struct overlay_track * +find_track_entry_from_tracker(const void *overlay_fdt, + uint32_t overlay_fdt_size) +{ +struct overlay_track *entry, *temp; +bool found_entry = false; + +ASSERT(spin_is_locked(_lock)); + +/* + * First check if dtbo is correct i.e. it should one of the dtbo which was + * used when dynamically adding the node. + * Limitation: Cases with same node names but different property are not + * supported currently. We are relying on user to provide the same dtbo + * as it was used when adding the nodes. + */ +list_for_each_entry_safe( entry, temp, _tracker, entry ) +{ +if ( memcmp(entry->overlay_fdt, overlay_fdt, overlay_fdt_size) == 0 ) +{ +found_entry = true; +break; +} +} + +if ( !found_entry ) +{ +printk(XENLOG_ERR "Cannot find any matching tracker with input dtbo." + " Operation is supported only for prior added dtbo.\n"); +return NULL; +} + +return entry; +} + /* Check if node itself can be removed and remove node from IOMMU. */ static int remove_node_resources(struct dt_device_node *device_node) { @@ -485,8 +521,7 @@ static long handle_remove_overlay_nodes(const void *overlay_fdt, uint32_t overlay_fdt_size) { int rc; -struct overlay_track *entry, *temp, *track; -bool found_entry = false; +struct overlay_track *entry; rc = check_overlay_fdt(overlay_fdt, overlay_fdt_size); if ( rc ) @@ -494,29 +529,10 @@ static long handle_remove_overlay_nodes(const void *overlay_fdt, spin_lock(_lock); -/* - * First check if dtbo is correct i.e. it should one of the dtbo which was - * used when dynamically adding the node. - * Limitation: Cases with same node names but different property are not - * supported currently. We are relying on user to provide the same dtbo - * as it was used when adding the nodes. - */ -list_for_each_entry_safe( entry, temp, _tracker, entry ) -{ -if ( memcmp(entry->overlay_fdt, overlay_fdt, overlay_fdt_size) == 0 ) -{ -track = entry; -found_entry =
Re: [PATCH v4 0/9] Remaining patches for dynamic node programming using overlay dtbo
On 23/05/2024 08:40, Henry Wang wrote: Hi all, Hi Henry, This is the remaining series for the full functional "dynamic node programming using overlay dtbo" feature. The first part [1] has already been merged. Quoting from the original series, the first part has already made Xen aware of new device tree node which means updating the dt_host with overlay node information, and in this series, the goal is to map IRQ and IOMMU during runtime, where we will do the actual IOMMU and IRQ mapping and unmapping to a running domain. Also, documentation of the "dynamic node programming using overlay dtbo" feature is added. During the discussion in v3, I was recommended to split the overlay devices attach/detach to/from running domains to separated patches [3]. But I decided to only expose the xl user interfaces together to the users after device attach/detach is fully functional, so I didn't split the toolstack patch (#8). So I was asking to split so we can get some of the work merged for 4.19. Can you clarify, whether the intention is to merge only patches #1-5? Cheers, -- Julien Grall
Re: [PATCH v4 4/9] xen/arm/gic: Allow adding interrupt to running VMs
Hi Henry, On 23/05/2024 08:40, Henry Wang wrote: Currently, adding physical interrupts are only allowed at the domain creation time. For use cases such as dynamic device tree overlay addition, the adding of physical IRQ to running domains should be allowed. Drop the above-mentioned domain creation check. Since this will introduce interrupt state unsync issues for cases when the interrupt is active or pending in the guest, therefore for these cases we simply reject the operation. Do it for both new and old vGIC implementations. Signed-off-by: Henry Wang With one remark below: Reviewed-by: Julien Grall diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c index b9463a5f27..048e12c562 100644 --- a/xen/arch/arm/vgic/vgic.c +++ b/xen/arch/arm/vgic/vgic.c @@ -876,8 +876,11 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *vcpu, if ( connect ) /* assign a mapped IRQ */ { -/* The VIRQ should not be already enabled by the guest */ -if ( !irq->hw && !irq->enabled ) +/* + * The VIRQ should not be already enabled by the guest nor + * active/pending in the guest Typo: Missing full stop. It can be fixed on commit. + */ +if ( !irq->hw && !irq->enabled && !irq->active && !irq->pending_latch ) { irq->hw = true; irq->hwintid = desc->irq; Cheers, -- Julien Grall
Re: [PATCH v4 2/9] xen/arm, doc: Add a DT property to specify IOMMU for Dom0less domUs
Hi Henry, On 23/05/2024 08:40, Henry Wang wrote: There are some use cases in which the dom0less domUs need to have the XEN_DOMCTL_CDF_iommu set at the domain construction time. For example, the dynamic dtbo feature allows the domain to be assigned a device that is behind the IOMMU at runtime. For these use cases, we need to have a way to specify the domain will need the IOMMU mapping at domain construction time. Introduce a "passthrough" DT property for Dom0less DomUs following the same entry as the xl.cfg. Currently only provide two options, i.e. "enable" and "disable". Set the XEN_DOMCTL_CDF_iommu at domain construction time based on the property. Signed-off-by: Henry Wang Reviewed-by: Julien Grall Cheers, -- Julien Grall
[xen-unstable-smoke test] 186117: tolerable all pass - PUSHED
flight 186117 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/186117/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 2a40b106e92aaa7ce808c8608dd6473edc67f608 baseline version: xen ced21fbb2842ac4655048bdee56232974ff9ff9c Last test of basis 186064 2024-05-21 15:04:02 Z2 days Failing since186104 2024-05-23 09:00:22 Z0 days4 attempts Testing same since 186117 2024-05-23 17:02:09 Z0 days1 attempts People who touched revisions under test: Alejandro Vallejo Alessandro Zucchelli Andrew Cooper Bobby Eshleman Christian Lindig George Dunlap Jan Beulich Julien Grall Olaf Hering Oleksandr Andrushchenko Oleksii Kurochko Roger Pau Monné Stewart Hildebrand Tamas K Lengyel Volodymyr Babchuk jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git ced21fbb28..2a40b106e9 2a40b106e92aaa7ce808c8608dd6473edc67f608 -> smoke
[libvirt test] 186099: regressions - FAIL
flight 186099 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/186099/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-xsm 6 xen-buildfail REGR. vs. 186070 build-amd64 6 xen-buildfail REGR. vs. 186070 build-i386-xsm6 xen-buildfail REGR. vs. 186070 build-i3866 xen-buildfail REGR. vs. 186070 build-armhf 6 xen-buildfail REGR. vs. 186070 Tests which did not succeed, but are not blocking: build-amd64-libvirt 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qcow2 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-raw 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-vhd 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 15 saverestore-support-checkfail never pass version targeted for testing: libvirt 66b052263d6ff046c60f4fce263e07c0d9bdd059 baseline version: libvirt 7dda4a03ac77bbe14b12b7b8f3a509a0e09f3129 Last test of basis 186070 2024-05-22 04:20:52 Z1 days Testing same since 186099 2024-05-23 04:18:41 Z0 days1 attempts People who touched revisions under test: Michal Privoznik jobs: build-amd64-xsm fail build-arm64-xsm pass build-i386-xsm fail build-amd64 fail build-arm64 pass build-armhf fail build-i386 fail build-amd64-libvirt blocked build-arm64-libvirt pass build-armhf-libvirt blocked build-i386-libvirt blocked build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm blocked test-amd64-amd64-libvirt-xsm blocked test-arm64-arm64-libvirt-xsm pass test-amd64-amd64-libvirt blocked test-arm64-arm64-libvirt pass test-armhf-armhf-libvirt blocked test-amd64-amd64-libvirt-pairblocked test-amd64-amd64-libvirt-qcow2 blocked test-arm64-arm64-libvirt-qcow2 pass test-amd64-amd64-libvirt-raw blocked test-arm64-arm64-libvirt-raw pass test-amd64-amd64-libvirt-vhd blocked test-armhf-armhf-libvirt-vhd blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at
Re: [PATCH v2 3/8] x86/vlapic: Move lapic_load_hidden migration checks to the check hook
On 08/05/2024 1:39 pm, Alejandro Vallejo wrote: > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > index 8a24419c..2f06bff1b2cc 100644 > --- a/xen/arch/x86/hvm/vlapic.c > +++ b/xen/arch/x86/hvm/vlapic.c > @@ -1573,35 +1573,54 @@ static void lapic_load_fixup(struct vlapic *vlapic) > v, vlapic->loaded.id, vlapic->loaded.ldr, good_ldr); > } > > -static int cf_check lapic_load_hidden(struct domain *d, hvm_domain_context_t > *h) > +static int cf_check lapic_check_hidden(const struct domain *d, > + hvm_domain_context_t *h) > { > unsigned int vcpuid = hvm_load_instance(h); > -struct vcpu *v; > -struct vlapic *s; > +struct hvm_hw_lapic s; > > if ( !has_vlapic(d) ) > return -ENODEV; > > /* Which vlapic to load? */ > -if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) > +if ( vcpuid >= d->max_vcpus || d->vcpu[vcpuid] == NULL ) As you're editing this anyway, swap for if ( !domain_vcpu(d, vcpuid) ) please. > { > dprintk(XENLOG_G_ERR, "HVM restore: dom%d has no apic%u\n", > d->domain_id, vcpuid); > return -EINVAL; > } > -s = vcpu_vlapic(v); > > -if ( hvm_load_entry_zeroextend(LAPIC, h, >hw) != 0 ) > +if ( hvm_load_entry_zeroextend(LAPIC, h, ) ) > +return -ENODATA; > + > +/* EN=0 with EXTD=1 is illegal */ > +if ( (s.apic_base_msr & (APIC_BASE_ENABLE | APIC_BASE_EXTD)) == > + APIC_BASE_EXTD ) > +return -EINVAL; This is very insufficient auditing for the incoming value, but it turns out that there's no nice logic for this at all. As it's just a less obfuscated form of the logic from lapic_load_hidden(), it's probably fine to stay as it is for now. The major changes since this logic was written originally are that the CPU policy correct (so we can reject EXTD on VMs which can't see x2apic), and that we now prohibit VMs moving the xAPIC MMIO window away from its default location (as this would require per-vCPU P2Ms in order to virtualise properly.) ~Andrew
Re: [PATCH v6 7/8] xen: mapcache: Add support for grant mappings
On Thu, 23 May 2024, Edgar E. Iglesias wrote: > On Thu, May 23, 2024 at 9:47 AM Manos Pitsidianakis > wrote: > On Thu, 16 May 2024 18:48, "Edgar E. Iglesias" > wrote: > >From: "Edgar E. Iglesias" > > > >Add a second mapcache for grant mappings. The mapcache for > >grants needs to work with XC_PAGE_SIZE granularity since > >we can't map larger ranges than what has been granted to us. > > > >Like with foreign mappings (xen_memory), machines using grants > >are expected to initialize the xen_grants MR and map it > >into their address-map accordingly. > > > >Signed-off-by: Edgar E. Iglesias > >Reviewed-by: Stefano Stabellini > >--- > > hw/xen/xen-hvm-common.c | 12 ++- > > hw/xen/xen-mapcache.c | 163 ++-- > > include/hw/xen/xen-hvm-common.h | 3 + > > include/sysemu/xen.h | 7 ++ > > 4 files changed, 152 insertions(+), 33 deletions(-) > > > >diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c > >index a0a0252da0..b8ace1c368 100644 > >--- a/hw/xen/xen-hvm-common.c > >+++ b/hw/xen/xen-hvm-common.c > >@@ -10,12 +10,18 @@ > > #include "hw/boards.h" > > #include "hw/xen/arch_hvm.h" > > > >-MemoryRegion xen_memory; > >+MemoryRegion xen_memory, xen_grants; > > > >-/* Check for xen memory. */ > >+/* Check for any kind of xen memory, foreign mappings or grants. */ > > bool xen_mr_is_memory(MemoryRegion *mr) > > { > >- return mr == _memory; > >+ return mr == _memory || mr == _grants; > >+} > >+ > >+/* Check specifically for grants. */ > >+bool xen_mr_is_grants(MemoryRegion *mr) > >+{ > >+ return mr == _grants; > > } > > > > void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion > *mr, > >diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c > >index a07c47b0b1..1cbc2aeaa9 100644 > >--- a/hw/xen/xen-mapcache.c > >+++ b/hw/xen/xen-mapcache.c > >@@ -14,6 +14,7 @@ > > > > #include > > > >+#include "hw/xen/xen-hvm-common.h" > > #include "hw/xen/xen_native.h" > > #include "qemu/bitmap.h" > > > >@@ -21,6 +22,8 @@ > > #include "sysemu/xen-mapcache.h" > > #include "trace.h" > > > >+#include > >+#include > > > > #if HOST_LONG_BITS == 32 > > # define MCACHE_MAX_SIZE (1UL<<31) /* 2GB Cap */ > >@@ -41,6 +44,7 @@ typedef struct MapCacheEntry { > > unsigned long *valid_mapping; > > uint32_t lock; > > #define XEN_MAPCACHE_ENTRY_DUMMY (1 << 0) > >+#define XEN_MAPCACHE_ENTRY_GRANT (1 << 1) > > Might we get more entry kinds in the future? (for example foreign maps). > Maybe this could be an enum. > > > Perhaps. Foreign mappings are already supported, this flag separates ordinary > foreign mappings from grant foreign mappings. > IMO, since this is not an external interface it's probably better to change > it once we have a concrete use-case at hand. > > > > uint8_t flags; > > hwaddr size; > > struct MapCacheEntry *next; > >@@ -71,6 +75,8 @@ typedef struct MapCache { > > } MapCache; > > > > static MapCache *mapcache; > >+static MapCache *mapcache_grants; > >+static xengnttab_handle *xen_region_gnttabdev; > > > > static inline void mapcache_lock(MapCache *mc) > > { > >@@ -131,6 +137,12 @@ void xen_map_cache_init(phys_offset_to_gaddr_t f, > void *opaque) > > unsigned long max_mcache_size; > > unsigned int bucket_shift; > > > >+ xen_region_gnttabdev = xengnttab_open(NULL, 0); > >+ if (xen_region_gnttabdev == NULL) { > >+ error_report("mapcache: Failed to open gnttab device"); > >+ exit(EXIT_FAILURE); > >+ } > >+ > > if (HOST_LONG_BITS == 32) { > > bucket_shift = 16; > > } else { > >@@ -159,6 +171,15 @@ void xen_map_cache_init(phys_offset_to_gaddr_t f, > void *opaque) > > mapcache = xen_map_cache_init_single(f, opaque, > > bucket_shift, > > max_mcache_size); > >+ > >+ /* > >+ * Grant mappings must use XC_PAGE_SIZE granularity since we can't > >+ * map anything beyond the number of pages granted to us. > >+ */ > >+ mapcache_grants = xen_map_cache_init_single(f, opaque, > >+ XC_PAGE_SHIFT, > >+ max_mcache_size); > >+ > > setrlimit(RLIMIT_AS, _as); > > } > > > >@@ -168,17
Re: [PATCH for-4.19 v3 2/3] xen: enable altp2m at create domain domctl
On Thu, 23 May 2024, Roger Pau Monné wrote: > On Fri, May 17, 2024 at 03:33:51PM +0200, Roger Pau Monne wrote: > > Enabling it using an HVM param is fragile, and complicates the logic when > > deciding whether options that interact with altp2m can also be enabled. > > > > Leave the HVM param value for consumption by the guest, but prevent it from > > being set. Enabling is now done using and additional altp2m specific field > > in > > xen_domctl_createdomain. > > > > Note that albeit only currently implemented in x86, altp2m could be > > implemented > > in other architectures, hence why the field is added to > > xen_domctl_createdomain > > instead of xen_arch_domainconfig. > > > > Signed-off-by: Roger Pau Monné > > --- > > Changes since v2: > > - Introduce a new altp2m field in xen_domctl_createdomain. > > > > Changes since v1: > > - New in this version. > > --- > > tools/libs/light/libxl_create.c | 23 ++- > > tools/libs/light/libxl_x86.c| 26 -- > > tools/ocaml/libs/xc/xenctrl_stubs.c | 2 +- > > xen/arch/arm/domain.c | 6 ++ > > Could I get an Ack from one of the Arm maintainers for the trivial Arm > change? Acked-by: Stefano Stabellini
[PATCH v1 0/1] xen/arm: smmuv3: Mark more init-only functions with __init
From: "Edgar E. Iglesias" I was scanning for code that we could potentially move from the .text section into .init.text and found a few candidates. I'm not sure if this makes sense, perhaps we don't want to mark these functions for other reasons but my scripts found this chain of SMMUv3 init functions as only reachable by .inittext code. Perhaps it's a little late in the release cycle to consider this... Best regards, Edgar Edgar E. Iglesias (1): xen/arm: smmuv3: Mark more init-only functions with __init xen/drivers/passthrough/arm/smmu-v3.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) -- 2.40.1
[PATCH v1 1/1] xen/arm: smmuv3: Mark more init-only functions with __init
From: "Edgar E. Iglesias" Move more functions that are only called at init to the .init.text section. Signed-off-by: Edgar E. Iglesias --- xen/drivers/passthrough/arm/smmu-v3.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c index 6904962467..cee5724022 100644 --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -1545,7 +1545,7 @@ static int arm_smmu_dt_xlate(struct device *dev, } /* Probing and initialisation functions */ -static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu, +static int __init arm_smmu_init_one_queue(struct arm_smmu_device *smmu, struct arm_smmu_queue *q, void __iomem *page, unsigned long prod_off, @@ -1588,7 +1588,7 @@ static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu, return 0; } -static int arm_smmu_init_queues(struct arm_smmu_device *smmu) +static int __init arm_smmu_init_queues(struct arm_smmu_device *smmu) { int ret; @@ -1724,7 +1724,7 @@ static int arm_smmu_init_strtab(struct arm_smmu_device *smmu) return 0; } -static int arm_smmu_init_structures(struct arm_smmu_device *smmu) +static int __init arm_smmu_init_structures(struct arm_smmu_device *smmu) { int ret; @@ -1746,7 +1746,8 @@ static int arm_smmu_write_reg_sync(struct arm_smmu_device *smmu, u32 val, } /* GBPA is "special" */ -static int arm_smmu_update_gbpa(struct arm_smmu_device *smmu, u32 set, u32 clr) +static int __init arm_smmu_update_gbpa(struct arm_smmu_device *smmu, + u32 set, u32 clr) { int ret; u32 reg, __iomem *gbpa = smmu->base + ARM_SMMU_GBPA; @@ -1842,7 +1843,7 @@ static void arm_smmu_setup_msis(struct arm_smmu_device *smmu) static inline void arm_smmu_setup_msis(struct arm_smmu_device *smmu) { } #endif /* CONFIG_MSI */ -static void arm_smmu_free_irqs(struct arm_smmu_device *smmu) +static void __init arm_smmu_free_irqs(struct arm_smmu_device *smmu) { int irq; @@ -1926,7 +1927,7 @@ err_free_evtq_irq: return ret; } -static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu) +static int __init arm_smmu_setup_irqs(struct arm_smmu_device *smmu) { int ret, irq; u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN; @@ -1988,7 +1989,7 @@ static int arm_smmu_device_disable(struct arm_smmu_device *smmu) return ret; } -static int arm_smmu_device_reset(struct arm_smmu_device *smmu) +static int __init arm_smmu_device_reset(struct arm_smmu_device *smmu) { int ret; u32 reg, enables; @@ -2405,7 +2406,7 @@ static void arm_smmu_free_structures(struct arm_smmu_device *smmu) xfree(smmu->strtab_cfg.l1_desc); } -static int arm_smmu_device_probe(struct platform_device *pdev) +static int __init arm_smmu_device_probe(struct platform_device *pdev) { int irq, ret; paddr_t ioaddr, iosize; -- 2.40.1
Re: [PATCH v2 6/8] xen/lib: Add topology generator for x86
On Wed, May 08, 2024 at 01:39:25PM +0100, Alejandro Vallejo wrote: > Add a helper to populate topology leaves in the cpu policy from > threads/core and cores/package counts. > > No functional change, as it's not connected to anything yet. There is a functional change in test-cpu-policy.c. Maybe the commit message needs to be updated to reflect the added testing to test-cpu-policy.c using the newly introduced helper to generate topologies? > > Signed-off-by: Alejandro Vallejo > --- > v2: > * New patch. Extracted from v1/patch6 > --- > tools/tests/cpu-policy/test-cpu-policy.c | 128 +++ > xen/include/xen/lib/x86/cpu-policy.h | 16 +++ > xen/lib/x86/policy.c | 86 +++ > 3 files changed, 230 insertions(+) > > diff --git a/tools/tests/cpu-policy/test-cpu-policy.c > b/tools/tests/cpu-policy/test-cpu-policy.c > index 301df2c00285..0ba8c418b1b3 100644 > --- a/tools/tests/cpu-policy/test-cpu-policy.c > +++ b/tools/tests/cpu-policy/test-cpu-policy.c > @@ -650,6 +650,132 @@ static void test_is_compatible_failure(void) > } > } > > +static void test_topo_from_parts(void) > +{ > +static const struct test { > +unsigned int threads_per_core; > +unsigned int cores_per_pkg; > +struct cpu_policy policy; > +} tests[] = { > +{ > +.threads_per_core = 3, .cores_per_pkg = 1, > +.policy = { > +.x86_vendor = X86_VENDOR_AMD, > +.topo.subleaf = { > +[0] = { .nr_logical = 3, .level = 0, .type = 1, > .id_shift = 2, }, > +[1] = { .nr_logical = 1, .level = 1, .type = 2, > .id_shift = 2, }, > +}, > +}, > +}, > +{ > +.threads_per_core = 1, .cores_per_pkg = 3, > +.policy = { > +.x86_vendor = X86_VENDOR_AMD, > +.topo.subleaf = { > +[0] = { .nr_logical = 1, .level = 0, .type = 1, > .id_shift = 0, }, > +[1] = { .nr_logical = 3, .level = 1, .type = 2, > .id_shift = 2, }, > +}, > +}, > +}, > +{ > +.threads_per_core = 7, .cores_per_pkg = 5, > +.policy = { > +.x86_vendor = X86_VENDOR_AMD, > +.topo.subleaf = { > +[0] = { .nr_logical = 7, .level = 0, .type = 1, > .id_shift = 3, }, > +[1] = { .nr_logical = 5, .level = 1, .type = 2, > .id_shift = 6, }, > +}, > +}, > +}, > +{ > +.threads_per_core = 2, .cores_per_pkg = 128, > +.policy = { > +.x86_vendor = X86_VENDOR_AMD, > +.topo.subleaf = { > +[0] = { .nr_logical = 2, .level = 0, .type = 1, > .id_shift = 1, }, > +[1] = { .nr_logical = 128, .level = 1, .type = 2, > .id_shift = 8, }, > +}, > +}, > +}, > +{ > +.threads_per_core = 3, .cores_per_pkg = 1, > +.policy = { > +.x86_vendor = X86_VENDOR_INTEL, > +.topo.subleaf = { > +[0] = { .nr_logical = 3, .level = 0, .type = 1, > .id_shift = 2, }, > +[1] = { .nr_logical = 3, .level = 1, .type = 2, > .id_shift = 2, }, > +}, > +}, > +}, > +{ > +.threads_per_core = 1, .cores_per_pkg = 3, > +.policy = { > +.x86_vendor = X86_VENDOR_INTEL, > +.topo.subleaf = { > +[0] = { .nr_logical = 1, .level = 0, .type = 1, > .id_shift = 0, }, > +[1] = { .nr_logical = 3, .level = 1, .type = 2, > .id_shift = 2, }, > +}, > +}, > +}, > +{ > +.threads_per_core = 7, .cores_per_pkg = 5, > +.policy = { > +.x86_vendor = X86_VENDOR_INTEL, > +.topo.subleaf = { > +[0] = { .nr_logical = 7, .level = 0, .type = 1, > .id_shift = 3, }, > +[1] = { .nr_logical = 35, .level = 1, .type = 2, > .id_shift = 6, }, > +}, > +}, > +}, > +{ > +.threads_per_core = 2, .cores_per_pkg = 128, > +.policy = { > +.x86_vendor = X86_VENDOR_INTEL, > +.topo.subleaf = { > +[0] = { .nr_logical = 2, .level = 0, .type = 1, > .id_shift = 1, }, > +[1] = { .nr_logical = 256, .level = 1, .type = 2, > .id_shift = 8, }, You don't need the array index in the initialization: .topo.subleaf = { { .nr_logical = 2, .level = 0, .type = 1, .id_shift = 1, }, { .nr_logical = 256, .level = 1, .type = 2, .id_shift = 8, }, } And lines should be limited to 80
Re: [PATCH v10 02/14] xen: introduce generic non-atomic test_*bit()
On Thu, 2024-05-23 at 15:33 +0100, Julien Grall wrote: > > > On 23/05/2024 15:11, Oleksii K. wrote: > > On Thu, 2024-05-23 at 14:00 +0100, Julien Grall wrote: > > > Hi Oleksii, > > Hi Julien, > > > > > > > > On 17/05/2024 14:54, Oleksii Kurochko wrote: > > > > diff --git a/xen/arch/arm/arm64/livepatch.c > > > > b/xen/arch/arm/arm64/livepatch.c > > > > index df2cebedde..4bc8ed9be5 100644 > > > > --- a/xen/arch/arm/arm64/livepatch.c > > > > +++ b/xen/arch/arm/arm64/livepatch.c > > > > @@ -10,7 +10,6 @@ > > > > #include > > > > #include > > > > > > > > -#include > > > > > > It is a bit unclear how this change is related to the patch. Can > > > you > > > explain in the commit message? > > Probably it doesn't need anymore. I will double check and if this > > change is not needed, I will just drop it in the next patch > > version. > > > > > > > > > #include > > > > #include > > > > #include > > > > diff --git a/xen/arch/arm/include/asm/bitops.h > > > > b/xen/arch/arm/include/asm/bitops.h > > > > index 5104334e48..8e16335e76 100644 > > > > --- a/xen/arch/arm/include/asm/bitops.h > > > > +++ b/xen/arch/arm/include/asm/bitops.h > > > > @@ -22,9 +22,6 @@ > > > > #define __set_bit(n,p) set_bit(n,p) > > > > #define __clear_bit(n,p) clear_bit(n,p) > > > > > > > > -#define BITOP_BITS_PER_WORD 32 > > > > -#define BITOP_MASK(nr) (1UL << ((nr) % > > > > BITOP_BITS_PER_WORD)) > > > > -#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) > > > > #define BITS_PER_BYTE 8 > > > > > > OOI, any reason BITS_PER_BYTE has not been moved as well? I don't > > > expect > > > the value to change across arch. > > I can move it to generic one header too in the next patch version. > > > > > > > > [...] > > > > > > > diff --git a/xen/include/xen/bitops.h > > > > b/xen/include/xen/bitops.h > > > > index f14ad0d33a..6eeeff0117 100644 > > > > --- a/xen/include/xen/bitops.h > > > > +++ b/xen/include/xen/bitops.h > > > > @@ -65,10 +65,141 @@ static inline int generic_flsl(unsigned > > > > long > > > > x) > > > > * scope > > > > */ > > > > > > > > +#define BITOP_BITS_PER_WORD 32 > > > > +typedef uint32_t bitop_uint_t; > > > > + > > > > +#define BITOP_MASK(nr) ((bitop_uint_t)1 << ((nr) % > > > > BITOP_BITS_PER_WORD)) > > > > + > > > > +#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) > > > > + > > > > +extern void __bitop_bad_size(void); > > > > + > > > > +#define bitop_bad_size(addr) (sizeof(*(addr)) < > > > > sizeof(bitop_uint_t)) > > > > + > > > > /* - Please tidy above here > > > > - > > > > */ > > > > > > > > #include > > > > > > > > +/** > > > > + * generic__test_and_set_bit - Set a bit and return its old > > > > value > > > > + * @nr: Bit to set > > > > + * @addr: Address to count from > > > > + * > > > > + * This operation is non-atomic and can be reordered. > > > > + * If two examples of this operation race, one can appear to > > > > succeed > > > > + * but actually fail. You must protect multiple accesses with > > > > a > > > > lock. > > > > + */ > > > > > > Sorry for only mentioning this on v10. I think this comment > > > should be > > > duplicated (or moved to) on top of test_bit() because this is > > > what > > > everyone will use. This will avoid the developper to follow the > > > function > > > calls and only notice the x86 version which says "This function > > > is > > > atomic and may not be reordered." and would be wrong for all the > > > other arch. > > It makes sense to add this comment on top of test_bit(), but I am > > curious if it is needed to mention that for x86 arch_test_bit() "is > > atomic and may not be reordered": > > I would say no because any developper modifying common code can't > relying it. > > > > > * This operation is non-atomic and can be reordered. ( Exception: > > for > > * x86 arch_test_bit() is atomic and may not be reordered ) > > * If two examples of this operation race, one can appear to > > succeed > > * but actually fail. You must protect multiple accesses with a > > lock. > > */ > > > > > > > > > +static always_inline bool > > > > +generic__test_and_set_bit(int nr, volatile void *addr) > > > > +{ > > > > + bitop_uint_t mask = BITOP_MASK(nr); > > > > + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + > > > > BITOP_WORD(nr); > > > > + bitop_uint_t old = *p; > > > > + > > > > + *p = old | mask; > > > > + return (old & mask); > > > > +} > > > > + > > > > +/** > > > > + * generic__test_and_clear_bit - Clear a bit and return its > > > > old > > > > value > > > > + * @nr: Bit to clear > > > > + * @addr: Address to count from > > > > + * > > > > + * This operation is non-atomic and can be reordered. > > > > + * If two examples of this operation race, one can appear to > > > > succeed > > > > + * but actually fail. You must protect multiple accesses with > > > > a > > > > lock. > > > > + */ >
[xen-unstable-smoke test] 186108: regressions - FAIL
flight 186108 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/186108/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 6 xen-buildfail REGR. vs. 186064 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass version targeted for testing: xen 9e58da32cc844b3fb7612fc35ece3a96f8cbf744 baseline version: xen ced21fbb2842ac4655048bdee56232974ff9ff9c Last test of basis 186064 2024-05-21 15:04:02 Z2 days Failing since186104 2024-05-23 09:00:22 Z0 days3 attempts Testing same since 186108 2024-05-23 14:00:21 Z0 days1 attempts People who touched revisions under test: Alejandro Vallejo Alessandro Zucchelli Bobby Eshleman Jan Beulich Julien Grall Oleksandr Andrushchenko Oleksii Kurochko Roger Pau Monné Stewart Hildebrand Tamas K Lengyel Volodymyr Babchuk jobs: build-arm64-xsm pass build-amd64 pass build-armhf fail build-amd64-libvirt pass test-armhf-armhf-xl blocked test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 406 lines long.)
Re: [PATCH 6/7] x86/cpuid: Fix handling of XSAVE dynamic leaves
On 23.05.2024 13:16, Andrew Cooper wrote: > First, if XSAVE is available in hardware but not visible to the guest, the > dynamic leaves shouldn't be filled in. > > Second, the comment concerning XSS state is wrong. VT-x doesn't manage > host/guest state automatically, but there is provision for "host only" bits to > be set, so the implications are still accurate. > > Introduce xstate_compressed_size() to mirror the uncompressed one. Cross > check it at boot. > > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich Irrespective ... > v3: > * Adjust commit message about !XSAVE guests > * Rebase over boot time cross check > * Use raw policy ... it should probably have occurred to me earlier on to ask: Why raw policy? Isn't the host one the more appropriate one to use for any kind of internal decisions? Jan
Re: [PATCH v2 5/8] tools/hvmloader: Retrieve (x2)APIC IDs from the APs themselves
On Wed, May 08, 2024 at 01:39:24PM +0100, Alejandro Vallejo wrote: > Make it so the APs expose their own APIC IDs in a LUT. We can use that LUT to > populate the MADT, decoupling the algorithm that relates CPU IDs and APIC IDs > from hvmloader. > > While at this also remove ap_callin, as writing the APIC ID may serve the same > purpose. > > Signed-off-by: Alejandro Vallejo > --- > v2: > * New patch. Replaces adding cpu policy to hvmloader in v1. > --- > tools/firmware/hvmloader/config.h| 6 - > tools/firmware/hvmloader/hvmloader.c | 4 +-- > tools/firmware/hvmloader/smp.c | 40 +++- > tools/firmware/hvmloader/util.h | 5 > xen/arch/x86/include/asm/hvm/hvm.h | 1 + > 5 files changed, 47 insertions(+), 9 deletions(-) > > diff --git a/tools/firmware/hvmloader/config.h > b/tools/firmware/hvmloader/config.h > index c82adf6dc508..edf6fa9c908c 100644 > --- a/tools/firmware/hvmloader/config.h > +++ b/tools/firmware/hvmloader/config.h > @@ -4,6 +4,8 @@ > #include > #include > > +#include > + > enum virtual_vga { VGA_none, VGA_std, VGA_cirrus, VGA_pt }; > extern enum virtual_vga virtual_vga; > > @@ -49,8 +51,10 @@ extern uint8_t ioapic_version; > > #define IOAPIC_ID 0x01 > > +extern uint32_t CPU_TO_X2APICID[HVM_MAX_VCPUS]; > + > #define LAPIC_BASE_ADDRESS 0xfee0 > -#define LAPIC_ID(vcpu_id) ((vcpu_id) * 2) > +#define LAPIC_ID(vcpu_id) (CPU_TO_X2APICID[(vcpu_id)]) > > #define PCI_ISA_DEVFN 0x08/* dev 1, fn 0 */ > #define PCI_ISA_IRQ_MASK0x0c20U /* ISA IRQs 5,10,11 are PCI connected */ > diff --git a/tools/firmware/hvmloader/hvmloader.c > b/tools/firmware/hvmloader/hvmloader.c > index c58841e5b556..1eba92229925 100644 > --- a/tools/firmware/hvmloader/hvmloader.c > +++ b/tools/firmware/hvmloader/hvmloader.c > @@ -342,11 +342,11 @@ int main(void) > > printf("CPU speed is %u MHz\n", get_cpu_mhz()); > > +smp_initialise(); > + > apic_setup(); > pci_setup(); > > -smp_initialise(); > - > perform_tests(); > > if ( bios->bios_info_setup ) > diff --git a/tools/firmware/hvmloader/smp.c b/tools/firmware/hvmloader/smp.c > index a668f15d7e1f..4d75f239c2f5 100644 > --- a/tools/firmware/hvmloader/smp.c > +++ b/tools/firmware/hvmloader/smp.c > @@ -29,7 +29,34 @@ > > #include > > -static int ap_callin, ap_cpuid; > +static int ap_cpuid; > + > +/** > + * Lookup table of x2APIC IDs. > + * > + * Each entry is populated its respective CPU as they come online. This is > required > + * for generating the MADT with minimal assumptions about ID relationships. > + */ > +uint32_t CPU_TO_X2APICID[HVM_MAX_VCPUS]; > + > +static uint32_t read_apic_id(void) > +{ > +uint32_t apic_id; > + > +cpuid(1, NULL, _id, NULL, NULL); > +apic_id >>= 24; > + > +/* > + * APIC IDs over 255 are represented by 255 in leaf 1 and are meant to be > + * read from topology leaves instead. Xen exposes x2APIC IDs in leaf 0xb, > + * but only if the x2APIC feature is present. If there are that many CPUs > + * it's guaranteed to be there so we can avoid checking for it > specifically > + */ Maybe I'm missing something, but given the current code won't Xen just return the low 8 bits from the x2APIC ID? I don't see any code in guest_cpuid() that adjusts the IDs to be 255 when > 255. > +if ( apic_id == 255 ) > +cpuid(0xb, NULL, NULL, NULL, _id); Won't the correct logic be to check if x2APIC is set in CPUID, and then fetch the APIC ID from leaf 0xb, otherwise fallback to fetching the APID ID from leaf 1? > + > +return apic_id; > +} > > static void ap_start(void) > { > @@ -37,12 +64,12 @@ static void ap_start(void) > cacheattr_init(); > printf("done.\n"); > > +wmb(); > +ACCESS_ONCE(CPU_TO_X2APICID[ap_cpuid]) = read_apic_id(); A comment would be helpful here, that CPU_TO_X2APICID[ap_cpuid] is used as synchronization that the AP has started. You probably want to assert that read_apic_id() doesn't return 0, otherwise we are skewed. > + > if ( !ap_cpuid ) > return; > > -wmb(); > -ap_callin = 1; > - > while ( 1 ) > asm volatile ( "hlt" ); > } > @@ -86,10 +113,11 @@ static void boot_cpu(unsigned int cpu) > BUG(); > > /* > - * Wait for the secondary processor to complete initialisation. > + * Wait for the secondary processor to complete initialisation, > + * which is signaled by its x2APIC ID being writted to the LUT. > * Do not touch shared resources meanwhile. > */ > -while ( !ap_callin ) > +while ( !ACCESS_ONCE(CPU_TO_X2APICID[cpu]) ) > cpu_relax(); As a further improvement, we could launch all APs in pararell, and use a for loop to wait until all positions of the CPU_TO_X2APICID array are set. > > /* Take the secondary processor offline. */ > diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h > index
Re: [PATCH 4/7] x86/xstate: Rework xstate_ctxt_size() as xstate_uncompressed_size()
On 23.05.2024 13:16, Andrew Cooper wrote: > @@ -611,6 +587,40 @@ static bool valid_xcr0(uint64_t xcr0) > return true; > } > > +unsigned int xstate_uncompressed_size(uint64_t xcr0) > +{ > +unsigned int size = XSTATE_AREA_MIN_SIZE, i; > + > +ASSERT((xcr0 & ~X86_XCR0_STATES) == 0); I'm puzzled by the combination of this assertion and ... > +if ( xcr0 == xfeature_mask ) > +return xsave_cntxt_size; ... this conditional return. Yes, right now we don't support/use any XSS components, but without any comment the assertion looks overly restrictive to me. > @@ -818,14 +834,14 @@ void xstate_init(struct cpuinfo_x86 *c) > * xsave_cntxt_size is the max size required by enabled features. > * We know FP/SSE and YMM about eax, and nothing about edx at > present. > */ > -xsave_cntxt_size = hw_uncompressed_size(feature_mask); > +xsave_cntxt_size = cpuid_count_ebx(0xd, 0); > printk("xstate: size: %#x and states: %#"PRIx64"\n", > xsave_cntxt_size, xfeature_mask); > } > else > { > BUG_ON(xfeature_mask != feature_mask); > -BUG_ON(xsave_cntxt_size != hw_uncompressed_size(feature_mask)); > +BUG_ON(xsave_cntxt_size != cpuid_count_ebx(0xd, 0)); > } Hmm, this may make re-basing of said earlier patch touching this code yet more interesting. Or maybe it actually simplifies things, will need to see ... The overall comment remains though: Patches pending for so long should really take priority over creating yet more new ones. But what do I do - I can't enforce this, unless I was now going to block your work the same way. Which I don't mean to do. Jan
Re: [PATCH 7/7] x86/defns: Clean up X86_{XCR0,XSS}_* constants
On 23.05.2024 13:16, Andrew Cooper wrote: > With the exception of one case in read_bndcfgu() which can use ilog2(), > the *_POS defines are unused. > > X86_XCR0_X87 is the name used by both the SDM and APM, rather than > X86_XCR0_FP. > > No functional change. > > Signed-off-by: Andrew Cooper Acked-by: Jan Beulich
Re: [PATCH 3/7] x86/boot: Collect the Raw CPU Policy earlier on boot
On 23.05.2024 13:16, Andrew Cooper wrote: > This is a tangle, but it's a small step in the right direction. > > xstate_init() is shortly going to want data from the Raw policy. > calculate_raw_cpu_policy() is sufficiently separate from the other policies to > be safe to do. > > No functional change. > > Signed-off-by: Andrew Cooper Would you mind taking a look at https://lists.xen.org/archives/html/xen-devel/2021-04/msg01335.html to make clear (to me at least) in how far we can perhaps find common grounds on what wants doing when? (Of course the local version I have has been constantly re-based, so some of the function names would have changed from what's visible there.) > --- a/xen/arch/x86/cpu-policy.c > +++ b/xen/arch/x86/cpu-policy.c > @@ -845,7 +845,6 @@ static void __init calculate_hvm_def_policy(void) > > void __init init_guest_cpu_policies(void) > { > -calculate_raw_cpu_policy(); > calculate_host_policy(); > > if ( IS_ENABLED(CONFIG_PV) ) > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -1888,7 +1888,9 @@ void asmlinkage __init noreturn __start_xen(unsigned > long mbi_p) > > tsx_init(); /* Needs microcode. May change HLE/RTM feature bits. */ > > -identify_cpu(_cpu_data); > +calculate_raw_cpu_policy(); /* Needs microcode. No other dependenices. > */ > + > +identify_cpu(_cpu_data); /* Needs microcode and raw policy. */ You don't introduce any dependency on raw policy here, and there cannot possibly have been such a dependency before (unless there was a bug somewhere). Therefore I consider this latter comment misleading at this point. Jan
[ANNOUNCE} Postpone June Community call
Hi all, The next community call is on Thursday 6th June 2024, which clashes with Xen Summit in Lisbon. I propose we move the call a week later to *Thursday 13th June 2024, 4-5pm (UK time). * Many thanks, Kelly Choi Community Manager Xen Project
Re: [PATCH 2/7] x86/xstate: Cross-check dynamic XSTATE sizes at boot
On 23.05.2024 13:16, Andrew Cooper wrote: > Right now, xstate_ctxt_size() performs a cross-check of size with CPUID in for > every call. This is expensive, being used for domain create/migrate, as well > as to service certain guest CPUID instructions. > > Instead, arrange to check the sizes once at boot. See the code comments for > details. Right now, it just checks hardware against the algorithm > expectations. Later patches will add further cross-checking. > > Introduce the missing X86_XCR0_* and X86_XSS_* constants, and a couple of > missing CPUID bits. This is to maximise coverage in the sanity check, even if > we don't expect to use/virtualise some of these features any time soon. Leave > HDC and HWP alone for now. We don't have CPUID bits from them stored nicely. Since you say "the missing", ... > --- a/xen/arch/x86/include/asm/x86-defns.h > +++ b/xen/arch/x86/include/asm/x86-defns.h > @@ -77,7 +77,7 @@ > #define X86_CR4_PKS0x0100 /* Protection Key Supervisor */ > > /* > - * XSTATE component flags in XCR0 > + * XSTATE component flags in XCR0 | MSR_XSS > */ > #define X86_XCR0_FP_POS 0 > #define X86_XCR0_FP (1ULL << X86_XCR0_FP_POS) > @@ -95,11 +95,34 @@ > #define X86_XCR0_ZMM (1ULL << X86_XCR0_ZMM_POS) > #define X86_XCR0_HI_ZMM_POS 7 > #define X86_XCR0_HI_ZMM (1ULL << X86_XCR0_HI_ZMM_POS) > +#define X86_XSS_PROC_TRACE(_AC(1, ULL) << 8) > #define X86_XCR0_PKRU_POS 9 > #define X86_XCR0_PKRU (1ULL << X86_XCR0_PKRU_POS) > +#define X86_XSS_PASID (_AC(1, ULL) << 10) > +#define X86_XSS_CET_U (_AC(1, ULL) << 11) > +#define X86_XSS_CET_S (_AC(1, ULL) << 12) > +#define X86_XSS_HDC (_AC(1, ULL) << 13) > +#define X86_XSS_UINTR (_AC(1, ULL) << 14) > +#define X86_XSS_LBR (_AC(1, ULL) << 15) > +#define X86_XSS_HWP (_AC(1, ULL) << 16) > +#define X86_XCR0_TILE_CFG (_AC(1, ULL) << 17) > +#define X86_XCR0_TILE_DATA(_AC(1, ULL) << 18) ... I'm wondering if you deliberately left out APX (bit 19). Since you're re-doing some of what I have long had in patches already, I'd also like to ask whether the last underscores each in the two AMX names really are useful in your opinion. While rebasing isn't going to be difficult either way, it would be yet simpler with X86_XCR0_TILECFG and X86_XCR0_TILEDATA, as I've had it in my patches for over 3 years. > --- a/xen/arch/x86/xstate.c > +++ b/xen/arch/x86/xstate.c > @@ -604,9 +604,156 @@ static bool valid_xcr0(uint64_t xcr0) > if ( !(xcr0 & X86_XCR0_BNDREGS) != !(xcr0 & X86_XCR0_BNDCSR) ) > return false; > > +/* TILE_CFG and TILE_DATA must be the same. */ > +if ( !(xcr0 & X86_XCR0_TILE_CFG) != !(xcr0 & X86_XCR0_TILE_DATA) ) > +return false; > + > return true; > } > > +struct xcheck_state { > +uint64_t states; > +uint32_t uncomp_size; > +uint32_t comp_size; > +}; > + > +static void __init check_new_xstate(struct xcheck_state *s, uint64_t new) > +{ > +uint32_t hw_size; > + > +BUILD_BUG_ON(X86_XCR0_STATES & X86_XSS_STATES); > + > +BUG_ON(s->states & new); /* States only increase. */ > +BUG_ON(!valid_xcr0(s->states | new)); /* Xen thinks it's a good value. */ > +BUG_ON(new & ~(X86_XCR0_STATES | X86_XSS_STATES)); /* Known state. */ > +BUG_ON((new & X86_XCR0_STATES) && > + (new & X86_XSS_STATES)); /* User or supervisor, not both. */ > + > +s->states |= new; > +if ( new & X86_XCR0_STATES ) > +{ > +if ( !set_xcr0(s->states & X86_XCR0_STATES) ) > +BUG(); > +} > +else > +set_msr_xss(s->states & X86_XSS_STATES); > + > +/* > + * Check the uncompressed size. Some XSTATEs are out-of-order and fill > in > + * prior holes in the state area, so we check that the size doesn't > + * decrease. > + */ > +hw_size = cpuid_count_ebx(0xd, 0); > + > +if ( hw_size < s->uncomp_size ) > +panic("XSTATE 0x%016"PRIx64", new bits {%63pbl}, uncompressed hw > size %#x < prev size %#x\n", > + s->states, , hw_size, s->uncomp_size); > + > +s->uncomp_size = hw_size; > + > +/* > + * Check the compressed size, if available. All components strictly > + * appear in index order. In principle there are no holes, but some > + * components have their base address 64-byte aligned for efficiency > + * reasons (e.g. AMX-TILE) and there are other components small enough to > + * fit in the gap (e.g. PKRU) without increasing the overall length. > + */ > +hw_size = cpuid_count_ebx(0xd, 1); > + > +if ( cpu_has_xsavec ) > +{ > +if ( hw_size < s->comp_size ) > +panic("XSTATE 0x%016"PRIx64", new bits {%63pbl}, compressed hw > size %#x < prev size %#x\n", > + s->states, , hw_size, s->comp_size); > + > +s->comp_size = hw_size; > +} > +else
Re: [PATCH 1/7] x86/xstate: Fix initialisation of XSS cache
On 23.05.2024 13:16, Andrew Cooper wrote: > The clobbering of this_cpu(xcr0) and this_cpu(xss) to architecturally invalid > values is to force the subsequent set_xcr0() and set_msr_xss() to reload the > hardware register. > > While XCR0 is reloaded in xstate_init(), MSR_XSS isn't. This causes > get_msr_xss() to return the invalid value, and logic of the form: > > old = get_msr_xss(); > set_msr_xss(new); > ... > set_msr_xss(old); > > to try and restore the architecturally invalid value. > > The architecturally invalid value must be purged from the cache, meaning the > hardware register must be written at least once. This in turn highlights that > the invalid value must only be used in the case that the hardware register is > available. > > Fixes: f7f4a523927f ("x86/xstate: reset cached register values on resume") > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich However, I view it as pretty unfair that now I will need to re-base https://lists.xen.org/archives/html/xen-devel/2021-04/msg01336.html over ... > --- a/xen/arch/x86/xstate.c > +++ b/xen/arch/x86/xstate.c > @@ -641,13 +641,6 @@ void xstate_init(struct cpuinfo_x86 *c) > return; > } > > -/* > - * Zap the cached values to make set_xcr0() and set_msr_xss() really > - * write it. > - */ > -this_cpu(xcr0) = 0; > -this_cpu(xss) = ~0; > - > cpuid_count(XSTATE_CPUID, 0, , , , ); > feature_mask = (((u64)edx << 32) | eax) & XCNTXT_MASK; > BUG_ON(!valid_xcr0(feature_mask)); > @@ -657,8 +650,19 @@ void xstate_init(struct cpuinfo_x86 *c) > * Set CR4_OSXSAVE and run "cpuid" to get xsave_cntxt_size. > */ > set_in_cr4(X86_CR4_OSXSAVE); > + > +/* > + * Zap the cached values to make set_xcr0() and set_msr_xss() really > write > + * the hardware register. > + */ > +this_cpu(xcr0) = 0; > if ( !set_xcr0(feature_mask) ) > BUG(); > +if ( cpu_has_xsaves ) > +{ > +this_cpu(xss) = ~0; > +set_msr_xss(0); > +} ... this change, kind of breaking again your nice arrangement. Seeing for how long that change has been pending, it _really_ should have gone in ahead of this one, with you then sorting how you'd like things to be arranged in the combined result, rather than me re-posting and then either again not getting any feedback for years, or you disliking what I've done. Oh well ... Jan
Re: [PATCH 4.5/8] tools/hvmloader: Further simplify SMP setup
On Thu, May 09, 2024 at 06:50:57PM +0100, Andrew Cooper wrote: > Now that we're using hypercalls to start APs, we can replace the 'ap_cpuid' > global with a regular function parameter. This requires telling the compiler > that we'd like the parameter in a register rather than on the stack. > > While adjusting, rename to cpu_setup(). It's always been used on the BSP, > making the name ap_start() specifically misleading. > > Signed-off-by: Andrew Cooper Reviewed-by: Roger Pau Monné Thanks, Roger.
Re: [PATCH v2 1/2] x86/hvm/trace: Use a different trace type for AMD processors
On 23/05/2024 3:10 pm, George Dunlap wrote: > A long-standing usability sub-optimality with xenalyze is the > necessity to specify `--svm-mode` when analyzing AMD processors. This > fundamentally comes about because the same trace event ID is used for > both VMX and SVM, but the contents of the trace must be interpreted > differently. > > Instead, allocate separate trace events for VMX and SVM vmexits in > Xen; this will allow all readers to properly interpret the meaning of > the vmexit reason. > > In xenalyze, first remove the redundant call to init_hvm_data(); > there's no way to get to hvm_vmexit_process() without it being already > initialized by the set_vcpu_type call in hvm_process(). > > Replace this with set_hvm_exit_reson_data(), and move setting of > hvm->exit_reason_* into that function. > > Modify hvm_process and hvm_vmexit_process to handle all four potential > values appropriately. > > If SVM entries are encountered, set opt.svm_mode so that other > SVM-specific functionality is triggered. > > Remove the `--svm-mode` command-line option, since it's now redundant. > > Signed-off-by: George Dunlap Acked-by: Andrew Cooper
Re: [xen-4.17-testing test] 186087: regressions - FAIL
On 23.05.2024 16:40, osstest service owner wrote: > flight 186087 xen-4.17-testing real [real] > http://logs.test-lab.xenproject.org/osstest/logs/186087/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > build-amd64 6 xen-buildfail REGR. vs. > 185864 > build-amd64-xsm 6 xen-buildfail REGR. vs. > 185864 > build-i386-xsm6 xen-buildfail REGR. vs. > 185864 > build-i3866 xen-buildfail REGR. vs. > 185864 > build-amd64-prev 6 xen-buildfail REGR. vs. > 185864 > build-i386-prev 6 xen-buildfail REGR. vs. > 185864 These look to be recurring, yet at the same time these look to be infrastructure issues. This not happening for the first time I'm not sure we can simply wait and hope for the problem to clear itself. Jan
Re: [PATCH v2 2/8] xen/x86: Simplify header dependencies in x86/hvm
On Thu, May 23, 2024 at 04:40:06PM +0200, Jan Beulich wrote: > On 23.05.2024 16:37, Roger Pau Monné wrote: > > On Wed, May 08, 2024 at 01:39:21PM +0100, Alejandro Vallejo wrote: > >> --- a/xen/arch/x86/include/asm/hvm/hvm.h > >> +++ b/xen/arch/x86/include/asm/hvm/hvm.h > >> @@ -798,6 +798,12 @@ static inline void hvm_update_vlapic_mode(struct vcpu > >> *v) > >> alternative_vcall(hvm_funcs.update_vlapic_mode, v); > >> } > >> > >> +static inline void hvm_vlapic_sync_pir_to_irr(struct vcpu *v) > >> +{ > >> +if ( hvm_funcs.sync_pir_to_irr ) > >> +alternative_vcall(hvm_funcs.sync_pir_to_irr, v); > > > > Nit: for consistency the wrappers are usually named hvm_, > > so in this case it would be hvm_sync_pir_to_irr(), or the hvm_funcs > > field should be renamed to vlapic_sync_pir_to_irr. > > Funny you should mention that: See my earlier comment as well as what > was committed. Oh, sorry, didn't realize you already replied, adjusted and committed. Thanks, Roger.
Re: [PATCH v2 3/8] x86/vlapic: Move lapic_load_hidden migration checks to the check hook
On Wed, May 08, 2024 at 01:39:22PM +0100, Alejandro Vallejo wrote: > While at it, add a check for the reserved field in the hidden save area. > > Signed-off-by: Alejandro Vallejo > --- > v2: > * New patch. Addresses the missing check for rsvd_zero in v1. Oh, it would be better if this was done at the time when rsvd_zero is introduced. I think this should be moved ahead of the series, so that the patch that introduces rsvd_zero can add the check in lapic_check_hidden(). > --- > xen/arch/x86/hvm/vlapic.c | 41 --- > 1 file changed, 30 insertions(+), 11 deletions(-) > > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > index 8a24419c..2f06bff1b2cc 100644 > --- a/xen/arch/x86/hvm/vlapic.c > +++ b/xen/arch/x86/hvm/vlapic.c > @@ -1573,35 +1573,54 @@ static void lapic_load_fixup(struct vlapic *vlapic) > v, vlapic->loaded.id, vlapic->loaded.ldr, good_ldr); > } > > -static int cf_check lapic_load_hidden(struct domain *d, hvm_domain_context_t > *h) > +static int cf_check lapic_check_hidden(const struct domain *d, > + hvm_domain_context_t *h) > { > unsigned int vcpuid = hvm_load_instance(h); > -struct vcpu *v; > -struct vlapic *s; > +struct hvm_hw_lapic s; > > if ( !has_vlapic(d) ) > return -ENODEV; > > /* Which vlapic to load? */ > -if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) > +if ( vcpuid >= d->max_vcpus || d->vcpu[vcpuid] == NULL ) > { > dprintk(XENLOG_G_ERR, "HVM restore: dom%d has no apic%u\n", > d->domain_id, vcpuid); > return -EINVAL; > } > -s = vcpu_vlapic(v); > > -if ( hvm_load_entry_zeroextend(LAPIC, h, >hw) != 0 ) > +if ( hvm_load_entry_zeroextend(LAPIC, h, ) ) Can't you use hvm_get_entry() to perform the sanity checks: const struct hvm_hw_lapic *s = hvm_get_entry(LAPIC, h); Thanks, Roger.
Re: [XEN PATCH v2 07/15] x86: guard cpu_has_{svm/vmx} macros with CONFIG_{SVM/VMX}
On 23.05.2024 15:07, Sergiy Kibrik wrote: > 16.05.24 14:12, Jan Beulich: >> On 15.05.2024 11:12, Sergiy Kibrik wrote: >>> --- a/xen/arch/x86/include/asm/cpufeature.h >>> +++ b/xen/arch/x86/include/asm/cpufeature.h >>> @@ -81,7 +81,8 @@ static inline bool boot_cpu_has(unsigned int feat) >>> #define cpu_has_sse3boot_cpu_has(X86_FEATURE_SSE3) >>> #define cpu_has_pclmulqdq boot_cpu_has(X86_FEATURE_PCLMULQDQ) >>> #define cpu_has_monitor boot_cpu_has(X86_FEATURE_MONITOR) >>> -#define cpu_has_vmx boot_cpu_has(X86_FEATURE_VMX) >>> +#define cpu_has_vmx ( IS_ENABLED(CONFIG_VMX) && \ >>> + boot_cpu_has(X86_FEATURE_VMX)) >>> #define cpu_has_eistboot_cpu_has(X86_FEATURE_EIST) >>> #define cpu_has_ssse3 boot_cpu_has(X86_FEATURE_SSSE3) >>> #define cpu_has_fma boot_cpu_has(X86_FEATURE_FMA) >>> @@ -109,7 +110,8 @@ static inline bool boot_cpu_has(unsigned int feat) >>> >>> /* CPUID level 0x8001.ecx */ >>> #define cpu_has_cmp_legacy boot_cpu_has(X86_FEATURE_CMP_LEGACY) >>> -#define cpu_has_svm boot_cpu_has(X86_FEATURE_SVM) >>> +#define cpu_has_svm ( IS_ENABLED(CONFIG_SVM) && \ >>> + boot_cpu_has(X86_FEATURE_SVM)) >>> #define cpu_has_sse4a boot_cpu_has(X86_FEATURE_SSE4A) >>> #define cpu_has_xop boot_cpu_has(X86_FEATURE_XOP) >>> #define cpu_has_skinit boot_cpu_has(X86_FEATURE_SKINIT) >> >> Hmm, leaving aside the style issue (stray blanks after opening parentheses, >> and as a result one-off indentation on the wrapped lines) I'm not really >> certain we can do this. The description goes into detail why we would want >> this, but it doesn't cover at all why it is safe for all present (and >> ideally also future) uses. I wouldn't be surprised if we had VMX/SVM checks >> just to derive further knowledge from that, without them being directly >> related to the use of VMX/SVM. Take a look at calculate_hvm_max_policy(), >> for example. While it looks to be okay there, it may give you an idea of >> what I mean. >> >> Things might become better separated if instead for such checks we used >> host and raw CPU policies instead of cpuinfo_x86.x86_capability[]. But >> that's still pretty far out, I'm afraid. > > I've followed a suggestion you made for patch in previous series: > > https://lore.kernel.org/xen-devel/8fbd604e-5e5d-410c-880f-2ad257bbe...@suse.com/ See the "If not, ..." that I had put there. Doing the change just mechanically isn't enough, you also need to make clear (in the description) that you verified it's safe to have this way. > yet if this approach can potentially be unsafe (I'm not completely sure > it's safe), should we instead fallback to the way it was done in v1 > series? I.e. guard calls to vmx/svm-specific calls where needed, like in > these 3 patches: > > 1) > https://lore.kernel.org/xen-devel/20240416063328.3469386-1-sergiy_kib...@epam.com/ > > 2) > https://lore.kernel.org/xen-devel/20240416063740.3469592-1-sergiy_kib...@epam.com/ > > 3) > https://lore.kernel.org/xen-devel/20240416063947.3469718-1-sergiy_kib...@epam.com/ I don't like this sprinkling around of IS_ENABLED() very much. Maybe we want to have two new helpers (say using_svm() and using_vmx()), to be used in place of most but possibly not all cpu_has_{svm,vmx}? Doing such a transformation would then kind of implicitly answer the safety question above, as at every use site you'd need to judge whether the replacement is correct. If it's correct everywhere, the construct(s) as proposed in this version could then be considered to be used in this very shape (instead of introducing the two new helpers). But of course the transition could also be done gradually then, touching only those uses that previously you touched in 1), 2), and 3). Jan
Re: [XEN PATCH v2 06/15] x86/p2m: guard altp2m code with CONFIG_ALTP2M option
On 23.05.2024 12:44, Sergiy Kibrik wrote: > 16.05.24 14:01, Jan Beulich: >> On 15.05.2024 11:10, Sergiy Kibrik wrote: >>> --- a/xen/arch/x86/include/asm/hvm/hvm.h >>> +++ b/xen/arch/x86/include/asm/hvm/hvm.h >>> @@ -670,7 +670,7 @@ static inline bool hvm_hap_supported(void) >>> /* returns true if hardware supports alternate p2m's */ >>> static inline bool hvm_altp2m_supported(void) >>> { >>> -return hvm_funcs.caps.altp2m; >>> +return IS_ENABLED(CONFIG_ALTP2M) && hvm_funcs.caps.altp2m; >> >> Which in turn raises the question whether the altp2m struct field shouldn't >> become conditional upon CONFIG_ALTP2M too (or rather: instead, as the change >> here then would need to be done differently). Yet maybe that would entail >> further changes elsewhere, so may well better be left for later. > > but hvm_funcs.caps.altp2m is only a capability bit -- is it worth to > become conditional? Well, the comment was more based on the overall principle than the actual space savings that might result. Plus as said - likely that would not work anyway without further changes elsewhere. So perhaps okay to leave as you have it. >>> --- a/xen/arch/x86/mm/Makefile >>> +++ b/xen/arch/x86/mm/Makefile >>> @@ -1,7 +1,7 @@ >>> obj-y += shadow/ >>> obj-$(CONFIG_HVM) += hap/ >>> >>> -obj-$(CONFIG_HVM) += altp2m.o >>> +obj-$(CONFIG_ALTP2M) += altp2m.o >> >> This change I think wants to move to patch 5. >> > > If this moves to patch 5 then HVM=y && ALTP2M=n configuration > combination will break the build in between patch 5 and 6, so I've > decided to put it together with fixes of these build failures in patch 6. Hmm, yes, I think I see what you mean. > Maybe I can merge patch 5 & 6 together then ? Perhaps more consistent that way, yes. Jan
[xen-4.17-testing test] 186087: regressions - FAIL
flight 186087 xen-4.17-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/186087/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 6 xen-buildfail REGR. vs. 185864 build-amd64-xsm 6 xen-buildfail REGR. vs. 185864 build-i386-xsm6 xen-buildfail REGR. vs. 185864 build-i3866 xen-buildfail REGR. vs. 185864 build-amd64-prev 6 xen-buildfail REGR. vs. 185864 build-i386-prev 6 xen-buildfail REGR. vs. 185864 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-rtds 1 build-check(1) blocked n/a test-amd64-amd64-xl-raw 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-xl-qemut-ws16-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvshim1 build-check(1) blocked n/a test-amd64-amd64-xl-pvhv2-intel 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvhv2-amd 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit1 1 build-check(1) blocked n/a test-amd64-amd64-dom0pvh-xl-amd 1 build-check(1) blocked n/a test-amd64-amd64-dom0pvh-xl-intel 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-intel 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qcow2 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-amd 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-raw 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-freebsd12-amd64 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-livepatch1 build-check(1) blocked n/a test-amd64-amd64-qemuu-freebsd11-amd64 1 build-check(1) blocked n/a test-amd64-amd64-migrupgrade 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-amd64-pygrub 1 build-check(1) blocked n/a test-xtf-amd64-amd64-51 build-check(1) blocked n/a test-amd64-amd64-xl-shadow1 build-check(1) blocked n/a test-amd64-amd64-xl-vhd 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-coresched-amd64-xl 1 build-check(1) blocked n/a test-xtf-amd64-amd64-11 build-check(1) blocked n/a test-xtf-amd64-amd64-21 build-check(1) blocked n/a test-xtf-amd64-amd64-31 build-check(1) blocked n/a test-xtf-amd64-amd64-41 build-check(1) blocked n/a test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 185864 test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass
Re: [PATCH v2 2/8] xen/x86: Simplify header dependencies in x86/hvm
On 23.05.2024 16:37, Roger Pau Monné wrote: > On Wed, May 08, 2024 at 01:39:21PM +0100, Alejandro Vallejo wrote: >> --- a/xen/arch/x86/include/asm/hvm/hvm.h >> +++ b/xen/arch/x86/include/asm/hvm/hvm.h >> @@ -798,6 +798,12 @@ static inline void hvm_update_vlapic_mode(struct vcpu >> *v) >> alternative_vcall(hvm_funcs.update_vlapic_mode, v); >> } >> >> +static inline void hvm_vlapic_sync_pir_to_irr(struct vcpu *v) >> +{ >> +if ( hvm_funcs.sync_pir_to_irr ) >> +alternative_vcall(hvm_funcs.sync_pir_to_irr, v); > > Nit: for consistency the wrappers are usually named hvm_, > so in this case it would be hvm_sync_pir_to_irr(), or the hvm_funcs > field should be renamed to vlapic_sync_pir_to_irr. Funny you should mention that: See my earlier comment as well as what was committed. Jan
Re: [PATCH v2 2/8] xen/x86: Simplify header dependencies in x86/hvm
On Wed, May 08, 2024 at 01:39:21PM +0100, Alejandro Vallejo wrote: > Otherwise it's not possible to call functions described in hvm/vlapic.h from > the > inline functions of hvm/hvm.h. > > This is because a static inline in vlapic.h depends on hvm.h, and pulls it > transitively through vpt.h. The ultimate cause is having hvm.h included in any > of the "v*.h" headers, so break the cycle moving the guilty inline into hvm.h. > > No functional change. > > Signed-off-by: Alejandro Vallejo Acked-by: Roger Pau Monné One cosmetic comment below. > --- > v2: > * New patch. Prereq to moving vlapic_cpu_policy_changed() onto hvm.h > --- > xen/arch/x86/hvm/irq.c| 6 +++--- > xen/arch/x86/hvm/vlapic.c | 4 ++-- > xen/arch/x86/include/asm/hvm/hvm.h| 6 ++ > xen/arch/x86/include/asm/hvm/vlapic.h | 6 -- > xen/arch/x86/include/asm/hvm/vpt.h| 1 - > 5 files changed, 11 insertions(+), 12 deletions(-) > > diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c > index 4a9fe82cbd8d..4f5479b12c98 100644 > --- a/xen/arch/x86/hvm/irq.c > +++ b/xen/arch/x86/hvm/irq.c > @@ -512,13 +512,13 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu > *v) > int vector; > > /* > - * Always call vlapic_sync_pir_to_irr so that PIR is synced into IRR when > - * using posted interrupts. Note this is also done by > + * Always call hvm_vlapic_sync_pir_to_irr so that PIR is synced into IRR > + * when using posted interrupts. Note this is also done by > * vlapic_has_pending_irq but depending on which interrupts are pending > * hvm_vcpu_has_pending_irq will return early without calling > * vlapic_has_pending_irq. > */ > -vlapic_sync_pir_to_irr(v); > +hvm_vlapic_sync_pir_to_irr(v); > > if ( unlikely(v->arch.nmi_pending) ) > return hvm_intack_nmi; > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > index 61a96474006b..8a24419c 100644 > --- a/xen/arch/x86/hvm/vlapic.c > +++ b/xen/arch/x86/hvm/vlapic.c > @@ -98,7 +98,7 @@ static void vlapic_clear_irr(int vector, struct vlapic > *vlapic) > > static int vlapic_find_highest_irr(struct vlapic *vlapic) > { > -vlapic_sync_pir_to_irr(vlapic_vcpu(vlapic)); > +hvm_vlapic_sync_pir_to_irr(vlapic_vcpu(vlapic)); > > return vlapic_find_highest_vector(>regs->data[APIC_IRR]); > } > @@ -1516,7 +1516,7 @@ static int cf_check lapic_save_regs(struct vcpu *v, > hvm_domain_context_t *h) > if ( !has_vlapic(v->domain) ) > return 0; > > -vlapic_sync_pir_to_irr(v); > +hvm_vlapic_sync_pir_to_irr(v); > > return hvm_save_entry(LAPIC_REGS, v->vcpu_id, h, vcpu_vlapic(v)->regs); > } > diff --git a/xen/arch/x86/include/asm/hvm/hvm.h > b/xen/arch/x86/include/asm/hvm/hvm.h > index e1f0585d75a9..84911f3ebcb4 100644 > --- a/xen/arch/x86/include/asm/hvm/hvm.h > +++ b/xen/arch/x86/include/asm/hvm/hvm.h > @@ -798,6 +798,12 @@ static inline void hvm_update_vlapic_mode(struct vcpu *v) > alternative_vcall(hvm_funcs.update_vlapic_mode, v); > } > > +static inline void hvm_vlapic_sync_pir_to_irr(struct vcpu *v) > +{ > +if ( hvm_funcs.sync_pir_to_irr ) > +alternative_vcall(hvm_funcs.sync_pir_to_irr, v); Nit: for consistency the wrappers are usually named hvm_, so in this case it would be hvm_sync_pir_to_irr(), or the hvm_funcs field should be renamed to vlapic_sync_pir_to_irr. Thanks, Roger.
Re: [xen-unstable-smoke test] 186107: regressions - FAIL
On 23.05.2024 15:45, osstest service owner wrote: > flight 186107 xen-unstable-smoke real [real] > http://logs.test-lab.xenproject.org/osstest/logs/186107/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > build-armhf 6 xen-buildfail REGR. vs. > 186064 Found ninja-1.11.1 at /usr/bin/ninja ERROR: Clock skew detected. File /usr/bin/bash has a time stamp 1682259478.4465s in the future. A full log can be found at /home/osstest/build.186107.build-armhf/xen/tools/qemu-xen-build/meson-logs/meson-log.txt ERROR: meson setup failed make: Entering directory '/home/osstest/build.186107.build-armhf/xen/tools/qemu-xen-build' config-host.mak is out-of-date, running configure GIT ui/keycodemapdb meson tests/fp/berkeley-testfloat-3 tests/fp/berkeley-softfloat-3 dtc bash: line 4: ./config.status: No such file or directory make: *** No rule to make target 'config-host.mak', needed by 'Makefile.prereqs'. Stop. make: *** Waiting for unfinished jobs make: Leaving directory '/home/osstest/build.186107.build-armhf/xen/tools/qemu-xen-build' make[2]: *** [Makefile:212: subdir-all-qemu-xen-dir] Error 2 make[2]: Leaving directory '/home/osstest/build.186107.build-armhf/xen/tools' make[1]: *** [/home/osstest/build.186107.build-armhf/xen/tools/../tools/Rules.mk:199: subdirs-all] Error 2 make[1]: Leaving directory '/home/osstest/build.186107.build-armhf/xen/tools' make: *** [Makefile:63: build-tools] Error 2 Suggest to me that there's some issue with the build host. Jan
Re: [PATCH v10 02/14] xen: introduce generic non-atomic test_*bit()
On 23/05/2024 15:11, Oleksii K. wrote: On Thu, 2024-05-23 at 14:00 +0100, Julien Grall wrote: Hi Oleksii, Hi Julien, On 17/05/2024 14:54, Oleksii Kurochko wrote: diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c index df2cebedde..4bc8ed9be5 100644 --- a/xen/arch/arm/arm64/livepatch.c +++ b/xen/arch/arm/arm64/livepatch.c @@ -10,7 +10,6 @@ #include #include -#include It is a bit unclear how this change is related to the patch. Can you explain in the commit message? Probably it doesn't need anymore. I will double check and if this change is not needed, I will just drop it in the next patch version. #include #include #include diff --git a/xen/arch/arm/include/asm/bitops.h b/xen/arch/arm/include/asm/bitops.h index 5104334e48..8e16335e76 100644 --- a/xen/arch/arm/include/asm/bitops.h +++ b/xen/arch/arm/include/asm/bitops.h @@ -22,9 +22,6 @@ #define __set_bit(n,p) set_bit(n,p) #define __clear_bit(n,p) clear_bit(n,p) -#define BITOP_BITS_PER_WORD 32 -#define BITOP_MASK(nr) (1UL << ((nr) % BITOP_BITS_PER_WORD)) -#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) #define BITS_PER_BYTE 8 OOI, any reason BITS_PER_BYTE has not been moved as well? I don't expect the value to change across arch. I can move it to generic one header too in the next patch version. [...] diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h index f14ad0d33a..6eeeff0117 100644 --- a/xen/include/xen/bitops.h +++ b/xen/include/xen/bitops.h @@ -65,10 +65,141 @@ static inline int generic_flsl(unsigned long x) * scope */ +#define BITOP_BITS_PER_WORD 32 +typedef uint32_t bitop_uint_t; + +#define BITOP_MASK(nr) ((bitop_uint_t)1 << ((nr) % BITOP_BITS_PER_WORD)) + +#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) + +extern void __bitop_bad_size(void); + +#define bitop_bad_size(addr) (sizeof(*(addr)) < sizeof(bitop_uint_t)) + /* - Please tidy above here - */ #include +/** + * generic__test_and_set_bit - Set a bit and return its old value + * @nr: Bit to set + * @addr: Address to count from + * + * This operation is non-atomic and can be reordered. + * If two examples of this operation race, one can appear to succeed + * but actually fail. You must protect multiple accesses with a lock. + */ Sorry for only mentioning this on v10. I think this comment should be duplicated (or moved to) on top of test_bit() because this is what everyone will use. This will avoid the developper to follow the function calls and only notice the x86 version which says "This function is atomic and may not be reordered." and would be wrong for all the other arch. It makes sense to add this comment on top of test_bit(), but I am curious if it is needed to mention that for x86 arch_test_bit() "is atomic and may not be reordered": I would say no because any developper modifying common code can't relying it. * This operation is non-atomic and can be reordered. ( Exception: for * x86 arch_test_bit() is atomic and may not be reordered ) * If two examples of this operation race, one can appear to succeed * but actually fail. You must protect multiple accesses with a lock. */ +static always_inline bool +generic__test_and_set_bit(int nr, volatile void *addr) +{ + bitop_uint_t mask = BITOP_MASK(nr); + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + BITOP_WORD(nr); + bitop_uint_t old = *p; + + *p = old | mask; + return (old & mask); +} + +/** + * generic__test_and_clear_bit - Clear a bit and return its old value + * @nr: Bit to clear + * @addr: Address to count from + * + * This operation is non-atomic and can be reordered. + * If two examples of this operation race, one can appear to succeed + * but actually fail. You must protect multiple accesses with a lock. + */ Same applies here and ... +static always_inline bool +generic__test_and_clear_bit(int nr, volatile void *addr) +{ + bitop_uint_t mask = BITOP_MASK(nr); + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + BITOP_WORD(nr); + bitop_uint_t old = *p; + + *p = old & ~mask; + return (old & mask); +} + +/* WARNING: non atomic and it can be reordered! */ ... here. +static always_inline bool +generic__test_and_change_bit(int nr, volatile void *addr) +{ + bitop_uint_t mask = BITOP_MASK(nr); + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + BITOP_WORD(nr); + bitop_uint_t old = *p; + + *p = old ^ mask; + return (old & mask); +} +/** + * generic_test_bit - Determine whether a bit is set + * @nr: bit number to test + * @addr: Address to start counting from + */ +static always_inline bool generic_test_bit(int nr, const volatile void *addr) +{ + bitop_uint_t mask = BITOP_MASK(nr); + const volatile bitop_uint_t *p = + (const volatile bitop_uint_t *)addr + BITOP_WORD(nr); + +
Re: [PATCH v2 1/8] xen/x86: Add initial x2APIC ID to the per-vLAPIC save area
On Wed, May 08, 2024 at 01:39:20PM +0100, Alejandro Vallejo wrote: > This allows the initial x2APIC ID to be sent on the migration stream. The > hardcoded mapping x2apic_id=2*vcpu_id is maintained for the time being. > Given the vlapic data is zero-extended on restore, fix up migrations from > hosts without the field by setting it to the old convention if zero. > > x2APIC IDs are calculated from the CPU policy where the guest topology is > defined. For the time being, the function simply returns the old > relationship, but will eventually return results consistent with the > topology. > > Signed-off-by: Alejandro Vallejo > --- > v2: > * Removed usage of SET_xAPIC_ID(). > * Restored previous logic when exposing leaf 0xb, and gate it for HVM only. > * Rewrote comment in lapic_load_fixup, including the implicit assumption. > * Moved vlapic_cpu_policy_changed() into hvm_cpuid_policy_changed()) > * const-ified policy in vlapic_cpu_policy_changed() > --- > xen/arch/x86/cpuid.c | 15 - > xen/arch/x86/hvm/vlapic.c | 30 -- > xen/arch/x86/include/asm/hvm/hvm.h | 1 + > xen/arch/x86/include/asm/hvm/vlapic.h | 2 ++ > xen/include/public/arch-x86/hvm/save.h | 2 ++ > xen/include/xen/lib/x86/cpu-policy.h | 9 > xen/lib/x86/policy.c | 11 ++ > 7 files changed, 57 insertions(+), 13 deletions(-) > > diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c > index 7a38e032146a..242c21ec5bb6 100644 > --- a/xen/arch/x86/cpuid.c > +++ b/xen/arch/x86/cpuid.c > @@ -139,10 +139,9 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf, > const struct cpu_user_regs *regs; > > case 0x1: > -/* TODO: Rework topology logic. */ > res->b &= 0x00ffu; > if ( is_hvm_domain(d) ) > -res->b |= (v->vcpu_id * 2) << 24; > +res->b |= vlapic_x2apic_id(vcpu_vlapic(v)) << 24; > > /* TODO: Rework vPMU control in terms of toolstack choices. */ > if ( vpmu_available(v) && > @@ -311,19 +310,13 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf, > break; > > case 0xb: > -/* > - * In principle, this leaf is Intel-only. In practice, it is tightly > - * coupled with x2apic, and we offer an x2apic-capable APIC emulation > - * to guests on AMD hardware as well. > - * > - * TODO: Rework topology logic. > - */ > -if ( p->basic.x2apic ) > +/* Don't expose topology information to PV guests */ Not sure whether we want to keep part of the comment about exposing x2APIC to guests even when x2APIC is not present in the host. I think this code has changed and the comment is kind of stale now. > +if ( is_hvm_domain(d) && p->basic.x2apic ) > { > *(uint8_t *)>c = subleaf; > > /* Fix the x2APIC identifier. */ > -res->d = v->vcpu_id * 2; > +res->d = vlapic_x2apic_id(vcpu_vlapic(v)); > } > break; > > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > index 05072a21bf38..61a96474006b 100644 > --- a/xen/arch/x86/hvm/vlapic.c > +++ b/xen/arch/x86/hvm/vlapic.c > @@ -1069,7 +1069,7 @@ static uint32_t x2apic_ldr_from_id(uint32_t id) > static void set_x2apic_id(struct vlapic *vlapic) > { > const struct vcpu *v = vlapic_vcpu(vlapic); > -uint32_t apic_id = v->vcpu_id * 2; > +uint32_t apic_id = vlapic->hw.x2apic_id; > uint32_t apic_ldr = x2apic_ldr_from_id(apic_id); > > /* > @@ -1083,6 +1083,22 @@ static void set_x2apic_id(struct vlapic *vlapic) > vlapic_set_reg(vlapic, APIC_LDR, apic_ldr); > } > > +void vlapic_cpu_policy_changed(struct vcpu *v) > +{ > +struct vlapic *vlapic = vcpu_vlapic(v); > +const struct cpu_policy *cp = v->domain->arch.cpu_policy; > + > +/* > + * Don't override the initial x2APIC ID if we have migrated it or > + * if the domain doesn't have vLAPIC at all. > + */ > +if ( !has_vlapic(v->domain) || vlapic->loaded.hw ) > +return; > + > +vlapic->hw.x2apic_id = x86_x2apic_id_from_vcpu_id(cp, v->vcpu_id); > +vlapic_set_reg(vlapic, APIC_ID, SET_xAPIC_ID(vlapic->hw.x2apic_id)); Nit: in case we decide to start APICs in x2APIC mode, might be good to take this into account here and use vlapic_x2apic_mode(vlapic) to select whether SET_xAPIC_ID() needs to be used or not: vlapic_set_reg(vlapic, APIC_ID, vlapic_x2apic_mode(vlapic) ? vlapic->hw.x2apic_id : SET_xAPIC_ID(vlapic->hw.x2apic_id)); Or similar. > +} > + > int guest_wrmsr_apic_base(struct vcpu *v, uint64_t val) > { > const struct cpu_policy *cp = v->domain->arch.cpu_policy; > @@ -1449,7 +1465,7 @@ void vlapic_reset(struct vlapic *vlapic) > if ( v->vcpu_id == 0 ) > vlapic->hw.apic_base_msr |= APIC_BASE_BSP; > > -vlapic_set_reg(vlapic, APIC_ID,
Re: [PATCH v4 0/2] Add API for making parts of a MMIO page R/O and use it in XHCI console
On 23.05.2024 16:22, Marek Marczykowski-Górecki wrote: > On Wed, May 22, 2024 at 05:39:02PM +0200, Marek Marczykowski-Górecki wrote: >> On older systems, XHCI xcap had a layout that no other (interesting) >> registers >> were placed on the same page as the debug capability, so Linux was fine with >> making the whole page R/O. But at least on Tiger Lake and Alder Lake, Linux >> needs to write to some other registers on the same page too. >> >> Add a generic API for making just parts of an MMIO page R/O and use it to fix >> USB3 console with share=yes or share=hwdom options. More details in commit >> messages. >> >> Marek Marczykowski-Górecki (2): >> x86/mm: add API for marking only part of a MMIO page read only >> drivers/char: Use sub-page ro API to make just xhci dbc cap RO > > Does any other x86 maintainer feel comfortable ack-ing this series? Jan > already reviewed 2/2 here (but not 1/2 in this version), Which, btw, isn't to mean I'm not going to look at it. But 2/2 was the lower hanging fruit ... Jan > but also said > he is not comfortable with letting this in without a second maintainer > approval: > https://lore.kernel.org/xen-devel/7655e401-b927-4250-ae63-05361a5ee...@suse.com/
[PATCH v2 1/2] x86/hvm/trace: Use a different trace type for AMD processors
A long-standing usability sub-optimality with xenalyze is the necessity to specify `--svm-mode` when analyzing AMD processors. This fundamentally comes about because the same trace event ID is used for both VMX and SVM, but the contents of the trace must be interpreted differently. Instead, allocate separate trace events for VMX and SVM vmexits in Xen; this will allow all readers to properly interpret the meaning of the vmexit reason. In xenalyze, first remove the redundant call to init_hvm_data(); there's no way to get to hvm_vmexit_process() without it being already initialized by the set_vcpu_type call in hvm_process(). Replace this with set_hvm_exit_reson_data(), and move setting of hvm->exit_reason_* into that function. Modify hvm_process and hvm_vmexit_process to handle all four potential values appropriately. If SVM entries are encountered, set opt.svm_mode so that other SVM-specific functionality is triggered. Remove the `--svm-mode` command-line option, since it's now redundant. Signed-off-by: George Dunlap --- v2: - Rebase to tip of staging - Rebase over xentrace_format removal - Fix typo in commit message - Remove --svm-mode command-line flag CC: Andrew Cooper CC: Jan Beulich CC: Roger Pau Monne CC: Anthony Perard CC: Stefano Stabellini CC: Julien Grall CC: Olaf Hering --- tools/xentrace/xenalyze.c | 37 +++-- xen/arch/x86/hvm/svm/svm.c | 4 ++-- xen/arch/x86/hvm/vmx/vmx.c | 4 ++-- xen/include/public/trace.h | 6 -- 4 files changed, 27 insertions(+), 24 deletions(-) diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c index ce6a85d50b..9c4463b0e8 100644 --- a/tools/xentrace/xenalyze.c +++ b/tools/xentrace/xenalyze.c @@ -1437,14 +1437,6 @@ void init_hvm_data(struct hvm_data *h, struct vcpu_data *v) { h->init = 1; -if(opt.svm_mode) { -h->exit_reason_max = HVM_SVM_EXIT_REASON_MAX; -h->exit_reason_name = hvm_svm_exit_reason_name; -} else { -h->exit_reason_max = HVM_VMX_EXIT_REASON_MAX; -h->exit_reason_name = hvm_vmx_exit_reason_name; -} - if(opt.histogram_interrupt_eip) { int count = ((1ULLexit_reason_max = HVM_SVM_EXIT_REASON_MAX; +h->exit_reason_name = hvm_svm_exit_reason_name; +} else { +h->exit_reason_max = HVM_VMX_EXIT_REASON_MAX; +h->exit_reason_name = hvm_vmx_exit_reason_name; +} +} + /* PV data */ enum { PV_HYPERCALL=1, @@ -5088,13 +5092,13 @@ void hvm_vmexit_process(struct record_info *ri, struct hvm_data *h, r = (typeof(r))ri->d; -if(!h->init) -init_hvm_data(h, v); +if(!h->exit_reason_name) +set_hvm_exit_reason_data(h, ri->event); h->vmexit_valid=1; bzero(>inflight, sizeof(h->inflight)); -if(ri->event == TRC_HVM_VMEXIT64) { +if(ri->event & TRC_64_FLAG) { if(v->guest_paging_levels != 4) { if ( verbosity >= 6 ) @@ -5316,8 +5320,10 @@ void hvm_process(struct pcpu_info *p) break; default: switch(ri->event) { -case TRC_HVM_VMEXIT: -case TRC_HVM_VMEXIT64: +case TRC_HVM_VMX_EXIT: +case TRC_HVM_VMX_EXIT64: +case TRC_HVM_SVM_EXIT: +case TRC_HVM_SVM_EXIT64: UPDATE_VOLUME(p, hvm[HVM_VOL_VMEXIT], ri->size); hvm_vmexit_process(ri, h, v); break; @@ -10884,11 +10890,6 @@ const struct argp_option cmd_opts[] = { .arg = "HZ", .doc = "Cpu speed of the tracing host, used to convert tsc into seconds.", }, -{ .name = "svm-mode", - .key = OPT_SVM_MODE, - .group = OPT_GROUP_HARDWARE, - .doc = "Assume AMD SVM-style vmexit error codes. (Default is Intel VMX.)", }, - { .name = "progress", .key = OPT_PROGRESS, .doc = "Progress dialog. Requires the zenity (GTK+) executable.", }, diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index db530d55f2..988250dbc1 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -2571,10 +2571,10 @@ void asmlinkage svm_vmexit_handler(void) exit_reason = vmcb->exitcode; if ( hvm_long_mode_active(v) ) -TRACE_TIME(TRC_HVM_VMEXIT64 | (vcpu_guestmode ? TRC_HVM_NESTEDFLAG : 0), +TRACE_TIME(TRC_HVM_SVM_EXIT64 | (vcpu_guestmode ? TRC_HVM_NESTEDFLAG : 0), exit_reason, regs->rip, regs->rip >> 32); else -TRACE_TIME(TRC_HVM_VMEXIT | (vcpu_guestmode ? TRC_HVM_NESTEDFLAG : 0), +TRACE_TIME(TRC_HVM_SVM_EXIT | (vcpu_guestmode ? TRC_HVM_NESTEDFLAG : 0), exit_reason, regs->eip); if ( vcpu_guestmode ) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 8ba996546f..f16faa6a61 100644 ---
[PATCH v2 2/2] tools/xenalyze: Ignore HVM_EMUL events harder
To unify certain common sanity checks, checks are done very early in processing based only on the top-level type. Unfortunately, when TRC_HVM_EMUL was introduced, it broke some of the assumptions about how the top-level types worked. Namely, traces of this type will show up outside of HVM contexts: in idle domains and in PV domains. Make an explicit exception for TRC_HVM_EMUL types in a number of places: - Pass the record info pointer to toplevel_assert_check, so that it can exclude TRC_HVM_EMUL records from idle and vcpu data_mode checks - Don't attempt to set the vcpu data_type in hvm_process for TRC_HVM_EMUL records. Signed-off-by: George Dunlap Acked-by: Andrew Cooper --- CC: Andrew Cooper CC: Anthony Perard CC: Olaf Hering --- tools/xentrace/xenalyze.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c index 9c4463b0e8..d95e52695f 100644 --- a/tools/xentrace/xenalyze.c +++ b/tools/xentrace/xenalyze.c @@ -21,6 +21,7 @@ #define _XOPEN_SOURCE 600 #include #include +#include #include #include #include @@ -5305,8 +5306,11 @@ void hvm_process(struct pcpu_info *p) assert(p->current); -if(vcpu_set_data_type(p->current, VCPU_DATA_HVM)) -return; +/* HVM_EMUL types show up in all contexts */ +if(ri->evt.sub != 0x4) { +if(vcpu_set_data_type(p->current, VCPU_DATA_HVM)) +return; +} switch ( ri->evt.sub ) { case 2: /* HVM_HANDLER */ @@ -9447,9 +9451,10 @@ static struct tl_assert_mask tl_assert_checks[TOPLEVEL_MAX] = { /* There are a lot of common assumptions for the various processing * routines. Check them all in one place, doing something else if * they don't pass. */ -int toplevel_assert_check(int toplevel, struct pcpu_info *p) +int toplevel_assert_check(int toplevel, struct record_info *ri, struct pcpu_info *p) { struct tl_assert_mask mask; +bool is_hvm_emul = (toplevel == TOPLEVEL_HVM) && (ri->evt.sub == 0x4); mask = tl_assert_checks[toplevel]; @@ -9459,7 +9464,7 @@ int toplevel_assert_check(int toplevel, struct pcpu_info *p) goto fail; } -if( mask.not_idle_domain ) +if( mask.not_idle_domain && !is_hvm_emul) { /* Can't do this check w/o first doing above check */ assert(mask.p_current); @@ -9478,7 +9483,8 @@ int toplevel_assert_check(int toplevel, struct pcpu_info *p) v = p->current; if ( ! (v->data_type == VCPU_DATA_NONE -|| v->data_type == mask.vcpu_data_mode) ) +|| v->data_type == mask.vcpu_data_mode +|| is_hvm_emul) ) { /* This may happen for track_dirty_vram, which causes a SHADOW_WRMAP_BF trace f/ dom0 */ fprintf(warn, "WARNING: Unexpected vcpu data type for d%dv%d on proc %d! Expected %d got %d. Not processing\n", @@ -9525,7 +9531,7 @@ void process_record(struct pcpu_info *p) { return; /* Unify toplevel assertions */ -if ( toplevel_assert_check(toplevel, p) ) +if ( toplevel_assert_check(toplevel, ri, p) ) { switch(toplevel) { case TRC_GEN_MAIN: -- 2.25.1
Re: [PATCH v4 0/2] Add API for making parts of a MMIO page R/O and use it in XHCI console
On Wed, May 22, 2024 at 05:39:02PM +0200, Marek Marczykowski-Górecki wrote: > On older systems, XHCI xcap had a layout that no other (interesting) registers > were placed on the same page as the debug capability, so Linux was fine with > making the whole page R/O. But at least on Tiger Lake and Alder Lake, Linux > needs to write to some other registers on the same page too. > > Add a generic API for making just parts of an MMIO page R/O and use it to fix > USB3 console with share=yes or share=hwdom options. More details in commit > messages. > > Marek Marczykowski-Górecki (2): > x86/mm: add API for marking only part of a MMIO page read only > drivers/char: Use sub-page ro API to make just xhci dbc cap RO Does any other x86 maintainer feel comfortable ack-ing this series? Jan already reviewed 2/2 here (but not 1/2 in this version), but also said he is not comfortable with letting this in without a second maintainer approval: https://lore.kernel.org/xen-devel/7655e401-b927-4250-ae63-05361a5ee...@suse.com/ > > xen/arch/x86/hvm/emulate.c | 2 +- > xen/arch/x86/hvm/hvm.c | 4 +- > xen/arch/x86/include/asm/mm.h | 25 +++- > xen/arch/x86/mm.c | 273 +- > xen/arch/x86/pv/ro-page-fault.c | 6 +- > xen/drivers/char/xhci-dbc.c | 36 ++-- > 6 files changed, 327 insertions(+), 19 deletions(-) > > base-commit: b0082b908391b29b7c4dd5e6c389ebd6481926f8 > -- > git-series 0.9.1 -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab signature.asc Description: PGP signature
Re: [PATCH v10 02/14] xen: introduce generic non-atomic test_*bit()
On Thu, 2024-05-23 at 14:00 +0100, Julien Grall wrote: > Hi Oleksii, Hi Julien, > > On 17/05/2024 14:54, Oleksii Kurochko wrote: > > diff --git a/xen/arch/arm/arm64/livepatch.c > > b/xen/arch/arm/arm64/livepatch.c > > index df2cebedde..4bc8ed9be5 100644 > > --- a/xen/arch/arm/arm64/livepatch.c > > +++ b/xen/arch/arm/arm64/livepatch.c > > @@ -10,7 +10,6 @@ > > #include > > #include > > > > -#include > > It is a bit unclear how this change is related to the patch. Can you > explain in the commit message? Probably it doesn't need anymore. I will double check and if this change is not needed, I will just drop it in the next patch version. > > > #include > > #include > > #include > > diff --git a/xen/arch/arm/include/asm/bitops.h > > b/xen/arch/arm/include/asm/bitops.h > > index 5104334e48..8e16335e76 100644 > > --- a/xen/arch/arm/include/asm/bitops.h > > +++ b/xen/arch/arm/include/asm/bitops.h > > @@ -22,9 +22,6 @@ > > #define __set_bit(n,p) set_bit(n,p) > > #define __clear_bit(n,p) clear_bit(n,p) > > > > -#define BITOP_BITS_PER_WORD 32 > > -#define BITOP_MASK(nr) (1UL << ((nr) % > > BITOP_BITS_PER_WORD)) > > -#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) > > #define BITS_PER_BYTE 8 > > OOI, any reason BITS_PER_BYTE has not been moved as well? I don't > expect > the value to change across arch. I can move it to generic one header too in the next patch version. > > [...] > > > diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h > > index f14ad0d33a..6eeeff0117 100644 > > --- a/xen/include/xen/bitops.h > > +++ b/xen/include/xen/bitops.h > > @@ -65,10 +65,141 @@ static inline int generic_flsl(unsigned long > > x) > > * scope > > */ > > > > +#define BITOP_BITS_PER_WORD 32 > > +typedef uint32_t bitop_uint_t; > > + > > +#define BITOP_MASK(nr) ((bitop_uint_t)1 << ((nr) % > > BITOP_BITS_PER_WORD)) > > + > > +#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) > > + > > +extern void __bitop_bad_size(void); > > + > > +#define bitop_bad_size(addr) (sizeof(*(addr)) < > > sizeof(bitop_uint_t)) > > + > > /* - Please tidy above here - > > */ > > > > #include > > > > +/** > > + * generic__test_and_set_bit - Set a bit and return its old value > > + * @nr: Bit to set > > + * @addr: Address to count from > > + * > > + * This operation is non-atomic and can be reordered. > > + * If two examples of this operation race, one can appear to > > succeed > > + * but actually fail. You must protect multiple accesses with a > > lock. > > + */ > > Sorry for only mentioning this on v10. I think this comment should be > duplicated (or moved to) on top of test_bit() because this is what > everyone will use. This will avoid the developper to follow the > function > calls and only notice the x86 version which says "This function is > atomic and may not be reordered." and would be wrong for all the > other arch. It makes sense to add this comment on top of test_bit(), but I am curious if it is needed to mention that for x86 arch_test_bit() "is atomic and may not be reordered": * This operation is non-atomic and can be reordered. ( Exception: for * x86 arch_test_bit() is atomic and may not be reordered ) * If two examples of this operation race, one can appear to succeed * but actually fail. You must protect multiple accesses with a lock. */ > > > +static always_inline bool > > +generic__test_and_set_bit(int nr, volatile void *addr) > > +{ > > + bitop_uint_t mask = BITOP_MASK(nr); > > + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + > > BITOP_WORD(nr); > > + bitop_uint_t old = *p; > > + > > + *p = old | mask; > > + return (old & mask); > > +} > > + > > +/** > > + * generic__test_and_clear_bit - Clear a bit and return its old > > value > > + * @nr: Bit to clear > > + * @addr: Address to count from > > + * > > + * This operation is non-atomic and can be reordered. > > + * If two examples of this operation race, one can appear to > > succeed > > + * but actually fail. You must protect multiple accesses with a > > lock. > > + */ > > Same applies here and ... > > > +static always_inline bool > > +generic__test_and_clear_bit(int nr, volatile void *addr) > > +{ > > + bitop_uint_t mask = BITOP_MASK(nr); > > + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + > > BITOP_WORD(nr); > > + bitop_uint_t old = *p; > > + > > + *p = old & ~mask; > > + return (old & mask); > > +} > > + > > +/* WARNING: non atomic and it can be reordered! */ > > ... here. > > > +static always_inline bool > > +generic__test_and_change_bit(int nr, volatile void *addr) > > +{ > > + bitop_uint_t mask = BITOP_MASK(nr); > > + volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + > > BITOP_WORD(nr); > > + bitop_uint_t old = *p; > > + > > + *p = old ^ mask; > > + return (old & mask); > > +} > > +/** > > + * generic_test_bit -
Re: [PATCH 4/5] x86/kernel: Move page table macros to new header
On Thu, May 23, 2024 at 03:59:43PM +0200, Thomas Gleixner wrote: > On Wed, Apr 10 2024 at 15:48, Jason Andryuk wrote: > > --- > > arch/x86/kernel/head_64.S| 22 ++ > > arch/x86/kernel/pgtable_64_helpers.h | 28 > > That's the wrong place as you want to include it from arch/x86/platform. > > arch/x86/include/asm/ ... and there already is a header waiting: arch/x86/include/asm/pgtable_64.h so no need for a new one. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH 4/5] x86/kernel: Move page table macros to new header
On Wed, Apr 10 2024 at 15:48, Jason Andryuk wrote: > --- > arch/x86/kernel/head_64.S| 22 ++ > arch/x86/kernel/pgtable_64_helpers.h | 28 That's the wrong place as you want to include it from arch/x86/platform. arch/x86/include/asm/ Thanks, tglx
[xen-unstable-smoke test] 186107: regressions - FAIL
flight 186107 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/186107/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 6 xen-buildfail REGR. vs. 186064 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass version targeted for testing: xen d6a7fd83039af36c28bd0ae2174f12c3888ce993 baseline version: xen ced21fbb2842ac4655048bdee56232974ff9ff9c Last test of basis 186064 2024-05-21 15:04:02 Z1 days Testing same since 186104 2024-05-23 09:00:22 Z0 days2 attempts People who touched revisions under test: Alejandro Vallejo Bobby Eshleman Jan Beulich Julien Grall Oleksandr Andrushchenko Oleksii Kurochko Roger Pau Monné Stewart Hildebrand Volodymyr Babchuk jobs: build-arm64-xsm pass build-amd64 pass build-armhf fail build-amd64-libvirt pass test-armhf-armhf-xl blocked test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 387 lines long.)
Re: [PATCH v10 03/14] xen/bitops: implement fls{l}() in common logic
Hi, On 22/05/2024 09:15, Jan Beulich wrote: On 22.05.2024 09:37, Oleksii K. wrote: On Tue, 2024-05-21 at 13:18 +0200, Jan Beulich wrote: On 17.05.2024 15:54, Oleksii Kurochko wrote: To avoid the compilation error below, it is needed to update to places in common/page_alloc.c where flsl() is used as now flsl() returns unsigned int: ./include/xen/kernel.h:18:21: error: comparison of distinct pointer types lacks a cast [-Werror] 18 | (void) (&_x == &_y); \ | ^~ common/page_alloc.c:1843:34: note: in expansion of macro 'min' 1843 | unsigned int inc_order = min(MAX_ORDER, flsl(e - s) - 1); generic_fls{l} was used instead of __builtin_clz{l}(x) as if x is 0, the result in undefined. The prototype of the per-architecture fls{l}() functions was changed to return 'unsigned int' to align with the generic implementation of these functions and avoid introducing signed/unsigned mismatches. Signed-off-by: Oleksii Kurochko --- The patch is almost independent from Andrew's patch series ( https://lore.kernel.org/xen-devel/20240313172716.2325427-1-andrew.coop...@citrix.com/T/#t ) except test_fls() function which IMO can be merged as a separate patch after Andrew's patch will be fully ready. If there wasn't this dependency (I don't think it's "almost independent"), I'd be offering R-b with again one nit below. Aren't all changes, except those in xen/common/bitops.c, independent? I could move these changes in xen/common/bitops.c to a separate commit. I think it is safe to commit them ( an introduction of common logic for fls{l}() and tests ) separately since the CI tests have passed. Technically they might be, but contextually there are further conflicts. Just try "patch --dry-run" on top of a plain staging tree. You really need to settle, perhaps consulting Andrew, whether you want to go on top of his change, or ahead of it. I'm not willing to approve a patch that's presented one way but then is (kind of) expected to go in the other way. I agree with what Jan wrote. I don't have any strong opinion on which order they should be merged. But, if your series is intended to be merged before Andrew's one then please rebase to vanilla staging. I looked at the rest of the patch and it LGTM. Cheers, -- Julien Grall
Re: [XEN PATCH v2 07/15] x86: guard cpu_has_{svm/vmx} macros with CONFIG_{SVM/VMX}
16.05.24 14:12, Jan Beulich: On 15.05.2024 11:12, Sergiy Kibrik wrote: --- a/xen/arch/x86/include/asm/cpufeature.h +++ b/xen/arch/x86/include/asm/cpufeature.h @@ -81,7 +81,8 @@ static inline bool boot_cpu_has(unsigned int feat) #define cpu_has_sse3boot_cpu_has(X86_FEATURE_SSE3) #define cpu_has_pclmulqdq boot_cpu_has(X86_FEATURE_PCLMULQDQ) #define cpu_has_monitor boot_cpu_has(X86_FEATURE_MONITOR) -#define cpu_has_vmx boot_cpu_has(X86_FEATURE_VMX) +#define cpu_has_vmx ( IS_ENABLED(CONFIG_VMX) && \ + boot_cpu_has(X86_FEATURE_VMX)) #define cpu_has_eistboot_cpu_has(X86_FEATURE_EIST) #define cpu_has_ssse3 boot_cpu_has(X86_FEATURE_SSSE3) #define cpu_has_fma boot_cpu_has(X86_FEATURE_FMA) @@ -109,7 +110,8 @@ static inline bool boot_cpu_has(unsigned int feat) /* CPUID level 0x8001.ecx */ #define cpu_has_cmp_legacy boot_cpu_has(X86_FEATURE_CMP_LEGACY) -#define cpu_has_svm boot_cpu_has(X86_FEATURE_SVM) +#define cpu_has_svm ( IS_ENABLED(CONFIG_SVM) && \ + boot_cpu_has(X86_FEATURE_SVM)) #define cpu_has_sse4a boot_cpu_has(X86_FEATURE_SSE4A) #define cpu_has_xop boot_cpu_has(X86_FEATURE_XOP) #define cpu_has_skinit boot_cpu_has(X86_FEATURE_SKINIT) Hmm, leaving aside the style issue (stray blanks after opening parentheses, and as a result one-off indentation on the wrapped lines) I'm not really certain we can do this. The description goes into detail why we would want this, but it doesn't cover at all why it is safe for all present (and ideally also future) uses. I wouldn't be surprised if we had VMX/SVM checks just to derive further knowledge from that, without them being directly related to the use of VMX/SVM. Take a look at calculate_hvm_max_policy(), for example. While it looks to be okay there, it may give you an idea of what I mean. Things might become better separated if instead for such checks we used host and raw CPU policies instead of cpuinfo_x86.x86_capability[]. But that's still pretty far out, I'm afraid. I've followed a suggestion you made for patch in previous series: https://lore.kernel.org/xen-devel/8fbd604e-5e5d-410c-880f-2ad257bbe...@suse.com/ yet if this approach can potentially be unsafe (I'm not completely sure it's safe), should we instead fallback to the way it was done in v1 series? I.e. guard calls to vmx/svm-specific calls where needed, like in these 3 patches: 1) https://lore.kernel.org/xen-devel/20240416063328.3469386-1-sergiy_kib...@epam.com/ 2) https://lore.kernel.org/xen-devel/20240416063740.3469592-1-sergiy_kib...@epam.com/ 3) https://lore.kernel.org/xen-devel/20240416063947.3469718-1-sergiy_kib...@epam.com/ -Sergiy
Re: [PATCH v10 02/14] xen: introduce generic non-atomic test_*bit()
Hi Oleksii, On 17/05/2024 14:54, Oleksii Kurochko wrote: diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c index df2cebedde..4bc8ed9be5 100644 --- a/xen/arch/arm/arm64/livepatch.c +++ b/xen/arch/arm/arm64/livepatch.c @@ -10,7 +10,6 @@ #include #include -#include It is a bit unclear how this change is related to the patch. Can you explain in the commit message? #include #include #include diff --git a/xen/arch/arm/include/asm/bitops.h b/xen/arch/arm/include/asm/bitops.h index 5104334e48..8e16335e76 100644 --- a/xen/arch/arm/include/asm/bitops.h +++ b/xen/arch/arm/include/asm/bitops.h @@ -22,9 +22,6 @@ #define __set_bit(n,p)set_bit(n,p) #define __clear_bit(n,p) clear_bit(n,p) -#define BITOP_BITS_PER_WORD 32 -#define BITOP_MASK(nr) (1UL << ((nr) % BITOP_BITS_PER_WORD)) -#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) #define BITS_PER_BYTE 8 OOI, any reason BITS_PER_BYTE has not been moved as well? I don't expect the value to change across arch. [...] diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h index f14ad0d33a..6eeeff0117 100644 --- a/xen/include/xen/bitops.h +++ b/xen/include/xen/bitops.h @@ -65,10 +65,141 @@ static inline int generic_flsl(unsigned long x) * scope */ +#define BITOP_BITS_PER_WORD 32 +typedef uint32_t bitop_uint_t; + +#define BITOP_MASK(nr) ((bitop_uint_t)1 << ((nr) % BITOP_BITS_PER_WORD)) + +#define BITOP_WORD(nr) ((nr) / BITOP_BITS_PER_WORD) + +extern void __bitop_bad_size(void); + +#define bitop_bad_size(addr) (sizeof(*(addr)) < sizeof(bitop_uint_t)) + /* - Please tidy above here - */ #include +/** + * generic__test_and_set_bit - Set a bit and return its old value + * @nr: Bit to set + * @addr: Address to count from + * + * This operation is non-atomic and can be reordered. + * If two examples of this operation race, one can appear to succeed + * but actually fail. You must protect multiple accesses with a lock. + */ Sorry for only mentioning this on v10. I think this comment should be duplicated (or moved to) on top of test_bit() because this is what everyone will use. This will avoid the developper to follow the function calls and only notice the x86 version which says "This function is atomic and may not be reordered." and would be wrong for all the other arch. +static always_inline bool +generic__test_and_set_bit(int nr, volatile void *addr) +{ +bitop_uint_t mask = BITOP_MASK(nr); +volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + BITOP_WORD(nr); +bitop_uint_t old = *p; + +*p = old | mask; +return (old & mask); +} + +/** + * generic__test_and_clear_bit - Clear a bit and return its old value + * @nr: Bit to clear + * @addr: Address to count from + * + * This operation is non-atomic and can be reordered. + * If two examples of this operation race, one can appear to succeed + * but actually fail. You must protect multiple accesses with a lock. + */ Same applies here and ... +static always_inline bool +generic__test_and_clear_bit(int nr, volatile void *addr) +{ +bitop_uint_t mask = BITOP_MASK(nr); +volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + BITOP_WORD(nr); +bitop_uint_t old = *p; + +*p = old & ~mask; +return (old & mask); +} + +/* WARNING: non atomic and it can be reordered! */ ... here. +static always_inline bool +generic__test_and_change_bit(int nr, volatile void *addr) +{ +bitop_uint_t mask = BITOP_MASK(nr); +volatile bitop_uint_t *p = (volatile bitop_uint_t *)addr + BITOP_WORD(nr); +bitop_uint_t old = *p; + +*p = old ^ mask; +return (old & mask); +} +/** + * generic_test_bit - Determine whether a bit is set + * @nr: bit number to test + * @addr: Address to start counting from + */ +static always_inline bool generic_test_bit(int nr, const volatile void *addr) +{ +bitop_uint_t mask = BITOP_MASK(nr); +const volatile bitop_uint_t *p = +(const volatile bitop_uint_t *)addr + BITOP_WORD(nr); + +return (*p & mask); +} + +static always_inline bool +__test_and_set_bit(int nr, volatile void *addr) +{ +#ifndef arch__test_and_set_bit +#define arch__test_and_set_bit generic__test_and_set_bit +#endif + +return arch__test_and_set_bit(nr, addr); +} NIT: It is a bit too late to change this one. But I have to admit, I don't understand the purpose of the static inline when you could have simply call... +#define __test_and_set_bit(nr, addr) ({ \ +if ( bitop_bad_size(addr) ) __bitop_bad_size(); \ +__test_and_set_bit(nr, addr); \ ... __arch__test_and_set_bit here. The only two reasons I am not providing an ack is the: * Explanation for the removal of asm/bitops.h in livepatch.c * The placement of the comments There are not too important for me. Cheers, -- Julien Grall
Re: [for-4.19] Re: [XEN PATCH v3] arm/mem_access: add conditional build of mem_access.c
Hi Oleksii, On 23/05/2024 09:04, Oleksii K. wrote: On Wed, 2024-05-22 at 21:50 +0100, Julien Grall wrote: Hi, Adding Oleksii as the release manager. On 22/05/2024 19:27, Tamas K Lengyel wrote: On Fri, May 10, 2024 at 8:32 AM Alessandro Zucchelli wrote: In order to comply to MISRA C:2012 Rule 8.4 for ARM the following changes are done: revert preprocessor conditional changes to xen/mem_access.h which had it build unconditionally, add conditional build for xen/mem_access.c as well and provide stubs in asm/mem_access.h for the users of this header. Signed-off-by: Alessandro Zucchelli Acked-by: Tamas K Lengyel Oleksii, would you be happy if this patch is committed for 4.19? Sure: Release-acked-by: Oleksii Kurochko Thanks. It is now committed. BTW, do you want to be release-ack every bug until the hard code freeze? Or would you be fine to levea the decision to the maintainers? I would prefer to leave the decision to the maintainers. Ok. I will keep it in mind for the bug fixes until the hard code. Cheers, -- Julien Grall
Re: [PATCH 5/5] x86/pvh: Add 64bit relocation page tables
On 10.04.24 21:48, Jason Andryuk wrote: The PVH entry point is 32bit. For a 64bit kernel, the entry point must switch to 64bit mode, which requires a set of page tables. In the past, PVH used init_top_pgt. This works fine when the kernel is loaded at LOAD_PHYSICAL_ADDR, as the page tables are prebuilt for this address. If the kernel is loaded at a different address, they need to be adjusted. __startup_64() adjusts the prebuilt page tables for the physical load address, but it is 64bit code. The 32bit PVH entry code can't call it to adjust the page tables, so it can't readily be re-used. 64bit PVH entry needs page tables set up for identity map, the kernel high map and the direct map. pvh_start_xen() enters identity mapped. Inside xen_prepare_pvh(), it jumps through a pv_ops function pointer into the highmap. The direct map is used for __va() on the initramfs and other guest physical addresses. Add a dedicated set of prebuild page tables for PVH entry. They are adjusted in assembly before loading. Add XEN_ELFNOTE_PHYS32_RELOC to indicate support for relocation along with the kernel's loading constraints. The maximum load address, KERNEL_IMAGE_SIZE - 1, is determined by a single pvh_level2_ident_pgt page. It could be larger with more pages. Signed-off-by: Jason Andryuk --- Instead of adding 5 pages of prebuilt page tables, they could be contructed dynamically in the .bss area. They are then only used for PVH entry and until transitioning to init_top_pgt. The .bss is later cleared. It's safer to add the dedicated pages, so that is done here. --- arch/x86/platform/pvh/head.S | 105 ++- 1 file changed, 104 insertions(+), 1 deletion(-) diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S index c08d08d8cc92..4af3cfbcf2f8 100644 --- a/arch/x86/platform/pvh/head.S +++ b/arch/x86/platform/pvh/head.S @@ -21,6 +21,8 @@ #include #include +#include "../kernel/pgtable_64_helpers.h" + __HEAD /* @@ -102,8 +104,47 @@ SYM_CODE_START_LOCAL(pvh_start_xen) btsl $_EFER_LME, %eax wrmsr + mov %ebp, %ebx + subl $LOAD_PHYSICAL_ADDR, %ebx /* offset */ + jz .Lpagetable_done + + /* Fixup page-tables for relocation. */ + leal rva(pvh_init_top_pgt)(%ebp), %edi + movl $512, %ecx Please use PTRS_PER_PGD instead of the literal 512. Similar issue below. +2: + testl $_PAGE_PRESENT, 0x00(%edi) + jz 1f + addl %ebx, 0x00(%edi) +1: + addl $8, %edi + decl %ecx + jnz 2b + + /* L3 ident has a single entry. */ + leal rva(pvh_level3_ident_pgt)(%ebp), %edi + addl %ebx, 0x00(%edi) + + leal rva(pvh_level3_kernel_pgt)(%ebp), %edi + addl %ebx, (4096 - 16)(%edi) + addl %ebx, (4096 - 8)(%edi) PAGE_SIZE instead of 4096, please. + + /* pvh_level2_ident_pgt is fine - large pages */ + + /* pvh_level2_kernel_pgt needs adjustment - large pages */ + leal rva(pvh_level2_kernel_pgt)(%ebp), %edi + movl $512, %ecx +2: + testl $_PAGE_PRESENT, 0x00(%edi) + jz 1f + addl %ebx, 0x00(%edi) +1: + addl $8, %edi + decl %ecx + jnz 2b + +.Lpagetable_done: /* Enable pre-constructed page tables. */ - leal rva(init_top_pgt)(%ebp), %eax + leal rva(pvh_init_top_pgt)(%ebp), %eax mov %eax, %cr3 mov $(X86_CR0_PG | X86_CR0_PE), %eax mov %eax, %cr0 @@ -197,5 +238,67 @@ SYM_DATA_START_LOCAL(early_stack) .fill BOOT_STACK_SIZE, 1, 0 SYM_DATA_END_LABEL(early_stack, SYM_L_LOCAL, early_stack_end) +#ifdef CONFIG_X86_64 +/* + * Xen PVH needs a set of identity mapped and kernel high mapping + * page tables. pvh_start_xen starts running on the identity mapped + * page tables, but xen_prepare_pvh calls into the high mapping. + * These page tables need to be relocatable and are only used until + * startup_64 transitions to init_top_pgt. + */ +SYM_DATA_START_PAGE_ALIGNED(pvh_init_top_pgt) + .quad pvh_level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC + .orgpvh_init_top_pgt + L4_PAGE_OFFSET*8, 0 Please add a space before and after the '*'. + .quad pvh_level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC + .orgpvh_init_top_pgt + L4_START_KERNEL*8, 0 + /* (2^48-(2*1024*1024*1024))/(2^39) = 511 */ + .quad pvh_level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC +SYM_DATA_END(pvh_init_top_pgt) + +SYM_DATA_START_PAGE_ALIGNED(pvh_level3_ident_pgt) + .quad pvh_level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC + .fill 511, 8, 0 +SYM_DATA_END(pvh_level3_ident_pgt) +SYM_DATA_START_PAGE_ALIGNED(pvh_level2_ident_pgt) + /* +* Since I easily can, map the first 1G. +* Don't set NX because code runs from these pages. +* +* Note: This sets _PAGE_GLOBAL despite whether +* the CPU supports it or it is enabled. But, +* the
Re: [XEN PATCH] x86/iommu: Conditionally compile platform-specific union entries
Le 23/05/2024 à 11:52, Roger Pau Monné a écrit : > The #ifdef and #endif processor directives shouldn't be indented. > > Would you mind adding /* CONFIG_{AMD,INTEL}_IOMMU */ comments in the > #endif directives? > Sure, will change it for v2. > I wonder if we could move the definitions of those structures to the > vendor specific headers, but that's more convoluted, and would require > including the iommu headers in pci.h Do you mean moving the vtd/amd union entries to separate structures (e.g vtd_arch_iommu) and put them into another file (I don't see any vendor-specific headers for this, perhaps create ones ?). > > Thanks, Roger. Teddy Teddy Astie | Vates XCP-ng Intern XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
Re: [PATCH 4/5] x86/kernel: Move page table macros to new header
On 10.04.24 21:48, Jason Andryuk wrote: The PVH entry point will need an additional set of prebuild page tables. Move the macros and defines to a new header so they can be re-used. Signed-off-by: Jason Andryuk With the one nit below addressed: Reviewed-by: Juergen Gross ... diff --git a/arch/x86/kernel/pgtable_64_helpers.h b/arch/x86/kernel/pgtable_64_helpers.h new file mode 100644 index ..0ae87d768ce2 --- /dev/null +++ b/arch/x86/kernel/pgtable_64_helpers.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __PGTABLES_64_H__ +#define __PGTABLES_64_H__ + +#ifdef __ASSEMBLY__ + +#define l4_index(x)(((x) >> 39) & 511) +#define pud_index(x) (((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1)) Please fix the minor style issue in this line by s/-/ - / Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
Re: [PATCH for-4.19 v3 2/3] xen: enable altp2m at create domain domctl
On Fri, May 17, 2024 at 03:33:51PM +0200, Roger Pau Monne wrote: > Enabling it using an HVM param is fragile, and complicates the logic when > deciding whether options that interact with altp2m can also be enabled. > > Leave the HVM param value for consumption by the guest, but prevent it from > being set. Enabling is now done using and additional altp2m specific field in > xen_domctl_createdomain. > > Note that albeit only currently implemented in x86, altp2m could be > implemented > in other architectures, hence why the field is added to > xen_domctl_createdomain > instead of xen_arch_domainconfig. > > Signed-off-by: Roger Pau Monné > --- > Changes since v2: > - Introduce a new altp2m field in xen_domctl_createdomain. > > Changes since v1: > - New in this version. > --- > tools/libs/light/libxl_create.c | 23 ++- > tools/libs/light/libxl_x86.c| 26 -- > tools/ocaml/libs/xc/xenctrl_stubs.c | 2 +- > xen/arch/arm/domain.c | 6 ++ Could I get an Ack from one of the Arm maintainers for the trivial Arm change? Thanks, Roger.
[PATCH 7/7] x86/defns: Clean up X86_{XCR0,XSS}_* constants
With the exception of one case in read_bndcfgu() which can use ilog2(), the *_POS defines are unused. X86_XCR0_X87 is the name used by both the SDM and APM, rather than X86_XCR0_FP. No functional change. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné v3: * New --- xen/arch/x86/i387.c | 2 +- xen/arch/x86/include/asm/x86-defns.h | 32 ++-- xen/arch/x86/include/asm/xstate.h| 4 ++-- xen/arch/x86/xstate.c| 18 4 files changed, 23 insertions(+), 33 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index 7a4297cc921e..fcdee10a6e69 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -369,7 +369,7 @@ void vcpu_setup_fpu(struct vcpu *v, struct xsave_struct *xsave_area, { v->arch.xsave_area->xsave_hdr.xstate_bv &= ~XSTATE_FP_SSE; if ( fcw_default != FCW_DEFAULT ) -v->arch.xsave_area->xsave_hdr.xstate_bv |= X86_XCR0_FP; +v->arch.xsave_area->xsave_hdr.xstate_bv |= X86_XCR0_X87; } } diff --git a/xen/arch/x86/include/asm/x86-defns.h b/xen/arch/x86/include/asm/x86-defns.h index d7602ab225c4..3bcdbaccd3aa 100644 --- a/xen/arch/x86/include/asm/x86-defns.h +++ b/xen/arch/x86/include/asm/x86-defns.h @@ -79,25 +79,16 @@ /* * XSTATE component flags in XCR0 | MSR_XSS */ -#define X86_XCR0_FP_POS 0 -#define X86_XCR0_FP (1ULL << X86_XCR0_FP_POS) -#define X86_XCR0_SSE_POS 1 -#define X86_XCR0_SSE (1ULL << X86_XCR0_SSE_POS) -#define X86_XCR0_YMM_POS 2 -#define X86_XCR0_YMM (1ULL << X86_XCR0_YMM_POS) -#define X86_XCR0_BNDREGS_POS 3 -#define X86_XCR0_BNDREGS (1ULL << X86_XCR0_BNDREGS_POS) -#define X86_XCR0_BNDCSR_POS 4 -#define X86_XCR0_BNDCSR (1ULL << X86_XCR0_BNDCSR_POS) -#define X86_XCR0_OPMASK_POS 5 -#define X86_XCR0_OPMASK (1ULL << X86_XCR0_OPMASK_POS) -#define X86_XCR0_ZMM_POS 6 -#define X86_XCR0_ZMM (1ULL << X86_XCR0_ZMM_POS) -#define X86_XCR0_HI_ZMM_POS 7 -#define X86_XCR0_HI_ZMM (1ULL << X86_XCR0_HI_ZMM_POS) +#define X86_XCR0_X87 (_AC(1, ULL) << 0) +#define X86_XCR0_SSE (_AC(1, ULL) << 1) +#define X86_XCR0_YMM (_AC(1, ULL) << 2) +#define X86_XCR0_BNDREGS (_AC(1, ULL) << 3) +#define X86_XCR0_BNDCSR (_AC(1, ULL) << 4) +#define X86_XCR0_OPMASK (_AC(1, ULL) << 5) +#define X86_XCR0_ZMM (_AC(1, ULL) << 6) +#define X86_XCR0_HI_ZMM (_AC(1, ULL) << 7) #define X86_XSS_PROC_TRACE(_AC(1, ULL) << 8) -#define X86_XCR0_PKRU_POS 9 -#define X86_XCR0_PKRU (1ULL << X86_XCR0_PKRU_POS) +#define X86_XCR0_PKRU (_AC(1, ULL) << 9) #define X86_XSS_PASID (_AC(1, ULL) << 10) #define X86_XSS_CET_U (_AC(1, ULL) << 11) #define X86_XSS_CET_S (_AC(1, ULL) << 12) @@ -107,11 +98,10 @@ #define X86_XSS_HWP (_AC(1, ULL) << 16) #define X86_XCR0_TILE_CFG (_AC(1, ULL) << 17) #define X86_XCR0_TILE_DATA(_AC(1, ULL) << 18) -#define X86_XCR0_LWP_POS 62 -#define X86_XCR0_LWP (1ULL << X86_XCR0_LWP_POS) +#define X86_XCR0_LWP (_AC(1, ULL) << 62) #define X86_XCR0_STATES \ -(X86_XCR0_FP | X86_XCR0_SSE | X86_XCR0_YMM | X86_XCR0_BNDREGS | \ +(X86_XCR0_X87 | X86_XCR0_SSE | X86_XCR0_YMM | X86_XCR0_BNDREGS |\ X86_XCR0_BNDCSR | X86_XCR0_OPMASK | X86_XCR0_ZMM | \ X86_XCR0_HI_ZMM | X86_XCR0_PKRU | X86_XCR0_TILE_CFG | \ X86_XCR0_TILE_DATA | \ diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index da1d89d2f416..f4a8e5f814a0 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -29,8 +29,8 @@ extern uint32_t mxcsr_mask; #define XSAVE_HDR_OFFSET FXSAVE_SIZE #define XSTATE_AREA_MIN_SIZE (FXSAVE_SIZE + XSAVE_HDR_SIZE) -#define XSTATE_FP_SSE (X86_XCR0_FP | X86_XCR0_SSE) -#define XCNTXT_MASK(X86_XCR0_FP | X86_XCR0_SSE | X86_XCR0_YMM | \ +#define XSTATE_FP_SSE (X86_XCR0_X87 | X86_XCR0_SSE) +#define XCNTXT_MASK(X86_XCR0_X87 | X86_XCR0_SSE | X86_XCR0_YMM | \ X86_XCR0_OPMASK | X86_XCR0_ZMM | X86_XCR0_HI_ZMM | \ XSTATE_NONLAZY) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 7b7f2dcaf651..0ed2541665b3 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -313,7 +313,7 @@ void xsave(struct vcpu *v, uint64_t mask) "=m" (*ptr), \ "a" (lmask), "d" (hmask), "D" (ptr)) -if ( fip_width == 8 || !(mask & X86_XCR0_FP) ) +if ( fip_width == 8 || !(mask & X86_XCR0_X87) ) {
[PATCH 3/7] x86/boot: Collect the Raw CPU Policy earlier on boot
This is a tangle, but it's a small step in the right direction. xstate_init() is shortly going to want data from the Raw policy. calculate_raw_cpu_policy() is sufficiently separate from the other policies to be safe to do. No functional change. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné This is necessary for the forthcoming xstate_{un,}compressed_size() to perform boot-time sanity checks on state components which aren't fully enabled yet. I decided that doing this was better than extending the xstate_{offsets,sizes}[] logic that we're intending to retire in due course. v3: * New. --- xen/arch/x86/cpu-policy.c | 1 - xen/arch/x86/setup.c | 4 +++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index b96f4ee55cc4..5b66f002df05 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -845,7 +845,6 @@ static void __init calculate_hvm_def_policy(void) void __init init_guest_cpu_policies(void) { -calculate_raw_cpu_policy(); calculate_host_policy(); if ( IS_ENABLED(CONFIG_PV) ) diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index b50c9c84af6d..8850e5637a98 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1888,7 +1888,9 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) tsx_init(); /* Needs microcode. May change HLE/RTM feature bits. */ -identify_cpu(_cpu_data); +calculate_raw_cpu_policy(); /* Needs microcode. No other dependenices. */ + +identify_cpu(_cpu_data); /* Needs microcode and raw policy. */ set_in_cr4(X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT); -- 2.30.2
[PATCH 5/7] x86/cpu-policy: Simplify recalculate_xstate()
Make use of xstate_uncompressed_size() helper rather than maintaining the running calculation while accumulating feature components. The rest of the CPUID data can come direct from the raw cpu policy. All per-component data form an ABI through the behaviour of the X{SAVE,RSTOR}* instructions. Use for_each_set_bit() rather than opencoding a slightly awkward version of it. Mask the attributes in ecx down based on the visible features. This isn't actually necessary for any components or attributes defined at the time of writing (up to AMX), but is added out of an abundance of caution. Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich --- CC: Jan Beulich CC: Roger Pau Monné v2: * Tie ALIGN64 to xsavec rather than xsaves. v3: * Tweak commit message. --- xen/arch/x86/cpu-policy.c | 55 +++ xen/arch/x86/include/asm/xstate.h | 1 + 2 files changed, 21 insertions(+), 35 deletions(-) diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index 5b66f002df05..304dc20cfab8 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -193,8 +193,7 @@ static void sanitise_featureset(uint32_t *fs) static void recalculate_xstate(struct cpu_policy *p) { uint64_t xstates = XSTATE_FP_SSE; -uint32_t xstate_size = XSTATE_AREA_MIN_SIZE; -unsigned int i, Da1 = p->xstate.Da1; +unsigned int i, ecx_mask = 0, Da1 = p->xstate.Da1; /* * The Da1 leaf is the only piece of information preserved in the common @@ -206,61 +205,47 @@ static void recalculate_xstate(struct cpu_policy *p) return; if ( p->basic.avx ) -{ xstates |= X86_XCR0_YMM; -xstate_size = max(xstate_size, - xstate_offsets[X86_XCR0_YMM_POS] + - xstate_sizes[X86_XCR0_YMM_POS]); -} if ( p->feat.mpx ) -{ xstates |= X86_XCR0_BNDREGS | X86_XCR0_BNDCSR; -xstate_size = max(xstate_size, - xstate_offsets[X86_XCR0_BNDCSR_POS] + - xstate_sizes[X86_XCR0_BNDCSR_POS]); -} if ( p->feat.avx512f ) -{ xstates |= X86_XCR0_OPMASK | X86_XCR0_ZMM | X86_XCR0_HI_ZMM; -xstate_size = max(xstate_size, - xstate_offsets[X86_XCR0_HI_ZMM_POS] + - xstate_sizes[X86_XCR0_HI_ZMM_POS]); -} if ( p->feat.pku ) -{ xstates |= X86_XCR0_PKRU; -xstate_size = max(xstate_size, - xstate_offsets[X86_XCR0_PKRU_POS] + - xstate_sizes[X86_XCR0_PKRU_POS]); -} -p->xstate.max_size = xstate_size; +/* Subleaf 0 */ +p->xstate.max_size = +xstate_uncompressed_size(xstates & ~XSTATE_XSAVES_ONLY); p->xstate.xcr0_low = xstates & ~XSTATE_XSAVES_ONLY; p->xstate.xcr0_high = (xstates & ~XSTATE_XSAVES_ONLY) >> 32; +/* Subleaf 1 */ p->xstate.Da1 = Da1; +if ( p->xstate.xsavec ) +ecx_mask |= XSTATE_ALIGN64; + if ( p->xstate.xsaves ) { +ecx_mask |= XSTATE_XSS; p->xstate.xss_low = xstates & XSTATE_XSAVES_ONLY; p->xstate.xss_high = (xstates & XSTATE_XSAVES_ONLY) >> 32; } -else -xstates &= ~XSTATE_XSAVES_ONLY; -for ( i = 2; i < min(63UL, ARRAY_SIZE(p->xstate.comp)); ++i ) +/* Subleafs 2+ */ +xstates &= ~XSTATE_FP_SSE; +BUILD_BUG_ON(ARRAY_SIZE(p->xstate.comp) < 63); +for_each_set_bit ( i, , 63 ) { -uint64_t curr_xstate = 1UL << i; - -if ( !(xstates & curr_xstate) ) -continue; - -p->xstate.comp[i].size = xstate_sizes[i]; -p->xstate.comp[i].offset = xstate_offsets[i]; -p->xstate.comp[i].xss= curr_xstate & XSTATE_XSAVES_ONLY; -p->xstate.comp[i].align = curr_xstate & xstate_align; +/* + * Pass through size (eax) and offset (ebx) directly. Visbility of + * attributes in ecx limited by visible features in Da1. + */ +p->xstate.raw[i].a = raw_cpu_policy.xstate.raw[i].a; +p->xstate.raw[i].b = raw_cpu_policy.xstate.raw[i].b; +p->xstate.raw[i].c = raw_cpu_policy.xstate.raw[i].c & ecx_mask; } } diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index f5115199d4f9..bfb66dd766b6 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -40,6 +40,7 @@ extern uint32_t mxcsr_mask; #define XSTATE_XSAVES_ONLY 0 #define XSTATE_COMPACTION_ENABLED (1ULL << 63) +#define XSTATE_XSS (1U << 0) #define XSTATE_ALIGN64 (1U << 1) extern u64 xfeature_mask; -- 2.30.2
[PATCH 1/7] x86/xstate: Fix initialisation of XSS cache
The clobbering of this_cpu(xcr0) and this_cpu(xss) to architecturally invalid values is to force the subsequent set_xcr0() and set_msr_xss() to reload the hardware register. While XCR0 is reloaded in xstate_init(), MSR_XSS isn't. This causes get_msr_xss() to return the invalid value, and logic of the form: old = get_msr_xss(); set_msr_xss(new); ... set_msr_xss(old); to try and restore the architecturally invalid value. The architecturally invalid value must be purged from the cache, meaning the hardware register must be written at least once. This in turn highlights that the invalid value must only be used in the case that the hardware register is available. Fixes: f7f4a523927f ("x86/xstate: reset cached register values on resume") Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné v3: * Split out of later patch --- xen/arch/x86/xstate.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 99cedb4f5e24..75788147966a 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -641,13 +641,6 @@ void xstate_init(struct cpuinfo_x86 *c) return; } -/* - * Zap the cached values to make set_xcr0() and set_msr_xss() really - * write it. - */ -this_cpu(xcr0) = 0; -this_cpu(xss) = ~0; - cpuid_count(XSTATE_CPUID, 0, , , , ); feature_mask = (((u64)edx << 32) | eax) & XCNTXT_MASK; BUG_ON(!valid_xcr0(feature_mask)); @@ -657,8 +650,19 @@ void xstate_init(struct cpuinfo_x86 *c) * Set CR4_OSXSAVE and run "cpuid" to get xsave_cntxt_size. */ set_in_cr4(X86_CR4_OSXSAVE); + +/* + * Zap the cached values to make set_xcr0() and set_msr_xss() really write + * the hardware register. + */ +this_cpu(xcr0) = 0; if ( !set_xcr0(feature_mask) ) BUG(); +if ( cpu_has_xsaves ) +{ +this_cpu(xss) = ~0; +set_msr_xss(0); +} if ( bsp ) { -- 2.30.2
[PATCH 6/7] x86/cpuid: Fix handling of XSAVE dynamic leaves
First, if XSAVE is available in hardware but not visible to the guest, the dynamic leaves shouldn't be filled in. Second, the comment concerning XSS state is wrong. VT-x doesn't manage host/guest state automatically, but there is provision for "host only" bits to be set, so the implications are still accurate. Introduce xstate_compressed_size() to mirror the uncompressed one. Cross check it at boot. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné CC: Wei Liu v3: * Adjust commit message about !XSAVE guests * Rebase over boot time cross check * Use raw policy --- xen/arch/x86/cpuid.c | 24 -- xen/arch/x86/include/asm/xstate.h | 1 + xen/arch/x86/xstate.c | 34 +++ 3 files changed, 43 insertions(+), 16 deletions(-) diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c index 7a38e032146a..a822e80c7ea7 100644 --- a/xen/arch/x86/cpuid.c +++ b/xen/arch/x86/cpuid.c @@ -330,23 +330,15 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf, case XSTATE_CPUID: switch ( subleaf ) { -case 1: -if ( !p->xstate.xsavec && !p->xstate.xsaves ) -break; - -/* - * TODO: Figure out what to do for XSS state. VT-x manages host - * vs guest MSR_XSS automatically, so as soon as we start - * supporting any XSS states, the wrong XSS will be in context. - */ -BUILD_BUG_ON(XSTATE_XSAVES_ONLY != 0); -fallthrough; case 0: -/* - * Read CPUID[0xD,0/1].EBX from hardware. They vary with enabled - * XSTATE, and appropriate XCR0|XSS are in context. - */ -res->b = cpuid_count_ebx(leaf, subleaf); +if ( p->basic.xsave ) +res->b = xstate_uncompressed_size(v->arch.xcr0); +break; + +case 1: +if ( p->xstate.xsavec ) +res->b = xstate_compressed_size(v->arch.xcr0 | +v->arch.msrs->xss.raw); break; } break; diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index bfb66dd766b6..da1d89d2f416 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -109,6 +109,7 @@ void xstate_free_save_area(struct vcpu *v); int xstate_alloc_save_area(struct vcpu *v); void xstate_init(struct cpuinfo_x86 *c); unsigned int xstate_uncompressed_size(uint64_t xcr0); +unsigned int xstate_compressed_size(uint64_t xstates); static inline uint64_t xgetbv(unsigned int index) { diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 1b3153600d9c..7b7f2dcaf651 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -621,6 +621,34 @@ unsigned int xstate_uncompressed_size(uint64_t xcr0) return size; } +unsigned int xstate_compressed_size(uint64_t xstates) +{ +unsigned int i, size = XSTATE_AREA_MIN_SIZE; + +if ( xstates == 0 ) /* TODO: clean up paths passing 0 in here. */ +return 0; + +if ( xstates <= (X86_XCR0_SSE | X86_XCR0_FP) ) +return size; + +/* + * For the compressed size, every component matters. Some componenets are + * rounded up to 64 first. + */ +xstates &= ~(X86_XCR0_SSE | X86_XCR0_FP); +for_each_set_bit ( i, , 63 ) +{ +const struct xstate_component *c = _cpu_policy.xstate.comp[i]; + +if ( c->align ) +size = ROUNDUP(size, 64); + +size += c->size; +} + +return size; +} + struct xcheck_state { uint64_t states; uint32_t uncomp_size; @@ -683,6 +711,12 @@ static void __init check_new_xstate(struct xcheck_state *s, uint64_t new) s->states, , hw_size, s->comp_size); s->comp_size = hw_size; + +xen_size = xstate_compressed_size(s->states); + +if ( xen_size != hw_size ) +panic("XSTATE 0x%016"PRIx64", compressed hw size %#x != xen size %#x\n", + s->states, hw_size, xen_size); } else BUG_ON(hw_size); /* Compressed size reported, but no XSAVEC ? */ -- 2.30.2
[PATCH 4/7] x86/xstate: Rework xstate_ctxt_size() as xstate_uncompressed_size()
We're soon going to need a compressed helper of the same form. The size of the uncompressed image depends on the single element with the largest offset + size. Sadly this isn't always the element with the largest index. Name the per-xstate-component cpu_policy struture, for legibility of the logic in xstate_uncompressed_size(). Cross-check with hardware during boot, and remove hw_uncompressed_size(). This means that the migration paths don't need to mess with XCR0 just to sanity check the buffer size. The users of hw_uncompressed_size() in xstate_init() can (and indeed need) to be replaced with CPUID instructions. They run with feature_mask in XCR0, and prior to setup_xstate_features() on the BSP. No practical change. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné v2: * Scan all features. LWP/APX_F are out-of-order. v3: * Rebase over boot time check. * Use the raw CPU policy. --- xen/arch/x86/domctl.c| 2 +- xen/arch/x86/hvm/hvm.c | 2 +- xen/arch/x86/include/asm/xstate.h| 2 +- xen/arch/x86/xstate.c| 78 +--- xen/include/xen/lib/x86/cpu-policy.h | 2 +- 5 files changed, 51 insertions(+), 35 deletions(-) diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 9a72d57333e9..c2f2016ed45a 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -833,7 +833,7 @@ long arch_do_domctl( uint32_t offset = 0; #define PV_XSAVE_HDR_SIZE (2 * sizeof(uint64_t)) -#define PV_XSAVE_SIZE(xcr0) (PV_XSAVE_HDR_SIZE + xstate_ctxt_size(xcr0)) +#define PV_XSAVE_SIZE(xcr0) (PV_XSAVE_HDR_SIZE + xstate_uncompressed_size(xcr0)) ret = -ESRCH; if ( (evc->vcpu >= d->max_vcpus) || diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 2c66fe0f7a16..b84f4d2387d1 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1190,7 +1190,7 @@ HVM_REGISTER_SAVE_RESTORE(CPU, hvm_save_cpu_ctxt, NULL, hvm_load_cpu_ctxt, 1, #define HVM_CPU_XSAVE_SIZE(xcr0) (offsetof(struct hvm_hw_cpu_xsave, \ save_area) + \ - xstate_ctxt_size(xcr0)) + xstate_uncompressed_size(xcr0)) static int cf_check hvm_save_cpu_xsave_states( struct vcpu *v, hvm_domain_context_t *h) diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index c08c267884f0..f5115199d4f9 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -107,7 +107,7 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size); void xstate_free_save_area(struct vcpu *v); int xstate_alloc_save_area(struct vcpu *v); void xstate_init(struct cpuinfo_x86 *c); -unsigned int xstate_ctxt_size(u64 xcr0); +unsigned int xstate_uncompressed_size(uint64_t xcr0); static inline uint64_t xgetbv(unsigned int index) { diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 33a5a89719ef..1b3153600d9c 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -8,6 +8,8 @@ #include #include #include + +#include #include #include #include @@ -183,7 +185,7 @@ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size) /* Check there is state to serialise (i.e. at least an XSAVE_HDR) */ BUG_ON(!v->arch.xcr0_accum); /* Check there is the correct room to decompress into. */ -BUG_ON(size != xstate_ctxt_size(v->arch.xcr0_accum)); +BUG_ON(size != xstate_uncompressed_size(v->arch.xcr0_accum)); if ( !(xstate->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) ) { @@ -245,7 +247,7 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) u64 xstate_bv, valid; BUG_ON(!v->arch.xcr0_accum); -BUG_ON(size != xstate_ctxt_size(v->arch.xcr0_accum)); +BUG_ON(size != xstate_uncompressed_size(v->arch.xcr0_accum)); ASSERT(!xsave_area_compressed(src)); xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv; @@ -553,32 +555,6 @@ void xstate_free_save_area(struct vcpu *v) v->arch.xsave_area = NULL; } -static unsigned int hw_uncompressed_size(uint64_t xcr0) -{ -u64 act_xcr0 = get_xcr0(); -unsigned int size; -bool ok = set_xcr0(xcr0); - -ASSERT(ok); -size = cpuid_count_ebx(XSTATE_CPUID, 0); -ok = set_xcr0(act_xcr0); -ASSERT(ok); - -return size; -} - -/* Fastpath for common xstate size requests, avoiding reloads of xcr0. */ -unsigned int xstate_ctxt_size(u64 xcr0) -{ -if ( xcr0 == xfeature_mask ) -return xsave_cntxt_size; - -if ( xcr0 == 0 ) /* TODO: clean up paths passing 0 in here. */ -return 0; - -return hw_uncompressed_size(xcr0); -} - static bool valid_xcr0(uint64_t xcr0) { /* FP must be unconditionally set. */ @@ -611,6 +587,40 @@ static bool valid_xcr0(uint64_t xcr0) return true; } +unsigned int
[PATCH 2/7] x86/xstate: Cross-check dynamic XSTATE sizes at boot
Right now, xstate_ctxt_size() performs a cross-check of size with CPUID in for every call. This is expensive, being used for domain create/migrate, as well as to service certain guest CPUID instructions. Instead, arrange to check the sizes once at boot. See the code comments for details. Right now, it just checks hardware against the algorithm expectations. Later patches will add further cross-checking. Introduce the missing X86_XCR0_* and X86_XSS_* constants, and a couple of missing CPUID bits. This is to maximise coverage in the sanity check, even if we don't expect to use/virtualise some of these features any time soon. Leave HDC and HWP alone for now. We don't have CPUID bits from them stored nicely. Only perform the cross-checks in debug builds. It's only developers or new hardware liable to trip these checks, and Xen at least tracks "maximum value ever seen in xcr0" for the lifetime of the VM, which we don't want to be tickling in the general case. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné v3: * New On Sapphire Rapids with the whole series inc diagnostics, we get this pattern: (XEN) *** check_new_xstate(, 0x0003) (XEN) *** check_new_xstate(, 0x0004) (XEN) *** check_new_xstate(, 0x00e0) (XEN) *** check_new_xstate(, 0x0200) (XEN) *** check_new_xstate(, 0x0006) (XEN) *** check_new_xstate(, 0x0100) (XEN) *** check_new_xstate(, 0x0400) (XEN) *** check_new_xstate(, 0x0800) (XEN) *** check_new_xstate(, 0x1000) (XEN) *** check_new_xstate(, 0x4000) (XEN) *** check_new_xstate(, 0x8000) and on Genoa, this pattern: (XEN) *** check_new_xstate(, 0x0003) (XEN) *** check_new_xstate(, 0x0004) (XEN) *** check_new_xstate(, 0x00e0) (XEN) *** check_new_xstate(, 0x0200) (XEN) *** check_new_xstate(, 0x0800) (XEN) *** check_new_xstate(, 0x1000) --- xen/arch/x86/include/asm/x86-defns.h| 25 +++- xen/arch/x86/xstate.c | 150 xen/include/public/arch-x86/cpufeatureset.h | 3 + 3 files changed, 177 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/include/asm/x86-defns.h b/xen/arch/x86/include/asm/x86-defns.h index 48d7a3b7af45..d7602ab225c4 100644 --- a/xen/arch/x86/include/asm/x86-defns.h +++ b/xen/arch/x86/include/asm/x86-defns.h @@ -77,7 +77,7 @@ #define X86_CR4_PKS0x0100 /* Protection Key Supervisor */ /* - * XSTATE component flags in XCR0 + * XSTATE component flags in XCR0 | MSR_XSS */ #define X86_XCR0_FP_POS 0 #define X86_XCR0_FP (1ULL << X86_XCR0_FP_POS) @@ -95,11 +95,34 @@ #define X86_XCR0_ZMM (1ULL << X86_XCR0_ZMM_POS) #define X86_XCR0_HI_ZMM_POS 7 #define X86_XCR0_HI_ZMM (1ULL << X86_XCR0_HI_ZMM_POS) +#define X86_XSS_PROC_TRACE(_AC(1, ULL) << 8) #define X86_XCR0_PKRU_POS 9 #define X86_XCR0_PKRU (1ULL << X86_XCR0_PKRU_POS) +#define X86_XSS_PASID (_AC(1, ULL) << 10) +#define X86_XSS_CET_U (_AC(1, ULL) << 11) +#define X86_XSS_CET_S (_AC(1, ULL) << 12) +#define X86_XSS_HDC (_AC(1, ULL) << 13) +#define X86_XSS_UINTR (_AC(1, ULL) << 14) +#define X86_XSS_LBR (_AC(1, ULL) << 15) +#define X86_XSS_HWP (_AC(1, ULL) << 16) +#define X86_XCR0_TILE_CFG (_AC(1, ULL) << 17) +#define X86_XCR0_TILE_DATA(_AC(1, ULL) << 18) #define X86_XCR0_LWP_POS 62 #define X86_XCR0_LWP (1ULL << X86_XCR0_LWP_POS) +#define X86_XCR0_STATES \ +(X86_XCR0_FP | X86_XCR0_SSE | X86_XCR0_YMM | X86_XCR0_BNDREGS | \ + X86_XCR0_BNDCSR | X86_XCR0_OPMASK | X86_XCR0_ZMM | \ + X86_XCR0_HI_ZMM | X86_XCR0_PKRU | X86_XCR0_TILE_CFG | \ + X86_XCR0_TILE_DATA | \ + X86_XCR0_LWP) + +#define X86_XSS_STATES \ +(X86_XSS_PROC_TRACE | X86_XSS_PASID | X86_XSS_CET_U | \ + X86_XSS_CET_S | X86_XSS_HDC | X86_XSS_UINTR | X86_XSS_LBR |\ + X86_XSS_HWP | \ + 0) + /* * Debug status flags in DR6. * diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 75788147966a..33a5a89719ef 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -604,9 +604,156 @@ static bool valid_xcr0(uint64_t xcr0) if ( !(xcr0 & X86_XCR0_BNDREGS) != !(xcr0 & X86_XCR0_BNDCSR) ) return false; +/* TILE_CFG and TILE_DATA must be the same. */ +if ( !(xcr0 & X86_XCR0_TILE_CFG) != !(xcr0 & X86_XCR0_TILE_DATA) ) +return false; + return true; } +struct xcheck_state { +uint64_t states; +uint32_t uncomp_size; +uint32_t comp_size; +}; + +static void __init check_new_xstate(struct xcheck_state *s, uint64_t new) +{ +uint32_t
[PATCH for-4.19 v3 0/7] x86/xstate: Fixes to size calculations
This has grown somewhat from v2, but is better for it IMO. The headline change is patch 2 performing all the cross-checking at boot time. This turned into needing prepare the Raw CPU policy earlier on boot (to avoid further-adding to scheme we're already looking to retire). The end result has been tested across the entire XenServer hardware lab. This found several false assupmtion about how the dynamic sizes behave. Patches 1 and 6 the main bugfixes from this series. There's still lots more work to do in order to get AMX and/or CET working, but this is at least a 4-yo series finally off my plate. Andrew Cooper (7): x86/xstate: Fix initialisation of XSS cache x86/xstate: Cross-check dynamic XSTATE sizes at boot x86/boot: Collect the Raw CPU Policy earlier on boot x86/xstate: Rework xstate_ctxt_size() as xstate_uncompressed_size() x86/cpu-policy: Simplify recalculate_xstate() x86/cpuid: Fix handling of XSAVE dynamic leaves x86/defns: Clean up X86_{XCR0,XSS}_* constants xen/arch/x86/cpu-policy.c | 56 ++-- xen/arch/x86/cpuid.c| 24 +- xen/arch/x86/domctl.c | 2 +- xen/arch/x86/hvm/hvm.c | 2 +- xen/arch/x86/i387.c | 2 +- xen/arch/x86/include/asm/x86-defns.h| 55 ++-- xen/arch/x86/include/asm/xstate.h | 8 +- xen/arch/x86/setup.c| 4 +- xen/arch/x86/xstate.c | 286 +--- xen/include/public/arch-x86/cpufeatureset.h | 3 + xen/include/xen/lib/x86/cpu-policy.h| 2 +- 11 files changed, 322 insertions(+), 122 deletions(-) -- 2.30.2
Re: [PATCH 3/5] x86/pvh: Set phys_base when calling xen_prepare_pvh()
On 10.04.24 21:48, Jason Andryuk wrote: phys_base needs to be set for __pa() to work in xen_pvh_init() when finding the hypercall page. Set it before calling into xen_prepare_pvh(), which calls xen_pvh_init(). Clear it afterward to avoid __startup_64() adding to it and creating an incorrect value. Signed-off-by: Jason Andryuk --- Instead of setting and clearing phys_base, a dedicated variable could be used just for the hypercall page. Having phys_base set properly may avoid further issues if the use of phys_base or __pa() grows. --- arch/x86/platform/pvh/head.S | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S index bb1e582e32b1..c08d08d8cc92 100644 --- a/arch/x86/platform/pvh/head.S +++ b/arch/x86/platform/pvh/head.S @@ -125,7 +125,17 @@ SYM_CODE_START_LOCAL(pvh_start_xen) xor %edx, %edx wrmsr + /* Calculate load offset from LOAD_PHYSICAL_ADDR and store in +* phys_base. __pa() needs phys_base set to calculate the +* hypercall page in xen_pvh_init(). */ Please use the correct style for multi-line comments: /* * comment lines * comment lines */ + movq %rbp, %rbx + subq $LOAD_PHYSICAL_ADDR, %rbx + movq %rbx, phys_base(%rip) call xen_prepare_pvh + /* Clear phys_base. __startup_64 will *add* to its value, +* so reset to 0. */ Comment style again. + xor %rbx, %rbx + movq %rbx, phys_base(%rip) /* startup_64 expects boot_params in %rsi. */ lea rva(pvh_bootparams)(%ebp), %rsi With above fixed: Reviewed-by: Juergen Gross Juergen
[xen-unstable-smoke test] 186104: regressions - FAIL
flight 186104 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/186104/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 6 xen-buildfail REGR. vs. 186064 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass version targeted for testing: xen d6a7fd83039af36c28bd0ae2174f12c3888ce993 baseline version: xen ced21fbb2842ac4655048bdee56232974ff9ff9c Last test of basis 186064 2024-05-21 15:04:02 Z1 days Testing same since 186104 2024-05-23 09:00:22 Z0 days1 attempts People who touched revisions under test: Alejandro Vallejo Bobby Eshleman Jan Beulich Julien Grall Oleksandr Andrushchenko Oleksii Kurochko Roger Pau Monné Stewart Hildebrand Volodymyr Babchuk jobs: build-arm64-xsm pass build-amd64 pass build-armhf fail build-amd64-libvirt pass test-armhf-armhf-xl blocked test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 387 lines long.)
Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs
On Tue, Apr 9, 2024 at 3:19 PM Ross Lagerwall wrote: > > On Tue, Apr 9, 2024 at 11:20 AM Anthony PERARD > wrote: > > > > On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote: > > > diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c > > > index 1627da739822..1116b3978938 100644 > > > --- a/hw/xen/xen-hvm-common.c > > > +++ b/hw/xen/xen-hvm-common.c > > > @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState > > > *state) > > [...] > > > > > > static void handle_buffered_io(void *opaque) > > > { > > > +unsigned int handled; > > > XenIOState *state = opaque; > > > > > > -if (handle_buffered_iopage(state)) { > > > +handled = handle_buffered_iopage(state); > > > +if (handled >= IOREQ_BUFFER_SLOT_NUM) { > > > +/* We handled a full page of ioreqs. Schedule a timer to continue > > > + * processing while giving other stuff a chance to run. > > > + */ > > > > ./scripts/checkpatch.pl report a style issue here: > > WARNING: Block comments use a leading /* on a separate line > > > > I can try to remember to fix that on commit. > > I copied the comment style from a few lines above but I guess it was > wrong. > > > > > > timer_mod(state->buffered_io_timer, > > > -BUFFER_IO_MAX_DELAY + > > > qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > > > -} else { > > > +qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > > > +} else if (handled == 0) { > > > > Just curious, why did you check for `handled == 0` here instead of > > `handled != 0`? That would have avoided to invert the last 2 cases, and > > the patch would just have introduce a new case without changing the > > order of the existing ones. But not that important I guess. > > > > In general I try to use conditionals with the least amount of negation > since I think it is easier to read. I can change it if you would prefer? It looks like this hasn't been committed anywhere. Were you expecting an updated version with the style issue fixed or has it fallen through the cracks? Thanks, Ross
Re: [PATCH v3 2/2] tools/xg: Clean up xend-style overrides for CPU policies
On Thu, May 23, 2024 at 10:41:30AM +0100, Alejandro Vallejo wrote: > Factor out policy getters/setters from both (CPUID and MSR) policy override > functions. Additionally, use host policy rather than featureset when > preparing the cur policy, saving one hypercall and several lines of > boilerplate. > > No functional change intended. > > Signed-off-by: Alejandro Vallejo > --- > v3: > * Restored overscoped loop indices > * Split long line in conditional > --- > tools/libs/guest/xg_cpuid_x86.c | 438 ++-- > 1 file changed, 131 insertions(+), 307 deletions(-) > > diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c > index 4f4b86b59470..1e631fd46d2f 100644 > --- a/tools/libs/guest/xg_cpuid_x86.c > +++ b/tools/libs/guest/xg_cpuid_x86.c > @@ -36,6 +36,34 @@ enum { > #define bitmaskof(idx) (1u << ((idx) & 31)) > #define featureword_of(idx) ((idx) >> 5) > > +static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy) > +{ > +uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; > +int rc; > + > +rc = x86_cpuid_copy_from_buffer(>policy, policy->leaves, > +policy->nr_leaves, _leaf, > _subleaf); > +if ( rc ) > +{ > +if ( err_leaf != -1 ) > +ERROR("Failed to deserialise CPUID (err leaf %#x, subleaf %#x) > (%d = %s)", > + err_leaf, err_subleaf, -rc, strerror(-rc)); > +return rc; > +} > + > +rc = x86_msr_copy_from_buffer(>policy, policy->msrs, > + policy->nr_msrs, _msr); > +if ( rc ) > +{ > +if ( err_msr != -1 ) > +ERROR("Failed to deserialise MSR (err MSR %#x) (%d = %s)", > + err_msr, -rc, strerror(-rc)); > +return rc; > +} > + > +return 0; > +} > + > int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps) > { > struct xen_sysctl sysctl = {}; > @@ -260,102 +288,37 @@ static int compare_leaves(const void *l, const void *r) > return 0; > } > > -static xen_cpuid_leaf_t *find_leaf( > -xen_cpuid_leaf_t *leaves, unsigned int nr_leaves, > -const struct xc_xend_cpuid *xend) > +static xen_cpuid_leaf_t *find_leaf(xc_cpu_policy_t *p, > + const struct xc_xend_cpuid *xend) > { > const xen_cpuid_leaf_t key = { xend->leaf, xend->subleaf }; > > -return bsearch(, leaves, nr_leaves, sizeof(*leaves), compare_leaves); > +return bsearch(, p->leaves, ARRAY_SIZE(p->leaves), Don't you need to use p->nr_leaves here, as otherwise we could check against possibly uninitialized leaves (or leaves with stale data)? > + sizeof(*p->leaves), compare_leaves); > } > > -static int xc_cpuid_xend_policy( > -xc_interface *xch, uint32_t domid, const struct xc_xend_cpuid *xend) > +static int xc_cpuid_xend_policy(xc_interface *xch, uint32_t domid, > +const struct xc_xend_cpuid *xend, > +xc_cpu_policy_t *host, > +xc_cpu_policy_t *def, > +xc_cpu_policy_t *cur) > { > -int rc; > -bool hvm; > -xc_domaininfo_t di; > -unsigned int nr_leaves, nr_msrs; > -uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; > -/* > - * Three full policies. The host, default for the domain type, > - * and domain current. > - */ > -xen_cpuid_leaf_t *host = NULL, *def = NULL, *cur = NULL; > -unsigned int nr_host, nr_def, nr_cur; > - > -if ( (rc = xc_domain_getinfo_single(xch, domid, )) < 0 ) > -{ > -PERROR("Failed to obtain d%d info", domid); > -rc = -errno; > -goto fail; > -} > -hvm = di.flags & XEN_DOMINF_hvm_guest; > - > -rc = xc_cpu_policy_get_size(xch, _leaves, _msrs); > -if ( rc ) > -{ > -PERROR("Failed to obtain policy info size"); > -rc = -errno; > -goto fail; > -} > - > -rc = -ENOMEM; > -if ( (host = calloc(nr_leaves, sizeof(*host))) == NULL || > - (def = calloc(nr_leaves, sizeof(*def))) == NULL || > - (cur = calloc(nr_leaves, sizeof(*cur))) == NULL ) > -{ > -ERROR("Unable to allocate memory for %u CPUID leaves", nr_leaves); > -goto fail; > -} > - > -/* Get the domain's current policy. */ > -nr_msrs = 0; > -nr_cur = nr_leaves; > -rc = get_domain_cpu_policy(xch, domid, _cur, cur, _msrs, NULL); > -if ( rc ) > -{ > -PERROR("Failed to obtain d%d current policy", domid); > -rc = -errno; > -goto fail; > -} > +if ( !xend ) > +return 0; > > -/* Get the domain type's default policy. */ > -nr_msrs = 0; > -nr_def = nr_leaves; > -rc = get_system_cpu_policy(xch, hvm ? XEN_SYSCTL_cpu_policy_hvm_default > -: XEN_SYSCTL_cpu_policy_pv_default, > -
Re: [XEN PATCH v2 06/15] x86/p2m: guard altp2m code with CONFIG_ALTP2M option
16.05.24 14:01, Jan Beulich: On 15.05.2024 11:10, Sergiy Kibrik wrote: @@ -38,7 +38,10 @@ static inline bool altp2m_active(const struct domain *d) } /* Only declaration is needed. DCE will optimise it out when linking. */ +void altp2m_vcpu_initialise(struct vcpu *v); +void altp2m_vcpu_destroy(struct vcpu *v); uint16_t altp2m_vcpu_idx(const struct vcpu *v); +int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn); void altp2m_vcpu_disable_ve(struct vcpu *v); These additions look unrelated, as long as the description says nothing in this regard. agree, I'll update description on why these declarations are added --- a/xen/arch/x86/include/asm/hvm/hvm.h +++ b/xen/arch/x86/include/asm/hvm/hvm.h @@ -670,7 +670,7 @@ static inline bool hvm_hap_supported(void) /* returns true if hardware supports alternate p2m's */ static inline bool hvm_altp2m_supported(void) { -return hvm_funcs.caps.altp2m; +return IS_ENABLED(CONFIG_ALTP2M) && hvm_funcs.caps.altp2m; Which in turn raises the question whether the altp2m struct field shouldn't become conditional upon CONFIG_ALTP2M too (or rather: instead, as the change here then would need to be done differently). Yet maybe that would entail further changes elsewhere, so may well better be left for later. but hvm_funcs.caps.altp2m is only a capability bit -- is it worth to become conditional? --- a/xen/arch/x86/mm/Makefile +++ b/xen/arch/x86/mm/Makefile @@ -1,7 +1,7 @@ obj-y += shadow/ obj-$(CONFIG_HVM) += hap/ -obj-$(CONFIG_HVM) += altp2m.o +obj-$(CONFIG_ALTP2M) += altp2m.o This change I think wants to move to patch 5. If this moves to patch 5 then HVM=y && ALTP2M=n configuration combination will break the build in between patch 5 and 6, so I've decided to put it together with fixes of these build failures in patch 6. Maybe I can merge patch 5 & 6 together then ? -Sergiy
Re: [PATCH v6 7/8] xen: mapcache: Add support for grant mappings
On Thu, May 23, 2024 at 9:47 AM Manos Pitsidianakis < manos.pitsidiana...@linaro.org> wrote: > On Thu, 16 May 2024 18:48, "Edgar E. Iglesias" > wrote: > >From: "Edgar E. Iglesias" > > > >Add a second mapcache for grant mappings. The mapcache for > >grants needs to work with XC_PAGE_SIZE granularity since > >we can't map larger ranges than what has been granted to us. > > > >Like with foreign mappings (xen_memory), machines using grants > >are expected to initialize the xen_grants MR and map it > >into their address-map accordingly. > > > >Signed-off-by: Edgar E. Iglesias > >Reviewed-by: Stefano Stabellini > >--- > > hw/xen/xen-hvm-common.c | 12 ++- > > hw/xen/xen-mapcache.c | 163 ++-- > > include/hw/xen/xen-hvm-common.h | 3 + > > include/sysemu/xen.h| 7 ++ > > 4 files changed, 152 insertions(+), 33 deletions(-) > > > >diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c > >index a0a0252da0..b8ace1c368 100644 > >--- a/hw/xen/xen-hvm-common.c > >+++ b/hw/xen/xen-hvm-common.c > >@@ -10,12 +10,18 @@ > > #include "hw/boards.h" > > #include "hw/xen/arch_hvm.h" > > > >-MemoryRegion xen_memory; > >+MemoryRegion xen_memory, xen_grants; > > > >-/* Check for xen memory. */ > >+/* Check for any kind of xen memory, foreign mappings or grants. */ > > bool xen_mr_is_memory(MemoryRegion *mr) > > { > >-return mr == _memory; > >+return mr == _memory || mr == _grants; > >+} > >+ > >+/* Check specifically for grants. */ > >+bool xen_mr_is_grants(MemoryRegion *mr) > >+{ > >+return mr == _grants; > > } > > > > void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion > *mr, > >diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c > >index a07c47b0b1..1cbc2aeaa9 100644 > >--- a/hw/xen/xen-mapcache.c > >+++ b/hw/xen/xen-mapcache.c > >@@ -14,6 +14,7 @@ > > > > #include > > > >+#include "hw/xen/xen-hvm-common.h" > > #include "hw/xen/xen_native.h" > > #include "qemu/bitmap.h" > > > >@@ -21,6 +22,8 @@ > > #include "sysemu/xen-mapcache.h" > > #include "trace.h" > > > >+#include > >+#include > > > > #if HOST_LONG_BITS == 32 > > # define MCACHE_MAX_SIZE (1UL<<31) /* 2GB Cap */ > >@@ -41,6 +44,7 @@ typedef struct MapCacheEntry { > > unsigned long *valid_mapping; > > uint32_t lock; > > #define XEN_MAPCACHE_ENTRY_DUMMY (1 << 0) > >+#define XEN_MAPCACHE_ENTRY_GRANT (1 << 1) > > Might we get more entry kinds in the future? (for example foreign maps). > Maybe this could be an enum. > > Perhaps. Foreign mappings are already supported, this flag separates ordinary foreign mappings from grant foreign mappings. IMO, since this is not an external interface it's probably better to change it once we have a concrete use-case at hand. > > uint8_t flags; > > hwaddr size; > > struct MapCacheEntry *next; > >@@ -71,6 +75,8 @@ typedef struct MapCache { > > } MapCache; > > > > static MapCache *mapcache; > >+static MapCache *mapcache_grants; > >+static xengnttab_handle *xen_region_gnttabdev; > > > > static inline void mapcache_lock(MapCache *mc) > > { > >@@ -131,6 +137,12 @@ void xen_map_cache_init(phys_offset_to_gaddr_t f, > void *opaque) > > unsigned long max_mcache_size; > > unsigned int bucket_shift; > > > >+xen_region_gnttabdev = xengnttab_open(NULL, 0); > >+if (xen_region_gnttabdev == NULL) { > >+error_report("mapcache: Failed to open gnttab device"); > >+exit(EXIT_FAILURE); > >+} > >+ > > if (HOST_LONG_BITS == 32) { > > bucket_shift = 16; > > } else { > >@@ -159,6 +171,15 @@ void xen_map_cache_init(phys_offset_to_gaddr_t f, > void *opaque) > > mapcache = xen_map_cache_init_single(f, opaque, > > bucket_shift, > > max_mcache_size); > >+ > >+/* > >+ * Grant mappings must use XC_PAGE_SIZE granularity since we can't > >+ * map anything beyond the number of pages granted to us. > >+ */ > >+mapcache_grants = xen_map_cache_init_single(f, opaque, > >+XC_PAGE_SHIFT, > >+max_mcache_size); > >+ > > setrlimit(RLIMIT_AS, _as); > > } > > > >@@ -168,17 +189,24 @@ static void xen_remap_bucket(MapCache *mc, > > hwaddr size, > > hwaddr address_index, > > bool dummy, > >+ bool grant, > >+ bool is_write, > > ram_addr_t ram_offset) > > { > > uint8_t *vaddr_base; > >-xen_pfn_t *pfns; > >+uint32_t *refs = NULL; > >+xen_pfn_t *pfns = NULL; > > int *err; > > You should use g_autofree to perform automatic cleanup on function exit > instead of manually freeing, since the allocations should only live > within the function call. > > Sounds good, I'll do that in the next version. > > unsigned
Re: [PATCH v3 1/2] tools/xg: Streamline cpu policy serialise/deserialise calls
On Thu, May 23, 2024 at 10:41:29AM +0100, Alejandro Vallejo wrote: > The idea is to use xc_cpu_policy_t as a single object containing both the > serialised and deserialised forms of the policy. Note that we need lengths > for the arrays, as the serialised policies may be shorter than the array > capacities. > > * Add the serialised lengths to the struct so we can distinguish > between length and capacity of the serialisation buffers. > * Remove explicit buffer+lengths in serialise/deserialise calls > and use the internal buffer inside xc_cpu_policy_t instead. > * Refactor everything to use the new serialisation functions. > * Remove redundant serialization calls and avoid allocating dynamic > memory aside from the policy objects in xen-cpuid. Also minor cleanup > in the policy print call sites. > > No functional change intended. > > Signed-off-by: Alejandro Vallejo Acked-by: Roger Pau Monné Just two comments. > --- > v3: > * Better context scoping in xg_sr_common_x86. > * Can't be const because write_record() takes non-const. > * Adjusted line length of xen-cpuid's print_policy. > * Adjusted error messages in xen-cpuid's print_policy. > * Reverted removal of overscoped loop indices. > --- > tools/include/xenguest.h| 8 ++- > tools/libs/guest/xg_cpuid_x86.c | 98 - > tools/libs/guest/xg_private.h | 2 + > tools/libs/guest/xg_sr_common_x86.c | 56 ++--- > tools/misc/xen-cpuid.c | 41 > 5 files changed, 106 insertions(+), 99 deletions(-) > > diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h > index e01f494b772a..563811cd8dde 100644 > --- a/tools/include/xenguest.h > +++ b/tools/include/xenguest.h > @@ -799,14 +799,16 @@ int xc_cpu_policy_set_domain(xc_interface *xch, > uint32_t domid, > xc_cpu_policy_t *policy); > > /* Manipulate a policy via architectural representations. */ > -int xc_cpu_policy_serialise(xc_interface *xch, const xc_cpu_policy_t *policy, > -xen_cpuid_leaf_t *leaves, uint32_t *nr_leaves, > -xen_msr_entry_t *msrs, uint32_t *nr_msrs); > +int xc_cpu_policy_serialise(xc_interface *xch, xc_cpu_policy_t *policy); > int xc_cpu_policy_update_cpuid(xc_interface *xch, xc_cpu_policy_t *policy, > const xen_cpuid_leaf_t *leaves, > uint32_t nr); > int xc_cpu_policy_update_msrs(xc_interface *xch, xc_cpu_policy_t *policy, >const xen_msr_entry_t *msrs, uint32_t nr); > +int xc_cpu_policy_get_leaves(xc_interface *xch, const xc_cpu_policy_t > *policy, > + const xen_cpuid_leaf_t **leaves, uint32_t *nr); > +int xc_cpu_policy_get_msrs(xc_interface *xch, const xc_cpu_policy_t *policy, > + const xen_msr_entry_t **msrs, uint32_t *nr); Maybe it would be helpful to have a comment clarifying that the return of xc_cpu_policy_get_{leaves,msrs}() is a reference to the content of the policy, not a copy of it (and hence is tied to the lifetime of policy, and doesn't require explicit freeing). > > /* Compatibility calculations. */ > bool xc_cpu_policy_is_compatible(xc_interface *xch, xc_cpu_policy_t *host, > diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c > index 4453178100ad..4f4b86b59470 100644 > --- a/tools/libs/guest/xg_cpuid_x86.c > +++ b/tools/libs/guest/xg_cpuid_x86.c > @@ -834,14 +834,13 @@ void xc_cpu_policy_destroy(xc_cpu_policy_t *policy) > } > } > > -static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy, > - unsigned int nr_leaves, unsigned int > nr_entries) > +static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy) > { > uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; > int rc; > > rc = x86_cpuid_copy_from_buffer(>policy, policy->leaves, > -nr_leaves, _leaf, _subleaf); > +policy->nr_leaves, _leaf, > _subleaf); > if ( rc ) > { > if ( err_leaf != -1 ) > @@ -851,7 +850,7 @@ static int deserialize_policy(xc_interface *xch, > xc_cpu_policy_t *policy, > } > > rc = x86_msr_copy_from_buffer(>policy, policy->msrs, > - nr_entries, _msr); > + policy->nr_msrs, _msr); > if ( rc ) > { > if ( err_msr != -1 ) > @@ -878,7 +877,10 @@ int xc_cpu_policy_get_system(xc_interface *xch, unsigned > int policy_idx, > return rc; > } > > -rc = deserialize_policy(xch, policy, nr_leaves, nr_msrs); > +policy->nr_leaves = nr_leaves; > +policy->nr_msrs = nr_msrs; > + > +rc = deserialize_policy(xch, policy); > if ( rc ) > { > errno = -rc; > @@ -903,7 +905,10 @@ int xc_cpu_policy_get_domain(xc_interface
Re: [XEN PATCH] x86/iommu: Conditionally compile platform-specific union entries
On Thu, May 23, 2024 at 09:19:53AM +, Teddy Astie wrote: > If some platform driver isn't compiled in, remove its related union > entries as they are not used. > > Signed-off-by Teddy Astie > --- > xen/arch/x86/include/asm/iommu.h | 4 > xen/arch/x86/include/asm/pci.h | 4 > 2 files changed, 8 insertions(+) > > diff --git a/xen/arch/x86/include/asm/iommu.h > b/xen/arch/x86/include/asm/iommu.h > index 8dc464fbd3..99180940c4 100644 > --- a/xen/arch/x86/include/asm/iommu.h > +++ b/xen/arch/x86/include/asm/iommu.h > @@ -42,17 +42,21 @@ struct arch_iommu > struct list_head identity_maps; > > union { > +#ifdef CONFIG_INTEL_IOMMU > /* Intel VT-d */ > struct { > uint64_t pgd_maddr; /* io page directory machine address */ > unsigned int agaw; /* adjusted guest address width, 0 is level 2 > 30-bit */ > unsigned long *iommu_bitmap; /* bitmap of iommu(s) that the > domain uses */ > } vtd; > +#endif > +#ifdef CONFIG_AMD_IOMMU > /* AMD IOMMU */ > struct { > unsigned int paging_mode; > struct page_info *root_table; > } amd; > +#endif > }; > }; > > diff --git a/xen/arch/x86/include/asm/pci.h b/xen/arch/x86/include/asm/pci.h > index fd5480d67d..842710f0dc 100644 > --- a/xen/arch/x86/include/asm/pci.h > +++ b/xen/arch/x86/include/asm/pci.h > @@ -22,12 +22,16 @@ struct arch_pci_dev { > */ > union { > /* Subset of struct arch_iommu's fields, to be used in dom_io. */ > +#ifdef CONFIG_INTEL_IOMMU > struct { > uint64_t pgd_maddr; > } vtd; > +#endif > +#ifdef CONFIG_AMD_IOMMU > struct { > struct page_info *root_table; > } amd; > +#endif > }; The #ifdef and #endif processor directives shouldn't be indented. Would you mind adding /* CONFIG_{AMD,INTEL}_IOMMU */ comments in the #endif directives? I wonder if we could move the definitions of those structures to the vendor specific headers, but that's more convoluted, and would require including the iommu headers in pci.h Thanks, Roger.
[PATCH v3 2/2] tools/xg: Clean up xend-style overrides for CPU policies
Factor out policy getters/setters from both (CPUID and MSR) policy override functions. Additionally, use host policy rather than featureset when preparing the cur policy, saving one hypercall and several lines of boilerplate. No functional change intended. Signed-off-by: Alejandro Vallejo --- v3: * Restored overscoped loop indices * Split long line in conditional --- tools/libs/guest/xg_cpuid_x86.c | 438 ++-- 1 file changed, 131 insertions(+), 307 deletions(-) diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c index 4f4b86b59470..1e631fd46d2f 100644 --- a/tools/libs/guest/xg_cpuid_x86.c +++ b/tools/libs/guest/xg_cpuid_x86.c @@ -36,6 +36,34 @@ enum { #define bitmaskof(idx) (1u << ((idx) & 31)) #define featureword_of(idx) ((idx) >> 5) +static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy) +{ +uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; +int rc; + +rc = x86_cpuid_copy_from_buffer(>policy, policy->leaves, +policy->nr_leaves, _leaf, _subleaf); +if ( rc ) +{ +if ( err_leaf != -1 ) +ERROR("Failed to deserialise CPUID (err leaf %#x, subleaf %#x) (%d = %s)", + err_leaf, err_subleaf, -rc, strerror(-rc)); +return rc; +} + +rc = x86_msr_copy_from_buffer(>policy, policy->msrs, + policy->nr_msrs, _msr); +if ( rc ) +{ +if ( err_msr != -1 ) +ERROR("Failed to deserialise MSR (err MSR %#x) (%d = %s)", + err_msr, -rc, strerror(-rc)); +return rc; +} + +return 0; +} + int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps) { struct xen_sysctl sysctl = {}; @@ -260,102 +288,37 @@ static int compare_leaves(const void *l, const void *r) return 0; } -static xen_cpuid_leaf_t *find_leaf( -xen_cpuid_leaf_t *leaves, unsigned int nr_leaves, -const struct xc_xend_cpuid *xend) +static xen_cpuid_leaf_t *find_leaf(xc_cpu_policy_t *p, + const struct xc_xend_cpuid *xend) { const xen_cpuid_leaf_t key = { xend->leaf, xend->subleaf }; -return bsearch(, leaves, nr_leaves, sizeof(*leaves), compare_leaves); +return bsearch(, p->leaves, ARRAY_SIZE(p->leaves), + sizeof(*p->leaves), compare_leaves); } -static int xc_cpuid_xend_policy( -xc_interface *xch, uint32_t domid, const struct xc_xend_cpuid *xend) +static int xc_cpuid_xend_policy(xc_interface *xch, uint32_t domid, +const struct xc_xend_cpuid *xend, +xc_cpu_policy_t *host, +xc_cpu_policy_t *def, +xc_cpu_policy_t *cur) { -int rc; -bool hvm; -xc_domaininfo_t di; -unsigned int nr_leaves, nr_msrs; -uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; -/* - * Three full policies. The host, default for the domain type, - * and domain current. - */ -xen_cpuid_leaf_t *host = NULL, *def = NULL, *cur = NULL; -unsigned int nr_host, nr_def, nr_cur; - -if ( (rc = xc_domain_getinfo_single(xch, domid, )) < 0 ) -{ -PERROR("Failed to obtain d%d info", domid); -rc = -errno; -goto fail; -} -hvm = di.flags & XEN_DOMINF_hvm_guest; - -rc = xc_cpu_policy_get_size(xch, _leaves, _msrs); -if ( rc ) -{ -PERROR("Failed to obtain policy info size"); -rc = -errno; -goto fail; -} - -rc = -ENOMEM; -if ( (host = calloc(nr_leaves, sizeof(*host))) == NULL || - (def = calloc(nr_leaves, sizeof(*def))) == NULL || - (cur = calloc(nr_leaves, sizeof(*cur))) == NULL ) -{ -ERROR("Unable to allocate memory for %u CPUID leaves", nr_leaves); -goto fail; -} - -/* Get the domain's current policy. */ -nr_msrs = 0; -nr_cur = nr_leaves; -rc = get_domain_cpu_policy(xch, domid, _cur, cur, _msrs, NULL); -if ( rc ) -{ -PERROR("Failed to obtain d%d current policy", domid); -rc = -errno; -goto fail; -} +if ( !xend ) +return 0; -/* Get the domain type's default policy. */ -nr_msrs = 0; -nr_def = nr_leaves; -rc = get_system_cpu_policy(xch, hvm ? XEN_SYSCTL_cpu_policy_hvm_default -: XEN_SYSCTL_cpu_policy_pv_default, - _def, def, _msrs, NULL); -if ( rc ) -{ -PERROR("Failed to obtain %s def policy", hvm ? "hvm" : "pv"); -rc = -errno; -goto fail; -} +if ( !host || !def || !cur ) +return -EINVAL; -/* Get the host policy. */ -nr_msrs = 0; -nr_host = nr_leaves; -rc = get_system_cpu_policy(xch, XEN_SYSCTL_cpu_policy_host, - _host, host, _msrs, NULL); -if ( rc ) -{ -PERROR("Failed to
[PATCH v3 1/2] tools/xg: Streamline cpu policy serialise/deserialise calls
The idea is to use xc_cpu_policy_t as a single object containing both the serialised and deserialised forms of the policy. Note that we need lengths for the arrays, as the serialised policies may be shorter than the array capacities. * Add the serialised lengths to the struct so we can distinguish between length and capacity of the serialisation buffers. * Remove explicit buffer+lengths in serialise/deserialise calls and use the internal buffer inside xc_cpu_policy_t instead. * Refactor everything to use the new serialisation functions. * Remove redundant serialization calls and avoid allocating dynamic memory aside from the policy objects in xen-cpuid. Also minor cleanup in the policy print call sites. No functional change intended. Signed-off-by: Alejandro Vallejo --- v3: * Better context scoping in xg_sr_common_x86. * Can't be const because write_record() takes non-const. * Adjusted line length of xen-cpuid's print_policy. * Adjusted error messages in xen-cpuid's print_policy. * Reverted removal of overscoped loop indices. --- tools/include/xenguest.h| 8 ++- tools/libs/guest/xg_cpuid_x86.c | 98 - tools/libs/guest/xg_private.h | 2 + tools/libs/guest/xg_sr_common_x86.c | 56 ++--- tools/misc/xen-cpuid.c | 41 5 files changed, 106 insertions(+), 99 deletions(-) diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h index e01f494b772a..563811cd8dde 100644 --- a/tools/include/xenguest.h +++ b/tools/include/xenguest.h @@ -799,14 +799,16 @@ int xc_cpu_policy_set_domain(xc_interface *xch, uint32_t domid, xc_cpu_policy_t *policy); /* Manipulate a policy via architectural representations. */ -int xc_cpu_policy_serialise(xc_interface *xch, const xc_cpu_policy_t *policy, -xen_cpuid_leaf_t *leaves, uint32_t *nr_leaves, -xen_msr_entry_t *msrs, uint32_t *nr_msrs); +int xc_cpu_policy_serialise(xc_interface *xch, xc_cpu_policy_t *policy); int xc_cpu_policy_update_cpuid(xc_interface *xch, xc_cpu_policy_t *policy, const xen_cpuid_leaf_t *leaves, uint32_t nr); int xc_cpu_policy_update_msrs(xc_interface *xch, xc_cpu_policy_t *policy, const xen_msr_entry_t *msrs, uint32_t nr); +int xc_cpu_policy_get_leaves(xc_interface *xch, const xc_cpu_policy_t *policy, + const xen_cpuid_leaf_t **leaves, uint32_t *nr); +int xc_cpu_policy_get_msrs(xc_interface *xch, const xc_cpu_policy_t *policy, + const xen_msr_entry_t **msrs, uint32_t *nr); /* Compatibility calculations. */ bool xc_cpu_policy_is_compatible(xc_interface *xch, xc_cpu_policy_t *host, diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c index 4453178100ad..4f4b86b59470 100644 --- a/tools/libs/guest/xg_cpuid_x86.c +++ b/tools/libs/guest/xg_cpuid_x86.c @@ -834,14 +834,13 @@ void xc_cpu_policy_destroy(xc_cpu_policy_t *policy) } } -static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy, - unsigned int nr_leaves, unsigned int nr_entries) +static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy) { uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; int rc; rc = x86_cpuid_copy_from_buffer(>policy, policy->leaves, -nr_leaves, _leaf, _subleaf); +policy->nr_leaves, _leaf, _subleaf); if ( rc ) { if ( err_leaf != -1 ) @@ -851,7 +850,7 @@ static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy, } rc = x86_msr_copy_from_buffer(>policy, policy->msrs, - nr_entries, _msr); + policy->nr_msrs, _msr); if ( rc ) { if ( err_msr != -1 ) @@ -878,7 +877,10 @@ int xc_cpu_policy_get_system(xc_interface *xch, unsigned int policy_idx, return rc; } -rc = deserialize_policy(xch, policy, nr_leaves, nr_msrs); +policy->nr_leaves = nr_leaves; +policy->nr_msrs = nr_msrs; + +rc = deserialize_policy(xch, policy); if ( rc ) { errno = -rc; @@ -903,7 +905,10 @@ int xc_cpu_policy_get_domain(xc_interface *xch, uint32_t domid, return rc; } -rc = deserialize_policy(xch, policy, nr_leaves, nr_msrs); +policy->nr_leaves = nr_leaves; +policy->nr_msrs = nr_msrs; + +rc = deserialize_policy(xch, policy); if ( rc ) { errno = -rc; @@ -917,17 +922,14 @@ int xc_cpu_policy_set_domain(xc_interface *xch, uint32_t domid, xc_cpu_policy_t *policy) { uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; -unsigned int nr_leaves = ARRAY_SIZE(policy->leaves); -unsigned int nr_msrs =
[PATCH v3 0/2] Clean the policy manipulation path in domain creation
v2 -> v3: * Style adjustments * Revert of loop index scope refactors v1 -> v2: * Removed xc_cpu_policy from xenguest.h (dropped v1/patch1) * Added accessors for xc_cpu_policy so the serialised form can be extracted. * Modified xen-cpuid to use accessors. Original cover letter In the context of creating a domain, we currently issue a lot of hypercalls redundantly while populating its CPU policy; likely a side effect of organic growth more than anything else. However, the worst part is not the overhead (this is a glacially cold path), but the insane amounts of boilerplate that make it really hard to pick apart what's going on. One major contributor to this situation is the fact that what's effectively "setup" and "teardown" phases in policy manipulation are not factored out from the functions that perform said manipulations, leading to the same getters and setter being invoked many times, when once each would do. Another big contributor is the code being unaware of when a policy is serialised and when it's not. This patch attempts to alleviate this situation, yielding over 200 LoC reduction. Patch 1: Mechanical change. Makes xc_cpu_policy_t public so it's usable from clients of libxc/libxg. Patch 2: Changes the (de)serialization wrappers in xenguest so they always serialise to/from the internal buffers of xc_cpu_policy_t. The struct is suitably expanded to hold extra information required. Patch 3: Performs the refactor of the policy manipulation code so that it follows a strict: PULL_POLICIES, MUTATE_POLICY (n times), PUSH_POLICY. Subject: [PATCH v3 0/2] *** SUBJECT HERE *** *** BLURB HERE *** Alejandro Vallejo (2): tools/xg: Streamline cpu policy serialise/deserialise calls tools/xg: Clean up xend-style overrides for CPU policies tools/include/xenguest.h| 8 +- tools/libs/guest/xg_cpuid_x86.c | 530 ++-- tools/libs/guest/xg_private.h | 2 + tools/libs/guest/xg_sr_common_x86.c | 56 ++- tools/misc/xen-cpuid.c | 41 +-- 5 files changed, 234 insertions(+), 403 deletions(-) -- 2.34.1
[xen-unstable test] 186078: tolerable FAIL
flight 186078 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/186078/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-qcow2 8 xen-boot fail in 186066 pass in 186078 test-armhf-armhf-xl-multivcpu 8 xen-boot fail pass in 186066 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-multivcpu 15 migrate-support-check fail in 186066 never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-check fail in 186066 never pass test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 186066 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 186066 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 186066 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 186066 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 186066 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 186066 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-qcow214 migrate-support-checkfail never pass test-armhf-armhf-xl-qcow215 saverestore-support-checkfail never pass version targeted for testing: xen ced21fbb2842ac4655048bdee56232974ff9ff9c baseline version: xen ced21fbb2842ac4655048bdee56232974ff9ff9c Last test of basis 186078 2024-05-22 12:53:24 Z0 days Testing same since (not found) 0 attempts jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64-xtf pass build-amd64 pass build-arm64 pass build-armhf pass
[XEN PATCH] x86/iommu: Conditionally compile platform-specific union entries
If some platform driver isn't compiled in, remove its related union entries as they are not used. Signed-off-by Teddy Astie --- xen/arch/x86/include/asm/iommu.h | 4 xen/arch/x86/include/asm/pci.h | 4 2 files changed, 8 insertions(+) diff --git a/xen/arch/x86/include/asm/iommu.h b/xen/arch/x86/include/asm/iommu.h index 8dc464fbd3..99180940c4 100644 --- a/xen/arch/x86/include/asm/iommu.h +++ b/xen/arch/x86/include/asm/iommu.h @@ -42,17 +42,21 @@ struct arch_iommu struct list_head identity_maps; union { +#ifdef CONFIG_INTEL_IOMMU /* Intel VT-d */ struct { uint64_t pgd_maddr; /* io page directory machine address */ unsigned int agaw; /* adjusted guest address width, 0 is level 2 30-bit */ unsigned long *iommu_bitmap; /* bitmap of iommu(s) that the domain uses */ } vtd; +#endif +#ifdef CONFIG_AMD_IOMMU /* AMD IOMMU */ struct { unsigned int paging_mode; struct page_info *root_table; } amd; +#endif }; }; diff --git a/xen/arch/x86/include/asm/pci.h b/xen/arch/x86/include/asm/pci.h index fd5480d67d..842710f0dc 100644 --- a/xen/arch/x86/include/asm/pci.h +++ b/xen/arch/x86/include/asm/pci.h @@ -22,12 +22,16 @@ struct arch_pci_dev { */ union { /* Subset of struct arch_iommu's fields, to be used in dom_io. */ +#ifdef CONFIG_INTEL_IOMMU struct { uint64_t pgd_maddr; } vtd; +#endif +#ifdef CONFIG_AMD_IOMMU struct { struct page_info *root_table; } amd; +#endif }; domid_t pseudo_domid; mfn_t leaf_mfn; -- 2.45.1 Teddy Astie | Vates XCP-ng Intern XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
Re: [PATCH v2 4/4] tools: Drop libsystemd as a dependency
On 16.05.24 20:58, Andrew Cooper wrote: There are no more users, and we want to disuade people from introducing new users just for sd_notify() and friends. Drop the dependency. We still want the overall --with{,out}-systemd to gate the generation of the service/unit/mount/etc files. Rerun autogen.sh, and mark the dependency as removed in the build containers. Signed-off-by: Andrew Cooper Reviewed-by: Juergen Gross Juergen
Re: [PATCH v2 4/4] tools: Drop libsystemd as a dependency
On 23/05/2024 9:27 am, Jürgen Groß wrote: > On 16.05.24 20:58, Andrew Cooper wrote: >> diff --git a/automation/build/archlinux/current.dockerfile >> b/automation/build/archlinux/current.dockerfile >> index 3e37ab5c40c1..d29f1358c2bd 100644 >> --- a/automation/build/archlinux/current.dockerfile >> +++ b/automation/build/archlinux/current.dockerfile >> @@ -37,6 +37,7 @@ RUN pacman -S --refresh --sysupgrade --noconfirm >> --noprogressbar --needed \ >> sdl2 \ >> spice \ >> spice-protocol \ >> + # systemd for Xen < 4.19 > > Does this work as intended? A comment between the parameters and no > "\" at the > end of the line? Sadly, yes. Comments are stripped out on a line-granuar basis, prior to Docker interpreting the remainder. This is the approved way to do comments in dockerfiles, and we already have other examples of this in our dockerfiles. See e.g. a0e29b316363d9 for what I'll be doing with these comments in ~3y time. ~Andrew
Re: [PATCH v2 4/4] tools: Drop libsystemd as a dependency
On 16.05.24 20:58, Andrew Cooper wrote: There are no more users, and we want to disuade people from introducing new users just for sd_notify() and friends. Drop the dependency. We still want the overall --with{,out}-systemd to gate the generation of the service/unit/mount/etc files. Rerun autogen.sh, and mark the dependency as removed in the build containers. Signed-off-by: Andrew Cooper --- CC: Anthony PERARD CC: Juergen Gross CC: George Dunlap CC: Jan Beulich CC: Stefano Stabellini CC: Julien Grall CC: Christian Lindig CC: Edwin Török v2: * Only strip out the library check. --- automation/build/archlinux/current.dockerfile | 1 + .../build/suse/opensuse-leap.dockerfile | 1 + .../build/suse/opensuse-tumbleweed.dockerfile | 1 + automation/build/ubuntu/focal.dockerfile | 1 + config/Tools.mk.in| 2 - m4/systemd.m4 | 9 - tools/configure | 256 -- 7 files changed, 4 insertions(+), 267 deletions(-) diff --git a/automation/build/archlinux/current.dockerfile b/automation/build/archlinux/current.dockerfile index 3e37ab5c40c1..d29f1358c2bd 100644 --- a/automation/build/archlinux/current.dockerfile +++ b/automation/build/archlinux/current.dockerfile @@ -37,6 +37,7 @@ RUN pacman -S --refresh --sysupgrade --noconfirm --noprogressbar --needed \ sdl2 \ spice \ spice-protocol \ +# systemd for Xen < 4.19 Does this work as intended? A comment between the parameters and no "\" at the end of the line? Juergen
Re: [PATCH v2 3/4] tools/{c,o}xenstored: Don't link against libsystemd
On 16.05.24 20:58, Andrew Cooper wrote: Use the local freestanding wrapper instead. Signed-off-by: Andrew Cooper Reviewed-by: Juergen Gross # tools/xenstored Juergen --- CC: Anthony PERARD CC: Juergen Gross CC: George Dunlap CC: Jan Beulich CC: Stefano Stabellini CC: Julien Grall CC: Christian Lindig CC: Edwin Török v2: * Redo almost from scratch, using the freestanding wrapper instead. --- tools/ocaml/xenstored/Makefile| 2 -- tools/ocaml/xenstored/systemd_stubs.c | 2 +- tools/xenstored/Makefile | 5 - tools/xenstored/posix.c | 4 ++-- 4 files changed, 3 insertions(+), 10 deletions(-) diff --git a/tools/ocaml/xenstored/Makefile b/tools/ocaml/xenstored/Makefile index e8aaecf2e630..a8b8bb64698e 100644 --- a/tools/ocaml/xenstored/Makefile +++ b/tools/ocaml/xenstored/Makefile @@ -4,8 +4,6 @@ include $(OCAML_TOPLEVEL)/common.make # Include configure output (config.h) CFLAGS += -include $(XEN_ROOT)/tools/config.h -CFLAGS-$(CONFIG_SYSTEMD) += $(SYSTEMD_CFLAGS) -LDFLAGS-$(CONFIG_SYSTEMD) += $(SYSTEMD_LIBS) CFLAGS += $(CFLAGS-y) CFLAGS += $(APPEND_CFLAGS) diff --git a/tools/ocaml/xenstored/systemd_stubs.c b/tools/ocaml/xenstored/systemd_stubs.c index f4c875075abe..7dbbdd35bf30 100644 --- a/tools/ocaml/xenstored/systemd_stubs.c +++ b/tools/ocaml/xenstored/systemd_stubs.c @@ -25,7 +25,7 @@ #if defined(HAVE_SYSTEMD) -#include +#include CAMLprim value ocaml_sd_notify_ready(value ignore) { diff --git a/tools/xenstored/Makefile b/tools/xenstored/Makefile index e0897ed1ba30..09adfe1d5064 100644 --- a/tools/xenstored/Makefile +++ b/tools/xenstored/Makefile @@ -9,11 +9,6 @@ xenstored: LDLIBS += $(LDLIBS_libxenctrl) xenstored: LDLIBS += -lrt xenstored: LDLIBS += $(SOCKET_LIBS) -ifeq ($(CONFIG_SYSTEMD),y) -$(XENSTORED_OBJS-y): CFLAGS += $(SYSTEMD_CFLAGS) -xenstored: LDLIBS += $(SYSTEMD_LIBS) -endif - TARGETS := xenstored .PHONY: all diff --git a/tools/xenstored/posix.c b/tools/xenstored/posix.c index d88c82d972d7..6037d739d013 100644 --- a/tools/xenstored/posix.c +++ b/tools/xenstored/posix.c @@ -27,7 +27,7 @@ #include #include #if defined(HAVE_SYSTEMD) -#include +#include #endif #include @@ -393,7 +393,7 @@ void late_init(bool live_update) #if defined(HAVE_SYSTEMD) if (!live_update) { sd_notify(1, "READY=1"); - fprintf(stderr, SD_NOTICE "xenstored is ready\n"); + fprintf(stderr, "xenstored is ready\n"); } #endif }
Re: [PATCH v4 2/2] drivers/char: Use sub-page ro API to make just xhci dbc cap RO
On 22.05.2024 17:39, Marek Marczykowski-Górecki wrote: > Not the whole page, which may contain other registers too. The XHCI > specification describes DbC as designed to be controlled by a different > driver, but does not mandate placing registers on a separate page. In fact > on Tiger Lake and newer (at least), this page do contain other registers > that Linux tries to use. And with share=yes, a domU would use them too. > Without this patch, PV dom0 would fail to initialize the controller, > while HVM would be killed on EPT violation. > > With `share=yes`, this patch gives domU more access to the emulator > (although a HVM with any emulated device already has plenty of it). This > configuration is already documented as unsafe with untrusted guests and > not security supported. > > Signed-off-by: Marek Marczykowski-Górecki Reviewed-by: Jan Beulich
Re: [PATCH v2 2/4] tools: Import standalone sd_notify() implementation from systemd
On 16.05.24 20:58, Andrew Cooper wrote: ... in order to avoid linking against the whole of libsystemd. Only minimal changes to the upstream copy, to function as a drop-in replacement for sd_notify() and as a header-only library. Signed-off-by: Andrew Cooper With s/cleanup(sd_closep)/cleanup(xen_sd_closep)/ Reviewed-by: Juergen Gross Juergen --- CC: Anthony PERARD CC: Juergen Gross CC: George Dunlap CC: Jan Beulich CC: Stefano Stabellini CC: Julien Grall CC: Christian Lindig CC: Edwin Török v2: * New --- tools/include/xen-sd-notify.h | 98 +++ 1 file changed, 98 insertions(+) create mode 100644 tools/include/xen-sd-notify.h diff --git a/tools/include/xen-sd-notify.h b/tools/include/xen-sd-notify.h new file mode 100644 index ..eda9d8b22d9e --- /dev/null +++ b/tools/include/xen-sd-notify.h @@ -0,0 +1,98 @@ +/* SPDX-License-Identifier: MIT-0 */ + +/* + * Implement the systemd notify protocol without external dependencies. + * Supports both readiness notification on startup and on reloading, + * according to the protocol defined at: + * https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html + * This protocol is guaranteed to be stable as per: + * https://systemd.io/PORTABILITY_AND_STABILITY/ + * + * Differences from the upstream copy: + * - Rename/rework as a drop-in replacement for systemd/sd-daemon.h + * - Only take the subset Xen cares about + * - Respect -Wdeclaration-after-statement + */ + +#ifndef XEN_SD_NOTIFY +#define XEN_SD_NOTIFY + +#include +#include +#include +#include +#include +#include + +static inline void xen_sd_closep(int *fd) { + if (!fd || *fd < 0) +return; + + close(*fd); + *fd = -1; +} + +static inline int xen_sd_notify(const char *message) { + union sockaddr_union { +struct sockaddr sa; +struct sockaddr_un sun; + } socket_addr = { +.sun.sun_family = AF_UNIX, + }; + size_t path_length, message_length; + ssize_t written; + const char *socket_path; + int __attribute__((cleanup(sd_closep))) fd = -1; + + /* Verify the argument first */ + if (!message) +return -EINVAL; + + message_length = strlen(message); + if (message_length == 0) +return -EINVAL; + + /* If the variable is not set, the protocol is a noop */ + socket_path = getenv("NOTIFY_SOCKET"); + if (!socket_path) +return 0; /* Not set? Nothing to do */ + + /* Only AF_UNIX is supported, with path or abstract sockets */ + if (socket_path[0] != '/' && socket_path[0] != '@') +return -EAFNOSUPPORT; + + path_length = strlen(socket_path); + /* Ensure there is room for NUL byte */ + if (path_length >= sizeof(socket_addr.sun.sun_path)) +return -E2BIG; + + memcpy(socket_addr.sun.sun_path, socket_path, path_length); + + /* Support for abstract socket */ + if (socket_addr.sun.sun_path[0] == '@') +socket_addr.sun.sun_path[0] = 0; + + fd = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0); + if (fd < 0) +return -errno; + + if (connect(fd, _addr.sa, offsetof(struct sockaddr_un, sun_path) + path_length) != 0) +return -errno; + + written = write(fd, message, message_length); + if (written != (ssize_t) message_length) +return written < 0 ? -errno : -EPROTO; + + return 1; /* Notified! */ +} + +static inline int sd_notify(int unset_environment, const char *message) { +int r = xen_sd_notify(message); + +if (unset_environment) +unsetenv("NOTIFY_SOCKET"); + +return r; +} + +#endif /* XEN_SD_NOTIFY */
Re: [PATCH] x86/shadow: don't leave trace record field uninitialized
On Wed, 2024-05-22 at 12:17 +0200, Jan Beulich wrote: > The emulation_count field is set only conditionally right now. > Convert > all field setting to an initializer, thus guaranteeing that field to > be > set to 0 (default initialized) when GUEST_PAGING_LEVELS != 3. > > While there also drop the "event" local variable, thus eliminating an > instance of the being phased out u32 type. > > Coverity ID: 1598430 > Fixes: 9a86ac1aa3d2 ("xentrace 5/7: Additional tracing for the shadow > code") > Signed-off-by: Jan Beulich Release-acked-by: Oleksii Kurochko ~ Oleksii > > --- a/xen/arch/x86/mm/shadow/multi.c > +++ b/xen/arch/x86/mm/shadow/multi.c > @@ -2093,20 +2093,18 @@ static inline void trace_shadow_emulate( > guest_l1e_t gl1e, write_val; > guest_va_t va; > uint32_t flags:29, emulation_count:3; > - } d; > - u32 event; > - > - event = TRC_SHADOW_EMULATE | ((GUEST_PAGING_LEVELS-2)<<8); > - > - d.gl1e = gl1e; > - d.write_val.l1 = this_cpu(trace_emulate_write_val); > - d.va = va; > + } d = { > + .gl1e = gl1e, > + .write_val.l1 = this_cpu(trace_emulate_write_val), > + .va = va, > #if GUEST_PAGING_LEVELS == 3 > - d.emulation_count = this_cpu(trace_extra_emulation_count); > + .emulation_count = > this_cpu(trace_extra_emulation_count), > #endif > - d.flags = this_cpu(trace_shadow_path_flags); > + .flags = this_cpu(trace_shadow_path_flags), > + }; > > - trace(event, sizeof(d), ); > + trace(TRC_SHADOW_EMULATE | ((GUEST_PAGING_LEVELS - 2) << 8), > + sizeof(d), ); > } > } > #endif /* CONFIG_HVM */