Re: [PATCH 1/3] automation: Add arm32 dom0less testing

2023-02-13 Thread Michal Orzel
Hi Stefano,

On 14/02/2023 01:37, Stefano Stabellini wrote:
> 
> 
> On Mon, 13 Feb 2023, Michal Orzel wrote:
>> At the moment, we only perform a single arm32 test in our CI, checking
>> whether dom0 boots successfully or not. This is mostly because we do not
>> have any arm32 runners and we only execute a hypervisor only build.
>>
>> In order not to leave the arm32 testing in such a poor state, add a
>> script qemu-smoke-dom0less-arm32.sh to start testing true dom0less
>> configuration, in which case we do not need a dom0 with a toolstack.
>>
>> The script is mostly based on the one used for dom0 arm32 testing as well
>> as the one used for dom0less arm64 testing. We obtain Debian Bullseye
>> kernel and Alpine Linux busybox-based rootfs. Depending on the test
>> variant, we prepare a test case contained within domU_check variable,
>> that will be executed as part of /init script.
>>
>> Signed-off-by: Michal Orzel 
>> ---
>>  automation/gitlab-ci/test.yaml| 16 
>>  .../scripts/qemu-smoke-dom0less-arm32.sh  | 89 +++
>>  2 files changed, 105 insertions(+)
>>  create mode 100755 automation/scripts/qemu-smoke-dom0less-arm32.sh
>>
>> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
>> index ce543ef5c0ef..84ab1fee50a4 100644
>> --- a/automation/gitlab-ci/test.yaml
>> +++ b/automation/gitlab-ci/test.yaml
>> @@ -210,6 +210,22 @@ qemu-smoke-dom0-arm32-gcc-debug:
>>  - *arm32-test-needs
>>  - debian-unstable-gcc-arm32-debug
>>
>> +qemu-smoke-dom0less-arm32-gcc:
>> +  extends: .qemu-arm32
>> +  script:
>> +- ./automation/scripts/qemu-smoke-dom0less-arm32.sh 2>&1 | tee 
>> ${LOGFILE}
>> +  needs:
>> +- *arm32-test-needs
>> +- debian-unstable-gcc-arm32
>> +
>> +qemu-smoke-dom0less-arm32-gcc-debug:
>> +  extends: .qemu-arm32
>> +  script:
>> +- ./automation/scripts/qemu-smoke-dom0less-arm32.sh 2>&1 | tee 
>> ${LOGFILE}
>> +  needs:
>> +- *arm32-test-needs
>> +- debian-unstable-gcc-arm32-debug
>> +
>>  qemu-alpine-x86_64-gcc:
>>extends: .qemu-x86-64
>>script:
>> diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh 
>> b/automation/scripts/qemu-smoke-dom0less-arm32.sh
>> new file mode 100755
>> index ..c81529cbbfd0
>> --- /dev/null
>> +++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh
>> @@ -0,0 +1,89 @@
>> +#!/bin/bash
>> +
>> +set -ex
>> +
>> +test_variant=$1
>> +
>> +cd binaries
>> +# Use the kernel from Debian
>> +curl --fail --silent --show-error --location --output vmlinuz 
>> https://deb.debian.org/debian/dists/bullseye/main/installer-armhf/current/images/netboot/vmlinuz
>> +# Use a tiny initrd based on busybox from Alpine Linux
>> +curl --fail --silent --show-error --location --output initrd.tar.gz 
>> https://dl-cdn.alpinelinux.org/alpine/v3.15/releases/armhf/alpine-minirootfs-3.15.1-armhf.tar.gz
>> +
>> +if [ -z "${test_variant}" ]; then
>> +passed="generic test passed"
>> +domU_check="
>> +echo \"${passed}\"
>> +"
>> +fi
>> +
>> +# domU rootfs
>> +mkdir rootfs
>> +cd rootfs
>> +tar xvzf ../initrd.tar.gz
>> +echo "#!/bin/sh
>> +
>> +mount -t proc proc /proc
>> +mount -t sysfs sysfs /sys
>> +mount -t devtmpfs devtmpfs /dev
>> +${domU_check}
>> +/bin/sh" > init
>> +chmod +x init
>> +find . | cpio -H newc -o | gzip > ../initrd.gz
>> +cd ..
>> +
>> +# XXX QEMU looks for "efi-virtio.rom" even if it is unneeded
>> +curl -fsSLO https://github.com/qemu/qemu/raw/v5.2.0/pc-bios/efi-virtio.rom
>> +./qemu-system-arm \
>> +-machine virt \
>> +-machine virtualization=true \
>> +-smp 4 \
>> +-m 1024 \
>> +-serial stdio \
>> +-monitor none \
>> +-display none \
>> +-machine dumpdtb=virt.dtb
>> +
>> +# ImageBuilder
>> +echo 'MEMORY_START="0x4000"
>> +MEMORY_END="0x8000"
>> +
>> +DEVICE_TREE="virt.dtb"
>> +XEN="xen"
>> +XEN_CMD="console=dtuart bootscrub=0"
>> +
>> +DOMU_KERNEL[0]="vmlinuz"
>> +DOMU_RAMDISK[0]="initrd.gz"
>> +DOMU_MEM[0]="512"
>> +NUM_DOMUS=1
> 
> Since we are at it I would prefer to create both dom0 and a domU to make
> the test more interesting. The following works on gitlab-ci on top of
> this patch. Would you be up for adding it to this patch?
This is exactly what I had in my initial approach. However, because this would 
not differ
much from what we have in dom0 arm32 testing, I decided to go for true dom0less 
configuration.
I'm ok to always boot dom0less domU with dom0. In this case I might add an 
additional test to
run a true dom0less as well.

~Michal



Re: [PATCH] automation: Add container and build jobs to run cppcheck analysis

2023-02-13 Thread Michal Orzel
Hi Stefano,

On 14/02/2023 00:56, Stefano Stabellini wrote:
> 
> 
> On Mon, 13 Feb 2023, Michal Orzel wrote:
>> Add a debian container with cppcheck installation routine inside,
>> capable of performing cppcheck analysis on Xen-only build including
>> cross-builds for arm32 and arm64.
>>
>> Populate build jobs making use of that container to run cppcheck
>> analysis to produce a text report (xen-cppcheck.txt) containing the list
>> of all the findings.
>>
>> This patch does not aim at performing any sort of bisection. Cppcheck is
>> imperfect and for now, our goal is to at least be aware of its reports,
>> so that we can compare them with the ones produced by better tools and
>> to be able to see how these reports change as a result of further
>> infrastructure improvements (e.g. exception list, rules exclusion).
>>
>> Signed-off-by: Michal Orzel 
> 
> Thanks for the patch, very nice!
> 
> 
>> ---
>> For those interested in, here is a sample pipeline:
>> https://gitlab.com/xen-project/people/morzel/xen-orzelmichal/-/pipelines/775769167
>> ---
>>  .../build/debian/unstable-cppcheck.dockerfile | 37 +
>>  automation/gitlab-ci/build.yaml   | 40 +++
>>  automation/scripts/build  | 11 -
>>  3 files changed, 87 insertions(+), 1 deletion(-)
>>  create mode 100644 automation/build/debian/unstable-cppcheck.dockerfile
>>
>> diff --git a/automation/build/debian/unstable-cppcheck.dockerfile 
>> b/automation/build/debian/unstable-cppcheck.dockerfile
>> new file mode 100644
>> index ..39bcc50673c8
>> --- /dev/null
>> +++ b/automation/build/debian/unstable-cppcheck.dockerfile
>> @@ -0,0 +1,37 @@
>> +FROM debian:unstable
>> +LABEL maintainer.name="The Xen Project" \
>> +  maintainer.email="xen-devel@lists.xenproject.org"
>> +
>> +ENV DEBIAN_FRONTEND=noninteractive
>> +ENV CPPCHECK_VERSION=2.7
>> +ENV USER root
>> +
>> +RUN mkdir /build
>> +WORKDIR /build
>> +
>> +# dependencies for cppcheck and Xen-only build/cross-build
>> +RUN apt-get update && \
>> +apt-get --quiet --yes install \
>> +build-essential \
>> +curl \
>> +python-is-python3 \
>> +libpcre3-dev \
>> +flex \
>> +bison \
>> +gcc-arm-linux-gnueabihf \
>> +gcc-aarch64-linux-gnu
>> +
>> +# cppcheck release build (see cppcheck readme.md)
>> +RUN curl -fsSLO 
>> https://github.com/danmar/cppcheck/archive/"$CPPCHECK_VERSION".tar.gz && \
>> +tar xvzf "$CPPCHECK_VERSION".tar.gz && \
>> +cd cppcheck-"$CPPCHECK_VERSION" && \
>> +make install -j$(nproc) \
>> +MATCHCOMPILER=yes \
>> +FILESDIR=/usr/share/cppcheck \
>> +HAVE_RULES=yes CXXFLAGS="-O2 -DNDEBUG -Wall -Wno-sign-compare 
>> -Wno-unused-function"
>> +
>> +# clean
>> +RUN apt-get autoremove -y && \
>> +apt-get clean && \
>> +rm -rf /var/lib/apt/lists* /tmp/* /var/tmp/* && \
>> +rm -rf cppcheck-"$CPPCHECK_VERSION"* "$CPPCHECK_VERSION".tar.gz
>> diff --git a/automation/gitlab-ci/build.yaml 
>> b/automation/gitlab-ci/build.yaml
>> index a053c5c7325d..c8831ccbec7a 100644
>> --- a/automation/gitlab-ci/build.yaml
>> +++ b/automation/gitlab-ci/build.yaml
>> @@ -7,6 +7,7 @@
>>  paths:
>>- binaries/
>>- xen-config
>> +  - xen-cppcheck.txt
>>- '*.log'
>>- '*/*.log'
>>  when: always
>> @@ -145,6 +146,23 @@
>>variables:
>>  <<: *gcc
>>
>> +.arm64-cross-build-tmpl:
>> +  <<: *build
>> +  variables:
>> +XEN_TARGET_ARCH: arm64
>> +  tags:
>> +- x86_64
>> +
>> +.arm64-cross-build:
>> +  extends: .arm64-cross-build-tmpl
>> +  variables:
>> +debug: n
>> +
>> +.gcc-arm64-cross-build:
>> +  extends: .arm64-cross-build
>> +  variables:
>> +<<: *gcc
>> +
>>  .arm64-build-tmpl:
>><<: *build
>>variables:
>> @@ -679,6 +697,28 @@ archlinux-current-gcc-riscv64-debug-randconfig:
>>  EXTRA_FIXED_RANDCONFIG:
>>CONFIG_COVERAGE=n
>>
>> +# Cppcheck analysis jobs
>> +
>> +debian-unstable-gcc-cppcheck:
>> +  extends: .gcc-x86-64-build
>> +  variables:
>> +CONTAINER: debian:unstable-cppcheck
>> +CPPCHECK: y
>> +
>> +debian-unstable-gcc-arm32-cppcheck:
>> +  extends: .gcc-arm32-cross-build
>> +  variables:
>> +CONTAINER: debian:unstable-cppcheck
>> +CROSS_COMPILE: /usr/bin/arm-linux-gnueabihf-
>> +CPPCHECK: y
>> +
>> +debian-unstable-gcc-arm64-cppcheck:
>> +  extends: .gcc-arm64-cross-build
>> +  variables:
>> +CONTAINER: debian:unstable-cppcheck
>> +CROSS_COMPILE: /usr/bin/aarch64-linux-gnu-
>> +CPPCHECK: y
>> +
>>  ## Test artifacts common
>>
>>  .test-jobs-artifact-common:
>> diff --git a/automation/scripts/build b/automation/scripts/build
>> index f2f5e55bc04f..c219752d553e 100755
>> --- a/automation/scripts/build
>> +++ b/automation/scripts/build
>> @@ -38,7 +38,16 @@ cp xen/.config xen-config
>>  # Directory for the artefacts to be dumped into
>>  mkdir binaries
>>
>> -if [[ "${HYPERVISOR_ONLY}" == "y" ]]; then
>> +if [[ 

Re: [PATCH] xen: speed up grant-table reclaim

2023-02-13 Thread Juergen Gross

On 13.02.23 22:01, Demi Marie Obenour wrote:

On Mon, Feb 13, 2023 at 10:26:11AM +0100, Juergen Gross wrote:

On 07.02.23 03:10, Demi Marie Obenour wrote:

When a grant entry is still in use by the remote domain, Linux must put
it on a deferred list.  Normally, this list is very short, because
the PV network and block protocols expect the backend to unmap the grant
first.  However, Qubes OS's GUI protocol is subject to the constraints
of the X Window System, and as such winds up with the frontend unmapping
the window first.  As a result, the list can grow very large, resulting
in a massive memory leak and eventual VM freeze.

Fix this problem by bumping the number of entries that the VM will
attempt to free at each iteration to 1.  This is an ugly hack that
may well make a denial of service easier, but for Qubes OS that is less
bad than the problem Qubes OS users are facing today.



There really
needs to be a way for a frontend to be notified when the backend has
unmapped the grants.


Please remove this sentence from the commit message, or move it below the
"---" marker.


Will fix in v2.


There are still some flag bits unallocated in struct grant_entry_v1 or
struct grant_entry_header. You could suggest some patches for Xen to use
one of the bits as a marker to get an event from the hypervisor if a
grant with such a bit set has been unmapped.


That is indeed a good idea.  There are other problems with the grant
interface as well, but those can be dealt with later.


I have no idea, whether such an interface would be accepted by the
maintainers, though.


Additionally, a module parameter is provided to
allow tuning the reclaim speed.

The code previously used printk(KERN_DEBUG) whenever it had to defer
reclaiming a page because the grant was still mapped.  This resulted in
a large volume of log messages that bothered users.  Use pr_debug
instead, which suppresses the messages by default.  Developers can
enable them using the dynamic debug mechanism.

Fixes: QubesOS/qubes-issues#7410 (memory leak)
Fixes: QubesOS/qubes-issues#7359 (excessive logging)
Fixes: 569ca5b3f94c ("xen/gnttab: add deferred freeing logic")
Cc: sta...@vger.kernel.org
Signed-off-by: Demi Marie Obenour 
---
Anyone have suggestions for improving the grant mechanism?  Argo isn't
a good option, as in the GUI protocol there are substantial performance
wins to be had by using true shared memory.  Resending as I forgot the
Signed-off-by on the first submission.  Sorry about that.

   drivers/xen/grant-table.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 5c83d41..2c2faa7 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -355,14 +355,20 @@
   static void gnttab_handle_deferred(struct timer_list *);
   static DEFINE_TIMER(deferred_timer, gnttab_handle_deferred);
+static atomic64_t deferred_count;
+static atomic64_t leaked_count;
+static unsigned int free_per_iteration = 1;


As you are adding a kernel parameter to change this value, please set the
default to a value not potentially causing any DoS problems. Qubes OS can
still use a higher value then.


Do you have any suggestions?  I don’t know if this is actually a DoS
concern anymore.  Shrinking the interval between iterations would be.


Why don't you use today's value of 10 for the default?




+
   static void gnttab_handle_deferred(struct timer_list *unused)
   {
-   unsigned int nr = 10;
+   unsigned int nr = READ_ONCE(free_per_iteration);


I don't see why you are needing READ_ONCE() here.


free_per_iteration can be concurrently modified via sysfs.


My remark was based on the wrong assumption that ignore_limit could be
dropped.




+   const bool ignore_limit = nr == 0;


I don't think you need this extra variable, if you would ...


struct deferred_entry *first = NULL;
unsigned long flags;
+   size_t freed = 0;
spin_lock_irqsave(_list_lock, flags);
-   while (nr--) {
+   while ((ignore_limit || nr--) && !list_empty(_list)) {


... change this to "while ((!nr || nr--) ...".


nr-- evaluates to the old value of nr, so "while ((!nr | nr--)) ..." is
an infinite loop.


Ah, right.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [RFC 01/10] x86: introduce AMD-V and Intel VT-x Kconfig options

2023-02-13 Thread Jan Beulich
On 13.02.2023 21:53, Boris Ostrovsky wrote:
> 
> On 2/13/23 11:41 AM, Jan Beulich wrote:
>> On 13.02.2023 17:30, Xenia Ragiadakou wrote:
>>> On 2/13/23 17:11, Jan Beulich wrote:
 On 13.02.2023 15:57, Xenia Ragiadakou wrote:
> --- a/xen/arch/x86/cpu/Makefile
> +++ b/xen/arch/x86/cpu/Makefile
> @@ -10,4 +10,6 @@ obj-y += intel.o
>obj-y += intel_cacheinfo.o
>obj-y += mwait-idle.o
>obj-y += shanghai.o
> -obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
> +obj-y += vpmu.o
> +obj-$(CONFIG_AMD_SVM) += vpmu_amd.o
> +obj-$(CONFIG_INTEL_VMX) += vpmu_intel.o
 This code was moved from hvm/ to cpu/ for the specific reason of also
 being used by PV guests. (Sadly the comments at the top weren't
 corrected at that time.)
>>> Hmm, the init functions are prefixed with svm/vmx.
>>> Does vpmu depend on svm/vmx support?
>> There are interactions, but I don't think there's a dependency. You
>> may want to ask Boris (whom I have now added to Cc).
> 
> MSR intercept bits need to be manipulated, which is SVM/VMX-specific.

But that's "interaction", not "dependency" aiui: The intercept bits aren't
relevant for PV guests, are they?

Jan



Re: [PATCH] x86/platform: make XENPF_get_dom0_console actually usable

2023-02-13 Thread Jan Beulich
On 13.02.2023 15:51, Jan Beulich wrote:
> struct dom0_vga_console_info has been extended in the past, and it may
> be extended again. The use in PV Dom0's start info already covers for
> that by supplying the size of the provided data. For the recently
> introduced platform-op size needs providing similarly. Go the easiest
> available route and simply supply size via the hypercall return value.
> 
> While there also add a build-time check that possibly future growth of
> the struct won't affect xen_platform_op_t's size.
> 
> Fixes: 4dd160583c79 ("x86/platform: introduce hypercall to get initial video 
> console settings")
> Signed-off-by: Jan Beulich 
> ---
> Noticed while trying to actually use the new hypercall in Linux.
> 
> --- a/xen/arch/x86/platform_hypercall.c
> +++ b/xen/arch/x86/platform_hypercall.c
> @@ -841,6 +841,8 @@ ret_t do_platform_op(
>  
>  #ifdef CONFIG_VIDEO
>  case XENPF_get_dom0_console:
> +BUILD_BUG_ON(sizeof(op->u.dom0_console) > sizeof(op->u.pad));
> +ret = sizeof(op->u.dom0_console);
>  if ( !fill_console_start_info(>u.dom0_console) )
>  {
>  ret = -ENODEV;
> 

Thinking about it, I've added the hunk below.

Jan

--- a/xen/include/public/platform.h
+++ b/xen/include/public/platform.h
@@ -605,7 +605,11 @@ struct xenpf_symdata {
 typedef struct xenpf_symdata xenpf_symdata_t;
 DEFINE_XEN_GUEST_HANDLE(xenpf_symdata_t);
 
-/* Fetch the video console information and mode setup by Xen. */
+/*
+ * Fetch the video console information and mode setup by Xen.  A non-
+ * negative return value indicates the size of the (part of the) structure
+ * which was filled.
+ */
 #define XENPF_get_dom0_console 64
 typedef struct dom0_vga_console_info xenpf_dom0_console_t;
 DEFINE_XEN_GUEST_HANDLE(xenpf_dom0_console_t);




[linux-linus test] 177222: tolerable trouble: fail/pass/starved - PUSHED

2023-02-13 Thread osstest service owner
flight 177222 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/177222/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 177177
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 177177
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 177177
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 177177
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 177177
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 linuxf6feea56f66d34259c4222fa02e8171c4f2673d1
baseline version:
 linuxceaa837f96adb69c0df0397937cd74991d5d821a

Last test of basis   177177  2023-02-13 13:41:39 Z0 days
Testing same since   177222  2023-02-14 00:11:11 Z0 days1 attempts


People who touched revisions under test:
  Alexander Mikhalitsyn 
  Andrew Morton 
  Arnd Bergmann 
  Catalin Marinas 
  Christophe Leroy 
  David E. Box 
  David Hildenbrand 
  Gayatri Kammela 
  Hans de Goede 
  Isaac J. Manjarres 
  Jeff Moyer 
  Jeff Xie 
  Kefeng Wang 
  Kuan-Ying Lee 
  Li Lingfeng 
  Linus Torvalds 
  Michal Hocko 
  Mike Rapoport (IBM) 
  Qi Zheng 
  Seth Jenkins 
  Shiyang Ruan 
  Tejun Heo 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  starved 
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  starved 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvops 

[PATCH 2/2] x86/mem_sharing: Add extra variable to track fork progress

2023-02-13 Thread Tamas K Lengyel
When VM forking is initiated a VM is not supposed to try to perform mem_sharing
yet as the fork process hasn't completed all required steps. However, the vCPU
bring-up paths trigger guest memory accesses that can lead to such premature
sharing ops. However, the gating check to see whether a VM is a fork is set
already (ie. the domain has a parent).

In this patch we introduce a separate variable to store the information that
forking has been initiated so that the non-restartable part of the forking op
is not repeated and we avoid the premature sharing ops from triggering.

Signed-off-by: Tamas K Lengyel 
---
 xen/arch/x86/include/asm/hvm/domain.h |  8 
 xen/arch/x86/mm/mem_sharing.c | 10 +++---
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/include/asm/hvm/domain.h 
b/xen/arch/x86/include/asm/hvm/domain.h
index 698455444e..76a08b55f9 100644
--- a/xen/arch/x86/include/asm/hvm/domain.h
+++ b/xen/arch/x86/include/asm/hvm/domain.h
@@ -33,6 +33,14 @@ struct mem_sharing_domain
 {
 bool enabled, block_interrupts;
 
+/*
+ * We need to avoid trying to nominate pages for forking until the
+ * fork operation is completely finished. As parts of fork creation
+ * is restartable we mark here if the process started to skip the
+ * non-restartable portion.
+ */
+bool fork_started;
+
 /*
  * When releasing shared gfn's in a preemptible manner, recall where
  * to resume the search.
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 649d93dc54..b55c5bccdd 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1888,11 +1888,12 @@ static int copy_settings(struct domain *cd, struct 
domain *d)
 static int fork(struct domain *cd, struct domain *d)
 {
 int rc = -EBUSY;
+struct mem_sharing_domain *msd = >arch.hvm.mem_sharing;
 
 if ( !cd->controller_pause_count )
 return rc;
 
-if ( !cd->parent )
+if ( !msd->fork_started )
 {
 if ( !get_domain(d) )
 {
@@ -1905,7 +1906,7 @@ static int fork(struct domain *cd, struct domain *d)
 *cd->arch.cpuid = *d->arch.cpuid;
 *cd->arch.msr = *d->arch.msr;
 cd->vmtrace_size = d->vmtrace_size;
-cd->parent = d;
+msd->fork_started = 1;
 }
 
 /* This is preemptible so it's the first to get done */
@@ -1918,8 +1919,11 @@ static int fork(struct domain *cd, struct domain *d)
 rc = copy_settings(cd, d);
 
  done:
-if ( rc && rc != -ERESTART )
+if ( !rc )
+cd->parent = d;
+else if ( rc != -ERESTART )
 {
+msd->fork_started = 0;
 cd->parent = NULL;
 domain_unpause(d);
 put_domain(d);
-- 
2.34.1




[PATCH 1/2] x86: Perform mem_sharing teardown before paging teardown

2023-02-13 Thread Tamas K Lengyel
An assert failure has been observed at p2m-basic.c:173 when performing vm
forking and then destroying the forked VM. The assert checks whether the
domain's shared pages counter is 0. According to the patch that originally
added the assert (7bedbbb5c31) the p2m_teardown should only happen after
mem_sharing already relinquished all shared pages.

In this patch we flip the order in which relinquish ops are called to avoid
tripping the assert.

Signed-off-by: Tamas K Lengyel 
---
Note: it is unclear why this assert hasn't tripped in the past. It hasn't
been observed with 4.17-rc1 but it is in RELEASE-4.17.0
---
 xen/arch/x86/domain.c | 52 +--
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index db3ebf062d..453ec52b6a 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2310,6 +2310,32 @@ int domain_relinquish_resources(struct domain *d)
 if ( ret )
 return ret;
 
+#ifdef CONFIG_MEM_SHARING
+PROGRESS(shared):
+
+if ( is_hvm_domain(d) )
+{
+/* If the domain has shared pages, relinquish them allowing
+ * for preemption. */
+ret = relinquish_shared_pages(d);
+if ( ret )
+return ret;
+
+/*
+ * If the domain is forked, decrement the parent's pause count
+ * and release the domain.
+ */
+if ( mem_sharing_is_fork(d) )
+{
+struct domain *parent = d->parent;
+
+d->parent = NULL;
+domain_unpause(parent);
+put_domain(parent);
+}
+}
+#endif
+
 PROGRESS(paging):
 
 /* Tear down paging-assistance stuff. */
@@ -2350,32 +2376,6 @@ int domain_relinquish_resources(struct domain *d)
 d->arch.auto_unmask = 0;
 }
 
-#ifdef CONFIG_MEM_SHARING
-PROGRESS(shared):
-
-if ( is_hvm_domain(d) )
-{
-/* If the domain has shared pages, relinquish them allowing
- * for preemption. */
-ret = relinquish_shared_pages(d);
-if ( ret )
-return ret;
-
-/*
- * If the domain is forked, decrement the parent's pause count
- * and release the domain.
- */
-if ( mem_sharing_is_fork(d) )
-{
-struct domain *parent = d->parent;
-
-d->parent = NULL;
-domain_unpause(parent);
-put_domain(parent);
-}
-}
-#endif
-
 spin_lock(>page_alloc_lock);
 page_list_splice(>arch.relmem_list, >page_list);
 INIT_PAGE_LIST_HEAD(>arch.relmem_list);
-- 
2.34.1




[ovmf test] 177223: all pass - PUSHED

2023-02-13 Thread osstest service owner
flight 177223 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/177223/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf b3f321f2d7871868951cf73edb8fa4d5a88854a5
baseline version:
 ovmf 93a21b465bda44cecdd6347ad481ca6f55286547

Last test of basis   176815  2023-02-10 04:16:45 Z3 days
Testing same since   177223  2023-02-14 00:11:46 Z0 days1 attempts


People who touched revisions under test:
  Michael Kubacki 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   93a21b465b..b3f321f2d7  b3f321f2d7871868951cf73edb8fa4d5a88854a5 -> 
xen-tested-master



linux-next: duplicate patches in the xen-tip tree

2023-02-13 Thread Stephen Rothwell
Hi all,

The following commits are also in the tip tree as different commits
(but the same patches):

  415dab3c1796 ("drivers/xen/hypervisor: Expose Xen SIF flags to userspace")
  336f560a8917 ("x86/xen: don't let xen_pv_play_dead() return")
  f697cb00afa9 ("x86/xen: mark xen_pv_play_dead() as __noreturn")

These are commits

  859761e770f8 ("drivers/xen/hypervisor: Expose Xen SIF flags to userspace")
  076cbf5d2163 ("x86/xen: don't let xen_pv_play_dead() return")
  1aff0d2658e5 ("x86/xen: mark xen_pv_play_dead() as __noreturn")

in the tip tree.

-- 
Cheers,
Stephen Rothwell


pgpVcQQU8z1kS.pgp
Description: OpenPGP digital signature


[xen-unstable test] 177171: tolerable trouble: fail/pass/starved - PUSHED

2023-02-13 Thread osstest service owner
flight 177171 xen-unstable real [real]
flight 177219 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/177171/
http://logs.test-lab.xenproject.org/osstest/logs/177219/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 12 windows-install fail pass in 
177219-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stop  fail in 177219 like 177125
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 177125
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 177125
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 177125
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 177125
 test-amd64-amd64-xl-qcow221 guest-start/debian.repeatfail  like 177125
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 177125
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 177125
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 177125
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 177125
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  f4f498d08d50259b9f25c274fd7b1e8ccf5693af
baseline version:
 xen  01e7477d1b081cff4288ff9f51ec59ee94c03ee0

Last test of basis   177125  2023-02-13 01:53:31 Z0 days
Testing same since   177171  2023-02-13 12:37:06 Z0 days1 attempts


People who touched revisions under test:
  Alistair Francis 
  Andrew Cooper 
  Jan Beulich 
  Juergen Gross 
  Oleksii Kurochko 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  

Re: [PATCH 1/3] automation: Add arm32 dom0less testing

2023-02-13 Thread Stefano Stabellini
On Mon, 13 Feb 2023, Michal Orzel wrote:
> At the moment, we only perform a single arm32 test in our CI, checking
> whether dom0 boots successfully or not. This is mostly because we do not
> have any arm32 runners and we only execute a hypervisor only build.
> 
> In order not to leave the arm32 testing in such a poor state, add a
> script qemu-smoke-dom0less-arm32.sh to start testing true dom0less
> configuration, in which case we do not need a dom0 with a toolstack.
> 
> The script is mostly based on the one used for dom0 arm32 testing as well
> as the one used for dom0less arm64 testing. We obtain Debian Bullseye
> kernel and Alpine Linux busybox-based rootfs. Depending on the test
> variant, we prepare a test case contained within domU_check variable,
> that will be executed as part of /init script.
> 
> Signed-off-by: Michal Orzel 
> ---
>  automation/gitlab-ci/test.yaml| 16 
>  .../scripts/qemu-smoke-dom0less-arm32.sh  | 89 +++
>  2 files changed, 105 insertions(+)
>  create mode 100755 automation/scripts/qemu-smoke-dom0less-arm32.sh
> 
> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> index ce543ef5c0ef..84ab1fee50a4 100644
> --- a/automation/gitlab-ci/test.yaml
> +++ b/automation/gitlab-ci/test.yaml
> @@ -210,6 +210,22 @@ qemu-smoke-dom0-arm32-gcc-debug:
>  - *arm32-test-needs
>  - debian-unstable-gcc-arm32-debug
>  
> +qemu-smoke-dom0less-arm32-gcc:
> +  extends: .qemu-arm32
> +  script:
> +- ./automation/scripts/qemu-smoke-dom0less-arm32.sh 2>&1 | tee ${LOGFILE}
> +  needs:
> +- *arm32-test-needs
> +- debian-unstable-gcc-arm32
> +
> +qemu-smoke-dom0less-arm32-gcc-debug:
> +  extends: .qemu-arm32
> +  script:
> +- ./automation/scripts/qemu-smoke-dom0less-arm32.sh 2>&1 | tee ${LOGFILE}
> +  needs:
> +- *arm32-test-needs
> +- debian-unstable-gcc-arm32-debug
> +
>  qemu-alpine-x86_64-gcc:
>extends: .qemu-x86-64
>script:
> diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh 
> b/automation/scripts/qemu-smoke-dom0less-arm32.sh
> new file mode 100755
> index ..c81529cbbfd0
> --- /dev/null
> +++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh
> @@ -0,0 +1,89 @@
> +#!/bin/bash
> +
> +set -ex
> +
> +test_variant=$1
> +
> +cd binaries
> +# Use the kernel from Debian
> +curl --fail --silent --show-error --location --output vmlinuz 
> https://deb.debian.org/debian/dists/bullseye/main/installer-armhf/current/images/netboot/vmlinuz
> +# Use a tiny initrd based on busybox from Alpine Linux
> +curl --fail --silent --show-error --location --output initrd.tar.gz 
> https://dl-cdn.alpinelinux.org/alpine/v3.15/releases/armhf/alpine-minirootfs-3.15.1-armhf.tar.gz
> +
> +if [ -z "${test_variant}" ]; then
> +passed="generic test passed"
> +domU_check="
> +echo \"${passed}\"
> +"
> +fi
> +
> +# domU rootfs
> +mkdir rootfs
> +cd rootfs
> +tar xvzf ../initrd.tar.gz
> +echo "#!/bin/sh
> +
> +mount -t proc proc /proc
> +mount -t sysfs sysfs /sys
> +mount -t devtmpfs devtmpfs /dev
> +${domU_check}
> +/bin/sh" > init
> +chmod +x init
> +find . | cpio -H newc -o | gzip > ../initrd.gz
> +cd ..
> +
> +# XXX QEMU looks for "efi-virtio.rom" even if it is unneeded
> +curl -fsSLO https://github.com/qemu/qemu/raw/v5.2.0/pc-bios/efi-virtio.rom
> +./qemu-system-arm \
> +-machine virt \
> +-machine virtualization=true \
> +-smp 4 \
> +-m 1024 \
> +-serial stdio \
> +-monitor none \
> +-display none \
> +-machine dumpdtb=virt.dtb
> +
> +# ImageBuilder
> +echo 'MEMORY_START="0x4000"
> +MEMORY_END="0x8000"
> +
> +DEVICE_TREE="virt.dtb"
> +XEN="xen"
> +XEN_CMD="console=dtuart bootscrub=0"
> +
> +DOMU_KERNEL[0]="vmlinuz"
> +DOMU_RAMDISK[0]="initrd.gz"
> +DOMU_MEM[0]="512"
> +NUM_DOMUS=1

Since we are at it I would prefer to create both dom0 and a domU to make
the test more interesting. The following works on gitlab-ci on top of
this patch. Would you be up for adding it to this patch?


diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh 
b/automation/scripts/qemu-smoke-dom0less-arm32.sh
index b420ee444f..6e85bca21c 100755
--- a/automation/scripts/qemu-smoke-dom0less-arm32.sh
+++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh
@@ -60,7 +60,7 @@ curl -fsSLO 
https://github.com/qemu/qemu/raw/v5.2.0/pc-bios/efi-virtio.rom
 -machine virt \
 -machine virtualization=true \
 -smp 4 \
--m 1024 \
+-m 2048 \
 -serial stdio \
 -monitor none \
 -display none \
@@ -68,11 +68,14 @@ curl -fsSLO 
https://github.com/qemu/qemu/raw/v5.2.0/pc-bios/efi-virtio.rom
 
 # ImageBuilder
 echo 'MEMORY_START="0x4000"
-MEMORY_END="0x8000"
+MEMORY_END="0xC000"
 
 DEVICE_TREE="virt.dtb"
 XEN="xen"
-XEN_CMD="console=dtuart bootscrub=0"
+DOM0_KERNEL="vmlinuz"
+DOM0_RAMDISK="initrd.gz"
+DOM0_CMD="console=hvc0 earlyprintk clk_ignore_unused root=/dev/ram0 
rdinit=/bin/sh"
+XEN_CMD="console=dtuart dom0_mem=512M bootscrub=0"
 
 

Re: [PATCH 3/3] automation: Add a gzip compressed kernel image test on arm32

2023-02-13 Thread Stefano Stabellini
On Mon, 13 Feb 2023, Michal Orzel wrote:
> Xen supports booting gzip compressed kernels, therefore add a new test
> job qemu-smoke-dom0less-arm32-gcc-gzip in debug and non-debug variant
> that will execute qemu-smoke-dom0less-arm32.sh script to test booting
> a domU with a gzip compressed kernel image (in our case zImage).
> 
> By passing "gzip" as a test variant, the test will call gzip command
> to compress kernel image and use it in ImageBuilder configuration.
> No need for a specific check to be executed from domU. Being able to
> see a test message from a boot log is sufficient to mark a test as
> passed.
> 
> Signed-off-by: Michal Orzel 

Reviewed-by: Stefano Stabellini 


> ---
>  automation/gitlab-ci/test.yaml  | 16 
>  automation/scripts/qemu-smoke-dom0less-arm32.sh | 13 +
>  2 files changed, 29 insertions(+)
> 
> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> index c2bcc5d4d3e5..be7a55d89708 100644
> --- a/automation/gitlab-ci/test.yaml
> +++ b/automation/gitlab-ci/test.yaml
> @@ -242,6 +242,22 @@ qemu-smoke-dom0less-arm32-gcc-debug-staticmem:
>  - *arm32-test-needs
>  - debian-unstable-gcc-arm32-debug-staticmem
>  
> +qemu-smoke-dom0less-arm32-gcc-gzip:
> +  extends: .qemu-arm32
> +  script:
> +- ./automation/scripts/qemu-smoke-dom0less-arm32.sh gzip 2>&1 | tee 
> ${LOGFILE}
> +  needs:
> +- *arm32-test-needs
> +- debian-unstable-gcc-arm32
> +
> +qemu-smoke-dom0less-arm32-gcc-debug-gzip:
> +  extends: .qemu-arm32
> +  script:
> +- ./automation/scripts/qemu-smoke-dom0less-arm32.sh gzip 2>&1 | tee 
> ${LOGFILE}
> +  needs:
> +- *arm32-test-needs
> +- debian-unstable-gcc-arm32-debug
> +
>  qemu-alpine-x86_64-gcc:
>extends: .qemu-x86-64
>script:
> diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh 
> b/automation/scripts/qemu-smoke-dom0less-arm32.sh
> index f89ee8b6464a..b420ee444f2a 100755
> --- a/automation/scripts/qemu-smoke-dom0less-arm32.sh
> +++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh
> @@ -30,6 +30,15 @@ fi
>  "
>  fi
>  
> +if [[ "${test_variant}" == "gzip" ]]; then
> +# Compress kernel image with gzip
> +gzip vmlinuz
> +passed="${test_variant} test passed"
> +domU_check="
> +echo \"${passed}\"
> +"
> +fi
> +
>  # domU rootfs
>  mkdir rootfs
>  cd rootfs
> @@ -79,6 +88,10 @@ if [[ "${test_variant}" == "static-mem" ]]; then
>  echo -e "\nDOMU_STATIC_MEM[0]=\"${domu_base} ${domu_size}\"" >> config
>  fi
>  
> +if [[ "${test_variant}" == "gzip" ]]; then
> +sed -i "s/vmlinuz/vmlinuz.gz/g" config
> +fi
> +
>  rm -rf imagebuilder
>  git clone https://gitlab.com/ViryaOS/imagebuilder
>  bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config
> -- 
> 2.25.1
> 



Re: [PATCH 2/3] automation: Add a static memory allocation test on arm32

2023-02-13 Thread Stefano Stabellini
On Mon, 13 Feb 2023, Michal Orzel wrote:
> Add a new test job qemu-smoke-dom0less-arm32-gcc-staticmem in debug
> and non-debug variant that will execute qemu-smoke-dom0less-arm32.sh
> script to test static memory allocation feature. The test case itself
> is directly taken from dom0less arm64 testing.
> 
> Populate build jobs to compile Xen with config options necessary to
> enable static memory feature. Populate test jobs passing "static-mem"
> as a test variant. The test configures domU with a static memory region
> (direct-mapped) and adds a check using /proc/iomem to determine the
> memory region marked as "System RAM".
> 
> Signed-off-by: Michal Orzel 

Reviewed-by: Stefano Stabellini 


> ---
>  automation/gitlab-ci/build.yaml   | 20 +++
>  automation/gitlab-ci/test.yaml| 16 +++
>  .../scripts/qemu-smoke-dom0less-arm32.sh  | 17 
>  3 files changed, 53 insertions(+)
> 
> diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
> index a053c5c7325d..166182bc595f 100644
> --- a/automation/gitlab-ci/build.yaml
> +++ b/automation/gitlab-ci/build.yaml
> @@ -565,6 +565,26 @@ debian-unstable-gcc-arm32-debug-randconfig:
>  HYPERVISOR_ONLY: y
>  RANDCONFIG: y
>  
> +debian-unstable-gcc-arm32-staticmem:
> +  extends: .gcc-arm32-cross-build
> +  variables:
> +CONTAINER: debian:unstable-arm32-gcc
> +HYPERVISOR_ONLY: y
> +EXTRA_XEN_CONFIG: |
> +  CONFIG_EXPERT=y
> +  CONFIG_UNSUPPORTED=y
> +  CONFIG_STATIC_MEMORY=y
> +
> +debian-unstable-gcc-arm32-debug-staticmem:
> +  extends: .gcc-arm32-cross-build-debug
> +  variables:
> +CONTAINER: debian:unstable-arm32-gcc
> +HYPERVISOR_ONLY: y
> +EXTRA_XEN_CONFIG: |
> +  CONFIG_EXPERT=y
> +  CONFIG_UNSUPPORTED=y
> +  CONFIG_STATIC_MEMORY=y
> +
>  # Arm builds
>  
>  debian-unstable-gcc-arm64:
> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> index 84ab1fee50a4..c2bcc5d4d3e5 100644
> --- a/automation/gitlab-ci/test.yaml
> +++ b/automation/gitlab-ci/test.yaml
> @@ -226,6 +226,22 @@ qemu-smoke-dom0less-arm32-gcc-debug:
>  - *arm32-test-needs
>  - debian-unstable-gcc-arm32-debug
>  
> +qemu-smoke-dom0less-arm32-gcc-staticmem:
> +  extends: .qemu-arm32
> +  script:
> +- ./automation/scripts/qemu-smoke-dom0less-arm32.sh static-mem 2>&1 | 
> tee ${LOGFILE}
> +  needs:
> +- *arm32-test-needs
> +- debian-unstable-gcc-arm32-staticmem
> +
> +qemu-smoke-dom0less-arm32-gcc-debug-staticmem:
> +  extends: .qemu-arm32
> +  script:
> +- ./automation/scripts/qemu-smoke-dom0less-arm32.sh static-mem 2>&1 | 
> tee ${LOGFILE}
> +  needs:
> +- *arm32-test-needs
> +- debian-unstable-gcc-arm32-debug-staticmem
> +
>  qemu-alpine-x86_64-gcc:
>extends: .qemu-x86-64
>script:
> diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh 
> b/automation/scripts/qemu-smoke-dom0less-arm32.sh
> index c81529cbbfd0..f89ee8b6464a 100755
> --- a/automation/scripts/qemu-smoke-dom0less-arm32.sh
> +++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh
> @@ -17,6 +17,19 @@ echo \"${passed}\"
>  "
>  fi
>  
> +if [[ "${test_variant}" == "static-mem" ]]; then
> +# Memory range that is statically allocated to domU1
> +domu_base="0x5000"
> +domu_size="0x2000"
> +passed="${test_variant} test passed"
> +domU_check="
> +mem_range=$(printf \"%08x-%08x\" ${domu_base} $(( ${domu_base} + 
> ${domu_size} - 1 )))
> +if grep -q -x \"\${mem_range} : System RAM\" /proc/iomem; then
> +echo \"${passed}\"
> +fi
> +"
> +fi
> +
>  # domU rootfs
>  mkdir rootfs
>  cd rootfs
> @@ -62,6 +75,10 @@ BOOT_CMD="bootm"
>  UBOOT_SOURCE="boot.source"
>  UBOOT_SCRIPT="boot.scr"' > config
>  
> +if [[ "${test_variant}" == "static-mem" ]]; then
> +echo -e "\nDOMU_STATIC_MEM[0]=\"${domu_base} ${domu_size}\"" >> config
> +fi
> +
>  rm -rf imagebuilder
>  git clone https://gitlab.com/ViryaOS/imagebuilder
>  bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config
> -- 
> 2.25.1
> 



Re: [PATCH] automation: Add container and build jobs to run cppcheck analysis

2023-02-13 Thread Stefano Stabellini
On Mon, 13 Feb 2023, Michal Orzel wrote:
> Add a debian container with cppcheck installation routine inside,
> capable of performing cppcheck analysis on Xen-only build including
> cross-builds for arm32 and arm64.
> 
> Populate build jobs making use of that container to run cppcheck
> analysis to produce a text report (xen-cppcheck.txt) containing the list
> of all the findings.
> 
> This patch does not aim at performing any sort of bisection. Cppcheck is
> imperfect and for now, our goal is to at least be aware of its reports,
> so that we can compare them with the ones produced by better tools and
> to be able to see how these reports change as a result of further
> infrastructure improvements (e.g. exception list, rules exclusion).
> 
> Signed-off-by: Michal Orzel 

Thanks for the patch, very nice!


> ---
> For those interested in, here is a sample pipeline:
> https://gitlab.com/xen-project/people/morzel/xen-orzelmichal/-/pipelines/775769167
> ---
>  .../build/debian/unstable-cppcheck.dockerfile | 37 +
>  automation/gitlab-ci/build.yaml   | 40 +++
>  automation/scripts/build  | 11 -
>  3 files changed, 87 insertions(+), 1 deletion(-)
>  create mode 100644 automation/build/debian/unstable-cppcheck.dockerfile
> 
> diff --git a/automation/build/debian/unstable-cppcheck.dockerfile 
> b/automation/build/debian/unstable-cppcheck.dockerfile
> new file mode 100644
> index ..39bcc50673c8
> --- /dev/null
> +++ b/automation/build/debian/unstable-cppcheck.dockerfile
> @@ -0,0 +1,37 @@
> +FROM debian:unstable
> +LABEL maintainer.name="The Xen Project" \
> +  maintainer.email="xen-devel@lists.xenproject.org"
> +
> +ENV DEBIAN_FRONTEND=noninteractive
> +ENV CPPCHECK_VERSION=2.7
> +ENV USER root
> +
> +RUN mkdir /build
> +WORKDIR /build
> +
> +# dependencies for cppcheck and Xen-only build/cross-build
> +RUN apt-get update && \
> +apt-get --quiet --yes install \
> +build-essential \
> +curl \
> +python-is-python3 \
> +libpcre3-dev \
> +flex \
> +bison \
> +gcc-arm-linux-gnueabihf \
> +gcc-aarch64-linux-gnu
> +
> +# cppcheck release build (see cppcheck readme.md)
> +RUN curl -fsSLO 
> https://github.com/danmar/cppcheck/archive/"$CPPCHECK_VERSION".tar.gz && \
> +tar xvzf "$CPPCHECK_VERSION".tar.gz && \
> +cd cppcheck-"$CPPCHECK_VERSION" && \
> +make install -j$(nproc) \
> +MATCHCOMPILER=yes \
> +FILESDIR=/usr/share/cppcheck \
> +HAVE_RULES=yes CXXFLAGS="-O2 -DNDEBUG -Wall -Wno-sign-compare 
> -Wno-unused-function"
> +
> +# clean
> +RUN apt-get autoremove -y && \
> +apt-get clean && \
> +rm -rf /var/lib/apt/lists* /tmp/* /var/tmp/* && \
> +rm -rf cppcheck-"$CPPCHECK_VERSION"* "$CPPCHECK_VERSION".tar.gz
> diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
> index a053c5c7325d..c8831ccbec7a 100644
> --- a/automation/gitlab-ci/build.yaml
> +++ b/automation/gitlab-ci/build.yaml
> @@ -7,6 +7,7 @@
>  paths:
>- binaries/
>- xen-config
> +  - xen-cppcheck.txt
>- '*.log'
>- '*/*.log'
>  when: always
> @@ -145,6 +146,23 @@
>variables:
>  <<: *gcc
>  
> +.arm64-cross-build-tmpl:
> +  <<: *build
> +  variables:
> +XEN_TARGET_ARCH: arm64
> +  tags:
> +- x86_64
> +
> +.arm64-cross-build:
> +  extends: .arm64-cross-build-tmpl
> +  variables:
> +debug: n
> +
> +.gcc-arm64-cross-build:
> +  extends: .arm64-cross-build
> +  variables:
> +<<: *gcc
> +
>  .arm64-build-tmpl:
><<: *build
>variables:
> @@ -679,6 +697,28 @@ archlinux-current-gcc-riscv64-debug-randconfig:
>  EXTRA_FIXED_RANDCONFIG:
>CONFIG_COVERAGE=n
>  
> +# Cppcheck analysis jobs
> +
> +debian-unstable-gcc-cppcheck:
> +  extends: .gcc-x86-64-build
> +  variables:
> +CONTAINER: debian:unstable-cppcheck
> +CPPCHECK: y
> +
> +debian-unstable-gcc-arm32-cppcheck:
> +  extends: .gcc-arm32-cross-build
> +  variables:
> +CONTAINER: debian:unstable-cppcheck
> +CROSS_COMPILE: /usr/bin/arm-linux-gnueabihf-
> +CPPCHECK: y
> +
> +debian-unstable-gcc-arm64-cppcheck:
> +  extends: .gcc-arm64-cross-build
> +  variables:
> +CONTAINER: debian:unstable-cppcheck
> +CROSS_COMPILE: /usr/bin/aarch64-linux-gnu-
> +CPPCHECK: y
> +
>  ## Test artifacts common
>  
>  .test-jobs-artifact-common:
> diff --git a/automation/scripts/build b/automation/scripts/build
> index f2f5e55bc04f..c219752d553e 100755
> --- a/automation/scripts/build
> +++ b/automation/scripts/build
> @@ -38,7 +38,16 @@ cp xen/.config xen-config
>  # Directory for the artefacts to be dumped into
>  mkdir binaries
>  
> -if [[ "${HYPERVISOR_ONLY}" == "y" ]]; then
> +if [[ "${CPPCHECK}" == "y" ]]; then
> +# Cppcheck analysis invokes Xen-only build.

Given that when $CPPCHECK == y we are doing a hypervisor-only build,
what do you think of also specifying $HYPERVISOR_ONLY == y in these
cases?


[linux-linus test] 177177: tolerable trouble: fail/pass/starved - PUSHED

2023-02-13 Thread osstest service owner
flight 177177 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/177177/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 177104
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 177104
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 177104
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 177104
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 177104
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 linuxceaa837f96adb69c0df0397937cd74991d5d821a
baseline version:
 linux711e9a4d52bf4e477e51c7135e1e6188c42018d0

Last test of basis   177104  2023-02-12 20:42:38 Z1 days
Testing same since   177133  2023-02-13 03:40:57 Z0 days2 attempts


People who touched revisions under test:
  Geert Uytterhoeven 
  John Paul Adrian Glaubitz 
  Linus Torvalds 
  Steven Rostedt (Google) 
  Yafang Shao 
  Yoshinori Sato 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  starved 
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  starved 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-amd64-coresched-amd64-xl   

Re: [PATCH v3 0/7] Mini-OS: add minimal 9pfs support

2023-02-13 Thread Samuel Thibault
Thanks!!

Juergen Gross, le lun. 13 févr. 2023 09:44:05 +0100, a ecrit:
> This series is adding minimal support to use 9pfs in Mini-OS. It is
> adding a PV 9pfs frontend and the ability to open, close, read and
> write files.
> 
> The series has been tested with qemu as 9pfs backend in a PVH Mini-OS
> guest (I've used a slightly modified Xenstore-stubdom for that purpose
> in order to reuse the build runes).
> 
> This series is meant to setup the stage for adding file based logging
> support to Xenstore-stubdom and later to add live update support (being
> able to save the LU data stream in a dom0 file makes this a _lot_
> easier).
> 
> In order to keep Mini-OS's license I have only used the protocol docs
> available on the internet [1] and then verified those with the qemu 9pfs
> backend implementation (especially for supporting the 9P2000.u variant,
> as qemu doesn't support the basic 9P2000 protocol).
> 
> The needed fixed values of the protocol have been taken from [2].
> 
> [1]: http://ericvh.github.io/9p-rfc/rfc9p2000.html
> [2]: https://github.com/0intro/libixp
> 
> Changes in V2:
> - addressed comments by Samuel Thibault
> 
> Changes in V3:
> - addressed comments by Samuel Thibault and Andrew Cooper
> 
> Juergen Gross (7):
>   Mini-OS: xenbus: add support for reading node from directory
>   Mini-OS: add concept of mount points
>   Mini-OS: add support for runtime mounts
>   Mini-OS: add 9pfs frontend
>   Mini-OS: add 9pfs transport layer
>   Mini-OS: add open and close handling to the 9pfs frontend
>   Mini-OS: add read and write support to 9pfsfront
> 
>  9pfront.c | 1300 +
>  Config.mk |1 +
>  Makefile  |1 +
>  arch/x86/testbuild/all-no |1 +
>  arch/x86/testbuild/all-yes|1 +
>  arch/x86/testbuild/newxen-yes |1 +
>  include/9pfront.h |7 +
>  include/lib.h |   14 +
>  include/xenbus.h  |6 +
>  lib/sys.c |  123 +++-
>  xenbus.c  |   40 +-
>  11 files changed, 1473 insertions(+), 22 deletions(-)
>  create mode 100644 9pfront.c
>  create mode 100644 include/9pfront.h
> 
> -- 
> 2.35.3
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.



Re: [PATCH v3 7/7] Mini-OS: add read and write support to 9pfsfront

2023-02-13 Thread Samuel Thibault
Juergen Gross, le lun. 13 févr. 2023 09:44:12 +0100, a ecrit:
> Add support to read from and write to a file handled by 9pfsfront.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Samuel Thibault 

> ---
> V2:
> - add check for max message size
> - return EAGAIN in case no free request got (Samuel Thibault)
> - loop until all data read/written (Samuel Thibault)
> V3:
> - use an exact limit for read/write (Samuel Thibault)
> - log read(write errors (Samuel Thibault)
> ---
>  9pfront.c | 216 ++
>  1 file changed, 216 insertions(+)
> 
> diff --git a/9pfront.c b/9pfront.c
> index fb2e5669..5da8a365 100644
> --- a/9pfront.c
> +++ b/9pfront.c
> @@ -72,7 +72,10 @@ struct file_9pfs {
>  #define P9_CMD_WALK   110
>  #define P9_CMD_OPEN   112
>  #define P9_CMD_CREATE 114
> +#define P9_CMD_READ   116
> +#define P9_CMD_WRITE  118
>  #define P9_CMD_CLUNK  120
> +#define P9_CMD_STAT   124
>  
>  /* P9 protocol open flags. */
>  #define P9_OREAD0   /* read */
> @@ -88,11 +91,39 @@ struct p9_header {
>  uint16_t tag;
>  } __attribute__((packed));
>  
> +struct p9_stat {
> +uint16_t size;
> +uint16_t type;
> +uint32_t dev;
> +uint8_t qid[P9_QID_SIZE];
> +uint32_t mode;
> +uint32_t atime;
> +uint32_t mtime;
> +uint64_t length;
> +char *name;
> +char *uid;
> +char *gid;
> +char *muid;
> +char *extension;
> +uint32_t n_uid;
> +uint32_t n_gid;
> +uint32_t n_muid;
> +};
> +
>  #define P9_VERSION"9P2000.u"
>  #define P9_ROOT_FID   0
>  
>  static unsigned int ftype_9pfs;
>  
> +static void free_stat(struct p9_stat *stat)
> +{
> +free(stat->name);
> +free(stat->uid);
> +free(stat->gid);
> +free(stat->muid);
> +free(stat->extension);
> +}
> +
>  static unsigned int get_fid(struct dev_9pfs *dev)
>  {
>  unsigned int fid;
> @@ -181,9 +212,12 @@ static void copy_from_ring(struct dev_9pfs *dev, void 
> *data, unsigned int len)
>   *Only valid for sending.
>   * u: 2 byte unsigned integer (uint16_t)
>   * U: 4 byte unsigned integer (uint32_t)
> + * L: 8 byte unsigned integer (uint64_t)
>   * S: String (2 byte length +  characters)
>   *in the rcv_9p() case the data for string is allocated (length omitted,
>   *string terminated by a NUL character)
> + * D: Binary data (4 byte length +  bytes of data), requires a length
> + *and a buffer pointer parameter.
>   * Q: A 13 byte "qid", consisting of 1 byte file type, 4 byte file version
>   *and 8 bytes unique file id. Only valid for receiving.
>   */
> @@ -192,10 +226,12 @@ static void send_9p(struct dev_9pfs *dev, struct req 
> *req, const char *fmt, ...)
>  struct p9_header hdr;
>  va_list ap, aq;
>  const char *f;
> +uint64_t longval;
>  uint32_t intval;
>  uint16_t shortval;
>  uint16_t len;
>  uint8_t byte;
> +uint8_t *data;
>  char *strval;
>  
>  hdr.size = sizeof(hdr);
> @@ -221,11 +257,21 @@ static void send_9p(struct dev_9pfs *dev, struct req 
> *req, const char *fmt, ...)
>  hdr.size += 4;
>  intval = va_arg(aq, unsigned int);
>  break;
> +case 'L':
> +hdr.size += 8;
> +longval = va_arg(aq, uint64_t);
> +break;
>  case 'S':
>  hdr.size += 2;
>  strval = va_arg(aq, char *);
>  hdr.size += strlen(strval);
>  break;
> +case 'D':
> +hdr.size += 4;
> +intval = va_arg(aq, unsigned int);
> +hdr.size += intval;
> +data = va_arg(aq, uint8_t *);
> +break;
>  default:
>  printk("send_9p: unknown format character %c\n", *f);
>  break;
> @@ -258,12 +304,22 @@ static void send_9p(struct dev_9pfs *dev, struct req 
> *req, const char *fmt, ...)
>  intval = va_arg(ap, unsigned int);
>  copy_to_ring(dev, , sizeof(intval));
>  break;
> +case 'L':
> +longval = va_arg(ap, uint64_t);
> +copy_to_ring(dev, , sizeof(longval));
> +break;
>  case 'S':
>  strval = va_arg(ap, char *);
>  len = strlen(strval);
>  copy_to_ring(dev, , sizeof(len));
>  copy_to_ring(dev, strval, len);
>  break;
> +case 'D':
> +intval = va_arg(ap, unsigned int);
> +copy_to_ring(dev, , sizeof(intval));
> +data = va_arg(ap, uint8_t *);
> +copy_to_ring(dev, data, intval);
> +break;
>  }
>  }
>  
> @@ -348,6 +404,8 @@ static void rcv_9p_copy(struct dev_9pfs *dev, struct req 
> *req,
>  uint32_t err;
>  uint16_t *shortval;
>  uint32_t *val;
> +uint64_t *longval;
> +uint8_t *data;
>  char **strval;
>  uint8_t *qval;
>  
> @@ -412,6 +470,10 @@ static void rcv_9p_copy(struct dev_9pfs *dev, struct 

Re: [PATCH v3 6/7] Mini-OS: add open and close handling to the 9pfs frontend

2023-02-13 Thread Samuel Thibault
Juergen Gross, le lun. 13 févr. 2023 09:44:11 +0100, a ecrit:
> Add the open() and close() support to the 9pfs frontend. This requires
> to split the path name and to walk to the desired directory level.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Samuel Thibault 

> ---
> V2:
> - check path to be canonical
> - avoid short read when walking the path
> - fix get_fid() (Samuel Thibault)
> - return EAGAIN if no free request got (Samuel Thibault)
> V3:
> - add check for "//" in path_canonical() (Samuel Thibault)
> ---
>  9pfront.c | 324 +-
>  1 file changed, 322 insertions(+), 2 deletions(-)
> 
> diff --git a/9pfront.c b/9pfront.c
> index 0b8d5461..fb2e5669 100644
> --- a/9pfront.c
> +++ b/9pfront.c
> @@ -6,6 +6,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -52,13 +53,32 @@ struct dev_9pfs {
>  struct wait_queue_head waitq;
>  struct semaphore ring_out_sem;
>  struct semaphore ring_in_sem;
> +
> +unsigned long long fid_mask;  /* Bit mask for free fids. */
> +};
> +
> +struct file_9pfs {
> +uint32_t fid;
> +struct dev_9pfs *dev;
> +bool append;
>  };
>  
>  #define DEFAULT_9PFS_RING_ORDER  4
>  
> +/* P9 protocol commands (response is either cmd+1 or P9_CMD_ERROR). */
>  #define P9_CMD_VERSION100
>  #define P9_CMD_ATTACH 104
>  #define P9_CMD_ERROR  107
> +#define P9_CMD_WALK   110
> +#define P9_CMD_OPEN   112
> +#define P9_CMD_CREATE 114
> +#define P9_CMD_CLUNK  120
> +
> +/* P9 protocol open flags. */
> +#define P9_OREAD0   /* read */
> +#define P9_OWRITE   1   /* write */
> +#define P9_ORDWR2   /* read and write */
> +#define P9_OTRUNC  16   /* or'ed in, truncate file first */
>  
>  #define P9_QID_SIZE13
>  
> @@ -69,10 +89,27 @@ struct p9_header {
>  } __attribute__((packed));
>  
>  #define P9_VERSION"9P2000.u"
> -#define P9_ROOT_FID   ~0
> +#define P9_ROOT_FID   0
>  
>  static unsigned int ftype_9pfs;
>  
> +static unsigned int get_fid(struct dev_9pfs *dev)
> +{
> +unsigned int fid;
> +
> +fid = ffs(dev->fid_mask);
> +if ( fid )
> +dev->fid_mask &= ~(1ULL << (fid - 1));
> +
> + return fid;
> +}
> +
> +static void put_fid(struct dev_9pfs *dev, unsigned int fid)
> +{
> +if ( fid )
> +dev->fid_mask |= 1ULL << (fid - 1);
> +}
> +
>  static struct req *get_free_req(struct dev_9pfs *dev)
>  {
>  struct req *req;
> @@ -140,6 +177,9 @@ static void copy_from_ring(struct dev_9pfs *dev, void 
> *data, unsigned int len)
>   * send_9p() and rcv_9p() are using a special format string for specifying
>   * the kind of data sent/expected. Each data item is represented by a single
>   * character:
> + * b: 1 byte unsigned integer (uint8_t)
> + *Only valid for sending.
> + * u: 2 byte unsigned integer (uint16_t)
>   * U: 4 byte unsigned integer (uint32_t)
>   * S: String (2 byte length +  characters)
>   *in the rcv_9p() case the data for string is allocated (length omitted,
> @@ -153,7 +193,9 @@ static void send_9p(struct dev_9pfs *dev, struct req 
> *req, const char *fmt, ...)
>  va_list ap, aq;
>  const char *f;
>  uint32_t intval;
> +uint16_t shortval;
>  uint16_t len;
> +uint8_t byte;
>  char *strval;
>  
>  hdr.size = sizeof(hdr);
> @@ -167,6 +209,14 @@ static void send_9p(struct dev_9pfs *dev, struct req 
> *req, const char *fmt, ...)
>  {
>  switch ( *f )
>  {
> +case 'b':
> +hdr.size += 1;
> +byte = va_arg(aq, unsigned int);
> +break;
> +case 'u':
> +hdr.size += 2;
> +shortval = va_arg(aq, unsigned int);
> +break;
>  case 'U':
>  hdr.size += 4;
>  intval = va_arg(aq, unsigned int);
> @@ -196,6 +246,14 @@ static void send_9p(struct dev_9pfs *dev, struct req 
> *req, const char *fmt, ...)
>  {
>  switch ( *f )
>  {
> +case 'b':
> +byte = va_arg(ap, unsigned int);
> +copy_to_ring(dev, , sizeof(byte));
> +break;
> +case 'u':
> +shortval = va_arg(ap, unsigned int);
> +copy_to_ring(dev, , sizeof(shortval));
> +break;
>  case 'U':
>  intval = va_arg(ap, unsigned int);
>  copy_to_ring(dev, , sizeof(intval));
> @@ -288,6 +346,7 @@ static void rcv_9p_copy(struct dev_9pfs *dev, struct req 
> *req,
>  char *str;
>  uint16_t len;
>  uint32_t err;
> +uint16_t *shortval;
>  uint32_t *val;
>  char **strval;
>  uint8_t *qval;
> @@ -345,6 +404,10 @@ static void rcv_9p_copy(struct dev_9pfs *dev, struct req 
> *req,
>  {
>  switch ( *f )
>  {
> +case 'u':
> +shortval = va_arg(ap, uint16_t *);
> +copy_bufs(, , , , shortval, 
> sizeof(*shortval));
> +

Re: [PATCH] xen: speed up grant-table reclaim

2023-02-13 Thread Demi Marie Obenour
On Mon, Feb 13, 2023 at 10:26:11AM +0100, Juergen Gross wrote:
> On 07.02.23 03:10, Demi Marie Obenour wrote:
> > When a grant entry is still in use by the remote domain, Linux must put
> > it on a deferred list.  Normally, this list is very short, because
> > the PV network and block protocols expect the backend to unmap the grant
> > first.  However, Qubes OS's GUI protocol is subject to the constraints
> > of the X Window System, and as such winds up with the frontend unmapping
> > the window first.  As a result, the list can grow very large, resulting
> > in a massive memory leak and eventual VM freeze.
> > 
> > Fix this problem by bumping the number of entries that the VM will
> > attempt to free at each iteration to 1.  This is an ugly hack that
> > may well make a denial of service easier, but for Qubes OS that is less
> > bad than the problem Qubes OS users are facing today.
> 
> > There really
> > needs to be a way for a frontend to be notified when the backend has
> > unmapped the grants.
> 
> Please remove this sentence from the commit message, or move it below the
> "---" marker.

Will fix in v2.

> There are still some flag bits unallocated in struct grant_entry_v1 or
> struct grant_entry_header. You could suggest some patches for Xen to use
> one of the bits as a marker to get an event from the hypervisor if a
> grant with such a bit set has been unmapped.

That is indeed a good idea.  There are other problems with the grant
interface as well, but those can be dealt with later.

> I have no idea, whether such an interface would be accepted by the
> maintainers, though.
> 
> > Additionally, a module parameter is provided to
> > allow tuning the reclaim speed.
> > 
> > The code previously used printk(KERN_DEBUG) whenever it had to defer
> > reclaiming a page because the grant was still mapped.  This resulted in
> > a large volume of log messages that bothered users.  Use pr_debug
> > instead, which suppresses the messages by default.  Developers can
> > enable them using the dynamic debug mechanism.
> > 
> > Fixes: QubesOS/qubes-issues#7410 (memory leak)
> > Fixes: QubesOS/qubes-issues#7359 (excessive logging)
> > Fixes: 569ca5b3f94c ("xen/gnttab: add deferred freeing logic")
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Demi Marie Obenour 
> > ---
> > Anyone have suggestions for improving the grant mechanism?  Argo isn't
> > a good option, as in the GUI protocol there are substantial performance
> > wins to be had by using true shared memory.  Resending as I forgot the
> > Signed-off-by on the first submission.  Sorry about that.
> > 
> >   drivers/xen/grant-table.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> > index 5c83d41..2c2faa7 100644
> > --- a/drivers/xen/grant-table.c
> > +++ b/drivers/xen/grant-table.c
> > @@ -355,14 +355,20 @@
> >   static void gnttab_handle_deferred(struct timer_list *);
> >   static DEFINE_TIMER(deferred_timer, gnttab_handle_deferred);
> > +static atomic64_t deferred_count;
> > +static atomic64_t leaked_count;
> > +static unsigned int free_per_iteration = 1;
> 
> As you are adding a kernel parameter to change this value, please set the
> default to a value not potentially causing any DoS problems. Qubes OS can
> still use a higher value then.

Do you have any suggestions?  I don’t know if this is actually a DoS
concern anymore.  Shrinking the interval between iterations would be.

> > +
> >   static void gnttab_handle_deferred(struct timer_list *unused)
> >   {
> > -   unsigned int nr = 10;
> > +   unsigned int nr = READ_ONCE(free_per_iteration);
> 
> I don't see why you are needing READ_ONCE() here.

free_per_iteration can be concurrently modified via sysfs.

> > +   const bool ignore_limit = nr == 0;
> 
> I don't think you need this extra variable, if you would ...
> 
> > struct deferred_entry *first = NULL;
> > unsigned long flags;
> > +   size_t freed = 0;
> > spin_lock_irqsave(_list_lock, flags);
> > -   while (nr--) {
> > +   while ((ignore_limit || nr--) && !list_empty(_list)) {
> 
> ... change this to "while ((!nr || nr--) ...".

nr-- evaluates to the old value of nr, so "while ((!nr | nr--)) ..." is
an infinite loop.

> > struct deferred_entry *entry
> > = list_first_entry(_list,
> >struct deferred_entry, list);
> > @@ -372,10 +378,13 @@
> > list_del(>list);
> > spin_unlock_irqrestore(_list_lock, flags);
> > if (_gnttab_end_foreign_access_ref(entry->ref)) {
> > +   uint64_t ret = atomic64_sub_return(1, _count);
> 
> Use atomic64_dec_return()?

Will fix in v2.

> > put_free_entry(entry->ref);
> > -   pr_debug("freeing g.e. %#x (pfn %#lx)\n",
> > -entry->ref, page_to_pfn(entry->page));
> > +   pr_debug("freeing g.e. %#x (pfn %#lx), %llu 
> > 

Re: [RFC 01/10] x86: introduce AMD-V and Intel VT-x Kconfig options

2023-02-13 Thread Boris Ostrovsky



On 2/13/23 11:41 AM, Jan Beulich wrote:

On 13.02.2023 17:30, Xenia Ragiadakou wrote:

On 2/13/23 17:11, Jan Beulich wrote:

On 13.02.2023 15:57, Xenia Ragiadakou wrote:

--- a/xen/arch/x86/cpu/Makefile
+++ b/xen/arch/x86/cpu/Makefile
@@ -10,4 +10,6 @@ obj-y += intel.o
   obj-y += intel_cacheinfo.o
   obj-y += mwait-idle.o
   obj-y += shanghai.o
-obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
+obj-y += vpmu.o
+obj-$(CONFIG_AMD_SVM) += vpmu_amd.o
+obj-$(CONFIG_INTEL_VMX) += vpmu_intel.o

This code was moved from hvm/ to cpu/ for the specific reason of also
being used by PV guests. (Sadly the comments at the top weren't
corrected at that time.)

Hmm, the init functions are prefixed with svm/vmx.
Does vpmu depend on svm/vmx support?

There are interactions, but I don't think there's a dependency. You
may want to ask Boris (whom I have now added to Cc).



MSR intercept bits need to be manipulated, which is SVM/VMX-specific.


-boris




Re: [PATCH v2 0/8] x86/mtrr: fix handling with PAT but without MTRR

2023-02-13 Thread Edgecombe, Rick P
On Mon, 2023-02-13 at 07:12 +0100, Juergen Gross wrote:
> 
> Thanks for the report.
> 
> I'll have a look. Probably I'll need to re-add the check for WB in
> patch 7.

Sure, let me know if you need any more details about by setup.


[xen-unstable-smoke test] 177187: tolerable trouble: pass/starved - PUSHED

2023-02-13 Thread osstest service owner
flight 177187 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/177187/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  9b70bc6d9678142a40e6c1c6934a32c7a0966e38
baseline version:
 xen  f4f498d08d50259b9f25c274fd7b1e8ccf5693af

Last test of basis   177160  2023-02-13 10:03:22 Z0 days
Testing same since   177187  2023-02-13 16:01:58 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Xenia Ragiadakou 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  starved 
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  starved 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   f4f498d08d..9b70bc6d96  9b70bc6d9678142a40e6c1c6934a32c7a0966e38 -> smoke



Re: [PATCH] create-diff-object: Handle missing secsym for debug sections

2023-02-13 Thread Ross Lagerwall
> From: Jan Beulich 
> Sent: Thursday, February 9, 2023 8:46 AM
> To: Ross Lagerwall 
> Cc: Konrad Rzeszutek Wilk ; Andrew Cooper 
> ; Sergey Dyasli ; 
> xen-de...@lists.xen.org 
> Subject: Re: [PATCH] create-diff-object: Handle missing secsym for debug 
> sections 
>  
> On 08.02.2023 19:04, Ross Lagerwall wrote:
> > Certain debug sections like ".debug_aranges" when built with GCC 11.2.1
> > are missing section symbols (presumably because they're not needed).
> 
> Is it really gcc (not gas) which controls whether section symbols are emitted
> for a particular section?

You're right. I tested it and it is down to the gas version (2.36.1 in this
case). I'll fix that before pushing it.

Ross


Re: [RFC 02/10] x86/hvm: separate AMD-V and Intel VT-x hvm_function_table initializers

2023-02-13 Thread Xenia Ragiadakou



On 2/13/23 18:47, Jan Beulich wrote:

On 13.02.2023 15:57, Xenia Ragiadakou wrote:

--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -261,8 +261,16 @@ extern struct hvm_function_table hvm_funcs;
  extern bool_t hvm_enabled;
  extern s8 hvm_port80_allowed;
  
+#ifdef CONFIG_AMD_SVM

  extern const struct hvm_function_table *start_svm(void);
+#else
+static inline const struct hvm_function_table *start_svm(void) { return NULL; }
+#endif
+#ifdef CONFIG_INTEL_VMX
  extern const struct hvm_function_table *start_vmx(void);
+#else
+static inline const struct hvm_function_table *start_vmx(void) { return NULL; }
+#endif
  
  int hvm_domain_initialise(struct domain *d,

const struct xen_domctl_createdomain *config);


Instead of this (which I consider harder to read), may I suggest

 if ( IS_ENABLED(CONFIG_VMX) && cpu_has_vmx )
 fns = start_vmx();
 else if ( IS_ENABLED(CONFIG_SVM) && cpu_has_svm )
 fns = start_svm();

in hvm_enable() instead (with DCE taking care of removing the dead
calls)?


Sure, it looks much better this way.



Jan


--
Xenia



Re: [RFC 03/10] x86/p2m: guard vmx specific ept functions with INTEL_VMX

2023-02-13 Thread Jan Beulich
On 13.02.2023 15:57, Xenia Ragiadakou wrote:
> --- a/xen/arch/x86/mm/p2m.h
> +++ b/xen/arch/x86/mm/p2m.h
> @@ -35,9 +35,25 @@ void p2m_nestedp2m_init(struct p2m_domain *p2m);
>  int p2m_init_nestedp2m(struct domain *d);
>  void p2m_teardown_nestedp2m(struct domain *d);
>  
> +#ifdef CONFIG_INTEL_VMX
>  int ept_p2m_init(struct p2m_domain *p2m);
>  void ept_p2m_uninit(struct p2m_domain *p2m);

For the calls from p2m_initialise() and p2m_free_one() see my reply
to patch 2. For the calls from altp2m functions, including ...

>  void p2m_init_altp2m_ept(struct domain *d, unsigned int i);

... to this one, I think you want to be more rigorous and hide much
(most?) of altp2m altogether when !VMX. (It is perhaps high time
for more of that code to move from p2m.c to altp2m.c, and altp2m.c
then, if it doesn't already, to become dependent on VMX somewhere
in the Makefile in your series.)

Jan



[PATCH v2] build: Make FILE symbol paths consistent

2023-02-13 Thread Ross Lagerwall
The FILE symbols in out-of-tree builds may be either a relative path to
the object dir or an absolute path depending on how the build is
invoked. Fix the paths for C files so that they are consistent with
in-tree builds - the path is relative to the "xen" directory (e.g.
common/irq.c).

This fixes livepatch builds when the original Xen build was out-of-tree
since livepatch-build always does in-tree builds. Note that this doesn't
fix the behaviour for Clang < 6 which always embeds full paths.

Fixes: 7115fa562fe7 ("build: adding out-of-tree support to the xen build")
Signed-off-by: Ross Lagerwall 
---

In v2:
* Adjust commit description.
* Rename rel_path.

 xen/Rules.mk | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/Rules.mk b/xen/Rules.mk
index 70b7489ea8..d6b7cec0a8 100644
--- a/xen/Rules.mk
+++ b/xen/Rules.mk
@@ -228,8 +228,9 @@ quiet_cmd_cc_o_c = CC  $@
 ifeq ($(CONFIG_ENFORCE_UNIQUE_SYMBOLS),y)
 cmd_cc_o_c = $(CC) $(c_flags) -c $< -o $(dot-target).tmp -MQ $@
 ifneq ($(CONFIG_CC_IS_CLANG)$(call clang-ifversion,-lt,600,y),yy)
+rel-path = $(patsubst $(abs_srctree)/%,%,$(call realpath,$(1)))
 cmd_objcopy_fix_sym = \
-   $(OBJCOPY) --redefine-sym $(

Re: [RFC 02/10] x86/hvm: separate AMD-V and Intel VT-x hvm_function_table initializers

2023-02-13 Thread Jan Beulich
On 13.02.2023 15:57, Xenia Ragiadakou wrote:
> --- a/xen/arch/x86/include/asm/hvm/hvm.h
> +++ b/xen/arch/x86/include/asm/hvm/hvm.h
> @@ -261,8 +261,16 @@ extern struct hvm_function_table hvm_funcs;
>  extern bool_t hvm_enabled;
>  extern s8 hvm_port80_allowed;
>  
> +#ifdef CONFIG_AMD_SVM
>  extern const struct hvm_function_table *start_svm(void);
> +#else
> +static inline const struct hvm_function_table *start_svm(void) { return 
> NULL; }
> +#endif
> +#ifdef CONFIG_INTEL_VMX
>  extern const struct hvm_function_table *start_vmx(void);
> +#else
> +static inline const struct hvm_function_table *start_vmx(void) { return 
> NULL; }
> +#endif
>  
>  int hvm_domain_initialise(struct domain *d,
>const struct xen_domctl_createdomain *config);

Instead of this (which I consider harder to read), may I suggest

if ( IS_ENABLED(CONFIG_VMX) && cpu_has_vmx )
fns = start_vmx();
else if ( IS_ENABLED(CONFIG_SVM) && cpu_has_svm )
fns = start_svm();

in hvm_enable() instead (with DCE taking care of removing the dead
calls)?

Jan



Re: [RFC 01/10] x86: introduce AMD-V and Intel VT-x Kconfig options

2023-02-13 Thread Jan Beulich
On 13.02.2023 17:30, Xenia Ragiadakou wrote:
> On 2/13/23 17:11, Jan Beulich wrote:
>> On 13.02.2023 15:57, Xenia Ragiadakou wrote:
>>> --- a/xen/arch/x86/cpu/Makefile
>>> +++ b/xen/arch/x86/cpu/Makefile
>>> @@ -10,4 +10,6 @@ obj-y += intel.o
>>>   obj-y += intel_cacheinfo.o
>>>   obj-y += mwait-idle.o
>>>   obj-y += shanghai.o
>>> -obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
>>> +obj-y += vpmu.o
>>> +obj-$(CONFIG_AMD_SVM) += vpmu_amd.o
>>> +obj-$(CONFIG_INTEL_VMX) += vpmu_intel.o
>>
>> This code was moved from hvm/ to cpu/ for the specific reason of also
>> being used by PV guests. (Sadly the comments at the top weren't
>> corrected at that time.)
> 
> Hmm, the init functions are prefixed with svm/vmx.
> Does vpmu depend on svm/vmx support?

There are interactions, but I don't think there's a dependency. You
may want to ask Boris (whom I have now added to Cc).

Jan



Re: [RFC 01/10] x86: introduce AMD-V and Intel VT-x Kconfig options

2023-02-13 Thread Xenia Ragiadakou



On 2/13/23 17:11, Jan Beulich wrote:

On 13.02.2023 15:57, Xenia Ragiadakou wrote:

--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -117,6 +117,12 @@ config HVM
  
  	  If unsure, say Y.
  
+config AMD_SVM

+   def_bool y if HVM
+
+config INTEL_VMX
+   def_bool y if HVM


I'm not convinced we want vendor names here - I'd prefer to go with
just SVM and VMX.


Ok, personally I have not strong preferences.




--- a/xen/arch/x86/cpu/Makefile
+++ b/xen/arch/x86/cpu/Makefile
@@ -10,4 +10,6 @@ obj-y += intel.o
  obj-y += intel_cacheinfo.o
  obj-y += mwait-idle.o
  obj-y += shanghai.o
-obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
+obj-y += vpmu.o
+obj-$(CONFIG_AMD_SVM) += vpmu_amd.o
+obj-$(CONFIG_INTEL_VMX) += vpmu_intel.o


This code was moved from hvm/ to cpu/ for the specific reason of also
being used by PV guests. (Sadly the comments at the top weren't
corrected at that time.)


Hmm, the init functions are prefixed with svm/vmx.
Does vpmu depend on svm/vmx support?




--- a/xen/drivers/passthrough/Kconfig
+++ b/xen/drivers/passthrough/Kconfig
@@ -51,7 +51,7 @@ config AMD_IOMMU
  
  config INTEL_IOMMU

bool "Intel VT-d" if EXPERT
-   depends on X86
+   depends on X86 && INTEL_VMX
default y
help
  Enables I/O virtualization on platforms that implement the


This is odd in three ways: For one PV domains (i.e. incl Dom0) also use
the IOMMU. And then earlier on there was a change of yours specifically
relaxing the connection between I/O and CPU virtualization. Plus
finally you make no similar change for AMD.


I am planning to relax it as it is relevant only to posted interrupts 
(that's why there is no such a dependency for AMD).




Jan


--
Xenia



Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Jan Beulich
On 13.02.2023 16:07, Julien Grall wrote:
> On 13/02/2023 15:02, Jan Beulich wrote:
>> On 13.02.2023 14:56, Julien Grall wrote:
>>> On 13/02/2023 13:30, Jan Beulich wrote:
 On 13.02.2023 14:19, Julien Grall wrote:
> On 13/02/2023 12:24, Jan Beulich wrote:
>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/include/xen/bug.h
>>> @@ -0,0 +1,127 @@
>>> +#ifndef __XEN_BUG_H__
>>> +#define __XEN_BUG_H__
>>> +
>>> +#define BUG_DISP_WIDTH24
>>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>>> +
>>> +#define BUGFRAME_run_fn 0
>>> +#define BUGFRAME_warn   1
>>> +#define BUGFRAME_bug2
>>> +#define BUGFRAME_assert 3
>>> +
>>> +#define BUGFRAME_NR 4
>>> +
>>> +#ifndef __ASSEMBLY__
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>
>> Again - please sort headers.
>>
>>> +#include 
>>> +
>>> +#ifndef BUG_FRAME_STUFF
>>> +struct bug_frame {
>>
>> Please can we have a blank line between the above two ones and then 
>> similarly
>> ahead of the #endif?
>>
>>> +signed int loc_disp;/* Relative address to the bug address */
>>> +signed int file_disp;   /* Relative address to the filename */
>>> +signed int msg_disp;/* Relative address to the predicate (for 
>>> ASSERT) */
>>> +uint16_t line;  /* Line number */
>>> +uint32_t pad0:16;   /* Padding for 8-bytes align */
>>
>> Already the original comment in Arm code is wrong: The padding doesn't
>> guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
>> size.
>> Aiui there's also no need for 8-byte alignment anywhere here (in
>> fact there's ".p2align 2" in the generator macros).
>
> I would rather keep the pad0 here.

 May I ask why that is?
>>>
>>> It makes clear of the padding (which would have been hidden) when using
>>> .p2align 2.
>>
>> But you realize that I didn't ask to drop the member?
> 
> I misunderstood your first reply. But the second reply...
> 
>   Besides (later in
>> the reply to Oleksii) suggesting to make "line" uint32_t, you zapped the
>> relevant further part of my reply here:
> 
>>
>> "I also wonder why this needs to be a named bitfield: Either make it
>>   plain uint16_t, or if you use a bitfield, then omit the name."
>>
>> I thought that's clear enough as a request to change representation,
> 
> ... "May I ask why that is?" added to the confusion because it implied 
> that you disagree on keep the pad0.

Hmm, yes, I can see how that may have been ambiguous: I understood that
reply of yours as requesting to retain the _name_ (i.e. objecting to my
thought of using an unnamed bitfield instead).

Jan



Re: [RFC 01/10] x86: introduce AMD-V and Intel VT-x Kconfig options

2023-02-13 Thread Jan Beulich
On 13.02.2023 15:57, Xenia Ragiadakou wrote:
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -117,6 +117,12 @@ config HVM
>  
> If unsure, say Y.
>  
> +config AMD_SVM
> + def_bool y if HVM
> +
> +config INTEL_VMX
> + def_bool y if HVM

I'm not convinced we want vendor names here - I'd prefer to go with
just SVM and VMX.

> --- a/xen/arch/x86/cpu/Makefile
> +++ b/xen/arch/x86/cpu/Makefile
> @@ -10,4 +10,6 @@ obj-y += intel.o
>  obj-y += intel_cacheinfo.o
>  obj-y += mwait-idle.o
>  obj-y += shanghai.o
> -obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
> +obj-y += vpmu.o
> +obj-$(CONFIG_AMD_SVM) += vpmu_amd.o
> +obj-$(CONFIG_INTEL_VMX) += vpmu_intel.o

This code was moved from hvm/ to cpu/ for the specific reason of also
being used by PV guests. (Sadly the comments at the top weren't
corrected at that time.)

> --- a/xen/drivers/passthrough/Kconfig
> +++ b/xen/drivers/passthrough/Kconfig
> @@ -51,7 +51,7 @@ config AMD_IOMMU
>  
>  config INTEL_IOMMU
>   bool "Intel VT-d" if EXPERT
> - depends on X86
> + depends on X86 && INTEL_VMX
>   default y
>   help
> Enables I/O virtualization on platforms that implement the

This is odd in three ways: For one PV domains (i.e. incl Dom0) also use
the IOMMU. And then earlier on there was a change of yours specifically
relaxing the connection between I/O and CPU virtualization. Plus
finally you make no similar change for AMD.

Jan



Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Julien Grall

Hi,

On 13/02/2023 15:02, Jan Beulich wrote:

On 13.02.2023 14:56, Julien Grall wrote:

On 13/02/2023 13:30, Jan Beulich wrote:

On 13.02.2023 14:19, Julien Grall wrote:

On 13/02/2023 12:24, Jan Beulich wrote:

On 03.02.2023 18:05, Oleksii Kurochko wrote:

--- /dev/null
+++ b/xen/include/xen/bug.h
@@ -0,0 +1,127 @@
+#ifndef __XEN_BUG_H__
+#define __XEN_BUG_H__
+
+#define BUG_DISP_WIDTH24
+#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
+#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
+
+#define BUGFRAME_run_fn 0
+#define BUGFRAME_warn   1
+#define BUGFRAME_bug2
+#define BUGFRAME_assert 3
+
+#define BUGFRAME_NR 4
+
+#ifndef __ASSEMBLY__
+
+#include 
+#include 
+#include 
+#include 


Again - please sort headers.


+#include 
+
+#ifndef BUG_FRAME_STUFF
+struct bug_frame {


Please can we have a blank line between the above two ones and then similarly
ahead of the #endif?


+signed int loc_disp;/* Relative address to the bug address */
+signed int file_disp;   /* Relative address to the filename */
+signed int msg_disp;/* Relative address to the predicate (for ASSERT) 
*/
+uint16_t line;  /* Line number */
+uint32_t pad0:16;   /* Padding for 8-bytes align */


Already the original comment in Arm code is wrong: The padding doesn't
guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
size.
Aiui there's also no need for 8-byte alignment anywhere here (in
fact there's ".p2align 2" in the generator macros).


I would rather keep the pad0 here.


May I ask why that is?


It makes clear of the padding (which would have been hidden) when using
.p2align 2.


But you realize that I didn't ask to drop the member?


I misunderstood your first reply. But the second reply...

 Besides (later in

the reply to Oleksii) suggesting to make "line" uint32_t, you zapped the
relevant further part of my reply here:




"I also wonder why this needs to be a named bitfield: Either make it
  plain uint16_t, or if you use a bitfield, then omit the name."

I thought that's clear enough as a request to change representation,


... "May I ask why that is?" added to the confusion because it implied 
that you disagree on keep the pad0.



but not asking for plain removal. The part of my reply that you commented
on is merely asking to not move a bogus comment (whether it gets corrected
up front or while being moved is secondary to me).


I am fine with both suggestions.

Cheers,

--
Julien Grall



Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Jan Beulich
On 13.02.2023 14:56, Julien Grall wrote:
> On 13/02/2023 13:30, Jan Beulich wrote:
>> On 13.02.2023 14:19, Julien Grall wrote:
>>> On 13/02/2023 12:24, Jan Beulich wrote:
 On 03.02.2023 18:05, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/include/xen/bug.h
> @@ -0,0 +1,127 @@
> +#ifndef __XEN_BUG_H__
> +#define __XEN_BUG_H__
> +
> +#define BUG_DISP_WIDTH24
> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
> +
> +#define BUGFRAME_run_fn 0
> +#define BUGFRAME_warn   1
> +#define BUGFRAME_bug2
> +#define BUGFRAME_assert 3
> +
> +#define BUGFRAME_NR 4
> +
> +#ifndef __ASSEMBLY__
> +
> +#include 
> +#include 
> +#include 
> +#include 

 Again - please sort headers.

> +#include 
> +
> +#ifndef BUG_FRAME_STUFF
> +struct bug_frame {

 Please can we have a blank line between the above two ones and then 
 similarly
 ahead of the #endif?

> +signed int loc_disp;/* Relative address to the bug address */
> +signed int file_disp;   /* Relative address to the filename */
> +signed int msg_disp;/* Relative address to the predicate (for 
> ASSERT) */
> +uint16_t line;  /* Line number */
> +uint32_t pad0:16;   /* Padding for 8-bytes align */

 Already the original comment in Arm code is wrong: The padding doesn't
 guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
 size.
 Aiui there's also no need for 8-byte alignment anywhere here (in
 fact there's ".p2align 2" in the generator macros).
>>>
>>> I would rather keep the pad0 here.
>>
>> May I ask why that is?
> 
> It makes clear of the padding (which would have been hidden) when using 
> .p2align 2.

But you realize that I didn't ask to drop the member? Besides (later in
the reply to Oleksii) suggesting to make "line" uint32_t, you zapped the
relevant further part of my reply here:

"I also wonder why this needs to be a named bitfield: Either make it
 plain uint16_t, or if you use a bitfield, then omit the name."

I thought that's clear enough as a request to change representation,
but not asking for plain removal. The part of my reply that you commented
on is merely asking to not move a bogus comment (whether it gets corrected
up front or while being moved is secondary to me).

Jan



[RFC 10/10] x86/hvm: make AMD-V and Intel VT-x support configurable

2023-02-13 Thread Xenia Ragiadakou
Provide the user with configuration control over the cpu virtualization support
in Xen by making AMD_SVM and INTEL_VMX options user selectable.

To preserve the current default behavior, both options depend on HVM and
default to Y.

To prevent users from unknowingly disabling virtualization support, make the
controls user selectable only if EXPERT is enabled.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/Kconfig | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 2a72111c23..fce40f08b1 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -118,10 +118,24 @@ config HVM
  If unsure, say Y.
 
 config AMD_SVM
-   def_bool y if HVM
+   bool "AMD-V" if EXPERT
+   depends on HVM
+   default y
+   help
+ Enables virtual machine extensions on platforms that implement the
+ AMD Virtualization Technology (AMD-V).
+ If your system includes a processor with AMD-V support, say Y.
+ If in doubt, say Y.
 
 config INTEL_VMX
-   def_bool y if HVM
+   bool "Intel VT-x" if EXPERT
+   depends on HVM
+   default y
+   help
+ Enables virtual machine extensions on platforms that implement the
+ Intel Virtualization Technology (Intel VT-x).
+ If your system includes a processor with Intel VT-x support, say Y.
+ If in doubt, say Y.
 
 config XEN_SHSTK
bool "Supervisor Shadow Stacks"
-- 
2.37.2




[RFC 09/10] x86/ioreq: guard VIO_realmode_completion with INTEL_VMX

2023-02-13 Thread Xenia Ragiadakou
VIO_realmode_completion is specific to vmx realmode, so guard the completion
handling code with INTEL_VMX.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/hvm/ioreq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 0bdcca1e1a..1b0c0f1b12 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -44,6 +44,7 @@ bool arch_vcpu_ioreq_completion(enum vio_completion 
completion)
 {
 switch ( completion )
 {
+#ifdef CONFIG_INTEL_VMX
 case VIO_realmode_completion:
 {
 struct hvm_emulate_ctxt ctxt;
@@ -54,6 +55,7 @@ bool arch_vcpu_ioreq_completion(enum vio_completion 
completion)
 
 break;
 }
+#endif
 
 default:
 ASSERT_UNREACHABLE();
-- 
2.37.2




[RFC 06/10] x86/domain: guard svm specific functions with AMD_SVM

2023-02-13 Thread Xenia Ragiadakou
The functions svm_load_segs() and svm_load_segs_prefetch() are AMD-V specific
so guard their calls in common code with AMD_SVM.

Since AMD_SVM depends on HVM, it can be used alone.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/domain.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index db3ebf062d..576a410f4f 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1628,7 +1628,7 @@ static void load_segments(struct vcpu *n)
 if ( !(n->arch.flags & TF_kernel_mode) )
 SWAP(gsb, gss);
 
-#ifdef CONFIG_HVM
+#ifdef CONFIG_AMD_SVM
 if ( cpu_has_svm && (uregs->fs | uregs->gs) <= 3 )
 fs_gs_done = svm_load_segs(n->arch.pv.ldt_ents, LDT_VIRT_START(n),
n->arch.pv.fs_base, gsb, gss);
@@ -1951,7 +1951,7 @@ static void __context_switch(void)
 
 write_ptbase(n);
 
-#if defined(CONFIG_PV) && defined(CONFIG_HVM)
+#if defined(CONFIG_PV) && defined(CONFIG_AMD_SVM)
 /* Prefetch the VMCB if we expect to use it later in the context switch */
 if ( cpu_has_svm && is_pv_64bit_domain(nd) && !is_idle_domain(nd) )
 svm_load_segs_prefetch();
-- 
2.37.2




[RFC 05/10] x86/traps: guard vmx specific functions with INTEL_VMX

2023-02-13 Thread Xenia Ragiadakou
The functions vmx_vmcs_enter() and vmx_vmcs_exit() are VT-x specific.
Guard their calls with INTEL_VMX.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/traps.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index cade9e12f8..77b4f772f2 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -694,7 +694,7 @@ void vcpu_show_execution_state(struct vcpu *v)
 
 vcpu_pause(v); /* acceptably dangerous */
 
-#ifdef CONFIG_HVM
+#ifdef CONFIG_INTEL_VMX
 /*
  * For VMX special care is needed: Reading some of the register state will
  * require VMCS accesses. Engaging foreign VMCSes involves acquiring of a
@@ -732,7 +732,7 @@ void vcpu_show_execution_state(struct vcpu *v)
 console_unlock_recursive_irqrestore(flags);
 }
 
-#ifdef CONFIG_HVM
+#ifdef CONFIG_INTEL_VMX
 if ( cpu_has_vmx && is_hvm_vcpu(v) )
 vmx_vmcs_exit(v);
 #endif
-- 
2.37.2




[RFC 08/10] x86: wire cpu_has_svm/vmx_* to false when svm/vmx not enabled

2023-02-13 Thread Xenia Ragiadakou
To be able to use cpu_has_svm/vmx_* macros in common code without enclosing
them inside #ifdef guards when the respective virtualization technology is
not enabled, define them as false when not applicable.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/include/asm/hvm/svm/svm.h  | 17 
 xen/arch/x86/include/asm/hvm/vmx/vmcs.h | 35 +
 xen/arch/x86/include/asm/hvm/vmx/vmx.h  | 18 +
 3 files changed, 70 insertions(+)

diff --git a/xen/arch/x86/include/asm/hvm/svm/svm.h 
b/xen/arch/x86/include/asm/hvm/svm/svm.h
index 65e35a4f59..556584851b 100644
--- a/xen/arch/x86/include/asm/hvm/svm/svm.h
+++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
@@ -80,6 +80,7 @@ extern u32 svm_feature_flags;
 #define SVM_FEATURE_SSS   19 /* NPT Supervisor Shadow Stacks */
 #define SVM_FEATURE_SPEC_CTRL 20 /* MSR_SPEC_CTRL virtualisation */
 
+#ifdef CONFIG_AMD_SVM
 #define cpu_has_svm_feature(f) (svm_feature_flags & (1u << (f)))
 #define cpu_has_svm_npt   cpu_has_svm_feature(SVM_FEATURE_NPT)
 #define cpu_has_svm_lbrv  cpu_has_svm_feature(SVM_FEATURE_LBRV)
@@ -95,6 +96,22 @@ extern u32 svm_feature_flags;
 #define cpu_has_svm_vloadsave cpu_has_svm_feature(SVM_FEATURE_VLOADSAVE)
 #define cpu_has_svm_sss   cpu_has_svm_feature(SVM_FEATURE_SSS)
 #define cpu_has_svm_spec_ctrl cpu_has_svm_feature(SVM_FEATURE_SPEC_CTRL)
+#else
+#define cpu_has_svm_npt   false
+#define cpu_has_svm_lbrv  false
+#define cpu_has_svm_svml  false
+#define cpu_has_svm_nrips false
+#define cpu_has_svm_cleanbits false
+#define cpu_has_svm_flushbyasid false
+#define cpu_has_svm_decodefalse
+#define cpu_has_svm_vgif  false
+#define cpu_has_pause_filter  false
+#define cpu_has_pause_thresh  false
+#define cpu_has_tsc_ratio false
+#define cpu_has_svm_vloadsave false
+#define cpu_has_svm_sss   false
+#define cpu_has_svm_spec_ctrl false
+#endif /* CONFIG_AMD_SVM */
 
 #define SVM_PAUSEFILTER_INIT4000
 #define SVM_PAUSETHRESH_INIT1000
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h 
b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
index 0a84e74478..03d1f7480a 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -294,6 +294,7 @@ extern u64 vmx_ept_vpid_cap;
 
 #define VMX_TSC_MULTIPLIER_MAX  0xULL
 
+#ifdef CONFIG_INTEL_VMX
 #define cpu_has_wbinvd_exiting \
 (vmx_secondary_exec_control & SECONDARY_EXEC_WBINVD_EXITING)
 #define cpu_has_vmx_virtualize_apic_accesses \
@@ -352,6 +353,36 @@ extern u64 vmx_ept_vpid_cap;
 (vmx_secondary_exec_control & SECONDARY_EXEC_BUS_LOCK_DETECTION)
 #define cpu_has_vmx_notify_vm_exiting \
 (vmx_secondary_exec_control & SECONDARY_EXEC_NOTIFY_VM_EXITING)
+#else
+#define cpu_has_wbinvd_exiting false
+#define cpu_has_vmx_virtualize_apic_accesses false
+#define cpu_has_vmx_tpr_shadow false
+#define cpu_has_vmx_vnmi false
+#define cpu_has_vmx_msr_bitmap false
+#define cpu_has_vmx_secondary_exec_control false
+#define cpu_has_vmx_ept false
+#define cpu_has_vmx_dt_exiting false
+#define cpu_has_vmx_vpid false
+#define cpu_has_monitor_trap_flag false
+#define cpu_has_vmx_pat false
+#define cpu_has_vmx_efer false
+#define cpu_has_vmx_unrestricted_guest false
+#define vmx_unrestricted_guest(v) false
+#define cpu_has_vmx_ple false
+#define cpu_has_vmx_apic_reg_virt false
+#define cpu_has_vmx_virtual_intr_delivery false
+#define cpu_has_vmx_virtualize_x2apic_mode false
+#define cpu_has_vmx_posted_intr_processing false
+#define cpu_has_vmx_vmcs_shadowing false
+#define cpu_has_vmx_vmfunc false
+#define cpu_has_vmx_virt_exceptions false
+#define cpu_has_vmx_pml false
+#define cpu_has_vmx_mpx false
+#define cpu_has_vmx_xsaves false
+#define cpu_has_vmx_tsc_scaling false
+#define cpu_has_vmx_bus_lock_detection false
+#define cpu_has_vmx_notify_vm_exiting false
+#endif /* CONFIG_INTEL_VMX */
 
 #define VMCS_RID_TYPE_MASK  0x8000
 
@@ -374,8 +405,12 @@ extern u64 vmx_ept_vpid_cap;
 #define VMX_BASIC_DEFAULT1_ZERO(1ULL << 55)
 
 extern u64 vmx_basic_msr;
+#ifdef CONFIG_INTEL_VMX
 #define cpu_has_vmx_ins_outs_instr_info \
 (!!(vmx_basic_msr & VMX_BASIC_INS_OUT_INFO))
+#else
+#define cpu_has_vmx_ins_outs_instr_info false
+#endif /* CONFIG_INTEL_VMX */
 
 /* Guest interrupt status */
 #define VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK  0x0FF
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h 
b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
index 97d6b810ec..fe9a5796f7 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
@@ -289,6 +289,7 @@ typedef union cr_access_qual {
 
 extern uint8_t posted_intr_vector;
 
+#ifdef CONFIG_INTEL_VMX
 #define cpu_has_vmx_ept_exec_only_supported\
 (vmx_ept_vpid_cap & VMX_EPT_EXEC_ONLY_SUPPORTED)
 
@@ -301,6 +302,17 @@ extern uint8_t posted_intr_vector;
 #define cpu_has_vmx_ept_ad(vmx_ept_vpid_cap & VMX_EPT_AD_BIT)
 

[RFC 07/10] x86/oprofile: guard svm specific symbols with AMD_SVM

2023-02-13 Thread Xenia Ragiadakou
The symbol svm_stgi_label is AMD-V specific so guard its usage in common code
with AMD_SVM.

Since AMD_SVM depends on HVM, it can be used alone.
Also, use #ifdef instead of #if.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/oprofile/op_model_athlon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/oprofile/op_model_athlon.c 
b/xen/arch/x86/oprofile/op_model_athlon.c
index 69fd3fcc86..782fa606ee 100644
--- a/xen/arch/x86/oprofile/op_model_athlon.c
+++ b/xen/arch/x86/oprofile/op_model_athlon.c
@@ -320,7 +320,7 @@ static int cf_check athlon_check_ctrs(
struct vcpu *v = current;
unsigned int const nr_ctrs = model->num_counters;
 
-#if CONFIG_HVM
+#ifdef CONFIG_AMD_SVM
struct cpu_user_regs *guest_regs = guest_cpu_user_regs();
 
if (!guest_mode(regs) &&
-- 
2.37.2




[RFC 04/10] x86/vpmu: separate AMD-V and Intel VT-x init arch_vpmu_ops initializers

2023-02-13 Thread Xenia Ragiadakou
The function core2_vpmu_init() is VT-x specific while the functions
amd_vpmu_init() and hygon_vpmu_init() are AMD-V specific, thus need to be
guarded with INTEL_VMX and AMD_SVM, respectively.
Instead of adding #ifdef guards around the function calls in common vpu code,
implement them as static inline null-returning functions when the respective
technology is not enabled.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/include/asm/vpmu.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/xen/arch/x86/include/asm/vpmu.h b/xen/arch/x86/include/asm/vpmu.h
index 05e1fbfccf..1e08afb7af 100644
--- a/xen/arch/x86/include/asm/vpmu.h
+++ b/xen/arch/x86/include/asm/vpmu.h
@@ -53,9 +53,18 @@ struct arch_vpmu_ops {
 #endif
 };
 
+#ifdef CONFIG_INTEL_VMX
 const struct arch_vpmu_ops *core2_vpmu_init(void);
+#else
+static inline const struct arch_vpmu_ops *core2_vpmu_init(void) { return NULL; 
}
+#endif /* CONFIG_INTEL_VMX */
+#ifdef CONFIG_AMD_SVM
 const struct arch_vpmu_ops *amd_vpmu_init(void);
 const struct arch_vpmu_ops *hygon_vpmu_init(void);
+#else
+static inline const struct arch_vpmu_ops *amd_vpmu_init(void) { return NULL; }
+static inline const struct arch_vpmu_ops *hygon_vpmu_init(void) { return NULL; 
}
+#endif /* CONFIG_AMD_SVM */
 
 struct vpmu_struct {
 u32 flags;
-- 
2.37.2




[RFC 03/10] x86/p2m: guard vmx specific ept functions with INTEL_VMX

2023-02-13 Thread Xenia Ragiadakou
The functions ept_p2m_init(), ept_p2m_uninit() and p2m_init_altp2m_ept() are
VT-x specific.
Implement them as unreachable static inline functions when !INTEL_VMX.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/mm/p2m.h | 16 
 1 file changed, 16 insertions(+)

diff --git a/xen/arch/x86/mm/p2m.h b/xen/arch/x86/mm/p2m.h
index cc0f6766e4..ac80414688 100644
--- a/xen/arch/x86/mm/p2m.h
+++ b/xen/arch/x86/mm/p2m.h
@@ -35,9 +35,25 @@ void p2m_nestedp2m_init(struct p2m_domain *p2m);
 int p2m_init_nestedp2m(struct domain *d);
 void p2m_teardown_nestedp2m(struct domain *d);
 
+#ifdef CONFIG_INTEL_VMX
 int ept_p2m_init(struct p2m_domain *p2m);
 void ept_p2m_uninit(struct p2m_domain *p2m);
 void p2m_init_altp2m_ept(struct domain *d, unsigned int i);
+#else
+static inline int ept_p2m_init(struct p2m_domain *p2m)
+{
+ASSERT_UNREACHABLE();
+return -EINVAL;
+}
+static inline void ept_p2m_uninit(struct p2m_domain *p2m)
+{
+ASSERT_UNREACHABLE();
+}
+static inline void p2m_init_altp2m_ept(struct domain *d, unsigned int i)
+{
+ASSERT_UNREACHABLE();
+}
+#endif
 
 /*
  * Local variables:
-- 
2.37.2




[RFC 02/10] x86/hvm: separate AMD-V and Intel VT-x hvm_function_table initializers

2023-02-13 Thread Xenia Ragiadakou
Since start_svm() is AMD-V specific while start_vmx() is Intel VT-x specific,
need to be guarded with AMD_SVM and INTEL_VMX, respectively.
Instead of adding #ifdef guards around the function calls, implement them as
static inline null-returning functions when the respective technology is not
enabled.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/include/asm/hvm/hvm.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/xen/arch/x86/include/asm/hvm/hvm.h 
b/xen/arch/x86/include/asm/hvm/hvm.h
index 43d3fc2498..353e48f4e3 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -261,8 +261,16 @@ extern struct hvm_function_table hvm_funcs;
 extern bool_t hvm_enabled;
 extern s8 hvm_port80_allowed;
 
+#ifdef CONFIG_AMD_SVM
 extern const struct hvm_function_table *start_svm(void);
+#else
+static inline const struct hvm_function_table *start_svm(void) { return NULL; }
+#endif
+#ifdef CONFIG_INTEL_VMX
 extern const struct hvm_function_table *start_vmx(void);
+#else
+static inline const struct hvm_function_table *start_vmx(void) { return NULL; }
+#endif
 
 int hvm_domain_initialise(struct domain *d,
   const struct xen_domctl_createdomain *config);
-- 
2.37.2




[RFC 01/10] x86: introduce AMD-V and Intel VT-x Kconfig options

2023-02-13 Thread Xenia Ragiadakou
Introduce two new Kconfig options, AMD_SVM and INTEL_VMX, to allow code
specific to each virtualization technology to be separated and, when not
required, stripped.
AMD_SVM will be used to enable virtual machine extensions on platforms that
implement the AMD Virtualization Technology (AMD-V).
INTEL_VMX will be used to enable virtual machine extensions on platforms that
implement the Intel Virtualization Technology (Intel VT-x).

Both features depend on HVM support.

Since, at this point, disabling any of them would cause Xen to not compile,
the options are enabled by default if HVM and are not selectable by the user.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---
 xen/arch/x86/Kconfig| 6 ++
 xen/arch/x86/cpu/Makefile   | 4 +++-
 xen/arch/x86/hvm/Makefile   | 4 ++--
 xen/arch/x86/mm/Makefile| 3 ++-
 xen/arch/x86/mm/hap/Makefile| 2 +-
 xen/drivers/passthrough/Kconfig | 2 +-
 6 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 6a7825f4ba..2a72111c23 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -117,6 +117,12 @@ config HVM
 
  If unsure, say Y.
 
+config AMD_SVM
+   def_bool y if HVM
+
+config INTEL_VMX
+   def_bool y if HVM
+
 config XEN_SHSTK
bool "Supervisor Shadow Stacks"
depends on HAS_AS_CET_SS
diff --git a/xen/arch/x86/cpu/Makefile b/xen/arch/x86/cpu/Makefile
index 35561fe51d..08bdf4abb8 100644
--- a/xen/arch/x86/cpu/Makefile
+++ b/xen/arch/x86/cpu/Makefile
@@ -10,4 +10,6 @@ obj-y += intel.o
 obj-y += intel_cacheinfo.o
 obj-y += mwait-idle.o
 obj-y += shanghai.o
-obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
+obj-y += vpmu.o
+obj-$(CONFIG_AMD_SVM) += vpmu_amd.o
+obj-$(CONFIG_INTEL_VMX) += vpmu_intel.o
diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 3464191544..4c1fa5c6c2 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -1,5 +1,5 @@
-obj-y += svm/
-obj-y += vmx/
+obj-$(CONFIG_AMD_SVM) += svm/
+obj-$(CONFIG_INTEL_VMX) += vmx/
 obj-y += viridian/
 
 obj-y += asid.o
diff --git a/xen/arch/x86/mm/Makefile b/xen/arch/x86/mm/Makefile
index 0803ac9297..c53c94308a 100644
--- a/xen/arch/x86/mm/Makefile
+++ b/xen/arch/x86/mm/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_MEM_SHARING) += mem_sharing.o
 obj-$(CONFIG_HVM) += nested.o
 obj-$(CONFIG_HVM) += p2m.o
 obj-y += p2m-basic.o
-obj-$(CONFIG_HVM) += p2m-ept.o p2m-pod.o p2m-pt.o
+obj-$(CONFIG_HVM) += p2m-pod.o p2m-pt.o
+obj-$(CONFIG_INTEL_VMX) += p2m-ept.o
 obj-y += paging.o
 obj-y += physmap.o
diff --git a/xen/arch/x86/mm/hap/Makefile b/xen/arch/x86/mm/hap/Makefile
index 8ef54b1faa..67c29b2162 100644
--- a/xen/arch/x86/mm/hap/Makefile
+++ b/xen/arch/x86/mm/hap/Makefile
@@ -3,4 +3,4 @@ obj-y += guest_walk_2.o
 obj-y += guest_walk_3.o
 obj-y += guest_walk_4.o
 obj-y += nested_hap.o
-obj-y += nested_ept.o
+obj-$(CONFIG_INTEL_VMX) += nested_ept.o
diff --git a/xen/drivers/passthrough/Kconfig b/xen/drivers/passthrough/Kconfig
index 864fcf3b0c..f95e7e5902 100644
--- a/xen/drivers/passthrough/Kconfig
+++ b/xen/drivers/passthrough/Kconfig
@@ -51,7 +51,7 @@ config AMD_IOMMU
 
 config INTEL_IOMMU
bool "Intel VT-d" if EXPERT
-   depends on X86
+   depends on X86 && INTEL_VMX
default y
help
  Enables I/O virtualization on platforms that implement the
-- 
2.37.2




[RFC 00/10] x86: Make cpu virtualization support configurable

2023-02-13 Thread Xenia Ragiadakou
This series aims to provide a means to render the cpu virtualization
technology support in Xen configurable.
Currently, irrespectively of the target platform, both AMD-V and Intel VT-x
drivers are built.
The series adds two new Kconfig controls, AMD_SVM and INTEL_VMX, that can be
used to switch to a finer-grained configuration for a given platform, and
reduce dead code.

The code separation is done using the new config guards.
The series is sent as an RFC to gather feedback in case the changes are in
the wrong direction (I m mostly concerned about hap (ept/npt))

It depends on patch
"x86/vlapic: call vmx_vlapic_msr_changed through an hvm_function callback"
(otherwise it won't compile with Intel VT-x support disabled)
and on the series "Make x86 IOMMU driver support configurable" (AFAIU, the
series has been reviewed but not merged yet) because Intel IOMMU depends on
Intel VT-x for posted interrupts. 

Xenia Ragiadakou (10):
  x86: introduce AMD-V and Intel VT-x Kconfig options
  x86/hvm: separate AMD-V and Intel VT-x hvm_function_table initializers
  x86/p2m: guard vmx specific ept functions with INTEL_VMX
  x86/vpmu: separate AMD-V and Intel VT-x init arch_vpmu_ops
initializers
  x86/traps: guard vmx specific functions with INTEL_VMX
  x86/domain: guard svm specific functions with AMD_SVM
  x86/oprofile: guard svm specific symbols with AMD_SVM
  x86: wire cpu_has_svm/vmx_* to false when svm/vmx not enabled
  x86/ioreq: guard VIO_realmode_completion with INTEL_VMX
  x86/hvm: make AMD-V and Intel VT-x support configurable

 xen/arch/x86/Kconfig| 20 ++
 xen/arch/x86/cpu/Makefile   |  4 ++-
 xen/arch/x86/domain.c   |  4 +--
 xen/arch/x86/hvm/Makefile   |  4 +--
 xen/arch/x86/hvm/ioreq.c|  2 ++
 xen/arch/x86/include/asm/hvm/hvm.h  |  8 ++
 xen/arch/x86/include/asm/hvm/svm/svm.h  | 17 
 xen/arch/x86/include/asm/hvm/vmx/vmcs.h | 35 +
 xen/arch/x86/include/asm/hvm/vmx/vmx.h  | 18 +
 xen/arch/x86/include/asm/vpmu.h |  9 +++
 xen/arch/x86/mm/Makefile|  3 ++-
 xen/arch/x86/mm/hap/Makefile|  2 +-
 xen/arch/x86/mm/p2m.h   | 16 +++
 xen/arch/x86/oprofile/op_model_athlon.c |  2 +-
 xen/arch/x86/traps.c|  4 +--
 xen/drivers/passthrough/Kconfig |  2 +-
 16 files changed, 139 insertions(+), 11 deletions(-)

-- 
2.37.2




[PATCH] x86/platform: make XENPF_get_dom0_console actually usable

2023-02-13 Thread Jan Beulich
struct dom0_vga_console_info has been extended in the past, and it may
be extended again. The use in PV Dom0's start info already covers for
that by supplying the size of the provided data. For the recently
introduced platform-op size needs providing similarly. Go the easiest
available route and simply supply size via the hypercall return value.

While there also add a build-time check that possibly future growth of
the struct won't affect xen_platform_op_t's size.

Fixes: 4dd160583c79 ("x86/platform: introduce hypercall to get initial video 
console settings")
Signed-off-by: Jan Beulich 
---
Noticed while trying to actually use the new hypercall in Linux.

--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -841,6 +841,8 @@ ret_t do_platform_op(
 
 #ifdef CONFIG_VIDEO
 case XENPF_get_dom0_console:
+BUILD_BUG_ON(sizeof(op->u.dom0_console) > sizeof(op->u.pad));
+ret = sizeof(op->u.dom0_console);
 if ( !fill_console_start_info(>u.dom0_console) )
 {
 ret = -ENODEV;



[PATCH] automation: Add container and build jobs to run cppcheck analysis

2023-02-13 Thread Michal Orzel
Add a debian container with cppcheck installation routine inside,
capable of performing cppcheck analysis on Xen-only build including
cross-builds for arm32 and arm64.

Populate build jobs making use of that container to run cppcheck
analysis to produce a text report (xen-cppcheck.txt) containing the list
of all the findings.

This patch does not aim at performing any sort of bisection. Cppcheck is
imperfect and for now, our goal is to at least be aware of its reports,
so that we can compare them with the ones produced by better tools and
to be able to see how these reports change as a result of further
infrastructure improvements (e.g. exception list, rules exclusion).

Signed-off-by: Michal Orzel 
---
For those interested in, here is a sample pipeline:
https://gitlab.com/xen-project/people/morzel/xen-orzelmichal/-/pipelines/775769167
---
 .../build/debian/unstable-cppcheck.dockerfile | 37 +
 automation/gitlab-ci/build.yaml   | 40 +++
 automation/scripts/build  | 11 -
 3 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 automation/build/debian/unstable-cppcheck.dockerfile

diff --git a/automation/build/debian/unstable-cppcheck.dockerfile 
b/automation/build/debian/unstable-cppcheck.dockerfile
new file mode 100644
index ..39bcc50673c8
--- /dev/null
+++ b/automation/build/debian/unstable-cppcheck.dockerfile
@@ -0,0 +1,37 @@
+FROM debian:unstable
+LABEL maintainer.name="The Xen Project" \
+  maintainer.email="xen-devel@lists.xenproject.org"
+
+ENV DEBIAN_FRONTEND=noninteractive
+ENV CPPCHECK_VERSION=2.7
+ENV USER root
+
+RUN mkdir /build
+WORKDIR /build
+
+# dependencies for cppcheck and Xen-only build/cross-build
+RUN apt-get update && \
+apt-get --quiet --yes install \
+build-essential \
+curl \
+python-is-python3 \
+libpcre3-dev \
+flex \
+bison \
+gcc-arm-linux-gnueabihf \
+gcc-aarch64-linux-gnu
+
+# cppcheck release build (see cppcheck readme.md)
+RUN curl -fsSLO 
https://github.com/danmar/cppcheck/archive/"$CPPCHECK_VERSION".tar.gz && \
+tar xvzf "$CPPCHECK_VERSION".tar.gz && \
+cd cppcheck-"$CPPCHECK_VERSION" && \
+make install -j$(nproc) \
+MATCHCOMPILER=yes \
+FILESDIR=/usr/share/cppcheck \
+HAVE_RULES=yes CXXFLAGS="-O2 -DNDEBUG -Wall -Wno-sign-compare 
-Wno-unused-function"
+
+# clean
+RUN apt-get autoremove -y && \
+apt-get clean && \
+rm -rf /var/lib/apt/lists* /tmp/* /var/tmp/* && \
+rm -rf cppcheck-"$CPPCHECK_VERSION"* "$CPPCHECK_VERSION".tar.gz
diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
index a053c5c7325d..c8831ccbec7a 100644
--- a/automation/gitlab-ci/build.yaml
+++ b/automation/gitlab-ci/build.yaml
@@ -7,6 +7,7 @@
 paths:
   - binaries/
   - xen-config
+  - xen-cppcheck.txt
   - '*.log'
   - '*/*.log'
 when: always
@@ -145,6 +146,23 @@
   variables:
 <<: *gcc
 
+.arm64-cross-build-tmpl:
+  <<: *build
+  variables:
+XEN_TARGET_ARCH: arm64
+  tags:
+- x86_64
+
+.arm64-cross-build:
+  extends: .arm64-cross-build-tmpl
+  variables:
+debug: n
+
+.gcc-arm64-cross-build:
+  extends: .arm64-cross-build
+  variables:
+<<: *gcc
+
 .arm64-build-tmpl:
   <<: *build
   variables:
@@ -679,6 +697,28 @@ archlinux-current-gcc-riscv64-debug-randconfig:
 EXTRA_FIXED_RANDCONFIG:
   CONFIG_COVERAGE=n
 
+# Cppcheck analysis jobs
+
+debian-unstable-gcc-cppcheck:
+  extends: .gcc-x86-64-build
+  variables:
+CONTAINER: debian:unstable-cppcheck
+CPPCHECK: y
+
+debian-unstable-gcc-arm32-cppcheck:
+  extends: .gcc-arm32-cross-build
+  variables:
+CONTAINER: debian:unstable-cppcheck
+CROSS_COMPILE: /usr/bin/arm-linux-gnueabihf-
+CPPCHECK: y
+
+debian-unstable-gcc-arm64-cppcheck:
+  extends: .gcc-arm64-cross-build
+  variables:
+CONTAINER: debian:unstable-cppcheck
+CROSS_COMPILE: /usr/bin/aarch64-linux-gnu-
+CPPCHECK: y
+
 ## Test artifacts common
 
 .test-jobs-artifact-common:
diff --git a/automation/scripts/build b/automation/scripts/build
index f2f5e55bc04f..c219752d553e 100755
--- a/automation/scripts/build
+++ b/automation/scripts/build
@@ -38,7 +38,16 @@ cp xen/.config xen-config
 # Directory for the artefacts to be dumped into
 mkdir binaries
 
-if [[ "${HYPERVISOR_ONLY}" == "y" ]]; then
+if [[ "${CPPCHECK}" == "y" ]]; then
+# Cppcheck analysis invokes Xen-only build.
+# Known limitation: cppcheck generates inconsistent reports when running
+# in parallel mode, therefore do not specify -j.
+xen/scripts/xen-analysis.py --run-cppcheck --cppcheck-misra
+
+# Preserve artefacts
+cp xen/xen binaries/xen
+cp xen/cppcheck-report/xen-cppcheck.txt xen-cppcheck.txt
+elif [[ "${HYPERVISOR_ONLY}" == "y" ]]; then
 # Xen-only build
 make -j$(nproc) xen
 
-- 
2.25.1




Re: xen (multiboot) binary gets corrupted ELF notes

2023-02-13 Thread Jan Beulich
On 13.02.2023 15:12, Marek Marczykowski-Górecki wrote:
> On Mon, Feb 13, 2023 at 01:53:21PM +0100, Jan Beulich wrote:
>> On 13.02.2023 12:14, Marek Marczykowski-Górecki wrote:
>>> Hi,
>>>
>>> I'm getting some ELF note issues on multiboot binary
>>> specifically:
>>> xen/xen: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), 
>>> statically linked, BuildID[sha1]=f7d2c37a4ad599b268f2f2d94bff3776d12649b3, 
>>> bad note description size 0xc0010001, stripped
>>>
>>> readelf additionally says:
>>>
>>> Displaying notes found in: .note
>>>   OwnerData sizeDescription
>>>   GNU  0x0014   NT_GNU_BUILD_ID (unique build 
>>> ID bitstring)
>>> Build ID: c5825a0d08edc4d11b1138fedca6b14ce8ba7302
>>>   (NONE)   0x0004   Unknown note type: (0x0020)
>>>description data: 05 00 00 00 
>>> readelf: xen/xen: Warning: note with invalid namesz and/or descsz found 
>>> at offset 0x34
>>> readelf: xen/xen: Warning:  type: 0x4, namesize: 0x00554e47, descsize: 
>>> 0xc0010001, alignment: 4
>>>
>>> Grub doesn't care, but launching such xen with kexec doesn't work.
>>>
>>> Initially found when booting Xen via Heads: 
>>> https://openqa.qubes-os.org/tests/60151#step/install_startup/11
>>>
>>> Andy says:
 yeah, I've seen the same on XTF
 binutil's recent elf notes for CET compatibility use an unsigned long
 so they're not compatible when we build as 64bit and then re-package as 32
 I think we need to strip all elf notes in mkelf32
>>
>> Instead of complicating mkelf32 (we want to retain at least the build-id
>> note, after all, and for PVH_GUEST builds also .note.Xen) why don't we
>> discard the unwanted/unneeded notes then from the linker script, just
>> like we already do for xen.efi?
> 
> Ok, this seems to help:
> 
> ---8<
> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
> index 8930e14fc40e..f0831bd677e7 100644
> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -192,13 +192,6 @@ SECTIONS
>  #endif
>  #endif
>  
> -#ifndef EFI
> -  /* Retain these just for the purpose of possible analysis tools. */
> -  DECL_SECTION(.note) {
> -   *(.note.*)
> -  } PHDR(note) PHDR(text)
> -#endif
> -
>_erodata = .;
>  
>. = ALIGN(SECTION_ALIGN);
> ---8<---
> 
> The comment suggests some notes could be useful, but given they are
> broken anyway and nobody complained so far, maybe not really?

Well, much depends on those unspecified analysis tools.

Jan



[PATCH 3/3] automation: Add a gzip compressed kernel image test on arm32

2023-02-13 Thread Michal Orzel
Xen supports booting gzip compressed kernels, therefore add a new test
job qemu-smoke-dom0less-arm32-gcc-gzip in debug and non-debug variant
that will execute qemu-smoke-dom0less-arm32.sh script to test booting
a domU with a gzip compressed kernel image (in our case zImage).

By passing "gzip" as a test variant, the test will call gzip command
to compress kernel image and use it in ImageBuilder configuration.
No need for a specific check to be executed from domU. Being able to
see a test message from a boot log is sufficient to mark a test as
passed.

Signed-off-by: Michal Orzel 
---
 automation/gitlab-ci/test.yaml  | 16 
 automation/scripts/qemu-smoke-dom0less-arm32.sh | 13 +
 2 files changed, 29 insertions(+)

diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
index c2bcc5d4d3e5..be7a55d89708 100644
--- a/automation/gitlab-ci/test.yaml
+++ b/automation/gitlab-ci/test.yaml
@@ -242,6 +242,22 @@ qemu-smoke-dom0less-arm32-gcc-debug-staticmem:
 - *arm32-test-needs
 - debian-unstable-gcc-arm32-debug-staticmem
 
+qemu-smoke-dom0less-arm32-gcc-gzip:
+  extends: .qemu-arm32
+  script:
+- ./automation/scripts/qemu-smoke-dom0less-arm32.sh gzip 2>&1 | tee 
${LOGFILE}
+  needs:
+- *arm32-test-needs
+- debian-unstable-gcc-arm32
+
+qemu-smoke-dom0less-arm32-gcc-debug-gzip:
+  extends: .qemu-arm32
+  script:
+- ./automation/scripts/qemu-smoke-dom0less-arm32.sh gzip 2>&1 | tee 
${LOGFILE}
+  needs:
+- *arm32-test-needs
+- debian-unstable-gcc-arm32-debug
+
 qemu-alpine-x86_64-gcc:
   extends: .qemu-x86-64
   script:
diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh 
b/automation/scripts/qemu-smoke-dom0less-arm32.sh
index f89ee8b6464a..b420ee444f2a 100755
--- a/automation/scripts/qemu-smoke-dom0less-arm32.sh
+++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh
@@ -30,6 +30,15 @@ fi
 "
 fi
 
+if [[ "${test_variant}" == "gzip" ]]; then
+# Compress kernel image with gzip
+gzip vmlinuz
+passed="${test_variant} test passed"
+domU_check="
+echo \"${passed}\"
+"
+fi
+
 # domU rootfs
 mkdir rootfs
 cd rootfs
@@ -79,6 +88,10 @@ if [[ "${test_variant}" == "static-mem" ]]; then
 echo -e "\nDOMU_STATIC_MEM[0]=\"${domu_base} ${domu_size}\"" >> config
 fi
 
+if [[ "${test_variant}" == "gzip" ]]; then
+sed -i "s/vmlinuz/vmlinuz.gz/g" config
+fi
+
 rm -rf imagebuilder
 git clone https://gitlab.com/ViryaOS/imagebuilder
 bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config
-- 
2.25.1




[PATCH 2/3] automation: Add a static memory allocation test on arm32

2023-02-13 Thread Michal Orzel
Add a new test job qemu-smoke-dom0less-arm32-gcc-staticmem in debug
and non-debug variant that will execute qemu-smoke-dom0less-arm32.sh
script to test static memory allocation feature. The test case itself
is directly taken from dom0less arm64 testing.

Populate build jobs to compile Xen with config options necessary to
enable static memory feature. Populate test jobs passing "static-mem"
as a test variant. The test configures domU with a static memory region
(direct-mapped) and adds a check using /proc/iomem to determine the
memory region marked as "System RAM".

Signed-off-by: Michal Orzel 
---
 automation/gitlab-ci/build.yaml   | 20 +++
 automation/gitlab-ci/test.yaml| 16 +++
 .../scripts/qemu-smoke-dom0less-arm32.sh  | 17 
 3 files changed, 53 insertions(+)

diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
index a053c5c7325d..166182bc595f 100644
--- a/automation/gitlab-ci/build.yaml
+++ b/automation/gitlab-ci/build.yaml
@@ -565,6 +565,26 @@ debian-unstable-gcc-arm32-debug-randconfig:
 HYPERVISOR_ONLY: y
 RANDCONFIG: y
 
+debian-unstable-gcc-arm32-staticmem:
+  extends: .gcc-arm32-cross-build
+  variables:
+CONTAINER: debian:unstable-arm32-gcc
+HYPERVISOR_ONLY: y
+EXTRA_XEN_CONFIG: |
+  CONFIG_EXPERT=y
+  CONFIG_UNSUPPORTED=y
+  CONFIG_STATIC_MEMORY=y
+
+debian-unstable-gcc-arm32-debug-staticmem:
+  extends: .gcc-arm32-cross-build-debug
+  variables:
+CONTAINER: debian:unstable-arm32-gcc
+HYPERVISOR_ONLY: y
+EXTRA_XEN_CONFIG: |
+  CONFIG_EXPERT=y
+  CONFIG_UNSUPPORTED=y
+  CONFIG_STATIC_MEMORY=y
+
 # Arm builds
 
 debian-unstable-gcc-arm64:
diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
index 84ab1fee50a4..c2bcc5d4d3e5 100644
--- a/automation/gitlab-ci/test.yaml
+++ b/automation/gitlab-ci/test.yaml
@@ -226,6 +226,22 @@ qemu-smoke-dom0less-arm32-gcc-debug:
 - *arm32-test-needs
 - debian-unstable-gcc-arm32-debug
 
+qemu-smoke-dom0less-arm32-gcc-staticmem:
+  extends: .qemu-arm32
+  script:
+- ./automation/scripts/qemu-smoke-dom0less-arm32.sh static-mem 2>&1 | tee 
${LOGFILE}
+  needs:
+- *arm32-test-needs
+- debian-unstable-gcc-arm32-staticmem
+
+qemu-smoke-dom0less-arm32-gcc-debug-staticmem:
+  extends: .qemu-arm32
+  script:
+- ./automation/scripts/qemu-smoke-dom0less-arm32.sh static-mem 2>&1 | tee 
${LOGFILE}
+  needs:
+- *arm32-test-needs
+- debian-unstable-gcc-arm32-debug-staticmem
+
 qemu-alpine-x86_64-gcc:
   extends: .qemu-x86-64
   script:
diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh 
b/automation/scripts/qemu-smoke-dom0less-arm32.sh
index c81529cbbfd0..f89ee8b6464a 100755
--- a/automation/scripts/qemu-smoke-dom0less-arm32.sh
+++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh
@@ -17,6 +17,19 @@ echo \"${passed}\"
 "
 fi
 
+if [[ "${test_variant}" == "static-mem" ]]; then
+# Memory range that is statically allocated to domU1
+domu_base="0x5000"
+domu_size="0x2000"
+passed="${test_variant} test passed"
+domU_check="
+mem_range=$(printf \"%08x-%08x\" ${domu_base} $(( ${domu_base} + ${domu_size} 
- 1 )))
+if grep -q -x \"\${mem_range} : System RAM\" /proc/iomem; then
+echo \"${passed}\"
+fi
+"
+fi
+
 # domU rootfs
 mkdir rootfs
 cd rootfs
@@ -62,6 +75,10 @@ BOOT_CMD="bootm"
 UBOOT_SOURCE="boot.source"
 UBOOT_SCRIPT="boot.scr"' > config
 
+if [[ "${test_variant}" == "static-mem" ]]; then
+echo -e "\nDOMU_STATIC_MEM[0]=\"${domu_base} ${domu_size}\"" >> config
+fi
+
 rm -rf imagebuilder
 git clone https://gitlab.com/ViryaOS/imagebuilder
 bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config
-- 
2.25.1




[PATCH 0/3] automation: dom0less arm32 testing

2023-02-13 Thread Michal Orzel
This patch series aims at improving the arm32 CI testing by introducing
the true dom0less-based tests that do not require dom0 with a toolstack.
It creates a foundation for further test expansion.

Michal Orzel (3):
  automation: Add arm32 dom0less testing
  automation: Add a static memory allocation test on arm32
  automation: Add a gzip compressed kernel image test on arm32

 automation/gitlab-ci/build.yaml   |  20 +++
 automation/gitlab-ci/test.yaml|  48 +++
 .../scripts/qemu-smoke-dom0less-arm32.sh  | 119 ++
 3 files changed, 187 insertions(+)
 create mode 100755 automation/scripts/qemu-smoke-dom0less-arm32.sh

-- 
2.25.1




[PATCH 1/3] automation: Add arm32 dom0less testing

2023-02-13 Thread Michal Orzel
At the moment, we only perform a single arm32 test in our CI, checking
whether dom0 boots successfully or not. This is mostly because we do not
have any arm32 runners and we only execute a hypervisor only build.

In order not to leave the arm32 testing in such a poor state, add a
script qemu-smoke-dom0less-arm32.sh to start testing true dom0less
configuration, in which case we do not need a dom0 with a toolstack.

The script is mostly based on the one used for dom0 arm32 testing as well
as the one used for dom0less arm64 testing. We obtain Debian Bullseye
kernel and Alpine Linux busybox-based rootfs. Depending on the test
variant, we prepare a test case contained within domU_check variable,
that will be executed as part of /init script.

Signed-off-by: Michal Orzel 
---
 automation/gitlab-ci/test.yaml| 16 
 .../scripts/qemu-smoke-dom0less-arm32.sh  | 89 +++
 2 files changed, 105 insertions(+)
 create mode 100755 automation/scripts/qemu-smoke-dom0less-arm32.sh

diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
index ce543ef5c0ef..84ab1fee50a4 100644
--- a/automation/gitlab-ci/test.yaml
+++ b/automation/gitlab-ci/test.yaml
@@ -210,6 +210,22 @@ qemu-smoke-dom0-arm32-gcc-debug:
 - *arm32-test-needs
 - debian-unstable-gcc-arm32-debug
 
+qemu-smoke-dom0less-arm32-gcc:
+  extends: .qemu-arm32
+  script:
+- ./automation/scripts/qemu-smoke-dom0less-arm32.sh 2>&1 | tee ${LOGFILE}
+  needs:
+- *arm32-test-needs
+- debian-unstable-gcc-arm32
+
+qemu-smoke-dom0less-arm32-gcc-debug:
+  extends: .qemu-arm32
+  script:
+- ./automation/scripts/qemu-smoke-dom0less-arm32.sh 2>&1 | tee ${LOGFILE}
+  needs:
+- *arm32-test-needs
+- debian-unstable-gcc-arm32-debug
+
 qemu-alpine-x86_64-gcc:
   extends: .qemu-x86-64
   script:
diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh 
b/automation/scripts/qemu-smoke-dom0less-arm32.sh
new file mode 100755
index ..c81529cbbfd0
--- /dev/null
+++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh
@@ -0,0 +1,89 @@
+#!/bin/bash
+
+set -ex
+
+test_variant=$1
+
+cd binaries
+# Use the kernel from Debian
+curl --fail --silent --show-error --location --output vmlinuz 
https://deb.debian.org/debian/dists/bullseye/main/installer-armhf/current/images/netboot/vmlinuz
+# Use a tiny initrd based on busybox from Alpine Linux
+curl --fail --silent --show-error --location --output initrd.tar.gz 
https://dl-cdn.alpinelinux.org/alpine/v3.15/releases/armhf/alpine-minirootfs-3.15.1-armhf.tar.gz
+
+if [ -z "${test_variant}" ]; then
+passed="generic test passed"
+domU_check="
+echo \"${passed}\"
+"
+fi
+
+# domU rootfs
+mkdir rootfs
+cd rootfs
+tar xvzf ../initrd.tar.gz
+echo "#!/bin/sh
+
+mount -t proc proc /proc
+mount -t sysfs sysfs /sys
+mount -t devtmpfs devtmpfs /dev
+${domU_check}
+/bin/sh" > init
+chmod +x init
+find . | cpio -H newc -o | gzip > ../initrd.gz
+cd ..
+
+# XXX QEMU looks for "efi-virtio.rom" even if it is unneeded
+curl -fsSLO https://github.com/qemu/qemu/raw/v5.2.0/pc-bios/efi-virtio.rom
+./qemu-system-arm \
+-machine virt \
+-machine virtualization=true \
+-smp 4 \
+-m 1024 \
+-serial stdio \
+-monitor none \
+-display none \
+-machine dumpdtb=virt.dtb
+
+# ImageBuilder
+echo 'MEMORY_START="0x4000"
+MEMORY_END="0x8000"
+
+DEVICE_TREE="virt.dtb"
+XEN="xen"
+XEN_CMD="console=dtuart bootscrub=0"
+
+DOMU_KERNEL[0]="vmlinuz"
+DOMU_RAMDISK[0]="initrd.gz"
+DOMU_MEM[0]="512"
+NUM_DOMUS=1
+
+LOAD_CMD="tftpb"
+BOOT_CMD="bootm"
+UBOOT_SOURCE="boot.source"
+UBOOT_SCRIPT="boot.scr"' > config
+
+rm -rf imagebuilder
+git clone https://gitlab.com/ViryaOS/imagebuilder
+bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config
+
+# Run the test
+rm -f smoke.serial
+set +e
+echo "  virtio scan; dhcp; tftpb 0x4000 boot.scr; source 0x4000"| \
+timeout -k 1 240 \
+./qemu-system-arm \
+-machine virt \
+-machine virtualization=true \
+-smp 4 \
+-m 1024 \
+-serial stdio \
+-monitor none \
+-display none \
+-no-reboot \
+-device virtio-net-pci,netdev=n0 \
+-netdev user,id=n0,tftp=./ \
+-bios /usr/lib/u-boot/qemu_arm/u-boot.bin |& tee smoke.serial
+
+set -e
+(grep -q "${passed}" smoke.serial) || exit 1
+exit 0
-- 
2.25.1




Re: xen (multiboot) binary gets corrupted ELF notes

2023-02-13 Thread Marek Marczykowski-Górecki
On Mon, Feb 13, 2023 at 01:53:21PM +0100, Jan Beulich wrote:
> On 13.02.2023 12:14, Marek Marczykowski-Górecki wrote:
> > Hi,
> > 
> > I'm getting some ELF note issues on multiboot binary
> > specifically:
> > xen/xen: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), 
> > statically linked, BuildID[sha1]=f7d2c37a4ad599b268f2f2d94bff3776d12649b3, 
> > bad note description size 0xc0010001, stripped
> > 
> > readelf additionally says:
> > 
> > Displaying notes found in: .note
> >   OwnerData sizeDescription
> >   GNU  0x0014   NT_GNU_BUILD_ID (unique build 
> > ID bitstring)
> > Build ID: c5825a0d08edc4d11b1138fedca6b14ce8ba7302
> >   (NONE)   0x0004   Unknown note type: (0x0020)
> >description data: 05 00 00 00 
> > readelf: xen/xen: Warning: note with invalid namesz and/or descsz found 
> > at offset 0x34
> > readelf: xen/xen: Warning:  type: 0x4, namesize: 0x00554e47, descsize: 
> > 0xc0010001, alignment: 4
> > 
> > Grub doesn't care, but launching such xen with kexec doesn't work.
> > 
> > Initially found when booting Xen via Heads: 
> > https://openqa.qubes-os.org/tests/60151#step/install_startup/11
> > 
> > Andy says:
> >> yeah, I've seen the same on XTF
> >> binutil's recent elf notes for CET compatibility use an unsigned long
> >> so they're not compatible when we build as 64bit and then re-package as 32
> >> I think we need to strip all elf notes in mkelf32
> 
> Instead of complicating mkelf32 (we want to retain at least the build-id
> note, after all, and for PVH_GUEST builds also .note.Xen) why don't we
> discard the unwanted/unneeded notes then from the linker script, just
> like we already do for xen.efi?

Ok, this seems to help:

---8<
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index 8930e14fc40e..f0831bd677e7 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -192,13 +192,6 @@ SECTIONS
 #endif
 #endif
 
-#ifndef EFI
-  /* Retain these just for the purpose of possible analysis tools. */
-  DECL_SECTION(.note) {
-   *(.note.*)
-  } PHDR(note) PHDR(text)
-#endif
-
   _erodata = .;
 
   . = ALIGN(SECTION_ALIGN);
---8<---

The comment suggests some notes could be useful, but given they are
broken anyway and nobody complained so far, maybe not really?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


signature.asc
Description: PGP signature


Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Julien Grall




On 13/02/2023 13:30, Jan Beulich wrote:

On 13.02.2023 14:19, Julien Grall wrote:

On 13/02/2023 12:24, Jan Beulich wrote:

On 03.02.2023 18:05, Oleksii Kurochko wrote:

--- /dev/null
+++ b/xen/common/bug.c
@@ -0,0 +1,88 @@
+#include 
+#include 
+#include 
+#include 


Please order headers alphabetically unless there's anything preventing
that from being done.


+#include 
+#include 
+
+#include 
+
+int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)


I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
environments that's redundant with "unsigned long", and it's also
redundant with C99's uintptr_t. Hence when seeing the type I always
wonder whether it's really a host virtual address which is meant (and
not perhaps a guest one, which in principle could differ from the host
one for certain guest types). In any event the existence of this type
should imo not be a prereq to using this generic piece of infrastructure


C spec aside, the use "unsigned long" is quite overloaded within Xen.
Although, this has improved since we introduced mfn_t/gfn_t.

In the future, I could imagine us to also introduce typesafe for
vaddr_t, reducing further the risk to mix different meaning of unsigned
long.

Therefore, I think the introduction of vaddr_t should be a prereq for
using the generic piece of infrastructure.


Would be nice if other maintainers could share their thoughts here. I,
for one, would view a typesafe gaddr_t as reasonable, and perhaps also
a typesafe gvaddr_t, but hypervisor addresses should be fine as void *
or unsigned long.


See my answer to Andrew.




--- /dev/null
+++ b/xen/include/xen/bug.h
@@ -0,0 +1,127 @@
+#ifndef __XEN_BUG_H__
+#define __XEN_BUG_H__
+
+#define BUG_DISP_WIDTH24
+#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
+#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
+
+#define BUGFRAME_run_fn 0
+#define BUGFRAME_warn   1
+#define BUGFRAME_bug2
+#define BUGFRAME_assert 3
+
+#define BUGFRAME_NR 4
+
+#ifndef __ASSEMBLY__
+
+#include 
+#include 
+#include 
+#include 


Again - please sort headers.


+#include 
+
+#ifndef BUG_FRAME_STUFF
+struct bug_frame {


Please can we have a blank line between the above two ones and then similarly
ahead of the #endif?


+signed int loc_disp;/* Relative address to the bug address */
+signed int file_disp;   /* Relative address to the filename */
+signed int msg_disp;/* Relative address to the predicate (for ASSERT) 
*/
+uint16_t line;  /* Line number */
+uint32_t pad0:16;   /* Padding for 8-bytes align */


Already the original comment in Arm code is wrong: The padding doesn't
guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
size.
Aiui there's also no need for 8-byte alignment anywhere here (in
fact there's ".p2align 2" in the generator macros).


I would rather keep the pad0 here.


May I ask why that is?


It makes clear of the padding (which would have been hidden) when using 
.p2align 2.


Cheers,

--
Julien Grall



Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Julien Grall

Hi Andrew,

On 13/02/2023 13:33, Andrew Cooper wrote:

On 13/02/2023 1:19 pm, Julien Grall wrote:



On 13/02/2023 12:24, Jan Beulich wrote:

On 03.02.2023 18:05, Oleksii Kurochko wrote:

--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -92,6 +92,12 @@ config STATIC_MEMORY
       If unsure, say N.
   +config GENERIC_DO_BUG_FRAME
+    bool
+    help
+  Generic do_bug_frame() function is needed to handle the type
of bug
+  frame and print an information about it.


Generally help text for prompt-less functions is not very useful.
Assuming
it is put here to inform people actually reading the source file, I'm
okay
for it to be left here, but please at least drop the stray "an". I also
think this may want moving up in the file, e.g. ahead of all the HAS_*
controls (which, as you will notice, all have no help text either). Plus
finally how about shorter and more applicable GENERIC_BUG_FRAME - after
all what becomes generic is more than just do_bug_frame()?


--- /dev/null
+++ b/xen/common/bug.c
@@ -0,0 +1,88 @@
+#include 
+#include 
+#include 
+#include 


Please order headers alphabetically unless there's anything preventing
that from being done.


+#include 
+#include 
+
+#include 
+
+int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)


I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
environments that's redundant with "unsigned long", and it's also
redundant with C99's uintptr_t. Hence when seeing the type I always
wonder whether it's really a host virtual address which is meant (and
not perhaps a guest one, which in principle could differ from the host
one for certain guest types). In any event the existence of this type
should imo not be a prereq to using this generic piece of infrastructure


C spec aside, the use "unsigned long" is quite overloaded within Xen.
Although, this has improved since we introduced mfn_t/gfn_t.

In the future, I could imagine us to also introduce typesafe for
vaddr_t, reducing further the risk to mix different meaning of
unsigned long.

Therefore, I think the introduction of vaddr_t should be a prereq for
using the generic piece of infrastructure.


I'm with Jan on this.  vaddr_t is pure obfuscation, and you can do what
you like with it in ARM, but you will find very firm objection to it
finding its way into common or x86 code.


Talking about obfuscation...

42sh> ack "unsigned long addr" | wc -l
143

2/3 of this is on x86. Without looking at the code, it can be quite 
difficult to figure out if this is meant to a virtual/physical 
host/guest address.




virtual addresses are spelt 'unsigned long'.


That's ok so long there are no way to mistakenly mix some parameters. 
For instance in the past, the type of map_pages_to_xen() was:


int map_pages_to_xen(unsigned long virt,
 unsigned long mfn,
 unsigned long nr_mfns,
 unsigned int flags)

Since then 'mfn' is now thankfully mfn_t... But before, it was easy to mix.

For sure, there are other way to do it like properly naming the 
variable... But using a type as the advantage that it could be checked a 
compilation.


I am open to other suggestions if you have a way to guarantee that 
mistake can be avoided.


Cheers,

--
Julien Grall



Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Andrew Cooper
On 13/02/2023 1:19 pm, Julien Grall wrote:
>
>
> On 13/02/2023 12:24, Jan Beulich wrote:
>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>> --- a/xen/common/Kconfig
>>> +++ b/xen/common/Kconfig
>>> @@ -92,6 +92,12 @@ config STATIC_MEMORY
>>>       If unsure, say N.
>>>   +config GENERIC_DO_BUG_FRAME
>>> +    bool
>>> +    help
>>> +  Generic do_bug_frame() function is needed to handle the type
>>> of bug
>>> +  frame and print an information about it.
>>
>> Generally help text for prompt-less functions is not very useful.
>> Assuming
>> it is put here to inform people actually reading the source file, I'm
>> okay
>> for it to be left here, but please at least drop the stray "an". I also
>> think this may want moving up in the file, e.g. ahead of all the HAS_*
>> controls (which, as you will notice, all have no help text either). Plus
>> finally how about shorter and more applicable GENERIC_BUG_FRAME - after
>> all what becomes generic is more than just do_bug_frame()?
>>
>>> --- /dev/null
>>> +++ b/xen/common/bug.c
>>> @@ -0,0 +1,88 @@
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>
>> Please order headers alphabetically unless there's anything preventing
>> that from being done.
>>
>>> +#include 
>>> +#include 
>>> +
>>> +#include 
>>> +
>>> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
>>
>> I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
>> environments that's redundant with "unsigned long", and it's also
>> redundant with C99's uintptr_t. Hence when seeing the type I always
>> wonder whether it's really a host virtual address which is meant (and
>> not perhaps a guest one, which in principle could differ from the host
>> one for certain guest types). In any event the existence of this type
>> should imo not be a prereq to using this generic piece of infrastructure
>
> C spec aside, the use "unsigned long" is quite overloaded within Xen.
> Although, this has improved since we introduced mfn_t/gfn_t.
>
> In the future, I could imagine us to also introduce typesafe for
> vaddr_t, reducing further the risk to mix different meaning of
> unsigned long.
>
> Therefore, I think the introduction of vaddr_t should be a prereq for
> using the generic piece of infrastructure.

I'm with Jan on this.  vaddr_t is pure obfuscation, and you can do what
you like with it in ARM, but you will find very firm objection to it
finding its way into common or x86 code.

virtual addresses are spelt 'unsigned long'.

~Andrew



Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Jan Beulich
On 13.02.2023 14:19, Julien Grall wrote:
> On 13/02/2023 12:24, Jan Beulich wrote:
>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/common/bug.c
>>> @@ -0,0 +1,88 @@
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>
>> Please order headers alphabetically unless there's anything preventing
>> that from being done.
>>
>>> +#include 
>>> +#include 
>>> +
>>> +#include 
>>> +
>>> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
>>
>> I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
>> environments that's redundant with "unsigned long", and it's also
>> redundant with C99's uintptr_t. Hence when seeing the type I always
>> wonder whether it's really a host virtual address which is meant (and
>> not perhaps a guest one, which in principle could differ from the host
>> one for certain guest types). In any event the existence of this type
>> should imo not be a prereq to using this generic piece of infrastructure
> 
> C spec aside, the use "unsigned long" is quite overloaded within Xen. 
> Although, this has improved since we introduced mfn_t/gfn_t.
> 
> In the future, I could imagine us to also introduce typesafe for 
> vaddr_t, reducing further the risk to mix different meaning of unsigned 
> long.
> 
> Therefore, I think the introduction of vaddr_t should be a prereq for 
> using the generic piece of infrastructure.

Would be nice if other maintainers could share their thoughts here. I,
for one, would view a typesafe gaddr_t as reasonable, and perhaps also
a typesafe gvaddr_t, but hypervisor addresses should be fine as void *
or unsigned long.

>>> --- /dev/null
>>> +++ b/xen/include/xen/bug.h
>>> @@ -0,0 +1,127 @@
>>> +#ifndef __XEN_BUG_H__
>>> +#define __XEN_BUG_H__
>>> +
>>> +#define BUG_DISP_WIDTH24
>>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>>> +
>>> +#define BUGFRAME_run_fn 0
>>> +#define BUGFRAME_warn   1
>>> +#define BUGFRAME_bug2
>>> +#define BUGFRAME_assert 3
>>> +
>>> +#define BUGFRAME_NR 4
>>> +
>>> +#ifndef __ASSEMBLY__
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>
>> Again - please sort headers.
>>
>>> +#include 
>>> +
>>> +#ifndef BUG_FRAME_STUFF
>>> +struct bug_frame {
>>
>> Please can we have a blank line between the above two ones and then similarly
>> ahead of the #endif?
>>
>>> +signed int loc_disp;/* Relative address to the bug address */
>>> +signed int file_disp;   /* Relative address to the filename */
>>> +signed int msg_disp;/* Relative address to the predicate (for 
>>> ASSERT) */
>>> +uint16_t line;  /* Line number */
>>> +uint32_t pad0:16;   /* Padding for 8-bytes align */
>>
>> Already the original comment in Arm code is wrong: The padding doesn't
>> guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
>> size.
>> Aiui there's also no need for 8-byte alignment anywhere here (in
>> fact there's ".p2align 2" in the generator macros).
> 
> I would rather keep the pad0 here.

May I ask why that is?

>> I also wonder why this needs to be a named bitfield: Either make it
>> plain uint16_t, or if you use a bitfield, then omit the name.
> 
> Everything you seem to suggest are clean ups. So I think it would be 
> better if they are first applied to Arm and then we move the code to 
> common afterwards.
> 
> This will make easier to confirm what changed and also tracking the 
> history (think git blame).
> 
> That said, I don't view the clean-ups as necessary in order to move the 
> code in common... They could be done afterwards by Oleksii or someone else.

I disagree: The wider the exposure / use of code, the more important it is
to be reasonably tidy. Cleaning up first and then moving is certainly fine
with me.

>>> + ".pushsection .rodata.str, \"aMS\", %progbits, 1\n"   
>>>  \
>>> + "2:\t.asciz " __stringify(file) "\n"  
>>>  \
>>
>> file is always a string literal; really it always is __FILE__ in this
>> non-x86 implementation. So first preference ought to be to drop the
>> macro parameter and use __FILE__ here (same then for line vs __LINE__).
>> Yet even if not done like that, the __stringify() is largely unneeded
>> (unless we expect file names to have e.g. backslashes in their names)
>> and looks somewhat odd here. So how about
>>
>>   "2:\t.asciz \"" __FILE__ "\"\n"
>>
>> ? But wait - peeking ahead to the x86 patch I notice that __FILE__ and
>> __LINE__ need to remain arguments. But then the second preference would
>> still be
>>
>>   "2:\t.asciz \"" file "\"\n"
>>
>> imo. Yet maybe others disagree ...
> 
> I would prefer to keep the __stringify() version because it avoids 
> relying on file to always be a string literal.

... hiding possible mistakes (not passing a string literal is very
likely unintentional here after all).

Jan



[linux-linus test] 177133: regressions - trouble: broken/fail/pass/starved

2023-02-13 Thread osstest service owner
flight 177133 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/177133/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-pair broken
 test-amd64-amd64-libvirt-pair 7 host-install/dst_host(7) broken REGR. vs. 
177104
 test-amd64-amd64-freebsd11-amd64 21 guest-start/freebsd.repeat fail REGR. vs. 
177104

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 177104
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 177104
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 177104
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 177104
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 177104
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 linuxceaa837f96adb69c0df0397937cd74991d5d821a
baseline version:
 linux711e9a4d52bf4e477e51c7135e1e6188c42018d0

Last test of basis   177104  2023-02-12 20:42:38 Z0 days
Testing same since   177133  2023-02-13 03:40:57 Z0 days1 attempts


People who touched revisions under test:
  Geert Uytterhoeven 
  John Paul Adrian Glaubitz 
  Linus Torvalds 
  Steven Rostedt (Google) 
  Yafang Shao 
  Yoshinori Sato 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  starved 
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  starved 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvops 

Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Julien Grall




On 13/02/2023 12:24, Jan Beulich wrote:

On 03.02.2023 18:05, Oleksii Kurochko wrote:

--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -92,6 +92,12 @@ config STATIC_MEMORY
  
  	  If unsure, say N.
  
+config GENERIC_DO_BUG_FRAME

+   bool
+   help
+ Generic do_bug_frame() function is needed to handle the type of bug
+ frame and print an information about it.


Generally help text for prompt-less functions is not very useful. Assuming
it is put here to inform people actually reading the source file, I'm okay
for it to be left here, but please at least drop the stray "an". I also
think this may want moving up in the file, e.g. ahead of all the HAS_*
controls (which, as you will notice, all have no help text either). Plus
finally how about shorter and more applicable GENERIC_BUG_FRAME - after
all what becomes generic is more than just do_bug_frame()?


--- /dev/null
+++ b/xen/common/bug.c
@@ -0,0 +1,88 @@
+#include 
+#include 
+#include 
+#include 


Please order headers alphabetically unless there's anything preventing
that from being done.


+#include 
+#include 
+
+#include 
+
+int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)


I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
environments that's redundant with "unsigned long", and it's also
redundant with C99's uintptr_t. Hence when seeing the type I always
wonder whether it's really a host virtual address which is meant (and
not perhaps a guest one, which in principle could differ from the host
one for certain guest types). In any event the existence of this type
should imo not be a prereq to using this generic piece of infrastructure


C spec aside, the use "unsigned long" is quite overloaded within Xen. 
Although, this has improved since we introduced mfn_t/gfn_t.


In the future, I could imagine us to also introduce typesafe for 
vaddr_t, reducing further the risk to mix different meaning of unsigned 
long.


Therefore, I think the introduction of vaddr_t should be a prereq for 
using the generic piece of infrastructure.





+{
+const struct bug_frame *bug = NULL;
+const char *prefix = "", *filename, *predicate;
+unsigned long fixup;
+int id = -1, lineno;


For both of these I question them needing to be of a signed type.


+const struct virtual_region *region;
+
+region = find_text_region(pc);
+if ( region )
+{
+for ( id = 0; id < BUGFRAME_NR; id++ )
+{
+const struct bug_frame *b;
+unsigned int i;
+
+for ( i = 0, b = region->frame[id].bugs;
+  i < region->frame[id].n_bugs; b++, i++ )
+{
+if ( ((vaddr_t)bug_loc(b)) == pc )


Afaics this is the sole use of bug_loc(). If so, better change the macro
to avoid the need for a cast here:

#define bug_loc(b) ((unsigned long)(b) + (b)->loc_disp)


--- /dev/null
+++ b/xen/include/xen/bug.h
@@ -0,0 +1,127 @@
+#ifndef __XEN_BUG_H__
+#define __XEN_BUG_H__
+
+#define BUG_DISP_WIDTH24
+#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
+#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
+
+#define BUGFRAME_run_fn 0
+#define BUGFRAME_warn   1
+#define BUGFRAME_bug2
+#define BUGFRAME_assert 3
+
+#define BUGFRAME_NR 4
+
+#ifndef __ASSEMBLY__
+
+#include 
+#include 
+#include 
+#include 


Again - please sort headers.


+#include 
+
+#ifndef BUG_FRAME_STUFF
+struct bug_frame {


Please can we have a blank line between the above two ones and then similarly
ahead of the #endif?


+signed int loc_disp;/* Relative address to the bug address */
+signed int file_disp;   /* Relative address to the filename */
+signed int msg_disp;/* Relative address to the predicate (for ASSERT) 
*/
+uint16_t line;  /* Line number */
+uint32_t pad0:16;   /* Padding for 8-bytes align */


Already the original comment in Arm code is wrong: The padding doesn't
guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
size.
Aiui there's also no need for 8-byte alignment anywhere here (in
fact there's ".p2align 2" in the generator macros).


I would rather keep the pad0 here.



I also wonder why this needs to be a named bitfield: Either make it
plain uint16_t, or if you use a bitfield, then omit the name.


Everything you seem to suggest are clean ups. So I think it would be 
better if they are first applied to Arm and then we move the code to 
common afterwards.


This will make easier to confirm what changed and also tracking the 
history (think git blame).


That said, I don't view the clean-ups as necessary in order to move the 
code in common... They could be done afterwards by Oleksii or someone else.





+};
+
+#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
+#define bug_file(b) ((const void *)(b) + (b)->file_disp);
+#define bug_line(b) ((b)->line)
+#define bug_msg(b) ((const char *)(b) + (b)->msg_disp)
+#endif /* BUG_FRAME_STUFF */
+
+#ifndef BUG_FRAME
+/* Many versions of GCC 

Re: [PATCH v3] x86/vlapic: call vmx_vlapic_msr_changed through an hvm_function callback

2023-02-13 Thread Jan Beulich
On 13.02.2023 12:50, Xenia Ragiadakou wrote:
> APIC virtualization support is currently implemented only for Intel VT-x.
> To aid future work on separating AMD-V from Intel VT-x code, instead of
> calling directly vmx_vlapic_msr_changed() from common hvm code, add a stub
> to the hvm_function_table, named update_vlapic_mode, and create a wrapper
> function, called hvm_vlapic_mode(), to be used by common hvm code.
> 
> After the change above, do not include header asm/hvm/vmx/vmx.h as it is
> not required anymore and resolve subsequent build errors for implicit
> declaration of functions ‘TRACE_2_LONG_3D’ and ‘TRC_PAR_LONG’ by including
> missing asm/hvm/trace.h header.
> 
> No functional change intended.
> 
> Signed-off-by: Xenia Ragiadakou 

Reviewed-by: Jan Beulich 





Re: [PATCH v1 4/4] xen: change to

2023-02-13 Thread Jan Beulich
On 03.02.2023 18:05, Oleksii Kurochko wrote:
> Since the generic version of bug.h stuff was introduced it
> is necessary to rename all uses of  to 
> 
> Signed-off-by: Oleksii Kurochko 

Doesn't this change need to come ahead of at least what currently is patch 3?
Or else why do you say "necessary" (I take this to mean the build otherwise
is broken)? If the build works after patch 3, the change here may want to be
drop the unnecessary asm/bug.h includes instead.

Jan



Re: [PATCH v1 3/4] xen/x86: switch x86 to use generic implemetation of bug.h

2023-02-13 Thread Jan Beulich
On 03.02.2023 18:05, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko 

Is there anything keeping x86 from also using the generic do_bug_frame()?
If not, switching over would then likely mean no need for the new Kconfig
control.

Jan



Re: [PATCH v1 2/4] xen/arm: switch ARM to use generic implementation of bug.h

2023-02-13 Thread Jan Beulich
On 03.02.2023 18:05, Oleksii Kurochko wrote:
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -18,6 +18,7 @@ config ARM
>   select HAS_PDX
>   select HAS_PMAP
>   select IOMMU_FORCE_PT_SHARE
> + select GENERIC_DO_BUG_FRAME

Please maintain (largely?) alphabetic ordering here.

Jan




Re: [PATCH v2] x86/vlapic: call vmx_vlapic_msr_changed through an hvm_function callback

2023-02-13 Thread Jan Beulich
On 13.02.2023 12:05, Andrew Cooper wrote:
> On 13/02/2023 10:54 am, Xenia Ragiadakou wrote:
>>
>> On 2/13/23 12:09, Jan Beulich wrote:
>>> I keep forgetting why putting it on just the declaration isn't
>>> sufficient; I
>>> guess a short comment to that effect next to cf_check's definition in
>>> compiler.h would help).
>>
>> Yes, that would help. I don't know why it is not sufficient placing it
>> just in the declaration.
> 
> Because it's part of the function's type.

What does that mean? Is that any different from e.g. noreturn, which
also is part of the function's type, and the compiler records this when
seeing the declaration. It only ever accumulates (never removes) such
attributes when seeing a 2nd declaration (unless multiple declarations
are otherwise rejected) or, finally, the definition.

Jan



Re: xen (multiboot) binary gets corrupted ELF notes

2023-02-13 Thread Jan Beulich
On 13.02.2023 12:14, Marek Marczykowski-Górecki wrote:
> Hi,
> 
> I'm getting some ELF note issues on multiboot binary
> specifically:
> xen/xen: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically 
> linked, BuildID[sha1]=f7d2c37a4ad599b268f2f2d94bff3776d12649b3, bad note 
> description size 0xc0010001, stripped
> 
> readelf additionally says:
> 
> Displaying notes found in: .note
>   OwnerData size  Description
>   GNU  0x0014 NT_GNU_BUILD_ID (unique build ID 
> bitstring)
> Build ID: c5825a0d08edc4d11b1138fedca6b14ce8ba7302
>   (NONE)   0x0004 Unknown note type: (0x0020)
>description data: 05 00 00 00 
> readelf: xen/xen: Warning: note with invalid namesz and/or descsz found 
> at offset 0x34
> readelf: xen/xen: Warning:  type: 0x4, namesize: 0x00554e47, descsize: 
> 0xc0010001, alignment: 4
> 
> Grub doesn't care, but launching such xen with kexec doesn't work.
> 
> Initially found when booting Xen via Heads: 
> https://openqa.qubes-os.org/tests/60151#step/install_startup/11
> 
> Andy says:
>> yeah, I've seen the same on XTF
>> binutil's recent elf notes for CET compatibility use an unsigned long
>> so they're not compatible when we build as 64bit and then re-package as 32
>> I think we need to strip all elf notes in mkelf32

Instead of complicating mkelf32 (we want to retain at least the build-id
note, after all, and for PVH_GUEST builds also .note.Xen) why don't we
discard the unwanted/unneeded notes then from the linker script, just
like we already do for xen.efi?

Jan



Re: [XEN v3 4/9] xen/arm: Introduce a wrapper for dt_device_get_address() to handle paddr_t

2023-02-13 Thread Ayan Kumar Halder



On 11/02/2023 09:10, Julien Grall wrote:

Hi,

Hi Julien/Stefano,


On 11/02/2023 00:20, Stefano Stabellini wrote:

On Wed, 8 Feb 2023, Ayan Kumar Halder wrote:

dt_device_get_address() can accept u64 only for address and size.
However, the address/size denotes physical addresses. Thus, they should
be represented by 'paddr_t'.
Consequently, we introduce a wrapper for dt_device_get_address() ie
dt_device_get_paddr() which accepts address/size as paddr_t and inturn
invokes dt_device_get_address() after converting address/size to u64.

The reason for introducing doing this is that in future 'paddr_t' may
be defined as u32. Thus, we need an explicit wrapper to do the type
conversion and return an error in case of truncation.

With this, callers now invoke dt_device_get_paddr().
dt_device_get_address() is invoked by dt_device_get_paddr() only.

Signed-off-by: Ayan Kumar Halder 
---
Changes from -

v1 - 1. New patch.

v2 - 1. Extracted part of "[XEN v2 05/11] xen/arm: Use paddr_t 
instead of u64 for address/size"

into this patch.

2. dt_device_get_address() callers now invoke dt_device_get_paddr() 
instead.


3. Logged error in case of truncation.

  xen/arch/arm/domain_build.c    | 10 +++---
  xen/arch/arm/gic-v2.c  | 10 +++---
  xen/arch/arm/gic-v3-its.c  |  4 +--
  xen/arch/arm/gic-v3.c  | 10 +++---
  xen/arch/arm/pci/pci-host-common.c |  6 ++--
  xen/arch/arm/platforms/brcm-raspberry-pi.c |  2 +-
  xen/arch/arm/platforms/brcm.c  |  4 +--
  xen/arch/arm/platforms/exynos5.c   | 32 +--
  xen/arch/arm/platforms/sunxi.c |  2 +-
  xen/arch/arm/platforms/xgene-storm.c   |  2 +-
  xen/common/device_tree.c   | 36 
--

  xen/drivers/char/cadence-uart.c    |  4 +--
  xen/drivers/char/exynos4210-uart.c |  4 +--
  xen/drivers/char/imx-lpuart.c  |  4 +--
  xen/drivers/char/meson-uart.c  |  4 +--
  xen/drivers/char/mvebu-uart.c  |  4 +--
  xen/drivers/char/ns16550.c |  2 +-
  xen/drivers/char/omap-uart.c   |  4 +--
  xen/drivers/char/pl011.c   |  6 ++--
  xen/drivers/char/scif-uart.c   |  4 +--
  xen/drivers/passthrough/arm/ipmmu-vmsa.c   |  8 ++---
  xen/drivers/passthrough/arm/smmu-v3.c  |  2 +-
  xen/drivers/passthrough/arm/smmu.c |  8 ++---
  xen/include/xen/device_tree.h  |  6 ++--
  24 files changed, 105 insertions(+), 73 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 4d7e67560f..c7e88f5011 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1692,13 +1692,13 @@ static int __init find_memory_holes(const 
struct kernel_info *kinfo,

  dt_for_each_device_node( dt_host, np )
  {
  unsigned int naddr;
-    u64 addr, size;
+    paddr_t addr, size;
    naddr = dt_number_of_address(np);
    for ( i = 0; i < naddr; i++ )
  {
-    res = dt_device_get_address(np, i, , );
+    res = dt_device_get_paddr(np, i, , );


One thing to be careful here is that "start" and "end" in
find_memory_holes are defined now as paddr_t and passed to
rangeset_add_range and rangeset_remove_range.


I am a bit puzzled why you are saying "now". Without Ayan's patch addr 
was 64-bit so...




rangeset_add_range and rangeset_remove_range take an unsigned long as
parameter, so in an arm32 configuration with paddr_t set to 64-bit it
would result in truncation


... the problem you are talking about would already exist.

I would consider to store a frame number rather than the full address 
because we can only deal with page-aligned region.


Stefano (or Ayan) can you look at it?


I have sent out a patch to address this "[XEN v6 2/2] xen/arm: 
domain_build: Use pfn start and end address for rangeset_{xxx}_range()".


- Ayan



Cheers,





[XEN v6 2/2] xen/arm: domain_build: Use pfn start and end address for rangeset_{xxx}_range()

2023-02-13 Thread Ayan Kumar Halder
rangeset_{xxx}_range() functions are invoked with 'start' and 'size' as
arguments which are either 'uint64_t' or 'paddr_t'. However, the function
accepts 'unsigned long' for 'start' and 'size'. 'unsigned long' is 32 bits for
ARM_32. Thus, there is an implicit downcasting from 'uint64_t'/'paddr_t' to
'unsigned long' when invoking rangeset_{xxx}_range().

However, it may seem there is a possibility of lose of data due to truncation.

In reality, 'start' and 'size' are always page aligned. And ARM_32 currently
supports 40 bits as the width of physical address.
So if the addresses are page aligned, the last 12 bits contain zeroes.
Thus, we could instead pass page frame number which will contain 28 bits (40-12
on Arm_32) and this can be represented using 'unsigned long'.

On Arm_64, this change will not induce any adverse side effect as the width of
physical address is 48 bits. Thus, the width of 'pfn' (ie 48 - 12 = 36) can be
represented using 'unsigned long' (which is 64 bits wide).

Signed-off-by: Ayan Kumar Halder 
---

Changes from -

v1 - v5 - NA (New patch introduced in v6).

 xen/arch/arm/domain_build.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index a798e0b256..6a8c7206ae 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1566,7 +1566,8 @@ static int __init find_unallocated_memory(const struct 
kernel_info *kinfo,
 {
 start = bootinfo.mem.bank[i].start;
 end = bootinfo.mem.bank[i].start + bootinfo.mem.bank[i].size;
-res = rangeset_add_range(unalloc_mem, start, end - 1);
+res = rangeset_add_range(unalloc_mem, PFN_DOWN(start),
+ PFN_DOWN(end - 1));
 if ( res )
 {
 printk(XENLOG_ERR "Failed to add: %#"PRIpaddr"->%#"PRIpaddr"\n",
@@ -1580,7 +1581,8 @@ static int __init find_unallocated_memory(const struct 
kernel_info *kinfo,
 {
 start = assign_mem->bank[i].start;
 end = assign_mem->bank[i].start + assign_mem->bank[i].size;
-res = rangeset_remove_range(unalloc_mem, start, end - 1);
+res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start),
+PFN_DOWN(end - 1));
 if ( res )
 {
 printk(XENLOG_ERR "Failed to remove: %#"PRIpaddr"->%#"PRIpaddr"\n",
@@ -1595,7 +1597,8 @@ static int __init find_unallocated_memory(const struct 
kernel_info *kinfo,
 start = bootinfo.reserved_mem.bank[i].start;
 end = bootinfo.reserved_mem.bank[i].start +
 bootinfo.reserved_mem.bank[i].size;
-res = rangeset_remove_range(unalloc_mem, start, end - 1);
+res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start),
+PFN_DOWN(end - 1));
 if ( res )
 {
 printk(XENLOG_ERR "Failed to remove: %#"PRIpaddr"->%#"PRIpaddr"\n",
@@ -1607,7 +1610,7 @@ static int __init find_unallocated_memory(const struct 
kernel_info *kinfo,
 /* Remove grant table region */
 start = kinfo->gnttab_start;
 end = kinfo->gnttab_start + kinfo->gnttab_size;
-res = rangeset_remove_range(unalloc_mem, start, end - 1);
+res = rangeset_remove_range(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end - 
1));
 if ( res )
 {
 printk(XENLOG_ERR "Failed to remove: %#"PRIpaddr"->%#"PRIpaddr"\n",
@@ -1617,7 +1620,7 @@ static int __init find_unallocated_memory(const struct 
kernel_info *kinfo,
 
 start = 0;
 end = (1ULL << p2m_ipa_bits) - 1;
-res = rangeset_report_ranges(unalloc_mem, start, end,
+res = rangeset_report_ranges(unalloc_mem, PFN_DOWN(start), PFN_DOWN(end),
  add_ext_regions, ext_regions);
 if ( res )
 ext_regions->nr_banks = 0;
@@ -1639,7 +1642,7 @@ static int __init handle_pci_range(const struct 
dt_device_node *dev,
 
 start = addr & PAGE_MASK;
 end = PAGE_ALIGN(addr + len);
-res = rangeset_remove_range(mem_holes, start, end - 1);
+res = rangeset_remove_range(mem_holes, PFN_DOWN(start),PFN_DOWN(end - 1));
 if ( res )
 {
 printk(XENLOG_ERR "Failed to remove: %#"PRIpaddr"->%#"PRIpaddr"\n",
@@ -1677,7 +1680,7 @@ static int __init find_memory_holes(const struct 
kernel_info *kinfo,
 /* Start with maximum possible addressable physical memory range */
 start = 0;
 end = (1ULL << p2m_ipa_bits) - 1;
-res = rangeset_add_range(mem_holes, start, end);
+res = rangeset_add_range(mem_holes, PFN_DOWN(start), PFN_DOWN(end));
 if ( res )
 {
 printk(XENLOG_ERR "Failed to add: %#"PRIpaddr"->%#"PRIpaddr"\n",
@@ -1708,7 +1711,8 @@ static int __init find_memory_holes(const struct 
kernel_info *kinfo,
 
 start = addr & PAGE_MASK;
 end = PAGE_ALIGN(addr + size);
-res = rangeset_remove_range(mem_holes, start, end - 1);
+res = rangeset_remove_range(mem_holes, 

[XEN v6 1/2] xen/arm: Use the correct format specifier

2023-02-13 Thread Ayan Kumar Halder
1. One should use 'PRIpaddr' to display 'paddr_t' variables. However,
while creating nodes in fdt, the address (if present in the node name)
should be represented using 'PRIx64'. This is to be in conformance
with the following rule present in https://elinux.org/Device_Tree_Linux

. node names
"unit-address does not have leading zeros"

As 'PRIpaddr' introduces leading zeros, we cannot use it.

So, we have introduced a wrapper ie domain_fdt_begin_node() which will
represent physical address using 'PRIx64'.

2. One should use 'PRIx64' to display 'u64' in hex format. The current
use of 'PRIpaddr' for printing PTE is buggy as this is not a physical
address.

Signed-off-by: Ayan Kumar Halder 
Reviewed-by: Stefano Stabellini 
Acked-by: Julien Grall 
---

Changes from -

v1 - 1. Moved the patch earlier.
2. Moved a part of change from "[XEN v1 8/9] xen/arm: Other adaptations 
required to support 32bit paddr"
into this patch.

v2 - 1. Use PRIx64 for appending addresses to fdt node names. This fixes the CI 
failure.

v3 - 1. Added a comment on top of domain_fdt_begin_node().
2. Check for the return of snprintf() in domain_fdt_begin_node().

v4 - 1. Grammatical error fixes.

v5 - 1. Added R-b and A-b.

 xen/arch/arm/domain_build.c | 64 +++--
 xen/arch/arm/gic-v2.c   |  6 ++--
 xen/arch/arm/mm.c   |  2 +-
 3 files changed, 44 insertions(+), 28 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index c2b97fa21e..a798e0b256 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1288,6 +1288,39 @@ static int __init fdt_property_interrupts(const struct 
kernel_info *kinfo,
 return res;
 }
 
+/*
+ * Wrapper to convert physical address from paddr_t to uint64_t and
+ * invoke fdt_begin_node(). This is required as the physical address
+ * provided as part of node name should not contain any leading
+ * zeroes. Thus, one should use PRIx64 (instead of PRIpaddr) to append
+ * unit (which contains the physical address) with name to generate a
+ * node name.
+ */
+static int __init domain_fdt_begin_node(void *fdt, const char *name,
+uint64_t unit)
+{
+/*
+ * The size of the buffer to hold the longest possible string (i.e.
+ * interrupt-controller@ + a 64-bit number + \0).
+ */
+char buf[38];
+int ret;
+
+/* ePAPR 3.4 */
+ret = snprintf(buf, sizeof(buf), "%s@%"PRIx64, name, unit);
+
+if ( ret >= sizeof(buf) )
+{
+printk(XENLOG_ERR
+   "Insufficient buffer. Minimum size required is %d\n",
+   (ret + 1));
+
+return -FDT_ERR_TRUNCATED;
+}
+
+return fdt_begin_node(fdt, buf);
+}
+
 static int __init make_memory_node(const struct domain *d,
void *fdt,
int addrcells, int sizecells,
@@ -1296,8 +1329,6 @@ static int __init make_memory_node(const struct domain *d,
 unsigned int i;
 int res, reg_size = addrcells + sizecells;
 int nr_cells = 0;
-/* Placeholder for memory@ + a 64-bit number + \0 */
-char buf[24];
 __be32 reg[NR_MEM_BANKS * 4 /* Worst case addrcells + sizecells */];
 __be32 *cells;
 
@@ -1314,9 +1345,7 @@ static int __init make_memory_node(const struct domain *d,
 
 dt_dprintk("Create memory node\n");
 
-/* ePAPR 3.4 */
-snprintf(buf, sizeof(buf), "memory@%"PRIx64, mem->bank[i].start);
-res = fdt_begin_node(fdt, buf);
+res = domain_fdt_begin_node(fdt, "memory", mem->bank[i].start);
 if ( res )
 return res;
 
@@ -1375,16 +1404,13 @@ static int __init make_shm_memory_node(const struct 
domain *d,
 {
 uint64_t start = mem->bank[i].start;
 uint64_t size = mem->bank[i].size;
-/* Placeholder for xen-shmem@ + a 64-bit number + \0 */
-char buf[27];
 const char compat[] = "xen,shared-memory-v1";
 /* Worst case addrcells + sizecells */
 __be32 reg[GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS];
 __be32 *cells;
 unsigned int len = (addrcells + sizecells) * sizeof(__be32);
 
-snprintf(buf, sizeof(buf), "xen-shmem@%"PRIx64, mem->bank[i].start);
-res = fdt_begin_node(fdt, buf);
+res = domain_fdt_begin_node(fdt, "xen-shmem", mem->bank[i].start);
 if ( res )
 return res;
 
@@ -2716,12 +2742,9 @@ static int __init make_gicv2_domU_node(struct 
kernel_info *kinfo)
 __be32 reg[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) * 2];
 __be32 *cells;
 const struct domain *d = kinfo->d;
-/* Placeholder for interrupt-controller@ + a 64-bit number + \0 */
-char buf[38];
 
-snprintf(buf, sizeof(buf), "interrupt-controller@%"PRIx64,
- vgic_dist_base(>arch.vgic));
-res = fdt_begin_node(fdt, buf);
+res = domain_fdt_begin_node(fdt, "interrupt-controller",
+vgic_dist_base(>arch.vgic));
 if ( res )
   

[XEN v6 0/2] Pre-requisite fixes for supporting 32 bit physical address

2023-02-13 Thread Ayan Kumar Halder


Hi All,

These series include some patches and fixes identified during the review of
"[XEN v2 00/11] Add support for 32 bit physical address" and
"[XEN v3 0/9] Add support for 32 bit physical address".

Patch 1/2 : This has been reviewed and ack-ed. It needs to be committed before
any of the "[XEN v3 0/9]" patches are applied.

Patch 2/2 : This addresses an issue identified during the review of
"[XEN v3 4/9] xen/arm: Introduce a wrapper for dt_device_get_address() to 
handle paddr_t".

Ayan Kumar Halder (2):
  xen/arm: Use the correct format specifier
  xen/arm: domain_build: Use pfn start and end address for
rangeset_{xxx}_range()

 xen/arch/arm/domain_build.c | 86 +++--
 xen/arch/arm/gic-v2.c   |  6 +--
 xen/arch/arm/mm.c   |  2 +-
 3 files changed, 57 insertions(+), 37 deletions(-)

-- 
2.17.1




[xen-unstable-smoke test] 177160: tolerable trouble: pass/starved - PUSHED

2023-02-13 Thread osstest service owner
flight 177160 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/177160/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  f4f498d08d50259b9f25c274fd7b1e8ccf5693af
baseline version:
 xen  01e7477d1b081cff4288ff9f51ec59ee94c03ee0

Last test of basis   176840  2023-02-10 09:02:03 Z3 days
Testing same since   177160  2023-02-13 10:03:22 Z0 days1 attempts


People who touched revisions under test:
  Alistair Francis 
  Andrew Cooper 
  Jan Beulich 
  Juergen Gross 
  Oleksii Kurochko 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  starved 
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  starved 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   01e7477d1b..f4f498d08d  f4f498d08d50259b9f25c274fd7b1e8ccf5693af -> smoke



Re: [PATCH v1 1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

2023-02-13 Thread Jan Beulich
On 03.02.2023 18:05, Oleksii Kurochko wrote:
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -92,6 +92,12 @@ config STATIC_MEMORY
>  
> If unsure, say N.
>  
> +config GENERIC_DO_BUG_FRAME
> + bool
> + help
> +   Generic do_bug_frame() function is needed to handle the type of bug
> +   frame and print an information about it.

Generally help text for prompt-less functions is not very useful. Assuming
it is put here to inform people actually reading the source file, I'm okay
for it to be left here, but please at least drop the stray "an". I also
think this may want moving up in the file, e.g. ahead of all the HAS_*
controls (which, as you will notice, all have no help text either). Plus
finally how about shorter and more applicable GENERIC_BUG_FRAME - after
all what becomes generic is more than just do_bug_frame()?

> --- /dev/null
> +++ b/xen/common/bug.c
> @@ -0,0 +1,88 @@
> +#include 
> +#include 
> +#include 
> +#include 

Please order headers alphabetically unless there's anything preventing
that from being done.

> +#include 
> +#include 
> +
> +#include 
> +
> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)

I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
environments that's redundant with "unsigned long", and it's also
redundant with C99's uintptr_t. Hence when seeing the type I always
wonder whether it's really a host virtual address which is meant (and
not perhaps a guest one, which in principle could differ from the host
one for certain guest types). In any event the existence of this type
should imo not be a prereq to using this generic piece of infrastructure.

> +{
> +const struct bug_frame *bug = NULL;
> +const char *prefix = "", *filename, *predicate;
> +unsigned long fixup;
> +int id = -1, lineno;

For both of these I question them needing to be of a signed type.

> +const struct virtual_region *region;
> +
> +region = find_text_region(pc);
> +if ( region )
> +{
> +for ( id = 0; id < BUGFRAME_NR; id++ )
> +{
> +const struct bug_frame *b;
> +unsigned int i;
> +
> +for ( i = 0, b = region->frame[id].bugs;
> +  i < region->frame[id].n_bugs; b++, i++ )
> +{
> +if ( ((vaddr_t)bug_loc(b)) == pc )

Afaics this is the sole use of bug_loc(). If so, better change the macro
to avoid the need for a cast here:

#define bug_loc(b) ((unsigned long)(b) + (b)->loc_disp)

> --- /dev/null
> +++ b/xen/include/xen/bug.h
> @@ -0,0 +1,127 @@
> +#ifndef __XEN_BUG_H__
> +#define __XEN_BUG_H__
> +
> +#define BUG_DISP_WIDTH24
> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
> +
> +#define BUGFRAME_run_fn 0
> +#define BUGFRAME_warn   1
> +#define BUGFRAME_bug2
> +#define BUGFRAME_assert 3
> +
> +#define BUGFRAME_NR 4
> +
> +#ifndef __ASSEMBLY__
> +
> +#include 
> +#include 
> +#include 
> +#include 

Again - please sort headers.

> +#include 
> +
> +#ifndef BUG_FRAME_STUFF
> +struct bug_frame {

Please can we have a blank line between the above two ones and then similarly
ahead of the #endif?

> +signed int loc_disp;/* Relative address to the bug address */
> +signed int file_disp;   /* Relative address to the filename */
> +signed int msg_disp;/* Relative address to the predicate (for 
> ASSERT) */
> +uint16_t line;  /* Line number */
> +uint32_t pad0:16;   /* Padding for 8-bytes align */

Already the original comment in Arm code is wrong: The padding doesn't
guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
size. Aiui there's also no need for 8-byte alignment anywhere here (in
fact there's ".p2align 2" in the generator macros).

I also wonder why this needs to be a named bitfield: Either make it
plain uint16_t, or if you use a bitfield, then omit the name.

> +};
> +
> +#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
> +#define bug_file(b) ((const void *)(b) + (b)->file_disp);
> +#define bug_line(b) ((b)->line)
> +#define bug_msg(b) ((const char *)(b) + (b)->msg_disp)
> +#endif /* BUG_FRAME_STUFF */
> +
> +#ifndef BUG_FRAME
> +/* Many versions of GCC doesn't support the asm %c parameter which would
> + * be preferable to this unpleasantness. We use mergeable string
> + * sections to avoid multiple copies of the string appearing in the
> + * Xen image. BUGFRAME_run_fn needs to be handled separately.
> + */

When generalizing the logic, I wonder in how far the comment doesn't
want re-wording some. For example, while Arm prefixes constant insn
operands with # (and x86 uses $), there's no such prefix in RISC-V. At
which point there's no need to use %c in the first place. (Which in
turn means x86'es more compact representation may also be usable there.
And yet in turn the question arises whether RISC-V wouldn't better have
its own derivation of the machinery, rather than trying to 

[PATCH v3] x86/vlapic: call vmx_vlapic_msr_changed through an hvm_function callback

2023-02-13 Thread Xenia Ragiadakou
APIC virtualization support is currently implemented only for Intel VT-x.
To aid future work on separating AMD-V from Intel VT-x code, instead of
calling directly vmx_vlapic_msr_changed() from common hvm code, add a stub
to the hvm_function_table, named update_vlapic_mode, and create a wrapper
function, called hvm_vlapic_mode(), to be used by common hvm code.

After the change above, do not include header asm/hvm/vmx/vmx.h as it is
not required anymore and resolve subsequent build errors for implicit
declaration of functions ‘TRACE_2_LONG_3D’ and ‘TRC_PAR_LONG’ by including
missing asm/hvm/trace.h header.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---

Changes in v3:
  - add cf_check attribute in both the function declaration and definition of
vmx_vlapic_msr_changed since it is used as callback, bug reported by Jan
  - initialize update_vlapic_mode in the vmx_function_table initializer to
keep unconditionally initialized cbs in a single place, reported by Jan

 xen/arch/x86/hvm/vlapic.c  | 6 +++---
 xen/arch/x86/hvm/vmx/vmx.c | 3 ++-
 xen/arch/x86/include/asm/hvm/hvm.h | 7 +++
 xen/arch/x86/include/asm/hvm/vmx/vmx.h | 2 +-
 4 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index eb32f12e2d..dc93b5e930 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -37,8 +37,8 @@
 #include 
 #include 
 #include 
-#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1165,7 +1165,7 @@ int guest_wrmsr_apic_base(struct vcpu *v, uint64_t value)
 if ( vlapic_x2apic_mode(vlapic) )
 set_x2apic_id(vlapic);
 
-vmx_vlapic_msr_changed(vlapic_vcpu(vlapic));
+hvm_update_vlapic_mode(vlapic_vcpu(vlapic));
 
 HVM_DBG_LOG(DBG_LEVEL_VLAPIC,
 "apic base msr is 0x%016"PRIx64, vlapic->hw.apic_base_msr);
@@ -1561,7 +1561,7 @@ static int cf_check lapic_load_hidden(struct domain *d, 
hvm_domain_context_t *h)
  unlikely(vlapic_x2apic_mode(s)) )
 return -EINVAL;
 
-vmx_vlapic_msr_changed(v);
+hvm_update_vlapic_mode(v);
 
 return 0;
 }
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 270bc98195..0ec33bcc18 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2763,6 +2763,7 @@ static struct hvm_function_table __initdata_cf_clobber 
vmx_function_table = {
 .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
 .nhvm_intr_blocked= nvmx_intr_blocked,
 .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
+.update_vlapic_mode = vmx_vlapic_msr_changed,
 .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
 .enable_msr_interception = vmx_enable_msr_interception,
 .altp2m_vcpu_update_p2m = vmx_vcpu_update_eptp,
@@ -3472,7 +3473,7 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
 vmx_vmcs_exit(v);
 }
 
-void vmx_vlapic_msr_changed(struct vcpu *v)
+void cf_check vmx_vlapic_msr_changed(struct vcpu *v)
 {
 int virtualize_x2apic_mode;
 struct vlapic *vlapic = vcpu_vlapic(v);
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h 
b/xen/arch/x86/include/asm/hvm/hvm.h
index 80e4565bd2..43d3fc2498 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -217,6 +217,7 @@ struct hvm_function_table {
 void (*handle_eoi)(uint8_t vector, int isr);
 int (*pi_update_irte)(const struct vcpu *v, const struct pirq *pirq,
   uint8_t gvec);
+void (*update_vlapic_mode)(struct vcpu *v);
 
 /*Walk nested p2m  */
 int (*nhvm_hap_walk_L1_p2m)(struct vcpu *v, paddr_t L2_gpa,
@@ -786,6 +787,12 @@ static inline int hvm_pi_update_irte(const struct vcpu *v,
 return alternative_call(hvm_funcs.pi_update_irte, v, pirq, gvec);
 }
 
+static inline void hvm_update_vlapic_mode(struct vcpu *v)
+{
+if ( hvm_funcs.update_vlapic_mode )
+alternative_vcall(hvm_funcs.update_vlapic_mode, v);
+}
+
 #else  /* CONFIG_HVM */
 
 #define hvm_enabled false
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h 
b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
index 234da4a7f4..97d6b810ec 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
@@ -85,7 +85,7 @@ typedef enum {
 void vmx_asm_vmexit_handler(struct cpu_user_regs);
 void vmx_intr_assist(void);
 void noreturn cf_check vmx_do_resume(void);
-void vmx_vlapic_msr_changed(struct vcpu *v);
+void cf_check vmx_vlapic_msr_changed(struct vcpu *v);
 struct hvm_emulate_ctxt;
 void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt);
 void vmx_realmode(struct cpu_user_regs *regs);
-- 
2.37.2




xen (multiboot) binary gets corrupted ELF notes

2023-02-13 Thread Marek Marczykowski-Górecki
Hi,

I'm getting some ELF note issues on multiboot binary
specifically:
xen/xen: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically 
linked, BuildID[sha1]=f7d2c37a4ad599b268f2f2d94bff3776d12649b3, bad note 
description size 0xc0010001, stripped

readelf additionally says:

Displaying notes found in: .note
  OwnerData sizeDescription
  GNU  0x0014   NT_GNU_BUILD_ID (unique build ID 
bitstring)
Build ID: c5825a0d08edc4d11b1138fedca6b14ce8ba7302
  (NONE)   0x0004   Unknown note type: (0x0020)
   description data: 05 00 00 00 
readelf: xen/xen: Warning: note with invalid namesz and/or descsz found at 
offset 0x34
readelf: xen/xen: Warning:  type: 0x4, namesize: 0x00554e47, descsize: 
0xc0010001, alignment: 4

Grub doesn't care, but launching such xen with kexec doesn't work.

Initially found when booting Xen via Heads: 
https://openqa.qubes-os.org/tests/60151#step/install_startup/11

Andy says:
> yeah, I've seen the same on XTF
> binutil's recent elf notes for CET compatibility use an unsigned long
> so they're not compatible when we build as 64bit and then re-package as 32
> I think we need to strip all elf notes in mkelf32

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


signature.asc
Description: PGP signature


Re: [PATCH v2] x86/vlapic: call vmx_vlapic_msr_changed through an hvm_function callback

2023-02-13 Thread Andrew Cooper
On 13/02/2023 10:54 am, Xenia Ragiadakou wrote:
>
> On 2/13/23 12:09, Jan Beulich wrote:
>> I keep forgetting why putting it on just the declaration isn't
>> sufficient; I
>> guess a short comment to that effect next to cf_check's definition in
>> compiler.h would help).
>
> Yes, that would help. I don't know why it is not sufficient placing it
> just in the declaration.

Because it's part of the function's type.

~Andrew



Re: Ping: [PATCH] Argo: don't obtain excess page references

2023-02-13 Thread Jan Beulich
On 30.01.2023 09:03, Jan Beulich wrote:
> On 30.01.2023 05:35, Christopher Clark wrote:
>> On Mon, Nov 21, 2022 at 4:41 AM Jan Beulich  wrote:
>>
>>> On 11.10.2022 11:28, Jan Beulich wrote:
 find_ring_mfn() already holds a page reference when trying to obtain a
 writable type reference. We shouldn't make assumptions on the general
 reference count limit being effectively "infinity". Obtain merely a type
 ref, re-using the general ref by only dropping the previously acquired
 one in the case of an error.

 Signed-off-by: Jan Beulich 
>>>
>>> Ping?
>>
>> Sorry it has taken me so long to review this patch and thank-you for
>> posting it. The points raised are helpful.
>>
>> Wrt to the patch - I can't ack because:
>> the general ref that is already held is from the page owner, and it may
>> actually be foreign; so the second ref acquire is currently ensuring that
>> it is a match for the owner of the ring. That needs addressing.
> 
> I'm afraid I may not understand your reply: Are you saying there's something
> wrong with the change? Or are you saying there's something wrong that merely
> becomes apparent due to the change? Or yet something else?

And to extend on the questions: What notion of "foreign" are you referring
to here? The earlier check_get_page_from_gfn() is passed the very same
domain. And anyway, get_page() wouldn't allow to acquire a ref with a
domain other than the page owner. Plus the sole caller of find_ring_mfns()
passes currd.

Jan



Re: [PATCH v2] x86/vlapic: call vmx_vlapic_msr_changed through an hvm_function callback

2023-02-13 Thread Xenia Ragiadakou



On 2/13/23 12:09, Jan Beulich wrote:

On 07.02.2023 10:43, Xenia Ragiadakou wrote:

APIC virtualization support is currently implemented only for Intel VT-x.
To aid future work on separating AMD-V from Intel VT-x code, instead of
calling directly vmx_vlapic_msr_changed() from common hvm code, add a stub
to the hvm_function_table, named update_vlapic_mode, and create a wrapper
function, called hvm_update_vlapic_mode(), to be used by common hvm code.

After the change above, do not include header asm/hvm/vmx/vmx.h as it is
not required anymore and resolve subsequent build errors for implicit
declaration of functions ‘TRACE_2_LONG_3D’ and ‘TRC_PAR_LONG’ by including
missing asm/hvm/trace.h header.

No functional change intended.

Signed-off-by: Xenia Ragiadakou 
---

Changes in v2:
   - rename set_virtual_apic_mode to update_vlapic_mode, suggested by Andrew
   - in hvm_update_vlapic_mode(), call the stub only if available, otherwise
 a BUG() will be triggered every time an svm guest writes the APIC_BASE MSR,
 bug reported by Andrew
   - initialize the stub for vmx unconditionally to maintain current behavior
 since no functional change is intended, bug reported by Andrew (here, I
 decided to place the initialization in start_vmx to be closer to the
 initializations of the other stubs that are relevant to apic 
virtualization)


I disagree with this - unconditional hooks are better put in place right
in vmx_function_table's initializer.


Ok. I will fix it.



Also now that you use the function as a callback, vmx_vlapic_msr_changed()
will need to have cf_check added (on both declaration and definition, 


Ok will do.

albeit

I keep forgetting why putting it on just the declaration isn't sufficient; I
guess a short comment to that effect next to cf_check's definition in
compiler.h would help).


Yes, that would help. I don't know why it is not sufficient placing it 
just in the declaration.




Jan


--
Xenia



Ping²: [PATCH v2] xen: credit2: respect credit2_runqueue=all when arranging runqueues

2023-02-13 Thread Jan Beulich
On 07.12.2022 12:41, Jan Beulich wrote:
> On 03.10.2022 18:21, Marek Marczykowski-Górecki wrote:
>> Documentation for credit2_runqueue=all says it should create one queue
>> for all pCPUs on the host. But since introduction
>> sched_credit2_max_cpus_runqueue, it actually created separate runqueue
>> per socket, even if the CPUs count is below
>> sched_credit2_max_cpus_runqueue.
>>
>> Adjust the condition to skip syblink check in case of
>> credit2_runqueue=all.
>>
>> Fixes: 8e2aa76dc167 ("xen: credit2: limit the max number of CPUs in a 
>> runqueue")
>> Signed-off-by: Marek Marczykowski-Górecki 
>> Reviewed-by: Juergen Gross 
>> ---
>> Changes in v2:
>>  - fix indentation
>>  - adjust doc
>>
>> The whole thing is under cpu_runqueue_match() already, so maybe
>> cpu_runqueue_siblings_match() isn't needed at all?
>> ---
>>  docs/misc/xen-command-line.pandoc | 5 +
>>  xen/common/sched/credit2.c| 9 +++--
>>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> George, Dario - any chance of an ack (or reasons not to)?

Another two months later I'm inclined to commit this with just Jürgen's R-b
(assuming it still applies cleanly). I'll give it a day or two more ...

Jan



Re: [PATCH v3] x86/ucode/AMD: apply the patch early on every logical thread

2023-02-13 Thread Jan Beulich
On 30.01.2023 12:55, Sergey Dyasli wrote:
> The original issue has been reported on AMD Bulldozer-based CPUs where
> ucode loading loses the LWP feature bit in order to gain the IBPB bit.
> LWP disabling is per-SMT/CMT core modification and needs to happen on
> each sibling thread despite the shared microcode engine. Otherwise,
> logical CPUs will end up with different cpuid capabilities.
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216211
> 
> Guests running under Xen happen to be not affected because of levelling
> logic for the feature masking/override MSRs which causes the LWP bit to
> fall out and hides the issue. The latest recommendation from AMD, after
> discussing this bug, is to load ucode on every logical CPU.
> 
> In Linux kernel this issue has been addressed by e7ad18d1169c
> ("x86/microcode/AMD: Apply the patch early on every logical thread").
> Follow the same approach in Xen.
> 
> Introduce SAME_UCODE match result and use it for early AMD ucode
> loading. Take this opportunity and move opt_ucode_allow_same out of
> compare_revisions() to the relevant callers and also modify the warning
> message based on it. Intel's side of things is modified for consistency
> but provides no functional change.
> 
> Late loading is still performed only on the first of SMT/CMT
> siblings and only if a newer ucode revision has been provided (unless
> allow_same option is specified).

Another sentence on the "why" would be helpful, or else it ends up
looking as if there was an issue left in place. (I guess latently it's
going to be left in place anyway, until such time where we re-evaluate
features after ucode application.)

> --- a/xen/arch/x86/cpu/microcode/core.c
> +++ b/xen/arch/x86/cpu/microcode/core.c
> @@ -612,13 +612,21 @@ static long cf_check microcode_update_helper(void *data)
>   * that ucode revision.
>   */
>  spin_lock(_mutex);
> -if ( microcode_cache &&
> - alternative_call(ucode_ops.compare_patch,
> -  patch, microcode_cache) != NEW_UCODE )
> +if ( microcode_cache )
>  {
> +enum microcode_match_result result;
> +
> +result = alternative_call(ucode_ops.compare_patch, patch,
> +   microcode_cache);

Nit: Indentation (3rd arg wants to align with 1st one, not 2nd).

>  spin_unlock(_mutex);
> -printk(XENLOG_WARNING "microcode: couldn't find any newer revision "
> -  "in the provided blob!\n");
> +
> +if ( result == NEW_UCODE ||
> + (opt_ucode_allow_same && result == SAME_UCODE) )
> +goto apply;
> +
> +printk(XENLOG_WARNING "microcode: couldn't find any newer%s revision 
> "
> +  "in the provided blob!\n", 
> opt_ucode_allow_same ?
> + " (or the same)" : 
> "");

Since you touch this entire printk() anyway, may I ask that you re-flow
it to meet our guidelines:

printk(XENLOG_WARNING
   "microcode: couldn't find any newer%s revision in the provided 
blob!\n",
   opt_ucode_allow_same ? " (or the same)" : "");

?

> @@ -626,6 +634,7 @@ static long cf_check microcode_update_helper(void *data)
>  }
>  spin_unlock(_mutex);
>  
> +apply:

Nit: Style (labels indented by at least one blank; see ./CODING_STYLE).
Personally I'd prefer if you got away without another "goto" here, as
our rule of thumb suggests that "goto" generally is to be used only for
error handling or other situations where the alternative would be
significantly worse (excess indentation, code duplication, ...).

Jan



Re: [PATCH v2] x86/vlapic: call vmx_vlapic_msr_changed through an hvm_function callback

2023-02-13 Thread Jan Beulich
On 07.02.2023 10:43, Xenia Ragiadakou wrote:
> APIC virtualization support is currently implemented only for Intel VT-x.
> To aid future work on separating AMD-V from Intel VT-x code, instead of
> calling directly vmx_vlapic_msr_changed() from common hvm code, add a stub
> to the hvm_function_table, named update_vlapic_mode, and create a wrapper
> function, called hvm_update_vlapic_mode(), to be used by common hvm code.
> 
> After the change above, do not include header asm/hvm/vmx/vmx.h as it is
> not required anymore and resolve subsequent build errors for implicit
> declaration of functions ‘TRACE_2_LONG_3D’ and ‘TRC_PAR_LONG’ by including
> missing asm/hvm/trace.h header.
> 
> No functional change intended.
> 
> Signed-off-by: Xenia Ragiadakou 
> ---
> 
> Changes in v2:
>   - rename set_virtual_apic_mode to update_vlapic_mode, suggested by Andrew
>   - in hvm_update_vlapic_mode(), call the stub only if available, otherwise
> a BUG() will be triggered every time an svm guest writes the APIC_BASE 
> MSR,
> bug reported by Andrew
>   - initialize the stub for vmx unconditionally to maintain current behavior
> since no functional change is intended, bug reported by Andrew (here, I
> decided to place the initialization in start_vmx to be closer to the
> initializations of the other stubs that are relevant to apic 
> virtualization)

I disagree with this - unconditional hooks are better put in place right
in vmx_function_table's initializer.

Also now that you use the function as a callback, vmx_vlapic_msr_changed()
will need to have cf_check added (on both declaration and definition, albeit
I keep forgetting why putting it on just the declaration isn't sufficient; I
guess a short comment to that effect next to cf_check's definition in
compiler.h would help).

Jan



[xen-unstable test] 177125: tolerable trouble: fail/pass/starved

2023-02-13 Thread osstest service owner
flight 177125 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/177125/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qcow221 guest-start/debian.repeatfail  like 177001
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 177049
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 177049
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 177049
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 177049
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 177049
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 177049
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 177049
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 177049
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 177049
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  01e7477d1b081cff4288ff9f51ec59ee94c03ee0
baseline version:
 xen  01e7477d1b081cff4288ff9f51ec59ee94c03ee0

Last test of basis   177125  2023-02-13 01:53:31 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  starved 
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt

Re: [PATCH 0/6] xen/x86: Use SPDX (take 1)

2023-02-13 Thread Jan Beulich
On 10.02.2023 00:00, Julien Grall wrote:
> This is a first attempt to replace all the full license text with
> SPX tag in xen/arch/x86/. For now, this is only covering files with GPL 2.0
> and 3 different patterns.

For clarification - the three patterns mentioned here are the three slightly
differently formatted/spelled versions of effectively the same text, not ...

> I have used the script below to remove the full license and add
> an SPDX tag. The script is based on the work from Anthony [1]
> 
> 42sh> cat replace_license.py
> #! /usr/bin/python3
> ## We are opening/writing to files using the binary mode to avoid
> ## python interpreting the content (reading ./drivers/video/font_8x14.c
> ## will throw some encoding error otherwise).
> 
> import sys
> 
> if len(sys.argv) < 4:
> print("./replace_license[debug]")
> exit(1)
> 
> licence_file = sys.argv[1]
> spdx = str.encode(sys.argv[2])
> file = sys.argv[3]
> # HACK: enable debug if there is a 4th argument
> debug = len(sys.argv) == 5
> 
> with open(licence_file, 'rb') as f:
> licence = f.read()
> 
> licence_spdx = b"/* SPDX-License-Identifier: " + spdx + b" */\n"
> 
> print(f"reading {file}")
> with open(file, 'rb') as f:
> whole_file = f.read()
> 
> try:
> licence_loc = whole_file.index(licence)
> except ValueError:
> print("licence not found. Ignoring")
> exit(0)
> 
> # Replace one the below pattern with nothing
> ## Pattern 1
> # * 
> # *
> whole_file = whole_file.replace(licence + b' *\n', b'')
> 
> ## Pattern 2
> # *
> # * 
> whole_file = whole_file.replace(b' *\n' + licence, b'')
> 
> ## Pattern 3
> # /*
> #  * 
> #  */
> whole_file = whole_file.replace(b'/*\n' + licence + b' */\n', b'')
> 
> ## Pattern 4
> # *
> # * 
> whole_file = whole_file.replace(b' * \n' + licence, b'')

... referring to the (really four) patterns here?

> Julien Grall (6):
>   xen/x86: Replace GPL v2.0 copyright with an SPDX tag in *.c
>   xen/x86: Replace GPL v2.0 copyright with an SPDX tag in *.c (part 2)
>   xen/x86: Replace GPL v2.0 copyright with an SPDX tag in *.h
>   xen/x86: Replace GPL v2.0 copyright with an SPDX tag in *.h (part 2)
>   xen/x86: Replace GPL v2.0 copyright with an SPDX tag in *.c (part 3)
>   xen/x86: Replace GPL v2.0 copyright with an SPDX tag in *.h (part 3)

With the one further adjustment you did spot yourself:
Acked-by: Jan Beulich 

Jan



Re: [PATCH] xen: speed up grant-table reclaim

2023-02-13 Thread Juergen Gross

On 07.02.23 03:10, Demi Marie Obenour wrote:

When a grant entry is still in use by the remote domain, Linux must put
it on a deferred list.  Normally, this list is very short, because
the PV network and block protocols expect the backend to unmap the grant
first.  However, Qubes OS's GUI protocol is subject to the constraints
of the X Window System, and as such winds up with the frontend unmapping
the window first.  As a result, the list can grow very large, resulting
in a massive memory leak and eventual VM freeze.

Fix this problem by bumping the number of entries that the VM will
attempt to free at each iteration to 1.  This is an ugly hack that
may well make a denial of service easier, but for Qubes OS that is less
bad than the problem Qubes OS users are facing today.



There really
needs to be a way for a frontend to be notified when the backend has
unmapped the grants.


Please remove this sentence from the commit message, or move it below the
"---" marker.

There are still some flag bits unallocated in struct grant_entry_v1 or
struct grant_entry_header. You could suggest some patches for Xen to use
one of the bits as a marker to get an event from the hypervisor if a
grant with such a bit set has been unmapped.

I have no idea, whether such an interface would be accepted by the
maintainers, though.


Additionally, a module parameter is provided to
allow tuning the reclaim speed.

The code previously used printk(KERN_DEBUG) whenever it had to defer
reclaiming a page because the grant was still mapped.  This resulted in
a large volume of log messages that bothered users.  Use pr_debug
instead, which suppresses the messages by default.  Developers can
enable them using the dynamic debug mechanism.

Fixes: QubesOS/qubes-issues#7410 (memory leak)
Fixes: QubesOS/qubes-issues#7359 (excessive logging)
Fixes: 569ca5b3f94c ("xen/gnttab: add deferred freeing logic")
Cc: sta...@vger.kernel.org
Signed-off-by: Demi Marie Obenour 
---
Anyone have suggestions for improving the grant mechanism?  Argo isn't
a good option, as in the GUI protocol there are substantial performance
wins to be had by using true shared memory.  Resending as I forgot the
Signed-off-by on the first submission.  Sorry about that.

  drivers/xen/grant-table.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 5c83d41..2c2faa7 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -355,14 +355,20 @@
  static void gnttab_handle_deferred(struct timer_list *);
  static DEFINE_TIMER(deferred_timer, gnttab_handle_deferred);
  
+static atomic64_t deferred_count;

+static atomic64_t leaked_count;
+static unsigned int free_per_iteration = 1;


As you are adding a kernel parameter to change this value, please set the
default to a value not potentially causing any DoS problems. Qubes OS can
still use a higher value then.


+
  static void gnttab_handle_deferred(struct timer_list *unused)
  {
-   unsigned int nr = 10;
+   unsigned int nr = READ_ONCE(free_per_iteration);


I don't see why you are needing READ_ONCE() here.


+   const bool ignore_limit = nr == 0;


I don't think you need this extra variable, if you would ...


struct deferred_entry *first = NULL;
unsigned long flags;
+   size_t freed = 0;
  
  	spin_lock_irqsave(_list_lock, flags);

-   while (nr--) {
+   while ((ignore_limit || nr--) && !list_empty(_list)) {


... change this to "while ((!nr || nr--) ...".


struct deferred_entry *entry
= list_first_entry(_list,
   struct deferred_entry, list);
@@ -372,10 +378,13 @@
list_del(>list);
spin_unlock_irqrestore(_list_lock, flags);
if (_gnttab_end_foreign_access_ref(entry->ref)) {
+   uint64_t ret = atomic64_sub_return(1, _count);


Use atomic64_dec_return()?


put_free_entry(entry->ref);
-   pr_debug("freeing g.e. %#x (pfn %#lx)\n",
-entry->ref, page_to_pfn(entry->page));
+   pr_debug("freeing g.e. %#x (pfn %#lx), %llu 
remaining\n",
+entry->ref, page_to_pfn(entry->page),
+(unsigned long long)ret);


Is each single instance of this message really needed?

If not, I'd suggest to use pr_debug_ratelimited() instead.


put_page(entry->page);
+   freed++;
kfree(entry);
entry = NULL;
} else {
@@ -387,14 +396,15 @@
spin_lock_irqsave(_list_lock, flags);
if (entry)
list_add_tail(>list, _list);
-   else if (list_empty(_list))
-   break;
}
-   if (!list_empty(_list) && !timer_pending(_timer)) {
+   if 

[PATCH v3 7/7] Mini-OS: add read and write support to 9pfsfront

2023-02-13 Thread Juergen Gross
Add support to read from and write to a file handled by 9pfsfront.

Signed-off-by: Juergen Gross 
---
V2:
- add check for max message size
- return EAGAIN in case no free request got (Samuel Thibault)
- loop until all data read/written (Samuel Thibault)
V3:
- use an exact limit for read/write (Samuel Thibault)
- log read(write errors (Samuel Thibault)
---
 9pfront.c | 216 ++
 1 file changed, 216 insertions(+)

diff --git a/9pfront.c b/9pfront.c
index fb2e5669..5da8a365 100644
--- a/9pfront.c
+++ b/9pfront.c
@@ -72,7 +72,10 @@ struct file_9pfs {
 #define P9_CMD_WALK   110
 #define P9_CMD_OPEN   112
 #define P9_CMD_CREATE 114
+#define P9_CMD_READ   116
+#define P9_CMD_WRITE  118
 #define P9_CMD_CLUNK  120
+#define P9_CMD_STAT   124
 
 /* P9 protocol open flags. */
 #define P9_OREAD0   /* read */
@@ -88,11 +91,39 @@ struct p9_header {
 uint16_t tag;
 } __attribute__((packed));
 
+struct p9_stat {
+uint16_t size;
+uint16_t type;
+uint32_t dev;
+uint8_t qid[P9_QID_SIZE];
+uint32_t mode;
+uint32_t atime;
+uint32_t mtime;
+uint64_t length;
+char *name;
+char *uid;
+char *gid;
+char *muid;
+char *extension;
+uint32_t n_uid;
+uint32_t n_gid;
+uint32_t n_muid;
+};
+
 #define P9_VERSION"9P2000.u"
 #define P9_ROOT_FID   0
 
 static unsigned int ftype_9pfs;
 
+static void free_stat(struct p9_stat *stat)
+{
+free(stat->name);
+free(stat->uid);
+free(stat->gid);
+free(stat->muid);
+free(stat->extension);
+}
+
 static unsigned int get_fid(struct dev_9pfs *dev)
 {
 unsigned int fid;
@@ -181,9 +212,12 @@ static void copy_from_ring(struct dev_9pfs *dev, void 
*data, unsigned int len)
  *Only valid for sending.
  * u: 2 byte unsigned integer (uint16_t)
  * U: 4 byte unsigned integer (uint32_t)
+ * L: 8 byte unsigned integer (uint64_t)
  * S: String (2 byte length +  characters)
  *in the rcv_9p() case the data for string is allocated (length omitted,
  *string terminated by a NUL character)
+ * D: Binary data (4 byte length +  bytes of data), requires a length
+ *and a buffer pointer parameter.
  * Q: A 13 byte "qid", consisting of 1 byte file type, 4 byte file version
  *and 8 bytes unique file id. Only valid for receiving.
  */
@@ -192,10 +226,12 @@ static void send_9p(struct dev_9pfs *dev, struct req 
*req, const char *fmt, ...)
 struct p9_header hdr;
 va_list ap, aq;
 const char *f;
+uint64_t longval;
 uint32_t intval;
 uint16_t shortval;
 uint16_t len;
 uint8_t byte;
+uint8_t *data;
 char *strval;
 
 hdr.size = sizeof(hdr);
@@ -221,11 +257,21 @@ static void send_9p(struct dev_9pfs *dev, struct req 
*req, const char *fmt, ...)
 hdr.size += 4;
 intval = va_arg(aq, unsigned int);
 break;
+case 'L':
+hdr.size += 8;
+longval = va_arg(aq, uint64_t);
+break;
 case 'S':
 hdr.size += 2;
 strval = va_arg(aq, char *);
 hdr.size += strlen(strval);
 break;
+case 'D':
+hdr.size += 4;
+intval = va_arg(aq, unsigned int);
+hdr.size += intval;
+data = va_arg(aq, uint8_t *);
+break;
 default:
 printk("send_9p: unknown format character %c\n", *f);
 break;
@@ -258,12 +304,22 @@ static void send_9p(struct dev_9pfs *dev, struct req 
*req, const char *fmt, ...)
 intval = va_arg(ap, unsigned int);
 copy_to_ring(dev, , sizeof(intval));
 break;
+case 'L':
+longval = va_arg(ap, uint64_t);
+copy_to_ring(dev, , sizeof(longval));
+break;
 case 'S':
 strval = va_arg(ap, char *);
 len = strlen(strval);
 copy_to_ring(dev, , sizeof(len));
 copy_to_ring(dev, strval, len);
 break;
+case 'D':
+intval = va_arg(ap, unsigned int);
+copy_to_ring(dev, , sizeof(intval));
+data = va_arg(ap, uint8_t *);
+copy_to_ring(dev, data, intval);
+break;
 }
 }
 
@@ -348,6 +404,8 @@ static void rcv_9p_copy(struct dev_9pfs *dev, struct req 
*req,
 uint32_t err;
 uint16_t *shortval;
 uint32_t *val;
+uint64_t *longval;
+uint8_t *data;
 char **strval;
 uint8_t *qval;
 
@@ -412,6 +470,10 @@ static void rcv_9p_copy(struct dev_9pfs *dev, struct req 
*req,
 val = va_arg(ap, uint32_t *);
 copy_bufs(, , , , val, sizeof(*val));
 break;
+case 'L':
+longval = va_arg(ap, uint64_t *);
+copy_bufs(, , , , longval, sizeof(*longval));
+break;
 case 'S':
 strval = va_arg(ap, char **);
 copy_bufs(, , , , , sizeof(len));
@@ -419,6 +481,12 @@ static 

[PATCH v3 5/7] Mini-OS: add 9pfs transport layer

2023-02-13 Thread Juergen Gross
Add the transport layer of 9pfs. This is basically the infrastructure
to send requests to the backend and to receive the related answers via
the rings.

As a first example add the version and attach requests of the 9pfs
protocol when mounting a new 9pfs device. For the version use the
"9P2000.u" variant, as it is the smallest subset supported by the qemu
based backend.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2:
- add more comments (Samuel Thibault)
- log short copy (Samuel Thibault)
- send event after consuming response (Samuel Thibault)
- reaturn EAGAIN in case no free request could be got (Samuel Thibault)
---
 9pfront.c | 478 ++
 1 file changed, 478 insertions(+)

diff --git a/9pfront.c b/9pfront.c
index 89ecb3a1..0b8d5461 100644
--- a/9pfront.c
+++ b/9pfront.c
@@ -7,6 +7,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -14,6 +16,9 @@
 #include 
 
 #ifdef HAVE_LIBC
+
+#define N_REQS   64
+
 struct dev_9pfs {
 int id;
 char nodename[20];
@@ -22,6 +27,7 @@ struct dev_9pfs {
 
 char *tag;
 const char *mnt;
+unsigned int msize_max;
 
 struct xen_9pfs_data_intf *intf;
 struct xen_9pfs_data data;
@@ -32,14 +38,470 @@ struct dev_9pfs {
 evtchn_port_t evtchn;
 unsigned int ring_order;
 xenbus_event_queue events;
+
+unsigned int free_reqs;
+struct req {
+unsigned int id;
+unsigned int next_free; /* N_REQS == end of list. */
+unsigned int cmd;
+int result;
+bool inflight;
+unsigned char *data;/* Returned data. */
+} req[N_REQS];
+
+struct wait_queue_head waitq;
+struct semaphore ring_out_sem;
+struct semaphore ring_in_sem;
 };
 
 #define DEFAULT_9PFS_RING_ORDER  4
 
+#define P9_CMD_VERSION100
+#define P9_CMD_ATTACH 104
+#define P9_CMD_ERROR  107
+
+#define P9_QID_SIZE13
+
+struct p9_header {
+uint32_t size;
+uint8_t cmd;
+uint16_t tag;
+} __attribute__((packed));
+
+#define P9_VERSION"9P2000.u"
+#define P9_ROOT_FID   ~0
+
 static unsigned int ftype_9pfs;
 
+static struct req *get_free_req(struct dev_9pfs *dev)
+{
+struct req *req;
+
+if ( dev->free_reqs == N_REQS )
+return NULL;
+
+req = dev->req + dev->free_reqs;
+dev->free_reqs = req->next_free;
+
+return req;
+}
+
+static void put_free_req(struct dev_9pfs *dev, struct req *req)
+{
+req->next_free = dev->free_reqs;
+req->inflight = false;
+req->data = NULL;
+dev->free_reqs = req->id;
+}
+
+static unsigned int ring_out_free(struct dev_9pfs *dev)
+{
+RING_IDX ring_size = XEN_FLEX_RING_SIZE(dev->ring_order);
+unsigned int queued;
+
+queued = xen_9pfs_queued(dev->prod_pvt_out, dev->intf->out_cons, 
ring_size);
+rmb();
+
+return ring_size - queued;
+}
+
+static unsigned int ring_in_data(struct dev_9pfs *dev)
+{
+RING_IDX ring_size = XEN_FLEX_RING_SIZE(dev->ring_order);
+unsigned int queued;
+
+queued = xen_9pfs_queued(dev->intf->in_prod, dev->cons_pvt_in, ring_size);
+rmb();
+
+return queued;
+}
+
+static void copy_to_ring(struct dev_9pfs *dev, void *data, unsigned int len)
+{
+RING_IDX ring_size = XEN_FLEX_RING_SIZE(dev->ring_order);
+RING_IDX prod = xen_9pfs_mask(dev->prod_pvt_out, ring_size);
+RING_IDX cons = xen_9pfs_mask(dev->intf->out_cons, ring_size);
+
+xen_9pfs_write_packet(dev->data.out, data, len, , cons, ring_size);
+dev->prod_pvt_out += len;
+}
+
+static void copy_from_ring(struct dev_9pfs *dev, void *data, unsigned int len)
+{
+RING_IDX ring_size = XEN_FLEX_RING_SIZE(dev->ring_order);
+RING_IDX prod = xen_9pfs_mask(dev->intf->in_prod, ring_size);
+RING_IDX cons = xen_9pfs_mask(dev->cons_pvt_in, ring_size);
+
+xen_9pfs_read_packet(data, dev->data.in, len, prod, , ring_size);
+dev->cons_pvt_in += len;
+}
+
+/*
+ * send_9p() and rcv_9p() are using a special format string for specifying
+ * the kind of data sent/expected. Each data item is represented by a single
+ * character:
+ * U: 4 byte unsigned integer (uint32_t)
+ * S: String (2 byte length +  characters)
+ *in the rcv_9p() case the data for string is allocated (length omitted,
+ *string terminated by a NUL character)
+ * Q: A 13 byte "qid", consisting of 1 byte file type, 4 byte file version
+ *and 8 bytes unique file id. Only valid for receiving.
+ */
+static void send_9p(struct dev_9pfs *dev, struct req *req, const char *fmt, 
...)
+{
+struct p9_header hdr;
+va_list ap, aq;
+const char *f;
+uint32_t intval;
+uint16_t len;
+char *strval;
+
+hdr.size = sizeof(hdr);
+hdr.cmd = req->cmd;
+hdr.tag = req->id;
+
+va_start(ap, fmt);
+
+va_copy(aq, ap);
+for ( f = fmt; *f; f++ )
+{
+switch ( *f )
+{
+case 'U':
+hdr.size += 4;
+intval = va_arg(aq, unsigned int);
+break;
+

[PATCH v3 6/7] Mini-OS: add open and close handling to the 9pfs frontend

2023-02-13 Thread Juergen Gross
Add the open() and close() support to the 9pfs frontend. This requires
to split the path name and to walk to the desired directory level.

Signed-off-by: Juergen Gross 
---
V2:
- check path to be canonical
- avoid short read when walking the path
- fix get_fid() (Samuel Thibault)
- return EAGAIN if no free request got (Samuel Thibault)
V3:
- add check for "//" in path_canonical() (Samuel Thibault)
---
 9pfront.c | 324 +-
 1 file changed, 322 insertions(+), 2 deletions(-)

diff --git a/9pfront.c b/9pfront.c
index 0b8d5461..fb2e5669 100644
--- a/9pfront.c
+++ b/9pfront.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -52,13 +53,32 @@ struct dev_9pfs {
 struct wait_queue_head waitq;
 struct semaphore ring_out_sem;
 struct semaphore ring_in_sem;
+
+unsigned long long fid_mask;  /* Bit mask for free fids. */
+};
+
+struct file_9pfs {
+uint32_t fid;
+struct dev_9pfs *dev;
+bool append;
 };
 
 #define DEFAULT_9PFS_RING_ORDER  4
 
+/* P9 protocol commands (response is either cmd+1 or P9_CMD_ERROR). */
 #define P9_CMD_VERSION100
 #define P9_CMD_ATTACH 104
 #define P9_CMD_ERROR  107
+#define P9_CMD_WALK   110
+#define P9_CMD_OPEN   112
+#define P9_CMD_CREATE 114
+#define P9_CMD_CLUNK  120
+
+/* P9 protocol open flags. */
+#define P9_OREAD0   /* read */
+#define P9_OWRITE   1   /* write */
+#define P9_ORDWR2   /* read and write */
+#define P9_OTRUNC  16   /* or'ed in, truncate file first */
 
 #define P9_QID_SIZE13
 
@@ -69,10 +89,27 @@ struct p9_header {
 } __attribute__((packed));
 
 #define P9_VERSION"9P2000.u"
-#define P9_ROOT_FID   ~0
+#define P9_ROOT_FID   0
 
 static unsigned int ftype_9pfs;
 
+static unsigned int get_fid(struct dev_9pfs *dev)
+{
+unsigned int fid;
+
+fid = ffs(dev->fid_mask);
+if ( fid )
+dev->fid_mask &= ~(1ULL << (fid - 1));
+
+ return fid;
+}
+
+static void put_fid(struct dev_9pfs *dev, unsigned int fid)
+{
+if ( fid )
+dev->fid_mask |= 1ULL << (fid - 1);
+}
+
 static struct req *get_free_req(struct dev_9pfs *dev)
 {
 struct req *req;
@@ -140,6 +177,9 @@ static void copy_from_ring(struct dev_9pfs *dev, void 
*data, unsigned int len)
  * send_9p() and rcv_9p() are using a special format string for specifying
  * the kind of data sent/expected. Each data item is represented by a single
  * character:
+ * b: 1 byte unsigned integer (uint8_t)
+ *Only valid for sending.
+ * u: 2 byte unsigned integer (uint16_t)
  * U: 4 byte unsigned integer (uint32_t)
  * S: String (2 byte length +  characters)
  *in the rcv_9p() case the data for string is allocated (length omitted,
@@ -153,7 +193,9 @@ static void send_9p(struct dev_9pfs *dev, struct req *req, 
const char *fmt, ...)
 va_list ap, aq;
 const char *f;
 uint32_t intval;
+uint16_t shortval;
 uint16_t len;
+uint8_t byte;
 char *strval;
 
 hdr.size = sizeof(hdr);
@@ -167,6 +209,14 @@ static void send_9p(struct dev_9pfs *dev, struct req *req, 
const char *fmt, ...)
 {
 switch ( *f )
 {
+case 'b':
+hdr.size += 1;
+byte = va_arg(aq, unsigned int);
+break;
+case 'u':
+hdr.size += 2;
+shortval = va_arg(aq, unsigned int);
+break;
 case 'U':
 hdr.size += 4;
 intval = va_arg(aq, unsigned int);
@@ -196,6 +246,14 @@ static void send_9p(struct dev_9pfs *dev, struct req *req, 
const char *fmt, ...)
 {
 switch ( *f )
 {
+case 'b':
+byte = va_arg(ap, unsigned int);
+copy_to_ring(dev, , sizeof(byte));
+break;
+case 'u':
+shortval = va_arg(ap, unsigned int);
+copy_to_ring(dev, , sizeof(shortval));
+break;
 case 'U':
 intval = va_arg(ap, unsigned int);
 copy_to_ring(dev, , sizeof(intval));
@@ -288,6 +346,7 @@ static void rcv_9p_copy(struct dev_9pfs *dev, struct req 
*req,
 char *str;
 uint16_t len;
 uint32_t err;
+uint16_t *shortval;
 uint32_t *val;
 char **strval;
 uint8_t *qval;
@@ -345,6 +404,10 @@ static void rcv_9p_copy(struct dev_9pfs *dev, struct req 
*req,
 {
 switch ( *f )
 {
+case 'u':
+shortval = va_arg(ap, uint16_t *);
+copy_bufs(, , , , shortval, sizeof(*shortval));
+break;
 case 'U':
 val = va_arg(ap, uint32_t *);
 copy_bufs(, , , , val, sizeof(*val));
@@ -486,6 +549,170 @@ static int p9_attach(struct dev_9pfs *dev)
 return ret;
 }
 
+static int p9_clunk(struct dev_9pfs *dev, uint32_t fid)
+{
+struct req *req = get_free_req(dev);
+int ret;
+
+if ( !req )
+return EAGAIN;
+
+req->cmd = P9_CMD_CLUNK;
+send_9p(dev, req, "U", 

[PATCH v3 3/7] Mini-OS: add support for runtime mounts

2023-02-13 Thread Juergen Gross
Add the support to mount a device at runtime. The number of dynamic
mounts is limited by a #define.

For devices supporting multiple files struct file is modified to hold
a pointer to file specific data in contrast to device specific data.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V3:
- set number of possible mountpoints to 16 (Andrew Cooper)
---
 include/lib.h |  5 +
 lib/sys.c | 45 -
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/lib.h b/include/lib.h
index 36d94ec4..fd8c36de 100644
--- a/include/lib.h
+++ b/include/lib.h
@@ -172,6 +172,7 @@ struct file {
 union {
 int fd; /* Any fd from an upper layer. */
 void *dev;
+void *filedata;
 };
 };
 
@@ -194,6 +195,10 @@ struct mount_point {
 void *dev;
 };
 
+int mount(const char *path, void *dev,
+  int (*open)(struct mount_point *, const char *, int, mode_t));
+void umount(const char *path);
+
 unsigned int alloc_file_type(const struct file_ops *ops);
 
 off_t lseek_default(struct file *file, off_t offset, int whence);
diff --git a/lib/sys.c b/lib/sys.c
index 2f33c937..118fc441 100644
--- a/lib/sys.c
+++ b/lib/sys.c
@@ -85,6 +85,8 @@
 }
 
 #define NOFILE 32
+#define N_MOUNTS  16
+
 extern void minios_evtchn_close_fd(int fd);
 extern void minios_gnttab_close_fd(int fd);
 
@@ -339,7 +341,7 @@ static int open_mem(struct mount_point *mnt, const char 
*pathname, int flags,
 return fd;
 }
 
-static struct mount_point mount_points[] = {
+static struct mount_point mount_points[N_MOUNTS] = {
 { .path = "/var/log", .open = open_log,  .dev = NULL },
 { .path = "/dev/mem", .open = open_mem,  .dev = NULL },
 #ifdef CONFIG_CONSFRONT
@@ -365,6 +367,8 @@ int open(const char *pathname, int flags, ...)
 for ( m = 0; m < ARRAY_SIZE(mount_points); m++ )
 {
 mnt = mount_points + m;
+if ( !mnt->path )
+continue;
 mlen = strlen(mnt->path);
 if ( !strncmp(pathname, mnt->path, mlen) &&
  (pathname[mlen] == '/' || pathname[mlen] == 0) )
@@ -375,6 +379,45 @@ int open(const char *pathname, int flags, ...)
 return -1;
 }
 
+int mount(const char *path, void *dev,
+  int (*open)(struct mount_point *, const char *, int, mode_t))
+{
+unsigned int m;
+struct mount_point *mnt;
+
+for ( m = 0; m < ARRAY_SIZE(mount_points); m++ )
+{
+mnt = mount_points + m;
+if ( !mnt->path )
+{
+mnt->path = strdup(path);
+mnt->open = open;
+mnt->dev = dev;
+return 0;
+}
+}
+
+errno = ENOSPC;
+return -1;
+}
+
+void umount(const char *path)
+{
+unsigned int m;
+struct mount_point *mnt;
+
+for ( m = 0; m < ARRAY_SIZE(mount_points); m++ )
+{
+mnt = mount_points + m;
+if ( mnt->path && !strcmp(mnt->path, path) )
+{
+free((char *)mnt->path);
+mnt->path = NULL;
+return;
+}
+}
+}
+
 int isatty(int fd)
 {
 return files[fd].type == FTYPE_CONSOLE;
-- 
2.35.3




[PATCH v3 2/7] Mini-OS: add concept of mount points

2023-02-13 Thread Juergen Gross
Add the concept of mount points to Mini-OS. A mount point is a path
associated with a device pointer and an open() callback. A mount point
can be either a file (e.g. "/dev/mem") or a directory ("/var/log").

This allows to replace the special casing in the generic open()
handling with a generic mount point handling.

Prepare the open() callbacks to support creating new files by adding a
mode parameter.

Additionally add a close() prototype to include/lib.h, as it is missing
today.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2:
- pass path below mount point to open callbacks (Samuel Thibault)
---
 include/lib.h |  9 ++
 lib/sys.c | 80 +++
 2 files changed, 70 insertions(+), 19 deletions(-)

diff --git a/include/lib.h b/include/lib.h
index bec99646..36d94ec4 100644
--- a/include/lib.h
+++ b/include/lib.h
@@ -187,6 +187,13 @@ struct file_ops {
 bool (*select_wr)(struct file *file);
 };
 
+struct mount_point {
+const char *path;
+int (*open)(struct mount_point *mnt, const char *pathname, int flags,
+mode_t mode);
+void *dev;
+};
+
 unsigned int alloc_file_type(const struct file_ops *ops);
 
 off_t lseek_default(struct file *file, off_t offset, int whence);
@@ -198,6 +205,8 @@ int alloc_fd(unsigned int type);
 void close_all_files(void);
 extern struct thread *main_thread;
 void sparse(unsigned long data, size_t size);
+
+int close(int fd);
 #endif
 
 #endif /* _LIB_H_ */
diff --git a/lib/sys.c b/lib/sys.c
index 8f8a3de2..2f33c937 100644
--- a/lib/sys.c
+++ b/lib/sys.c
@@ -263,11 +263,6 @@ char *getcwd(char *buf, size_t size)
 return buf;
 }
 
-#define LOG_PATH "/var/log/"
-#define SAVE_PATH "/var/lib/xen"
-#define SAVE_CONSOLE 1
-#define RESTORE_CONSOLE 2
-
 int mkdir(const char *pathname, mode_t mode)
 {
 errno = EIO;
@@ -286,18 +281,30 @@ int posix_openpt(int flags)
 return fd;
 }
 
+static int open_pt(struct mount_point *mnt, const char *pathname, int flags,
+   mode_t mode)
+{
+return posix_openpt(flags);
+}
+
 int open_savefile(const char *path, int save)
 {
 int fd;
 char nodename[64];
 
-snprintf(nodename, sizeof(nodename), "device/console/%d", save ? 
SAVE_CONSOLE : RESTORE_CONSOLE);
+snprintf(nodename, sizeof(nodename), "device/console/%d", save ? 1 : 2);
 
 fd = open_consfront(nodename);
 printk("fd(%d) = open_savefile\n", fd);
 
 return fd;
 }
+
+static int open_save(struct mount_point *mnt, const char *pathname, int flags,
+ mode_t mode)
+{
+return open_savefile(pathname, flags & O_WRONLY);
+}
 #else
 int posix_openpt(int flags)
 {
@@ -311,24 +318,59 @@ int open_savefile(const char *path, int save)
 }
 #endif
 
-int open(const char *pathname, int flags, ...)
+static int open_log(struct mount_point *mnt, const char *pathname, int flags,
+mode_t mode)
 {
 int fd;
+
 /* Ugly, but fine.  */
-if (!strncmp(pathname,LOG_PATH,strlen(LOG_PATH))) {
-   fd = alloc_fd(FTYPE_CONSOLE);
-printk("open(%s) -> %d\n", pathname, fd);
-return fd;
+fd = alloc_fd(FTYPE_CONSOLE);
+printk("open(%s%s) -> %d\n", mnt->path, pathname, fd);
+return fd;
+}
+
+static int open_mem(struct mount_point *mnt, const char *pathname, int flags,
+mode_t mode)
+{
+int fd;
+
+fd = alloc_fd(FTYPE_MEM);
+printk("open(%s%s) -> %d\n", mnt->path, pathname, fd);
+return fd;
+}
+
+static struct mount_point mount_points[] = {
+{ .path = "/var/log", .open = open_log,  .dev = NULL },
+{ .path = "/dev/mem", .open = open_mem,  .dev = NULL },
+#ifdef CONFIG_CONSFRONT
+{ .path = "/dev/ptmx",.open = open_pt,   .dev = NULL },
+{ .path = "/var/lib/xen", .open = open_save, .dev = NULL },
+#endif
+};
+
+int open(const char *pathname, int flags, ...)
+{
+unsigned int m, mlen;
+struct mount_point *mnt;
+mode_t mode = 0;
+va_list ap;
+
+if ( flags & O_CREAT )
+{
+va_start(ap, flags);
+mode = va_arg(ap, mode_t);
+va_end(ap);
 }
-if (!strncmp(pathname, "/dev/mem", strlen("/dev/mem"))) {
-fd = alloc_fd(FTYPE_MEM);
-printk("open(/dev/mem) -> %d\n", fd);
-return fd;
+
+for ( m = 0; m < ARRAY_SIZE(mount_points); m++ )
+{
+mnt = mount_points + m;
+mlen = strlen(mnt->path);
+if ( !strncmp(pathname, mnt->path, mlen) &&
+ (pathname[mlen] == '/' || pathname[mlen] == 0) )
+return mnt->open(mnt, pathname + mlen, flags, mode);
 }
-if (!strncmp(pathname, "/dev/ptmx", strlen("/dev/ptmx")))
-return posix_openpt(flags);
-if (!strncmp(pathname,SAVE_PATH,strlen(SAVE_PATH)))
-return open_savefile(pathname, flags & O_WRONLY);
+
 errno = EIO;
 return -1;
 }
-- 
2.35.3




[PATCH v3 4/7] Mini-OS: add 9pfs frontend

2023-02-13 Thread Juergen Gross
Add a frontend for the 9pfs PV device. For now add only the code needed
to connect to the backend and the related disconnect functionality. The
9pfs protocol support will be added later.

Due to its nature (ability to access files) the whole code is guarded
by "#ifdef HAVE_LIBC".

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2:
- add better error handling to version parsing (Samuel Thibault)
---
 9pfront.c | 286 ++
 Config.mk |   1 +
 Makefile  |   1 +
 arch/x86/testbuild/all-no |   1 +
 arch/x86/testbuild/all-yes|   1 +
 arch/x86/testbuild/newxen-yes |   1 +
 include/9pfront.h |   7 +
 7 files changed, 298 insertions(+)
 create mode 100644 9pfront.c
 create mode 100644 include/9pfront.h

diff --git a/9pfront.c b/9pfront.c
new file mode 100644
index ..89ecb3a1
--- /dev/null
+++ b/9pfront.c
@@ -0,0 +1,286 @@
+/*
+ * Minimal 9pfs PV frontend for Mini-OS.
+ * Copyright (c) 2023 Juergen Gross, SUSE Software Solution GmbH
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef HAVE_LIBC
+struct dev_9pfs {
+int id;
+char nodename[20];
+unsigned int dom;
+char *backend;
+
+char *tag;
+const char *mnt;
+
+struct xen_9pfs_data_intf *intf;
+struct xen_9pfs_data data;
+RING_IDX prod_pvt_out;
+RING_IDX cons_pvt_in;
+
+grant_ref_t ring_ref;
+evtchn_port_t evtchn;
+unsigned int ring_order;
+xenbus_event_queue events;
+};
+
+#define DEFAULT_9PFS_RING_ORDER  4
+
+static unsigned int ftype_9pfs;
+
+static void intr_9pfs(evtchn_port_t port, struct pt_regs *regs, void *data)
+{
+}
+
+static int open_9pfs(struct mount_point *mnt, const char *pathname, int flags,
+ mode_t mode)
+{
+errno = ENOSYS;
+
+return -1;
+}
+
+static void free_9pfront(struct dev_9pfs *dev)
+{
+unsigned int i;
+
+if ( dev->data.in && dev->intf )
+{
+for ( i = 0; i < (1 << dev->ring_order); i++ )
+gnttab_end_access(dev->intf->ref[i]);
+free_pages(dev->data.in, dev->ring_order);
+}
+unbind_evtchn(dev->evtchn);
+gnttab_end_access(dev->ring_ref);
+free_page(dev->intf);
+free(dev->backend);
+free(dev->tag);
+free(dev);
+}
+
+void *init_9pfront(unsigned int id, const char *mnt)
+{
+struct dev_9pfs *dev;
+char *msg;
+char *reason = "";
+xenbus_transaction_t xbt;
+int retry = 1;
+char bepath[64] = { 0 };
+XenbusState state;
+unsigned int i;
+void *addr;
+char *version;
+char *v;
+
+printk("9pfsfront add %u, for mount at %s\n", id, mnt);
+dev = malloc(sizeof(*dev));
+memset(dev, 0, sizeof(*dev));
+snprintf(dev->nodename, sizeof(dev->nodename), "device/9pfs/%u", id);
+dev->id = id;
+
+msg = xenbus_read_unsigned(XBT_NIL, dev->nodename, "backend-id", 
>dom);
+if ( msg )
+goto err;
+msg = xenbus_read_string(XBT_NIL, dev->nodename, "backend", >backend);
+if ( msg )
+goto err;
+msg = xenbus_read_string(XBT_NIL, dev->nodename, "tag", >tag);
+if ( msg )
+goto err;
+
+snprintf(bepath, sizeof(bepath), "%s/state", dev->backend);
+free(xenbus_watch_path_token(XBT_NIL, bepath, bepath, >events));
+state = xenbus_read_integer(bepath);
+while ( msg == NULL && state < XenbusStateInitWait )
+msg = xenbus_wait_for_state_change(bepath, , >events);
+if ( msg || state != XenbusStateInitWait )
+{
+reason = "illegal backend state";
+goto err;
+}
+
+msg = xenbus_read_unsigned(XBT_NIL, dev->backend, "max-ring-page-order",
+   >ring_order);
+if ( msg )
+goto err;
+if ( dev->ring_order > DEFAULT_9PFS_RING_ORDER )
+dev->ring_order = DEFAULT_9PFS_RING_ORDER;
+
+msg = xenbus_read_string(XBT_NIL, dev->backend, "versions", );
+if ( msg )
+goto err;
+for ( v = version; *v; v++ )
+{
+if ( strtoul(v, , 10) == 1 && (*v == ',' || *v == 0) )
+{
+v = NULL;
+break;
+}
+if ( *v != ',' && *v != 0 )
+{
+reason = "backend published illegal version string";
+free(version);
+goto err;
+}
+}
+free(version);
+if ( v )
+{
+reason = "backend doesn't support version 1";
+goto err;
+}
+
+dev->ring_ref = gnttab_alloc_and_grant((void **)>intf);
+memset(dev->intf, 0, PAGE_SIZE);
+if ( evtchn_alloc_unbound(dev->dom, intr_9pfs, dev, >evtchn) )
+{
+reason = "no event channel";
+goto err;
+}
+dev->intf->ring_order = dev->ring_order;
+dev->data.in = (void *)alloc_pages(dev->ring_order);
+dev->data.out = dev->data.in + XEN_FLEX_RING_SIZE(dev->ring_order);
+for ( i = 0; i < (1 << dev->ring_order); i++ )
+{
+addr = dev->data.in + i * 

[PATCH v3 1/7] Mini-OS: xenbus: add support for reading node from directory

2023-02-13 Thread Juergen Gross
Especially PV device drivers often need to read multiple Xenstore
nodes from a common directory. Add support for reading a string or an
unsigned value by specifying the directory and the node.

Signed-off-by: Juergen Gross 
Reviewed-by: Samuel Thibault 
---
V2:
- check sscanf() return value (Samuel Thibault)
---
 include/xenbus.h |  6 ++
 xenbus.c | 40 +---
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/include/xenbus.h b/include/xenbus.h
index 3871f358..c0fc0ac5 100644
--- a/include/xenbus.h
+++ b/include/xenbus.h
@@ -108,6 +108,12 @@ int xenbus_read_integer(const char *path);
  * read and parsing were successful, 0 if not */
 int xenbus_read_uuid(const char* path, unsigned char uuid[16]);
 
+/* Support functions for reading values from directory/node tuple. */
+char *xenbus_read_string(xenbus_transaction_t xbt, const char *dir,
+ const char *node, char **value);
+char *xenbus_read_unsigned(xenbus_transaction_t xbt, const char *dir,
+   const char *node, unsigned int *value);
+
 /* Contraction of snprintf and xenbus_write(path/node). */
 char* xenbus_printf(xenbus_transaction_t xbt,
   const char* node, const char* path,
diff --git a/xenbus.c b/xenbus.c
index 81e9b65d..923e8181 100644
--- a/xenbus.c
+++ b/xenbus.c
@@ -936,16 +936,21 @@ int xenbus_read_uuid(const char *path, unsigned char 
uuid[16])
 return 1;
 }
 
+#define BUFFER_SIZE 256
+static void xenbus_build_path(const char *dir, const char *node, char *res)
+{
+BUG_ON(strlen(dir) + strlen(node) + 1 >= BUFFER_SIZE);
+sprintf(res,"%s/%s", dir, node);
+}
+
 char *xenbus_printf(xenbus_transaction_t xbt, const char* node,
 const char* path, const char* fmt, ...)
 {
-#define BUFFER_SIZE 256
 char fullpath[BUFFER_SIZE];
 char val[BUFFER_SIZE];
 va_list args;
 
-BUG_ON(strlen(node) + strlen(path) + 1 >= BUFFER_SIZE);
-sprintf(fullpath,"%s/%s", node, path);
+xenbus_build_path(node, path, fullpath);
 va_start(args, fmt);
 vsprintf(val, fmt, args);
 va_end(args);
@@ -964,6 +969,35 @@ domid_t xenbus_get_self_id(void)
 return ret;
 }
 
+char *xenbus_read_string(xenbus_transaction_t xbt, const char *dir,
+ const char *node, char **value)
+{
+char path[BUFFER_SIZE];
+
+xenbus_build_path(dir, node, path);
+
+return xenbus_read(xbt, path, value);
+}
+
+char *xenbus_read_unsigned(xenbus_transaction_t xbt, const char *dir,
+   const char *node, unsigned int *value)
+{
+char path[BUFFER_SIZE];
+char *msg;
+char *str;
+
+xenbus_build_path(dir, node, path);
+msg = xenbus_read(xbt, path, );
+if ( msg )
+return msg;
+
+if ( sscanf(str, "%u", value) != 1 )
+msg = strdup("EINVAL");
+free(str);
+
+return msg;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
2.35.3




[PATCH v3 0/7] Mini-OS: add minimal 9pfs support

2023-02-13 Thread Juergen Gross
This series is adding minimal support to use 9pfs in Mini-OS. It is
adding a PV 9pfs frontend and the ability to open, close, read and
write files.

The series has been tested with qemu as 9pfs backend in a PVH Mini-OS
guest (I've used a slightly modified Xenstore-stubdom for that purpose
in order to reuse the build runes).

This series is meant to setup the stage for adding file based logging
support to Xenstore-stubdom and later to add live update support (being
able to save the LU data stream in a dom0 file makes this a _lot_
easier).

In order to keep Mini-OS's license I have only used the protocol docs
available on the internet [1] and then verified those with the qemu 9pfs
backend implementation (especially for supporting the 9P2000.u variant,
as qemu doesn't support the basic 9P2000 protocol).

The needed fixed values of the protocol have been taken from [2].

[1]: http://ericvh.github.io/9p-rfc/rfc9p2000.html
[2]: https://github.com/0intro/libixp

Changes in V2:
- addressed comments by Samuel Thibault

Changes in V3:
- addressed comments by Samuel Thibault and Andrew Cooper

Juergen Gross (7):
  Mini-OS: xenbus: add support for reading node from directory
  Mini-OS: add concept of mount points
  Mini-OS: add support for runtime mounts
  Mini-OS: add 9pfs frontend
  Mini-OS: add 9pfs transport layer
  Mini-OS: add open and close handling to the 9pfs frontend
  Mini-OS: add read and write support to 9pfsfront

 9pfront.c | 1300 +
 Config.mk |1 +
 Makefile  |1 +
 arch/x86/testbuild/all-no |1 +
 arch/x86/testbuild/all-yes|1 +
 arch/x86/testbuild/newxen-yes |1 +
 include/9pfront.h |7 +
 include/lib.h |   14 +
 include/xenbus.h  |6 +
 lib/sys.c |  123 +++-
 xenbus.c  |   40 +-
 11 files changed, 1473 insertions(+), 22 deletions(-)
 create mode 100644 9pfront.c
 create mode 100644 include/9pfront.h

-- 
2.35.3




Re: [PATCH 07/10] xen/physinfo: add arm SVE arch capability and vector length

2023-02-13 Thread Jan Beulich
On 10.02.2023 16:54, Luca Fancellu wrote:
>> On 2 Feb 2023, at 12:05, Jan Beulich  wrote:
>> On 02.02.2023 12:08, Luca Fancellu wrote:
>>> (is hw_cap only for x86?)
>>
>> I suppose it is, but I also expect it would better go away than be moved.
>> It doesn't hold a complete set of information, and it has been superseded.
>>
>> Question is (and I think I did raise this before, perhaps in different
>> context) whether Arm wouldn't want to follow x86 in how hardware as well
>> as hypervisor derived / used ones are exposed to the tool stack
>> (XEN_SYSCTL_get_cpu_featureset). I guess there's nothing really precluding
>> that data to consist of more than just boolean fields.
> 
> Yes I guess that infrastructure could work, however I don’t have the 
> bandwidth to
> put it in place at the moment, so I would like the Arm maintainers to give me 
> a
> suggestion on how I can expose the vector length to XL if putting its value 
> here
> needs to be avoided

Since you've got a reply from Andrew boiling down to the same suggestion
(or should I even say request), I guess it wants seriously considering
to introduce abstract base infrastructure first. As Andrew says, time not
invested now will very likely mean yet more time to be invested later.

>>> --- a/xen/include/public/sysctl.h
>>> +++ b/xen/include/public/sysctl.h
>>> @@ -18,7 +18,7 @@
>>> #include "domctl.h"
>>> #include "physdev.h"
>>>
>>> -#define XEN_SYSCTL_INTERFACE_VERSION 0x0015
>>> +#define XEN_SYSCTL_INTERFACE_VERSION 0x0016
>>
>> Why? You ...
>>
>>> @@ -104,7 +110,8 @@ struct xen_sysctl_physinfo {
>>> uint32_t cpu_khz;
>>> uint32_t capabilities;/* XEN_SYSCTL_PHYSCAP_??? */
>>> uint32_t arch_capabilities;/* XEN_SYSCTL_PHYSCAP_{X86,ARM,...}_??? */
>>> -uint32_t pad;
>>> +uint16_t arm_sve_vl_bits;
>>> +uint16_t pad;
>>> uint64_aligned_t total_pages;
>>> uint64_aligned_t free_pages;
>>> uint64_aligned_t scrub_pages;
>>
>> ... add no new fields, and the only producer of the data zero-fills the
>> struct first thing.
> 
> Yes that is right, I will wait to understand how I can proceed here:
> 
> 1) rename arch_capabilities to arm_sve_vl_bits and put vector length there.
> 2) leave arch_capabilities untouched, no flag creation/setting, create 
> uint32_t arm_sve_vl_bits field removing “pad”,
> Use its value to determine if SVE is present or not.
> 3) ??

3) Introduce suitable #define(s) to use part of arch_capabilities for your
purpose, without renaming the field. (But that's of course only if Arm
maintainers agree with you on _not_ going the proper feature handling route
right away.)

Jan



Re: [PATCH][next] xen: Replace one-element array with flexible-array member

2023-02-13 Thread Juergen Gross

On 03.02.23 02:28, Gustavo A. R. Silva wrote:

One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in struct xen_page_directory.

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

This results in no differences in binary output.

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/255
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva 


Pushed-to: xen/tip.git for-linus-6.3


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] xen/grant-dma-iommu: Implement a dummy probe_device() callback

2023-02-13 Thread Juergen Gross

On 08.02.23 16:36, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

Update stub IOMMU driver (which main purpose is to reuse generic
IOMMU device-tree bindings by Xen grant DMA-mapping layer on Arm)
according to the recent changes done in the following
commit 57365a04c921 ("iommu: Move bus setup to IOMMU device registration").

With probe_device() callback being called during IOMMU device registration,
the uninitialized callback just leads to the "kernel NULL pointer
dereference" issue during boot. Fix that by adding a dummy callback.

Looks like the release_device() callback is not mandatory to be
implemented as IOMMU framework makes sure that callback is initialized
before dereferencing.

Reported-by: Viresh Kumar 
Signed-off-by: Oleksandr Tyshchenko 


Pushed-to: xen/tip.git for-linus-6.3


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] xen/pvcalls-back: fix permanently masked event channel

2023-02-13 Thread Juergen Gross

On 19.01.23 22:11, Volodymyr Babchuk wrote:

There is a sequence of events that can lead to a permanently masked
event channel, because xen_irq_lateeoi() is newer called. This happens
when a backend receives spurious write event from a frontend. In this
case pvcalls_conn_back_write() returns early and it does not clears the
map->write counter. As map->write > 0, pvcalls_back_ioworker() returns
without calling xen_irq_lateeoi(). This leaves the event channel in
masked state, a backend does not receive any new events from a
frontend and the whole communication stops.

Move atomic_set(>write, 0) to the very beginning of
pvcalls_conn_back_write() to fix this issue.

Signed-off-by: Volodymyr Babchuk 
Reported-by: Oleksii Moisieiev 


Pushed to: xen/tip.git for-linus-6.3


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] xen: Allow platform PCI interrupt to be shared

2023-02-13 Thread Juergen Gross

On 18.01.23 13:22, David Woodhouse wrote:

From: David Woodhouse 

When we don't use the per-CPU vector callback, we ask Xen to deliver event
channel interrupts as INTx on the PCI platform device. As such, it can be
shared with INTx on other PCI devices.

Set IRQF_SHARED, and make it return IRQ_HANDLED or IRQ_NONE according to
whether the evtchn_upcall_pending flag was actually set. Now I can share
the interrupt:

  11: 82  0   IO-APIC  11-fasteoi   xen-platform-pci, ens4

Drop the IRQF_TRIGGER_RISING. It has no effect when the IRQ is shared,
and besides, the only effect it was having even beforehand was to trigger
a debug message in both I/OAPIC and legacy PIC cases:

[0.915441] genirq: No set_type function for IRQ 11 (IO-APIC)
[0.951939] genirq: No set_type function for IRQ 11 (XT-PIC)

Signed-off-by: David Woodhouse 


Pushed to: xen/tip.git for-linus-6.3


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH linux-next v3] x86/xen/time: prefer tsc as clocksource when it is invariant

2023-02-13 Thread Juergen Gross

On 16.12.22 17:21, Krister Johansen wrote:

Kvm elects to use tsc instead of kvm-clock when it can detect that the
TSC is invariant.

(As of commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
invariant TSC is exposed")).

Notable cloud vendors[1] and performance engineers[2] recommend that Xen
users preferentially select tsc over xen-clocksource due the performance
penalty incurred by the latter.  These articles are persuasive and
tailored to specific use cases.  In order to understand the tradeoffs
around this choice more fully, this author had to reference the
documented[3] complexities around the Xen configuration, as well as the
kernel's clocksource selection algorithm.  Many users may not attempt
this to correctly configure the right clock source in their guest.

The approach taken in the kvm-clock module spares users this confusion,
where possible.

Both the Intel SDM[4] and the Xen tsc documentation explain that marking
a tsc as invariant means that it should be considered stable by the OS
and is elibile to be used as a wall clock source.

In order to obtain better out-of-the-box performance, and reduce the
need for user tuning, follow kvm's approach and decrease the xen clock
rating so that tsc is preferable, if it is invariant, stable, and the
tsc will never be emulated.

[1] 
https://aws.amazon.com/premiumsupport/knowledge-center/manage-ec2-linux-clock-source/
[2] https://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html
[3] https://xenbits.xen.org/docs/unstable/man/xen-tscmode.7.html
[4] Intel 64 and IA-32 Architectures Sofware Developer's Manual Volume
 3b: System Programming Guide, Part 2, Section 17.17.1, Invariant TSC

Signed-off-by: Krister Johansen 
Code-reviewed-by: David Reaver 


Pushed to: xen/tip.git for-linus-6.3


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v3] drivers/xen/hypervisor: Expose Xen SIF flags to userspace

2023-02-13 Thread Juergen Gross

On 03.01.23 14:02, Per Bilse wrote:

/proc/xen is a legacy pseudo filesystem which predates Xen support
getting merged into Linux.  It has largely been replaced with more
normal locations for data (/sys/hypervisor/ for info, /dev/xen/ for
user devices).  We want to compile xenfs support out of the dom0 kernel.

There is one item which only exists in /proc/xen, namely
/proc/xen/capabilities with "control_d" being the signal of "you're in
the control domain".  This ultimately comes from the SIF flags provided
at VM start.

This patch exposes all SIF flags in /sys/hypervisor/start_flags/ as
boolean files, one for each bit, returning '1' if set, '0' otherwise.
Two known flags, 'privileged' and 'initdomain', are explicitly named,
and all remaining flags can be accessed via generically named files,
as suggested by Andrew Cooper.

Signed-off-by: Per Bilse 


Pushed to xen/tip.got for-linus-6.3


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH][next] xen: Replace one-element array with flexible-array member

2023-02-13 Thread Juergen Gross

On 03.02.23 02:28, Gustavo A. R. Silva wrote:

One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in struct xen_page_directory.

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

This results in no differences in binary output.

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/255
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva 


Reviewed-by: Juergen Gross 


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature