Re: [PATCH] x86: arrange for ENDBR zapping from _ctxt_switch_masking()

2024-02-01 Thread Jan Beulich
On 01.02.2024 22:12, Andrew Cooper wrote:
> On 16/01/2024 4:53 pm, Jan Beulich wrote:
>> While altcall is already used for them, the functions want announcing in
>> .init.rodata.cf_clobber, even if the resulting static variables aren't
>> otherwise used.
>>
>> While doing this also move ctxt_switch_masking to .data.ro_after_init.
>>
>> Signed-off-by: Jan Beulich 
> 
> Reviewed-by: Andrew Cooper 

Thanks.

>> --- a/xen/arch/x86/cpu/amd.c
>> +++ b/xen/arch/x86/cpu/amd.c
>> @@ -258,6 +258,11 @@ static void cf_check amd_ctxt_switch_mas
>>  #undef LAZY
>>  }
>>  
>> +#ifdef CONFIG_XEN_IBT /* Announce the function to ENDBR clobbering logic. */
>> +static const typeof(ctxt_switch_masking) __initconst_cf_clobber __used csm =
>> +amd_ctxt_switch_masking;
>> +#endif
> 
> If we gain more of these, I suspect we'll want a wrapper for it.
> 
> Irritatingly you can't pass parameters into global asm, because the nice
> way to do this would be an _ASM_PTR in a pushsection.

While I'm not convinced resorting to asm() here would indeed be a good thing,
for very many years I've been carrying a gcc change to permit exactly this.
I don't even recall anymore why it wasn't liked upstream.

Jan



Ping: [PATCH] x86/guest: finish conversion to altcall

2024-02-01 Thread Jan Beulich
On 17.01.2024 10:31, Jan Beulich wrote:
> While .setup() and .e820_fixup() don't need fiddling with for being run
> only very early, both .ap_setup() and .resume() want converting too:
> This way both pre-filled struct hypervisor_ops instances can become
> __initconst_cf_clobber, thus allowing to eliminate up to 5 more ENDBR
> (configuration dependent) during the 2nd phase of alternatives patching.
> 
> While fiddling with section annotations here, also move "ops" itself to
> .data.ro_after_init.
> 
> Signed-off-by: Jan Beulich 

May I ask for an ack (or otherwise)?

Thanks, Jan

> --- a/xen/arch/x86/guest/hyperv/hyperv.c
> +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> @@ -207,7 +207,7 @@ static int cf_check flush_tlb(
>  return hyperv_flush_tlb(mask, va, flags);
>  }
>  
> -static const struct hypervisor_ops __initconstrel ops = {
> +static const struct hypervisor_ops __initconst_cf_clobber ops = {
>  .name = "Hyper-V",
>  .setup = setup,
>  .ap_setup = ap_setup,
> --- a/xen/arch/x86/guest/hypervisor.c
> +++ b/xen/arch/x86/guest/hypervisor.c
> @@ -13,7 +13,7 @@
>  #include 
>  #include 
>  
> -static struct hypervisor_ops __read_mostly ops;
> +static struct hypervisor_ops __ro_after_init ops;
>  
>  const char *__init hypervisor_probe(void)
>  {
> @@ -49,7 +49,7 @@ void __init hypervisor_setup(void)
>  int hypervisor_ap_setup(void)
>  {
>  if ( ops.ap_setup )
> -return ops.ap_setup();
> +return alternative_call(ops.ap_setup);
>  
>  return 0;
>  }
> @@ -57,7 +57,7 @@ int hypervisor_ap_setup(void)
>  void hypervisor_resume(void)
>  {
>  if ( ops.resume )
> -ops.resume();
> +alternative_vcall(ops.resume);
>  }
>  
>  void __init hypervisor_e820_fixup(void)
> --- a/xen/arch/x86/guest/xen/xen.c
> +++ b/xen/arch/x86/guest/xen/xen.c
> @@ -318,7 +318,7 @@ static int cf_check flush_tlb(
>  return xen_hypercall_hvm_op(HVMOP_flush_tlbs, NULL);
>  }
>  
> -static const struct hypervisor_ops __initconstrel ops = {
> +static const struct hypervisor_ops __initconst_cf_clobber ops = {
>  .name = "Xen",
>  .setup = setup,
>  .ap_setup = ap_setup,




Re: [PATCH] xen: Swap order of actions in the FREE*() macros

2024-02-01 Thread Jan Beulich
On 02.02.2024 01:39, Andrew Cooper wrote:
> Wherever possible, it is a good idea to NULL out the visible reference to an
> object prior to freeing it.  The FREE*() macros already collect together both
> parts, making it easy to adjust.
> 
> This has a marginal code generation improvement, as some of the calls to the
> free() function can be tailcall optimised.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Jan Beulich 

However, ...

> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -92,8 +92,9 @@ bool scrub_free_pages(void);
>  
>  /* Free an allocation, and zero the pointer to it. */
>  #define FREE_XENHEAP_PAGES(p, o) do { \
> -free_xenheap_pages(p, o); \
> +void *_ptr_ = (p);\

... why a trailing _and_ a leading underscore? Sooner or later we'll
need to get rid of the leading ones anyway aiui (for Misra), here
and elsewhere. With it omitted right away (also below):
Reviewed-by: Jan Beulich 

Jan



[linux-6.1 baseline test] 184549: tolerable FAIL

2024-02-01 Thread osstest service owner
"Old" tested version had not actually been tested; therefore in this
flight we test it, rather than a new candidate.  The baseline, if
any, is the most recent actually tested revision.

flight 184549 linux-6.1 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184549/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl   8 xen-bootfail baseline untested
 test-armhf-armhf-xl-arndale   8 xen-bootfail baseline untested
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stop  fail baseline untested
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stop  fail baseline untested
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail baseline 
untested
 test-armhf-armhf-libvirt-raw 15 saverestore-support-check fail baseline 
untested
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stop  fail baseline untested
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stop  fail baseline untested
 test-armhf-armhf-libvirt   16 saverestore-support-check fail baseline untested
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail baseline 
untested
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass

version targeted for testing:
 linux8fd7f44624538675abadc73f5a44e95016964d22
baseline version:
 linux8fd7f44624538675abadc73f5a44e95016964d22

Last test of basis   184549  2024-02-01 11:13:03 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt 

[ovmf test] 184561: all pass - PUSHED

2024-02-01 Thread osstest service owner
flight 184561 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184561/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 927ea1364d047d54cb67e0f231644f295c032944
baseline version:
 ovmf 3656352675bf66c06b65bf85632996d3471073ae

Last test of basis   184558  2024-02-01 22:41:22 Z0 days
Testing same since   184561  2024-02-02 02:55:40 Z0 days1 attempts


People who touched revisions under test:
  Dongyan Qian 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   3656352675..927ea1364d  927ea1364d047d54cb67e0f231644f295c032944 -> 
xen-tested-master



Re: [PATCH v12 11/15] vpci: add initial support for virtual PCI bus topology

2024-02-01 Thread Stewart Hildebrand
On 1/25/24 11:00, Jan Beulich wrote:
> On 09.01.2024 22:51, Stewart Hildebrand wrote:
>> --- a/xen/drivers/Kconfig
>> +++ b/xen/drivers/Kconfig
>> @@ -15,4 +15,8 @@ source "drivers/video/Kconfig"
>>  config HAS_VPCI
>>  bool
>>  
>> +config HAS_VPCI_GUEST_SUPPORT
>> +bool
>> +depends on HAS_VPCI
> 
> Wouldn't this better be "select", or even just "imply"?

I prefer "select"

> 
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -40,6 +40,49 @@ extern vpci_register_init_t *const __start_vpci_array[];
>>  extern vpci_register_init_t *const __end_vpci_array[];
>>  #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>>  
>> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
>> +static int add_virtual_device(struct pci_dev *pdev)
>> +{
>> +struct domain *d = pdev->domain;
>> +unsigned int new_dev_number;
>> +
>> +if ( is_hardware_domain(d) )
>> +return 0;
>> +
>> +ASSERT(rw_is_write_locked(>domain->pci_lock));
>> +
>> +/*
>> + * Each PCI bus supports 32 devices/slots at max or up to 256 when
>> + * there are multi-function ones which are not yet supported.
>> + */
>> +if ( pdev->info.is_extfn && !pdev->info.is_virtfn )
>> +{
>> +gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n",
>> + >sbdf);
> 
> The message suggests you ought to check pdev->devfn to have the low
> three bits clear. Yet what you check are two booleans.

I'll check pdev->sbdf.fn

> 
> Further doesn't this require the multi-function bit to be emulated
> clear?

I consider this to be future work. The header type register, where the
bit resides, is not yet emulated for domUs. I have a series in the works
for emulating additional registers (including PCI_HEADER_TYPE), but I'm
planning to wait to submit it until after the current series is finished
so as to not delay the current series any further.

> And finally don't you then also need to disallow assignment of
> devices with phantom functions?

No, I don't think so. My understanding is that there is no configuration
space associated with the phantom SBDFs. There's no special handling
required in vPCI per se, because the phantom function RIDs get mapped
in the IOMMU when the device gets assigned. Future work would include
exposing the PCI Express Capability, including device control register
with the phantom function enable bit. I say this having only done
limited testing of phantom functions on ARM, and by faking it using the
pci-phantom= Xen arg because I don't have a real device with phantom
function capability.

> 
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -484,6 +484,14 @@ struct domain
>>   * 2. pdev->vpci->lock
>>   */
>>  rwlock_t pci_lock;
>> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
>> +/*
>> + * The bitmap which shows which device numbers are already used by the
>> + * virtual PCI bus topology and is used to assign a unique SBDF to the
>> + * next passed through virtual PCI device.
>> + */
>> +DECLARE_BITMAP(vpci_dev_assigned_map, VPCI_MAX_VIRT_DEV);
>> +#endif
>>  #endif
> 
> With this the 2nd #endif would likely better gain a comment.

I will add it. Actually, I see no harm in adding a comment for both of
these #endifs.

> 
>> --- a/xen/include/xen/vpci.h
>> +++ b/xen/include/xen/vpci.h
>> @@ -21,6 +21,13 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
>>  
>>  #define VPCI_ECAM_BDF(addr) (((addr) & 0x0000) >> 12)
>>  
>> +/*
>> + * Maximum number of devices supported by the virtual bus topology:
>> + * each PCI bus supports 32 devices/slots at max or up to 256 when
>> + * there are multi-function ones which are not yet supported.
>> + */
>> +#define VPCI_MAX_VIRT_DEV   (PCI_SLOT(~0) + 1)
> 
> The limit being this means only bus 0 / seg 0 is supported, which I
> think the comment would better also say. (In add_virtual_device(),
> which has a similar comment, there's then at least a 2nd one saying
> so.)

OK, I'll adjust the comment.



[ovmf test] 184558: all pass - PUSHED

2024-02-01 Thread osstest service owner
flight 184558 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184558/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 3656352675bf66c06b65bf85632996d3471073ae
baseline version:
 ovmf 97c3f5b8d27230acfc20f479adea64c348750612

Last test of basis   184542  2024-01-31 13:12:38 Z1 days
Testing same since   184558  2024-02-01 22:41:22 Z0 days1 attempts


People who touched revisions under test:
  MarsX Lin 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   97c3f5b8d2..3656352675  3656352675bf66c06b65bf85632996d3471073ae -> 
xen-tested-master



[xen-unstable-smoke test] 184560: tolerable all pass - PUSHED

2024-02-01 Thread osstest service owner
flight 184560 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184560/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  3f819af8a796c0e2f798dd301ec8c3f8cccbc9fc
baseline version:
 xen  4c2d50d586717533f6bee4c6340be0432d0a76d0

Last test of basis   184555  2024-02-01 19:00:28 Z0 days
Testing same since   184560  2024-02-01 23:00:27 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  Christian Lindig 
  Edwin Török 
  Jan Beulich 
  Roger Pau Monné 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   4c2d50d586..3f819af8a7  3f819af8a796c0e2f798dd301ec8c3f8cccbc9fc -> smoke



[PATCH] xen: Swap order of actions in the FREE*() macros

2024-02-01 Thread Andrew Cooper
Wherever possible, it is a good idea to NULL out the visible reference to an
object prior to freeing it.  The FREE*() macros already collect together both
parts, making it easy to adjust.

This has a marginal code generation improvement, as some of the calls to the
free() function can be tailcall optimised.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: George Dunlap 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Wei Liu 
CC: Julien Grall 
---
 xen/include/xen/mm.h  | 3 ++-
 xen/include/xen/xmalloc.h | 7 ---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 3d9b2d05a5c8..044f3f3b19c8 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -92,8 +92,9 @@ bool scrub_free_pages(void);
 
 /* Free an allocation, and zero the pointer to it. */
 #define FREE_XENHEAP_PAGES(p, o) do { \
-free_xenheap_pages(p, o); \
+void *_ptr_ = (p);\
 (p) = NULL;   \
+free_xenheap_pages(_ptr_, o); \
 } while ( false )
 #define FREE_XENHEAP_PAGE(p) FREE_XENHEAP_PAGES(p, 0)
 
diff --git a/xen/include/xen/xmalloc.h b/xen/include/xen/xmalloc.h
index 9ecddbff5e00..1b88a83be879 100644
--- a/xen/include/xen/xmalloc.h
+++ b/xen/include/xen/xmalloc.h
@@ -66,9 +66,10 @@
 extern void xfree(void *p);
 
 /* Free an allocation, and zero the pointer to it. */
-#define XFREE(p) do { \
-xfree(p); \
-(p) = NULL;   \
+#define XFREE(p) do {   \
+void *_ptr_ = (p);  \
+(p) = NULL; \
+xfree(_ptr_);   \
 } while ( false )
 
 /* Underlying functions */

base-commit: 3f819af8a796c0e2f798dd301ec8c3f8cccbc9fc
-- 
2.30.2




[xen-unstable-smoke test] 184555: tolerable all pass - PUSHED

2024-02-01 Thread osstest service owner
flight 184555 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184555/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  4c2d50d586717533f6bee4c6340be0432d0a76d0
baseline version:
 xen  bc45f20c01f1711bc56a4bb0955c49c182a5a03a

Last test of basis   184552  2024-02-01 16:00:25 Z0 days
Testing same since   184555  2024-02-01 19:00:28 Z0 days1 attempts


People who touched revisions under test:
  Julien Grall 
  Oleksandr Tyshchenko 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   bc45f20c01..4c2d50d586  4c2d50d586717533f6bee4c6340be0432d0a76d0 -> smoke



[xtf test] 184556: all pass - PUSHED

2024-02-01 Thread osstest service owner
flight 184556 xtf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184556/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 xtf  d1de27cf787e9c86610b11f6fe13d90f5ca91e27
baseline version:
 xtf  5a513bb7a7f08a0ab9aa23312ab9a34240a24d2c

Last test of basis   184341  2024-01-13 14:42:52 Z   19 days
Testing same since   184556  2024-02-01 19:13:08 Z0 days1 attempts


People who touched revisions under test:
  Frediano Ziglio 

jobs:
 build-amd64-xtf  pass
 build-amd64  pass
 build-amd64-pvopspass
 test-xtf-amd64-amd64-1   pass
 test-xtf-amd64-amd64-2   pass
 test-xtf-amd64-amd64-3   pass
 test-xtf-amd64-amd64-4   pass
 test-xtf-amd64-amd64-5   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xtf.git
   5a513bb..d1de27c  d1de27cf787e9c86610b11f6fe13d90f5ca91e27 -> 
xen-tested-master



[xen-unstable test] 184547: tolerable FAIL

2024-02-01 Thread osstest service owner
flight 184547 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184547/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 184535
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 184535
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 184535
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 184535
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 184535
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 184535
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 184535
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 184535
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 184535
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 184535
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 184535
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 184535
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  cc6ba68edf6dcd18c3865e7d7c0f1ed822796426
baseline version:
 xen  cc6ba68edf6dcd18c3865e7d7c0f1ed822796426

Last test of basis   184547  2024-02-01 03:35:56 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 

[PATCH v2 2/3] libxl: Allow Phy backend for CDROM devices

2024-02-01 Thread Jason Andryuk
A Linux HVM domain ignores PV block devices with type cdrom.  The
Windows PV drivers also ignore device-type != "disk".  Therefore QEMU's
emulated CD-ROM support is used.  This allows ejection and other CD-ROM
features to work.

With a stubdom, QEMU is running in the stubdom.  A PV disk is still
connected into the stubdom, and then QEMU can emulate the CD-ROM into
the guest.  Phy support has been enhanced to provide a placeholder file
forempty disks, so it is usable as a CDROM backend as well.  Allow Phy
to pass the check as well.

(Bypassing just for a linux-based stubdom doesn't work because
libxl__device_disk_setdefault() gets called early in domain creation
before xenstore is populated with relevant information for the stubdom
type.  The build information isn't readily available and won't exist in
some call trees, so it isn't usable either.)

Let disk_try_backend() allow format empty for Phy cdrom drives.

Signed-off-by: Jason Andryuk 
---
v2:
Different approach to pass QDISK requirement check.
---
 tools/libs/light/libxl_device.c | 12 
 tools/libs/light/libxl_disk.c   | 11 +++
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/libs/light/libxl_device.c b/tools/libs/light/libxl_device.c
index 09d85928d7..f73c6705d4 100644
--- a/tools/libs/light/libxl_device.c
+++ b/tools/libs/light/libxl_device.c
@@ -301,13 +301,17 @@ static int disk_try_backend(disk_try_backend_args *a,
 
 switch (backend) {
 case LIBXL_DISK_BACKEND_PHY:
-if (a->disk->format != LIBXL_DISK_FORMAT_RAW) {
-goto bad_format;
-}
-
 if (libxl_defbool_val(a->disk->colo_enable))
 goto bad_colo;
 
+if (a->disk->is_cdrom && a->disk->format == LIBXL_DISK_FORMAT_EMPTY) {
+LOG(DEBUG, "Disk vdev=%s is an empty cdrom", a->disk->vdev);
+return backend;
+}
+
+if (a->disk->format != LIBXL_DISK_FORMAT_RAW)
+goto bad_format;
+
 if (a->disk->backend_domid != LIBXL_TOOLSTACK_DOMID) {
 LOG(DEBUG, "Disk vdev=%s, is using a storage driver domain, "
"skipping physical device check", a->disk->vdev);
diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c
index c48e1de659..09082ffb58 100644
--- a/tools/libs/light/libxl_disk.c
+++ b/tools/libs/light/libxl_disk.c
@@ -188,15 +188,18 @@ static int libxl__device_disk_setdefault(libxl__gc *gc, 
uint32_t domid,
 return ERROR_FAIL;
 }
 
-/* Force Qdisk backend for CDROM devices of guests with a device model. */
+/* Only allow Qdisk or Phy for CDROM devices. */
 if (disk->is_cdrom != 0 &&
 libxl__domain_type(gc, domid) == LIBXL_DOMAIN_TYPE_HVM) {
+if (disk->backend == LIBXL_DISK_BACKEND_UNKNOWN)
+disk->backend = LIBXL_DISK_BACKEND_QDISK;
+
 if (!(disk->backend == LIBXL_DISK_BACKEND_QDISK ||
-  disk->backend == LIBXL_DISK_BACKEND_UNKNOWN)) {
-LOGD(ERROR, domid, "Backend for CD devices on HVM guests must be 
Qdisk");
+  disk->backend == LIBXL_DISK_BACKEND_PHY)) {
+LOGD(ERROR, domid,
+ "Backend for CD devices on HVM guests must be Qdisk or Phy");
 return ERROR_FAIL;
 }
-disk->backend = LIBXL_DISK_BACKEND_QDISK;
 }
 
 if (disk->is_cdrom &&
-- 
2.43.0




[PATCH v2 0/3] libxl: Stubdom cd-rom changing support

2024-02-01 Thread Jason Andryuk
These patches enable cd-rom media changing for an HVM with a linux
stubdom.

v1 didn't support these empty drives.  The code came out of OpenXT which
has a hack - null.iso.  null.iso is an ISO with no contents.  OpenXT
doesn't actually eject the cd-rom, it just inserts null.iso.  This patch
set has QEMU present empty media.

The first patch creates an empty file, /run/xen/empty-cdrom.%u, for Phy
drives that are "empty".  The place holder simplifies things because the
block scripts don't work for an empty params.  Even if the scripts were
modified for that, a stubdom will timeout on startup when the empty
disk/blkback never connects.  The empty file works around these issues.

The second patch allows use of Phy backend drives for a cd-rom.  This
works for non-stubdom HVMs.  Actually special casing stubdoms didn't
work.

The third patch expands the cd-rom changing code to support the stubdom
case.

To change the cd-rom medium, libxl will:
 - QMP eject the medium from QEMU
 - block-detach the old PV disk
 - block-attach the new PV disk
 - QMP change the medium to the new PV disk by fdset-id

xl cd-eject follows the above through connecting the new PV disk,
empty-cdrom.%u.  It skips the QMP media change.  This keeps the xenstore
entries which are needed to identify that a cd-rom drive is present.  If
the xenstore entries were removed on eject, libxl wouldn't find the
device (hdc) for a subsequent cd-insert.

The QMP change insert uses fdset-id STUBDOM_FDSET_CD + $disk - 'a'.
That is, hda -> 'a', so
STUBDOM_FDSET_CD + 'a' - 'a' = STUBDOM_FDSET_CD.
For hdc:
STUBDOM_FDSET_CD + 'c' - 'a' = STUBDOM_FDSET_CD + 2.

The stubdom must internally handle adding /dev/xvdc to the appropriate
fdset inside QEMU.

A script like this:
https://github.com/OpenXT/xenclient-oe/blob/master/recipes-core/initrdscripts/initramfs-stubdomain/qemu-xvdc-add-fd.sh

Can be called by busybox mdev configured like this:
https://github.com/OpenXT/xenclient-oe/blob/master/recipes-core/busybox/files/mdev.conf

(OpenXT mdev as the hotplug helper works, but with a ~Qubes stubdom, I
had to run mdev as a daemon, mdev -d.)

Linux locks the cd-rom by default?  That means the QMP eject commands
fail, but then Linux unlocks.  Re-running a second time works.  Windows
doesn't do that.

There are spurious messages sometimes like:
libxl: error: libxl_qmp.c:1837:qmp_ev_parse_error_messages: Domain 5:Could not 
dup FD for /dev/fdset/8002 flags 0: No such file or directory

libxl doesn't know when the stubdom has setup the fdset.  Since it gets
those errors, it'll retry adding to the fdset.

Jason Andryuk (3):
  libxl: Create empty cdrom file for stubdom
  libxl: Allow Phy backend for CDROM devices
  libxl: Enable stubdom cdrom changing

 docs/misc/stubdom.txt |  16 ++
 tools/libs/light/libxl_device.c   |  17 +-
 tools/libs/light/libxl_disk.c | 345 +++---
 tools/libs/light/libxl_domain.c   |   4 +
 tools/libs/light/libxl_internal.h |   1 +
 5 files changed, 344 insertions(+), 39 deletions(-)

-- 
2.43.0




[PATCH v2 3/3] libxl: Enable stubdom cdrom changing

2024-02-01 Thread Jason Andryuk
To change the cd-rom medium, libxl will:
 - QMP eject the medium from QEMU
 - block-detach the old PV disk
 - block-attach the new PV disk
 - QMP change the medium to the new PV disk by fdset-id

The QMP code is reused, and remove and attach are implemented here.

The stubdom must internally handle adding /dev/xvdc to the appropriate
fdset.  libxl in dom0 doesn't see the result of adding to the fdset as
that is internal to the stubdom, so a delay and retries are added to
around calling cdrom_insert_addfd_cb().

For cd-eject, we still need to attach the empty vbd.  This is necessary
since xenstore is used to determine that hdc exists.  Otherwise after
eject, hdc would be gone and the cd-insert would fail to find the drive
to insert new media.

The ERROR_JSON_CONFIG_EMPTY check in cdrom_insert_inserted() is because
a stubdom don't have a json config.

Signed-off-by: Jason Andryuk 
---
v2:
Only allow for Linux stubdoms (Marek)
Fix cd-eject
Fix errant hard tabs
Move the debug print and into the special case.
---
 docs/misc/stubdom.txt |  16 ++
 tools/libs/light/libxl_disk.c | 298 +++---
 2 files changed, 288 insertions(+), 26 deletions(-)

diff --git a/docs/misc/stubdom.txt b/docs/misc/stubdom.txt
index c717a95d17..1b2380ae8f 100644
--- a/docs/misc/stubdom.txt
+++ b/docs/misc/stubdom.txt
@@ -127,6 +127,22 @@ Limitations:
  - at most 26 emulated disks are supported (more are still available as PV 
disks)
  - graphics output (VNC/SDL/Spice) not supported
 
+CD-ROM changing:
+
+To change the CD-ROM medium, libxl will:
+ - QMP eject the medium from QEMU
+ - block-detach the old PV disk
+ - block-attach the new PV disk
+ - QMP change the medium to the new PV disk by fdset-id
+
+The QMP change insert uses fdset-id STUBDOM_FDSET_CD + $disk - 'a'.
+That is, hda -> 'a', so
+STUBDOM_FDSET_CD + 'a' - 'a' = STUBDOM_FDSET_CD.
+For hdc:
+STUBDOM_FDSET_CD + 'c' - 'a' = STUBDOM_FDSET_CD + 2.
+
+The stubdom must internally handle adding /dev/xvdc to the appropriate
+fdset.
 
PV-GRUB
===
diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c
index 09082ffb58..6354982c05 100644
--- a/tools/libs/light/libxl_disk.c
+++ b/tools/libs/light/libxl_disk.c
@@ -808,25 +808,46 @@ int libxl_device_disk_getinfo(libxl_ctx *ctx, uint32_t 
domid,
 
 typedef struct {
 libxl__ao *ao;
+libxl__ao_device aodev;
+libxl__ao_device aodev_del;
 libxl_domid domid;
+libxl_domid disk_domid;
 libxl_device_disk *disk;
 libxl_device_disk disk_saved;
 libxl__ev_slowlock qmp_lock;
 int dm_ver;
 libxl__ev_time time;
+libxl__ev_time timeout_retry;
 libxl__ev_qmp qmp;
+int retries;
+int stubdom_fdset;
 } libxl__cdrom_insert_state;
 
 static void cdrom_insert_lock_acquired(libxl__egc *, libxl__ev_slowlock *,
int rc);
 static void cdrom_insert_qmp_connected(libxl__egc *, libxl__ev_qmp *,
const libxl__json_object *, int rc);
+static void cdrom_insert_stubdom_removefd(libxl__egc *egc, libxl__ev_qmp *qmp,
+  const libxl__json_object *response,
+  int rc);
+static void cdrom_insert_stubdom_ejected(libxl__egc *egc, libxl__ev_qmp *,
+ const libxl__json_object *, int rc);
+static void cdrom_insert_stubdom_disk_ejected_aocomplete(libxl__egc *egc,
+ libxl__ao_device 
*aodev);
+static void cdrom_insert_stubdom_disk_ejected(libxl__egc *egc, libxl__ev_qmp *,
+  const libxl__json_object *,
+  int rc);
+static void cdrom_insert_ejected_aodevcb(libxl__egc *egc,
+ libxl__ao_device *aodev);
 static void cdrom_insert_ejected(libxl__egc *egc, libxl__ev_qmp *,
  const libxl__json_object *, int rc);
 static void cdrom_insert_addfd_cb(libxl__egc *egc, libxl__ev_qmp *,
   const libxl__json_object *, int rc);
 static void cdrom_insert_inserted(libxl__egc *egc, libxl__ev_qmp *,
   const libxl__json_object *, int rc);
+static void cdrom_insert_addfd_retry(libxl__egc *egc, libxl__ev_time *ev,
+ const struct timeval *requested_abs,
+ int rc);
 static void cdrom_insert_timout(libxl__egc *egc, libxl__ev_time *ev,
 const struct timeval *requested_abs,
 int rc);
@@ -842,6 +863,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, 
libxl_device_disk *disk,
 libxl_device_disk *disks = NULL;
 int rc;
 libxl__cdrom_insert_state *cis;
+libxl_domid stubdomid;
 
 GCNEW(cis);
 cis->ao 

[PATCH v2 1/3] libxl: Create empty file for Phy cdrom

2024-02-01 Thread Jason Andryuk
With a device model stubdom, dom0 exports a PV disk to the stubdom.
Inside the stubdom, QEMU emulates a cdrom to the guest with a
host_device pointing at the PV frontend (/dev/xvdc)

An empty cdrom drive causes problems booting the stubdom.  The PV disk
protocol isn't designed to support no media.  That can be partially
hacked around, but the stubdom kernel waits for all block devices to
transition to Connected.  Since the backend never connects empty media,
stubdom launch times out and it is destroyed.

Empty media and the PV disks not connecting is fine at runtime since the
stubdom keeps running irrespective of the disk state.

Empty media can be worked around my providing an empty file to the
stubdom for the PV disk source.  This works as the disk is exposed as a
zero-size disk.  Dynamically create the empty file as needed and remove
in the stubdom cleanup.

libxl__device_disk_set_backend() needs to allow through these "empty"
disks with a pdev_path.

Fixup the params writing since scripts have trouble with an empty params
field.

This works for non-stubdom HVMs as well.

Signed-off-by: Jason Andryuk 
---
v2:
New to support "empty" cdroms
---
 tools/libs/light/libxl_device.c   |  5 -
 tools/libs/light/libxl_disk.c | 36 +++
 tools/libs/light/libxl_domain.c   |  4 
 tools/libs/light/libxl_internal.h |  1 +
 4 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/tools/libs/light/libxl_device.c b/tools/libs/light/libxl_device.c
index 13da6e0573..09d85928d7 100644
--- a/tools/libs/light/libxl_device.c
+++ b/tools/libs/light/libxl_device.c
@@ -421,7 +421,10 @@ int libxl__device_disk_set_backend(libxl__gc *gc, 
libxl_device_disk *disk) {
 LOG(ERROR, "Disk vdev=%s is empty but not cdrom", disk->vdev);
 return ERROR_INVAL;
 }
-if (disk->pdev_path != NULL && strcmp(disk->pdev_path, "")) {
+if (disk->pdev_path != NULL &&
+(strcmp(disk->pdev_path, "") &&
+ strncmp(disk->pdev_path, LIBXL_STUBDOM_EMPTY_CDROM,
+ strlen(LIBXL_STUBDOM_EMPTY_CDROM {
 LOG(ERROR,
 "Disk vdev=%s is empty but an image has been provided: %s",
 disk->vdev, disk->pdev_path);
diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c
index ea3623dd6f..c48e1de659 100644
--- a/tools/libs/light/libxl_disk.c
+++ b/tools/libs/light/libxl_disk.c
@@ -199,6 +199,32 @@ static int libxl__device_disk_setdefault(libxl__gc *gc, 
uint32_t domid,
 disk->backend = LIBXL_DISK_BACKEND_QDISK;
 }
 
+if (disk->is_cdrom &&
+disk->format == LIBXL_DISK_FORMAT_EMPTY &&
+disk->backend == LIBXL_DISK_BACKEND_PHY &&
+disk->backend_domid == LIBXL_TOOLSTACK_DOMID) {
+uint32_t target_domid;
+int fd;
+
+if (libxl_is_stubdom(CTX, domid, _domid)) {
+LOGED(DEBUG, domid, "Using target_domid %u", target_domid);
+} else {
+target_domid = domid;
+}
+free(disk->pdev_path);
+disk->pdev_path =
+libxl__sprintf(NOGC, LIBXL_STUBDOM_EMPTY_CDROM ".%u",
+   target_domid);
+fd = creat(disk->pdev_path, 0400);
+if (fd < 0) {
+LOGED(ERROR, domid, "Failed to create empty cdrom \"%s\"",
+  disk->pdev_path);
+return ERROR_FAIL;
+}
+
+close(fd);
+}
+
 rc = libxl__device_disk_set_backend(gc, disk);
 return rc;
 }
@@ -988,7 +1014,7 @@ static void cdrom_insert_ejected(libxl__egc *egc,
 empty = flexarray_make(gc, 4, 1);
 flexarray_append_pair(empty, "type",
   libxl__device_disk_string_of_backend(disk->backend));
-flexarray_append_pair(empty, "params", "");
+flexarray_append_pair(empty, "params", disk->pdev_path ?: "");
 
 for (;;) {
 rc = libxl__xs_transaction_start(gc, );
@@ -1164,13 +1190,15 @@ static void cdrom_insert_inserted(libxl__egc *egc,
 insert = flexarray_make(gc, 4, 1);
 flexarray_append_pair(insert, "type",
   libxl__device_disk_string_of_backend(disk->backend));
-if (disk->format != LIBXL_DISK_FORMAT_EMPTY)
+if (disk->backend == LIBXL_DISK_BACKEND_QDISK &&
+disk->format != LIBXL_DISK_FORMAT_EMPTY) {
 flexarray_append_pair(insert, "params",
 GCSPRINTF("%s:%s",
 libxl__device_disk_string_of_format(disk->format),
 disk->pdev_path));
-else
-flexarray_append_pair(insert, "params", "");
+} else {
+flexarray_append_pair(insert, "params", disk->pdev_path ?: "");
+}
 
 for (;;) {
 rc = libxl__xs_transaction_start(gc, );
diff --git a/tools/libs/light/libxl_domain.c b/tools/libs/light/libxl_domain.c
index 5ee1544d9c..6751fc785f 100644
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -1525,6 +1525,10 @@ static void 

Re: [PATCH] x86: arrange for ENDBR zapping from _ctxt_switch_masking()

2024-02-01 Thread Andrew Cooper
On 16/01/2024 4:53 pm, Jan Beulich wrote:
> While altcall is already used for them, the functions want announcing in
> .init.rodata.cf_clobber, even if the resulting static variables aren't
> otherwise used.
>
> While doing this also move ctxt_switch_masking to .data.ro_after_init.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

>
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -258,6 +258,11 @@ static void cf_check amd_ctxt_switch_mas
>  #undef LAZY
>  }
>  
> +#ifdef CONFIG_XEN_IBT /* Announce the function to ENDBR clobbering logic. */
> +static const typeof(ctxt_switch_masking) __initconst_cf_clobber __used csm =
> +amd_ctxt_switch_masking;
> +#endif

If we gain more of these, I suspect we'll want a wrapper for it.

Irritatingly you can't pass parameters into global asm, because the nice
way to do this would be an _ASM_PTR in a pushsection.

~Andrew



Re: [PATCH] xen/bitmap: Drop unused headers

2024-02-01 Thread Andrew Cooper
On 01/02/2024 10:35 am, Andrew Cooper wrote:
> Nothing in bitmap.h uses lib.h, and there's no point including types.h when we
> need to include bitops.h anyway.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: George Dunlap 
> CC: Jan Beulich 
> CC: Stefano Stabellini 
> CC: Wei Liu 
> CC: Julien Grall 
> ---
>  xen/include/xen/bitmap.h | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/xen/include/xen/bitmap.h b/xen/include/xen/bitmap.h
> index b9f980e91930..9f749e3913d8 100644
> --- a/xen/include/xen/bitmap.h
> +++ b/xen/include/xen/bitmap.h
> @@ -3,8 +3,6 @@
>  
>  #ifndef __ASSEMBLY__
>  
> -#include 
> -#include 
>  #include 

Turns out this went too far, and breaks PPC.  Other arches look ok.

https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/6076263594

bitmap.h uses mem{set,cpy}() so needs string.h.  That's a bug in this
patch specifically.

However, cpumask.h transitively picks up IS_ALIGNED() (so needs
macros.h) and ASSERT() which sadly is still in lib.h

I guess moving the mess from bitmap.h to cpumask.h is a (minor) improvement.

But this comes back to the header tangle which prevented moving BUG() in
the first place.  Sadly there's been no reply to my question in the
debugger.h removal, but I'm going to get that committed and then we can
re-evaluate.

~Andrew



Re: [PATCH] Fix some typos in comments

2024-02-01 Thread Andrew Cooper
On 01/02/2024 3:07 pm, Andrew Cooper wrote:
> On 01/02/2024 2:59 pm, Frediano Ziglio wrote:
>> Signed-off-by: Frediano Ziglio 
>> ---
>>  arch/x86/include/arch/processor.h | 2 +-
>>  include/xen/hvm/params.h  | 2 +-
>>  include/xtf/console.h | 2 +-
>>  include/xtf/extable.h | 4 ++--
>>  4 files changed, 5 insertions(+), 5 deletions(-)
> This looks like an XTF patch?

Given that it applies cleanly, that will be a yes.

Reviewed-by: Andrew Cooper  and committed.

Thanks.



Re: [PATCH 5/4] libxl: blktap/tapback support

2024-02-01 Thread Andrew Cooper
On 01/02/2024 6:30 pm, Jason Andryuk wrote:
> Jason Andryuk (4):
>   block-common: Fix same_vm for no targets
>   libxl: Add support for blktap vbd3
>   hotplug: Update block-tap
>   libxl: Support blktap with HVM device model

If libxl is learning how to drive tapdisk, then it's time for:

https://github.com/xenserver/xen.pg/blob/XS-8/patches/xentop-vbd3.patch

to make it's way upstream.  Otherwise, there's an absence of disk info
in xentop.

I previously kept it back because vbd3 was specific to XenServer.

~Andrew



Re: Nullifying Recently Introduced Xen Headers Check

2024-02-01 Thread John L. Poole


On Thursday, February 1st, 2024 at 9:18 AM, David Woodhouse 
 wrote:

> (Thanks Andy for the explicit cc)
> 
> On Thu, 2024-02-01 at 00:05 +, John L. Poole wrote:
> 
> > cause Gentoo's build to error out. See line 24790:
> > 
> > 5 | #error In Xen native files, include xen_native.h before other Xen 
> > headers
> > 
> > at
> > https://salemdata.us/xen/xen_tools_20240128_Sun_174740.script.html.
> > 
> > What I have done is create a patch for a draft Gentoo ebuild which
> > nullifies lines 4-6 by wrapping them in a comment:
> 
> 
> That isn't what the #error told you to do, though.
> 
> 24788 In file included from ../qemu-xen/hw/xen/xen-operations.c:16:
> 24789 
> /var/tmp/portage/app-emulation/xen-tools-4.18.0/work/xen-4.18.0/tools/qemu-xen/include/hw/xen/xen_native.h:5:2:
>  error: #error In Xen native files, include xen_native.h before other Xen 
> headers
> 24790 5 | #error In Xen native files, include xen_native.h before other Xen 
> headers
> 24791 | ^
> 
> So it's hw/xen/xen-operations.h which is failing. As far as I can tell
> (visually and empirically because it does actually build elsewhere), it
> is doing what the #error said — it is including xen_native.h before
> any other Xen headers.
> 
> The first four non-comment lines of xen-operations.c should look
> something like this...
> 
> #include "qemu/osdep.h"
> #include "qemu/uuid.h"
> #include "qapi/error.h"
> 
> #include "hw/xen/xen_native.h"
> 
> So... did you patch it so it doesn't start like that any more? Or does
> one of those first three files (perhaps qemu/osdep.h?) end up bringing
> in the Xen interface headers in a way that I didn't anticipate and
> which doesn't seem to happen elsewhere?
> 
> I didn't cite the full gcc command line from line 24787 of your log
> because it's huge. Can you run a variant of that command to just give
> me the preprocessed output (-E -dD -o xen-operations.i).

Hi David,

To answer your questions: the patch consists only of this
modification of 3 lines in xen_native.h:

   https://921932.bugs.gentoo.org/attachment.cgi?id=883883

Otherwise, the 4.18 code tree is as released.

I'm without the know-how to run a variant of the emerge command
as you have requested, alas.  I just simply removed 3 lines that
were introduced into the 4.18 tree to achieve a successful build.
I posted to this list to learn what the negative impacts might be. 

While the Xen Project "make" works, the Gentoo emerge
of app-emulation/xen-tools does not unless the three lines are
removed to simulate prior 4.17.3 and earlier code.  

I suspect the Gentoo approach 
of building tools first contributes to the problem.

Thank you,

John



[PATCH 3/4] hotplug: Update block-tap

2024-02-01 Thread Jason Andryuk
Implement a sharing check like the regular block script.

Checking tapback inside block-tap is too late since it needs to be
running to transition the backend to InitWait before block-tap is run.

tap-ctl check will be removed when the requirement for the blktap kernel
driver is removed.  Remove it now as it is of limited use.

find_device() needs to be non-fatal allow a sharing check.

Only write physical-device-path because that is all that tapback needs.
Also write_dev doesn't handled files and would incorrectly store
physical-device as 0:0 which would confuse the minor inside tapback
---
 tools/hotplug/Linux/block-tap | 162 --
 1 file changed, 153 insertions(+), 9 deletions(-)

diff --git a/tools/hotplug/Linux/block-tap b/tools/hotplug/Linux/block-tap
index 89247921b9..5eca09f0f6 100755
--- a/tools/hotplug/Linux/block-tap
+++ b/tools/hotplug/Linux/block-tap
@@ -37,10 +37,6 @@ check_tools()
 if ! command -v tap-ctl > /dev/null 2>&1; then
 fatal "Unable to find tap-ctl tool"
 fi
-modprobe blktap
-if ! tap-ctl check >& /dev/null ; then
-   fatal "Blocktap kernel module not available"
-fi
 }
 
 # Sets the following global variables based on the params field passed in as
@@ -81,7 +77,105 @@ find_device()
 done
 
 if [ -z "$pid" ] || [ -z "$minor" ]; then
-fatal "cannot find required parameters"
+return 1
+fi
+
+return 0
+}
+
+count_using()
+{
+  local file="$1"
+  local f
+
+  local i=0
+  local base_path="$XENBUS_BASE_PATH/$XENBUS_TYPE"
+  for dom in $(xenstore-list "$base_path")
+  do
+for dev in $(xenstore-list "$base_path/$dom")
+do
+  f=$(xenstore_read_default "$base_path/$dom/$dev/params" "")
+  f=$(echo "$f" | cut -d ":" -f 2)
+
+  if [ -n "$f" ] && [ "$file" = $f ] ; then
+  i=$(( i + 1 ))
+  fi
+done
+  done
+
+  echo "$i"
+}
+
+# tap_shared is used to determine if a shared tap can be closed
+# Since a stubdom and a guest both use the same tap, it can only
+# be freed when there is a single one left.
+tap_shared() {
+[ $( count_using "$file" ) -gt 1 ]
+}
+
+check_tap_sharing()
+{
+  local file="$1"
+  local mode="$2"
+  local dev
+
+  local base_path="$XENBUS_BASE_PATH/$XENBUS_TYPE"
+  for dom in $(xenstore-list "$base_path") ; do
+for dev in $(xenstore-list "$base_path/$dom") ; do
+  f=$(xenstore_read_default "$base_path/$dom/$dev/params" "")
+  f=$(echo "$f" | cut -d ":" -f 2)
+
+  if [ -n "$f" ] && [ "$file" = "$f" ] ; then
+if [ "$mode" = 'w' ] ; then
+  if ! same_vm $dom ; then
+echo "guest $f"
+return
+  fi
+else
+  local m=$(xenstore_read_default "$base_path/$dom/$dev/mode" "")
+  m=$(canonicalise_mode "$m")
+
+  if [ "$m" = 'w' ] ; then
+if ! same_vm $dom ; then
+  echo "guest $f"
+  return
+fi
+  fi
+fi
+  fi
+done
+  done
+
+  echo 'ok'
+}
+
+tap_create()
+{
+if ! minor=$( tap-ctl allocate ) ; then
+fatal "Could not allocate minor"
+fi
+
+# Handle with or without kernel blktap
+minor=${minor#/run/blktap-control/tapdisk/tapdisk-}
+minor=${minor#/dev/xen/blktap-2/tapdev}
+
+# tap-ctl is spawning tapdisk which would hold the _lockfd open.
+# Ensure it is closed before running tap-ctl spawn, which needs to be
+# done in a subshell to continue holding the lock in the parent.
+if ! pid=$( ( eval "exec $_lockfd>&-" ; tap-ctl spawn ) ) ; then
+tap-ctl free -m "$minor"
+fatal "Could not spawn tapdisk for $minor"
+fi
+
+if ! tap-ctl attach -p "$pid" -m "$minor" ; then
+tap-ctl free -m "$minor"
+fatal "Could not attach $pid and $minor"
+fi
+
+if ! tap-ctl open -p "$pid" -m "$minor" -a "$target" ; then
+tap-ctl detach -p "$pid" -m "$minor"
+tap-ctl free -m "$minor"
+fatal "Could not open \"$target\""
 fi
 }
 
@@ -89,15 +183,57 @@ find_device()
 # the device
 add()
 {
-dev=$(tap-ctl create -a $target)
-write_dev $dev
+local minor
+local pid
+local res
+
+claim_lock "block"
+
+if find_device; then
+result=$( check_tap_sharing "$file" "$mode" )
+if [ "$result" != "ok" ] ; then
+do_ebusy "tap $type file $file in use " "$mode" "${result%% *}"
+fi
+else
+tap_create
+fi
+
+# Create nbd unix path
+dev=$( printf "/run/blktap-control/nbd%ld.%d" "$pid" "$minor" )
+
+xenstore_write "$XENBUS_PATH/pid" "$pid"
+xenstore_write "$XENBUS_PATH/minor" "$minor"
+# dev, as a unix socket, would end up with major:minor 0:0 in
+# physical-device if write_dev were used.  tapback would be thrown off by
+# that incorrect minor, so just skip writing physical-device.
+xenstore_write "$XENBUS_PATH/physical-device-path" "$dev"
+
+success
+
+release_lock "block"
 }
 
 # Disconnects the device
 

[PATCH 1/4] block-common: Fix same_vm for no targets

2024-02-01 Thread Jason Andryuk
same_vm is broken when the two main domains do not have targets.  otvm
and targetvm are both missing, which means they get set to -1 and then
converted to empty strings:

++10697+ local targetvm=-1
++10697+ local otvm=-1
++10697+ otvm=
++10697+ othervm=/vm/cc97bc2f-3a91-43f7-8fbc-4cb92f90b4e4
++10697+ targetvm=
++10697+ local frontend_uuid=/vm/844dea4e-44f8-4e3e-8145-325132a31ca5

The final comparison returns true since the two empty strings match:

++10697+ '[' /vm/844dea4e-44f8-4e3e-8145-325132a31ca5 = 
/vm/cc97bc2f-3a91-43f7-8fbc-4cb92f90b4e4 -o '' = 
/vm/cc97bc2f-3a91-43f7-8fbc-4cb92f90b4e4 -o 
/vm/844dea4e-44f8-4e3e-8145-325132a31ca5 = '' -o '' = '' ']'

Replace -1 with distinct strings indicating the lack of a value and
remove the collescing to empty stings.  The strings themselves will no
longer match, and that is correct.

++12364+ '[' /vm/844dea4e-44f8-4e3e-8145-325132a31ca5 = 
/vm/cc97bc2f-3a91-43f7-8fbc-4cb92f90b4e4 -o 'No target' = 
/vm/cc97bc2f-3a91-43f7-8fbc-4cb92f90b4e4 -o 
/vm/844dea4e-44f8-4e3e-8145-325132a31ca5 = 'No other target' -o 'No target' = 
'No other target' ']'

Signed-off-by: Jason Andryuk 
---
 tools/hotplug/Linux/block-common.sh | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tools/hotplug/Linux/block-common.sh 
b/tools/hotplug/Linux/block-common.sh
index f86a88c4eb..5c80237d99 100644
--- a/tools/hotplug/Linux/block-common.sh
+++ b/tools/hotplug/Linux/block-common.sh
@@ -112,14 +112,12 @@ same_vm()
   "$FRONTEND_UUID")
   local target=$(xenstore_read_default  "/local/domain/$FRONTEND_ID/target"   \
  "-1")
-  local targetvm=$(xenstore_read_default "/local/domain/$target/vm" "-1")
+  local targetvm=$(xenstore_read_default "/local/domain/$target/vm" "No 
Target")
   local otarget=$(xenstore_read_default  "/local/domain/$otherdom/target"   \
  "-1")
   local otvm=$(xenstore_read_default  "/local/domain/$otarget/vm"   \
- "-1")
-  otvm=${otvm%-1}
-  othervm=${othervm%-1}
-  targetvm=${targetvm%-1}
+ "No Other Target")
+
   local frontend_uuid=${FRONTEND_UUID%-1}
   
   [ "$frontend_uuid" = "$othervm" -o "$targetvm" = "$othervm" -o \
-- 
2.43.0




[PATCH 4/4] libxl: Support blktap with HVM device model

2024-02-01 Thread Jason Andryuk
blktap exposes disks over UNIX socket Network Block Device (NBD).
Modify libxl__device_disk_find_local_path() to provide back the
QEMU-formatted NBD path.  This allows tapdisk to be used for booting an
HVM.

Use the nbd+unix:/// format specified by the protocol at
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md

Signed-off-by: Jason Andryuk 
---
 tools/libs/light/libxl_disk.c | 17 +
 tools/libs/light/libxl_dm.c   |  1 -
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c
index 59ff996837..b65cad33cc 100644
--- a/tools/libs/light/libxl_disk.c
+++ b/tools/libs/light/libxl_disk.c
@@ -1317,7 +1317,8 @@ char *libxl__device_disk_find_local_path(libxl__gc *gc,
  * If the format isn't raw and / or we're using a script, then see
  * if the script has written a path to the "cooked" node
  */
-if (disk->script && guest_domid != INVALID_DOMID) {
+if ((disk->script && guest_domid != INVALID_DOMID) ||
+disk->backend == LIBXL_DISK_BACKEND_TAP) {
 libxl__device device;
 char *be_path, *pdpath;
 int rc;
@@ -1337,10 +1338,18 @@ char *libxl__device_disk_find_local_path(libxl__gc *gc,
 LOGD(DEBUG, guest_domid, "Attempting to read node %s", pdpath);
 path = libxl__xs_read(gc, XBT_NULL, pdpath);
 
-if (path)
+if (path) {
 LOGD(DEBUG, guest_domid, "Accessing cooked block device %s", path);
-else
-LOGD(DEBUG, guest_domid, "No physical-device-path, can't access 
locally.");
+
+/* tapdisk exposes disks locally over UNIX socket NBD. */
+if (disk->backend == LIBXL_DISK_BACKEND_TAP) {
+path = libxl__sprintf(gc, "nbd+unix:///?socket=%s", path);
+LOGD(DEBUG, guest_domid,
+ "Directly accessing local TAP target %s", path);
+}
+} else
+LOGD(DEBUG, guest_domid,
+"No physical-device-path, can't access locally.");
 
 goto out;
 }
diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 14b593110f..f7c796011d 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -1915,7 +1915,6 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 continue;
 }
 
-assert(disks[i].backend != LIBXL_DISK_BACKEND_TAP);
 target_path = libxl__device_disk_find_local_path(gc,
 guest_domid, [i], true);
 
-- 
2.43.0




[PATCH 2/4] libxl: Add support for blktap vbd3

2024-02-01 Thread Jason Andryuk
This patch re-introduces blktap support to libxl.  Unlike earlier
versions, it does not link against any blktap library.  libxl changes
are needed to write to the vbd3 backend XenStore nodes.

blktap has three components.  tapdisk is a daemon implementing the disk
IO, NBD (Network Block Device), and Xen PV interfaces.  tap-ctl is a
tool to control tapdisks - creating, starting, stopping and freeing.
tapback manages the XenStore operations and instructs tapdisk to
connect.

It is notable that tapdisk performs the grant and event channel ops, but
doesn't interact with XenStore.  tapback performs XenStore operations
and notifies tapdisks of values and changes.

The flow is: libxl writes to the "vbd3" XenStore nodes and runs the
block-tap script.  The block-tap script runs tap-ctl to create a tapdisk
instance as the physical device.  tapback then sees the tapdisk and
instructs the tapdisk to connect up the PV blkif interface.

This is expected to work without the kernel blktap driver, so the
block-tap script is modified accordingly to write the UNIX NBD path.

backendtype=tap was not fully removed previously, but it would never
succeed since it would hit the hardcoded error in disk_try_backend().
It is reused now.

An example command to attach a vhd:
xl block-attach vm 'vdev=xvdf,backendtype=tap,format=vhd,target=/srv/target.vhd'

Format raw also works to run an "aio:" tapdisk.

Signed-off-by: Jason Andryuk 
---
VHD support is important for OpenXT since there are lots of existing
VHDs which still need supporting.  tapdisk also supports encrypting VHDs
which is not available in QEMU.
---
 docs/man/xl-disk-configuration.5.pod.in   |  4 +++-
 tools/libs/light/libxl_device.c   | 14 ++
 tools/libs/light/libxl_disk.c | 20 +++-
 tools/libs/light/libxl_linux.c|  1 +
 tools/libs/light/libxl_types_internal.idl |  1 +
 tools/libs/light/libxl_utils.c|  2 ++
 6 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/docs/man/xl-disk-configuration.5.pod.in 
b/docs/man/xl-disk-configuration.5.pod.in
index bc945cc517..cb442bd5b4 100644
--- a/docs/man/xl-disk-configuration.5.pod.in
+++ b/docs/man/xl-disk-configuration.5.pod.in
@@ -232,7 +232,7 @@ Specifies the backend implementation to use
 
 =item Supported values
 
-phy, qdisk, standalone
+phy, qdisk, standalone, tap
 
 =item Mandatory
 
@@ -254,6 +254,8 @@ and "standalone" does not support specifications other than 
"virtio".
 Normally this option should not be specified, in which case libxl will
 automatically determine the most suitable backend.
 
+"tap" needs blktap's tapback to be running.
+
 
 =item 

[PATCH 0/4] libxl: blktap/tapback support

2024-02-01 Thread Jason Andryuk
This patch re-introduces blktap support to libxl.  Unlike earlier
versions, it does not link against any blktap library.  libxl changes
are needed to write to the vbd3 backend XenStore nodes.

blktap has three components.  tapdisk is a daemon implementing the disk
IO, NBD (Network Block Device), and Xen PV interfaces.  tap-ctl is a
tool to control tapdisks - creating, starting, stopping and freeing.
tapback manages the XenStore operations and instructs tapdisk to
connect.

It is notable that tapdisk performs the grant and event channel ops, but
doesn't interact with XenStore.  tapback performs XenStore operations
and notifies tapdisks of values and changes.

The flow is: libxl writes to the "vbd3" XenStore nodes and runs the
block-tap script.  The block-tap script runs tap-ctl to create a tapdisk
instance as the physical device.  tapback then sees the tapdisk and
instructs the tapdisk to connect up the PV blkif interface.

This is expected to work without the kernel blktap driver, so the
block-tap script is modified accordingly to write the UNIX NBD path.
(It works with the kernel blktap driver as well - upstream blktap hasn't
removed the blktap driver requirement yet -
https://github.com/xapi-project/blktap/pull/364)

An example command to attach a vhd:
xl block-attach vm 'vdev=xvdf,backendtype=tap,format=vhd,target=/srv/target.vhd'

VHD support is important for OpenXT since there are lots of existing
VHDs which still need supporting.  tapdisk also supports encrypted VHDs
which is not available in QEMU.

blktap's tapback needs minimal changes to work with libxl:
https://github.com/xapi-project/blktap/pull/394

Jason Andryuk (4):
  block-common: Fix same_vm for no targets
  libxl: Add support for blktap vbd3
  hotplug: Update block-tap
  libxl: Support blktap with HVM device model

 docs/man/xl-disk-configuration.5.pod.in   |   4 +-
 tools/hotplug/Linux/block-common.sh   |   8 +-
 tools/hotplug/Linux/block-tap | 162 --
 tools/libs/light/libxl_device.c   |  14 +-
 tools/libs/light/libxl_disk.c |  37 +++--
 tools/libs/light/libxl_dm.c   |   1 -
 tools/libs/light/libxl_linux.c|   1 +
 tools/libs/light/libxl_types_internal.idl |   1 +
 tools/libs/light/libxl_utils.c|   2 +
 9 files changed, 201 insertions(+), 29 deletions(-)

-- 
2.43.0




[xen-unstable-smoke test] 184552: tolerable all pass - PUSHED

2024-02-01 Thread osstest service owner
flight 184552 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184552/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  bc45f20c01f1711bc56a4bb0955c49c182a5a03a
baseline version:
 xen  cc6ba68edf6dcd18c3865e7d7c0f1ed822796426

Last test of basis   184528  2024-01-30 14:03:48 Z2 days
Testing same since   184552  2024-02-01 16:00:25 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Carlo Nonato 
  Jan Beulich 
  Jason Andryuk 
  Roger Pau Monné 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   cc6ba68edf..bc45f20c01  bc45f20c01f1711bc56a4bb0955c49c182a5a03a -> smoke



Re: [PATCH] xen/arm: Properly clean update to init_ttbr and smp_up_cpu

2024-02-01 Thread Julien Grall

Hi,

On 31/01/2024 07:57, Bertrand Marquis wrote:

On 30 Jan 2024, at 18:29, Julien Grall  wrote:

From: Julien Grall 

Recent rework to the secondary boot code modified how init_ttbr and
smp_up_cpu are accessed. Rather than directly accessing them, we
are using a pointer to them.

The helper clean_dcache() is expected to take the variable in parameter
and then clean its content. As we now pass a pointer to the variable,
we will clean the area storing the address rather than the content itself.

Switch to use clean_dcache_va_range() to avoid casting the pointer.

Fixes: a5ed59e62c6f ("arm/mmu: Move init_ttbr to a new section .data.idmap")
Fixes: 9a5114074b04 ("arm/smpboot: Move smp_up_cpu to a new section .data.idmap)

Reported-by: Oleksandr Tyshchenko 
Signed-off-by: Julien Grall 


Reviewed-by: Bertrand Marquis 


Committed. Thanks.

Cheers,

--
Julien Grall



Re: Nullifying Recently Introduced Xen Headers Check

2024-02-01 Thread David Woodhouse
(Thanks Andy for the explicit cc)

On Thu, 2024-02-01 at 00:05 +, John L. Poole wrote:
> 
> cause Gentoo's build to error out.  See line 24790:
> 
>     5 | #error In Xen native files, include xen_native.h before other Xen 
> headers
> 
> at
> https://salemdata.us/xen/xen_tools_20240128_Sun_174740.script.html.
> 
> What I have done is create a patch for a draft Gentoo ebuild which
> nullifies lines 4-6 by wrapping them in a comment:

That isn't what the #error told you to do, though.

 24788  In file included from ../qemu-xen/hw/xen/xen-operations.c:16:
 24789  
/var/tmp/portage/app-emulation/xen-tools-4.18.0/work/xen-4.18.0/tools/qemu-xen/include/hw/xen/xen_native.h:5:2:
 error: #error In Xen native files, include xen_native.h before other Xen 
headers
 24790  5 | #error In Xen native files, include xen_native.h before other 
Xen headers
 24791|  ^

So it's hw/xen/xen-operations.h which is failing. As far as I can tell
(visually and empirically because it does actually build elsewhere), it
*is* doing what the #error said — it *is* including xen_native.h before
any other Xen headers. 

The first four non-comment lines of xen-operations.c should look
something like this...

  #include "qemu/osdep.h"
  #include "qemu/uuid.h"
  #include "qapi/error.h"

  #include "hw/xen/xen_native.h"

So... did you patch it so it doesn't start like that any more? Or does
one of those first three files (perhaps qemu/osdep.h?) end up bringing
in the Xen interface headers in a way that I didn't anticipate and
which doesn't seem to happen elsewhere?

I didn't cite the full gcc command line from line 24787 of your log
because it's huge. Can you run a variant of that command to just give
me the *preprocessed* output (-E -dD -o xen-operations.i).


smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v4 3/8] VMX: tertiary execution control infrastructure

2024-02-01 Thread Roger Pau Monné
On Thu, Feb 01, 2024 at 01:09:11PM +0100, Jan Beulich wrote:
> On 01.02.2024 12:50, Roger Pau Monné wrote:
> > On Thu, Jan 11, 2024 at 10:00:10AM +0100, Jan Beulich wrote:
> >> @@ -503,6 +538,9 @@ static int vmx_init_vmcs_config(bool bsp
> >>  "Secondary Exec Control",
> >>  vmx_secondary_exec_control, _vmx_secondary_exec_control);
> >>  mismatch |= cap_check(
> >> +"Tertiary Exec Control",
> >> +vmx_tertiary_exec_control, _vmx_tertiary_exec_control);
> > 
> > I know it's done to match the surrounding style, but couldn't you move
> > the name parameter one line up, and then limit the call to two lines?
> > 
> > (I don't think it will compromise readability).
> 
> You mean like this:
> 
> mismatch |= cap_check("Tertiary Exec Control",
> vmx_tertiary_exec_control, _vmx_tertiary_exec_control);
> 
> ? No, I view this as a mix of two possible styles. If the string literal
> was moved up, the other legitimate style would only be
> 
> mismatch |= cap_check("Tertiary Exec Control",
>   vmx_tertiary_exec_control,
>   _vmx_tertiary_exec_control);
> 
> aiui (again extending over 3 lines). Yet none of this is written down
> anywhere.
> 
> But anyway - consistency with surrounding code trumps here, I think.

I was hoping it could still fit on 2 lines, but if you need 3 never
mind then.

> >> @@ -2068,10 +2111,12 @@ void vmcs_dump_vcpu(struct vcpu *v)
> >> vmr(HOST_PERF_GLOBAL_CTRL));
> >>  
> >>  printk("*** Control State ***\n");
> >> -printk("PinBased=%08x CPUBased=%08x SecondaryExec=%08x\n",
> >> +printk("PinBased=%08x CPUBased=%08x\n",
> >> vmr32(PIN_BASED_VM_EXEC_CONTROL),
> >> -   vmr32(CPU_BASED_VM_EXEC_CONTROL),
> >> -   vmr32(SECONDARY_VM_EXEC_CONTROL));
> >> +   vmr32(CPU_BASED_VM_EXEC_CONTROL));
> >> +printk("SecondaryExec=%08x TertiaryExec=%08lx\n",
> > 
> > For consistency, shouldn't TertiaryExec use 016 instead of 08 (as it's
> > a 64bit filed).
> 
> Perhaps, assuming we'll gets bits 32 and populated sooner or later.
> However, I view 16-digit literal numbers as hard to read, so I'd be
> inclined to insert a separator (e.g. an underscore) between the low
> and high halves. Thoughts?

Works for me.

Thanks, Roger.



[PATCH 4/4] iommu/x86: make unity range checking more strict

2024-02-01 Thread Roger Pau Monne
Currently when a unity range overlaps with memory being used as RAM by the
hypervisor the result would be that the IOMMU gets disabled.  However that's
not enough, as even with the IOMMU disabled the device will still access the
affected RAM areas.

Note that IVMD or RMRR ranges being placed over RAM is a firmware bug.

Doing so also allows to simplify the code and use a switch over the reported
memory type(s).

Signed-off-by: Roger Pau Monné 
---
 xen/drivers/passthrough/x86/iommu.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/xen/drivers/passthrough/x86/iommu.c 
b/xen/drivers/passthrough/x86/iommu.c
index 63d4cb898218..9b977f84582f 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -806,10 +806,14 @@ bool __init iommu_unity_region_ok(mfn_t start, mfn_t end)
 
 for ( addr = start; !mfn_eq(addr, end); mfn_add(addr, 1) )
 {
-unsigned int type = page_get_ram_type(addr);
-
-if ( type == RAM_TYPE_UNKNOWN )
+/*
+ * Any page that's at least partially of type RESERVED, UNUSABLE or
+ * ACPI will be considered by Xen of being all of that type, and hence
+ * the problematic pages are those that are fully holes or RAM.
+ */
+switch ( page_get_ram_type(addr) )
 {
+case RAM_TYPE_UNKNOWN:
 if ( e820_add_range(mfn_to_maddr(addr),
 mfn_to_maddr(addr) + PAGE_SIZE, E820_RESERVED) 
)
 continue;
@@ -817,7 +821,10 @@ bool __init iommu_unity_region_ok(mfn_t start, mfn_t end)
"IOMMU: page at %#" PRI_mfn " couldn't be reserved\n",
mfn_x(addr));
 return false;
-}
+
+case RAM_TYPE_CONVENTIONAL:
+panic("IOMMU: page at %#" PRI_mfn " overlaps RAM range\n",
+  mfn_x(addr));
 
 /*
  * Types which aren't RAM are considered good enough.
@@ -825,14 +832,7 @@ bool __init iommu_unity_region_ok(mfn_t start, mfn_t end)
  * force Xen into assuming the whole page as having that type in
  * practice.
  */
-if ( type & (RAM_TYPE_RESERVED | RAM_TYPE_ACPI |
- RAM_TYPE_UNUSABLE) )
-continue;
-
-printk(XENLOG_WARNING
-   "IOMMU: page at %#" PRI_mfn " can't be converted\n",
-   mfn_x(addr));
-return false;
+}
 }
 
 return true;
-- 
2.43.0




[PATCH 2/4] iommu/x86: introduce a generic IVMD/RMRR range validity helper

2024-02-01 Thread Roger Pau Monne
IVMD and RMRR ranges are functionally equivalent, and as so could use the same
validity checker.

Move the IVMD to x86 common IOMMU code and adjust the function to take a pair
of [start, end) mfn parameters.

So far only the AMD-Vi side is adjusted to use the newly introduced helper, the
VT-d side will be adjusted in a further change.

No functional change intended.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/include/asm/iommu.h |  3 ++
 xen/drivers/passthrough/amd/iommu_acpi.c | 38 ++--
 xen/drivers/passthrough/x86/iommu.c  | 46 
 3 files changed, 52 insertions(+), 35 deletions(-)

diff --git a/xen/arch/x86/include/asm/iommu.h b/xen/arch/x86/include/asm/iommu.h
index 15a848ddc329..5c7e37331aad 100644
--- a/xen/arch/x86/include/asm/iommu.h
+++ b/xen/arch/x86/include/asm/iommu.h
@@ -135,6 +135,9 @@ struct page_info *__must_check iommu_alloc_pgtable(struct 
domain_iommu *hd,
uint64_t contig_mask);
 void iommu_queue_free_pgtable(struct domain_iommu *hd, struct page_info *pg);
 
+/* Check [start, end) unity map range for correctness. */
+bool iommu_unity_region_ok(mfn_t start, mfn_t end);
+
 #endif /* !__ARCH_X86_IOMMU_H__ */
 /*
  * Local variables:
diff --git a/xen/drivers/passthrough/amd/iommu_acpi.c 
b/xen/drivers/passthrough/amd/iommu_acpi.c
index ca70f4f3ae2c..40468dbbccf3 100644
--- a/xen/drivers/passthrough/amd/iommu_acpi.c
+++ b/xen/drivers/passthrough/amd/iommu_acpi.c
@@ -405,41 +405,9 @@ static int __init parse_ivmd_block(const struct 
acpi_ivrs_memory *ivmd_block)
 return 0;
 }
 
-if ( !e820_all_mapped(base, limit + PAGE_SIZE, E820_RESERVED) )
-{
-paddr_t addr;
-
-AMD_IOMMU_WARN("IVMD: [%lx,%lx) is not (entirely) in reserved 
memory\n",
-   base, limit + PAGE_SIZE);
-
-for ( addr = base; addr <= limit; addr += PAGE_SIZE )
-{
-unsigned int type = page_get_ram_type(maddr_to_mfn(addr));
-
-if ( type == RAM_TYPE_UNKNOWN )
-{
-if ( e820_add_range(addr, addr + PAGE_SIZE,
-E820_RESERVED) )
-continue;
-AMD_IOMMU_ERROR("IVMD: page at %lx couldn't be reserved\n",
-addr);
-return -EIO;
-}
-
-/*
- * Types which aren't RAM are considered good enough.
- * Note that a page being partially RESERVED, ACPI or UNUSABLE will
- * force Xen into assuming the whole page as having that type in
- * practice.
- */
-if ( type & (RAM_TYPE_RESERVED | RAM_TYPE_ACPI |
- RAM_TYPE_UNUSABLE) )
-continue;
-
-AMD_IOMMU_ERROR("IVMD: page at %lx can't be converted\n", addr);
-return -EIO;
-}
-}
+if ( !iommu_unity_region_ok(maddr_to_mfn(base),
+maddr_to_mfn(limit + PAGE_SIZE)) )
+return -EIO;
 
 if ( ivmd_block->header.flags & ACPI_IVMD_EXCLUSION_RANGE )
 exclusion = true;
diff --git a/xen/drivers/passthrough/x86/iommu.c 
b/xen/drivers/passthrough/x86/iommu.c
index c90755ff58fa..63d4cb898218 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -792,6 +792,52 @@ static int __init cf_check adjust_irq_affinities(void)
 }
 __initcall(adjust_irq_affinities);
 
+bool __init iommu_unity_region_ok(mfn_t start, mfn_t end)
+{
+mfn_t addr;
+
+if ( e820_all_mapped(mfn_to_maddr(start), mfn_to_maddr(end),
+ E820_RESERVED) )
+return true;
+
+printk(XENLOG_WARNING "IOMMU: [%#" PRI_mfn " ,%#" PRI_mfn
+   ") is not (entirely) in reserved memory\n",
+   mfn_x(start), mfn_x(end));
+
+for ( addr = start; !mfn_eq(addr, end); mfn_add(addr, 1) )
+{
+unsigned int type = page_get_ram_type(addr);
+
+if ( type == RAM_TYPE_UNKNOWN )
+{
+if ( e820_add_range(mfn_to_maddr(addr),
+mfn_to_maddr(addr) + PAGE_SIZE, E820_RESERVED) 
)
+continue;
+printk(XENLOG_WARNING
+   "IOMMU: page at %#" PRI_mfn " couldn't be reserved\n",
+   mfn_x(addr));
+return false;
+}
+
+/*
+ * Types which aren't RAM are considered good enough.
+ * Note that a page being partially RESERVED, ACPI or UNUSABLE will
+ * force Xen into assuming the whole page as having that type in
+ * practice.
+ */
+if ( type & (RAM_TYPE_RESERVED | RAM_TYPE_ACPI |
+ RAM_TYPE_UNUSABLE) )
+continue;
+
+printk(XENLOG_WARNING
+   "IOMMU: page at %#" PRI_mfn " can't be converted\n",
+   mfn_x(addr));
+return false;
+}
+
+return true;
+}
+
 /*
  * Local variables:
 

[PATCH 3/4] iommu/vt-d: switch to common RMRR checker

2024-02-01 Thread Roger Pau Monne
Use the newly introduced generic unity map checker.

Also drop the message recommending the usage of iommu_inclusive_mapping: the
ranges would end up being mapped anyway even if some of the checks above
failed, regardless of whether iommu_inclusive_mapping is set.

Signed-off-by: Roger Pau Monné 
---
 xen/drivers/passthrough/vtd/dmar.c | 14 +++---
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/dmar.c 
b/xen/drivers/passthrough/vtd/dmar.c
index 07772f178fe6..005b42706a34 100644
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -642,17 +642,9 @@ acpi_parse_one_rmrr(struct acpi_dmar_header *header)
return -EEXIST;
}
 
-/* This check is here simply to detect when RMRR values are
- * not properly represented in the system memory map and
- * inform the user
- */
-if ( !e820_all_mapped(base_addr, end_addr + 1, E820_RESERVED) &&
- !e820_all_mapped(base_addr, end_addr + 1, E820_NVS) &&
- !e820_all_mapped(base_addr, end_addr + 1, E820_ACPI) )
-printk(XENLOG_WARNING VTDPREFIX
-   " RMRR [%"PRIx64",%"PRIx64"] not in reserved memory;"
-   " need \"iommu_inclusive_mapping=1\"?\n",
-base_addr, end_addr);
+if ( !iommu_unity_region_ok(maddr_to_mfn(base_addr),
+maddr_to_mfn(end_addr + PAGE_SIZE)) )
+return -EIO;
 
 rmrru = xzalloc(struct acpi_rmrr_unit);
 if ( !rmrru )
-- 
2.43.0




[PATCH 0/4] iommu/x86: fixes/improvements for unity range checks

2024-02-01 Thread Roger Pau Monne
Hello,

Patch 1 is a bugfix for AMD-Vi IVMD range type checks.

Further patches unify the IVMD/RMRR checks into a common function, and
the last patch tightens up the condition to panic when attempting to
boot on a system that has RMRR or IVMD regions over memory that Xen uses
as RAM.

Thanks, Roger.

Roger Pau Monne (4):
  amd-vi: fix IVMD memory type checks
  iommu/x86: introduce a generic IVMD/RMRR range validity helper
  iommu/vt-d: switch to common RMRR checker
  iommu/x86: make unity range checking more strict

 xen/arch/x86/include/asm/iommu.h |  3 ++
 xen/drivers/passthrough/amd/iommu_acpi.c | 33 ++---
 xen/drivers/passthrough/vtd/dmar.c   | 14 ++--
 xen/drivers/passthrough/x86/iommu.c  | 46 
 4 files changed, 55 insertions(+), 41 deletions(-)

-- 
2.43.0




[PATCH 1/4] amd-vi: fix IVMD memory type checks

2024-02-01 Thread Roger Pau Monne
The current code that parses the IVMD blocks is relaxed with regard to the
restriction that such unity regions should always fall into memory ranges
marked as reserved in the memory map.

However the type checks for the IVMD addresses are inverted, and as a result
IVMD ranges falling into RAM areas are accepted.  Note that having such ranges
in the first place is a firmware bug, as IVMD should always fall into reserved
ranges.

Fixes: ed6c77ebf0c1 ('AMD/IOMMU: check / convert IVMD ranges for being / to be 
reserved')
Signed-off-by: Roger Pau Monné 
---
Cc: o...@proton.me
---
 xen/drivers/passthrough/amd/iommu_acpi.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_acpi.c 
b/xen/drivers/passthrough/amd/iommu_acpi.c
index 2e3b83014beb..ca70f4f3ae2c 100644
--- a/xen/drivers/passthrough/amd/iommu_acpi.c
+++ b/xen/drivers/passthrough/amd/iommu_acpi.c
@@ -426,9 +426,14 @@ static int __init parse_ivmd_block(const struct 
acpi_ivrs_memory *ivmd_block)
 return -EIO;
 }
 
-/* Types which won't be handed out are considered good enough. */
-if ( !(type & (RAM_TYPE_RESERVED | RAM_TYPE_ACPI |
-   RAM_TYPE_UNUSABLE)) )
+/*
+ * Types which aren't RAM are considered good enough.
+ * Note that a page being partially RESERVED, ACPI or UNUSABLE will
+ * force Xen into assuming the whole page as having that type in
+ * practice.
+ */
+if ( type & (RAM_TYPE_RESERVED | RAM_TYPE_ACPI |
+ RAM_TYPE_UNUSABLE) )
 continue;
 
 AMD_IOMMU_ERROR("IVMD: page at %lx can't be converted\n", addr);
-- 
2.43.0




Re: Next steps for Rust guest agent

2024-02-01 Thread George Dunlap
On Thu, Feb 1, 2024 at 4:57 PM Julien Grall  wrote:
>
> Hi George,
>
> On 01/02/2024 16:55, George Dunlap wrote:
> > On Thu, Jan 11, 2024 at 12:27 PM Yann Dirson  wrote:
> >>> - what should be the criteria to advertise it as official Xenproject
> >>> guest agent ?
> >>
> >> What do people think here?
> >
> > As we discussed at the community call, I think that we should
> > basically set a date at which we consider this the official Xen
> > Project guest agent.  Anyone who wants to have input can give it
> > before then.  Then once you guys think it's ready, we can start to
> > "market" it to the distros.
>
> +1
>
> >
> > Shall we say 29 February, 8 weeks from now?
>
> This is 4 weeks away. I am fine with that, but checking this is the date
> you intended to set.

...

I have no idea how that 8 got there... yes, 4 weeks is what I meant to write.

 -George



Re: Next steps for Rust guest agent

2024-02-01 Thread Julien Grall

Hi George,

On 01/02/2024 16:55, George Dunlap wrote:

On Thu, Jan 11, 2024 at 12:27 PM Yann Dirson  wrote:

- what should be the criteria to advertise it as official Xenproject
guest agent ?


What do people think here?


As we discussed at the community call, I think that we should
basically set a date at which we consider this the official Xen
Project guest agent.  Anyone who wants to have input can give it
before then.  Then once you guys think it's ready, we can start to
"market" it to the distros.


+1



Shall we say 29 February, 8 weeks from now?


This is 4 weeks away. I am fine with that, but checking this is the date 
you intended to set.


Cheers,

--
Julien Grall



Re: Next steps for Rust guest agent

2024-02-01 Thread George Dunlap
On Thu, Jan 11, 2024 at 12:27 PM Yann Dirson  wrote:
> > - what should be the criteria to advertise it as official Xenproject
> > guest agent ?
>
> What do people think here?

As we discussed at the community call, I think that we should
basically set a date at which we consider this the official Xen
Project guest agent.  Anyone who wants to have input can give it
before then.  Then once you guys think it's ready, we can start to
"market" it to the distros.

Shall we say 29 February, 8 weeks from now?

 -George



Re: [BUG]i2c_hid_acpi broken with 4.17.2 on Framework Laptop 13 AMD

2024-02-01 Thread Sébastien Chaumat
Le jeu. 1 févr. 2024 à 13:30, Sébastien Chaumat  a
écrit :

> I spotted the following warning for IRQ7 (along with IRQ6 and 10)
>
> [0.686073] fedora kernel: __irq_set_trigger: genirq: No set_type
> function for IRQ 7 (IR-IO-APIC)
>
> This comes from kernel/irq/manage.c
>
>
> int __irq_set_trigger(struct irq_desc *desc, unsigned long flags)
> {
> struct irq_chip *chip = desc->irq_data.chip;
> int ret, unmask = 0;
>
> if (!chip || !chip->irq_set_type) {
> /*
> * IRQF_TRIGGER_* but the PIC does not support multiple
> * flow-types?
> */
> pr_debug("No set_type function for IRQ %d (%s)\n",
> irq_desc_get_irq(desc),
> chip ? (chip->name ? : "unknown") : "unknown");
> return 0;
> }
> Could this have a role in the IRQ misconfiguration by xen ?
>

 Things are getting even weirder :

xen 4.18.1-pre (vanilla, no patching of pci_xen_initial_domain()), kernel
6.8.0-rc2) :

checking /sys/kernel/irq/7 :

actions: pinctrl_amd
chip_name: xen-pirq
hwirq:
name: ioapic-edge
per_cpu_count: 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
type: edge
wakeup: disabled

now checking  xen with xl debug-key i

(XEN)IRQ:   7 vec:51 IO-APIC-level   status=030 aff:{13}/{13}
in-flight=0 d0:  7(---)
(XEN) IRQ  7 Vec 81:
(XEN)   Apic 0x00, Pin  7: vec=51 delivery=Fixed dest=P status=1
polarity=1 irr=1 trig=L mask=0 dest_id:510d

So even after dom0 is set up, the kernel considers  IRQ7 is of type edge
while xen has it registered as IO-APIC-level.


Re: [PATCH net] xen-netback: properly sync TX responses

2024-02-01 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski :

On Mon, 29 Jan 2024 14:03:08 +0100 you wrote:
> Invoking the make_tx_response() / push_tx_responses() pair with no lock
> held would be acceptable only if all such invocations happened from the
> same context (NAPI instance or dealloc thread). Since this isn't the
> case, and since the interface "spec" also doesn't demand that multicast
> operations may only be performed with no in-flight transmits,
> MCAST_{ADD,DEL} processing also needs to acquire the response lock
> around the invocations.
> 
> [...]

Here is the summary with links:
  - [net] xen-netback: properly sync TX responses
https://git.kernel.org/netdev/net/c/7b55984c96ff

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html





[linux-linus test] 184545: tolerable FAIL - PUSHED

2024-02-01 Thread osstest service owner
flight 184545 linux-linus real [real]
flight 184550 linux-linus real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/184545/
http://logs.test-lab.xenproject.org/osstest/logs/184550/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-xl-xsm   8 xen-bootfail pass in 184550-retest

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm 15 migrate-support-check fail in 184550 never pass
 test-arm64-arm64-xl-xsm 16 saverestore-support-check fail in 184550 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 184525
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 184525
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 184525
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 184525
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 184525
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 184525
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 184525
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 184525
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass

version targeted for testing:
 linux6764c317b6bb91bd806ef79adf6d9c0e428b191e
baseline version:
 linux861c0981648f5b64c86fd028ee622096eb7af05a

Last test of basis   184525  2024-01-30 04:10:43 Z2 days
Failing since184534  2024-01-30 23:45:05 Z1 days2 attempts
Testing same since   184545  2024-01-31 22:37:40 Z0 days1 attempts


People who touched revisions under test:
  Alexander Gordeev 
  Anders Roxell 
  Arthur Grillo 
  Chunhai Guo 
  Colin Ian King 
  Dan Carpenter 
  David Gow 
  Gao Xiang 
  Jingbo Xu 
  Joe Lawrence 
  Kees Cook 
  Li RongQing 
  Linus Torvalds 
  Marco Pagani 
  Marcos Paulo de Souza 
  Mark Brown 
  Martin K. Petersen 
  Mathieu Desnoyers 
  Michael Kelley 
  Michael S. Tsirkin 
  Ming Lei 
  Miroslav Benes 
  Rae Moar 
  Richard 

Re: [PATCH v4 22/47] hw/arm/aspeed: use qemu_configure_nic_device()

2024-02-01 Thread Cédric Le Goater

On 1/26/24 18:24, David Woodhouse wrote:

From: David Woodhouse 

Signed-off-by: David Woodhouse 
Acked-by: Cédric Le Goater 


and

Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/arm/aspeed.c | 9 -
  1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index cc59176563..bed5e4f40b 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -356,7 +356,6 @@ static void aspeed_machine_init(MachineState *machine)
  AspeedMachineClass *amc = ASPEED_MACHINE_GET_CLASS(machine);
  AspeedSoCClass *sc;
  int i;
-NICInfo *nd = _table[0];
  
  bmc->soc = ASPEED_SOC(object_new(amc->soc_name));

  object_property_add_child(OBJECT(machine), "soc", OBJECT(bmc->soc));
@@ -371,10 +370,10 @@ static void aspeed_machine_init(MachineState *machine)
   _fatal);
  
  for (i = 0; i < sc->macs_num; i++) {

-if ((amc->macs_mask & (1 << i)) && nd->used) {
-qemu_check_nic_model(nd, TYPE_FTGMAC100);
-qdev_set_nic_properties(DEVICE(>soc->ftgmac100[i]), nd);
-nd++;
+if ((amc->macs_mask & (1 << i)) &&
+!qemu_configure_nic_device(DEVICE(>soc->ftgmac100[i]),
+   true, NULL)) {
+break; /* No configs left; stop asking */
  }
  }
  





Re: [PATCH 1/2] xen/arm: Add imx8q{m,x} platform glue

2024-02-01 Thread John Ernberg
On 2/1/24 13:20, Julien Grall wrote:
> 
> 
> On 31/01/2024 15:32, John Ernberg wrote:
>> Hi Julien,
> 
> Hi John,
> 
>> On 1/31/24 13:22, Julien Grall wrote:
>>> Hi,
>>>
>>> On 31/01/2024 11:50, John Ernberg wrote:
 When using Linux for dom0 there are a bunch of drivers that need to do
 SMC
 SIP calls into the PSCI provider to enable certain hardware bits 
 like the
 watchdog.
>>>
>>> Do you know which protocol this is under the hood. Is this SCMI?
>>
>> I think I confused myself here when I wrote the commit log.
>>
>> The EL3 code in our case is ATF, and it does not appear to be SCMI, nor
>> PSCI. The register usage of these SMC SIP calls are as follows:
>> a0 - service
>> a1 - function
>> a2-a7 - args
>>
>> In ATF the handler is declared as a runtime service.
>>
>> Would the appropriate commmit message here be something along the lines
>> of below?
>> """
>> When using Linux for dom0 there are a bunch of drivers that need to 
>> do   SMC
>> SIP calls into the firmware to enable certain hardware bits like the
>> watchdog.
>> """
> 
> It reads better thanks.
> 
> [...]
> 
>>> But even if we restrict to dom0, have you checked that none of the SMCs
>>> use buffers?
>> I haven't found any such instances in the Linux kernel where a buffer is
>> used. Adding a call filtering like suggested below additions of such
>> functions can be discovered and adapted for if they would show up later.
>>>
>>> Rather than providing a blanket forward, to me it sounds more like you
>>> want to provide an allowlist of the SMCs. This is more futureproof and
>>> avoid the risk to expose unsafe SMCs to any domain.
>>>
>>> For an example, you can have a look at the EEMI mediator for Xilinx.
>>
>> Ack. Do you prefer to see only on SMCCC service level or also on
>> function level? (a1 register, per description earlier)
> 
> I am not sure. It will depend on whether it is correct to expose *all* 
> the functions within a service level and they have the same format.
> 
> If you can't guarantee that, then you will most likely need to allowlist 
> at the function level.
> 
> Also, do you have a spec in hand that would help to understand which 
> service/function is implemented via those SMCs?

I don't have the spec unfortunately, but I will add a filter on both 
service and function for V2 and we'll take it from there.
> 
> Cheers,
> 

Thanks! // John Ernberg

Re: [PATCH 1/2] xen/arm: Add imx8q{m,x} platform glue

2024-02-01 Thread John Ernberg
Hi Peng,

On 2/1/24 05:10, Peng Fan wrote:
>> Cc: Jonas Blixt ; xen-devel@lists.xenproject.org
>> Subject: Re: [PATCH 1/2] xen/arm: Add imx8q{m,x} platform glue
>>
>> Hi Julien,
>>
>> On 1/31/24 13:22, Julien Grall wrote:
>>> Hi,
>>>
>>> On 31/01/2024 11:50, John Ernberg wrote:
 When using Linux for dom0 there are a bunch of drivers that need to
 do SMC SIP calls into the PSCI provider to enable certain hardware
 bits like the watchdog.
>>>
>>> Do you know which protocol this is under the hood. Is this SCMI?
>>
>> I think I confused myself here when I wrote the commit log.
>>
>> The EL3 code in our case is ATF, and it does not appear to be SCMI, nor PSCI.
>> The register usage of these SMC SIP calls are as follows:
>> a0 - service
>> a1 - function
>> a2-a7 - args
>>
>> In ATF the handler is declared as a runtime service.
>>
>> Would the appropriate commmit message here be something along the lines
>> of below?
>> """
>> When using Linux for dom0 there are a bunch of drivers that need to do
>> SMC
>> SIP calls into the firmware to enable certain hardware bits like the 
>> watchdog.
>> """
>>>

 Provide a basic platform glue that implements the needed SMC forwarding.

 Signed-off-by: John Ernberg 
 ---
 NOTE: This is based on code found in NXP Xen tree located here:
 https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
 hub.com%2Fnxp-imx%2Fimx-xen%2Fblob%2Flf-
>> 5.10.y_4.13%2Fxen%2Farch%2Far

>> m%2Fplatforms%2Fimx8qm.c=05%7C02%7Cpeng.fan%40nxp.com%7C
>> 573b599a

>> 4b4143ceca1d08dc2271e5be%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C
>> 0%7C0%7

>> C638423119777601548%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
>> wMDAiLCJQI

>> joiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=ZO
>> 0TXjL6
 g0W7TIZo8x8lTNBXEZW%2BDNcLPndWlEf5D2A%3D=0
>>>
>>> Anything after --- will be removed while applied to the three. I think
>>> this NOTE should be written down in the commit message.
>>
>> Ack.
>>>
>>> You also possibly want a signed-off-by from Peng as this is his code.
>>
>> @Peng: May I add a sign-off from you?
> 
> Yeah. You could add my sign off.

Great, thanks!
> 
>>>

    xen/arch/arm/platforms/Makefile |  1 +
    xen/arch/arm/platforms/imx8qm.c | 65
 +
    2 files changed, 66 insertions(+)
    create mode 100644 xen/arch/arm/platforms/imx8qm.c

 diff --git a/xen/arch/arm/platforms/Makefile
 b/xen/arch/arm/platforms/Makefile index 8632f4115f..bec6e55d1f
>> 100644
 --- a/xen/arch/arm/platforms/Makefile
 +++ b/xen/arch/arm/platforms/Makefile
 @@ -9,5 +9,6 @@ obj-$(CONFIG_ALL_PLAT)   += sunxi.o
    obj-$(CONFIG_ALL64_PLAT) += thunderx.o
    obj-$(CONFIG_ALL64_PLAT) += xgene-storm.o
    obj-$(CONFIG_ALL64_PLAT) += brcm-raspberry-pi.o
 +obj-$(CONFIG_ALL64_PLAT) += imx8qm.o
    obj-$(CONFIG_MPSOC_PLATFORM)  += xilinx-zynqmp.o
    obj-$(CONFIG_MPSOC_PLATFORM)  += xilinx-zynqmp-eemi.o diff --git
 a/xen/arch/arm/platforms/imx8qm.c
>> b/xen/arch/arm/platforms/imx8qm.c
 new file mode 100644 index 00..a9cd9c3615
 --- /dev/null
 +++ b/xen/arch/arm/platforms/imx8qm.c
 @@ -0,0 +1,65 @@
 +/* SPDX-License-Identifier: GPL-2.0-or-later */
 +/*
 + * xen/arch/arm/platforms/imx8qm.c
 + *
 + * i.MX 8QM setup
 + *
 + * Copyright (c) 2016 Freescale Inc.
 + * Copyright 2018-2019 NXP
 + *
 + *
 + * Peng Fan 
 + */
 +
 +#include 
 +#include 
 +
 +static const char * const imx8qm_dt_compat[] __initconst = {
 +    "fsl,imx8qm",
 +    "fsl,imx8qxp",
 +    NULL
 +};
 +
 +static bool imx8qm_smc(struct cpu_user_regs *regs) {
>>>
>>> Your implementation below will not only forward SMC for dom0 but also
>>> for any non-trusted domains. Have you investigated that all the SIP
>>> calls are safe to be called by anyone?
>>
>> We use pure virtualized domUs, so we do not expect any calls to this SMC
>> interface from the guest. I'll limit it to dom0.
> 
> Would you mind to share what features are supported in your DomU?
> 
> Pure virtualized, you using xen pv or virtio?

Not at all. We're forwarding block and networking (and additionally 
usbip over networking for USB sharing), via the Xen PV drivers.
> 
> Thanks,
> Peng.
> 

Thanks! // John Ernberg

Re: [PATCH net] xen-netback: properly sync TX responses

2024-02-01 Thread Paul Durrant

On 29/01/2024 13:03, Jan Beulich wrote:

Invoking the make_tx_response() / push_tx_responses() pair with no lock
held would be acceptable only if all such invocations happened from the
same context (NAPI instance or dealloc thread). Since this isn't the
case, and since the interface "spec" also doesn't demand that multicast
operations may only be performed with no in-flight transmits,
MCAST_{ADD,DEL} processing also needs to acquire the response lock
around the invocations.

To prevent similar mistakes going forward, "downgrade" the present
functions to private helpers of just the two remaining ones using them
directly, with no forward declarations anymore. This involves renaming
what so far was make_tx_response(), for the new function of that name
to serve the new (wrapper) purpose.

While there,
- constify the txp parameters,
- correct xenvif_idx_release()'s status parameter's type,
- rename {,_}make_tx_response()'s status parameters for consistency with
   xenvif_idx_release()'s.

Fixes: 210c34dcd8d9 ("xen-netback: add support for multicast control")
Cc: sta...@vger.kernel.org
Signed-off-by: Jan Beulich 
---
Of course this could be split into two or even more separate changes,
but I think these transformations are best done all in one go.

It remains questionable whether push_tx_responses() really needs
invoking after every single _make_tx_response().

MCAST_{ADD,DEL} are odd also from another perspective: They're supposed
to come with "dummy requests", with the comment in the public header
leaving open what that means.


IIRC the only reference I had at the time was Solaris code... I don't 
really there being any documentation of the feature. I think that 
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/xen/io/xnb.c 
is probably similar to the code I looked at to determine what the 
Solaris frontend expected. So I think 'dummy' means 'ignored'.



Netback doesn't check dummy-ness (e.g.
size being zero). Furthermore the description in the public header
doesn't really make clear that there's a restriction of one such "extra"
per dummy request. Yet the way xenvif_get_extras() works precludes
multiple ADDs or multiple DELs in a single dummy request (only the last
one would be honored afaict). While the way xenvif_tx_build_gops() works
precludes an ADD and a DEL coming together in a single dummy request
(the DEL would be ignored).


It appears the Solaris backend never coped with multiple extra_info so 
what the 'correct' semantic would be is unclear.


Anyway...

Reviewed-by: Paul Durrant 



--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -104,13 +104,12 @@ bool provides_xdp_headroom = true;
  module_param(provides_xdp_headroom, bool, 0644);
  
  static void xenvif_idx_release(struct xenvif_queue *queue, u16 pending_idx,

-  u8 status);
+  s8 status);
  
  static void make_tx_response(struct xenvif_queue *queue,

-struct xen_netif_tx_request *txp,
+const struct xen_netif_tx_request *txp,
 unsigned int extra_count,
-s8   st);
-static void push_tx_responses(struct xenvif_queue *queue);
+s8 status);
  
  static void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx);
  
@@ -208,13 +207,9 @@ static void xenvif_tx_err(struct xenvif_

  unsigned int extra_count, RING_IDX end)
  {
RING_IDX cons = queue->tx.req_cons;
-   unsigned long flags;
  
  	do {

-   spin_lock_irqsave(>response_lock, flags);
make_tx_response(queue, txp, extra_count, XEN_NETIF_RSP_ERROR);
-   push_tx_responses(queue);
-   spin_unlock_irqrestore(>response_lock, flags);
if (cons == end)
break;
RING_COPY_REQUEST(>tx, cons++, txp);
@@ -465,12 +460,7 @@ static void xenvif_get_requests(struct x
for (shinfo->nr_frags = 0; nr_slots > 0 && shinfo->nr_frags < 
MAX_SKB_FRAGS;
 nr_slots--) {
if (unlikely(!txp->size)) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>response_lock, flags);
make_tx_response(queue, txp, 0, XEN_NETIF_RSP_OKAY);
-   push_tx_responses(queue);
-   spin_unlock_irqrestore(>response_lock, flags);
++txp;
continue;
}
@@ -496,14 +486,8 @@ static void xenvif_get_requests(struct x
  
  		for (shinfo->nr_frags = 0; shinfo->nr_frags < nr_slots; ++txp) {

if (unlikely(!txp->size)) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>response_lock, flags);
make_tx_response(queue, txp, 0,
 

Re: [PATCH v6 10/15] xen: add cache coloring allocator for domains

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:18, Carlo Nonato wrote:
> @@ -157,7 +158,11 @@
>  #define PGC_static 0
>  #endif
>  
> -#define PGC_preserved (PGC_extra | PGC_static)
> +#ifndef PGC_colored
> +#define PGC_colored 0
> +#endif
> +
> +#define PGC_preserved (PGC_extra | PGC_static | PGC_colored)
>  
>  #ifndef PGT_TYPE_INFO_INITIALIZER
>  #define PGT_TYPE_INFO_INITIALIZER 0
> @@ -1504,6 +1509,7 @@ static void free_heap_pages(
>  if ( !mfn_valid(page_to_mfn(predecessor)) ||
>   !page_state_is(predecessor, free) ||
>   (predecessor->count_info & PGC_static) ||
> + (predecessor->count_info & PGC_colored) ||

How about a 2nd #define (e.g. PGC_no_buddy_merge) to use here and ...

>   (PFN_ORDER(predecessor) != order) ||
>   (page_to_nid(predecessor) != node) )
>  break;
> @@ -1528,6 +1534,7 @@ static void free_heap_pages(
>  if ( !mfn_valid(page_to_mfn(successor)) ||
>   !page_state_is(successor, free) ||
>   (successor->count_info & PGC_static) ||
> + (successor->count_info & PGC_colored) ||

... here? That'll then also avoid me commenting that I don't see why
these two PGC_* checks aren't folded.

> @@ -1943,6 +1950,161 @@ static unsigned long avail_heap_pages(
>  return free_pages;
>  }
>  
> +/*
> + * COLORED SIDE-ALLOCATOR
> + *
> + * Pages are grouped by LLC color in lists which are globally referred to as 
> the
> + * color heap. Lists are populated in end_boot_allocator().
> + * After initialization there will be N lists where N is the number of
> + * available colors on the platform.
> + */
> +static struct page_list_head *__ro_after_init _color_heap;
> +static unsigned long *__ro_after_init free_colored_pages;
> +
> +/* Memory required for buddy allocator to work with colored one */
> +#ifdef CONFIG_LLC_COLORING
> +static unsigned long __initdata buddy_alloc_size =
> +MB(CONFIG_BUDDY_ALLOCATOR_SIZE);
> +size_param("buddy-alloc-size", buddy_alloc_size);
> +
> +#define domain_num_llc_colors(d) (d)->num_llc_colors
> +#define domain_llc_color(d, i)   (d)->llc_colors[(i)]

Nit: Unnecessary parentheses inside the square brackets.

> +#else
> +static unsigned long __initdata buddy_alloc_size;
> +
> +#define domain_num_llc_colors(d) 0
> +#define domain_llc_color(d, i)   0
> +#endif
> +
> +#define color_heap(color) (&_color_heap[color])

Wouldn't this better live next to _color_heap()'s definition?

> +static void free_color_heap_page(struct page_info *pg, bool need_scrub)
> +{
> +unsigned int color = page_to_llc_color(pg);
> +struct page_list_head *head = color_heap(color);
> +
> +spin_lock(_lock);
> +
> +mark_page_free(pg, page_to_mfn(pg));
> +
> +if ( need_scrub )
> +{
> +pg->count_info |= PGC_need_scrub;
> +poison_one_page(pg);
> +}
> +
> +pg->count_info |= PGC_colored;

The page transiently losing PGC_colored is not an issue then (presumably
because of holding the heap lock)? Did you consider having mark_page_free()
also use PGC_preserved?

> +free_colored_pages[color]++;
> +page_list_add(pg, head);
> +
> +spin_unlock(_lock);
> +}
> +
> +static struct page_info *alloc_color_heap_page(unsigned int memflags,
> +   const struct domain *d)
> +{
> +struct page_info *pg = NULL;
> +unsigned int i, color = 0;
> +unsigned long max = 0;
> +bool need_tlbflush = false;
> +uint32_t tlbflush_timestamp = 0;
> +bool scrub = !(memflags & MEMF_no_scrub);
> +
> +spin_lock(_lock);
> +
> +for ( i = 0; i < domain_num_llc_colors(d); i++ )
> +{
> +unsigned long free = free_colored_pages[domain_llc_color(d, i)];
> +
> +if ( free > max )
> +{
> +color = domain_llc_color(d, i);
> +pg = page_list_first(color_heap(color));
> +max = free;
> +}
> +}

The apporach is likely fine at least initially, but: By going from where
the most free pages are, you're not necessarily evenly distributing a
domain's pages over the colors it may use, unless the domain uses its
set of colors exclusively.

> +if ( !pg )
> +{
> +spin_unlock(_lock);
> +return NULL;
> +}
> +
> +pg->count_info = PGC_state_inuse | PGC_colored |
> + (pg->count_info & PGC_need_scrub);

Isn't PGC_colored already set on the page? Together with ...

> +free_colored_pages[color]--;
> +page_list_del(pg, color_heap(color));
> +
> +if ( !(memflags & MEMF_no_tlbflush) )
> +accumulate_tlbflush(_tlbflush, pg, _timestamp);
> +
> +init_free_page_fields(pg);
> +
> +spin_unlock(_lock);
> +
> +if ( test_and_clear_bit(_PGC_need_scrub, >count_info) && scrub )

... this, can't the above be simplified?

> +scrub_one_page(pg);
> +else if ( scrub )
> +check_one_page(pg);
> +
> +if ( need_tlbflush )
> +  

[libvirt test] 184548: regressions - FAIL

2024-02-01 Thread osstest service owner
flight 184548 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184548/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 184537

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 184537
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 184537
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 184537
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  f85a382a0e709afea3a4021de593b1679553a35a
baseline version:
 libvirt  2757e91c2b28b704d9a0b586fb60012450110b1a

Last test of basis   184537  2024-01-31 04:24:45 Z1 days
Testing same since   184548  2024-02-01 06:57:31 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Göran Uddeborg 
  Peter Krempa 
  Stefano Brivio 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  fail
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   blocked 
 test-armhf-armhf-libvirt-qcow2   pass
 test-arm64-arm64-libvirt-raw blocked 
 test-armhf-armhf-libvirt-raw pass
 test-amd64-i386-libvirt-raw  pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master

Re: [PATCH net] xen-netback: properly sync TX responses

2024-02-01 Thread Paul Durrant

On 01/02/2024 00:23, Jakub Kicinski wrote:

On Mon, 29 Jan 2024 14:03:08 +0100 Jan Beulich wrote:

Invoking the make_tx_response() / push_tx_responses() pair with no lock
held would be acceptable only if all such invocations happened from the
same context (NAPI instance or dealloc thread). Since this isn't the
case, and since the interface "spec" also doesn't demand that multicast
operations may only be performed with no in-flight transmits,
MCAST_{ADD,DEL} processing also needs to acquire the response lock
around the invocations.

To prevent similar mistakes going forward, "downgrade" the present
functions to private helpers of just the two remaining ones using them
directly, with no forward declarations anymore. This involves renaming
what so far was make_tx_response(), for the new function of that name
to serve the new (wrapper) purpose.

While there,
- constify the txp parameters,
- correct xenvif_idx_release()'s status parameter's type,
- rename {,_}make_tx_response()'s status parameters for consistency with
   xenvif_idx_release()'s.


Hi Paul, is this one on your TODO list to review or should
we do our best? :)


Sorry for the delay. I'll take a look at this now.

  Paul



Re: [PATCH v6 09/15] xen/page_alloc: introduce preserved page flags macro

2024-02-01 Thread Jan Beulich
On 01.02.2024 15:49, Carlo Nonato wrote:
> On Thu, Feb 1, 2024 at 3:24 PM Jan Beulich  wrote:
>> On 29.01.2024 18:18, Carlo Nonato wrote:
>>> PGC_static and PGC_extra are flags that needs to be preserved when assigning
>>> a page. Define a new macro that groups those flags and use it instead of
>>> or'ing every time.
>>
>> While here you say where the "preserving" applies, ...
>>
>>> --- a/xen/common/page_alloc.c
>>> +++ b/xen/common/page_alloc.c
>>> @@ -157,6 +157,8 @@
>>>  #define PGC_static 0
>>>  #endif
>>>
>>> +#define PGC_preserved (PGC_extra | PGC_static)
>>
>> ... nothing is said here. From the earlier version I also seem to recall
>> that the constant was then used outside of assign_pages(). That would
>> then mean amending whatever comment would be added here.
> 
> Yes, but it was used in places where the name didn't fit (to stop merging in
> free_heap_pages()) and so I thought it would've been better to use the
> constant only for one of the two concepts: only for preserved flags in
> assign_pages().
> 
> Are you suggesting adding a comment to this #define to clarify its usage?

Yes. Albeit if I understand the earlier paragraph right, you don't use
it further (anymore). In which case the question would be: Is the patch
still needed, and if so is it rightfully part of this series?

Jan



Re: [PATCH v6 12/15] xen/arm: add Xen cache colors command line parameter

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:18, Carlo Nonato wrote:
> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -9,6 +9,9 @@
>  #include 
>  #include 
>  
> +#define XEN_DEFAULT_COLOR   0
> +#define XEN_DEFAULT_NUM_COLORS  1
> +
>  bool __ro_after_init llc_coloring_enabled;
>  boolean_param("llc-coloring", llc_coloring_enabled);
>  
> @@ -21,6 +24,9 @@ static unsigned int __ro_after_init max_nr_colors = 
> CONFIG_NR_LLC_COLORS;
>  static unsigned int __initdata dom0_colors[CONFIG_NR_LLC_COLORS];
>  static unsigned int __initdata dom0_num_colors;
>  
> +static unsigned int __ro_after_init xen_colors[CONFIG_NR_LLC_COLORS];

So unlike for Dom0 here you use the static buffer at runtime.

> +static unsigned int __ro_after_init xen_num_colors;

Taken together, I don't see the value in having XEN_DEFAULT_COLOR:
One can't simply change it and XEN_DEFAULT_NUM_COLORS to have Xen have,
say, 4 colors by default. I think you want to have xen_colors[] have
an initializer, with XEN_DEFAULT_COLOR dropped and XEN_DEFAULT_NUM_COLORS
moved. Or you actually allocate the runtime buffer if a command line
option is found, and just use ARRAY_SIZE() as the default count.

Jan



Re: Issue iommu unrecognized on amd computer

2024-02-01 Thread oxjo
I put it on the linux command line indeed, here is the log with iommu=debug.

I can install Xen with a debug build or test a fix if needed.
I just need some guides because all I did for know is apt or dnf install Xen.


Le jeudi 1 février 2024 à 11:57, Jan Beulich  a écrit :

> On 01.02.2024 09:16, o...@proton.me wrote:
> 
> > Following our interaction on matrix, I send you the boot log `xl dmesg` 
> > output of my computer, booting with `iommu=debug` options on linux 
> > 6.6.13-200.fc39.
> 
> 
> There's no "iommu=debug" anywhere in sight:
> 
> (XEN) Command line: placeholder no-real-mode edd=off
> 
> Did you mistakenly put it on the Linux command line?
> 
> Nevertheless there are relevant messages already without the extra
> verbosity (and without you using a debug build of Xen, which would
> be preferable):
> 
> (XEN) AMD-Vi: Warning: IVMD: [c9f1c000,c9f42000) is not (entirely) in 
> reserved memory
> (XEN) AMD-Vi: Error: IVMD: page at c9f1c000 can't be converted
> (XEN) AMD-Vi: Error initialization
> 
> Quoting the respective entry from the memory map:
> 
> (XEN) [c8f58000, c9f57fff] (ACPI NVS)
> 
> Both Roger and I look to agree that this condition we have
> 
> if ( !(type & (RAM_TYPE_RESERVED | RAM_TYPE_ACPI |
> RAM_TYPE_UNUSABLE)) )
> 
> is inverted in some way. We merely need to agree in which way it
> wants adjusting.
> 
> Jan Xen 4.18.0
(XEN) Xen version 4.18.0 (mockbuild@) (gcc (GCC) 13.2.1 20231205 (Red Hat 
13.2.1-6)) debug=n Wed Dec 13 17:50:42 UTC 2023
(XEN) Latest ChangeSet: 
(XEN) build-id: ea5d1960a8d06f31b000355f4662169fe12d09a6
(XEN) Bootloader: GRUB 2.06
(XEN) Command line: placeholder no-real-mode edd=off iommu=debug
(XEN) Xen image load base address: 0xc1a0
(XEN) Video information:
(XEN)  VGA is graphics mode 1920x1080, 32 bpp
(XEN)  VBE/DDC methods: none; EDID transfer time: 0 seconds
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 1 EDD information structures
(XEN) CPU Vendor: AMD, Family 25 (0x19), Model 80 (0x50), Stepping 0 (raw 
00a50f00)
(XEN) Enabling Supervisor Shadow Stacks
(XEN)   - Disabling PV32 due to CET
(XEN) EFI RAM map:
(XEN)  [, 0009efff] (usable)
(XEN)  [0009f000, 0009] (reserved)
(XEN)  [0010, 09bf] (usable)
(XEN)  [09c0, 09db0fff] (reserved)
(XEN)  [09db1000, 09ef] (usable)
(XEN)  [09f0, 09f0efff] (ACPI NVS)
(XEN)  [09f0f000, c2d57fff] (usable)
(XEN)  [c2d58000, c8f57fff] (reserved)
(XEN)  [c8f58000, c9f57fff] (ACPI NVS)
(XEN)  [c9f58000, c9fd7fff] (ACPI data)
(XEN)  [c9fd8000, c9fd9fff] (usable)
(XEN)  [c9fda000, c9ff] (reserved)
(XEN)  [ca00, cbff] (usable)
(XEN)  [cc00, cdff] (reserved)
(XEN)  [cf00, cfff] (reserved)
(XEN)  [f800, fbff] (reserved)
(XEN)  [fdc0, fdcf] (reserved)
(XEN)  [fed8, fed80fff] (reserved)
(XEN)  [0001, 000fee2f] (usable)
(XEN)  [000fee30, 00102fff] (reserved)
(XEN) ACPI: RSDP C9FD7014, 0024 (r2 LENOVO)
(XEN) ACPI: XSDT C9FD5188, 0104 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: FACP C3964000, 0114 (r6 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: DSDT C394C000, 12791 (r1 LENOVO TP-R25   1130 INTL 20180313)
(XEN) ACPI: FACS C9DF8000, 0040
(XEN) ACPI: SSDT C5E74000, 00A2 (r1 LENOVO PID0Ssdt1 INTL 20180313)
(XEN) ACPI: SSDT C5E73000, 0752 (r1 LENOVO UsbCTabl1 INTL 20180313)
(XEN) ACPI: SSDT C5E66000, 7345 (r2 LENOVO TP-R25  2 MSFT  202)
(XEN) ACPI: MSDM C5B7B000, 0055 (r3 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: BATB C5B66000, 004A (r2 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: HPET C3963000, 0038 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: APIC C3962000, 0138 (r2 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: MCFG C3961000, 003C (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: SBST C396, 0030 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: WSMT C395F000, 0028 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: VFCT C393E000, D884 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: SSDT C3938000, 5354 (r2 LENOVO TP-R25  1 AMD 1)
(XEN) ACPI: CRAT C3937000, 0EC0 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: CDIT C3936000, 0029 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: FPDT C5B67000, 0034 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: SSDT C3935000, 0149 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C3933000, 14C3 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C3931000, 15A8 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C392D000, 3B0E (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: BGRT C392C000, 

Re: [PATCH] Fix some typos in comments

2024-02-01 Thread Andrew Cooper
On 01/02/2024 2:59 pm, Frediano Ziglio wrote:
> Signed-off-by: Frediano Ziglio 
> ---
>  arch/x86/include/arch/processor.h | 2 +-
>  include/xen/hvm/params.h  | 2 +-
>  include/xtf/console.h | 2 +-
>  include/xtf/extable.h | 4 ++--
>  4 files changed, 5 insertions(+), 5 deletions(-)

This looks like an XTF patch?

Considering that I do try to remember to spell check before committing,
I'm somewhat embarrassed by this.

~Andrew



[PATCH] Fix some typos in comments

2024-02-01 Thread Frediano Ziglio
Signed-off-by: Frediano Ziglio 
---
 arch/x86/include/arch/processor.h | 2 +-
 include/xen/hvm/params.h  | 2 +-
 include/xtf/console.h | 2 +-
 include/xtf/extable.h | 4 ++--
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/arch/processor.h 
b/arch/x86/include/arch/processor.h
index 0c33545..1c7e655 100644
--- a/arch/x86/include/arch/processor.h
+++ b/arch/x86/include/arch/processor.h
@@ -101,7 +101,7 @@
 #define X86_EXC_TS10 /* Invalid TSS. */
 #define X86_EXC_NP11 /* Segment Not Present. */
 #define X86_EXC_SS12 /* Stack-Segment Fault. */
-#define X86_EXC_GP13 /* General Porection Fault. */
+#define X86_EXC_GP13 /* General Protection Fault. */
 #define X86_EXC_PF14 /* Page Fault. */
 #define X86_EXC_SPV   15 /* PIC Spurious Interrupt Vector. */
 #define X86_EXC_MF16 /* Maths fault (x87 FPU). */
diff --git a/include/xen/hvm/params.h b/include/xen/hvm/params.h
index 886b986..0b7c05b 100644
--- a/include/xen/hvm/params.h
+++ b/include/xen/hvm/params.h
@@ -1,5 +1,5 @@
 /*
- * Xen public hvm paramter index
+ * Xen public hvm parameter index
  */
 
 #ifndef XEN_PUBLIC_HVM_PARAMS_H
diff --git a/include/xtf/console.h b/include/xtf/console.h
index caec790..16a6a23 100644
--- a/include/xtf/console.h
+++ b/include/xtf/console.h
@@ -11,7 +11,7 @@
 typedef void (*cons_output_cb)(const char *buf, size_t len);
 
 /*
- * Register a console callback.  Several callbacks can be registered for usful
+ * Register a console callback.  Several callbacks can be registered for useful
  * destinations of console text.
  */
 void register_console_callback(cons_output_cb cb);
diff --git a/include/xtf/extable.h b/include/xtf/extable.h
index e93331e..0668b7d 100644
--- a/include/xtf/extable.h
+++ b/include/xtf/extable.h
@@ -4,8 +4,8 @@
  * Exception table support.
  *
  * Allows code to tag an instruction which might fault, and where to jump to
- * in order to recover.  For more complicated recovery, a cusom handler
- * handler can be registerd.
+ * in order to recover.  For more complicated recovery, a custom handler
+ * can be registered.
  */
 #ifndef XTF_EXTABLE_H
 #define XTF_EXTABLE_H
-- 
2.43.0




Re: [PATCH v6 09/15] xen/page_alloc: introduce preserved page flags macro

2024-02-01 Thread Carlo Nonato
Hi Jan,

On Thu, Feb 1, 2024 at 3:24 PM Jan Beulich  wrote:
>
> On 29.01.2024 18:18, Carlo Nonato wrote:
> > PGC_static and PGC_extra are flags that needs to be preserved when assigning
> > a page. Define a new macro that groups those flags and use it instead of
> > or'ing every time.
>
> While here you say where the "preserving" applies, ...
>
> > --- a/xen/common/page_alloc.c
> > +++ b/xen/common/page_alloc.c
> > @@ -157,6 +157,8 @@
> >  #define PGC_static 0
> >  #endif
> >
> > +#define PGC_preserved (PGC_extra | PGC_static)
>
> ... nothing is said here. From the earlier version I also seem to recall
> that the constant was then used outside of assign_pages(). That would
> then mean amending whatever comment would be added here.

Yes, but it was used in places where the name didn't fit (to stop merging in
free_heap_pages()) and so I thought it would've been better to use the
constant only for one of the two concepts: only for preserved flags in
assign_pages().

Are you suggesting adding a comment to this #define to clarify its usage?

Thanks.

> Jan



Re: [PATCH v6 09/15] xen/page_alloc: introduce preserved page flags macro

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:18, Carlo Nonato wrote:
> PGC_static and PGC_extra are flags that needs to be preserved when assigning
> a page. Define a new macro that groups those flags and use it instead of
> or'ing every time.

While here you say where the "preserving" applies, ...

> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -157,6 +157,8 @@
>  #define PGC_static 0
>  #endif
>  
> +#define PGC_preserved (PGC_extra | PGC_static)

... nothing is said here. From the earlier version I also seem to recall
that the constant was then used outside of assign_pages(). That would
then mean amending whatever comment would be added here.

Jan



Re: [PATCH v6 08/15] xen/page_alloc: introduce init_free_page_fields() helper

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:18, Carlo Nonato wrote:
> Introduce a new helper to initialize fields that have different uses for
> free pages.
> 
> Signed-off-by: Carlo Nonato 

Acked-by: Jan Beulich 





Re: [PATCH v6 07/15] xen/arm: add support for cache coloring configuration via device-tree

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:18, Carlo Nonato wrote:
> @@ -950,6 +951,11 @@ void __init create_domUs(void)
>  #endif
>  }
>  
> +dt_property_read_string(node, "llc-colors", _colors_str);
> +if ( !llc_coloring_enabled && llc_colors_str)
> +printk(XENLOG_WARNING
> +   "'llc-colors' found, but LLC coloring is disabled\n");

Why's this just a warning, when ...

> @@ -960,6 +966,11 @@ void __init create_domUs(void)
>  panic("Error creating domain %s (rc = %ld)\n",
>dt_node_name(node), PTR_ERR(d));
>  
> +if ( llc_coloring_enabled &&
> + (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
> +panic("Error initializing LLC coloring for domain %s (rc = 
> %d)\n",
> +  dt_node_name(node), rc);

... this results in panic()?

> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -254,6 +254,29 @@ int domain_set_llc_colors_domctl(struct domain *d,
>  return domain_check_colors(d);
>  }
>  
> +int domain_set_llc_colors_from_str(struct domain *d, const char *str)

__init ?

> +{
> +int err;
> +unsigned int *colors;
> +
> +if ( !str )
> +return domain_set_default_colors(d);
> +
> +colors = alloc_colors(max_nr_colors);
> +if ( !colors )
> +return -ENOMEM;
> +
> +err = parse_color_config(str, colors, max_nr_colors, >num_llc_colors);
> +if ( err )
> +{
> +printk(XENLOG_ERR "Error parsing LLC color configuration.");

Nit: No full stop at the end of log messages please.

> +return err;
> +}
> +d->llc_colors = colors;
> +
> +return domain_check_colors(d);

Same ordering issue as in the earlier patch, I think.

Jan



Re: [PATCH v2 2/2] x86/p2m: aid the compiler in folding p2m_is_...()

2024-02-01 Thread Jan Beulich
On 01.02.2024 14:32, George Dunlap wrote:
> On Thu, Jun 23, 2022 at 12:54 PM Jan Beulich  wrote:
> 
>> By using | instead of || or (in the negated form) && chances increase
>> for the compiler to recognize that both predicates can actually be
>> folded into an expression requiring just a single branch (via OR-ing
>> together the respective P2M_*_TYPES constants).
>>
>> Signed-off-by: Jan Beulich 
>>
> 
> Sorry for the delay.  Git complains that this patch is malformed:
> 
> error: `git apply --index`: error: corrupt patch at line 28
> 
> Similar complaint from patchew when it was posted:
> 
> https://patchew.org/Xen/5d6c927e-7d7c-5754-e7eb-65d1e70f6...@suse.com/

Not sure what to say. The patch surely is well-formed. It applies fine
using patch (when not taken from email). When taken from email, patch
mentions that it strips CRs (I'm running my email client on Windows),
but the saved email still applies fine. "git am" indeed is unhappy
when taking the plain file as saved from email, albeit here with an
error different from yours. If I edit the saved email to retain just
the From: and Subject: tags, all is fine.

I can't tell what git doesn't like. The error messages (the one you
see and the one I got) tell me nothing. I'm also not aware of there
being a requirement that patches I send via email need to be
"git am"-able (unlike in xsa.git, where I edit patches enough to be
suitable for that), nor am I aware how I would convince my email
client and/or server to omit whatever git doesn't like or to add
whatever git is missing.

Bottom line - your response would be actionable by me only in so far
as I could switch to using "git send-email". Which I'm afraid I'm not
going to do unless left with no other choice. The way I've been
sending patches has worked well for over 20 years, and for different
projects. (I'm aware Andrew has some special "Jan" command to apply
patches I send, but I don't know any specifics.)

Jan



Re: [XEN PATCH v3] xen/arm: ffa: reclaim shared memory on guest destroy

2024-02-01 Thread Bertrand Marquis
Hi Jens,

> On 17 Jan 2024, at 12:06, Jens Wiklander  wrote:
> 
> When an FF-A enabled guest is destroyed it may leave behind memory
> shared with SPs. This memory must be reclaimed before it's reused or an
> SP may make changes to memory used by a new unrelated guest. So when the
> domain is teared down add FF-A requests to reclaim all remaining shared
> memory.
> 
> SPs in the secure world are notified using VM_DESTROYED that a guest has
> been destroyed. An SP is supposed to relinquish all shared memory to allow
> reclaiming the memory. The relinquish operation may need to be delayed if
> the shared memory is for instance part of a DMA operation.
> 
> The domain reference counter is increased when the first FF-A shared
> memory is registered and the counter is decreased again when the last
> shared memory is reclaimed. If FF-A shared memory registrations remain
> at the end of of ffa_domain_teardown() a timer is set to try to reclaim
> the shared memory every second until the memory is reclaimed.
> 
> A few minor style fixes with a removed empty line here and an added new
> line there.
> 
> Signed-off-by: Jens Wiklander 
> ---
> 
> v3:
> - Mentioning in the commit message that there are some style fixes
> - Addressing review comments
> - Refactor the ffa_domain_teardown() path to let
>  ffa_domain_teardown_continue() do most of the work.
> 
> v2:
> - Update commit message to match the new implementation
> - Using a per domain bitfield to keep track of which SPs has been notified
>  with VM_DESTROYED
> - Holding a domain reference counter to keep the domain as a zombie domain
>  while there still is shared memory registrations remaining to be reclaimed
> - Using a timer to retry reclaiming remaining shared memory registrations
> ---
> xen/arch/arm/tee/ffa.c | 253 +
> 1 file changed, 204 insertions(+), 49 deletions(-)
> 
> diff --git a/xen/arch/arm/tee/ffa.c b/xen/arch/arm/tee/ffa.c
> index 0793c1c7585d..80ebbf4f01c6 100644
> --- a/xen/arch/arm/tee/ffa.c
> +++ b/xen/arch/arm/tee/ffa.c
> @@ -54,6 +54,7 @@
> #include 
> #include 
> #include 
> +#include 
> #include 
> 
> #include 
> @@ -144,6 +145,12 @@
>  */
> #define FFA_MAX_SHM_COUNT   32
> 
> +/*
> + * The time we wait until trying to tear down a domain again if it was
> + * blocked initially.
> + */
> +#define FFA_CTX_TEARDOWN_DELAY  SECONDS(1)
> +
> /* FF-A-1.1-REL0 section 10.9.2 Memory region handle, page 167 */
> #define FFA_HANDLE_HYP_FLAG BIT(63, ULL)
> #define FFA_HANDLE_INVALID  0xULL
> @@ -384,11 +391,6 @@ struct ffa_ctx {
> unsigned int page_count;
> /* FF-A version used by the guest */
> uint32_t guest_vers;
> -/*
> - * Number of SPs that we have sent a VM created signal to, used in
> - * ffa_domain_teardown() to know which SPs need to be signalled.
> - */
> -uint16_t create_signal_count;
> bool rx_is_free;
> /* Used shared memory objects, struct ffa_shm_mem */
> struct list_head shm_list;
> @@ -402,6 +404,15 @@ struct ffa_ctx {
> spinlock_t tx_lock;
> spinlock_t rx_lock;
> spinlock_t lock;
> +/* Used if domain can't be torn down immediately */
> +struct domain *teardown_d;
> +struct list_head teardown_list;
> +s_time_t teardown_expire;
> +/*
> + * Used for ffa_domain_teardown() to keep track of which SPs should be
> + * notified that this guest is being destroyed.
> + */
> +unsigned long vm_destroy_bitmap[];
> };
> 
> struct ffa_shm_mem {
> @@ -436,6 +447,12 @@ static void *ffa_tx __read_mostly;
> static DEFINE_SPINLOCK(ffa_rx_buffer_lock);
> static DEFINE_SPINLOCK(ffa_tx_buffer_lock);
> 
> +
> +/* Used to track domains that could not be torn down immediately. */
> +static struct timer ffa_teardown_timer;
> +static struct list_head ffa_teardown_head;
> +static DEFINE_SPINLOCK(ffa_teardown_lock);
> +
> static bool ffa_get_version(uint32_t *vers)
> {
> const struct arm_smccc_1_2_regs arg = {
> @@ -853,7 +870,6 @@ static int32_t handle_partition_info_get(uint32_t w1, 
> uint32_t w2, uint32_t w3,
> goto out_rx_release;
> }
> 
> -
> memcpy(ctx->rx, ffa_rx, sz);
> }
> ctx->rx_is_free = false;
> @@ -992,53 +1008,75 @@ static void put_shm_pages(struct ffa_shm_mem *shm)
> }
> }
> 
> -static bool inc_ctx_shm_count(struct ffa_ctx *ctx)
> +static bool inc_ctx_shm_count(struct domain *d, struct ffa_ctx *ctx)
> {
> bool ret = true;
> 
> spin_lock(>lock);
> +
> +/*
> + * If this is the first shm added, increase the domain reference
> + * counter as we need to keep domain around a bit longer to reclaim the
> + * shared memory in the teardown path.
> + */
> +if ( !ctx->shm_count )
> +get_knownalive_domain(d);
> +
> if (ctx->shm_count >= FFA_MAX_SHM_COUNT)
> ret = false;
> else
> ctx->shm_count++;
> +
> spin_unlock(>lock);
> 
> return ret;
> }
> 
> 

Re: [PATCH v6 04/15] xen/arm: add Dom0 cache coloring support

2024-02-01 Thread Julien Grall

Hi Jan,

On 01/02/2024 13:39, Jan Beulich wrote:

On 01.02.2024 14:35, Julien Grall wrote:

Hi Jan,

On 01/02/2024 13:30, Jan Beulich wrote:

On 29.01.2024 18:18, Carlo Nonato wrote:

Add a command line parameter to allow the user to set the coloring
configuration for Dom0.
A common configuration syntax for cache colors is introduced and
documented.
Take the opportunity to also add:
   - default configuration notion.
   - function to check well-formed configurations.

Direct mapping Dom0 isn't possible when coloring is enabled, so
CDF_directmap flag is removed when creating it.


What implications does this have?


--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
   
   Specify a list of IO ports to be excluded from dom0 access.
   
+### dom0-llc-colors

+> `= List of [  | - ]`
+
+> Default: `All available LLC colors`
+
+Specify dom0 LLC color configuration. This options is available only when
+`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
+colors are used.


Even Arm already has a "dom0=" option. Is there a particular reason why
this doesn't become a new sub-option there?

As to meaning: With just a single , that's still a color value
then (and not a count of colors)? Wouldn't it make sense to have a
simpler variant available where you just say how many, and a suitable
set/range is then picked?

Finally a nit: "This option is ...".


@@ -2188,10 +2190,16 @@ void __init create_dom0(void)
   panic("SVE vector length error\n");
   }
   
-dom0 = domain_create(0, _cfg, CDF_privileged | CDF_directmap);

+if ( !llc_coloring_enabled )
+flags |= CDF_directmap;
+
+dom0 = domain_create(0, _cfg, flags);
   if ( IS_ERR(dom0) )
   panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
   
+if ( llc_coloring_enabled && (rc = dom0_set_llc_colors(dom0)) )

+panic("Error initializing LLC coloring for domain 0 (rc = %d)", rc);


As for the earlier patch, I find panic()ing here dubious. You can continue
quite fine, with a warning and perhaps again tainting the system.

There are arguments for both approach.


In which case - perhaps allow for both? With a Kconfig-established
default and a command line option to override?


Perhaps. But this is a separate discussion from this series. What Carlo 
has been doing match the surrounding code on Arm.





I agree that you can continue but
technically this is not the configuration you asked. Someone may not
notice the tainting until it is too late (read they have done
investigation).

Bear in mind that the user for cache coloring will be in very
specialized environment.


s/will/may/ I suppose. People may enable the option without being in
any specialized environment.


Sure. But again, why would you want to boot with a half broken 
configuration?


In a lot of cases, you are not making a favor to the admin to continue 
to boot. It is easy to say there is a warning in the logs, but this can 
often be overlooked and difficult to diagnostic afterwards. For 
instance, if you think about cache coloring the issue would be latency.


I don't think a lambda users will be able to easily figure out that 
their configuration was wrong.


Futhermore, when you operate at scale, I feel it is better to have an 
early boot crash rather than allowing the system to boot (parsing the 
logs is feasible but IMO risky as they are not stable).





So if you can't enable cache coloring in
production, then something really wrong has happened and continue to
boot is probably not right.

This matches the approach for Arm we have been using since the
beginning. And I will strongly argue to continue this way.


I'm okay with this, and here (for Arm-specific code) it may even be okay
to do so without further justification. But in the earlier patch where
common code is affected, I'll insist on at least justifying this behavior.


See above for a justification. If someone asks for cache coloring, then 
you most likely don't want to continue without cache coloring.


If you dislike the panic() in common code, then we can simply modify the 
function to return an error and move the panic() in the Arm code. This 
is not my preference, but I am under the impression that we both have 
very diverging view how to handle boot error and it will be hard to 
reconcile them (at least in this series, this can be done afterwards if 
somone fancy to write a series matching what you proposed above). So 
this would be the second best option for me.


Cheers,

--
Julien Grall



Re: [PATCH v6 05/15] xen: extend domctl interface for cache coloring

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:18, Carlo Nonato wrote:
> @@ -858,6 +859,16 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
> u_domctl)
>  __HYPERVISOR_domctl, "h", u_domctl);
>  break;
>  
> +case XEN_DOMCTL_set_llc_colors:
> +if ( !llc_coloring_enabled )
> +break;

With "ret" still being 0, this amounts to "successfully ignored". Ought
to be -EOPNOTSUPP, I guess.

> +ret = domain_set_llc_colors_domctl(d, >u.set_llc_colors);
> +if ( ret == -EEXIST )
> +printk(XENLOG_ERR
> +   "Can't set LLC colors on an already created domain\n");

If at all a dprintk(). But personally I think even that's too much - we
don't do so elsewhere, I don't think.

> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -4,6 +4,7 @@
>   *
>   * Copyright (C) 2022 Xilinx Inc.
>   */
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -229,6 +230,30 @@ int __init dom0_set_llc_colors(struct domain *d)
>  return domain_check_colors(d);
>  }
>  
> +int domain_set_llc_colors_domctl(struct domain *d,
> + const struct xen_domctl_set_llc_colors 
> *config)

What purpose has the "domctl" in the function name?

> +{
> +unsigned int *colors;
> +
> +if ( d->num_llc_colors )
> +return -EEXIST;
> +
> +if ( !config->num_llc_colors )
> +return domain_set_default_colors(d);
> +
> +colors = alloc_colors(config->num_llc_colors);
> +if ( !colors )
> +return -ENOMEM;

Hmm, I see here you call the function without first having bounds checked
the input. But in case of too big a value, -ENOMEM is inappropriate, so
such a check wants adding up front anyway.

> +if ( copy_from_guest(colors, config->llc_colors, config->num_llc_colors) 
> )
> +return -EFAULT;

There again wants to be a check that the pointed to values are the same,
or at least of the same size. Else you'd need to do element-wise copy.

> +d->llc_colors = colors;
> +d->num_llc_colors = config->num_llc_colors;
> +
> +return domain_check_colors(d);

And if this fails, you leave the domain with the bad settings? Shouldn't
you check and only then store pointer and count?

> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -1190,6 +1190,13 @@ struct xen_domctl_vmtrace_op {
>  typedef struct xen_domctl_vmtrace_op xen_domctl_vmtrace_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vmtrace_op_t);
>  
> +struct xen_domctl_set_llc_colors {
> +/* IN LLC coloring parameters */
> +uint32_t num_llc_colors;
> +uint32_t padding;

I see you've added padding, but: You don't check it to be zero. Plus
the overwhelming majority of padding fields is named "pad".

Jan



Re: [PATCH v2 1/3] x86/PoD: simplify / improve p2m_pod_cache_add()

2024-02-01 Thread George Dunlap
On Tue, Jan 4, 2022 at 9:48 AM Jan Beulich  wrote:

> Avoid recurring MFN -> page or page -> MFN translations. Drop the pretty
> pointless local variable "p". Make sure the MFN logged in a debugging
> error message is actually the offending one. Return negative errno
> values rather than -1 (presently no caller really cares, but imo this
> should change). Adjust style.
>
> Signed-off-by: Jan Beulich 
>

Sorry, not sure how I managed to miss this:

Reviewed-by: George Dunlap 


Re: [PATCH v6 04/15] xen/arm: add Dom0 cache coloring support

2024-02-01 Thread Jan Beulich
On 01.02.2024 14:35, Julien Grall wrote:
> Hi Jan,
> 
> On 01/02/2024 13:30, Jan Beulich wrote:
>> On 29.01.2024 18:18, Carlo Nonato wrote:
>>> Add a command line parameter to allow the user to set the coloring
>>> configuration for Dom0.
>>> A common configuration syntax for cache colors is introduced and
>>> documented.
>>> Take the opportunity to also add:
>>>   - default configuration notion.
>>>   - function to check well-formed configurations.
>>>
>>> Direct mapping Dom0 isn't possible when coloring is enabled, so
>>> CDF_directmap flag is removed when creating it.
>>
>> What implications does this have?
>>
>>> --- a/docs/misc/xen-command-line.pandoc
>>> +++ b/docs/misc/xen-command-line.pandoc
>>> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
>>>   
>>>   Specify a list of IO ports to be excluded from dom0 access.
>>>   
>>> +### dom0-llc-colors
>>> +> `= List of [  | - ]`
>>> +
>>> +> Default: `All available LLC colors`
>>> +
>>> +Specify dom0 LLC color configuration. This options is available only when
>>> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all 
>>> available
>>> +colors are used.
>>
>> Even Arm already has a "dom0=" option. Is there a particular reason why
>> this doesn't become a new sub-option there?
>>
>> As to meaning: With just a single , that's still a color value
>> then (and not a count of colors)? Wouldn't it make sense to have a
>> simpler variant available where you just say how many, and a suitable
>> set/range is then picked?
>>
>> Finally a nit: "This option is ...".
>>
>>> @@ -2188,10 +2190,16 @@ void __init create_dom0(void)
>>>   panic("SVE vector length error\n");
>>>   }
>>>   
>>> -dom0 = domain_create(0, _cfg, CDF_privileged | CDF_directmap);
>>> +if ( !llc_coloring_enabled )
>>> +flags |= CDF_directmap;
>>> +
>>> +dom0 = domain_create(0, _cfg, flags);
>>>   if ( IS_ERR(dom0) )
>>>   panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
>>>   
>>> +if ( llc_coloring_enabled && (rc = dom0_set_llc_colors(dom0)) )
>>> +panic("Error initializing LLC coloring for domain 0 (rc = %d)", 
>>> rc);
>>
>> As for the earlier patch, I find panic()ing here dubious. You can continue
>> quite fine, with a warning and perhaps again tainting the system.
> There are arguments for both approach.

In which case - perhaps allow for both? With a Kconfig-established
default and a command line option to override?

> I agree that you can continue but 
> technically this is not the configuration you asked. Someone may not 
> notice the tainting until it is too late (read they have done 
> investigation).
> 
> Bear in mind that the user for cache coloring will be in very 
> specialized environment.

s/will/may/ I suppose. People may enable the option without being in
any specialized environment.

> So if you can't enable cache coloring in 
> production, then something really wrong has happened and continue to 
> boot is probably not right.
> 
> This matches the approach for Arm we have been using since the 
> beginning. And I will strongly argue to continue this way.

I'm okay with this, and here (for Arm-specific code) it may even be okay
to do so without further justification. But in the earlier patch where
common code is affected, I'll insist on at least justifying this behavior.

Jan



Re: [PATCH v6 04/15] xen/arm: add Dom0 cache coloring support

2024-02-01 Thread Julien Grall

Hi Jan,

On 01/02/2024 13:30, Jan Beulich wrote:

On 29.01.2024 18:18, Carlo Nonato wrote:

Add a command line parameter to allow the user to set the coloring
configuration for Dom0.
A common configuration syntax for cache colors is introduced and
documented.
Take the opportunity to also add:
  - default configuration notion.
  - function to check well-formed configurations.

Direct mapping Dom0 isn't possible when coloring is enabled, so
CDF_directmap flag is removed when creating it.


What implications does this have?


--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
  
  Specify a list of IO ports to be excluded from dom0 access.
  
+### dom0-llc-colors

+> `= List of [  | - ]`
+
+> Default: `All available LLC colors`
+
+Specify dom0 LLC color configuration. This options is available only when
+`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
+colors are used.


Even Arm already has a "dom0=" option. Is there a particular reason why
this doesn't become a new sub-option there?

As to meaning: With just a single , that's still a color value
then (and not a count of colors)? Wouldn't it make sense to have a
simpler variant available where you just say how many, and a suitable
set/range is then picked?

Finally a nit: "This option is ...".


@@ -2188,10 +2190,16 @@ void __init create_dom0(void)
  panic("SVE vector length error\n");
  }
  
-dom0 = domain_create(0, _cfg, CDF_privileged | CDF_directmap);

+if ( !llc_coloring_enabled )
+flags |= CDF_directmap;
+
+dom0 = domain_create(0, _cfg, flags);
  if ( IS_ERR(dom0) )
  panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
  
+if ( llc_coloring_enabled && (rc = dom0_set_llc_colors(dom0)) )

+panic("Error initializing LLC coloring for domain 0 (rc = %d)", rc);


As for the earlier patch, I find panic()ing here dubious. You can continue
quite fine, with a warning and perhaps again tainting the system.
There are arguments for both approach. I agree that you can continue but 
technically this is not the configuration you asked. Someone may not 
notice the tainting until it is too late (read they have done 
investigation).


Bear in mind that the user for cache coloring will be in very 
specialized environment. So if you can't enable cache coloring in 
production, then something really wrong has happened and continue to 
boot is probably not right.


This matches the approach for Arm we have been using since the 
beginning. And I will strongly argue to continue this way.


Cheers,

--
Julien Grall



Re: [PATCH v2 2/2] x86/p2m: aid the compiler in folding p2m_is_...()

2024-02-01 Thread George Dunlap
On Thu, Jun 23, 2022 at 12:54 PM Jan Beulich  wrote:

> By using | instead of || or (in the negated form) && chances increase
> for the compiler to recognize that both predicates can actually be
> folded into an expression requiring just a single branch (via OR-ing
> together the respective P2M_*_TYPES constants).
>
> Signed-off-by: Jan Beulich 
>

Sorry for the delay.  Git complains that this patch is malformed:

error: `git apply --index`: error: corrupt patch at line 28

Similar complaint from patchew when it was posted:

https://patchew.org/Xen/5d6c927e-7d7c-5754-e7eb-65d1e70f6...@suse.com/

 -George


Re: [PATCH v6 01/15] xen/common: add cache coloring common code

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:17, Carlo Nonato wrote:
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -626,6 +626,11 @@ struct domain
>  
>  /* Holding CDF_* constant. Internal flags for domain creation. */
>  unsigned int cdf;
> +
> +#ifdef CONFIG_LLC_COLORING
> +unsigned const int *llc_colors;
> +unsigned int num_llc_colors;
> +#endif
>  };

Btw, at this point flipping the order of the two fields will be more
efficient for 64-bit architectures (consuming a padding hole rather
than adding yet another one).

Jan



Re: [PATCH v6 04/15] xen/arm: add Dom0 cache coloring support

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:18, Carlo Nonato wrote:
> Add a command line parameter to allow the user to set the coloring
> configuration for Dom0.
> A common configuration syntax for cache colors is introduced and
> documented.
> Take the opportunity to also add:
>  - default configuration notion.
>  - function to check well-formed configurations.
> 
> Direct mapping Dom0 isn't possible when coloring is enabled, so
> CDF_directmap flag is removed when creating it.

What implications does this have?

> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
>  
>  Specify a list of IO ports to be excluded from dom0 access.
>  
> +### dom0-llc-colors
> +> `= List of [  | - ]`
> +
> +> Default: `All available LLC colors`
> +
> +Specify dom0 LLC color configuration. This options is available only when
> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
> +colors are used.

Even Arm already has a "dom0=" option. Is there a particular reason why
this doesn't become a new sub-option there?

As to meaning: With just a single , that's still a color value
then (and not a count of colors)? Wouldn't it make sense to have a
simpler variant available where you just say how many, and a suitable
set/range is then picked?

Finally a nit: "This option is ...".

> @@ -2188,10 +2190,16 @@ void __init create_dom0(void)
>  panic("SVE vector length error\n");
>  }
>  
> -dom0 = domain_create(0, _cfg, CDF_privileged | CDF_directmap);
> +if ( !llc_coloring_enabled )
> +flags |= CDF_directmap;
> +
> +dom0 = domain_create(0, _cfg, flags);
>  if ( IS_ERR(dom0) )
>  panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
>  
> +if ( llc_coloring_enabled && (rc = dom0_set_llc_colors(dom0)) )
> +panic("Error initializing LLC coloring for domain 0 (rc = %d)", rc);

As for the earlier patch, I find panic()ing here dubious. You can continue
quite fine, with a warning and perhaps again tainting the system.

> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -17,6 +17,63 @@ size_param("llc-way-size", llc_way_size);
>  /* Number of colors available in the LLC */
>  static unsigned int __ro_after_init max_nr_colors = CONFIG_NR_LLC_COLORS;
>  
> +static unsigned int __initdata dom0_colors[CONFIG_NR_LLC_COLORS];
> +static unsigned int __initdata dom0_num_colors;
> +
> +/*
> + * Parse the coloring configuration given in the buf string, following the
> + * syntax below.
> + *
> + * COLOR_CONFIGURATION ::= COLOR | RANGE,...,COLOR | RANGE
> + * RANGE   ::= COLOR-COLOR
> + *
> + * Example: "0,2-6,15-16" represents the set of colors: 0,2,3,4,5,6,15,16.
> + */
> +static int parse_color_config(const char *buf, unsigned int *colors,
> +  unsigned int num_colors, unsigned int 
> *num_parsed)

Is this function going to be re-used? If not, it wants to be __init.
If so, I wonder where the input string is going to come from ...

Also "num_colors" looks to be misnamed - doesn't this specify an
upper bound only?

> +{
> +const char *s = buf;
> +
> +if ( !colors || !num_colors )
> +return -EINVAL;

Why do you check colors but not ...

> +*num_parsed = 0;

... num_parsed? I think internal functions don't need such NULL checks.

> +while ( *s != '\0' )
> +{
> +if ( *s != ',' )

Hmm, this way you also accept leading/trailing commas as well as multiple
consecutive ones. Elsewhere we're more strict.

> @@ -70,12 +150,85 @@ void __init llc_coloring_init(void)
>  arch_llc_coloring_init();
>  }
>  
> +void domain_llc_coloring_free(struct domain *d)
> +{
> +xfree(__va(__pa(d->llc_colors)));

This __va(__pa()) trick deserves a comment, I think.

> +}
> +
>  void domain_dump_llc_colors(const struct domain *d)
>  {
>  printk("Domain %pd has %u LLC colors: ", d, d->num_llc_colors);
>  print_colors(d->llc_colors, d->num_llc_colors);
>  }
>  
> +static unsigned int *alloc_colors(unsigned int num_colors)
> +{
> +unsigned int *colors;
> +
> +if ( num_colors > max_nr_colors )
> +return NULL;

Shouldn't check_colors() have made sure of this? If so, convert to
ASSERT()?

> +colors = xmalloc_array(unsigned int, num_colors);
> +if ( !colors )
> +return NULL;

These last two lines are redundant with ...

> +return colors;

... this one. Question then is whether this is useful at all as a
separate helper function.

> +}
> +
> +static int domain_check_colors(const struct domain *d)
> +{
> +if ( !d->num_llc_colors )
> +{
> +printk(XENLOG_ERR "No LLC color config found for %pd\n", d);
> +return -ENODATA;
> +}
> +else if ( !check_colors(d->llc_colors, d->num_llc_colors) )

I generally recommend against use of "else" in cases like this one.

> +{
> +printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
> +return -EINVAL;
> +  

Re: [PATCH v6 01/15] xen/common: add cache coloring common code

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:17, Carlo Nonato wrote:
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -31,3 +31,20 @@ config NR_NUMA_NODES
> associated with multiple-nodes management. It is the upper bound of
> the number of NUMA nodes that the scheduler, memory allocation and
> other NUMA-aware components can handle.
> +
> +config LLC_COLORING
> + bool "Last Level Cache (LLC) coloring" if EXPERT
> + depends on HAS_LLC_COLORING
> +
> +config NR_LLC_COLORS
> + int "Maximum number of LLC colors"
> + range 2 1024

What's the reasoning behind this upper bound? IOW - can something to this
effect be said in the description, please?

> + default 128
> + depends on LLC_COLORING
> + help
> +   Controls the build-time size of various arrays associated with LLC
> +   coloring. Refer to cache coloring documentation for how to compute the
> +   number of colors supported by the platform. This is only an upper
> +   bound. The runtime value is autocomputed or manually set via cmdline.
> +   The default value corresponds to an 8 MiB 16-ways LLC, which should be
> +   more than what needed in the general case.

Aiui while not outright wrong, non-power-of-2 values are meaningless to
specify. Perhaps that is worth mentioning (if not making this a value
that's used as exponent of 2 in the first place)?

As to the default and its description: As said for the documentation,
doesn't what this corresponds to also depend on cache line size? Even
if this was still Arm-specific rather than common code, I'd question
whether now and forever Arm chips may only use one pre-determined cache
line size.

> --- /dev/null
> +++ b/xen/common/llc-coloring.c
> @@ -0,0 +1,87 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Last Level Cache (LLC) coloring common code
> + *
> + * Copyright (C) 2022 Xilinx Inc.
> + */
> +#include 
> +#include 
> +#include 
> +
> +bool __ro_after_init llc_coloring_enabled;
> +boolean_param("llc-coloring", llc_coloring_enabled);

The variable has no use right now afaics, so it's unclear whether (a) it
is legitimately non-static and (b) placed in an appropriate section.

> +/* Size of an LLC way */
> +static unsigned int __ro_after_init llc_way_size;
> +size_param("llc-way-size", llc_way_size);
> +/* Number of colors available in the LLC */
> +static unsigned int __ro_after_init max_nr_colors = CONFIG_NR_LLC_COLORS;
> +
> +static void print_colors(const unsigned int *colors, unsigned int num_colors)
> +{
> +unsigned int i;
> +
> +printk("{ ");
> +for ( i = 0; i < num_colors; i++ ) {

Nit (style): Brace placement.

> +unsigned int start = colors[i], end = colors[i];
> +
> +printk("%u", start);
> +
> +for ( ;
> +  i < num_colors - 1 && colors[i] + 1 == colors[i + 1];

To reduce the number of array accesses, may I suggest to use "end + 1"
here instead of "colors[i] + 1"? (The initializer of "end" could also
be "start", but I guess the compiler will recognize this anyway.) This
would then (imo) also better justify the desire for having "end" in
the first place.

> +  i++, end++ );

Imo for clarity the semicolon want to live on its own line.

> +static void dump_coloring_info(unsigned char key)

This being common code now, I think it would be good practice to have
cf_check here right away, even if for now (for whatever reason) the
feature is meant to be limited to Arm. (Albeit see below for whether
this is to remain that way.)

> +void __init llc_coloring_init(void)
> +{
> +if ( !llc_way_size && !(llc_way_size = get_llc_way_size()) )
> +panic("Probed LLC coloring way size is 0 and no custom value 
> found\n");
> +
> +/*
> + * The maximum number of colors must be a power of 2 in order to 
> correctly
> + * map them to bits of an address, so also the LLC way size must be so.
> + */
> +if ( llc_way_size & (llc_way_size - 1) )
> +panic("LLC coloring way size (%u) isn't a power of 2\n", 
> llc_way_size);
> +
> +max_nr_colors = llc_way_size >> PAGE_SHIFT;

With this unconditionally initialized here, what's the purpose of the
variable's initializer?

> +if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> +panic("Number of LLC colors (%u) not in range [2, %u]\n",
> +  max_nr_colors, CONFIG_NR_LLC_COLORS);

I'm not convinced of panic()ing here (including the earlier two
instances). You could warn, taint, disable, and continue. If you want
to stick to panic(), please justify doing so in the description.

Plus, if you panic(), shouldn't that be limited to llc_coloring_enabled
being true? Or - not visible here, due to the lack of a caller of the
function - is that meant to be taken care of by the caller (to not call
here when the flag is off)? I think it would be cleaner if the check
lived here; quite possibly that would then further permit the flag
variable to become static.

> +register_keyhandler('K', 

Re: [BUG]i2c_hid_acpi broken with 4.17.2 on Framework Laptop 13 AMD

2024-02-01 Thread Sébastien Chaumat
I spotted the following warning for IRQ7 (along with IRQ6 and 10)

[0.686073] fedora kernel: __irq_set_trigger: genirq: No set_type
function for IRQ 7 (IR-IO-APIC)

This comes from kernel/irq/manage.c


int __irq_set_trigger(struct irq_desc *desc, unsigned long flags)
{
struct irq_chip *chip = desc->irq_data.chip;
int ret, unmask = 0;

if (!chip || !chip->irq_set_type) {
/*
* IRQF_TRIGGER_* but the PIC does not support multiple
* flow-types?
*/
pr_debug("No set_type function for IRQ %d (%s)\n",
irq_desc_get_irq(desc),
chip ? (chip->name ? : "unknown") : "unknown");
return 0;
}

Could this have a role in the IRQ misconfiguration by xen ?




Le lun. 22 janv. 2024 à 21:59, Mario Limonciello 
a écrit :

> On 1/22/2024 11:06, Sébastien Chaumat wrote:
> >
> >
> > Le mer. 17 janv. 2024 à 03:20, Mario Limonciello
> > mailto:mario.limoncie...@amd.com>> a écrit :
> >
> > On 1/16/2024 10:18, Jan Beulich wrote:
> >  > On 16.01.2024 16:52, Sébastien Chaumat wrote:
> >  >> Le mar. 2 janv. 2024 à 21:23, Sébastien Chaumat
> > mailto:euidz...@gmail.com>> a
> >  >> écrit :
> >  >>
> >  >>>
> >  >>>   output of gpioinfo
> >  
> >   kernel alone :
> >  
> >    line   5: unnamed input active-low
> > consumer=interrupt
> >    line  84: unnamed input active-low
> > consumer=interrupt
> >  
> >   xen:
> >  
> >    line   5: unnamed input active-low
> >    line  84: unnamed input active-low
> >  
> >   xen with skipping IRQ7 double init :
> >  
> >    line   5: unnamed input active-low
> > consumer=interrupt
> >    line  84: unnamed input active-low
> >  
> >  
> >   So definitely progressing.
> >  
> >  >>>
> >  >>> Checking /sys/kernel/irq/7
> >  >>>
> >  >>> kernel alone :
> >  >>>   actions: pinctrl_amd
> >  >>>   chip_name: IR-IO-APIC
> >  >>>   hwirq: 7
> >  >>>   name: fasteoi
> >  >>>   per_cpu_count: 0,0,0,0,0,20,0,0,0,0,0,0,0,0,0,0
> >  >>>   type: level
> >  >>>   wakeup: enabled
> >  >>>
> >  >>> xen skipping IRQ7 double init :
> >  >>>
> >  >>> actions: pinctrl_amd
> >  >>>   chip_name: xen-pirq
> >  >>>   hwirq:
> >  >>>   name: ioapic-level
> >  >>>   per_cpu_count: 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
> >  >>>   type: edge
> >  >>>   wakeup: disabled
> >  >>>
> >  >>> So the skip of IRQ7 in pci_xen_initial_domain() sets the
> > correct handler
> >  >>>   (IIUC xen uses the ioapic-level and handles the eoi
> > separately), but not
> >  >>> the correct type (still edge).
> >  >>> I guess this may explains the results above.
> >  >>>
> >  >>>
> >  >>   Mario (in CC) patched the pinctrl_amd to flush pending
> > interrupt before
> >  >> starting the driver for the GPIO.
> >  >>
> >  >> This helped in  the sense of there's no more pending interrupt
> > on IRQ7
> >  >> (whatever the handler is, level or edge) but then the touchpad
> > is not
> >  >> detected by i2c-hid.
> >  >>
> >  >> Is there any work in progress related to the incorrect IRQ
> > configuration ?
> >  >
> >  > I'm not aware of any. As per my recollection it's still not
> entirely
> >  > clear where in the kernel things go astray. And to be honest I
> don't
> >  > feel comfortable trying to half-blindly address this, e.g. by
> trying
> >  > to circumvent / defer the early setting up of the low 16 IRQs.
> >  >
> >  > Jan
> >
> > Shot in the dark - but could this be a problem where PCAT_COMPAT from
> > the MADT is being ignored causing PIC not to be setup properly in the
> > Xen case?
> >
> > See https://lore.kernel.org/all/875y2u5s8g.ffs@tglx/
> >  for some context.
> >
> > At least we know that no MADT override is found by xen for INT7 as no
> > INT_SRC_OVR message is printed.
> >
> > Do we expect one @Mario Limonciello 
> ?
>
> No; the INT_SRV_OVR you'll see on Framework 13 AMD is on IRQ 2 and IRQ 9.
>
>


Re: [PATCH 1/2] xen/arm: Add imx8q{m,x} platform glue

2024-02-01 Thread Julien Grall




On 31/01/2024 15:32, John Ernberg wrote:

Hi Julien,


Hi John,


On 1/31/24 13:22, Julien Grall wrote:

Hi,

On 31/01/2024 11:50, John Ernberg wrote:

When using Linux for dom0 there are a bunch of drivers that need to do
SMC
SIP calls into the PSCI provider to enable certain hardware bits like the
watchdog.


Do you know which protocol this is under the hood. Is this SCMI?


I think I confused myself here when I wrote the commit log.

The EL3 code in our case is ATF, and it does not appear to be SCMI, nor
PSCI. The register usage of these SMC SIP calls are as follows:
a0 - service
a1 - function
a2-a7 - args

In ATF the handler is declared as a runtime service.

Would the appropriate commmit message here be something along the lines
of below?
"""
When using Linux for dom0 there are a bunch of drivers that need to do   SMC
SIP calls into the firmware to enable certain hardware bits like the
watchdog.
"""


It reads better thanks.

[...]


But even if we restrict to dom0, have you checked that none of the SMCs
use buffers?

I haven't found any such instances in the Linux kernel where a buffer is
used. Adding a call filtering like suggested below additions of such
functions can be discovered and adapted for if they would show up later.


Rather than providing a blanket forward, to me it sounds more like you
want to provide an allowlist of the SMCs. This is more futureproof and
avoid the risk to expose unsafe SMCs to any domain.

For an example, you can have a look at the EEMI mediator for Xilinx.


Ack. Do you prefer to see only on SMCCC service level or also on
function level? (a1 register, per description earlier)


I am not sure. It will depend on whether it is correct to expose *all* 
the functions within a service level and they have the same format.


If you can't guarantee that, then you will most likely need to allowlist 
at the function level.


Also, do you have a spec in hand that would help to understand which 
service/function is implemented via those SMCs?


Cheers,

--
Julien Grall



Re: [PATCH v6 01/15] xen/common: add cache coloring common code

2024-02-01 Thread Jan Beulich
On 29.01.2024 18:17, Carlo Nonato wrote:
> --- /dev/null
> +++ b/docs/misc/cache-coloring.rst
> @@ -0,0 +1,87 @@
> +Xen cache coloring user guide
> +=
> +
> +The cache coloring support in Xen allows to reserve Last Level Cache (LLC)
> +partitions for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
> +
> +To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
> +
> +If needed, change the maximum number of colors with
> +``CONFIG_NR_LLC_COLORS=``.
> +
> +Compile Xen and the toolstack and then configure it via
> +`Command line parameters`_.
> +
> +Background
> +**
> +
> +Cache hierarchy of a modern multi-core CPU typically has first levels 
> dedicated
> +to each core (hence using multiple cache units), while the last level is 
> shared
> +among all of them. Such configuration implies that memory operations on one
> +core (e.g. running a DomU) are able to generate interference on another core
> +(e.g .hosting another DomU). Cache coloring allows eliminating this
> +mutual interference, and thus guaranteeing higher and more predictable
> +performances for memory accesses.

Since you say "eliminating" - what about shared mid-level caches? What about
shared TLBs?

Jan



Re: [PATCH v6 01/15] xen/common: add cache coloring common code

2024-02-01 Thread Jan Beulich
On 31.01.2024 16:57, Jan Beulich wrote:
> On 29.01.2024 18:17, Carlo Nonato wrote:
>> +Command line parameters
>> +***
>> +
>> +More specific documentation is available at 
>> `docs/misc/xen-command-line.pandoc`.
>> +
>> ++--+---+
>> +| **Parameter**| **Description**   |
>> ++--+---+
>> +| ``llc-coloring`` | enable coloring at runtime|
>> ++--+---+
>> +| ``llc-way-size`` | set the LLC way size  |
>> ++--+---+
> 
> As a result of the above, I also find it confusing to specify "way size"
> as a command line option. Cache size, number of ways, and cache line size
> would seem more natural to me.

Or, alternatively, have the number of colors be specifiable directly.

Jan



Re: [PATCH v1 0/2] tools/ocaml: support OCaml 5.x, drop support for <=4.05

2024-02-01 Thread Edwin Torok



> On 31 Jan 2024, at 10:55, Andrew Cooper  wrote:
> 
> On 31/01/2024 10:44 am, Christian Lindig wrote:
>>> On 31 Jan 2024, at 10:42, Edwin Török  wrote:
>>> 
>>> Fix building oxenstored with OCaml 5.x.
>>> OCaml 5.x has removed some functions that have been deprecated for many 
>>> years,
>>> in order to support OCaml 5.x we need to drop support for OCaml 4.02.
>>> 
>>> Tested in gitlab CI (together with my other series):
>>> https://gitlab.com/xen-project/people/edwintorok/xen/-/pipelines/1158302827
>>> 
>>> Edwin Török (2):
>>> oxenstored: fix build on OCaml 5.x
>>> tools/ocaml: bump minimum version to OCaml 4.05
>>> 
>>> tools/configure   | 2 +-
>>> tools/configure.ac| 2 +-
>>> tools/ocaml/xenstored/disk.ml | 2 +-
>>> 3 files changed, 3 insertions(+), 3 deletions(-)
>>> 
>>> -- 
>>> 2.43.0
>>> 
>> Acked-by: Christian Lindig 
> 
> It occurs to me that this is the kind of thing which should get a
> CHANGELOG.md entry these days.  Something like:
> 
> diff --git a/CHANGELOG.md b/CHANGELOG.md
> index 1f55c9c72d10..fd7c8f5c6b82 100644
> --- a/CHANGELOG.md
> +++ b/CHANGELOG.md
> @@ -9,6 +9,7 @@ The format is based on [Keep a
> Changelog](https://keepachangelog.com/en/1.0.0/)
>  ### Changed
>   - Changed flexible array definitions in public I/O interface headers
> to not
> use "1" as the number of array elements.
> + - The minimum supported Ocaml toolchain version is now 4.05
>   - On x86:
> - HVM PIRQs are disabled by default.
> - Reduce IOMMU setup time for hardware domain.

Sounds good.

Should this be mentioned in 
https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches#Sending_a_Patch_Series?

Best regards,
—Edwin

> 
> 
> ought to do.
> 
> Have we checked to see whether this drops Ocaml from any of the build
> containers ?


I can look into this later, haven’t tried rebuilding the containers (the gitlab 
CI passed though)

Best regards,
—Edwin

> 
> ~Andrew




Re: [PATCH v4 3/8] VMX: tertiary execution control infrastructure

2024-02-01 Thread Jan Beulich
On 01.02.2024 12:50, Roger Pau Monné wrote:
> On Thu, Jan 11, 2024 at 10:00:10AM +0100, Jan Beulich wrote:
>> @@ -503,6 +538,9 @@ static int vmx_init_vmcs_config(bool bsp
>>  "Secondary Exec Control",
>>  vmx_secondary_exec_control, _vmx_secondary_exec_control);
>>  mismatch |= cap_check(
>> +"Tertiary Exec Control",
>> +vmx_tertiary_exec_control, _vmx_tertiary_exec_control);
> 
> I know it's done to match the surrounding style, but couldn't you move
> the name parameter one line up, and then limit the call to two lines?
> 
> (I don't think it will compromise readability).

You mean like this:

mismatch |= cap_check("Tertiary Exec Control",
vmx_tertiary_exec_control, _vmx_tertiary_exec_control);

? No, I view this as a mix of two possible styles. If the string literal
was moved up, the other legitimate style would only be

mismatch |= cap_check("Tertiary Exec Control",
  vmx_tertiary_exec_control,
  _vmx_tertiary_exec_control);

aiui (again extending over 3 lines). Yet none of this is written down
anywhere.

But anyway - consistency with surrounding code trumps here, I think.

>> @@ -2068,10 +2111,12 @@ void vmcs_dump_vcpu(struct vcpu *v)
>> vmr(HOST_PERF_GLOBAL_CTRL));
>>  
>>  printk("*** Control State ***\n");
>> -printk("PinBased=%08x CPUBased=%08x SecondaryExec=%08x\n",
>> +printk("PinBased=%08x CPUBased=%08x\n",
>> vmr32(PIN_BASED_VM_EXEC_CONTROL),
>> -   vmr32(CPU_BASED_VM_EXEC_CONTROL),
>> -   vmr32(SECONDARY_VM_EXEC_CONTROL));
>> +   vmr32(CPU_BASED_VM_EXEC_CONTROL));
>> +printk("SecondaryExec=%08x TertiaryExec=%08lx\n",
> 
> For consistency, shouldn't TertiaryExec use 016 instead of 08 (as it's
> a 64bit filed).

Perhaps, assuming we'll gets bits 32 and populated sooner or later.
However, I view 16-digit literal numbers as hard to read, so I'd be
inclined to insert a separator (e.g. an underscore) between the low
and high halves. Thoughts?

>> @@ -260,6 +262,13 @@ extern u32 vmx_vmentry_control;
>>  #define SECONDARY_EXEC_NOTIFY_VM_EXITING0x8000U
>>  extern u32 vmx_secondary_exec_control;
>>  
>> +#define TERTIARY_EXEC_LOADIWKEY_EXITING BIT(0, UL)
>> +#define TERTIARY_EXEC_ENABLE_HLAT   BIT(1, UL)
>> +#define TERTIARY_EXEC_EPT_PAGING_WRITE  BIT(2, UL)
>> +#define TERTIARY_EXEC_GUEST_PAGING_VERIFY   BIT(3, UL)
>> +#define TERTIARY_EXEC_IPI_VIRT  BIT(4, UL)
> 
> While at it, my copy of the SDM also has:
> 
> #define TERTIARY_EXEC_VIRT_SPEC_CTRL   BIT(7, UL)

Ah yes, this must have appeared in the over 9 months that have
passed since I originally wrote this patch.

>> --- a/xen/arch/x86/include/asm/msr-index.h
>> +++ b/xen/arch/x86/include/asm/msr-index.h
>> @@ -347,6 +347,7 @@
>>  #define MSR_IA32_VMX_TRUE_EXIT_CTLS 0x48f
>>  #define MSR_IA32_VMX_TRUE_ENTRY_CTLS0x490
>>  #define MSR_IA32_VMX_VMFUNC 0x491
>> +#define MSR_IA32_VMX_PROCBASED_CTLS30x492
> 
> Shouldn't this be added above the "Legacy MSR constants in need of
> cleanup.  No new MSRs below this comment." line?

Now this is a question I'd like to forward to Andrew. Imo grouping the
new MSR with the other VMX ones is more important than respecting that
comment. But yes, I could of course add yet another patch to move the
entire block up first ...

>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -760,6 +760,12 @@ void vmx_update_secondary_exec_control(s
>>v->arch.hvm.vmx.secondary_exec_control);
>>  }
>>  
>> +void vmx_update_tertiary_exec_control(struct vcpu *v)
> 
> const vcpu?

Hmm, yes - overly blind copy-and-paste.

Jan



Re: [PATCH v4 3/8] VMX: tertiary execution control infrastructure

2024-02-01 Thread Roger Pau Monné
On Thu, Jan 11, 2024 at 10:00:10AM +0100, Jan Beulich wrote:
> This is a prereq to enabling the MSRLIST feature.
> 
> Note that the PROCBASED_CTLS3 MSR is different from other VMX feature
> reporting MSRs, in that all 64 bits report allowed 1-settings.
> 
> vVMX code is left alone, though, for the time being.
> 
> Signed-off-by: Jan Beulich 
> ---
> v2: New.
> 
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -164,6 +164,7 @@ static int cf_check parse_ept_param_runt
>  u32 vmx_pin_based_exec_control __read_mostly;
>  u32 vmx_cpu_based_exec_control __read_mostly;
>  u32 vmx_secondary_exec_control __read_mostly;
> +uint64_t vmx_tertiary_exec_control __read_mostly;
>  u32 vmx_vmexit_control __read_mostly;
>  u32 vmx_vmentry_control __read_mostly;
>  u64 vmx_ept_vpid_cap __read_mostly;
> @@ -228,10 +229,32 @@ static u32 adjust_vmx_controls(
>  return ctl;
>  }
>  
> -static bool cap_check(const char *name, u32 expected, u32 saw)
> +static uint64_t adjust_vmx_controls2(
> +const char *name, uint64_t ctl_min, uint64_t ctl_opt, unsigned int msr,
> +bool *mismatch)
> +{
> +uint64_t vmx_msr, ctl = ctl_min | ctl_opt;
> +
> +rdmsrl(msr, vmx_msr);
> +
> +ctl &= vmx_msr; /* bit == 0 ==> must be zero */
> +
> +/* Ensure minimum (required) set of control bits are supported. */
> +if ( ctl_min & ~ctl )
> +{
> +*mismatch = true;
> +printk("VMX: CPU%u has insufficient %s (%#lx; requires %#lx)\n",
> +   smp_processor_id(), name, ctl, ctl_min);
> +}
> +
> +return ctl;
> +}
> +
> +static bool cap_check(
> +const char *name, unsigned long expected, unsigned long saw)
>  {
>  if ( saw != expected )
> -printk("VMX %s: saw %#x expected %#x\n", name, saw, expected);
> +printk("VMX %s: saw %#lx expected %#lx\n", name, saw, expected);
>  return saw != expected;
>  }
>  
> @@ -241,6 +264,7 @@ static int vmx_init_vmcs_config(bool bsp
>  u32 _vmx_pin_based_exec_control;
>  u32 _vmx_cpu_based_exec_control;
>  u32 _vmx_secondary_exec_control = 0;
> +uint64_t _vmx_tertiary_exec_control = 0;
>  u64 _vmx_ept_vpid_cap = 0;
>  u64 _vmx_misc_cap = 0;
>  u32 _vmx_vmexit_control;
> @@ -274,7 +298,8 @@ static int vmx_init_vmcs_config(bool bsp
>  opt = (CPU_BASED_ACTIVATE_MSR_BITMAP |
> CPU_BASED_TPR_SHADOW |
> CPU_BASED_MONITOR_TRAP_FLAG |
> -   CPU_BASED_ACTIVATE_SECONDARY_CONTROLS);
> +   CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
> +   CPU_BASED_ACTIVATE_TERTIARY_CONTROLS);
>  _vmx_cpu_based_exec_control = adjust_vmx_controls(
>  "CPU-Based Exec Control", min, opt,
>  MSR_IA32_VMX_PROCBASED_CTLS, );
> @@ -338,6 +363,15 @@ static int vmx_init_vmcs_config(bool bsp
>  MSR_IA32_VMX_PROCBASED_CTLS2, );
>  }
>  
> +if ( _vmx_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS )
> +{
> +uint64_t opt = 0;
> +
> +_vmx_tertiary_exec_control = adjust_vmx_controls2(
> +"Tertiary Exec Control", 0, opt,
> +MSR_IA32_VMX_PROCBASED_CTLS3, );
> +}
> +
>  /* The IA32_VMX_EPT_VPID_CAP MSR exists only when EPT or VPID available 
> */
>  if ( _vmx_secondary_exec_control & (SECONDARY_EXEC_ENABLE_EPT |
>  SECONDARY_EXEC_ENABLE_VPID) )
> @@ -468,6 +502,7 @@ static int vmx_init_vmcs_config(bool bsp
>  vmx_pin_based_exec_control = _vmx_pin_based_exec_control;
>  vmx_cpu_based_exec_control = _vmx_cpu_based_exec_control;
>  vmx_secondary_exec_control = _vmx_secondary_exec_control;
> +vmx_tertiary_exec_control  = _vmx_tertiary_exec_control;
>  vmx_ept_vpid_cap   = _vmx_ept_vpid_cap;
>  vmx_vmexit_control = _vmx_vmexit_control;
>  vmx_vmentry_control= _vmx_vmentry_control;
> @@ -503,6 +538,9 @@ static int vmx_init_vmcs_config(bool bsp
>  "Secondary Exec Control",
>  vmx_secondary_exec_control, _vmx_secondary_exec_control);
>  mismatch |= cap_check(
> +"Tertiary Exec Control",
> +vmx_tertiary_exec_control, _vmx_tertiary_exec_control);

I know it's done to match the surrounding style, but couldn't you move
the name parameter one line up, and then limit the call to two lines?

(I don't think it will compromise readability).

> +mismatch |= cap_check(
>  "VMExit Control",
>  vmx_vmexit_control, _vmx_vmexit_control);
>  mismatch |= cap_check(
> @@ -1080,6 +1118,7 @@ static int construct_vmcs(struct vcpu *v
>  v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING;
>  
>  v->arch.hvm.vmx.secondary_exec_control = vmx_secondary_exec_control;
> +v->arch.hvm.vmx.tertiary_exec_control  = vmx_tertiary_exec_control;
>  
>  /*
>   * Disable features which we don't want active by default:
> @@ -1134,6 +1173,10 @@ static int construct_vmcs(struct 

Re: Issue iommu unrecognized on amd computer

2024-02-01 Thread Jan Beulich
On 01.02.2024 09:16, o...@proton.me wrote:
> Following our interaction on matrix, I send you the boot log `xl dmesg` 
> output of my computer, booting with `iommu=debug` options on linux 
> 6.6.13-200.fc39.

There's no "iommu=debug" anywhere in sight:

(XEN) Command line: placeholder no-real-mode edd=off

Did you mistakenly put it on the Linux command line?

Nevertheless there are relevant messages already without the extra
verbosity (and without you using a debug build of Xen, which would
be preferable):

(XEN) AMD-Vi: Warning: IVMD: [c9f1c000,c9f42000) is not (entirely) in reserved 
memory
(XEN) AMD-Vi: Error: IVMD: page at c9f1c000 can't be converted
(XEN) AMD-Vi: Error initialization

Quoting the respective entry from the memory map:

(XEN)  [c8f58000, c9f57fff] (ACPI NVS)

Both Roger and I look to agree that this condition we have

if ( !(type & (RAM_TYPE_RESERVED | RAM_TYPE_ACPI |
   RAM_TYPE_UNUSABLE)) )

is inverted in some way. We merely need to agree in which way it
wants adjusting.

Jan



Re: [PATCH] xen/bitmap: Consistently use unsigned bits values

2024-02-01 Thread Andrew Cooper
On 01/02/2024 10:45 am, Jan Beulich wrote:
> On 01.02.2024 11:33, Andrew Cooper wrote:
>> Right now, most of the static inline helpers take an unsigned nbits quantity,
>> and most of the library functions take a signed quanity.  Because
>> BITMAP_LAST_WORD_MASK() is expressed as a divide, the compiler is forced to
>> emit two different paths to get the correct semantics for signed division.
>>
>> Swap all signed bit-counts to being unsigned bit-counts for the simple cases.
>> This includes the return value of bitmap_weight().
>>
>> Bloat-o-meter for a random x86 build reports:
>>   add/remove: 0/0 grow/shrink: 8/19 up/down: 167/-413 (-246)
>>
>> which all comes from compiler not emitting "dead" logic paths for negative 
>> bit
>> counts.
>>
>> No functional change.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 

Thanks.

> albeit with a question at the bottom.
>
>> There is much more wanting cleaning up here, but we have to start somewhere.
>> Some observations:
>>
>>  * Various of the boolean-like return values have -1 for zero-length bitmaps.
>>I can't spot any callers which care, so this seems like a waste.
>>  * bitmap_zero() and similar clear predate us switching to use
>>__builtin_memset(), because there's no need for bitmap_switch().
>>  * Should we consolidate 'bits' vs 'nbits'?
> This looks desirable to me.
>
>>  * The internals of these helpers want converting too.  Other helpers need
>>more than just a parameter conversion.
> This may or may not relate to my question; it's not exactly clear what you
> mean here.
>
>> --- a/xen/include/xen/bitmap.h
>> +++ b/xen/include/xen/bitmap.h
>> @@ -66,25 +66,25 @@
>>   * lib/bitmap.c provides these functions:
>>   */
>>  
>> -extern int __bitmap_empty(const unsigned long *bitmap, int bits);
>> -extern int __bitmap_full(const unsigned long *bitmap, int bits);
>> -extern int __bitmap_equal(const unsigned long *bitmap1,
>> -const unsigned long *bitmap2, int bits);
>> -extern void __bitmap_complement(unsigned long *dst, const unsigned long 
>> *src,
>> -int bits);
>> -extern void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
>> -const unsigned long *bitmap2, int bits);
>> -extern void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
>> -const unsigned long *bitmap2, int bits);
>> -extern void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
>> -const unsigned long *bitmap2, int bits);
>> -extern void __bitmap_andnot(unsigned long *dst, const unsigned long 
>> *bitmap1,
>> -const unsigned long *bitmap2, int bits);
>> -extern int __bitmap_intersects(const unsigned long *bitmap1,
>> -const unsigned long *bitmap2, int bits);
>> -extern int __bitmap_subset(const unsigned long *bitmap1,
>> -const unsigned long *bitmap2, int bits);
>> -extern int __bitmap_weight(const unsigned long *bitmap, int bits);
>> +int __bitmap_empty(const unsigned long *bitmap, unsigned int bits);
>> +int __bitmap_full(const unsigned long *bitmap, unsigned int bits);
>> +int __bitmap_equal(const unsigned long *bitmap1,
>> +   const unsigned long *bitmap2, unsigned int bits);
>> +void __bitmap_complement(unsigned long *dst, const unsigned long *src,
>> + unsigned int bits);
>> +void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
>> +  const unsigned long *bitmap2, unsigned int bits);
>> +void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
>> + const unsigned long *bitmap2, unsigned int bits);
>> +void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
>> +  const unsigned long *bitmap2, unsigned int bits);
>> +void __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
>> + const unsigned long *bitmap2, unsigned int bits);
>> +int __bitmap_intersects(const unsigned long *bitmap1,
>> +const unsigned long *bitmap2, unsigned int bits);
>> +int __bitmap_subset(const unsigned long *bitmap1,
>> +const unsigned long *bitmap2, unsigned int bits);
>> +unsigned int __bitmap_weight(const unsigned long *bitmap, unsigned int 
>> bits);
>>  extern void __bitmap_set(unsigned long *map, unsigned int start, int len);
>>  extern void __bitmap_clear(unsigned long *map, unsigned int start, int len);
> What about these two, and the subsequent (in the .c file at least)
> bitmap_*_region()? That last remark above may mean you deliberately
> left them out for now (which is okay - as you say, this is merely a
> 1st step).

They want map->bitmap and int len -> unsigned int bits for consistency,
which is more than just a typechange.

I also wasn't totally certain of the correctness of the internal logic. 
I don't have time to investigate further in the immediate future, so
left it 

Upcoming Social Event - Location!

2024-02-01 Thread Kelly Choi
Hi all,

I'm pleased to announce the location of our next social event!
Join and chat with the Xen community, there will also be free food and
drinks.

*Date:* Wednesday 21st February 2024

*Time:* 5:30pm - 9pm (you are welcome to stay later if you wish)

*Location:*
The Portland Arms
129 Chesterton Rd Cambridge
www.theportlandarms.co.uk
T: 01223 357268

*Getting to the venue (more information here
):*
There are pay and display spaces (free after 5pm) on Milton Road to the
East of us or opposite on Chesterton Road. The nearest bus stops are:
Chesterton Road, opposite the pub outside Hing Hung restaurant, and on
Milton Road, near Westbrook centre, Citi 1, 9, X9 and The Busway. Cambridge
North rail station is 2.3 miles away and Cambridge Central station is 2.4
miles.

*To anticipate the number of attendees, please email me indicating you wish
to attend. *
*If you have any food allergies, let me know ahead of time. *

Many thanks,
Kelly Choi

Community Manager
Xen Project


-- Forwarded message -
From: Kelly Choi 
Date: Thu, Nov 23, 2023 at 1:01 PM
Subject: The Xen Project’s 20th Anniversary - Upcoming Social Event!
To: , ,



*The Xen Project’s 20th Anniversary*

*Let's get together for an informal social, likely to be pizza/drinks and
getting involved with the community. I hope to run these in future
locations to give everyone a chance to attend.*

*Date placeholder: Wednesday 21st February 2024*
*Location: Cambridge*
*Details TBC - **If you're interested, please reply to me directly and I
will add you to the list.*

*Celebrating Two Decades of Innovation*

It’s hard to believe that two decades have passed since the inception of
the Xen Project, a trailblazing force in the world of open-source
virtualization. As we raise our glasses to commemorate this momentous
occasion, it’s not just a celebration of time but a reflection on the
incredible journey that has defined the Xen Project’s legacy.

*A Legacy of Innovation*

In the year 2003, the Xen Project emerged as a pioneering open-source
hypervisor, laying the groundwork for some of the most influential cloud
infrastructures that shape our digital landscape today. Over the past 20
years, the Xen Project has not only endured but has thrived, continuously
evolving to meet the dynamic demands of the ever-changing tech landscape.

*Driving Technological Frontiers*

>From data center and server virtualization to cloud computing, desktop
virtualization, and fortifying desktop security and hardware appliances,
the Xen Project has been at the forefront of driving technological
innovation. With 20 years of relentless development, it has become
synonymous with reliability, scalability, and adaptability.

*Venturing into New Horizons*

As we celebrate this milestone, we also look forward to the exciting new
territories that the Xen Project is venturing into. From embedded
virtualization to even making strides in the automotive industry, the Xen
Project continues to push boundaries and redefine what’s possible in the
world of open-source virtualization.

*The Annual Event: Xen Project Developer and Design Summit*

At the heart of this remarkable journey is the Xen Project Developer and
Design Summit, an annual gathering of the community’s brilliant minds and
power users. More than just a conference, it’s a celebration of idea
exchange, a showcase of the latest advancements, a platform for sharing
invaluable experiences, and a forum for strategic planning and
collaborative efforts. Be sure to look out for our upcoming event in 2024.

*A Vibrant Community Defining the Future*

Beyond the code and technological achievements, the Xen Project’s strength
lies in its vibrant community. It’s a community that has come together to
celebrate successes, overcome challenges, and collectively shape the future
of open-source virtualization technology. Even to this day, community
contributions and reviews are still going!

*Looking Ahead*

As we commemorate 20 years of innovation, we also eagerly anticipate the
next chapter in the Xen Project’s journey. With gratitude for the past and
excitement for the future, we extend our deepest thanks to everyone who has
contributed to this incredible legacy.

Here’s to 20 years of pushing boundaries, fostering collaboration, and
shaping the digital landscape.

Happy anniversary, Xen Project! The best is yet to come and I can’t wait to
see what we all achieve.

Many thanks,
Kelly Choi

Open Source Community Manager
XenServer, Cloud Software Group


Re: [PATCH] xen/bitmap: Drop unused headers

2024-02-01 Thread Jan Beulich
On 01.02.2024 11:35, Andrew Cooper wrote:
> Nothing in bitmap.h uses lib.h, and there's no point including types.h when we
> need to include bitops.h anyway.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Jan Beulich 





Re: [PATCH] xen/bitmap: Consistently use unsigned bits values

2024-02-01 Thread Jan Beulich
On 01.02.2024 11:33, Andrew Cooper wrote:
> Right now, most of the static inline helpers take an unsigned nbits quantity,
> and most of the library functions take a signed quanity.  Because
> BITMAP_LAST_WORD_MASK() is expressed as a divide, the compiler is forced to
> emit two different paths to get the correct semantics for signed division.
> 
> Swap all signed bit-counts to being unsigned bit-counts for the simple cases.
> This includes the return value of bitmap_weight().
> 
> Bloat-o-meter for a random x86 build reports:
>   add/remove: 0/0 grow/shrink: 8/19 up/down: 167/-413 (-246)
> 
> which all comes from compiler not emitting "dead" logic paths for negative bit
> counts.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 
albeit with a question at the bottom.

> There is much more wanting cleaning up here, but we have to start somewhere.
> Some observations:
> 
>  * Various of the boolean-like return values have -1 for zero-length bitmaps.
>I can't spot any callers which care, so this seems like a waste.
>  * bitmap_zero() and similar clear predate us switching to use
>__builtin_memset(), because there's no need for bitmap_switch().
>  * Should we consolidate 'bits' vs 'nbits'?

This looks desirable to me.

>  * The internals of these helpers want converting too.  Other helpers need
>more than just a parameter conversion.

This may or may not relate to my question; it's not exactly clear what you
mean here.

> --- a/xen/include/xen/bitmap.h
> +++ b/xen/include/xen/bitmap.h
> @@ -66,25 +66,25 @@
>   * lib/bitmap.c provides these functions:
>   */
>  
> -extern int __bitmap_empty(const unsigned long *bitmap, int bits);
> -extern int __bitmap_full(const unsigned long *bitmap, int bits);
> -extern int __bitmap_equal(const unsigned long *bitmap1,
> - const unsigned long *bitmap2, int bits);
> -extern void __bitmap_complement(unsigned long *dst, const unsigned long *src,
> - int bits);
> -extern void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
> - const unsigned long *bitmap2, int bits);
> -extern void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
> - const unsigned long *bitmap2, int bits);
> -extern void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
> - const unsigned long *bitmap2, int bits);
> -extern void __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
> - const unsigned long *bitmap2, int bits);
> -extern int __bitmap_intersects(const unsigned long *bitmap1,
> - const unsigned long *bitmap2, int bits);
> -extern int __bitmap_subset(const unsigned long *bitmap1,
> - const unsigned long *bitmap2, int bits);
> -extern int __bitmap_weight(const unsigned long *bitmap, int bits);
> +int __bitmap_empty(const unsigned long *bitmap, unsigned int bits);
> +int __bitmap_full(const unsigned long *bitmap, unsigned int bits);
> +int __bitmap_equal(const unsigned long *bitmap1,
> +   const unsigned long *bitmap2, unsigned int bits);
> +void __bitmap_complement(unsigned long *dst, const unsigned long *src,
> + unsigned int bits);
> +void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
> +  const unsigned long *bitmap2, unsigned int bits);
> +void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
> + const unsigned long *bitmap2, unsigned int bits);
> +void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
> +  const unsigned long *bitmap2, unsigned int bits);
> +void __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
> + const unsigned long *bitmap2, unsigned int bits);
> +int __bitmap_intersects(const unsigned long *bitmap1,
> +const unsigned long *bitmap2, unsigned int bits);
> +int __bitmap_subset(const unsigned long *bitmap1,
> +const unsigned long *bitmap2, unsigned int bits);
> +unsigned int __bitmap_weight(const unsigned long *bitmap, unsigned int bits);
>  extern void __bitmap_set(unsigned long *map, unsigned int start, int len);
>  extern void __bitmap_clear(unsigned long *map, unsigned int start, int len);

What about these two, and the subsequent (in the .c file at least)
bitmap_*_region()? That last remark above may mean you deliberately
left them out for now (which is okay - as you say, this is merely a
1st step).

Jan



[PATCH] xen/bitmap: Drop unused headers

2024-02-01 Thread Andrew Cooper
Nothing in bitmap.h uses lib.h, and there's no point including types.h when we
need to include bitops.h anyway.

Signed-off-by: Andrew Cooper 
---
CC: George Dunlap 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Wei Liu 
CC: Julien Grall 
---
 xen/include/xen/bitmap.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/include/xen/bitmap.h b/xen/include/xen/bitmap.h
index b9f980e91930..9f749e3913d8 100644
--- a/xen/include/xen/bitmap.h
+++ b/xen/include/xen/bitmap.h
@@ -3,8 +3,6 @@
 
 #ifndef __ASSEMBLY__
 
-#include 
-#include 
 #include 
 
 /*
-- 
2.30.2




[PATCH] xen/bitmap: Consistently use unsigned bits values

2024-02-01 Thread Andrew Cooper
Right now, most of the static inline helpers take an unsigned nbits quantity,
and most of the library functions take a signed quanity.  Because
BITMAP_LAST_WORD_MASK() is expressed as a divide, the compiler is forced to
emit two different paths to get the correct semantics for signed division.

Swap all signed bit-counts to being unsigned bit-counts for the simple cases.
This includes the return value of bitmap_weight().

Bloat-o-meter for a random x86 build reports:
  add/remove: 0/0 grow/shrink: 8/19 up/down: 167/-413 (-246)

which all comes from compiler not emitting "dead" logic paths for negative bit
counts.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: George Dunlap 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Wei Liu 
CC: Julien Grall 

Found when investigating negative-shift UBSAN, because each of the library
functions ended up being instrumented.

There is much more wanting cleaning up here, but we have to start somewhere.
Some observations:

 * Various of the boolean-like return values have -1 for zero-length bitmaps.
   I can't spot any callers which care, so this seems like a waste.
 * bitmap_zero() and similar clear predate us switching to use
   __builtin_memset(), because there's no need for bitmap_switch().
 * Should we consolidate 'bits' vs 'nbits'?
 * The internals of these helpers want converting too.  Other helpers need
   more than just a parameter conversion.
---
 xen/common/bitmap.c  | 24 +++---
 xen/include/xen/bitmap.h | 43 
 2 files changed, 34 insertions(+), 33 deletions(-)

diff --git a/xen/common/bitmap.c b/xen/common/bitmap.c
index 7d4551f78283..c57b35f0042c 100644
--- a/xen/common/bitmap.c
+++ b/xen/common/bitmap.c
@@ -55,7 +55,7 @@ static void clamp_last_byte(uint8_t *bp, unsigned int nbits)
bp[nbits/8] &= (1U << remainder) - 1;
 }
 
-int __bitmap_empty(const unsigned long *bitmap, int bits)
+int __bitmap_empty(const unsigned long *bitmap, unsigned int bits)
 {
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
@@ -70,7 +70,7 @@ int __bitmap_empty(const unsigned long *bitmap, int bits)
 }
 EXPORT_SYMBOL(__bitmap_empty);
 
-int __bitmap_full(const unsigned long *bitmap, int bits)
+int __bitmap_full(const unsigned long *bitmap, unsigned int bits)
 {
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
@@ -86,7 +86,7 @@ int __bitmap_full(const unsigned long *bitmap, int bits)
 EXPORT_SYMBOL(__bitmap_full);
 
 int __bitmap_equal(const unsigned long *bitmap1,
-   const unsigned long *bitmap2, int bits)
+   const unsigned long *bitmap2, unsigned int bits)
 {
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
@@ -101,7 +101,7 @@ int __bitmap_equal(const unsigned long *bitmap1,
 }
 EXPORT_SYMBOL(__bitmap_equal);
 
-void __bitmap_complement(unsigned long *dst, const unsigned long *src, int 
bits)
+void __bitmap_complement(unsigned long *dst, const unsigned long *src, 
unsigned int bits)
 {
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
@@ -113,7 +113,7 @@ void __bitmap_complement(unsigned long *dst, const unsigned 
long *src, int bits)
 EXPORT_SYMBOL(__bitmap_complement);
 
 void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
-   const unsigned long *bitmap2, int bits)
+  const unsigned long *bitmap2, unsigned int bits)
 {
int k;
int nr = BITS_TO_LONGS(bits);
@@ -124,7 +124,7 @@ void __bitmap_and(unsigned long *dst, const unsigned long 
*bitmap1,
 EXPORT_SYMBOL(__bitmap_and);
 
 void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
-   const unsigned long *bitmap2, int bits)
+ const unsigned long *bitmap2, unsigned int bits)
 {
int k;
int nr = BITS_TO_LONGS(bits);
@@ -135,7 +135,7 @@ void __bitmap_or(unsigned long *dst, const unsigned long 
*bitmap1,
 EXPORT_SYMBOL(__bitmap_or);
 
 void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
-   const unsigned long *bitmap2, int bits)
+  const unsigned long *bitmap2, unsigned int bits)
 {
int k;
int nr = BITS_TO_LONGS(bits);
@@ -146,7 +146,7 @@ void __bitmap_xor(unsigned long *dst, const unsigned long 
*bitmap1,
 EXPORT_SYMBOL(__bitmap_xor);
 
 void __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
-   const unsigned long *bitmap2, int bits)
+ const unsigned long *bitmap2, unsigned int bits)
 {
int k;
int nr = BITS_TO_LONGS(bits);
@@ -157,7 +157,7 @@ void __bitmap_andnot(unsigned long *dst, const unsigned 
long *bitmap1,
 EXPORT_SYMBOL(__bitmap_andnot);
 
 int __bitmap_intersects(const unsigned long *bitmap1,
-   const unsigned long *bitmap2, int bits)
+const unsigned long 

Re: [PATCH v4 37/47] hw/net/lasi_i82596: Re-enable build

2024-02-01 Thread Thomas Huth

On 26/01/2024 18.25, David Woodhouse wrote:

From: David Woodhouse 

When converting to the shiny build-system-du-jour, a typo prevented the
last_i82596 driver from being built. Correct the config option name to
re-enable the build. And include "sysemu/sysemu.h" so it actually builds.

Fixes: b1419fa66558 ("meson: convert hw/net")
Signed-off-by: David Woodhouse 
---
  hw/net/lasi_i82596.c | 1 +
  hw/net/meson.build   | 2 +-
  2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/net/lasi_i82596.c b/hw/net/lasi_i82596.c
index 6a3147fe2d..09e830ba5f 100644
--- a/hw/net/lasi_i82596.c
+++ b/hw/net/lasi_i82596.c
@@ -14,6 +14,7 @@
  #include "qapi/error.h"
  #include "qemu/timer.h"
  #include "hw/sysbus.h"
+#include "sysemu/sysemu.h"
  #include "net/eth.h"
  #include "hw/net/lasi_82596.h"
  #include "hw/net/i82596.h"
diff --git a/hw/net/meson.build b/hw/net/meson.build
index 9afceb0619..2b426d3d5a 100644
--- a/hw/net/meson.build
+++ b/hw/net/meson.build
@@ -33,7 +33,7 @@ system_ss.add(when: 'CONFIG_MARVELL_88W8618', if_true: 
files('mv88w8618_eth.c'))
  system_ss.add(when: 'CONFIG_CADENCE', if_true: files('cadence_gem.c'))
  system_ss.add(when: 'CONFIG_STELLARIS_ENET', if_true: 
files('stellaris_enet.c'))
  system_ss.add(when: 'CONFIG_LANCE', if_true: files('lance.c'))
-system_ss.add(when: 'CONFIG_LASI_I82596', if_true: files('lasi_i82596.c'))
+system_ss.add(when: 'CONFIG_LASI_82596', if_true: files('lasi_i82596.c'))
  system_ss.add(when: 'CONFIG_I82596_COMMON', if_true: files('i82596.c'))
  system_ss.add(when: 'CONFIG_SUNHME', if_true: files('sunhme.c'))
  system_ss.add(when: 'CONFIG_FTGMAC100', if_true: files('ftgmac100.c'))


Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2144




Re: [RFC KERNEL PATCH v4 3/3] PCI/sysfs: Add gsi sysfs for pci_dev

2024-02-01 Thread Roger Pau Monné
On Wed, Jan 31, 2024 at 01:00:14PM -0600, Bjorn Helgaas wrote:
> On Wed, Jan 31, 2024 at 09:58:19AM +0100, Roger Pau Monné wrote:
> > On Tue, Jan 30, 2024 at 02:44:03PM -0600, Bjorn Helgaas wrote:
> > > On Tue, Jan 30, 2024 at 10:07:36AM +0100, Roger Pau Monné wrote:
> > > > On Mon, Jan 29, 2024 at 04:01:13PM -0600, Bjorn Helgaas wrote:
> > > > > On Thu, Jan 25, 2024 at 07:17:24AM +, Chen, Jiqian wrote:
> > > > > > On 2024/1/24 00:02, Bjorn Helgaas wrote:
> > > > > > > On Tue, Jan 23, 2024 at 10:13:52AM +, Chen, Jiqian wrote:
> > > > > > >> On 2024/1/23 07:37, Bjorn Helgaas wrote:
> > > > > > >>> On Fri, Jan 05, 2024 at 02:22:17PM +0800, Jiqian Chen wrote:
> > > > > >  There is a need for some scenarios to use gsi sysfs.
> > > > > >  For example, when xen passthrough a device to dumU, it will
> > > > > >  use gsi to map pirq, but currently userspace can't get gsi
> > > > > >  number.
> > > > > >  So, add gsi sysfs for that and for other potential scenarios.
> > > > > > >> ...
> > > > > > > 
> > > > > > >>> I don't know enough about Xen to know why it needs the GSI in
> > > > > > >>> userspace.  Is this passthrough brand new functionality that 
> > > > > > >>> can't be
> > > > > > >>> done today because we don't expose the GSI yet?
> > > > > 
> > > > > I assume this must be new functionality, i.e., this kind of
> > > > > passthrough does not work today, right?
> > > > > 
> > > > > > >> has ACPI support and is responsible for detecting and controlling
> > > > > > >> the hardware, also it performs privileged operations such as the
> > > > > > >> creation of normal (unprivileged) domains DomUs. When we give to 
> > > > > > >> a
> > > > > > >> DomU direct access to a device, we need also to route the 
> > > > > > >> physical
> > > > > > >> interrupts to the DomU. In order to do so Xen needs to setup and 
> > > > > > >> map
> > > > > > >> the interrupts appropriately.
> > > > > > > 
> > > > > > > What kernel interfaces are used for this setup and mapping?
> > > > > >
> > > > > > For passthrough devices, the setup and mapping of routing physical
> > > > > > interrupts to DomU are done on Xen hypervisor side, hypervisor only
> > > > > > need userspace to provide the GSI info, see Xen code:
> > > > > > xc_physdev_map_pirq require GSI and then will call hypercall to pass
> > > > > > GSI into hypervisor and then hypervisor will do the mapping and
> > > > > > routing, kernel doesn't do the setup and mapping.
> > > > > 
> > > > > So we have to expose the GSI to userspace not because userspace itself
> > > > > uses it, but so userspace can turn around and pass it back into the
> > > > > kernel?
> > > > 
> > > > No, the point is to pass it back to Xen, which doesn't know the
> > > > mapping between GSIs and PCI devices because it can't execute the ACPI
> > > > AML resource methods that provide such information.
> > > > 
> > > > The (Linux) kernel is just a proxy that forwards the hypercalls from
> > > > user-space tools into Xen.
> > > 
> > > But I guess Xen knows how to interpret a GSI even though it doesn't
> > > have access to AML?
> > 
> > On x86 Xen does know how to map a GSI into an IO-APIC pin, in order
> > configure the RTE as requested.
> 
> IIUC, mapping a GSI to an IO-APIC pin requires information from the
> MADT.  So I guess Xen does use the static ACPI tables, but not the AML
> _PRT methods that would connect a GSI with a PCI device?

Yes, Xen can parse the static tables, and knows the base GSI of
IO-APICs from the MADT.

> I guess this means Xen would not be able to deal with _MAT methods,
> which also contains MADT entries?  I don't know the implications of
> this -- maybe it means Xen might not be able to use with hot-added
> devices?

It's my understanding _MAT will only be present on some very specific
devices (IO-APIC or CPU objects).  Xen doesn't support hotplug of
IO-APICs, but hotplug of CPUs should in principle be supported with
cooperation from the control domain OS (albeit it's not something that
we tests on our CI).  I don't expect however that a CPU object _MAT
method will return IO APIC entries.

> The tables (including DSDT and SSDTS that contain the AML) are exposed
> to userspace via /sys/firmware/acpi/tables/, but of course that
> doesn't mean Xen knows how to interpret the AML, and even if it did,
> Xen probably wouldn't be able to *evaluate* it since that could
> conflict with the host kernel's use of AML.

Indeed, there can only be a single OSPM, and that's the dom0 OS (Linux
in our context).

Getting back to our context though, what would be a suitable place for
exposing the GSI assigned to each device?

Thanks, Roger.



Issue iommu unrecognized on amd computer

2024-02-01 Thread oxjo
Hi,

Following our interaction on matrix, I send you the boot log `xl dmesg` output 
of my computer, booting with `iommu=debug` options on linux 6.6.13-200.fc39.

My issue is everything works on qubes on my thinkpad L15 gen 4 with AMD Ryzen 7 
Pro 7730U, except iommu/amd-vi.
I can boot and use a vm but no pci passthrough.

I tried qubes 4.2, bare xen 17.2, xen 18 and I always have the same messages 
you can see in the logs with IVMD.

I'd be grateful if you can tell me within a week if it's an issue you think can 
be dealt with or no so I can decide what I do with this computer.

Best regards,
Ox Xen 4.18.0
(XEN) Xen version 4.18.0 (mockbuild@) (gcc (GCC) 13.2.1 20231205 (Red Hat 
13.2.1-6)) debug=n Wed Dec 13 17:50:42 UTC 2023
(XEN) Latest ChangeSet: 
(XEN) build-id: ea5d1960a8d06f31b000355f4662169fe12d09a6
(XEN) Bootloader: GRUB 2.06
(XEN) Command line: placeholder no-real-mode edd=off
(XEN) Xen image load base address: 0xc1a0
(XEN) Video information:
(XEN)  VGA is graphics mode 1920x1080, 32 bpp
(XEN)  VBE/DDC methods: none; EDID transfer time: 0 seconds
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 1 EDD information structures
(XEN) CPU Vendor: AMD, Family 25 (0x19), Model 80 (0x50), Stepping 0 (raw 
00a50f00)
(XEN) Enabling Supervisor Shadow Stacks
(XEN)   - Disabling PV32 due to CET
(XEN) EFI RAM map:
(XEN)  [, 0009efff] (usable)
(XEN)  [0009f000, 0009] (reserved)
(XEN)  [0010, 09bf] (usable)
(XEN)  [09c0, 09db0fff] (reserved)
(XEN)  [09db1000, 09ef] (usable)
(XEN)  [09f0, 09f0efff] (ACPI NVS)
(XEN)  [09f0f000, c2d57fff] (usable)
(XEN)  [c2d58000, c8f57fff] (reserved)
(XEN)  [c8f58000, c9f57fff] (ACPI NVS)
(XEN)  [c9f58000, c9fd7fff] (ACPI data)
(XEN)  [c9fd8000, c9fd9fff] (usable)
(XEN)  [c9fda000, c9ff] (reserved)
(XEN)  [ca00, cbff] (usable)
(XEN)  [cc00, cdff] (reserved)
(XEN)  [cf00, cfff] (reserved)
(XEN)  [f800, fbff] (reserved)
(XEN)  [fdc0, fdcf] (reserved)
(XEN)  [fed8, fed80fff] (reserved)
(XEN)  [0001, 000fee2f] (usable)
(XEN)  [000fee30, 00102fff] (reserved)
(XEN) ACPI: RSDP C9FD7014, 0024 (r2 LENOVO)
(XEN) ACPI: XSDT C9FD5188, 0104 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: FACP C3964000, 0114 (r6 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: DSDT C394C000, 12791 (r1 LENOVO TP-R25   1130 INTL 20180313)
(XEN) ACPI: FACS C9DF8000, 0040
(XEN) ACPI: SSDT C5E74000, 00A2 (r1 LENOVO PID0Ssdt1 INTL 20180313)
(XEN) ACPI: SSDT C5E73000, 0752 (r1 LENOVO UsbCTabl1 INTL 20180313)
(XEN) ACPI: SSDT C5E66000, 7345 (r2 LENOVO TP-R25  2 MSFT  202)
(XEN) ACPI: MSDM C5B7B000, 0055 (r3 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: BATB C5B66000, 004A (r2 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: HPET C3963000, 0038 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: APIC C3962000, 0138 (r2 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: MCFG C3961000, 003C (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: SBST C396, 0030 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: WSMT C395F000, 0028 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: VFCT C393E000, D884 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: SSDT C3938000, 5354 (r2 LENOVO TP-R25  1 AMD 1)
(XEN) ACPI: CRAT C3937000, 0EC0 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: CDIT C3936000, 0029 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: FPDT C5B67000, 0034 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: SSDT C3935000, 0149 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C3933000, 14C3 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C3931000, 15A8 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C392D000, 3B0E (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: BGRT C392C000, 0038 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: SSDT C392B000, 024D (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C3929000, 14C4 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C3928000, 0AB7 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: UEFI C9DF7000, 00C6 (r1 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: IVRS C3927000, 01E4 (r2 LENOVO TP-R25   1130 PTEC2)
(XEN) ACPI: SSDT C5E72000, 0090 (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) ACPI: SSDT C5E71000, 098D (r1 LENOVO TP-R25  1 INTL 20180313)
(XEN) System RAM: 64302MB (65845476kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at -000fee30
(XEN) Domain heap initialised
(XEN) vesafb: framebuffer at 

Re: [PATCH] shim: avoid building of vendor IOMMU code

2024-02-01 Thread Roger Pau Monné
On Thu, Feb 01, 2024 at 08:39:31AM +0100, Jan Beulich wrote:
> There's no use for IOMMU code in the shim. Disable at least the vendor-
> specific code, until eventually IOMMU code can be disabled altogether.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Roger Pau Monné 

That was easier than I expected :).

Thanks, Roger.



Re: [PATCH v12 10/15] vpci/header: emulate PCI_COMMAND register for guests

2024-02-01 Thread Jan Beulich
On 01.02.2024 05:50, Stewart Hildebrand wrote:
> On 1/25/24 10:43, Jan Beulich wrote:
>> On 09.01.2024 22:51, Stewart Hildebrand wrote:
>>> --- a/xen/drivers/vpci/header.c
>>> +++ b/xen/drivers/vpci/header.c
>>> @@ -168,6 +168,9 @@ static void modify_decoding(const struct pci_dev *pdev, 
>>> uint16_t cmd,
>>>  if ( !rom_only )
>>>  {
>>>  pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
>>> +/* Show DomU that we updated P2M */
>>> +header->guest_cmd &= ~PCI_COMMAND_MEMORY;
>>> +header->guest_cmd |= cmd & PCI_COMMAND_MEMORY;
>>>  header->bars_mapped = map;
>>>  }
>>
>> I don't follow what the comment means to say. The bit in question has no
>> real connection to the P2M, and the guest also may have no notion of the
>> underlying hypervisor's internals. Likely connected to ...
> 
> Indeed. If the comment survives to v13, I'll update it to:
> 
> /* Now that we updated P2M, show DomU change to PCI_COMMAND_MEMORY */
> 
>>
>>> @@ -524,9 +527,26 @@ static void cf_check cmd_write(
>>>  {
>>>  struct vpci_header *header = data;
>>>  
>>> +if ( !is_hardware_domain(pdev->domain) )
>>> +{
>>> +const struct vpci *vpci = pdev->vpci;
>>> +
>>> +if ( (vpci->msi && vpci->msi->enabled) ||
>>> + (vpci->msix && vpci->msix->enabled) )
>>> +cmd |= PCI_COMMAND_INTX_DISABLE;
>>> +
>>> +/*
>>> + * Do not show change to PCI_COMMAND_MEMORY bit until we finish
>>> + * modifying P2M mappings.
>>> + */
>>> +header->guest_cmd = (cmd & ~PCI_COMMAND_MEMORY) |
>>> +(header->guest_cmd & PCI_COMMAND_MEMORY);
>>> +}
>>
>> ... the comment here, but then shouldn't it be that the guest can't even
>> issue a 2nd cfg space access until the present write has been carried out?
>> Otherwise I'd be inclined to claim that such a partial update is unlikely
>> to be spec-conformant.
> 
> Due to the raise_softirq() call added in
> 
>   3e568fa9e19c ("vpci: fix deferral of long operations")
> 
> my current understanding is: when the guest toggles memory decoding, the 
> guest vcpu doesn't resume execution until vpci_process_pending() and 
> modify_decoding() have finished. So I think the guest should see a consistent 
> state of the register, unless it was trying to read from a different vcpu 
> than the one doing the writing.
> 
> Regardless, if the guest did have an opportunity to successfully read the 
> partially updated state of the register, I'm not really spotting what part of 
> the spec that would be a violation of. PCIe 6.1 has this description 
> regarding the bit: "When this bit is Set" and "When this bit is Clear" the 
> device will decode (or not) memory accesses. The spec doesn't seem to 
> distinguish whether the host or the device itself is the one to set/clear the 
> bit. One might even try to argue the opposite: allowing the bit to be toggled 
> before the device reflects the change would be a violation of spec. Since the 
> spec is ambiguous in this regard, I don't think either argument is 
> particularly strong.
> 
> Chesterton's fence: the logic for deferring the update of PCI_COMMAND_MEMORY 
> in guest_cmd was added between v10 and v11 of this series. I went back to 
> look at the review comments on v10 [1], but the rationale is still not 
> entirely clear to me.

Indeed. The only sentence possibly hinting in such a direction would imo
have been "I'm kind of unsure whether we want to fake the guest view by
returning what the guest writes." It's unclear to me whether it really
was meant that way.

> At the end of the day, with the information I have at hand, I suspect it 
> would be fine either way (whether updating guest_cmd is deferred or not). If 
> no other info comes to light, I'm leaning toward not deferring because it 
> would be simpler to update the bit right away in cmd_write().

I'm not sure it would be fine either way. Config space writes aren't
posted writes, so they complete synchronously. IOW whatever internal
state updates are needed in the device, they ought to have finished by
the time the write completes.

> [1] https://lore.kernel.org/xen-devel/ZVy73iJ3E8nJHvgf@macbook.local/
> 
>>[...]
>>> --- a/xen/drivers/vpci/msix.c
>>> +++ b/xen/drivers/vpci/msix.c
>>> @@ -135,6 +135,13 @@ static void cf_check control_write(
>>>  }
>>>  }
>>>  
>>> +/* Make sure domU doesn't enable INTx while enabling MSI-X. */
>>> +if ( new_enabled && !msix->enabled && 
>>> !is_hardware_domain(pdev->domain) )
>>> +{
>>> +pci_intx(pdev, false);
>>> +pdev->vpci->header.guest_cmd |= PCI_COMMAND_INTX_DISABLE;
>>> +}
>>
>> ... the similar code here has it.
>>
>> In both cases, is it really appropriate to set the bit in guest view?
> 
> I added this based on Roger's comment at [2]. Roger, what do you think? I 
> don't believe QEMU updates the guest view in this manner.
> 
> [2] 
>