Re: Trying add smt=off disabled cores to cpupool crash xen

2023-12-05 Thread Jan Beulich
On 05.12.2023 18:24, René Winther Højgaard wrote:
> You are right about sched-gran=core being the issue.
> 
> I don't know if this is relevant, but my CPU shouldn't be able to use 
> sched-gran=core it's asymmetric.
> 
> smt=on with sched-gran=core gives me a warning that it's falling back to 
> sched-gran=cpu, I tested smt=off with sched-gran=cpu and it works.
> 
> This warning is missing with sched-gran=core and smt=off 
> 
> (XEN) ***
> (XEN) Asymmetric cpu configuration.
> (XEN) Falling back to sched-gran=cpu.
> (XEN) ***

And (presumably) rightly so, because at the core level (presumably) there is
no asymmetry.

Jan



RE: [PATCH v13 26/35] x86/fred: FRED entry/exit and dispatch code

2023-12-05 Thread Li, Xin3
> > diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
> > new file mode 100644 index ..215883e90f94
> > --- /dev/null
> > +++ b/arch/x86/entry/entry_fred.c
> > @@ -0,0 +1,230 @@
> > ...
> > +static noinstr void fred_intx(struct pt_regs *regs) {
> > +   switch (regs->fred_ss.vector) {
> > +   /* INT0 */
> 
> INTO (for overflow), not INT-zero.  However...

My bad again...

> > +   case X86_TRAP_OF:
> > +   exc_overflow(regs);
> > +   return;
> > +
> > +   /* INT3 */
> > +   case X86_TRAP_BP:
> > +   exc_int3(regs);
> > +   return;
> 
> ... neither OF nor BP will ever enter fred_intx() because they're type SWEXC 
> not
> SWINT.

Per FRED spec 5.0, section 7.3 Software Interrupts and Related Instructions:
INT n (opcode CD followed by an immediate byte): There are 256 such
software interrupt instructions, one for each value n of the immediate
byte (0–255).

And appendix B Event Stack Levels:
If the event is an execution of INT n (opcode CD n for 8-bit value n),
the event stack level is 0. The event type is 4 (software interrupt)
and the vector is n.

So int $0x4 and int $0x3 (use asm(".byte 0xCD, 0x03")) get here.

But into (0xCE) and int3 (0xCC) do use event type SWEXC. 

BTW, into is NOT allowed in 64-bit mode but "int $0x4" is allowed.

> 
> SWINT is strictly the INT $imm8 instruction.
> 
> > ...
> > +static noinstr void fred_extint(struct pt_regs *regs) {
> > +   unsigned int vector = regs->fred_ss.vector;
> > +
> > +   if (WARN_ON_ONCE(vector < FIRST_EXTERNAL_VECTOR))
> > +   return;
> > +
> > +   if (likely(vector >= FIRST_SYSTEM_VECTOR)) {
> > +   irqentry_state_t state = irqentry_enter(regs);
> > +
> > +   instrumentation_begin();
> > +   sysvec_table[vector - FIRST_SYSTEM_VECTOR](regs);
> 
> array_index_mask_nospec()
> 
> This is easy for an attacker to abuse, to install non-function-pointer 
> targets into
> the indirect predictor.

HPA did use array_index_nospec() at the beginning, but I forgot it later.

> 
> > +   instrumentation_end();
> > +   irqentry_exit(regs, state);
> > +   } else {
> > +   common_interrupt(regs, vector);
> > +   }
> > +}
> > +
> > +static noinstr void fred_exception(struct pt_regs *regs, unsigned
> > +long error_code) {
> > +   /* Optimize for #PF. That's the only exception which matters performance
> wise */
> > +   if (likely(regs->fred_ss.vector == X86_TRAP_PF)) {
> > +   exc_page_fault(regs, error_code);
> > +   return;
> > +   }
> > +
> > +   switch (regs->fred_ss.vector) {
> > +   case X86_TRAP_DE: return exc_divide_error(regs);
> > +   case X86_TRAP_DB: return fred_exc_debug(regs);
> > +   case X86_TRAP_BP: return exc_int3(regs);
> > +   case X86_TRAP_OF: return exc_overflow(regs);
> 
> Depending on what you want to do with BP/OF vs fred_intx(), this may need
> adjusting.
> 
> If you are cross-checking type and vector, then these should be rejected for 
> not
> being of type HWEXC.

You're right, the event type needs to be SWEXC for into and int3.

However, would it be overkilling?  Assuming hardware and VMM are sane.

> 
> > +   case X86_TRAP_BR: return exc_bounds(regs);
> > +   case X86_TRAP_UD: return exc_invalid_op(regs);
> > +   case X86_TRAP_NM: return exc_device_not_available(regs);
> > +   case X86_TRAP_DF: return exc_double_fault(regs, error_code);
> > +   case X86_TRAP_TS: return exc_invalid_tss(regs, error_code);
> > +   case X86_TRAP_NP: return exc_segment_not_present(regs, error_code);
> > +   case X86_TRAP_SS: return exc_stack_segment(regs, error_code);
> > +   case X86_TRAP_GP: return exc_general_protection(regs, error_code);
> > +   case X86_TRAP_MF: return exc_coprocessor_error(regs);
> > +   case X86_TRAP_AC: return exc_alignment_check(regs, error_code);
> > +   case X86_TRAP_XF: return exc_simd_coprocessor_error(regs);
> > +
> > +#ifdef CONFIG_X86_MCE
> > +   case X86_TRAP_MC: return fred_exc_machine_check(regs); #endif #ifdef
> > +CONFIG_INTEL_TDX_GUEST
> > +   case X86_TRAP_VE: return exc_virtualization_exception(regs);
> > +#endif
> > +#ifdef CONFIG_X86_KERNEL_IBT
> 
> CONFIG_X86_CET
> 
> Userspace can use CET even if the kernel isn't compiled with IBT, so this
> exception needs handling.

Absolutely correct!

> 
> > +   case X86_TRAP_CP: return exc_control_protection(regs, error_code);
> > +#endif
> > +   default: return fred_bad_type(regs, error_code);
> > +   }
> > +}
> > +
> > +__visible noinstr void fred_entry_from_user(struct pt_regs *regs) {
> > +   unsigned long error_code = regs->orig_ax;
> > +
> > +   /* Invalidate orig_ax so that syscall_get_nr() works correctly */
> > +   regs->orig_ax = -1;
> > +
> > +   switch (regs->fred_ss.type) {
> > +   case EVENT_TYPE_EXTINT:
> > +   return fred_extint(regs);
> > +   case EVENT_TYPE_NMI:
> > +   return fred_exc_nmi(regs);
> > +   case EVENT_TYPE_SWINT:
> > +   return fred_intx(regs);
> > +   case EVENT_TYPE_HWEXC:
> > +   case 

Re: [PATCH v3 2/6] x86/HVM: split restore state checking from state loading

2023-12-05 Thread Jan Beulich
On 05.12.2023 16:55, Roger Pau Monné wrote:
> On Tue, Dec 05, 2023 at 03:59:13PM +0100, Jan Beulich wrote:
>> On 05.12.2023 15:29, Roger Pau Monné wrote:
>>> On Tue, Dec 05, 2023 at 09:52:31AM +0100, Jan Beulich wrote:
 On 04.12.2023 18:27, Roger Pau Monné wrote:
> On Tue, Nov 28, 2023 at 11:34:04AM +0100, Jan Beulich wrote:
>> ..., at least as reasonably feasible without making a check hook
>> mandatory (in particular strict vs relaxed/zero-extend length checking
>> can't be done early this way).
>>
>> Note that only one of the two uses of hvm_load() is accompanied with
>> hvm_check(). The other directly consumes hvm_save() output, which ought
>> to be well-formed. This means that while input data related checks don't
>> need repeating in the "load" function when already done by the "check"
>> one (albeit assertions to this effect may be desirable), domain state
>> related checks (e.g. has_xyz(d)) will be required in both places.
>>
>> Suggested-by: Roger Pau Monné 
>> Signed-off-by: Jan Beulich 
>> ---
>> Do we really need all the copying involved in use of _hvm_read_entry()
>> (backing hvm_load_entry()? Zero-extending loads are likely easier to
>> handle that way, but for strict loads all we gain is a reduced risk of
>> unaligned accesses (compared to simply pointing into h->data[]).
>
> See below, but I wonder whether the checks could be performed as part
> of hvm_load() without having to introduce a separate handler and loop
> over the context entries.

 Specifically not. State loading (in the longer run) would better not fail
 once started. (Imo it should have been this way from the beginning.) Only
 then will the vCPU still be in a predictable state even after a possible
 error.
>>>
>>> Looking at the callers, does such predictable state after failure
>>> matter?
>>>
>>> One caller is an hypercall used by the toolstack at domain create,
>>> failing can just lead to the domain being destroyed.  The other caller
>>> is vm fork, which will also lead to the fork being destroyed if
>>> context loading fails.
>>>
>>> Maybe I'm overlooking something.
>>
>> You don't (I think), but existing callers necessarily have to behave the
>> way you describe. From an abstract perspective, though, failed state
>> loading would better allow a retry. And really I thought that when you
>> suggested to split checking from loading, you had exactly that in mind.
> 
> Not really TBH, because I didn't think that much on a possible
> implementation when proposing it.

But what else did you think of then in terms of separating checking from
loading?

> Maybe a suitable compromise would be to reset the state to the initial
> (at domain build) one on failure?

That's an option, sure.

> I do dislike the duplicated loops, as it seems like a lot of duplicate
> boilerplate code, and I have fears of it going out of sync.

There's a certain risk, yes, but that exists similarly with the save and
load sides imo.

Andrew - before I go and undo the v2 changes, what is your view?

Jan



[ovmf test] 184004: all pass - PUSHED

2023-12-05 Thread osstest service owner
flight 184004 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184004/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 7e18c9a788e543ab71cdc0485989cf5d00cdccc2
baseline version:
 ovmf ef3fde64aa78598a4c21556629936fb228390e8c

Last test of basis   183999  2023-12-05 20:44:36 Z0 days
Testing same since   184004  2023-12-06 05:41:03 Z0 days1 attempts


People who touched revisions under test:
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   ef3fde64aa..7e18c9a788  7e18c9a788e543ab71cdc0485989cf5d00cdccc2 -> 
xen-tested-master



Re: [PATCH v3 5/6] x86/vPIC: vpic_elcr_mask() master bit 2 control

2023-12-05 Thread Jan Beulich
On 05.12.2023 18:29, Roger Pau Monné wrote:
> On Tue, Nov 28, 2023 at 11:35:46AM +0100, Jan Beulich wrote:
>> @@ -387,7 +387,7 @@ static int cf_check vpic_intercept_elcr_
>>  if ( dir == IOREQ_WRITE )
>>  {
>>  /* Some IRs are always edge trig. Slave IR is always level 
>> trig. */
>> -data = (*val >> shift) & vpic_elcr_mask(vpic);
>> +data = (*val >> shift) & vpic_elcr_mask(vpic, 1);
> 
> Not that it matters much, but I think you could use
> vpic_elcr_mask(vpic, 0) to strictly keep the same behavior as
> before?

Indeed, as also said in the description. Personally I view it as (slightly)
more logical to not mask off ...

>>  if ( vpic->is_master )
>>  data |= 1 << 2;
> 
> Since the bit is forcefully set here anyway.

... and then set the bit, hence why I chose to go with 1.

> Regardless:
> 
> Reviewed-by: Roger Pau Monné 

Thanks.

Jan



Re: [PATCH -next RFC 02/14] xen/blkback: use bdev api in xen_update_blkif_status()

2023-12-05 Thread Christoph Hellwig
On Wed, Dec 06, 2023 at 02:56:05PM +0800, Yu Kuai wrote:
> > > - invalidate_inode_pages2(
> > > - blkif->vbd.bdev_handle->bdev->bd_inode->i_mapping);
> > > + invalidate_bdev(blkif->vbd.bdev_handle->bdev);
> > 
> > blkbak is a bdev exported.   I don't think it should ever call
> > invalidate_inode_pages2, through a wrapper or not.
> 
> I'm not sure about this. I'm not familiar with xen/blkback, but I saw
> that xen-blkback will open a bdev from xen_vbd_create(), hence this
> looks like a dm/md for me, hence it sounds reasonable to sync +
> invalidate the opened bdev while initialization. Please kindly correct
> me if I'm wrong.

I guess we have enough precedence for this, so the switchover here
isn't wrong.  But all this invalidating of the bdev cache seems to
be asking for trouble.




Re: [PATCH -next RFC 01/14] block: add some bdev apis

2023-12-05 Thread Christoph Hellwig
On Wed, Dec 06, 2023 at 02:50:56PM +0800, Yu Kuai wrote:
> I'm a litter confused, so there are 3 use cases:
> 1) use GFP_USER, default gfp from bdev_alloc.
> 2) use GFP_KERNEL
> 3) use GFP_NOFS
> 
> I understand that you're suggesting memalloc_nofs_save() to distinguish
> 2 and 3, but how can I distinguish 1?

You shouldn't.  Diverging from the default flags except for clearing
the FS or IO flags is simply a bug.  Note that things like block2mtd
should probably also ensure a noio allocation if they aren't doing that
yet.

> >   - use memalloc_nofs_save in extet instead of using
> > mapping_gfp_constraint to clear it from the mapping flags
> >   - remove __ext4_sb_bread_gfp and just have buffer.c helper that does
> > the right thing (either by changing the calling conventions of an
> > existing one, or adding a new one).
> 
> Thanks for the suggestions, but I'm not sure how to do this yet, I must
> read more ext4 code.

the nofs save part should be trivial.  You can just skip the rest for
now as it's not needed for this patch series.




Re: [BUG]i2c_hid_acpi broken with 4.17.2 on Framework Laptop 13 AMD

2023-12-05 Thread Jan Beulich
On 05.12.2023 21:31, Sébastien Chaumat wrote:
>> [2.464598] amd_gpio AMDI0030:00: failed to enable wake-up interrupt
> 
> Is it expected that IRQ7 goes from fasteoi (kernel 6.6.4 ) to
> ioapic-edge and IRQ9 to ioapic-level ?
> 
> IR-IO-APIC7-fasteoi   pinctrl_amd
> IR-IO-APIC9-fasteoi   acpi
> 
> to (xen 4.18.0)
> 
> xen-pirq -ioapic-edge  pinctrl_amd
> xen-pirq -ioapic-level  acpi
> 
> ?

Aren't you comparing different things here? IR-* pretty clearly is on a
native kernel (no Xen). It being "edge" looks suspicious, though.

But again - I'm not primarily a kernel person, and hence you continuing
to send replies To: me feels at least odd.

Jan



[PATCH v2 2/3] xen: make include/xen/unaligned.h usable on all architectures

2023-12-05 Thread Juergen Gross
Instead of defining get_unaligned() and put_unaligned() in a way that
is only supporting architectures allowing unaligned accesses, use the
same approach as the Linux kernel and let the compiler do the
decision how to generate the code for probably unaligned data accesses.

Update include/xen/unaligned.h from include/asm-generic/unaligned.h of
the Linux kernel.

The generated code has been checked to be the same on x86.

Modify the Linux variant to not use underscore prefixed identifiers,
avoid unneeded parentheses and drop the 24-bit accessors.

Signed-off-by: Arnd Bergmann 
Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
803f4e1eab7a
Signed-off-by: Juergen Gross 
---
V2:
- drop 24-bit accessors (Jan Beulich)
- avoid underscore prefixed identifiers (Jan Beulich)
- drop unneeded parentheses (Jan Beulich)
---
 xen/include/xen/unaligned.h | 77 -
 1 file changed, 42 insertions(+), 35 deletions(-)

diff --git a/xen/include/xen/unaligned.h b/xen/include/xen/unaligned.h
index 0a2b16d05d..0ceb06a2bb 100644
--- a/xen/include/xen/unaligned.h
+++ b/xen/include/xen/unaligned.h
@@ -1,12 +1,4 @@
-/*
- * This header can be used by architectures where unaligned accesses work
- * without faulting, and at least reasonably efficiently.  Other architectures
- * will need to have a custom asm/unaligned.h.
- */
-#ifndef __ASM_UNALIGNED_H__
-#error "xen/unaligned.h should not be included directly - include 
asm/unaligned.h instead"
-#endif
-
+/* SPDX-License-Identifier: GPL-2.0 */
 #ifndef __XEN_UNALIGNED_H__
 #define __XEN_UNALIGNED_H__
 
@@ -15,67 +7,82 @@
 #include 
 #endif
 
-#define get_unaligned(p) (*(p))
-#define put_unaligned(val, p) (*(p) = (val))
+/*
+ * This is the most generic implementation of unaligned accesses
+ * and should work almost anywhere.
+ */
+
+#define get_unaligned_t_(type, ptr) ({ \
+   const struct { type x; } __packed *ptr_ = (typeof(ptr_))(ptr);  \
+   ptr_->x;\
+})
+
+#define put_unaligned_t_(type, val, ptr) do {  \
+   struct { type x; } __packed *ptr_ = (typeof(ptr_))(ptr);\
+   ptr_->x = val;  \
+} while (0)
+
+#define get_unaligned(ptr) get_unaligned_t_(typeof(*(ptr)), ptr)
+#define put_unaligned(val, ptr) put_unaligned_t_(typeof(*(ptr)), val, ptr)
 
-static inline uint16_t get_unaligned_be16(const void *p)
+static inline u16 get_unaligned_le16(const void *p)
 {
-   return be16_to_cpup(p);
+   return le16_to_cpu(get_unaligned_t_(__le16, p));
 }
 
-static inline void put_unaligned_be16(uint16_t val, void *p)
+static inline u32 get_unaligned_le32(const void *p)
 {
-   *(__force __be16*)p = cpu_to_be16(val);
+   return le32_to_cpu(get_unaligned_t_(__le32, p));
 }
 
-static inline uint32_t get_unaligned_be32(const void *p)
+static inline u64 get_unaligned_le64(const void *p)
 {
-   return be32_to_cpup(p);
+   return le64_to_cpu(get_unaligned_t_(__le64, p));
 }
 
-static inline void put_unaligned_be32(uint32_t val, void *p)
+static inline void put_unaligned_le16(u16 val, void *p)
 {
-   *(__force __be32*)p = cpu_to_be32(val);
+   put_unaligned_t_(__le16, cpu_to_le16(val), p);
 }
 
-static inline uint64_t get_unaligned_be64(const void *p)
+static inline void put_unaligned_le32(u32 val, void *p)
 {
-   return be64_to_cpup(p);
+   put_unaligned_t_(__le32, cpu_to_le32(val), p);
 }
 
-static inline void put_unaligned_be64(uint64_t val, void *p)
+static inline void put_unaligned_le64(u64 val, void *p)
 {
-   *(__force __be64*)p = cpu_to_be64(val);
+   put_unaligned_t_(__le64, cpu_to_le64(val), p);
 }
 
-static inline uint16_t get_unaligned_le16(const void *p)
+static inline u16 get_unaligned_be16(const void *p)
 {
-   return le16_to_cpup(p);
+   return be16_to_cpu(get_unaligned_t_(__be16, p));
 }
 
-static inline void put_unaligned_le16(uint16_t val, void *p)
+static inline u32 get_unaligned_be32(const void *p)
 {
-   *(__force __le16*)p = cpu_to_le16(val);
+   return be32_to_cpu(get_unaligned_t_(__be32, p));
 }
 
-static inline uint32_t get_unaligned_le32(const void *p)
+static inline u64 get_unaligned_be64(const void *p)
 {
-   return le32_to_cpup(p);
+   return be64_to_cpu(get_unaligned_t_(__be64, p));
 }
 
-static inline void put_unaligned_le32(uint32_t val, void *p)
+static inline void put_unaligned_be16(u16 val, void *p)
 {
-   *(__force __le32*)p = cpu_to_le32(val);
+   put_unaligned_t_(__be16, cpu_to_be16(val), p);
 }
 
-static inline uint64_t get_unaligned_le64(const void *p)
+static inline void put_unaligned_be32(u32 val, void *p)
 {
-   return le64_to_cpup(p);
+   put_unaligned_t_(__be32, cpu_to_be32(val), p);
 }
 
-static inline void put_unaligned_le64(uint64_t val, void *p)
+static inline void put_unaligned_be64(u64 val, void *p)
 {
-   *(__force 

[PATCH v2 0/3] xen: have a more generic unaligned.h header

2023-12-05 Thread Juergen Gross
Update Xen's unaligned.h header to support all architectures, allowing
to remove the architecture specific variants (x86 only until now).

Changes in V2:
- new patch 1 (Julien Grall)
- adjusted patch 2 (Jan Beulich)

Juergen Gross (3):
  xen/arm: set -mno-unaligned-access compiler option for Arm32
  xen: make include/xen/unaligned.h usable on all architectures
  xen: remove asm/unaligned.h

 xen/arch/arm/arch.mk |  1 +
 xen/arch/x86/include/asm/unaligned.h |  6 ---
 xen/common/lz4/defs.h|  2 +-
 xen/common/lzo.c |  2 +-
 xen/common/unlzo.c   |  2 +-
 xen/common/xz/private.h  |  2 +-
 xen/common/zstd/mem.h|  2 +-
 xen/include/xen/unaligned.h  | 77 +++-
 xen/lib/xxhash32.c   |  2 +-
 xen/lib/xxhash64.c   |  2 +-
 10 files changed, 50 insertions(+), 48 deletions(-)
 delete mode 100644 xen/arch/x86/include/asm/unaligned.h

-- 
2.35.3




[PATCH v2 3/3] xen: remove asm/unaligned.h

2023-12-05 Thread Juergen Gross
With include/xen/unaligned.h now dealing properly with unaligned
accesses for all architectures, asm/unaligned.h can be removed and
users can be switched to include xen/unaligned.h instead.

Signed-off-by: Juergen Gross 
Reviewed-by: Jan Beulich 
---
 xen/arch/x86/include/asm/unaligned.h | 6 --
 xen/common/lz4/defs.h| 2 +-
 xen/common/lzo.c | 2 +-
 xen/common/unlzo.c   | 2 +-
 xen/common/xz/private.h  | 2 +-
 xen/common/zstd/mem.h| 2 +-
 xen/lib/xxhash32.c   | 2 +-
 xen/lib/xxhash64.c   | 2 +-
 8 files changed, 7 insertions(+), 13 deletions(-)
 delete mode 100644 xen/arch/x86/include/asm/unaligned.h

diff --git a/xen/arch/x86/include/asm/unaligned.h 
b/xen/arch/x86/include/asm/unaligned.h
deleted file mode 100644
index 6070801d4a..00
--- a/xen/arch/x86/include/asm/unaligned.h
+++ /dev/null
@@ -1,6 +0,0 @@
-#ifndef __ASM_UNALIGNED_H__
-#define __ASM_UNALIGNED_H__
-
-#include 
-
-#endif /* __ASM_UNALIGNED_H__ */
diff --git a/xen/common/lz4/defs.h b/xen/common/lz4/defs.h
index 10609f5a53..6d81113266 100644
--- a/xen/common/lz4/defs.h
+++ b/xen/common/lz4/defs.h
@@ -10,7 +10,7 @@
 
 #ifdef __XEN__
 #include 
-#include 
+#include 
 #else
 
 static inline u16 get_unaligned_le16(const void *p)
diff --git a/xen/common/lzo.c b/xen/common/lzo.c
index a87c76dded..cc03f0f554 100644
--- a/xen/common/lzo.c
+++ b/xen/common/lzo.c
@@ -97,7 +97,7 @@
 #ifdef __XEN__
 #include 
 #include 
-#include 
+#include 
 #else
 #define get_unaligned_le16(_p) (*(u16 *)(_p))
 #endif
diff --git a/xen/common/unlzo.c b/xen/common/unlzo.c
index 74056778eb..bdcefa95b3 100644
--- a/xen/common/unlzo.c
+++ b/xen/common/unlzo.c
@@ -34,7 +34,7 @@
 
 #ifdef __XEN__
 #include 
-#include 
+#include 
 #else
 
 static inline u16 get_unaligned_be16(const void *p)
diff --git a/xen/common/xz/private.h b/xen/common/xz/private.h
index e6814250e8..2299705378 100644
--- a/xen/common/xz/private.h
+++ b/xen/common/xz/private.h
@@ -13,7 +13,7 @@
 #ifdef __XEN__
 #include 
 #include 
-#include 
+#include 
 #else
 
 static inline u32 get_unaligned_le32(const void *p)
diff --git a/xen/common/zstd/mem.h b/xen/common/zstd/mem.h
index 2acae6a8ed..ae1e305126 100644
--- a/xen/common/zstd/mem.h
+++ b/xen/common/zstd/mem.h
@@ -23,7 +23,7 @@
 #ifdef __XEN__
 #include  /* memcpy */
 #include   /* size_t, ptrdiff_t */
-#include 
+#include 
 #endif
 
 /*-
diff --git a/xen/lib/xxhash32.c b/xen/lib/xxhash32.c
index e8d403e5ce..32efa651c5 100644
--- a/xen/lib/xxhash32.c
+++ b/xen/lib/xxhash32.c
@@ -42,7 +42,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 /*-*
  * Macros
diff --git a/xen/lib/xxhash64.c b/xen/lib/xxhash64.c
index 481e76fbcf..1858e236fe 100644
--- a/xen/lib/xxhash64.c
+++ b/xen/lib/xxhash64.c
@@ -43,7 +43,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #endif
 
 /*-*
-- 
2.35.3




[PATCH v2 1/3] xen/arm: set -mno-unaligned-access compiler option for Arm32

2023-12-05 Thread Juergen Gross
As the hypervisor is disabling unaligned accesses for Arm32, set the
-mno-unaligned-access compiler option for building. This will prohibit
unaligned accesses when e.g. accessing 2- or 4-byte data items in
packed data structures.

Signed-off-by: Juergen Gross 
---
V2:
- new patch
---
 xen/arch/arm/arch.mk | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/arm/arch.mk b/xen/arch/arm/arch.mk
index 58db76c4e1..022dcda192 100644
--- a/xen/arch/arm/arch.mk
+++ b/xen/arch/arm/arch.mk
@@ -7,6 +7,7 @@ $(call cc-option-add,CFLAGS,CC,-Wnested-externs)
 # Prevent floating-point variables from creeping into Xen.
 CFLAGS-$(CONFIG_ARM_32) += -msoft-float
 CFLAGS-$(CONFIG_ARM_32) += -mcpu=cortex-a15
+CFLAGS-$(CONFIG_ARM_32) += -mno-unaligned-access
 
 CFLAGS-$(CONFIG_ARM_64) += -mcpu=generic
 CFLAGS-$(CONFIG_ARM_64) += -mgeneral-regs-only # No fp registers etc
-- 
2.35.3




[xen-unstable-smoke test] 184002: tolerable all pass - PUSHED

2023-12-05 Thread osstest service owner
flight 184002 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184002/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  4c6142a1ab004be132f386da3cabce07b44fac2d
baseline version:
 xen  01da0aeecd41435cea8bd2fe0f547e0a474f6e45

Last test of basis   184000  2023-12-05 23:00:25 Z0 days
Testing same since   184002  2023-12-06 02:00:25 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   01da0aeecd..4c6142a1ab  4c6142a1ab004be132f386da3cabce07b44fac2d -> smoke



Re: [PATCH -next RFC 02/14] xen/blkback: use bdev api in xen_update_blkif_status()

2023-12-05 Thread Yu Kuai

Hi,

在 2023/12/06 13:55, Christoph Hellwig 写道:

On Tue, Dec 05, 2023 at 08:37:16PM +0800, Yu Kuai wrote:

diff --git a/drivers/block/xen-blkback/xenbus.c 
b/drivers/block/xen-blkback/xenbus.c
index e34219ea2b05..e645afa4af57 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -104,8 +104,7 @@ static void xen_update_blkif_status(struct xen_blkif *blkif)
xenbus_dev_error(blkif->be->dev, err, "block flush");
return;
}
-   invalidate_inode_pages2(
-   blkif->vbd.bdev_handle->bdev->bd_inode->i_mapping);
+   invalidate_bdev(blkif->vbd.bdev_handle->bdev);


blkbak is a bdev exported.   I don't think it should ever call
invalidate_inode_pages2, through a wrapper or not.


I'm not sure about this. I'm not familiar with xen/blkback, but I saw
that xen-blkback will open a bdev from xen_vbd_create(), hence this
looks like a dm/md for me, hence it sounds reasonable to sync +
invalidate the opened bdev while initialization. Please kindly correct
me if I'm wrong.

Thanks,
Kuai



.






Re: [PATCH -next RFC 01/14] block: add some bdev apis

2023-12-05 Thread Yu Kuai

Hi,

在 2023/12/06 14:14, Christoph Hellwig 写道:

+void invalidate_bdev_range(struct block_device *bdev, pgoff_t start,
+  pgoff_t end)
+{
+   invalidate_mapping_pages(bdev->bd_inode->i_mapping, start, end);
+}
+EXPORT_SYMBOL_GPL(invalidate_bdev_range);


All these could probably use kerneldoc comments.


Ok, and thanks for reviewing the patchset!


For this one I really don't like it existing at all, but we'll have to
discuss that in the btrfs patch.


+loff_t bdev_size(struct block_device *bdev)
+{
+   loff_t size;
+
+   spin_lock(>bd_size_lock);
+   size = i_size_read(bdev->bd_inode);
+   spin_unlock(>bd_size_lock);
+
+   return size;
+}
+EXPORT_SYMBOL_GPL(bdev_size);


No need for this one.  The callers can simply use bdev_nr_bytes.


Ok, I'll replace it with bdev_nr_bytes.



+struct folio *bdev_read_folio(struct block_device *bdev, pgoff_t index)
+{
+   return read_mapping_folio(bdev->bd_inode->i_mapping, index, NULL);
+}
+EXPORT_SYMBOL_GPL(bdev_read_folio);
+
+struct folio *bdev_read_folio_gfp(struct block_device *bdev, pgoff_t index,
+ gfp_t gfp)
+{
+   return mapping_read_folio_gfp(bdev->bd_inode->i_mapping, index, gfp);
+}
+EXPORT_SYMBOL_GPL(bdev_read_folio_gfp);


I think we can just drop bdev_read_folio_gfp. Half of the callers simply
pass GPK_KERNEL, and the other half passes GFP_NOFS and could just use
memalloc_nofs_save().


I'm a litter confused, so there are 3 use cases:
1) use GFP_USER, default gfp from bdev_alloc.
2) use GFP_KERNEL
3) use GFP_NOFS

I understand that you're suggesting memalloc_nofs_save() to distinguish
2 and 3, but how can I distinguish 1?



+void bdev_balance_dirty_pages_ratelimited(struct block_device *bdev)
+{
+   return balance_dirty_pages_ratelimited(bdev->bd_inode->i_mapping);
+}
+EXPORT_SYMBOL_GPL(bdev_balance_dirty_pages_ratelimited);


Hmm, this is just used for block2mtd, and feels a little too low-level
to me, as block2mtd really should be using the normal fileread/write
APIs.  I guess we'll have to live with it for now if we want to expedite
killing off bd_inode.


+void bdev_correlate_mapping(struct block_device *bdev,
+   struct address_space *mapping)
+{
+   mapping->host = bdev->bd_inode;
+}
+EXPORT_SYMBOL_GPL(bdev_correlate_mapping);


Maybe associated insted of correlate?  Either way this basically
fully exposes the bdev inode again :(


+gfp_t bdev_gfp_constraint(struct block_device *bdev, gfp_t gfp)
+{
+   return mapping_gfp_constraint(bdev->bd_inode->i_mapping, gfp);
+}
+EXPORT_SYMBOL_GPL(bdev_gfp_constraint);


The right fix here is to:

  - use memalloc_nofs_save in extet instead of using
mapping_gfp_constraint to clear it from the mapping flags
  - remove __ext4_sb_bread_gfp and just have buffer.c helper that does
the right thing (either by changing the calling conventions of an
existing one, or adding a new one).


Thanks for the suggestions, but I'm not sure how to do this yet, I must
read more ext4 code.



+/*
+ * The del_gendisk() function uninitializes the disk-specific data
+ * structures, including the bdi structure, without telling anyone
+ * else.  Once this happens, any attempt to call mark_buffer_dirty()
+ * (for example, by ext4_commit_super), will cause a kernel OOPS.
+ * This is a kludge to prevent these oops until we can put in a proper
+ * hook in del_gendisk() to inform the VFS and file system layers.
+ */
+int bdev_ejected(struct block_device *bdev)
+{
+   struct backing_dev_info *bdi = inode_to_bdi(bdev->bd_inode);
+
+   return bdi->dev == NULL;
+}
+EXPORT_SYMBOL_GPL(bdev_ejected);


And this code in ext4 should just go away entirely.  The bdi should
always be valid for a live bdev for years.

Sounds good, I was confused about this code as well.




--- a/block/bio.c
+++ b/block/bio.c
@@ -1119,6 +1119,7 @@ void bio_add_folio_nofail(struct bio *bio, struct folio 
*folio, size_t len,
WARN_ON_ONCE(off > UINT_MAX);
__bio_add_page(bio, >page, len, off);
  }
+EXPORT_SYMBOL_GPL(bio_add_folio_nofail);


How is this realted?  The export is fine, but really should be a
separate, well-documented commit.


This is used to replace __bio_add_page() in btrfs while converting page
to folio, please let me know if I should keep this, if so, I'll split
this into a new commit.


  
+static inline u8 block_bits(struct block_device *bdev)

+{
+   return bdev->bd_inode->i_blkbits;
+}


Not sure we should need this.  i_blkbits comes from the blocksize
the fs set, so it should have other ways to get at it.


Yes, this is now only used for erofs, and erofs do call
sb_set_blocksize() while initializing, hence it's right there is other
way to get blkbits and this helper is not needed.

Thanks,
Kuai


.






Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function

2023-12-05 Thread Chen, Jiqian
Hi Thomas Gleixner,

On 2023/12/6 01:02, Thomas Gleixner wrote:
> On Mon, Dec 04 2023 at 13:31, Stefano Stabellini wrote:
>> On Mon, 3 Dec 2023, Chen, Jiqian wrote:
> vpci device state when device is reset on dom0 side.
>
> And call that function in pcistub_init_device. Because when
> we use "pci-assignable-add" to assign a passthrough device in
> Xen, it will reset passthrough device and the vpci state will
> out of date, and then device will fail to restore bar state.
>
> Signed-off-by: Jiqian Chen 
> Signed-off-by: Huang Rui 

 This Signed-off-by chain is incorrect.

 Documentation/process/submitting-patches.rst has a full chapter about
 S-O-B and the correct usage.
>>> I am the author of this series of patches, and Huang Rui transported the v1 
>>> to upstream. And now I transport v2. I am not aware that the SOB chain is 
>>> incorrect.
>>> Do you have any suggestions?
>>
>> I think he means that your Signed-off-by should be the second one of the
>> two as you are the one submitting the patch to the LKML
> 
> No.
> 
>Mailfrom: Jiqian Chen 
>
> 
>Changelog-text
> 
>Signed-off-by: Huang Rui 
>Signed-off-by: Jiqian Chen 
> 
> is equally wrong because that would end up with Chen as author and Huang
> as first S-O-B which is required to be the author's S-O-B
> 
> To make the above correct this would require:
> 
>Mailfrom: Jiqian Chen 
>
> 
>From: Huang Rui 
> 
>Changelog-text
> 
>Signed-off-by: Huang Rui 
>Signed-off-by: Jiqian Chen 
> 
>which tells that Huang is the author and Chen is the 'transporter',
>which unfortunately does not reflect reality.
> 
> Or:
> 
>Mailfrom: Jiqian Chen 
>
> 
>Changelog-text
> 
>Co-developed-by: Huang Rui 
>Signed-off-by: Huang Rui 
>Signed-off-by: Jiqian Chen 
> 
>which tells that Checn is the author and Huang co-developed the
>patch, which might be true or not.
> 
> 
> V1 which was sent by Huang has the ordering is correct:
> 
>Mailfrom: Huang Rui 
>
> 
>From: Jiqian Chen 
> 
>Changelog-text
> 
>Signed-off-by: Jiqian Chen 
>Signed-off-by: Huang Rui 
> 
>i.e. Chen authored and Huang transported
> 
> Now this V2 has not really much to do with V1 and is a new
> implementation to solve the problem, which was authored by Chen, so
> Huang is not involved at all if I understand correctly.
Not exactly, V2 is modified based on the V1. I am the author of this series. 
Huang Rui upstream the V1, and he also helped me to improve the first patch of 
this V2 series.
Maybe in next version, below is more suitable:

Co-developed-by: Huang Rui 
Signed-off-by: Jiqian Chen 

Which can reflect that Huang Rui is co-developer, and I am the author and the 
last people to send patches.

> 
> So what does his S-O-B mean here? Nothing...
> 
> It's very well documented how the whole S-O-B business works and it's
> not really rocket science to get it straight.
> 
> It has a meaning and is not just for decoration purposes.
> 
> Thanks,
> 
> tglx

-- 
Best regards,
Jiqian Chen.


Re: [PATCH -next RFC 01/14] block: add some bdev apis

2023-12-05 Thread Christoph Hellwig
> +void invalidate_bdev_range(struct block_device *bdev, pgoff_t start,
> +pgoff_t end)
> +{
> + invalidate_mapping_pages(bdev->bd_inode->i_mapping, start, end);
> +}
> +EXPORT_SYMBOL_GPL(invalidate_bdev_range);

All these could probably use kerneldoc comments.

For this one I really don't like it existing at all, but we'll have to
discuss that in the btrfs patch.

> +loff_t bdev_size(struct block_device *bdev)
> +{
> + loff_t size;
> +
> + spin_lock(>bd_size_lock);
> + size = i_size_read(bdev->bd_inode);
> + spin_unlock(>bd_size_lock);
> +
> + return size;
> +}
> +EXPORT_SYMBOL_GPL(bdev_size);

No need for this one.  The callers can simply use bdev_nr_bytes.

> +struct folio *bdev_read_folio(struct block_device *bdev, pgoff_t index)
> +{
> + return read_mapping_folio(bdev->bd_inode->i_mapping, index, NULL);
> +}
> +EXPORT_SYMBOL_GPL(bdev_read_folio);
> +
> +struct folio *bdev_read_folio_gfp(struct block_device *bdev, pgoff_t index,
> +   gfp_t gfp)
> +{
> + return mapping_read_folio_gfp(bdev->bd_inode->i_mapping, index, gfp);
> +}
> +EXPORT_SYMBOL_GPL(bdev_read_folio_gfp);

I think we can just drop bdev_read_folio_gfp. Half of the callers simply
pass GPK_KERNEL, and the other half passes GFP_NOFS and could just use
memalloc_nofs_save().

> +void bdev_balance_dirty_pages_ratelimited(struct block_device *bdev)
> +{
> + return balance_dirty_pages_ratelimited(bdev->bd_inode->i_mapping);
> +}
> +EXPORT_SYMBOL_GPL(bdev_balance_dirty_pages_ratelimited);

Hmm, this is just used for block2mtd, and feels a little too low-level
to me, as block2mtd really should be using the normal fileread/write
APIs.  I guess we'll have to live with it for now if we want to expedite
killing off bd_inode.

> +void bdev_correlate_mapping(struct block_device *bdev,
> + struct address_space *mapping)
> +{
> + mapping->host = bdev->bd_inode;
> +}
> +EXPORT_SYMBOL_GPL(bdev_correlate_mapping);

Maybe associated insted of correlate?  Either way this basically
fully exposes the bdev inode again :(

> +gfp_t bdev_gfp_constraint(struct block_device *bdev, gfp_t gfp)
> +{
> + return mapping_gfp_constraint(bdev->bd_inode->i_mapping, gfp);
> +}
> +EXPORT_SYMBOL_GPL(bdev_gfp_constraint);

The right fix here is to:

 - use memalloc_nofs_save in extet instead of using
   mapping_gfp_constraint to clear it from the mapping flags
 - remove __ext4_sb_bread_gfp and just have buffer.c helper that does
   the right thing (either by changing the calling conventions of an
   existing one, or adding a new one).

> +/*
> + * The del_gendisk() function uninitializes the disk-specific data
> + * structures, including the bdi structure, without telling anyone
> + * else.  Once this happens, any attempt to call mark_buffer_dirty()
> + * (for example, by ext4_commit_super), will cause a kernel OOPS.
> + * This is a kludge to prevent these oops until we can put in a proper
> + * hook in del_gendisk() to inform the VFS and file system layers.
> + */
> +int bdev_ejected(struct block_device *bdev)
> +{
> + struct backing_dev_info *bdi = inode_to_bdi(bdev->bd_inode);
> +
> + return bdi->dev == NULL;
> +}
> +EXPORT_SYMBOL_GPL(bdev_ejected);

And this code in ext4 should just go away entirely.  The bdi should
always be valid for a live bdev for years.

> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1119,6 +1119,7 @@ void bio_add_folio_nofail(struct bio *bio, struct folio 
> *folio, size_t len,
>   WARN_ON_ONCE(off > UINT_MAX);
>   __bio_add_page(bio, >page, len, off);
>  }
> +EXPORT_SYMBOL_GPL(bio_add_folio_nofail);

How is this realted?  The export is fine, but really should be a
separate, well-documented commit.

>  
> +static inline u8 block_bits(struct block_device *bdev)
> +{
> + return bdev->bd_inode->i_blkbits;
> +}

Not sure we should need this.  i_blkbits comes from the blocksize
the fs set, so it should have other ways to get at it.



[xen-unstable test] 183996: tolerable FAIL - PUSHED

2023-12-05 Thread osstest service owner
flight 183996 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183996/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 183990
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 183990
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 183990
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 183990
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 183990
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 183990
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 183990
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 183990
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 183990
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 183990
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 183990
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 183990
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  ff1178062094837d55ef342070e58316c43a54c9
baseline version:
 xen  525c7c094b258e8a46b494488eef96f5670eb352

Last test of basis   183990  2023-12-05 01:55:39 Z1 days
Testing same since   183996  2023-12-05 16:10:29 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Federico Serafini 
  Henry Wang  # 

Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0

2023-12-05 Thread Chen, Jiqian
On 2023/12/5 18:32, Jan Beulich wrote:
> On 05.12.2023 10:19, Roger Pau Monné wrote:
>> On Mon, Dec 04, 2023 at 02:19:33PM -0800, Stefano Stabellini wrote:
>>> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
 On Fri, Dec 01, 2023 at 07:37:55PM -0800, Stefano Stabellini wrote:
> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
>> On Thu, Nov 30, 2023 at 07:15:17PM -0800, Stefano Stabellini wrote:
>>> On Thu, 30 Nov 2023, Roger Pau Monné wrote:
 On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> On Fri, 24 Nov 2023, Jiqian Chen wrote:
>> This patch is to solve two problems we encountered when we try to
>> passthrough a device to hvm domU base on Xen PVH dom0.
>>
>> First, hvm guest will alloc a pirq and irq for a passthrough device
>> by using gsi, before that, the gsi must first has a mapping in dom0,
>> see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
>> into Xen and check whether dom0 has the mapping. See
>> XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
>> dom0 and it return irq is 0, and then return -EPERM.
>> This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
>> when thay are enabled.
>>
>> Second, in PVH dom0, the gsi of a passthrough device doesn't get
>> registered, but gsi must be configured for it to be able to be
>> mapped into a domU.
>>
>> After searching codes, we can find map_pirq and register_gsi will be
>> done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
>> the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
>> can be conclude to that the gsi of a passthrough device doesn't be
>> unmasked.
>>
>> To solve the unmaske problem, this patch call the unmask_irq when we
>> assign a device to be passthrough. So that the gsi can get registered
>> and mapped in PVH dom0.
>
>
> Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> we need the unmask check in Xen? Couldn't we just do:
>
>
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 4e40d3609a..df262a4a18 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
>  hvm_dpci_eoi(d, gsi);
>  }
>  
> -if ( is_hardware_domain(d) && unmasked )
> +if ( is_hardware_domain(d) )
>  {
>  /*
>   * NB: don't call vioapic_hwdom_map_gsi while holding 
> hvm.irq_lock

 There are some issues with this approach.

 mp_register_gsi() will only setup the trigger and polarity of the
 IO-APIC pin once, so we do so once the guest unmask the pin in order
 to assert that the configuration is the intended one.  A guest is
 allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
 that doesn't take effect unless the pin is unmasked.

 Overall the question would be whether we have any guarantees that
 the hardware domain has properly configured the pin, even if it's not
 using it itself (as it hasn't been unmasked).

 IIRC PCI legacy interrupts are level triggered and low polarity, so we
 could configure any pins that are not setup at bind time?
>>>
>>> That could work.
>>>
>>> Another idea is to move only the call to allocate_and_map_gsi_pirq at
>>> bind time? That might be enough to pass a pirq_access_permitted check.
>>
>> Maybe, albeit that would change the behavior of XEN_DOMCTL_bind_pt_irq
>> just for PT_IRQ_TYPE_PCI and only when called from a PVH dom0 (as the
>> parameter would be a GSI instead of a previously mapped IRQ).  Such
>> difference just for PT_IRQ_TYPE_PCI is slightly weird - if we go that
>> route I would recommend that we instead introduce a new dmop that has
>> this syntax regardless of the domain type it's called from.
>
> Looking at the code it is certainly a bit confusing. My point was that
> we don't need to wait until polarity and trigger are set appropriately
> to allow Dom0 to pass successfully a pirq_access_permitted() check. Xen
> should be able to figure out that Dom0 is permitted pirq access.

 The logic is certainly not straightforward, and it could benefit from
 some comments.

 The irq permissions are a bit special, in that they get setup when the
 IRQ is mapped.

 The problem however is not so much with IRQ permissions, that we can
 indeed sort out internally in Xen.  Such check in dom0 has the side
 effect of preventing the IRQ from being assigned to a 

Re: [PATCH -next RFC 00/14] block: don't access bd_inode directly from other modules

2023-12-05 Thread Yu Kuai

Hi,

在 2023/12/06 13:54, Christoph Hellwig 写道:

On Tue, Dec 05, 2023 at 08:37:14PM +0800, Yu Kuai wrote:

From: Yu Kuai 

Patch 1 add some bdev apis, then follow up patches will use these apis
to avoid access bd_inode directly, and hopefully the field bd_inode can
be removed eventually(after figure out a way for fs/buffer.c).


What tree is this against?  It fails to apply to either Jens'
for-6.8/block or Linus tree in the very first patch.


It was against linux-next branch, for the tag next-20231201, because I'm
not sure yet if this patchset should be applied to Jans' tree. Please
let me know if I should swith wo Jens' tree for v2.

Thanks,
Kuai


.






Re: [PATCH -next RFC 02/14] xen/blkback: use bdev api in xen_update_blkif_status()

2023-12-05 Thread Christoph Hellwig
On Tue, Dec 05, 2023 at 08:37:16PM +0800, Yu Kuai wrote:
> diff --git a/drivers/block/xen-blkback/xenbus.c 
> b/drivers/block/xen-blkback/xenbus.c
> index e34219ea2b05..e645afa4af57 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -104,8 +104,7 @@ static void xen_update_blkif_status(struct xen_blkif 
> *blkif)
>   xenbus_dev_error(blkif->be->dev, err, "block flush");
>   return;
>   }
> - invalidate_inode_pages2(
> - blkif->vbd.bdev_handle->bdev->bd_inode->i_mapping);
> + invalidate_bdev(blkif->vbd.bdev_handle->bdev);

blkbak is a bdev exported.   I don't think it should ever call
invalidate_inode_pages2, through a wrapper or not.




Re: [PATCH -next RFC 00/14] block: don't access bd_inode directly from other modules

2023-12-05 Thread Christoph Hellwig
On Tue, Dec 05, 2023 at 08:37:14PM +0800, Yu Kuai wrote:
> From: Yu Kuai 
> 
> Patch 1 add some bdev apis, then follow up patches will use these apis
> to avoid access bd_inode directly, and hopefully the field bd_inode can
> be removed eventually(after figure out a way for fs/buffer.c).

What tree is this against?  It fails to apply to either Jens'
for-6.8/block or Linus tree in the very first patch.




xen | Failed pipeline for staging | 3e5672d6

2023-12-05 Thread GitLab


Pipeline #1096802796 has failed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 3e5672d6 ( 
https://gitlab.com/xen-project/xen/-/commit/3e5672d69fe09e240195fa6744686b59db6d7d69
 )
Commit Message: automation/eclair: tag function calls to addres...
Commit Author: Simone Ballarin
Committed by: Stefano Stabellini


Pipeline #1096802796 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096802796 ) triggered by Ganis 
( https://gitlab.com/ganis )
had 2 failed jobs.

Job #5689442552 ( https://gitlab.com/xen-project/xen/-/jobs/5689442552/raw )

Stage: test
Name: xilinx-smoke-dom0less-arm64-gcc
Job #5689442554 ( https://gitlab.com/xen-project/xen/-/jobs/5689442554/raw )

Stage: test
Name: xilinx-smoke-dom0less-arm64-gcc-gem-passthrough

-- 
You're receiving this email because of your account on gitlab.com.





Re: [XEN PATCH 6/6] xen/pci: address violations of MISRA C:2012 Rule 8.2

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Federico Serafini wrote:
> Add missing parameter names to address violations of MISRA C:2012
> Rule 8.2. Furthermore, use C standard types to comply with XEN coding style.
> No functional change.
> 
> Signed-off-by: Federico Serafini 

Reviewed-by: Stefano Stabellini 



Re: [XEN PATCH 5/6] x86/mce: address violations of MISRA C:2012 Rule 8.2

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Federico Serafini wrote:
> Add missing parameter names. No functional change.
> 
> Signed-off-by: Federico Serafini 

Reviewed-by: Stefano Stabellini 




Re: [XEN PATCH 4/6] x86/page: address violations of MISRA C:2012 Rule 8.2

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Federico Serafini wrote:
> Add missing parameter names. No functional change.
> 
> Signed-off-by: Federico Serafini 

Reviewed-by: Stefano Stabellini 



Re: [XEN PATCH 3/6] AMD/IOMMU: address violations of MISRA C:2012 Rule 8.2

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Federico Serafini wrote:
> Add missing parameter names to address violations of MISRA C:2012
> Rule 8.2. Furthermore, remove trailing spaces and use C standard types to
> comply with XEN coding style.
> 
> No functional change.
> 
> Signed-off-by: Federico Serafini 
> ---
>  xen/drivers/passthrough/amd/iommu.h  | 17 ++---
>  xen/drivers/passthrough/amd/iommu_init.c | 24 ++--
>  2 files changed, 24 insertions(+), 17 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/amd/iommu.h 
> b/xen/drivers/passthrough/amd/iommu.h
> index d4416ebc43..1b62c083ba 100644
> --- a/xen/drivers/passthrough/amd/iommu.h
> +++ b/xen/drivers/passthrough/amd/iommu.h
> @@ -138,10 +138,12 @@ struct ivrs_mappings {
>  extern unsigned int ivrs_bdf_entries;
>  extern u8 ivhd_type;
>  
> -struct ivrs_mappings *get_ivrs_mappings(u16 seg);
> -int iterate_ivrs_mappings(int (*)(u16 seg, struct ivrs_mappings *));
> -int iterate_ivrs_entries(int (*)(const struct amd_iommu *,
> - struct ivrs_mappings *, uint16_t));
> +struct ivrs_mappings *get_ivrs_mappings(uint16_t seg);
> +int iterate_ivrs_mappings(int (*handler)(uint16_t seg,
> + struct ivrs_mappings *map));
> +int iterate_ivrs_entries(int (*handler)(const struct amd_iommu *iommu,
> +struct ivrs_mappings *map,
> +uint16_t bdf));
>  
>  /* iommu tables in guest space */
>  struct mmio_reg {
> @@ -226,7 +228,7 @@ struct acpi_ivrs_hardware;
>  /* amd-iommu-detect functions */
>  int amd_iommu_get_ivrs_dev_entries(void);
>  int amd_iommu_get_supported_ivhd_type(void);
> -int amd_iommu_detect_one_acpi(const struct acpi_ivrs_hardware *);
> +int amd_iommu_detect_one_acpi(const struct acpi_ivrs_hardware *ivhd_block);
>  int amd_iommu_detect_acpi(void);
>  void get_iommu_features(struct amd_iommu *iommu);
>  
> @@ -295,9 +297,10 @@ struct amd_iommu *find_iommu_for_device(int seg, int 
> bdf);
>  bool cf_check iov_supports_xt(void);
>  int amd_iommu_setup_ioapic_remapping(void);
>  void *amd_iommu_alloc_intremap_table(
> -const struct amd_iommu *, unsigned long **, unsigned int nr);
> +const struct amd_iommu *iommu, unsigned long **inuse_map, unsigned int 
> nr);
>  int cf_check amd_iommu_free_intremap_table(
> -const struct amd_iommu *, struct ivrs_mappings *, uint16_t);
> +const struct amd_iommu *iommu, struct ivrs_mappings *ivrs_mapping,
> +uint16_t bdf);
>  unsigned int amd_iommu_intremap_table_order(
>  const void *irt, const struct amd_iommu *iommu);
>  void cf_check amd_iommu_ioapic_update_ire(
> diff --git a/xen/drivers/passthrough/amd/iommu_init.c 
> b/xen/drivers/passthrough/amd/iommu_init.c
> index 5515cb70fd..62f9bfdfc8 100644
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -300,12 +300,13 @@ static void cf_check set_iommu_ppr_log_control(
>  static int iommu_read_log(struct amd_iommu *iommu,
>struct ring_buffer *log,
>unsigned int entry_size,
> -  void (*parse_func)(struct amd_iommu *, u32 *))
> +  void (*parse_func)(struct amd_iommu *iommu,
> + uint32_t *entry))
>  {
>  unsigned int tail, tail_offest, head_offset;
>  
>  BUG_ON(!iommu || ((log != >event_log) && (log != 
> >ppr_log)));
> -
> +
>  spin_lock(>lock);
>  
>  /* make sure there's an entry in the log */
> @@ -361,14 +362,15 @@ static int iommu_read_log(struct amd_iommu *iommu,
>  
>   out:
>  spin_unlock(>lock);
> -   
> +
>  return 0;
>  }
>  
>  /* reset event log or ppr log when overflow */
>  static void iommu_reset_log(struct amd_iommu *iommu,
>  struct ring_buffer *log,
> -void (*ctrl_func)(struct amd_iommu *iommu, bool))
> +void (*ctrl_func)(struct amd_iommu *iommu,
> +  bool iommu_control))

instead of iommu_control it should be iommu_enable ?


>  {
>  unsigned int entry, run_bit, loop_count = 1000;
>  bool log_run;
> @@ -1158,14 +1160,15 @@ static void __init amd_iommu_init_cleanup(void)
>  iommuv2_enabled = 0;
>  }
>  
> -struct ivrs_mappings *get_ivrs_mappings(u16 seg)
> +struct ivrs_mappings *get_ivrs_mappings(uint16_t seg)
>  {
>  return radix_tree_lookup(_maps, seg);
>  }
>  
> -int iterate_ivrs_mappings(int (*handler)(u16 seg, struct ivrs_mappings *))
> +int iterate_ivrs_mappings(int (*handler)(uint16_t seg,
> + struct ivrs_mappings *map))

Instead of map it should be ivrs_mappings ? Actually it reads better as
map and I know it is not a MISRA requirement to have function pointer
args match. I'll leave this one to Jan.




>  {
> -u16 seg = 0;
> +uint16_t seg = 0;

Re: [XEN PATCH 2/6] x86/mm: address violations of MISRA C:2012 Rule 8.2

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Federico Serafini wrote:
> Add missing parameter names. No functional change.
> 
> Signed-off-by: Federico Serafini 
> ---
>  xen/arch/x86/include/asm/mm.h | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h
> index 05dfe35502..a270f8ddd6 100644
> --- a/xen/arch/x86/include/asm/mm.h
> +++ b/xen/arch/x86/include/asm/mm.h
> @@ -406,7 +406,7 @@ void put_page_type(struct page_info *page);
>  int  get_page_type(struct page_info *page, unsigned long type);
>  int  put_page_type_preemptible(struct page_info *page);
>  int  get_page_type_preemptible(struct page_info *page, unsigned long type);
> -int  put_old_guest_table(struct vcpu *);
> +int  put_old_guest_table(struct vcpu *v);
>  int  get_page_from_l1e(
>  l1_pgentry_t l1e, struct domain *l1e_owner, struct domain *pg_owner);
>  void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner);
> @@ -557,7 +557,7 @@ void audit_domains(void);
>  
>  void make_cr3(struct vcpu *v, mfn_t mfn);
>  pagetable_t update_cr3(struct vcpu *v);
> -int vcpu_destroy_pagetables(struct vcpu *);
> +int vcpu_destroy_pagetables(struct vcpu *v);
>  void *do_page_walk(struct vcpu *v, unsigned long addr);
>  
>  /* Allocator functions for Xen pagetables. */
> @@ -572,20 +572,20 @@ int __sync_local_execstate(void);
>  /* Arch-specific portion of memory_op hypercall. */
>  long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg);
>  long subarch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg);
> -int compat_arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void));
> -int compat_subarch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void));
> +int compat_arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) 
> arg);
> +int compat_subarch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg);

Also for compat_subarch_memory_op I cannot find the definition

everything else looks fine




Re: [XEN PATCH 1/6] xen/acpi: address remaining violations of MISRA C:2012 Rule 8.2

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Federico Serafini wrote:
> Add missing parameter names to address violations of MISRA C:2012
> Rule 8.2. Furthermore, use C standard types to comply with XEN coding style.
> 
> No functional change.
> 
> Signed-off-by: Federico Serafini 
> ---
>  xen/arch/x86/include/asm/acpi.h   |  2 +-
>  xen/include/acpi/apei.h   |  5 +++--
>  xen/include/acpi/cpufreq/cpufreq.h|  2 +-
>  xen/include/acpi/cpufreq/processor_perf.h | 16 
>  4 files changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/xen/arch/x86/include/asm/acpi.h b/xen/arch/x86/include/asm/acpi.h
> index 0df92d3714..d54c105f61 100644
> --- a/xen/arch/x86/include/asm/acpi.h
> +++ b/xen/arch/x86/include/asm/acpi.h
> @@ -81,7 +81,7 @@ extern bool acpi_lapic, acpi_ioapic, acpi_noirq;
>  extern bool acpi_force, acpi_ht, acpi_disabled;
>  extern u32 acpi_smi_cmd;
>  extern u8 acpi_enable_value, acpi_disable_value;
> -void acpi_pic_sci_set_trigger(unsigned int, u16);
> +void acpi_pic_sci_set_trigger(unsigned int irq, uint16_t trigger);

There is something strange about this one, I cannot find the definition
anywhere. Am I missing anything?

Everything else looks good




Re: [BUG] Nested Virtualization Bug on x86-64 AMD CPU

2023-12-05 Thread Reima ISHII
Thank you for your prompt response.

On Tue, Dec 5, 2023 at 11:43 PM Andrew Cooper  wrote:
> Who is still in 64-bit mode ?
>
> It is legal for a 64-bit L1 to VMRUN into a 32-bit L2 with PG=0.
>
> But I'm guessing that you mean L2 is also 64-bit, and we're clearing PG,
> thus creating an illegal state (LMA=1 && PG=0) in VMCB12.
>
> And yes, in that case (virtual) VMRUN at L1 ought to fail with
> VMEXIT_INVALID.

Yes, you are correct in your understanding. This issue is triggered by
VMRUN execution to 64-bit L2 guests, when CR0.PG is cleared in VMCB12.
Contrary to the expected behavior where a VMRUN at L1 should fail with
VMEXIT_INVALID, the VMRUN does not fail but instead, the system
encounters a BUG().

> As an incidental observation, that function is particularly absurd and
> the two switches should be merged.
>
> VMExit reason 0x402 is AVIC_NOACCEL and Xen has no support for AVIC in
> the slightest right now.  i.e. Xen shouldn't have AVIC active in the
> VMCB, and should never any AVIC related VMExits.
>
> It is possible that we've got memory corruption, and have accidentally
> activated AVIC in the VMCB.

The idea of potential memory corruption activating AVIC in the VMCB is
certainly an interesting perspective. While I'm not sure how exactly
such memory corruption could occur, the suggestion does provide a
compelling explanation for the VMExit reason 0x402 (AVIC_NOACCEL),
particularly considering Xen's current lack of AVIC support.

> But, is this by any chance all running nested under KVM in your fuzzer?

No, KVM was not used. The issue was observed on a Xen hypervisor's
domU HVM running directly on the hardware. Within the guest HVM, a
simple custom hypervisor was utilized.

-- 
Graduate School of Information Science and Technology, The University of Tokyo
Reima Ishii
ish...@g.ecc.u-tokyo.ac.jp



[PATCH] docs/misra/rules.rst: add more rules

2023-12-05 Thread Stefano Stabellini
Add the rules accepted in the last three MISRA C working group meetings.

Signed-off-by: Stefano Stabellini 

diff --git a/docs/misra/rules.rst b/docs/misra/rules.rst
index 75921b9a34..ab89116a43 100644
--- a/docs/misra/rules.rst
+++ b/docs/misra/rules.rst
@@ -462,11 +462,23 @@ maintainers if you want to suggest a change.
 
while(0) and while(1) and alike are allowed.
 
+   * - `Rule 16.3 
`_
+ - Required
+ - An unconditional break statement shall terminate every
+   switch-clause
+ - In addition to break, also other flow control statements such as
+   continue, return, goto are allowed.
+
* - `Rule 16.7 
`_
  - Required
  - A switch-expression shall not have essentially Boolean type
  -
 
+   * - `Rule 17.1 
`_
+ - Required
+ - The features of  shall not be used
+ -
+
* - `Rule 17.3 
`_
  - Mandatory
  - A function shall not be declared implicitly
@@ -478,12 +490,24 @@ maintainers if you want to suggest a change.
have an explicit return statement with an expression
  -
 
+   * - `Rule 17.5 
`_
+ - Advisory
+ - The function argument corresponding to a parameter declared to
+   have an array type shall have an appropriate number of elements
+ -
+
* - `Rule 17.6 
`_
  - Mandatory
  - The declaration of an array parameter shall not contain the
static keyword between the [ ]
  -
 
+   * - `Rule 17.7 
`_
+ - Required
+ - The value returned by a function having non-void return type
+   shall be used
+ -
+
* - `Rule 18.3 
`_
  - Required
  - The relational operators > >= < and <= shall not be applied to objects 
of pointer type except where they point into the same object
@@ -498,6 +522,11 @@ maintainers if you want to suggest a change.
instances where Eclair is unable to verify that the code is valid
in regard to Rule 19.1. Caution reports are not violations.
 
+   * - `Rule 20.4 
`_
+ - Required
+ - A macro shall not be defined with the same name as a keyword
+ -
+
* - `Rule 20.7 
`_
  - Required
  - Expressions resulting from the expansion of macro parameters
@@ -506,6 +535,13 @@ maintainers if you want to suggest a change.
as function arguments, as macro arguments, array indices, lhs in
assignments
 
+   * - `Rule 20.9 
`_
+ - Required
+ - All identifiers used in the controlling expression of #if or
+   #elif preprocessing directives shall be #define'd before
+   evaluation
+ -
+
* - `Rule 20.13 
`_
  - Required
  - A line whose first token is # shall be a valid preprocessing



Re: SAF-* comment at the end of the line

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Luca Fancellu wrote:
> Hi all,
> 
> I’m writing this mail to collect thoughts about the need to improve the SAF-* 
> comments.
> 
> I think we reached a point where we need to use deviations for some violation 
> that we want
> to keep in the code with a proper justification and an issue was raised when 
> the comment
> cannot be put on a line on its own.
> 
> e.g.
> 
> If ( condition-1 &&
>  condition-2 &&
>   [...] )
> {
> ...
> }
> 
> For example in the code above, if the violation is in the second condition, 
> breaking the conditions
> to have an empty line between them for the SAF-* comment is not ideal, so we 
> could maybe
> improve the in-code comment to be used at the end of the line:
> 
> e.g.
> 
> If ( condition-1 &&
>  condition-2 && /* SAF-*-safe [...] */
>   [...] )
> {
> ...
> }
> 
> This might require also a deviation on the coding style to allow the comment 
> to overcome the line length.
> 
> Bertrand, from his experience with safety certifications, feels that adding 
> this feature could be enough
> to cover the majority of the cases where we need to deviate a violation in 
> the code.
> 
> Using it consistently in the code base as the only way to deviate a violation 
> can also help the adoption
> of the project to people who might want to fix them instead of deviating 
> them, the only thing they would need
> to do is to grep for SAF-* to have a rough idea of how many justified 
> violation are in the code.
> 
> Please let me know your thoughts before I start to implement the feature.

I think we need this feature and in fact we have already been adding it
in an ad-hoc way with /* octal-ok */

It would like to remove octal-ok and use a generic way (SAF) to do the
same.

Re: [PATCH v6 4/5] [FUTURE] xen/arm: enable vPCI for domUs

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Roger Pau Monné wrote:
> > > > I don't think we should enable IOREQ servers to handle PCI passthrough
> > > > for PVH guests and/or guests with vPCI. If the domain has vPCI, PCI
> > > > Passthrough can be handled by vPCI just fine. I think this should be a
> > > > good anti-feature to have (a goal to explicitly not add this feature) to
> > > > reduce complexity. Unless you see a specific usecase to add support for
> > > > it?
> > > 
> > > There are passthrough devices (GPUs) that might require some extra
> > > mediation on dom0 (like the Intel GVT-g thing) that would force the
> > > usage of ioreqs to passthrough.
> > 
> > From an architectural perspective, I think it would be cleaner, simpler
> > to maintain, and simpler to understand if Xen was the sole owner of the
> > PCI Root Complex and PCI config space mediation (implemented with vPCI).
> > IOREQ can be used for emulation and it works very well for that. At
> > least in my mind, that makes things much simpler.
> 
> But IOREQ already has all the code to mediate accesses to the PCI
> config space, and the interface to register separate servers for
> different PCI devices.
> 
> We would then need to duplicate this internally for vPCI, so that vPCI
> could forward accesses to IOREQ just for IOREQ to forward to yet a
> different component?  Seems like a lot of duplication for no benefit.

[...] 
 
> Also, you seem to confabulate IOREQ with QEMU, while the latter is
> indeed an user of IOREQ, I do see IOREQ as a simple resource mediator
> inside of Xen, that has the ability to forward such accesses to
> external emulators using an hypercall interface.

We have been using different terminologies until now. IOREQ could mean
anything from the ABI interface, the emulator side (QEMU) or the
hypervisor side (Xen). I am going to align with your wording and say:

IOREQ: only the IOREQ implementation in Xen (xen/common/ioreq.c)
IOREQ server: QEMU or alternative

I think it is OK if we use IOREQ internally within Xen to hook vPCI with
PCI config space accesses and emulation. I don't think it is a good idea
to attempt to enable IOREQ servers (e.g. QEMU) to implement PCI
Passthrough when vPCI is also enabled for the domain, at least
initially.


> > I understand there are non-trivial cases, like virtual GPUs with
> > hardware access, but I don't classify those as passthrough. That's
> > because there isn't one device that gets fully assigned to the guest.
> > Instead, there is an emulated device (hence IOREQ) with certain MMIO
> > regions and interrupts that end up being directly mapped from real
> > hardware.
> > 
> > So I think it is natural in those cases to use IOREQ and it is also
> > natural to have QEMU remap MMIO/IRQs at runtime. From a vPCI
> > perspective, I hope it will mostly look as if the device is assigned to
> > Dom0. Even if it ends up being more complex than that, Rome wasn't
> > built in one day, and I don't think we should try to solve this problem
> > on day1 (as long as the interfaces are not stable interfaces).
> 
> I don't see IOREQ as dealing explicitly with emulation.  Yes, it does
> allow for emulators to be implemented in user-space, but at the end
> it's just an interface that allows forwarding accesses to certain
> resources (for the case we are speaking about here, PCI config space)
> to entities that registered as handlers.
> 
> vPCI OTOH just deals with a very specific resource (PCI config space)
> and only allows internal handlers to be registered on a byte
> granularity.
> 
> So your proposal would be to implement a hierarchy like the one on the
> diagram below:
> 
> ┌┐ ┌──┐ ┌──┐
> │ Memory │ │ IO Ports │ │ PCI config space │
> └───┬┘ └┬─┘ └───┬──┘
> │   │   │
> │   │   ┌───┴──┐
> │   │   │ vPCI │
> │   │   └─┬──┬─┘
>  ┌──┴───┴┐│  │
>  │ IOREQ ├┘  │
>  └┬──┘   │
>   │  │
>  ┌┴──┐  ┌┴──┐
>  │ IOREQ servers │  │ vPCI handlers │
>  └───┘  └───┘

Yes


> While what I'm proposing would look like:
> 
> ┌┐ ┌──┐ ┌──┐
> │ Memory │ │ IO Ports │ │ PCI config space │
> └┬───┘ └┬─┘ └┬─┘
>  │  ││
>  └─┬┴┬───┘
>│  IOREQ  │
>└─┬─┬─┘
>  │ │
>  ┌───┤ └┬──┐
>  │ IOREQ servers │  │ vPCI │
>  └───┘  └───┬──┘
> │
> ┌───┴───┐
> │ vPCI handlers │
> └───┘

I don't have a major problem with this, but I find it less clear than
the first one.

Let's 

Re: Clang-format configuration discussion - pt 2

2023-12-05 Thread George Dunlap
On Tue, Dec 5, 2023 at 2:07 PM Jan Beulich  wrote:
>
> On 05.12.2023 14:46, Luca Fancellu wrote:
> > In my opinion, I don’t know of any tool that can address all the 
> > flexibility the Xen codestyle allows, yet the use of automatic
> > checkers would improve the review time, allow more new contributors to 
> > approach the community without being put down by
> > the amount of code-style comments,
>
> Since this argument is being repeated: I find it odd. No-one needs to even
> fear any amount of style comments if they simply follow the written down
> policy plus a tiny bit of common sense. According to my observation, (some)
> newcomers don't even care to look at what is being said about our style.
> It's not like you and I haven't been through this. When I started working
> with GNU toolchain, I had to adopt to their style. When I later started to
> work with Linux, I had to also adopt there. And then for Xen. And all of
> that already past closed source projects I had been working on before.

Most modern languages, including golang (and I think rust) have
built-in style correctors (`go fmt` is go's official one).  If you
haven't worked with an automatic style checker / fixer, you don't know
how much time, hassle, and emotional energy you're saving.  I don't
think I know anyone who, after using one, wants to go back to not
using one any more.

In general, I'm in favor of making changes to our style such that we
can make clang's style checker official.  The only reason I would vote
against it is if one of the style requirements was really intolerable;
but I find that pretty unlikely.

And as I've said before, the main reservation I have going forward
with this discussion is that I can't see clearly what it is that I'm
agreeing to.

 -George



Re: [PATCH] CODING_STYLE: Add a section of the naming convention

2023-12-05 Thread George Dunlap
On Tue, Dec 5, 2023 at 6:12 PM Julien Grall  wrote:
>
> From: Julien Grall 
>
> Several maintainers have expressed a stronger preference
> to use '-' when in filename and option that contains multiple
> words.
>
> So document it in CODING_STYLE.
>
> Signed-off-by: Julien Grall 
>
> ---
> Changes in v2:
> - New wording
> - Update the section title
> - Add Jan's acked-by
> ---
>  CODING_STYLE | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/CODING_STYLE b/CODING_STYLE
> index ced3ade5a6fb..ed13ee2b664b 100644
> --- a/CODING_STYLE
> +++ b/CODING_STYLE
> @@ -144,6 +144,15 @@ separate lines and each line should begin with a leading 
> '*'.
>   * Note beginning and end markers on separate lines and leading '*'.
>   */
>
> +Naming convention for files and command line options
> +
> +
> +'-' should be used to separate words in commandline options and filenames.
> +E.g. timer-works.
> +
> +Note that some of the options and filenames are using '_'. This is now
> +deprecated.

Sorry for not catching this last time; "are using X" isn't really
idiomatic English; more idiomatic would be something like the
following:

"Note that some existing options and file names use '_'.  This is now
deprecated."

Since we're changing things, I *think* most style guides would advise
against starting the sentence with a punctuation; so perhaps:

"Command-line options and file names should use '-' to separate words;
e.g., timer-works."

And what about adding to the last paragraph:

"When touching code around command-line parameters still using '_', it
is recommended to modify the documentation to say only '-', but modify
the code to accept both '-' and '_' (for backwards compatibility)."

I was going to say I'm happy to change on check-in, but I think with
three changes it's probably worth waiting for a fuller discussion.

 -George



[linux-linus test] 183993: regressions - FAIL

2023-12-05 Thread osstest service owner
flight 183993 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183993/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-pvops  6 kernel-build fail REGR. vs. 183973

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 183973
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 183973
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 183973
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 183973
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 183973
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 183973
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 183973
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 183973
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass

version targeted for testing:
 linuxbee0e7762ad2c6025b9f5245c040fcc36ef2bde8
baseline version:
 linux815fb87b753055df2d9e50f6cd80eb10235fe3e9

Last test of basis   183973  2023-12-02 07:48:26 Z3 days
Failing since183977  2023-12-03 00:12:06 Z3 days7 attempts
Testing same since   183993  2023-12-05 09:00:54 Z0 days1 attempts


People who touched revisions under test:
  Alex Williamson 
  Brett Creeley 
  Dan Carpenter 
  David Howells 
  Dmitry Antipov 
  Eugenio Pérez 
  Jason Gunthorpe 
  Jason Wang 
  Jens Axboe 
  Joao Martins 
  JP Kobryn 
  Juergen Gross 
  Linus Torvalds 
  Masami Hiramatsu (Google) 
  Michael Ellerman 
  Michael S. Tsirkin 
  Namjae Jeon 
  Nicholas Piggin 
  Paulo Alcantara (SUSE) 
  Paulo Alcantara 
  Robin Murphy 
  Sean Christopherson 
  Shannon Nelson 
  Steve French 
  Steve Sistare 
  Takashi Sakamoto 
  

[xen-unstable-smoke test] 184000: tolerable all pass - PUSHED

2023-12-05 Thread osstest service owner
flight 184000 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184000/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  01da0aeecd41435cea8bd2fe0f547e0a474f6e45
baseline version:
 xen  948e03303138482bf12f67740d8ebd8272824903

Last test of basis   183998  2023-12-05 20:02:06 Z0 days
Testing same since   184000  2023-12-05 23:00:25 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Nicola Vetrini 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   948e033031..01da0aeecd  01da0aeecd41435cea8bd2fe0f547e0a474f6e45 -> smoke



xen | Successful pipeline for staging | 4c6142a1

2023-12-05 Thread GitLab


Pipeline #1096652606 has passed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 4c6142a1 ( 
https://gitlab.com/xen-project/xen/-/commit/4c6142a1ab004be132f386da3cabce07b44fac2d
 )
Commit Message: CI: Fix fallout from adding elfutils-dev to the...
Commit Author: Andrew Cooper ( https://gitlab.com/andyhhp )



Pipeline #1096652606 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096652606 ) triggered by Ganis 
( https://gitlab.com/ganis )
successfully completed 129 jobs in 3 stages.

-- 
You're receiving this email because of your account on gitlab.com.





[ovmf test] 183999: all pass - PUSHED

2023-12-05 Thread osstest service owner
flight 183999 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183999/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf ef3fde64aa78598a4c21556629936fb228390e8c
baseline version:
 ovmf 120aa6064465496c962b6664a9365a2573e54d1f

Last test of basis   183991  2023-12-05 03:42:59 Z0 days
Testing same since   183999  2023-12-05 20:44:36 Z0 days1 attempts


People who touched revisions under test:
  Tina Chen 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   120aa60644..ef3fde64aa  ef3fde64aa78598a4c21556629936fb228390e8c -> 
xen-tested-master



Re: [PATCH v6 4/5] [FUTURE] xen/arm: enable vPCI for domUs

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Stewart Hildebrand wrote:
> On 12/5/23 12:09, Roger Pau Monné wrote:
> > On Tue, Dec 05, 2023 at 11:27:03AM -0500, Stewart Hildebrand wrote:
> >> On 12/5/23 06:08, Roger Pau Monné wrote:
> >>> On Mon, Dec 04, 2023 at 02:07:51PM -0800, Stefano Stabellini wrote:
>  On Mon, 4 Dec 2023, Roger Pau Monné wrote:
> > On Fri, Dec 01, 2023 at 06:56:32PM -0800, Stefano Stabellini wrote:
> >> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> >>> On Mon, Nov 13, 2023 at 05:21:13PM -0500, Stewart Hildebrand wrote:
>  @@ -1618,6 +1630,14 @@ int iommu_do_pci_domctl(
>   bus = PCI_BUS(machine_sbdf);
>   devfn = PCI_DEVFN(machine_sbdf);
>   
>  +if ( needs_vpci(d) && !has_vpci(d) )
>  +{
>  +printk(XENLOG_G_WARNING "Cannot assign %pp to %pd: vPCI 
>  support not enabled\n",
>  +   _SBDF(seg, bus, devfn), d);
>  +ret = -EPERM;
>  +break;
> >>>
> >>> I think this is likely too restrictive going forward.  The current
> >>> approach is indeed to enable vPCI on a per-domain basis because that's
> >>> how PVH dom0 uses it, due to being unable to use ioreq servers.
> >>>
> >>> If we start to expose vPCI suport to guests the interface should be on
> >>> a per-device basis, so that vPCI could be enabled for some devices,
> >>> while others could still be handled by ioreq servers.
> >>>
> >>> We might want to add a new flag to xen_domctl_assign_device (used by
> >>> XEN_DOMCTL_assign_device) in order to signal whether the device will
> >>> use vPCI.
> >>
> >> Actually I don't think this is a good idea. I am all for flexibility 
> >> but
> >> supporting multiple different configurations comes at an extra cost for
> >> both maintainers and contributors. I think we should try to reduce the
> >> amount of configurations we support rather than increasing them
> >> (especially on x86 where we have PV, PVH, HVM).
> >
> > I think it's perfectly fine to initially require a domain to have all
> > its devices either passed through using vPCI or ireqs, but the
> > interface IMO should allow for such differentiation in the future.
> > That's why I think introducing a domain wide vPCI flag might not be
> > the best option going forward.
> >
> > It would be perfectly fine for XEN_DOMCTL_assign_device to set a
> > domain wide vPCI flag, iow:
> >
> > if ( HYPERCALL_VPCI_FLAG_SET && !has_vpci(d) )
> > {
> > if ( has_arch_pdevs(d) )
> > {
> > printk("All passthrough devices must use the same backend\n");
> > return -EINVAL;
> > }
> >
> > /* Set vPCI domain flag */
> > }
> 
>  That would be fine by me, but maybe we can avoid this change too. I was
>  imagining that vPCI would be enabled at domain creation, not at runtime.
>  And that vPCI would be enabled by default for all PVH guests (once we
>  are past the initial experimental phase.)
> >>>
> >>> Then we don't even need a new CDF flag, and just enable vPCI when
> >>> IOMMU is enabled?  IOW: we can key the enabling of vPCI to
> >>> XEN_DOMCTL_CDF_iommu for specific domain types?
> >>
> >> There are many Arm based platforms that need to use iommu but don't have 
> >> (or don't use) PCI, so we'd still like to have a separate vPCI flag.
> > 
> > OK, read below though - if we switch to vPCI being a descendant of
> > IOREQ (so that the PCI config space decoding is done by IOREQ) we
> > could hotplug vPCI managed devices at runtime without requiring any
> > prior initialization at domain create, since the traps to the PCI
> > config space would be setup by IOREQ.
> > 
> > We might need a PCI flag in order to signal whether the domain is
> > intended to use PCI devices or not, and so whether IOREQ needs to
> > setup PCI config space traps (either fully emulated or passthrough
> > devices).  But that would be arch-specific AFAICT, as on x86 we
> > always trap accesses to the PCI IO ports.
> 
> On Arm, the toolstack (or dom0less creation code) needs to construct a 
> {v,ioreq}PCI root complex node in the device tree before guest boot. 
> Attempting to hot plug such a device tree node at runtime sounds like a goal 
> for the (far) future. On Arm, we don't trap the {v,ioreq}PCI config space by 
> default, so, yes, we for sure would need such a {v,ioreq}PCI flag for setting 
> up the {v,ioreq}PCI mmio handlers if we go this route.

Yes and also dynamic configuration and hotplug are actually detrimental
in safety configurations where you need as much as possible, ideally
everything, to be specified at build time.

Re: [RFC PATCH] xen/arm: Add emulation of Debug Data Transfer Registers

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Julien Grall wrote:
> Hi Ayan,
> 
> On 05/12/2023 12:50, Ayan Kumar Halder wrote:
> > Hi Julien/All,
> > 
> > On 05/12/2023 11:02, Michal Orzel wrote:
> > > 
> > > On 05/12/2023 11:42, Julien Grall wrote:
> > > > 
> > > > On 05/12/2023 10:30, Michal Orzel wrote:
> > > > > 
> > > > > On 05/12/2023 11:01, Julien Grall wrote:
> > > > > > 
> > > > > > On 05/12/2023 09:28, Michal Orzel wrote:
> > > > > > > Hi Julien,
> > > > > > > 
> > > > > > > On 04/12/2023 20:55, Julien Grall wrote:
> > > > > > > > 
> > > > > > > > On 04/12/2023 13:02, Ayan Kumar Halder wrote:
> > > > > > > > > On 04/12/2023 10:31, Julien Grall wrote:
> > > > > > > > > > Hi Ayan,
> > > > > > > > > Hi Julien,
> > > > > > > > > > On 01/12/2023 18:50, Ayan Kumar Halder wrote:
> > > > > > > > > > > Currently if user enables HVC_DCC config option in Linux,
> > > > > > > > > > > it invokes
> > > > > > > > > > > access to debug data transfer registers (ie DBGDTRTX_EL0
> > > > > > > > > > > on arm64,
> > > > > > > > > > > DBGDTRTXINT on arm32). As these registers are not
> > > > > > > > > > > emulated, Xen injects
> > > > > > > > > > > an undefined exception to the guest. And Linux crashes.
> > > > > > > > > > I am missing some data points here to be able to say whether
> > > > > > > > > > I would
> > > > > > > > > > be ok with emulating the registers. So some questions:
> > > > > > > > > >  * As you wrote below, HVC_DCC will return -ENODEV after
> > > > > > > > > > this
> > > > > > > > > > emulation. So may I ask what's the use case to enable it?
> > > > > > > > > > (e.g. is
> > > > > > > > > > there a distro turning this on?)
> > > > > > > > > I am not aware of any distro using (or not using) this
> > > > > > > > > feature. This
> > > > > > > > > issue came to light during our internal testing, when HVC_DCC
> > > > > > > > > was
> > > > > > > > > enabled to use the debug console. When Linux runs without Xen,
> > > > > > > > > then we
> > > > > > > > > could observe the logs on the debug console. When Linux was
> > > > > > > > > running as a
> > > > > > > > > VM, it crashed.
> > > > > > > > > 
> > > > > > > > > My intention here was to do the bare minimum emulation so that
> > > > > > > > > the crash
> > > > > > > > > could be avoided.
> > > > > > > > This reminds me a bit the discussion around "xen/arm64: Decode
> > > > > > > > ldr/str
> > > > > > > > post increment operations". I don't want Xen to contain
> > > > > > > > half-backed
> > > > > > > > emulation just to please an OS in certain configuration that
> > > > > > > > doesn't
> > > > > > > > seem to be often used.
> > > > > > > > 
> > > > > > > > Also, AFAICT, KVM is in the same situation...
> > > > > > > Well, KVM is not in the same situation. It emulates all DCC regs
> > > > > > > as RAZ/WI, so there
> > > > > > > will be no fault on an attempt to access the DCC.
> > > > > > Does this mean a guest will think the JTAG is availabe?
> > > > > Yes, it will believe the DCC is available which is not a totally bad
> > > > > idea. Yes, it will not have
> > > > > any effect but at least covers the polling loop. The solution proposed
> > > > > here sounds better but does not take
> > > > > into account the busy while loop when sending the char. Linux DCC
> > > > > earlycon does not make an initial check that runtime
> > > > > driver does and will keep waiting in the loop if TXfull is set.
> > > > > Emulating everything as RAZ/WI solves that.
> > > > > As you can see, each solution has its flaws and depends on the OS
> > > > > implementation.
> > > > Right, which prove my earlier point. You are providing an emulation just
> > > > to please a specific driver in Linux (not even the whole Linux). This is
> > > > what I was the most concern of.
> > I have sent out a patch ("[PATCH] tty: hvc: dcc: Check for TXfull condition
> > while setting up early console") to fix this.
> > > > 
> > > > So ...
> > > > 
> > > > > > > In general, I think that if a register is not optional and does
> > > > > > > not depend on other register
> > > > > > > to be checked first (e.g. a feature/control register we emulate),
> > > > > > > which implies that it is fully ok for a guest to
> > > > > > > access it directly - we should at least try to do something not to
> > > > > > > crash a guest.
> > > > > > This is where we have opposing opinion. I view crashing a guest
> > > > > > better
> > > > > > than providing a wrong emulation because it gives a clear signal
> > > > > > that
> > > > > > the register they are trying to access will not function properly.
> > > > > > 
> > > > > > We had this exact same discussion a few years ago when Linux started
> > > > > > to
> > > > > > access GIC*_ACTIVER registers. I know that Stefano was for emulating
> > > > > > them as RAZ but this had consequences on the domain side (Linux
> > > > > > sometimes need to read them). We settled on printing a warning which
> > > > > > is
> > > > > > not great but better than claiming we properly emulate the register.
> > > > > 

Re: [PATCH] CI: Fix fallout from adding elfutils-dev to the build container

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Andrew Cooper wrote:
> Commit 948e03303138 ("automation/alpine: add elfutils-dev") had an unintended
> consequence of causing Qemu to gain a runtime dependency on libdw.so
> 
> The {adl,zen3p}-pci-hvm-x86-64-gcc-debug tests, which are the only two tests
> that run the built Qemu, started failing with:
> 
>   Error loading shared library libdw.so.1: No such file or directory (needed 
> by /usr/local/lib/xen/bin/qemu-system-i386)
>   Error relocating /usr/local/lib/xen/bin/qemu-system-i386: dwfl_begin: 
> symbol not found
> 
> Update the test container with libelf to cope.
> 
> While editing the runtime dependency list, fix up two other problems causing
> bloat.  texinfo isn't a runtime dependency, and we should be using xz itself,
> not it's development libraries.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Stefano Stabellini 


> ---
> CC: Anthony PERARD 
> CC: Stefano Stabellini 
> CC: Michal Orzel 
> CC: Doug Goldstein 
> CC: Roger Pau Monné 
> 
> I've already pushed the x86 container as part of confirming the fix.
> ---
>  automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile | 3 +--
>  automation/tests-artifacts/alpine/3.18.dockerfile | 4 ++--
>  2 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile 
> b/automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile
> index 94f69621f40e..0e5ae7f2b4d8 100644
> --- a/automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile
> +++ b/automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile
> @@ -21,10 +21,9 @@ RUN \
>apk add python3 && \
>apk add zlib && \
>apk add ncurses && \
> -  apk add texinfo && \
>apk add yajl && \
>apk add libaio && \
> -  apk add xz-dev && \
> +  apk add xz && \
>apk add util-linux && \
>apk add argp-standalone && \
>apk add libfdt && \
> diff --git a/automation/tests-artifacts/alpine/3.18.dockerfile 
> b/automation/tests-artifacts/alpine/3.18.dockerfile
> index f1b4a8b7a191..9cde6c9ad4da 100644
> --- a/automation/tests-artifacts/alpine/3.18.dockerfile
> +++ b/automation/tests-artifacts/alpine/3.18.dockerfile
> @@ -22,10 +22,9 @@ RUN \
>apk add python3 && \
>apk add zlib && \
>apk add ncurses && \
> -  apk add texinfo && \
>apk add yajl && \
>apk add libaio && \
> -  apk add xz-dev && \
> +  apk add xz && \
>apk add util-linux && \
>apk add argp-standalone && \
>apk add libfdt && \
> @@ -34,6 +33,7 @@ RUN \
>apk add curl && \
>apk add udev && \
>apk add pciutils && \
> +  apk add libelf && \
>\
># Xen
>cd / && \
> 
> base-commit: ff1178062094837d55ef342070e58316c43a54c9
> prerequisite-patch-id: 477e3af5692ee0daa13d795fdf78384be604fd66
> prerequisite-patch-id: 60d13b1c04d8a808a42d20b3432270cfd87a47fc
> prerequisite-patch-id: 457b56a295e75d2d9f837b44cd483812ca66cd85
> -- 
> 2.30.2
> 

xen | Successful pipeline for staging | 01da0aee

2023-12-05 Thread GitLab


Pipeline #1096532827 has passed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 01da0aee ( 
https://gitlab.com/xen-project/xen/-/commit/01da0aeecd41435cea8bd2fe0f547e0a474f6e45
 )
Commit Message: ns16550: remove partial explicit initializer

T...
Commit Author: Nicola Vetrini
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )



Pipeline #1096532827 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096532827 ) triggered by Ganis 
( https://gitlab.com/ganis )
successfully completed 132 jobs in 3 stages.

-- 
You're receiving this email because of your account on gitlab.com.





[PATCH] CI: Fix fallout from adding elfutils-dev to the build container

2023-12-05 Thread Andrew Cooper
Commit 948e03303138 ("automation/alpine: add elfutils-dev") had an unintended
consequence of causing Qemu to gain a runtime dependency on libdw.so

The {adl,zen3p}-pci-hvm-x86-64-gcc-debug tests, which are the only two tests
that run the built Qemu, started failing with:

  Error loading shared library libdw.so.1: No such file or directory (needed by 
/usr/local/lib/xen/bin/qemu-system-i386)
  Error relocating /usr/local/lib/xen/bin/qemu-system-i386: dwfl_begin: symbol 
not found

Update the test container with libelf to cope.

While editing the runtime dependency list, fix up two other problems causing
bloat.  texinfo isn't a runtime dependency, and we should be using xz itself,
not it's development libraries.

Signed-off-by: Andrew Cooper 
---
CC: Anthony PERARD 
CC: Stefano Stabellini 
CC: Michal Orzel 
CC: Doug Goldstein 
CC: Roger Pau Monné 

I've already pushed the x86 container as part of confirming the fix.
---
 automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile | 3 +--
 automation/tests-artifacts/alpine/3.18.dockerfile | 4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile 
b/automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile
index 94f69621f40e..0e5ae7f2b4d8 100644
--- a/automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile
+++ b/automation/tests-artifacts/alpine/3.18-arm64v8.dockerfile
@@ -21,10 +21,9 @@ RUN \
   apk add python3 && \
   apk add zlib && \
   apk add ncurses && \
-  apk add texinfo && \
   apk add yajl && \
   apk add libaio && \
-  apk add xz-dev && \
+  apk add xz && \
   apk add util-linux && \
   apk add argp-standalone && \
   apk add libfdt && \
diff --git a/automation/tests-artifacts/alpine/3.18.dockerfile 
b/automation/tests-artifacts/alpine/3.18.dockerfile
index f1b4a8b7a191..9cde6c9ad4da 100644
--- a/automation/tests-artifacts/alpine/3.18.dockerfile
+++ b/automation/tests-artifacts/alpine/3.18.dockerfile
@@ -22,10 +22,9 @@ RUN \
   apk add python3 && \
   apk add zlib && \
   apk add ncurses && \
-  apk add texinfo && \
   apk add yajl && \
   apk add libaio && \
-  apk add xz-dev && \
+  apk add xz && \
   apk add util-linux && \
   apk add argp-standalone && \
   apk add libfdt && \
@@ -34,6 +33,7 @@ RUN \
   apk add curl && \
   apk add udev && \
   apk add pciutils && \
+  apk add libelf && \
   \
   # Xen
   cd / && \

base-commit: ff1178062094837d55ef342070e58316c43a54c9
prerequisite-patch-id: 477e3af5692ee0daa13d795fdf78384be604fd66
prerequisite-patch-id: 60d13b1c04d8a808a42d20b3432270cfd87a47fc
prerequisite-patch-id: 457b56a295e75d2d9f837b44cd483812ca66cd85
-- 
2.30.2




xen | Failed pipeline for staging | 01da0aee

2023-12-05 Thread GitLab


Pipeline #1096532827 has failed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 01da0aee ( 
https://gitlab.com/xen-project/xen/-/commit/01da0aeecd41435cea8bd2fe0f547e0a474f6e45
 )
Commit Message: ns16550: remove partial explicit initializer

T...
Commit Author: Nicola Vetrini
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )


Pipeline #1096532827 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096532827 ) triggered by Ganis 
( https://gitlab.com/ganis )
had 2 failed jobs.

Job #5687727963 ( https://gitlab.com/xen-project/xen/-/jobs/5687727963/raw )

Stage: test
Name: zen3p-pci-hvm-x86-64-gcc-debug
Job #5687727945 ( https://gitlab.com/xen-project/xen/-/jobs/5687727945/raw )

Stage: test
Name: adl-pci-hvm-x86-64-gcc-debug

-- 
You're receiving this email because of your account on gitlab.com.





Re: [PATCH v2] x86/DMI: adjustments to comply with Misra C:2012 Rule 9.3

2023-12-05 Thread Stefano Stabellini
On Tue, 5 Dec 2023, Jan Beulich wrote:
> The rule demands that all array elements be initialized (or dedicated
> initializers be used). Introduce a small set of macros to allow doing so
> without unduly affecting use sites (in particular in terms of how many
> elements .matches[] actually has; right now there's no use of
> DMI_MATCH4(), so we could even consider reducing the array size to 3).
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Stefano Stabellini 




xen | Failed pipeline for staging | 01da0aee

2023-12-05 Thread GitLab


Pipeline #1096532827 has failed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 01da0aee ( 
https://gitlab.com/xen-project/xen/-/commit/01da0aeecd41435cea8bd2fe0f547e0a474f6e45
 )
Commit Message: ns16550: remove partial explicit initializer

T...
Commit Author: Nicola Vetrini
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )


Pipeline #1096532827 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096532827 ) triggered by Ganis 
( https://gitlab.com/ganis )
had 2 failed jobs.

Job #5687727963 ( https://gitlab.com/xen-project/xen/-/jobs/5687727963/raw )

Stage: test
Name: zen3p-pci-hvm-x86-64-gcc-debug
Job #5687727945 ( https://gitlab.com/xen-project/xen/-/jobs/5687727945/raw )

Stage: test
Name: adl-pci-hvm-x86-64-gcc-debug

-- 
You're receiving this email because of your account on gitlab.com.





xen | Successful pipeline for staging | 948e0330

2023-12-05 Thread GitLab


Pipeline #1096471389 has passed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 948e0330 ( 
https://gitlab.com/xen-project/xen/-/commit/948e03303138482bf12f67740d8ebd8272824903
 )
Commit Message: automation/alpine: add elfutils-dev

In prepara...
Commit Author: Roger Pau Monne
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )



Pipeline #1096471389 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096471389 ) triggered by Ganis 
( https://gitlab.com/ganis )
successfully completed 135 jobs in 3 stages.

-- 
You're receiving this email because of your account on gitlab.com.





Re: [PATCH] xen: update PV-device interface headers

2023-12-05 Thread Stefano Stabellini
On Tue, 4 Dec 2023, Juergen Gross wrote:
> Update the Xen PV-device interface headers in order to avoid undefined
> behavior with flexible arrays being defined with one array element.
> 
> Reported-by: Pry Mar 
> Signed-off-by: Juergen Gross 

Acked-by: Stefano Stabellini 

> ---
>  include/xen/interface/io/displif.h | 2 +-
>  include/xen/interface/io/ring.h| 2 +-
>  include/xen/interface/io/sndif.h   | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/xen/interface/io/displif.h 
> b/include/xen/interface/io/displif.h
> index 18417b017869..60e42d3b760e 100644
> --- a/include/xen/interface/io/displif.h
> +++ b/include/xen/interface/io/displif.h
> @@ -537,7 +537,7 @@ struct xendispl_dbuf_create_req {
>  
>  struct xendispl_page_directory {
>   grant_ref_t gref_dir_next_page;
> - grant_ref_t gref[1]; /* Variable length */
> + grant_ref_t gref[];
>  };
>  
>  /*
> diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
> index ba4c4274b714..4fef1efcdcab 100644
> --- a/include/xen/interface/io/ring.h
> +++ b/include/xen/interface/io/ring.h
> @@ -95,7 +95,7 @@ struct __name##_sring { 
> \
>  RING_IDX req_prod, req_event;   \
>  RING_IDX rsp_prod, rsp_event;   \
>  uint8_t __pad[48];  \
> -union __name##_sring_entry ring[1]; /* variable-length */   \
> +union __name##_sring_entry ring[];  \
>  };  \
>  \
>  /* "Front" end's private variables */   \
> diff --git a/include/xen/interface/io/sndif.h 
> b/include/xen/interface/io/sndif.h
> index 445657cdb1de..b818517588b5 100644
> --- a/include/xen/interface/io/sndif.h
> +++ b/include/xen/interface/io/sndif.h
> @@ -659,7 +659,7 @@ struct xensnd_open_req {
>  
>  struct xensnd_page_directory {
>   grant_ref_t gref_dir_next_page;
> - grant_ref_t gref[1]; /* Variable length */
> + grant_ref_t gref[];
>  };
>  
>  /*
> -- 
> 2.35.3
> 



[xen-unstable-smoke test] 183998: tolerable all pass - PUSHED

2023-12-05 Thread osstest service owner
flight 183998 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183998/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  948e03303138482bf12f67740d8ebd8272824903
baseline version:
 xen  ff1178062094837d55ef342070e58316c43a54c9

Last test of basis   183994  2023-12-05 10:00:30 Z0 days
Testing same since   183998  2023-12-05 20:02:06 Z0 days1 attempts


People who touched revisions under test:
  Michal Orzel 
  Roger Pau Monne 
  Roger Pau Monné 
  Stefano Stabellini 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   ff11780620..948e033031  948e03303138482bf12f67740d8ebd8272824903 -> smoke



xen | Failed pipeline for staging | 948e0330

2023-12-05 Thread GitLab


Pipeline #1096471389 has failed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 948e0330 ( 
https://gitlab.com/xen-project/xen/-/commit/948e03303138482bf12f67740d8ebd8272824903
 )
Commit Message: automation/alpine: add elfutils-dev

In prepara...
Commit Author: Roger Pau Monne
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )


Pipeline #1096471389 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096471389 ) triggered by Ganis 
( https://gitlab.com/ganis )
had 1 failed job.

Job #5688196651 ( https://gitlab.com/xen-project/xen/-/jobs/5688196651/raw )

Stage: test
Name: zen3p-pci-hvm-x86-64-gcc-debug

-- 
You're receiving this email because of your account on gitlab.com.





xen | Failed pipeline for staging | 948e0330

2023-12-05 Thread GitLab


Pipeline #1096471389 has failed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 948e0330 ( 
https://gitlab.com/xen-project/xen/-/commit/948e03303138482bf12f67740d8ebd8272824903
 )
Commit Message: automation/alpine: add elfutils-dev

In prepara...
Commit Author: Roger Pau Monne
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )


Pipeline #1096471389 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096471389 ) triggered by Ganis 
( https://gitlab.com/ganis )
had 2 failed jobs.

Job #5688196651 ( https://gitlab.com/xen-project/xen/-/jobs/5688196651/raw )

Stage: test
Name: zen3p-pci-hvm-x86-64-gcc-debug
Job #5688242125 ( https://gitlab.com/xen-project/xen/-/jobs/5688242125/raw )

Stage: test
Name: adl-pci-hvm-x86-64-gcc-debug

-- 
You're receiving this email because of your account on gitlab.com.





xen | Failed pipeline for staging | 948e0330

2023-12-05 Thread GitLab


Pipeline #1096471389 has failed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 948e0330 ( 
https://gitlab.com/xen-project/xen/-/commit/948e03303138482bf12f67740d8ebd8272824903
 )
Commit Message: automation/alpine: add elfutils-dev

In prepara...
Commit Author: Roger Pau Monne
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )


Pipeline #1096471389 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096471389 ) triggered by Ganis 
( https://gitlab.com/ganis )
had 2 failed jobs.

Job #5688196651 ( https://gitlab.com/xen-project/xen/-/jobs/5688196651/raw )

Stage: test
Name: zen3p-pci-hvm-x86-64-gcc-debug
Job #5688242125 ( https://gitlab.com/xen-project/xen/-/jobs/5688242125/raw )

Stage: test
Name: adl-pci-hvm-x86-64-gcc-debug

-- 
You're receiving this email because of your account on gitlab.com.





xen | Failed pipeline for staging | 948e0330

2023-12-05 Thread GitLab


Pipeline #1096471389 has failed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 948e0330 ( 
https://gitlab.com/xen-project/xen/-/commit/948e03303138482bf12f67740d8ebd8272824903
 )
Commit Message: automation/alpine: add elfutils-dev

In prepara...
Commit Author: Roger Pau Monne
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )


Pipeline #1096471389 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096471389 ) triggered by Ganis 
( https://gitlab.com/ganis )
had 2 failed jobs.

Job #5688197051 ( https://gitlab.com/xen-project/xen/-/jobs/5688197051/raw )

Stage: test
Name: adl-pci-hvm-x86-64-gcc-debug
Job #5688196651 ( https://gitlab.com/xen-project/xen/-/jobs/5688196651/raw )

Stage: test
Name: zen3p-pci-hvm-x86-64-gcc-debug

-- 
You're receiving this email because of your account on gitlab.com.





xen | Failed pipeline for staging | 948e0330

2023-12-05 Thread GitLab


Pipeline #1096471389 has failed!

Project: xen ( https://gitlab.com/xen-project/xen )
Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging )

Commit: 948e0330 ( 
https://gitlab.com/xen-project/xen/-/commit/948e03303138482bf12f67740d8ebd8272824903
 )
Commit Message: automation/alpine: add elfutils-dev

In prepara...
Commit Author: Roger Pau Monne
Committed by: Andrew Cooper ( https://gitlab.com/andyhhp )


Pipeline #1096471389 ( 
https://gitlab.com/xen-project/xen/-/pipelines/1096471389 ) triggered by Ganis 
( https://gitlab.com/ganis )
had 2 failed jobs.

Job #5687342759 ( https://gitlab.com/xen-project/xen/-/jobs/5687342759/raw )

Stage: test
Name: zen3p-pci-hvm-x86-64-gcc-debug
Job #5687342742 ( https://gitlab.com/xen-project/xen/-/jobs/5687342742/raw )

Stage: test
Name: adl-pci-hvm-x86-64-gcc-debug

-- 
You're receiving this email because of your account on gitlab.com.





Re: [PATCH] Only compile the hypervisor with -Wdeclaration-after-statement

2023-12-05 Thread Julien Grall

Hi Andrew,

On 05/12/2023 19:49, Andrew Cooper wrote:

On 05/12/2023 6:32 pm, Julien Grall wrote:

diff --git a/xen/Makefile b/xen/Makefile
index ca571103c868..360fb6dcae57 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -393,6 +393,7 @@ CFLAGS-$(CONFIG_CC_SPLIT_SECTIONS) += -ffunction-sections 
-fdata-sections
  
  CFLAGS += -nostdinc -fno-builtin -fno-common

  CFLAGS += -Werror -Wredundant-decls -Wwrite-strings -Wno-pointer-arith
+$(call cc-option-add,CFLAGS,CC,-Wdeclaration-after-statement)


I agree with the direction of the patch, but this should be a plain
CFLAGS +=

All compilers we support, even on x86, know
-Wdeclaration-after-statement so it doesn't need a toolchain-check to
activate.


That's good to know, I wasn't sure and..



https://godbolt.org/z/PM7bb1d55


.. I keep forgetting godbolt.org can help to check old compilers.

I will update the patch in my tree.



With that fixed, Acked-by: Andrew Cooper 


Thanks!

Cheers,

--
Julien Grall



Re: [BUG]i2c_hid_acpi broken with 4.17.2 on Framework Laptop 13 AMD

2023-12-05 Thread Sébastien Chaumat
> [2.464598] amd_gpio AMDI0030:00: failed to enable wake-up interrupt

Is it expected that IRQ7 goes from fasteoi (kernel 6.6.4 ) to
ioapic-edge and IRQ9 to ioapic-level ?

IR-IO-APIC7-fasteoi   pinctrl_amd
IR-IO-APIC9-fasteoi   acpi

to (xen 4.18.0)

xen-pirq -ioapic-edge  pinctrl_amd
xen-pirq -ioapic-level  acpi

?

THX



Re: [XEN PATCH] ns16550: remove partial explicit initializer

2023-12-05 Thread Andrew Cooper
On 05/12/2023 4:31 pm, Nicola Vetrini wrote:
> The initializer of 'ns16550_com' violates MISRA C Rule 9.3
> because it explicitly initializes only the first element of the array,
> but the semantics is the same if the explicit initialization is
> omitted.
>
> No functional change.
>
> Signed-off-by: Nicola Vetrini 

Acked-by: Andrew Cooper 



[libvirt test] 183992: tolerable FAIL - PUSHED

2023-12-05 Thread osstest service owner
flight 183992 libvirt real [real]
flight 183997 libvirt real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/183992/
http://logs.test-lab.xenproject.org/osstest/logs/183997/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-libvirt-qcow2 13 guest-start   fail pass in 183997-retest

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail in 183997 
like 183972
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-check fail in 183997 never 
pass
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 183972
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 183972
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 15 saverestore-support-checkfail never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  f1a950379d856282c661d21005b49edc8e97a152
baseline version:
 libvirt  d8de4f2770fbe5195f92c0e142a1921c2a1d8ed9

Last test of basis   183972  2023-12-02 04:22:45 Z3 days
Testing same since   183992  2023-12-05 04:18:54 Z0 days1 attempts


People who touched revisions under test:
  Göran Uddeborg 
  Michal Privoznik 
  Thanos Makatos 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-qcow2   fail
 test-arm64-arm64-libvirt-raw pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-i386-libvirt-raw  pass
 test-amd64-amd64-libvirt-vhd pass



Re: [PATCH] Only compile the hypervisor with -Wdeclaration-after-statement

2023-12-05 Thread Andrew Cooper
On 05/12/2023 6:32 pm, Julien Grall wrote:
> diff --git a/xen/Makefile b/xen/Makefile
> index ca571103c868..360fb6dcae57 100644
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -393,6 +393,7 @@ CFLAGS-$(CONFIG_CC_SPLIT_SECTIONS) += -ffunction-sections 
> -fdata-sections
>  
>  CFLAGS += -nostdinc -fno-builtin -fno-common
>  CFLAGS += -Werror -Wredundant-decls -Wwrite-strings -Wno-pointer-arith
> +$(call cc-option-add,CFLAGS,CC,-Wdeclaration-after-statement)

I agree with the direction of the patch, but this should be a plain
CFLAGS +=

All compilers we support, even on x86, know
-Wdeclaration-after-statement so it doesn't need a toolchain-check to
activate.

https://godbolt.org/z/PM7bb1d55

With that fixed, Acked-by: Andrew Cooper 



RE: [PATCH v13 26/35] x86/fred: FRED entry/exit and dispatch code

2023-12-05 Thread Li, Xin3
> > +static noinstr void fred_intx(struct pt_regs *regs) {
> > +   switch (regs->fred_ss.vector) {
> > +   /* INT0 */
> 
> INTO (for overflow), not INT-zero.  However...
> 
> > +   case X86_TRAP_OF:
> > +   exc_overflow(regs);
> > +   return;
> > +
> > +   /* INT3 */
> > +   case X86_TRAP_BP:
> > +   exc_int3(regs);
> > +   return;
> 
> ... neither OF nor BP will ever enter fred_intx() because they're type SWEXC 
> not
> SWINT.
> 
> SWINT is strictly the INT $imm8 instruction.
> 
> > ...
> > +static noinstr void fred_extint(struct pt_regs *regs) {
> > +   unsigned int vector = regs->fred_ss.vector;
> > +
> > +   if (WARN_ON_ONCE(vector < FIRST_EXTERNAL_VECTOR))
> > +   return;
> > +
> > +   if (likely(vector >= FIRST_SYSTEM_VECTOR)) {
> > +   irqentry_state_t state = irqentry_enter(regs);
> > +
> > +   instrumentation_begin();
> > +   sysvec_table[vector - FIRST_SYSTEM_VECTOR](regs);
> 
> array_index_mask_nospec()
> 
> This is easy for an attacker to abuse, to install non-function-pointer 
> targets into
> the indirect predictor.
> 
> > +   instrumentation_end();
> > +   irqentry_exit(regs, state);
> > +   } else {
> > +   common_interrupt(regs, vector);
> > +   }
> > +}
> > +
> > +static noinstr void fred_exception(struct pt_regs *regs, unsigned
> > +long error_code) {
> > +   /* Optimize for #PF. That's the only exception which matters performance
> wise */
> > +   if (likely(regs->fred_ss.vector == X86_TRAP_PF)) {
> > +   exc_page_fault(regs, error_code);
> > +   return;
> > +   }
> > +
> > +   switch (regs->fred_ss.vector) {
> > +   case X86_TRAP_DE: return exc_divide_error(regs);
> > +   case X86_TRAP_DB: return fred_exc_debug(regs);
> > +   case X86_TRAP_BP: return exc_int3(regs);
> > +   case X86_TRAP_OF: return exc_overflow(regs);
> 
> Depending on what you want to do with BP/OF vs fred_intx(), this may need
> adjusting.
> 
> If you are cross-checking type and vector, then these should be rejected for 
> not
> being of type HWEXC.
> 
> > +   case X86_TRAP_BR: return exc_bounds(regs);
> > +   case X86_TRAP_UD: return exc_invalid_op(regs);
> > +   case X86_TRAP_NM: return exc_device_not_available(regs);
> > +   case X86_TRAP_DF: return exc_double_fault(regs, error_code);
> > +   case X86_TRAP_TS: return exc_invalid_tss(regs, error_code);
> > +   case X86_TRAP_NP: return exc_segment_not_present(regs, error_code);
> > +   case X86_TRAP_SS: return exc_stack_segment(regs, error_code);
> > +   case X86_TRAP_GP: return exc_general_protection(regs, error_code);
> > +   case X86_TRAP_MF: return exc_coprocessor_error(regs);
> > +   case X86_TRAP_AC: return exc_alignment_check(regs, error_code);
> > +   case X86_TRAP_XF: return exc_simd_coprocessor_error(regs);
> > +
> > +#ifdef CONFIG_X86_MCE
> > +   case X86_TRAP_MC: return fred_exc_machine_check(regs); #endif #ifdef
> > +CONFIG_INTEL_TDX_GUEST
> > +   case X86_TRAP_VE: return exc_virtualization_exception(regs);
> > +#endif
> > +#ifdef CONFIG_X86_KERNEL_IBT
> 
> CONFIG_X86_CET
> 
> Userspace can use CET even if the kernel isn't compiled with IBT, so this
> exception needs handling.
> 
> > +   case X86_TRAP_CP: return exc_control_protection(regs, error_code);
> > +#endif
> > +   default: return fred_bad_type(regs, error_code);
> > +   }
> > +}
> > +
> > +__visible noinstr void fred_entry_from_user(struct pt_regs *regs) {
> > +   unsigned long error_code = regs->orig_ax;
> > +
> > +   /* Invalidate orig_ax so that syscall_get_nr() works correctly */
> > +   regs->orig_ax = -1;
> > +
> > +   switch (regs->fred_ss.type) {
> > +   case EVENT_TYPE_EXTINT:
> > +   return fred_extint(regs);
> > +   case EVENT_TYPE_NMI:
> > +   return fred_exc_nmi(regs);
> > +   case EVENT_TYPE_SWINT:
> > +   return fred_intx(regs);
> > +   case EVENT_TYPE_HWEXC:
> > +   case EVENT_TYPE_SWEXC:
> > +   case EVENT_TYPE_PRIV_SWEXC:
> > +   return fred_exception(regs, error_code);
> 
> PRIV_SWEXC should have it's own function and not fall into fred_exception().
> 
> It is strictly only the ICEBP (INT1) instruction at the moment, so should 
> fall into
> bad_type() for any vector other than X86_TRAP_DB.
> 
> > +   case EVENT_TYPE_OTHER:
> > +   return fred_other(regs);
> > +   default:
> > +   return fred_bad_type(regs, error_code);
> > +   }
> > +}
> 
> ~Andrew


Thanks a lot for your quick review, will address soon.
Xin


Re: [PATCH v6 4/5] [FUTURE] xen/arm: enable vPCI for domUs

2023-12-05 Thread Stewart Hildebrand
On 12/5/23 06:08, Roger Pau Monné wrote:
> On Mon, Dec 04, 2023 at 02:07:51PM -0800, Stefano Stabellini wrote:
>> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
>>> On Fri, Dec 01, 2023 at 06:56:32PM -0800, Stefano Stabellini wrote:
 On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> On Mon, Nov 13, 2023 at 05:21:13PM -0500, Stewart Hildebrand wrote:
>> @@ -1618,6 +1630,14 @@ int iommu_do_pci_domctl(
>>  bus = PCI_BUS(machine_sbdf);
>>  devfn = PCI_DEVFN(machine_sbdf);
>>  
>> +if ( needs_vpci(d) && !has_vpci(d) )
>> +{
>> +printk(XENLOG_G_WARNING "Cannot assign %pp to %pd: vPCI 
>> support not enabled\n",
>> +   _SBDF(seg, bus, devfn), d);
>> +ret = -EPERM;
>> +break;
>
> I think this is likely too restrictive going forward.  The current
> approach is indeed to enable vPCI on a per-domain basis because that's
> how PVH dom0 uses it, due to being unable to use ioreq servers.
>
> If we start to expose vPCI suport to guests the interface should be on
> a per-device basis, so that vPCI could be enabled for some devices,
> while others could still be handled by ioreq servers.
>
> We might want to add a new flag to xen_domctl_assign_device (used by
> XEN_DOMCTL_assign_device) in order to signal whether the device will
> use vPCI.

 Actually I don't think this is a good idea. I am all for flexibility but
 supporting multiple different configurations comes at an extra cost for
 both maintainers and contributors. I think we should try to reduce the
 amount of configurations we support rather than increasing them
 (especially on x86 where we have PV, PVH, HVM).
>>>
>>> I think it's perfectly fine to initially require a domain to have all
>>> its devices either passed through using vPCI or ireqs, but the
>>> interface IMO should allow for such differentiation in the future.
>>> That's why I think introducing a domain wide vPCI flag might not be
>>> the best option going forward.
>>>
>>> It would be perfectly fine for XEN_DOMCTL_assign_device to set a
>>> domain wide vPCI flag, iow:
>>>
>>> if ( HYPERCALL_VPCI_FLAG_SET && !has_vpci(d) )
>>> {
>>> if ( has_arch_pdevs(d) )
>>> {
>>> printk("All passthrough devices must use the same backend\n");
>>> return -EINVAL;
>>> }
>>>
>>> /* Set vPCI domain flag */
>>> }
>>
>> That would be fine by me, but maybe we can avoid this change too. I was
>> imagining that vPCI would be enabled at domain creation, not at runtime.
>> And that vPCI would be enabled by default for all PVH guests (once we
>> are past the initial experimental phase.)
> 
> Then we don't even need a new CDF flag, and just enable vPCI when
> IOMMU is enabled?  IOW: we can key the enabling of vPCI to
> XEN_DOMCTL_CDF_iommu for specific domain types?
> 
> Maybe that's not so trivial on x86, as there's no x86 PVH domain type
> from the hypervisor PoV.
> 
>>
>>> We have already agreed that we want to aim for a setup where ioreqs
>>> and vPCI could be used for the same domain, but I guess you assumed
>>> that ioreqs would be used for emulated devices exclusively and vPCI
>>> for passthrough devices?
>>
>> Yes, that's right
>>
>>
>>> Overall if we agree that ioreqs and vPCI should co-exist for a domain,
>>> I'm not sure there's much reason to limit ioreqs to only handle
>>> emulated devices, seems like an arbitrary limitation.
>>
>> Reply below
>>
>>
 I don't think we should enable IOREQ servers to handle PCI passthrough
 for PVH guests and/or guests with vPCI. If the domain has vPCI, PCI
 Passthrough can be handled by vPCI just fine. I think this should be a
 good anti-feature to have (a goal to explicitly not add this feature) to
 reduce complexity. Unless you see a specific usecase to add support for
 it?
>>>
>>> There are passthrough devices (GPUs) that might require some extra
>>> mediation on dom0 (like the Intel GVT-g thing) that would force the
>>> usage of ioreqs to passthrough.
>>
>> From an architectural perspective, I think it would be cleaner, simpler
>> to maintain, and simpler to understand if Xen was the sole owner of the
>> PCI Root Complex and PCI config space mediation (implemented with vPCI).
>> IOREQ can be used for emulation and it works very well for that. At
>> least in my mind, that makes things much simpler.
> 
> But IOREQ already has all the code to mediate accesses to the PCI
> config space, and the interface to register separate servers for
> different PCI devices.
> 
> We would then need to duplicate this internally for vPCI, so that vPCI
> could forward accesses to IOREQ just for IOREQ to forward to yet a
> different component?  Seems like a lot of duplication for no benefit.
> 
>> I understand there are non-trivial cases, like virtual GPUs with
>> hardware access, but I don't classify those as passthrough. That's
>> because there isn't 

Re: [PATCH] Config.mk: drop -Wdeclaration-after-statement

2023-12-05 Thread Julien Grall

Hi Alexander.

On 04/12/2023 09:28, Alexander Kanavin wrote:

On 12/1/23 20:14, Julien Grall wrote:


So I agree that if we were to remove -Wdeclaration-after-statement 
then we should also update the CODING_STYLE. However, I am not 
entirely sure I would want to mix code and declaration in the hypervisor.


Anyway, I think this is a separate discussion from resolving the 
immediate problem (i.e. building the python bindings).


So for now, I think it would make sense to push the 
-Wdeclaration-after-statement to the tools.


@Alexander, are you going to send a new version? If not, I would be 
happy to do it.


Please do it, as in the meantime, my attention has focused entirely 
elsewhere, so I'd have to switch context and find time to study the xen 
source. I don't have specific interest in xen, the reason I looked into 
it is that we're updating python to 3.12 in yocto and this one was one 
of the many issues that came up all over the userspace stack.


Thanks, I have sent a patch [1]. I decided to add a Reported-by tag 
rather than Signed-off-by on my version. I hope this is fine.


Cheers,

[1] https://lore.kernel.org/xen-devel/20231205183226.26636-1-jul...@xen.org/






--
Julien Grall



[PATCH v2 14/14] block: remove outdated AioContext locking comments

2023-12-05 Thread Stefan Hajnoczi
The AioContext lock no longer exists.

There is one noteworthy change:

  - * More specifically, these functions use BDRV_POLL_WHILE(bs), which
  - * requires the caller to be either in the main thread and hold
  - * the BlockdriverState (bs) AioContext lock, or directly in the
  - * home thread that runs the bs AioContext. Calling them from
  - * another thread in another AioContext would cause deadlocks.
  + * More specifically, these functions use BDRV_POLL_WHILE(bs), which requires
  + * the caller to be either in the main thread or directly in the home thread
  + * that runs the bs AioContext. Calling them from another thread in another
  + * AioContext would cause deadlocks.

I am not sure whether deadlocks are still possible. Maybe they have just
moved to the fine-grained locks that have replaced the AioContext. Since
I am not sure if the deadlocks are gone, I have kept the substance
unchanged and just removed mention of the AioContext.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 include/block/block-common.h |  3 --
 include/block/block-io.h |  9 ++--
 include/block/block_int-common.h |  2 -
 block.c  | 73 ++--
 block/block-backend.c|  8 ---
 block/export/vhost-user-blk-server.c |  4 --
 tests/qemu-iotests/202   |  2 +-
 tests/qemu-iotests/203   |  3 +-
 8 files changed, 22 insertions(+), 82 deletions(-)

diff --git a/include/block/block-common.h b/include/block/block-common.h
index d7599564db..a846023a09 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -70,9 +70,6 @@
  * automatically takes the graph rdlock when calling the wrapped function. In
  * the same way, no_co_wrapper_bdrv_wrlock functions automatically take the
  * graph wrlock.
- *
- * If the first parameter of the function is a BlockDriverState, BdrvChild or
- * BlockBackend pointer, the AioContext lock for it is taken in the wrapper.
  */
 #define no_co_wrapper
 #define no_co_wrapper_bdrv_rdlock
diff --git a/include/block/block-io.h b/include/block/block-io.h
index 8eb39a858b..b49e0537dd 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -332,11 +332,10 @@ bdrv_co_copy_range(BdrvChild *src, int64_t src_offset,
  * "I/O or GS" API functions. These functions can run without
  * the BQL, but only in one specific iothread/main loop.
  *
- * More specifically, these functions use BDRV_POLL_WHILE(bs), which
- * requires the caller to be either in the main thread and hold
- * the BlockdriverState (bs) AioContext lock, or directly in the
- * home thread that runs the bs AioContext. Calling them from
- * another thread in another AioContext would cause deadlocks.
+ * More specifically, these functions use BDRV_POLL_WHILE(bs), which requires
+ * the caller to be either in the main thread or directly in the home thread
+ * that runs the bs AioContext. Calling them from another thread in another
+ * AioContext would cause deadlocks.
  *
  * Therefore, these functions are not proper I/O, because they
  * can't run in *any* iothreads, but only in a specific one.
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 4e31d161c5..151279d481 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -1192,8 +1192,6 @@ struct BlockDriverState {
 /* The error object in use for blocking operations on backing_hd */
 Error *backing_blocker;
 
-/* Protected by AioContext lock */
-
 /*
  * If we are reading a disk image, give its size in sectors.
  * Generally read-only; it is written to by load_snapshot and
diff --git a/block.c b/block.c
index 434b7f4d72..a097772238 100644
--- a/block.c
+++ b/block.c
@@ -1616,11 +1616,6 @@ out:
 g_free(gen_node_name);
 }
 
-/*
- * The caller must always hold @bs AioContext lock, because this function calls
- * bdrv_refresh_total_sectors() which polls when called from non-coroutine
- * context.
- */
 static int no_coroutine_fn GRAPH_UNLOCKED
 bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, const char *node_name,
  QDict *options, int open_flags, Error **errp)
@@ -2901,7 +2896,7 @@ uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission 
qapi_perm)
  * Replaces the node that a BdrvChild points to without updating permissions.
  *
  * If @new_bs is non-NULL, the parent of @child must already be drained through
- * @child and the caller must hold the AioContext lock for @new_bs.
+ * @child.
  */
 static void GRAPH_WRLOCK
 bdrv_replace_child_noperm(BdrvChild *child, BlockDriverState *new_bs)
@@ -3041,9 +3036,8 @@ static TransactionActionDrv bdrv_attach_child_common_drv 
= {
  *
  * Returns new created child.
  *
- * The caller must hold the AioContext lock for @child_bs. Both @parent_bs and
- * @child_bs can move to a different AioContext in this function. Callers must
- * make sure that their AioContext locking is still correct after 

[PATCH] Only compile the hypervisor with -Wdeclaration-after-statement

2023-12-05 Thread Julien Grall
From: Julien Grall 

Right now, all tools and hypervisor will be complied with the option
-Wdeclaration-after-statement. While most of the code in the hypervisor
is controlled by us, for tools we may import external libraries.

The build will fail if one of them are using the construct we are
trying to prevent. This is the case when building against Python 3.12
and Yocto:

| In file included from 
/srv/storage/alex/yocto/build-virt/tmp/work/core2-64-poky-linux/xen-tools/4.17+stable/recipe-sysroot/usr/include/python3.12/Python.h:44,
|  from xen/lowlevel/xc/xc.c:8:
| 
/srv/storage/alex/yocto/build-virt/tmp/work/core2-64-poky-linux/xen-tools/4.17+stable/recipe-sysroot/usr/include/python3.12/object.h:
 In function 'Py_SIZE':
| 
/srv/storage/alex/yocto/build-virt/tmp/work/core2-64-poky-linux/xen-tools/4.17+stable/recipe-sysroot/usr/include/python3.12/object.h:233:5:
 error: ISO C90 forbids mixed declarations and code 
[-Werror=declaration-after-statement]
|   233 | PyVarObject *var_ob = _PyVarObject_CAST(ob);
|   | ^~~
| In file included from 
/srv/storage/alex/yocto/build-virt/tmp/work/core2-64-poky-linux/xen-tools/4.17+stable/recipe-sysroot/usr/include/python3.12/Python.h:53:
| 
/srv/storage/alex/yocto/build-virt/tmp/work/core2-64-poky-linux/xen-tools/4.17+stable/recipe-sysroot/usr/include/python3.12/cpython/longintrepr.h:
 In function '_PyLong_CompactValue':
| 
/srv/storage/alex/yocto/build-virt/tmp/work/core2-64-poky-linux/xen-tools/4.17+stable/recipe-sysroot/usr/include/python3.12/cpython/longintrepr.h:121:5:
 error: ISO C90 forbids mixed declarations and code 
[-Werror=declaration-after-statement]
|   121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK);
|   | ^~
| cc1: all warnings being treated as errors

Looking at the tools directory, a fair few directory already add
-Wno-declaration-after-statement to inhibit the default behavior.

We have always build the hypervisor with the flag, so for now remove
only the flag for anything but the hypervisor. We can decide at later
time whether we want to relax.

Also remove the -Wno-declaration-after-statement in some subdirectory
as the flag is now unnecessary.

Part of the commit message was take from Alexander's first proposal:

Link: 
https://lore.kernel.org/xen-devel/20231128174729.3880113-1-a...@linutronix.de/
Reported-by: Alexander Kanavin 
Signed-off-by: Julien Grall 

---

I think the decision to remove the flag for the hypervisor is a separate
discussion. Personally I am on the fence.

We could also re-enable the flags to some of the tools directory
if wanted. I chose the most convenience approach for now.
---
 Config.mk   | 2 --
 stubdom/Makefile| 2 +-
 stubdom/vtpmmgr/Makefile| 2 +-
 tools/libs/light/Makefile   | 3 +--
 tools/libs/util/Makefile| 3 +--
 tools/tests/depriv/Makefile | 2 --
 tools/xl/Makefile   | 3 +--
 xen/Makefile| 1 +
 8 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/Config.mk b/Config.mk
index 2c43702958eb..7e67b91de293 100644
--- a/Config.mk
+++ b/Config.mk
@@ -177,8 +177,6 @@ CFLAGS += -std=gnu99
 
 CFLAGS += -Wall -Wstrict-prototypes
 
-$(call cc-option-add,HOSTCFLAGS,HOSTCC,-Wdeclaration-after-statement)
-$(call cc-option-add,CFLAGS,CC,-Wdeclaration-after-statement)
 $(call cc-option-add,CFLAGS,CC,-Wno-unused-but-set-variable)
 $(call cc-option-add,CFLAGS,CC,-Wno-unused-local-typedefs)
 
diff --git a/stubdom/Makefile b/stubdom/Makefile
index 71c9b2200e68..8c503c2bf8de 100644
--- a/stubdom/Makefile
+++ b/stubdom/Makefile
@@ -245,7 +245,7 @@ tpm_emulator-$(XEN_TARGET_ARCH): 
tpm_emulator-$(TPMEMU_VERSION).tar.gz
patch -d $@ -p1 < vtpm-command-duration.patch
patch -d $@ -p1 < vtpm-tpm_bn_t-addr.patch
mkdir $@/build
-   cd $@/build; CC=${CC} $(CMAKE) .. -DCMAKE_C_FLAGS:STRING="-std=c99 
-DTPM_NO_EXTERN $(TARGET_CPPFLAGS) $(TARGET_CFLAGS) 
-Wno-declaration-after-statement"
+   cd $@/build; CC=${CC} $(CMAKE) .. -DCMAKE_C_FLAGS:STRING="-std=c99 
-DTPM_NO_EXTERN $(TARGET_CPPFLAGS) $(TARGET_CFLAGS)"
touch $@
 
 TPMEMU_STAMPFILE=$(CROSS_ROOT)/$(GNU_TARGET_ARCH)-xen-elf/lib/libtpm.a
diff --git a/stubdom/vtpmmgr/Makefile b/stubdom/vtpmmgr/Makefile
index 6dae034a0778..c29bb498381c 100644
--- a/stubdom/vtpmmgr/Makefile
+++ b/stubdom/vtpmmgr/Makefile
@@ -17,7 +17,7 @@ OBJS += vtpm_disk.o disk_tpm.o disk_io.o disk_crypto.o 
disk_read.o disk_write.o
 OBJS += mgmt_authority.o
 
 CFLAGS+=-Werror -Iutil -Icrypto -Itcs
-CFLAGS+=-Wno-declaration-after-statement -Wno-unused-label
+CFLAGS+=-Wno-unused-label
 
 build: $(TARGET)
 $(TARGET): $(OBJS)
diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index ba4c1b79336f..37e4d1670986 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -38,8 +38,7 @@ vpath static_tables.c $(ACPI_PATH)/
 
 OBJS-$(CONFIG_X86) += $(ACPI_OBJS)
 
-CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
- 

[PATCH v2 13/14] job: remove outdated AioContext locking comments

2023-12-05 Thread Stefan Hajnoczi
The AioContext lock no longer exists.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 include/qemu/job.h | 20 
 1 file changed, 20 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index e502787dd8..9ea98b5927 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -67,8 +67,6 @@ typedef struct Job {
 
 /**
  * The completion function that will be called when the job completes.
- * Called with AioContext lock held, since many callback implementations
- * use bdrv_* functions that require to hold the lock.
  */
 BlockCompletionFunc *cb;
 
@@ -264,9 +262,6 @@ struct JobDriver {
  *
  * This callback will not be invoked if the job has already failed.
  * If it fails, abort and then clean will be called.
- *
- * Called with AioContext lock held, since many callbacs implementations
- * use bdrv_* functions that require to hold the lock.
  */
 int (*prepare)(Job *job);
 
@@ -277,9 +272,6 @@ struct JobDriver {
  *
  * All jobs will complete with a call to either .commit() or .abort() but
  * never both.
- *
- * Called with AioContext lock held, since many callback implementations
- * use bdrv_* functions that require to hold the lock.
  */
 void (*commit)(Job *job);
 
@@ -290,9 +282,6 @@ struct JobDriver {
  *
  * All jobs will complete with a call to either .commit() or .abort() but
  * never both.
- *
- * Called with AioContext lock held, since many callback implementations
- * use bdrv_* functions that require to hold the lock.
  */
 void (*abort)(Job *job);
 
@@ -301,9 +290,6 @@ struct JobDriver {
  * .commit() or .abort(). Regardless of which callback is invoked after
  * completion, .clean() will always be called, even if the job does not
  * belong to a transaction group.
- *
- * Called with AioContext lock held, since many callbacs implementations
- * use bdrv_* functions that require to hold the lock.
  */
 void (*clean)(Job *job);
 
@@ -318,17 +304,12 @@ struct JobDriver {
  * READY).
  * (If the callback is NULL, the job is assumed to terminate
  * without I/O.)
- *
- * Called with AioContext lock held, since many callback implementations
- * use bdrv_* functions that require to hold the lock.
  */
 bool (*cancel)(Job *job, bool force);
 
 
 /**
  * Called when the job is freed.
- * Called with AioContext lock held, since many callback implementations
- * use bdrv_* functions that require to hold the lock.
  */
 void (*free)(Job *job);
 };
@@ -424,7 +405,6 @@ void job_ref_locked(Job *job);
  * Release a reference that was previously acquired with job_ref_locked() or
  * job_create(). If it's the last reference to the object, it will be freed.
  *
- * Takes AioContext lock internally to invoke a job->driver callback.
  * Called with job lock held.
  */
 void job_unref_locked(Job *job);
-- 
2.43.0




[PATCH v2 11/14] docs: remove AioContext lock from IOThread docs

2023-12-05 Thread Stefan Hajnoczi
Encourage the use of locking primitives and stop mentioning the
AioContext lock since it is being removed.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 docs/devel/multiple-iothreads.txt | 45 +++
 1 file changed, 15 insertions(+), 30 deletions(-)

diff --git a/docs/devel/multiple-iothreads.txt 
b/docs/devel/multiple-iothreads.txt
index a3e949f6b3..4865196bde 100644
--- a/docs/devel/multiple-iothreads.txt
+++ b/docs/devel/multiple-iothreads.txt
@@ -88,27 +88,18 @@ loop, depending on which AioContext instance the caller 
passes in.
 
 How to synchronize with an IOThread
 ---
-AioContext is not thread-safe so some rules must be followed when using file
-descriptors, event notifiers, timers, or BHs across threads:
+Variables that can be accessed by multiple threads require some form of
+synchronization such as qemu_mutex_lock(), rcu_read_lock(), etc.
 
-1. AioContext functions can always be called safely.  They handle their
-own locking internally.
-
-2. Other threads wishing to access the AioContext must use
-aio_context_acquire()/aio_context_release() for mutual exclusion.  Once the
-context is acquired no other thread can access it or run event loop iterations
-in this AioContext.
-
-Legacy code sometimes nests aio_context_acquire()/aio_context_release() calls.
-Do not use nesting anymore, it is incompatible with the BDRV_POLL_WHILE() macro
-used in the block layer and can lead to hangs.
-
-There is currently no lock ordering rule if a thread needs to acquire multiple
-AioContexts simultaneously.  Therefore, it is only safe for code holding the
-QEMU global mutex to acquire other AioContexts.
+AioContext functions like aio_set_fd_handler(), aio_set_event_notifier(),
+aio_bh_new(), and aio_timer_new() are thread-safe. They can be used to trigger
+activity in an IOThread.
 
 Side note: the best way to schedule a function call across threads is to call
-aio_bh_schedule_oneshot().  No acquire/release or locking is needed.
+aio_bh_schedule_oneshot().
+
+The main loop thread can wait synchronously for a condition using
+AIO_WAIT_WHILE().
 
 AioContext and the block layer
 --
@@ -124,22 +115,16 @@ Block layer code must therefore expect to run in an 
IOThread and avoid using
 old APIs that implicitly use the main loop.  See the "How to program for
 IOThreads" above for information on how to do that.
 
-If main loop code such as a QMP function wishes to access a BlockDriverState
-it must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure
-that callbacks in the IOThread do not run in parallel.
-
 Code running in the monitor typically needs to ensure that past
 requests from the guest are completed.  When a block device is running
 in an IOThread, the IOThread can also process requests from the guest
 (via ioeventfd).  To achieve both objects, wrap the code between
 bdrv_drained_begin() and bdrv_drained_end(), thus creating a "drained
-section".  The functions must be called between aio_context_acquire()
-and aio_context_release().  You can freely release and re-acquire the
-AioContext within a drained section.
+section".
 
-Long-running jobs (usually in the form of coroutines) are best scheduled in
-the BlockDriverState's AioContext to avoid the need to acquire/release around
-each bdrv_*() call.  The functions bdrv_add/remove_aio_context_notifier,
-or alternatively blk_add/remove_aio_context_notifier if you use BlockBackends,
-can be used to get a notification whenever bdrv_try_change_aio_context() moves 
a
+Long-running jobs (usually in the form of coroutines) are often scheduled in
+the BlockDriverState's AioContext.  The functions
+bdrv_add/remove_aio_context_notifier, or alternatively
+blk_add/remove_aio_context_notifier if you use BlockBackends, can be used to
+get a notification whenever bdrv_try_change_aio_context() moves a
 BlockDriverState to a different AioContext.
-- 
2.43.0




[PATCH v2 10/14] aio: remove aio_context_acquire()/aio_context_release() API

2023-12-05 Thread Stefan Hajnoczi
Delete these functions because nothing calls these functions anymore.

I introduced these APIs in commit 98563fc3ec44 ("aio: add
aio_context_acquire() and aio_context_release()") in 2014. It's with a
sigh of relief that I delete these APIs almost 10 years later.

Thanks to Paolo Bonzini's vision for multi-queue QEMU, we got an
understanding of where the code needed to go in order to remove the
limitations that the original dataplane and the IOThread/AioContext
approach that followed it.

Emanuele Giuseppe Esposito had the splendid determination to convert
large parts of the codebase so that they no longer needed the AioContext
lock. This was a painstaking process, both in the actual code changes
required and the iterations of code review that Emanuele eked out of
Kevin and me over many months.

Kevin Wolf tackled multitudes of graph locking conversions to protect
in-flight I/O from run-time changes to the block graph as well as the
clang Thread Safety Analysis annotations that allow the compiler to
check whether the graph lock is being used correctly.

And me, well, I'm just here to add some pizzazz to the QEMU multi-queue
block layer :). Thank you to everyone who helped with this effort,
including Eric Blake, code reviewer extraordinaire, and others who I've
forgotten to mention.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 include/block/aio.h | 17 -
 util/async.c| 10 --
 2 files changed, 27 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index f08b358077..af05512a7d 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -278,23 +278,6 @@ void aio_context_ref(AioContext *ctx);
  */
 void aio_context_unref(AioContext *ctx);
 
-/* Take ownership of the AioContext.  If the AioContext will be shared between
- * threads, and a thread does not want to be interrupted, it will have to
- * take ownership around calls to aio_poll().  Otherwise, aio_poll()
- * automatically takes care of calling aio_context_acquire and
- * aio_context_release.
- *
- * Note that this is separate from bdrv_drained_begin/bdrv_drained_end.  A
- * thread still has to call those to avoid being interrupted by the guest.
- *
- * Bottom halves, timers and callbacks can be created or removed without
- * acquiring the AioContext.
- */
-void aio_context_acquire(AioContext *ctx);
-
-/* Relinquish ownership of the AioContext. */
-void aio_context_release(AioContext *ctx);
-
 /**
  * aio_bh_schedule_oneshot_full: Allocate a new bottom half structure that will
  * run only once and as soon as possible.
diff --git a/util/async.c b/util/async.c
index dfd44ef612..460529057c 100644
--- a/util/async.c
+++ b/util/async.c
@@ -719,16 +719,6 @@ void aio_context_unref(AioContext *ctx)
 g_source_unref(>source);
 }
 
-void aio_context_acquire(AioContext *ctx)
-{
-/* TODO remove this function */
-}
-
-void aio_context_release(AioContext *ctx)
-{
-/* TODO remove this function */
-}
-
 QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext)
 
 AioContext *qemu_get_current_aio_context(void)
-- 
2.43.0




[PATCH v2 07/14] block: remove bdrv_co_lock()

2023-12-05 Thread Stefan Hajnoczi
The bdrv_co_lock() and bdrv_co_unlock() functions are already no-ops.
Remove them.

Signed-off-by: Stefan Hajnoczi 
---
 include/block/block-global-state.h | 14 --
 block.c| 10 --
 blockdev.c |  4 
 3 files changed, 28 deletions(-)

diff --git a/include/block/block-global-state.h 
b/include/block/block-global-state.h
index 0327f1c605..4ec0b217f0 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -267,20 +267,6 @@ int bdrv_debug_remove_breakpoint(BlockDriverState *bs, 
const char *tag);
 int bdrv_debug_resume(BlockDriverState *bs, const char *tag);
 bool bdrv_debug_is_suspended(BlockDriverState *bs, const char *tag);
 
-/**
- * Locks the AioContext of @bs if it's not the current AioContext. This avoids
- * double locking which could lead to deadlocks: This is a coroutine_fn, so we
- * know we already own the lock of the current AioContext.
- *
- * May only be called in the main thread.
- */
-void coroutine_fn bdrv_co_lock(BlockDriverState *bs);
-
-/**
- * Unlocks the AioContext of @bs if it's not the current AioContext.
- */
-void coroutine_fn bdrv_co_unlock(BlockDriverState *bs);
-
 bool bdrv_child_change_aio_context(BdrvChild *c, AioContext *ctx,
GHashTable *visited, Transaction *tran,
Error **errp);
diff --git a/block.c b/block.c
index 91ace5d2d5..434b7f4d72 100644
--- a/block.c
+++ b/block.c
@@ -7431,16 +7431,6 @@ void coroutine_fn bdrv_co_leave(BlockDriverState *bs, 
AioContext *old_ctx)
 bdrv_dec_in_flight(bs);
 }
 
-void coroutine_fn bdrv_co_lock(BlockDriverState *bs)
-{
-/* TODO removed in next patch */
-}
-
-void coroutine_fn bdrv_co_unlock(BlockDriverState *bs)
-{
-/* TODO removed in next patch */
-}
-
 static void bdrv_do_remove_aio_context_notifier(BdrvAioNotifier *ban)
 {
 GLOBAL_STATE_CODE();
diff --git a/blockdev.c b/blockdev.c
index 8a1b28f830..3a5e7222ec 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2264,18 +2264,14 @@ void coroutine_fn qmp_block_resize(const char *device, 
const char *node_name,
 return;
 }
 
-bdrv_co_lock(bs);
 bdrv_drained_begin(bs);
-bdrv_co_unlock(bs);
 
 old_ctx = bdrv_co_enter(bs);
 blk_co_truncate(blk, size, false, PREALLOC_MODE_OFF, 0, errp);
 bdrv_co_leave(bs, old_ctx);
 
-bdrv_co_lock(bs);
 bdrv_drained_end(bs);
 blk_co_unref(blk);
-bdrv_co_unlock(bs);
 }
 
 void qmp_block_stream(const char *job_id, const char *device,
-- 
2.43.0




[PATCH v2 12/14] scsi: remove outdated AioContext lock comment

2023-12-05 Thread Stefan Hajnoczi
The SCSI subsystem no longer uses the AioContext lock. Request
processing runs exclusively in the BlockBackend's AioContext since
"scsi: only access SCSIDevice->requests from one thread" and hence the
lock is unnecessary.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 hw/scsi/scsi-disk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 61be3d395a..2e7e1e9a1c 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -355,7 +355,6 @@ done:
 scsi_req_unref(>req);
 }
 
-/* Called with AioContext lock held */
 static void scsi_dma_complete(void *opaque, int ret)
 {
 SCSIDiskReq *r = (SCSIDiskReq *)opaque;
-- 
2.43.0




[PATCH v2 05/14] graph-lock: remove AioContext locking

2023-12-05 Thread Stefan Hajnoczi
Stop acquiring/releasing the AioContext lock in
bdrv_graph_wrlock()/bdrv_graph_unlock() since the lock no longer has any
effect.

The distinction between bdrv_graph_wrunlock() and
bdrv_graph_wrunlock_ctx() becomes meaningless and they can be collapsed
into one function.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
---
 include/block/graph-lock.h | 21 ++---
 block.c| 50 +++---
 block/backup.c |  4 +--
 block/blklogwrites.c   |  8 ++---
 block/blkverify.c  |  4 +--
 block/block-backend.c  | 11 +++
 block/commit.c | 16 +-
 block/graph-lock.c | 44 ++
 block/mirror.c | 22 ++---
 block/qcow2.c  |  4 +--
 block/quorum.c |  8 ++---
 block/replication.c| 14 -
 block/snapshot.c   |  4 +--
 block/stream.c | 12 +++
 block/vmdk.c   | 20 ++--
 blockdev.c |  8 ++---
 blockjob.c | 12 +++
 tests/unit/test-bdrv-drain.c   | 40 
 tests/unit/test-bdrv-graph-mod.c   | 20 ++--
 scripts/block-coroutine-wrapper.py |  4 +--
 20 files changed, 133 insertions(+), 193 deletions(-)

diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
index 22b5db1ed9..d7545e82d0 100644
--- a/include/block/graph-lock.h
+++ b/include/block/graph-lock.h
@@ -110,34 +110,17 @@ void unregister_aiocontext(AioContext *ctx);
  *
  * The wrlock can only be taken from the main loop, with BQL held, as only the
  * main loop is allowed to modify the graph.
- *
- * If @bs is non-NULL, its AioContext is temporarily released.
- *
- * This function polls. Callers must not hold the lock of any AioContext other
- * than the current one and the one of @bs.
  */
 void no_coroutine_fn TSA_ACQUIRE(graph_lock) TSA_NO_TSA
-bdrv_graph_wrlock(BlockDriverState *bs);
+bdrv_graph_wrlock(void);
 
 /*
  * bdrv_graph_wrunlock:
  * Write finished, reset global has_writer to 0 and restart
  * all readers that are waiting.
- *
- * If @bs is non-NULL, its AioContext is temporarily released.
  */
 void no_coroutine_fn TSA_RELEASE(graph_lock) TSA_NO_TSA
-bdrv_graph_wrunlock(BlockDriverState *bs);
-
-/*
- * bdrv_graph_wrunlock_ctx:
- * Write finished, reset global has_writer to 0 and restart
- * all readers that are waiting.
- *
- * If @ctx is non-NULL, its lock is temporarily released.
- */
-void no_coroutine_fn TSA_RELEASE(graph_lock) TSA_NO_TSA
-bdrv_graph_wrunlock_ctx(AioContext *ctx);
+bdrv_graph_wrunlock(void);
 
 /*
  * bdrv_graph_co_rdlock:
diff --git a/block.c b/block.c
index bfb0861ec6..25e1ebc606 100644
--- a/block.c
+++ b/block.c
@@ -1708,12 +1708,12 @@ bdrv_open_driver(BlockDriverState *bs, BlockDriver 
*drv, const char *node_name,
 open_failed:
 bs->drv = NULL;
 
-bdrv_graph_wrlock(NULL);
+bdrv_graph_wrlock();
 if (bs->file != NULL) {
 bdrv_unref_child(bs, bs->file);
 assert(!bs->file);
 }
-bdrv_graph_wrunlock(NULL);
+bdrv_graph_wrunlock();
 
 g_free(bs->opaque);
 bs->opaque = NULL;
@@ -3575,9 +3575,9 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 
 bdrv_ref(drain_bs);
 bdrv_drained_begin(drain_bs);
-bdrv_graph_wrlock(backing_hd);
+bdrv_graph_wrlock();
 ret = bdrv_set_backing_hd_drained(bs, backing_hd, errp);
-bdrv_graph_wrunlock(backing_hd);
+bdrv_graph_wrunlock();
 bdrv_drained_end(drain_bs);
 bdrv_unref(drain_bs);
 
@@ -3790,13 +3790,13 @@ BdrvChild *bdrv_open_child(const char *filename,
 return NULL;
 }
 
-bdrv_graph_wrlock(NULL);
+bdrv_graph_wrlock();
 ctx = bdrv_get_aio_context(bs);
 aio_context_acquire(ctx);
 child = bdrv_attach_child(parent, bs, bdref_key, child_class, child_role,
   errp);
 aio_context_release(ctx);
-bdrv_graph_wrunlock(NULL);
+bdrv_graph_wrunlock();
 
 return child;
 }
@@ -4650,9 +4650,9 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 aio_context_release(ctx);
 }
 
-bdrv_graph_wrlock(NULL);
+bdrv_graph_wrlock();
 tran_commit(tran);
-bdrv_graph_wrunlock(NULL);
+bdrv_graph_wrunlock();
 
 QTAILQ_FOREACH_REVERSE(bs_entry, bs_queue, entry) {
 BlockDriverState *bs = bs_entry->state.bs;
@@ -4669,9 +4669,9 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 goto cleanup;
 
 abort:
-bdrv_graph_wrlock(NULL);
+bdrv_graph_wrlock();
 tran_abort(tran);
-bdrv_graph_wrunlock(NULL);
+bdrv_graph_wrunlock();
 
 QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
 if (bs_entry->prepared) {
@@ -4852,12 +4852,12 @@ 

[PATCH v2 09/14] aio-wait: draw equivalence between AIO_WAIT_WHILE() and AIO_WAIT_WHILE_UNLOCKED()

2023-12-05 Thread Stefan Hajnoczi
Now that the AioContext lock no longer exists, AIO_WAIT_WHILE() and
AIO_WAIT_WHILE_UNLOCKED() are equivalent.

A future patch will get rid of AIO_WAIT_WHILE_UNLOCKED().

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 include/block/aio-wait.h | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
index 5449b6d742..157f105916 100644
--- a/include/block/aio-wait.h
+++ b/include/block/aio-wait.h
@@ -63,9 +63,6 @@ extern AioWait global_aio_wait;
  * @ctx: the aio context, or NULL if multiple aio contexts (for which the
  *   caller does not hold a lock) are involved in the polling condition.
  * @cond: wait while this conditional expression is true
- * @unlock: whether to unlock and then lock again @ctx. This applies
- * only when waiting for another AioContext from the main loop.
- * Otherwise it's ignored.
  *
  * Wait while a condition is true.  Use this to implement synchronous
  * operations that require event loop activity.
@@ -78,7 +75,7 @@ extern AioWait global_aio_wait;
  * wait on conditions between two IOThreads since that could lead to deadlock,
  * go via the main loop instead.
  */
-#define AIO_WAIT_WHILE_INTERNAL(ctx, cond, unlock) ({  \
+#define AIO_WAIT_WHILE_INTERNAL(ctx, cond) ({  \
 bool waited_ = false;  \
 AioWait *wait_ = _aio_wait; \
 AioContext *ctx_ = (ctx);  \
@@ -95,13 +92,7 @@ extern AioWait global_aio_wait;
 assert(qemu_get_current_aio_context() ==   \
qemu_get_aio_context());\
 while ((cond)) {   \
-if (unlock && ctx_) {  \
-aio_context_release(ctx_); \
-}  \
 aio_poll(qemu_get_aio_context(), true);\
-if (unlock && ctx_) {  \
-aio_context_acquire(ctx_); \
-}  \
 waited_ = true;\
 }  \
 }  \
@@ -109,10 +100,11 @@ extern AioWait global_aio_wait;
 waited_; })
 
 #define AIO_WAIT_WHILE(ctx, cond)  \
-AIO_WAIT_WHILE_INTERNAL(ctx, cond, true)
+AIO_WAIT_WHILE_INTERNAL(ctx, cond)
 
+/* TODO replace this with AIO_WAIT_WHILE() in a future patch */
 #define AIO_WAIT_WHILE_UNLOCKED(ctx, cond) \
-AIO_WAIT_WHILE_INTERNAL(ctx, cond, false)
+AIO_WAIT_WHILE_INTERNAL(ctx, cond)
 
 /**
  * aio_wait_kick:
-- 
2.43.0




[PATCH v2 03/14] tests: remove aio_context_acquire() tests

2023-12-05 Thread Stefan Hajnoczi
The aio_context_acquire() API is being removed. Drop the test case that
calls the API.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
---
 tests/unit/test-aio.c | 67 +--
 1 file changed, 1 insertion(+), 66 deletions(-)

diff --git a/tests/unit/test-aio.c b/tests/unit/test-aio.c
index 337b6e4ea7..e77d86be87 100644
--- a/tests/unit/test-aio.c
+++ b/tests/unit/test-aio.c
@@ -100,76 +100,12 @@ static void event_ready_cb(EventNotifier *e)
 
 /* Tests using aio_*.  */
 
-typedef struct {
-QemuMutex start_lock;
-EventNotifier notifier;
-bool thread_acquired;
-} AcquireTestData;
-
-static void *test_acquire_thread(void *opaque)
-{
-AcquireTestData *data = opaque;
-
-/* Wait for other thread to let us start */
-qemu_mutex_lock(>start_lock);
-qemu_mutex_unlock(>start_lock);
-
-/* event_notifier_set might be called either before or after
- * the main thread's call to poll().  The test case's outcome
- * should be the same in either case.
- */
-event_notifier_set(>notifier);
-aio_context_acquire(ctx);
-aio_context_release(ctx);
-
-data->thread_acquired = true; /* success, we got here */
-
-return NULL;
-}
-
 static void set_event_notifier(AioContext *nctx, EventNotifier *notifier,
EventNotifierHandler *handler)
 {
 aio_set_event_notifier(nctx, notifier, handler, NULL, NULL);
 }
 
-static void dummy_notifier_read(EventNotifier *n)
-{
-event_notifier_test_and_clear(n);
-}
-
-static void test_acquire(void)
-{
-QemuThread thread;
-AcquireTestData data;
-
-/* Dummy event notifier ensures aio_poll() will block */
-event_notifier_init(, false);
-set_event_notifier(ctx, , dummy_notifier_read);
-g_assert(!aio_poll(ctx, false)); /* consume aio_notify() */
-
-qemu_mutex_init(_lock);
-qemu_mutex_lock(_lock);
-data.thread_acquired = false;
-
-qemu_thread_create(, "test_acquire_thread",
-   test_acquire_thread,
-   , QEMU_THREAD_JOINABLE);
-
-/* Block in aio_poll(), let other thread kick us and acquire context */
-aio_context_acquire(ctx);
-qemu_mutex_unlock(_lock); /* let the thread run */
-g_assert(aio_poll(ctx, true));
-g_assert(!data.thread_acquired);
-aio_context_release(ctx);
-
-qemu_thread_join();
-set_event_notifier(ctx, , NULL);
-event_notifier_cleanup();
-
-g_assert(data.thread_acquired);
-}
-
 static void test_bh_schedule(void)
 {
 BHTestData data = { .n = 0 };
@@ -879,7 +815,7 @@ static void test_worker_thread_co_enter(void)
 qemu_thread_get_self(_thread);
 co = qemu_coroutine_create(co_check_current_thread, _thread);
 
-qemu_thread_create(_thread, "test_acquire_thread",
+qemu_thread_create(_thread, "test_aio_co_enter",
test_aio_co_enter,
co, QEMU_THREAD_JOINABLE);
 
@@ -899,7 +835,6 @@ int main(int argc, char **argv)
 while (g_main_context_iteration(NULL, false));
 
 g_test_init(, , NULL);
-g_test_add_func("/aio/acquire", test_acquire);
 g_test_add_func("/aio/bh/schedule", test_bh_schedule);
 g_test_add_func("/aio/bh/schedule10",   test_bh_schedule10);
 g_test_add_func("/aio/bh/cancel",   test_bh_cancel);
-- 
2.43.0




[PATCH v2 08/14] scsi: remove AioContext locking

2023-12-05 Thread Stefan Hajnoczi
The AioContext lock no longer has any effect. Remove it.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 include/hw/virtio/virtio-scsi.h | 14 --
 hw/scsi/scsi-bus.c  |  2 --
 hw/scsi/scsi-disk.c | 31 +--
 hw/scsi/virtio-scsi.c   | 18 --
 4 files changed, 5 insertions(+), 60 deletions(-)

diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index da8cb928d9..7f0573b1bf 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -101,20 +101,6 @@ struct VirtIOSCSI {
 uint32_t host_features;
 };
 
-static inline void virtio_scsi_acquire(VirtIOSCSI *s)
-{
-if (s->ctx) {
-aio_context_acquire(s->ctx);
-}
-}
-
-static inline void virtio_scsi_release(VirtIOSCSI *s)
-{
-if (s->ctx) {
-aio_context_release(s->ctx);
-}
-}
-
 void virtio_scsi_common_realize(DeviceState *dev,
 VirtIOHandleOutput ctrl,
 VirtIOHandleOutput evt,
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index f3ec11f892..df68a44b6a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -1731,9 +1731,7 @@ void scsi_device_purge_requests(SCSIDevice *sdev, 
SCSISense sense)
 {
 scsi_device_for_each_req_async(sdev, scsi_device_purge_one_req, NULL);
 
-aio_context_acquire(blk_get_aio_context(sdev->conf.blk));
 blk_drain(sdev->conf.blk);
-aio_context_release(blk_get_aio_context(sdev->conf.blk));
 scsi_device_set_ua(sdev, sense);
 }
 
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index a5048e0aaf..61be3d395a 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2339,14 +2339,10 @@ static void scsi_disk_reset(DeviceState *dev)
 {
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev.qdev, dev);
 uint64_t nb_sectors;
-AioContext *ctx;
 
 scsi_device_purge_requests(>qdev, SENSE_CODE(RESET));
 
-ctx = blk_get_aio_context(s->qdev.conf.blk);
-aio_context_acquire(ctx);
 blk_get_geometry(s->qdev.conf.blk, _sectors);
-aio_context_release(ctx);
 
 nb_sectors /= s->qdev.blocksize / BDRV_SECTOR_SIZE;
 if (nb_sectors) {
@@ -2545,15 +2541,13 @@ static void scsi_unrealize(SCSIDevice *dev)
 static void scsi_hd_realize(SCSIDevice *dev, Error **errp)
 {
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
-AioContext *ctx = NULL;
+
 /* can happen for devices without drive. The error message for missing
  * backend will be issued in scsi_realize
  */
 if (s->qdev.conf.blk) {
-ctx = blk_get_aio_context(s->qdev.conf.blk);
-aio_context_acquire(ctx);
 if (!blkconf_blocksizes(>qdev.conf, errp)) {
-goto out;
+return;
 }
 }
 s->qdev.blocksize = s->qdev.conf.logical_block_size;
@@ -2562,16 +2556,11 @@ static void scsi_hd_realize(SCSIDevice *dev, Error 
**errp)
 s->product = g_strdup("QEMU HARDDISK");
 }
 scsi_realize(>qdev, errp);
-out:
-if (ctx) {
-aio_context_release(ctx);
-}
 }
 
 static void scsi_cd_realize(SCSIDevice *dev, Error **errp)
 {
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
-AioContext *ctx;
 int ret;
 uint32_t blocksize = 2048;
 
@@ -2587,8 +2576,6 @@ static void scsi_cd_realize(SCSIDevice *dev, Error **errp)
 blocksize = dev->conf.physical_block_size;
 }
 
-ctx = blk_get_aio_context(dev->conf.blk);
-aio_context_acquire(ctx);
 s->qdev.blocksize = blocksize;
 s->qdev.type = TYPE_ROM;
 s->features |= 1 << SCSI_DISK_F_REMOVABLE;
@@ -2596,7 +2583,6 @@ static void scsi_cd_realize(SCSIDevice *dev, Error **errp)
 s->product = g_strdup("QEMU CD-ROM");
 }
 scsi_realize(>qdev, errp);
-aio_context_release(ctx);
 }
 
 
@@ -2727,7 +2713,6 @@ static int get_device_type(SCSIDiskState *s)
 static void scsi_block_realize(SCSIDevice *dev, Error **errp)
 {
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
-AioContext *ctx;
 int sg_version;
 int rc;
 
@@ -2742,9 +2727,6 @@ static void scsi_block_realize(SCSIDevice *dev, Error 
**errp)
   "be removed in a future version");
 }
 
-ctx = blk_get_aio_context(s->qdev.conf.blk);
-aio_context_acquire(ctx);
-
 /* check we are using a driver managing SG_IO (version 3 and after) */
 rc = blk_ioctl(s->qdev.conf.blk, SG_GET_VERSION_NUM, _version);
 if (rc < 0) {
@@ -2752,18 +2734,18 @@ static void scsi_block_realize(SCSIDevice *dev, Error 
**errp)
 if (rc != -EPERM) {
 error_append_hint(errp, "Is this a SCSI device?\n");
 }
-goto out;
+return;
 }
 if (sg_version < 3) {
 error_setg(errp, "scsi generic interface too old");
-goto out;
+return;
 }
 
 /* get device type from INQUIRY data */
 rc = get_device_type(s);
 if (rc < 0) {
 error_setg(errp, 

[PATCH v2 02/14] scsi: assert that callbacks run in the correct AioContext

2023-12-05 Thread Stefan Hajnoczi
Since the removal of AioContext locking, the correctness of the code
relies on running requests from a single AioContext at any given time.

Add assertions that verify that callbacks are invoked in the correct
AioContext.

Signed-off-by: Stefan Hajnoczi 
---
 hw/scsi/scsi-disk.c  | 14 ++
 system/dma-helpers.c |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 2c1bbb3530..a5048e0aaf 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -273,6 +273,10 @@ static void scsi_aio_complete(void *opaque, int ret)
 SCSIDiskReq *r = (SCSIDiskReq *)opaque;
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
 
+/* The request must only run in the BlockBackend's AioContext */
+assert(blk_get_aio_context(s->qdev.conf.blk) ==
+   qemu_get_current_aio_context());
+
 assert(r->req.aiocb != NULL);
 r->req.aiocb = NULL;
 
@@ -370,8 +374,13 @@ static void scsi_dma_complete(void *opaque, int ret)
 
 static void scsi_read_complete_noio(SCSIDiskReq *r, int ret)
 {
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
 uint32_t n;
 
+/* The request must only run in the BlockBackend's AioContext */
+assert(blk_get_aio_context(s->qdev.conf.blk) ==
+   qemu_get_current_aio_context());
+
 assert(r->req.aiocb == NULL);
 if (scsi_disk_req_check_error(r, ret, false)) {
 goto done;
@@ -496,8 +505,13 @@ static void scsi_read_data(SCSIRequest *req)
 
 static void scsi_write_complete_noio(SCSIDiskReq *r, int ret)
 {
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
 uint32_t n;
 
+/* The request must only run in the BlockBackend's AioContext */
+assert(blk_get_aio_context(s->qdev.conf.blk) ==
+   qemu_get_current_aio_context());
+
 assert (r->req.aiocb == NULL);
 if (scsi_disk_req_check_error(r, ret, false)) {
 goto done;
diff --git a/system/dma-helpers.c b/system/dma-helpers.c
index 528117f256..9b221cf94e 100644
--- a/system/dma-helpers.c
+++ b/system/dma-helpers.c
@@ -119,6 +119,9 @@ static void dma_blk_cb(void *opaque, int ret)
 
 trace_dma_blk_cb(dbs, ret);
 
+/* DMAAIOCB is not thread-safe and must be accessed only from dbs->ctx */
+assert(ctx == qemu_get_current_aio_context());
+
 dbs->acb = NULL;
 dbs->offset += dbs->iov.size;
 
-- 
2.43.0




[PATCH v2 04/14] aio: make aio_context_acquire()/aio_context_release() a no-op

2023-12-05 Thread Stefan Hajnoczi
aio_context_acquire()/aio_context_release() has been replaced by
fine-grained locking to protect state shared by multiple threads. The
AioContext lock still plays the role of balancing locking in
AIO_WAIT_WHILE() and many functions in QEMU either require that the
AioContext lock is held or not held for this reason. In other words, the
AioContext lock is purely there for consistency with itself and serves
no real purpose anymore.

Stop actually acquiring/releasing the lock in
aio_context_acquire()/aio_context_release() so that subsequent patches
can remove callers across the codebase incrementally.

I have performed "make check" and qemu-iotests stress tests across
x86-64, ppc64le, and aarch64 to confirm that there are no failures as a
result of eliminating the lock.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Acked-by: Kevin Wolf 
---
 util/async.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/util/async.c b/util/async.c
index 8f90ddc304..04ee83d220 100644
--- a/util/async.c
+++ b/util/async.c
@@ -725,12 +725,12 @@ void aio_context_unref(AioContext *ctx)
 
 void aio_context_acquire(AioContext *ctx)
 {
-qemu_rec_mutex_lock(>lock);
+/* TODO remove this function */
 }
 
 void aio_context_release(AioContext *ctx)
 {
-qemu_rec_mutex_unlock(>lock);
+/* TODO remove this function */
 }
 
 QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext)
-- 
2.43.0




[PATCH v2 01/14] virtio-scsi: replace AioContext lock with tmf_bh_lock

2023-12-05 Thread Stefan Hajnoczi
Protect the Task Management Function BH state with a lock. The TMF BH
runs in the main loop thread. An IOThread might process a TMF at the
same time as the TMF BH is running. Therefore tmf_bh_list and tmf_bh
must be protected by a lock.

Run TMF request completion in the IOThread using aio_wait_bh_oneshot().
This avoids more locking to protect the virtqueue and SCSI layer state.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
---
 include/hw/virtio/virtio-scsi.h |  3 +-
 hw/scsi/virtio-scsi.c   | 62 ++---
 2 files changed, 43 insertions(+), 22 deletions(-)

diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index 779568ab5d..da8cb928d9 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -85,8 +85,9 @@ struct VirtIOSCSI {
 
 /*
  * TMFs deferred to main loop BH. These fields are protected by
- * virtio_scsi_acquire().
+ * tmf_bh_lock.
  */
+QemuMutex tmf_bh_lock;
 QEMUBH *tmf_bh;
 QTAILQ_HEAD(, VirtIOSCSIReq) tmf_bh_list;
 
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 9c751bf296..4f8d35facc 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -123,6 +123,30 @@ static void virtio_scsi_complete_req(VirtIOSCSIReq *req)
 virtio_scsi_free_req(req);
 }
 
+static void virtio_scsi_complete_req_bh(void *opaque)
+{
+VirtIOSCSIReq *req = opaque;
+
+virtio_scsi_complete_req(req);
+}
+
+/*
+ * Called from virtio_scsi_do_one_tmf_bh() in main loop thread. The main loop
+ * thread cannot touch the virtqueue since that could race with an IOThread.
+ */
+static void virtio_scsi_complete_req_from_main_loop(VirtIOSCSIReq *req)
+{
+VirtIOSCSI *s = req->dev;
+
+if (!s->ctx || s->ctx == qemu_get_aio_context()) {
+/* No need to schedule a BH when there is no IOThread */
+virtio_scsi_complete_req(req);
+} else {
+/* Run request completion in the IOThread */
+aio_wait_bh_oneshot(s->ctx, virtio_scsi_complete_req_bh, req);
+}
+}
+
 static void virtio_scsi_bad_req(VirtIOSCSIReq *req)
 {
 virtio_error(VIRTIO_DEVICE(req->dev), "wrong size for virtio-scsi 
headers");
@@ -338,10 +362,7 @@ static void virtio_scsi_do_one_tmf_bh(VirtIOSCSIReq *req)
 
 out:
 object_unref(OBJECT(d));
-
-virtio_scsi_acquire(s);
-virtio_scsi_complete_req(req);
-virtio_scsi_release(s);
+virtio_scsi_complete_req_from_main_loop(req);
 }
 
 /* Some TMFs must be processed from the main loop thread */
@@ -354,18 +375,16 @@ static void virtio_scsi_do_tmf_bh(void *opaque)
 
 GLOBAL_STATE_CODE();
 
-virtio_scsi_acquire(s);
+WITH_QEMU_LOCK_GUARD(>tmf_bh_lock) {
+QTAILQ_FOREACH_SAFE(req, >tmf_bh_list, next, tmp) {
+QTAILQ_REMOVE(>tmf_bh_list, req, next);
+QTAILQ_INSERT_TAIL(, req, next);
+}
 
-QTAILQ_FOREACH_SAFE(req, >tmf_bh_list, next, tmp) {
-QTAILQ_REMOVE(>tmf_bh_list, req, next);
-QTAILQ_INSERT_TAIL(, req, next);
+qemu_bh_delete(s->tmf_bh);
+s->tmf_bh = NULL;
 }
 
-qemu_bh_delete(s->tmf_bh);
-s->tmf_bh = NULL;
-
-virtio_scsi_release(s);
-
 QTAILQ_FOREACH_SAFE(req, , next, tmp) {
 QTAILQ_REMOVE(, req, next);
 virtio_scsi_do_one_tmf_bh(req);
@@ -379,8 +398,7 @@ static void virtio_scsi_reset_tmf_bh(VirtIOSCSI *s)
 
 GLOBAL_STATE_CODE();
 
-virtio_scsi_acquire(s);
-
+/* Called after ioeventfd has been stopped, so tmf_bh_lock is not needed */
 if (s->tmf_bh) {
 qemu_bh_delete(s->tmf_bh);
 s->tmf_bh = NULL;
@@ -393,19 +411,19 @@ static void virtio_scsi_reset_tmf_bh(VirtIOSCSI *s)
 req->resp.tmf.response = VIRTIO_SCSI_S_TARGET_FAILURE;
 virtio_scsi_complete_req(req);
 }
-
-virtio_scsi_release(s);
 }
 
 static void virtio_scsi_defer_tmf_to_bh(VirtIOSCSIReq *req)
 {
 VirtIOSCSI *s = req->dev;
 
-QTAILQ_INSERT_TAIL(>tmf_bh_list, req, next);
+WITH_QEMU_LOCK_GUARD(>tmf_bh_lock) {
+QTAILQ_INSERT_TAIL(>tmf_bh_list, req, next);
 
-if (!s->tmf_bh) {
-s->tmf_bh = qemu_bh_new(virtio_scsi_do_tmf_bh, s);
-qemu_bh_schedule(s->tmf_bh);
+if (!s->tmf_bh) {
+s->tmf_bh = qemu_bh_new(virtio_scsi_do_tmf_bh, s);
+qemu_bh_schedule(s->tmf_bh);
+}
 }
 }
 
@@ -1235,6 +1253,7 @@ static void virtio_scsi_device_realize(DeviceState *dev, 
Error **errp)
 Error *err = NULL;
 
 QTAILQ_INIT(>tmf_bh_list);
+qemu_mutex_init(>tmf_bh_lock);
 
 virtio_scsi_common_realize(dev,
virtio_scsi_handle_ctrl,
@@ -1277,6 +1296,7 @@ static void virtio_scsi_device_unrealize(DeviceState *dev)
 
 qbus_set_hotplug_handler(BUS(>bus), NULL);
 virtio_scsi_common_unrealize(dev);
+qemu_mutex_destroy(>tmf_bh_lock);
 }
 
 static Property virtio_scsi_properties[] = {
-- 
2.43.0




[PATCH v2 00/14] aio: remove AioContext lock

2023-12-05 Thread Stefan Hajnoczi
v2:
- Add Patch 2 "scsi: assert that callbacks run in the correct AioContext" 
[Kevin]
- Add Patch 7 "block: remove bdrv_co_lock()" [Eric and Kevin]
- Remove stray goto label in Patch 8 [Kevin]
- Fix "eeked" -> "eked" typo in Patch 10 [Eric]

This series removes the AioContext locking APIs from QEMU.
aio_context_acquire() and aio_context_release() are currently only needed to
support the locking discipline required by AIO_POLL_WHILE() (except for a stray
user that I converted in Patch 1). AIO_POLL_WHILE() doesn't really need the
AioContext lock anymore, so it's possible to remove the API. This is a nice
simplification because the AioContext locking rules were sometimes tricky or
underspecified, leading to many bugs of the years.

This patch series removes these APIs across the codebase and cleans up the
documentation/comments that refers to them.

Patch 1 is a AioContext lock user I forgot to convert in my earlier SCSI
conversion series.

Patch 2 adds an assertion to the SCSI code to ensure that callbacks are invoked
in the correct AioContext.

Patch 3 removes tests for the AioContext lock because they will no longer be
needed when the lock is gone.

Patches 4-10 remove the AioContext lock. These can be reviewed by categorizing
the call sites into 1. places that take the lock because they call an API that
requires the lock (ultimately AIO_POLL_WHILE()) and 2. places that take the
lock to protect state. There should be no instances of case 2 left. If you see
one, you've found a bug in this patch series!

Patches 11-14 remove comments.

Based-on: 20231204164259.1515217-1-stefa...@redhat.com ("[PATCH v2 0/4] scsi: 
eliminate AioContext lock")
Since SCSI needs to stop relying on the AioContext lock before we can remove
the lock.

Stefan Hajnoczi (14):
  virtio-scsi: replace AioContext lock with tmf_bh_lock
  scsi: assert that callbacks run in the correct AioContext
  tests: remove aio_context_acquire() tests
  aio: make aio_context_acquire()/aio_context_release() a no-op
  graph-lock: remove AioContext locking
  block: remove AioContext locking
  block: remove bdrv_co_lock()
  scsi: remove AioContext locking
  aio-wait: draw equivalence between AIO_WAIT_WHILE() and
AIO_WAIT_WHILE_UNLOCKED()
  aio: remove aio_context_acquire()/aio_context_release() API
  docs: remove AioContext lock from IOThread docs
  scsi: remove outdated AioContext lock comment
  job: remove outdated AioContext locking comments
  block: remove outdated AioContext locking comments

 docs/devel/multiple-iothreads.txt|  45 ++--
 include/block/aio-wait.h |  16 +-
 include/block/aio.h  |  17 --
 include/block/block-common.h |   3 -
 include/block/block-global-state.h   |  23 +-
 include/block/block-io.h |  12 +-
 include/block/block_int-common.h |   2 -
 include/block/graph-lock.h   |  21 +-
 include/block/snapshot.h |   2 -
 include/hw/virtio/virtio-scsi.h  |  17 +-
 include/qemu/job.h   |  20 --
 block.c  | 363 ---
 block/backup.c   |   4 +-
 block/blklogwrites.c |   8 +-
 block/blkverify.c|   4 +-
 block/block-backend.c|  33 +--
 block/commit.c   |  16 +-
 block/copy-before-write.c|  22 +-
 block/export/export.c|  22 +-
 block/export/vhost-user-blk-server.c |   4 -
 block/graph-lock.c   |  44 +---
 block/io.c   |  45 +---
 block/mirror.c   |  41 +--
 block/monitor/bitmap-qmp-cmds.c  |  20 +-
 block/monitor/block-hmp-cmds.c   |  29 ---
 block/qapi-sysemu.c  |  27 +-
 block/qapi.c |  18 +-
 block/qcow2.c|   4 +-
 block/quorum.c   |   8 +-
 block/raw-format.c   |   5 -
 block/replication.c  |  72 +-
 block/snapshot.c |  26 +-
 block/stream.c   |  12 +-
 block/vmdk.c |  20 +-
 block/write-threshold.c  |   6 -
 blockdev.c   | 319 +--
 blockjob.c   |  30 +--
 hw/block/dataplane/virtio-blk.c  |  10 -
 hw/block/dataplane/xen-block.c   |  17 +-
 hw/block/virtio-blk.c|  45 +---
 hw/core/qdev-properties-system.c |   9 -
 hw/scsi/scsi-bus.c   |   2 -
 hw/scsi/scsi-disk.c  |  46 ++--
 hw/scsi/virtio-scsi.c|  80 +++---
 job.c|  16 --
 migration/block.c|  33 +--
 migration/migration-hmp-cmds.c   |   3 -
 migration/savevm.c   |  22 --
 net/colo-compare.c   |   2 -
 qemu-img.c   |   4 -
 qemu-io.c|  10 +-
 qemu-nbd.c   |   2 -
 

[PATCH] CODING_STYLE: Add a section of the naming convention

2023-12-05 Thread Julien Grall
From: Julien Grall 

Several maintainers have expressed a stronger preference
to use '-' when in filename and option that contains multiple
words.

So document it in CODING_STYLE.

Signed-off-by: Julien Grall 

---
Changes in v2:
- New wording
- Update the section title
- Add Jan's acked-by
---
 CODING_STYLE | 9 +
 1 file changed, 9 insertions(+)

diff --git a/CODING_STYLE b/CODING_STYLE
index ced3ade5a6fb..ed13ee2b664b 100644
--- a/CODING_STYLE
+++ b/CODING_STYLE
@@ -144,6 +144,15 @@ separate lines and each line should begin with a leading 
'*'.
  * Note beginning and end markers on separate lines and leading '*'.
  */
 
+Naming convention for files and command line options
+
+
+'-' should be used to separate words in commandline options and filenames.
+E.g. timer-works.
+
+Note that some of the options and filenames are using '_'. This is now
+deprecated.
+
 Emacs local variables
 -
 
-- 
2.40.1




Re: [PATCH v3 6/6] x86/vPIC: check values loaded from state save record

2023-12-05 Thread Roger Pau Monné
On Tue, Nov 28, 2023 at 11:36:40AM +0100, Jan Beulich wrote:
> Loading is_master from the state save record can lead to out-of-bounds
> accesses via at least the two container_of() uses by vpic_domain() and
> __vpic_lock(). Make sure the value is consistent with the instance being
> loaded.
> 
> For ->int_output (which for whatever reason isn't a 1-bit bitfield),
> besides bounds checking also take ->init_state into account.
> 
> For ELCR follow what vpic_intercept_elcr_io()'s write path and
> vpic_reset() do, i.e. don't insist on the internal view of the value to
> be saved.
> 
> Move the instance range check as well, leaving just an assertion in the
> load handler.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Roger Pau Monné 

Thanks, Roger.



Re: [PATCH v6 4/5] [FUTURE] xen/arm: enable vPCI for domUs

2023-12-05 Thread Stewart Hildebrand
On 12/5/23 12:09, Roger Pau Monné wrote:
> On Tue, Dec 05, 2023 at 11:27:03AM -0500, Stewart Hildebrand wrote:
>> On 12/5/23 06:08, Roger Pau Monné wrote:
>>> On Mon, Dec 04, 2023 at 02:07:51PM -0800, Stefano Stabellini wrote:
 On Mon, 4 Dec 2023, Roger Pau Monné wrote:
> On Fri, Dec 01, 2023 at 06:56:32PM -0800, Stefano Stabellini wrote:
>> On Fri, 1 Dec 2023, Roger Pau Monné wrote:
>>> On Mon, Nov 13, 2023 at 05:21:13PM -0500, Stewart Hildebrand wrote:
 @@ -1618,6 +1630,14 @@ int iommu_do_pci_domctl(
  bus = PCI_BUS(machine_sbdf);
  devfn = PCI_DEVFN(machine_sbdf);
  
 +if ( needs_vpci(d) && !has_vpci(d) )
 +{
 +printk(XENLOG_G_WARNING "Cannot assign %pp to %pd: vPCI 
 support not enabled\n",
 +   _SBDF(seg, bus, devfn), d);
 +ret = -EPERM;
 +break;
>>>
>>> I think this is likely too restrictive going forward.  The current
>>> approach is indeed to enable vPCI on a per-domain basis because that's
>>> how PVH dom0 uses it, due to being unable to use ioreq servers.
>>>
>>> If we start to expose vPCI suport to guests the interface should be on
>>> a per-device basis, so that vPCI could be enabled for some devices,
>>> while others could still be handled by ioreq servers.
>>>
>>> We might want to add a new flag to xen_domctl_assign_device (used by
>>> XEN_DOMCTL_assign_device) in order to signal whether the device will
>>> use vPCI.
>>
>> Actually I don't think this is a good idea. I am all for flexibility but
>> supporting multiple different configurations comes at an extra cost for
>> both maintainers and contributors. I think we should try to reduce the
>> amount of configurations we support rather than increasing them
>> (especially on x86 where we have PV, PVH, HVM).
>
> I think it's perfectly fine to initially require a domain to have all
> its devices either passed through using vPCI or ireqs, but the
> interface IMO should allow for such differentiation in the future.
> That's why I think introducing a domain wide vPCI flag might not be
> the best option going forward.
>
> It would be perfectly fine for XEN_DOMCTL_assign_device to set a
> domain wide vPCI flag, iow:
>
> if ( HYPERCALL_VPCI_FLAG_SET && !has_vpci(d) )
> {
> if ( has_arch_pdevs(d) )
> {
> printk("All passthrough devices must use the same backend\n");
> return -EINVAL;
> }
>
> /* Set vPCI domain flag */
> }

 That would be fine by me, but maybe we can avoid this change too. I was
 imagining that vPCI would be enabled at domain creation, not at runtime.
 And that vPCI would be enabled by default for all PVH guests (once we
 are past the initial experimental phase.)
>>>
>>> Then we don't even need a new CDF flag, and just enable vPCI when
>>> IOMMU is enabled?  IOW: we can key the enabling of vPCI to
>>> XEN_DOMCTL_CDF_iommu for specific domain types?
>>
>> There are many Arm based platforms that need to use iommu but don't have (or 
>> don't use) PCI, so we'd still like to have a separate vPCI flag.
> 
> OK, read below though - if we switch to vPCI being a descendant of
> IOREQ (so that the PCI config space decoding is done by IOREQ) we
> could hotplug vPCI managed devices at runtime without requiring any
> prior initialization at domain create, since the traps to the PCI
> config space would be setup by IOREQ.
> 
> We might need a PCI flag in order to signal whether the domain is
> intended to use PCI devices or not, and so whether IOREQ needs to
> setup PCI config space traps (either fully emulated or passthrough
> devices).  But that would be arch-specific AFAICT, as on x86 we
> always trap accesses to the PCI IO ports.

On Arm, the toolstack (or dom0less creation code) needs to construct a 
{v,ioreq}PCI root complex node in the device tree before guest boot. Attempting 
to hot plug such a device tree node at runtime sounds like a goal for the (far) 
future. On Arm, we don't trap the {v,ioreq}PCI config space by default, so, 
yes, we for sure would need such a {v,ioreq}PCI flag for setting up the 
{v,ioreq}PCI mmio handlers if we go this route.



Re: [PATCH v3 5/6] x86/vPIC: vpic_elcr_mask() master bit 2 control

2023-12-05 Thread Roger Pau Monné
On Tue, Nov 28, 2023 at 11:35:46AM +0100, Jan Beulich wrote:
> Master bit 2 is treated specially: We force it set, but we don't expose
> the bit being set to the guest. While right now the read and write
> handling can easily use the fixed mask, the restore input checking that
> is about to be put in place wants to use the inverted mask to prove that
> no bits are unduly set. That will require master bit 2 to be set. Otoh
> the read path requires the bit to be clear (the bit can have either
> value for the use on the write path). Hence allow use sites control over
> that bit.
> 
> Signed-off-by: Jan Beulich 
> ---
> v3: New, split from larger patch.
> ---
> I'm certainly open to naming suggestions for the new macro parameter.
> "mb2" can certainly be misleading as to Multiboot 2. Yet "master_bit_2"
> it too long for my taste, not the least because of the macro then
> needing to be split across lines.

Let's leave it as mb2, I think given the context it is difficult to
mislead this code as having anything to do with multiboot.

> 
> --- a/xen/arch/x86/hvm/vpic.c
> +++ b/xen/arch/x86/hvm/vpic.c
> @@ -41,7 +41,7 @@
>  #define vpic_lock(v)   spin_lock(__vpic_lock(v))
>  #define vpic_unlock(v) spin_unlock(__vpic_lock(v))
>  #define vpic_is_locked(v) spin_is_locked(__vpic_lock(v))
> -#define vpic_elcr_mask(v) ((v)->is_master ? 0xf8 : 0xde)
> +#define vpic_elcr_mask(v, mb2) ((v)->is_master ? 0xf8 | ((mb2) << 2) : 0xde)
>  
>  /* Return the highest priority found in mask. Return 8 if none. */
>  #define VPIC_PRIO_NONE 8
> @@ -387,7 +387,7 @@ static int cf_check vpic_intercept_elcr_
>  if ( dir == IOREQ_WRITE )
>  {
>  /* Some IRs are always edge trig. Slave IR is always level trig. 
> */
> -data = (*val >> shift) & vpic_elcr_mask(vpic);
> +data = (*val >> shift) & vpic_elcr_mask(vpic, 1);

Not that it matters much, but I think you could use
vpic_elcr_mask(vpic, 0) to strictly keep the same behavior as
before?

>  if ( vpic->is_master )
>  data |= 1 << 2;

Since the bit is forcefully set here anyway.

Regardless:

Reviewed-by: Roger Pau Monné 

Thanks, Roger.



Re: Trying add smt=off disabled cores to cpupool crash xen

2023-12-05 Thread René Winther Højgaard
You are right about sched-gran=core being the issue.

I don't know if this is relevant, but my CPU shouldn't be able to use 
sched-gran=core it's asymmetric.

smt=on with sched-gran=core gives me a warning that it's falling back to 
sched-gran=cpu, I tested smt=off with sched-gran=cpu and it works.

This warning is missing with sched-gran=core and smt=off 

(XEN) ***
(XEN) Asymmetric cpu configuration.
(XEN) Falling back to sched-gran=cpu.
(XEN) ***


/rene


On Tuesday, December 5th, 2023 at 07:21, Juergen Gross  wrote:


> On 04.12.23 18:40, René Winther Højgaard wrote:
> 

> > Hi Juergen,
> > 

> > Sorry for the late reply.
> > 

> > Here are the commands I execute, it is 'xl cpupool-cpu-add pcores 4-15' 
> > that crash the system.
> > 

> > xl cpupool-cpu-remove Pool-0 4-31
> > xl cpupool-create name=\"ecores\" sched=\"credit\"
> > xl cpupool-cpu-add ecores 16-31
> > 

> > xl cpupool-create name=\"pcores\" sched=\"credit\"
> > xl cpupool-cpu-add pcores 4-15
> > 

> > Here is the other information you asked for.
> > 

> > xl cpupool-list:
> > Name CPUs Sched Active Domain count
> > Pool-0 24 credit y 5
> > 

> > xl cpupool-list -c:
> > Name CPU list
> > Pool-0 0,2,4,6,8,10,12,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
> > 

> > xl info:
> > host : dom0
> > release : 6.1.62-1.qubes.fc37.x86_64
> > version : #1 SMP PREEMPT_DYNAMIC Tue Nov 14 06:16:38 GMT 2023
> > machine : x86_64
> > nr_cpus : 24
> > max_cpu_id : 31
> > nr_nodes : 1
> > cores_per_socket : 24
> > threads_per_core : 1
> > cpu_mhz : 2995.196
> > hw_caps : 
> > bfebfbff:77faf3ff:2c100800:0121:000f:239c27eb:1840078c:0100
> > virt_caps : pv hvm hvm_directio pv_directio hap iommu_hap_pt_share vmtrace 
> > gnttab-v1
> > total_memory : 65373
> > free_memory : 56505
> > sharing_freed_memory : 0
> > sharing_used_memory : 0
> > outstanding_claims : 0
> > free_cpus : 0
> > xen_major : 4
> > xen_minor : 17
> > xen_extra : .2
> > xen_version : 4.17.2
> > xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
> > 

> > xen_scheduler : credit
> > xen_pagesize : 4096
> > platform_params : virt_start=0x8000
> > xen_changeset :
> > 

> > xen_commandline : placeholder dom0_mem=min:2048M dom0_mem=max:4096M 
> > ucode=scan gnttab_max_frames=2048 gnttab_max_maptrack_frames=4096 smt=off 
> > dom0_max_vcpus=4 dom0_vcpus_pin sched-gran=core sched=credit no-real-mode 
> > edd=off
> 

> 

> Please drop the "sched-gran=core" from the Xen boot parameters. It doesn't 
> make
> any sense with smt=off and is adding additional complexity.
> 

> It shouldn't crash, but core scheduling is still "Experimental". I'll look 
> into
> the issue later.
> 

> 

> Juergen

publickey - renewin@proton.me - 0x43C32E54.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature


Re: [PATCH v6 4/5] [FUTURE] xen/arm: enable vPCI for domUs

2023-12-05 Thread Roger Pau Monné
On Tue, Dec 05, 2023 at 11:27:03AM -0500, Stewart Hildebrand wrote:
> On 12/5/23 06:08, Roger Pau Monné wrote:
> > On Mon, Dec 04, 2023 at 02:07:51PM -0800, Stefano Stabellini wrote:
> >> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
> >>> On Fri, Dec 01, 2023 at 06:56:32PM -0800, Stefano Stabellini wrote:
>  On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> > On Mon, Nov 13, 2023 at 05:21:13PM -0500, Stewart Hildebrand wrote:
> >> @@ -1618,6 +1630,14 @@ int iommu_do_pci_domctl(
> >>  bus = PCI_BUS(machine_sbdf);
> >>  devfn = PCI_DEVFN(machine_sbdf);
> >>  
> >> +if ( needs_vpci(d) && !has_vpci(d) )
> >> +{
> >> +printk(XENLOG_G_WARNING "Cannot assign %pp to %pd: vPCI 
> >> support not enabled\n",
> >> +   _SBDF(seg, bus, devfn), d);
> >> +ret = -EPERM;
> >> +break;
> >
> > I think this is likely too restrictive going forward.  The current
> > approach is indeed to enable vPCI on a per-domain basis because that's
> > how PVH dom0 uses it, due to being unable to use ioreq servers.
> >
> > If we start to expose vPCI suport to guests the interface should be on
> > a per-device basis, so that vPCI could be enabled for some devices,
> > while others could still be handled by ioreq servers.
> >
> > We might want to add a new flag to xen_domctl_assign_device (used by
> > XEN_DOMCTL_assign_device) in order to signal whether the device will
> > use vPCI.
> 
>  Actually I don't think this is a good idea. I am all for flexibility but
>  supporting multiple different configurations comes at an extra cost for
>  both maintainers and contributors. I think we should try to reduce the
>  amount of configurations we support rather than increasing them
>  (especially on x86 where we have PV, PVH, HVM).
> >>>
> >>> I think it's perfectly fine to initially require a domain to have all
> >>> its devices either passed through using vPCI or ireqs, but the
> >>> interface IMO should allow for such differentiation in the future.
> >>> That's why I think introducing a domain wide vPCI flag might not be
> >>> the best option going forward.
> >>>
> >>> It would be perfectly fine for XEN_DOMCTL_assign_device to set a
> >>> domain wide vPCI flag, iow:
> >>>
> >>> if ( HYPERCALL_VPCI_FLAG_SET && !has_vpci(d) )
> >>> {
> >>> if ( has_arch_pdevs(d) )
> >>> {
> >>> printk("All passthrough devices must use the same backend\n");
> >>> return -EINVAL;
> >>> }
> >>>
> >>> /* Set vPCI domain flag */
> >>> }
> >>
> >> That would be fine by me, but maybe we can avoid this change too. I was
> >> imagining that vPCI would be enabled at domain creation, not at runtime.
> >> And that vPCI would be enabled by default for all PVH guests (once we
> >> are past the initial experimental phase.)
> > 
> > Then we don't even need a new CDF flag, and just enable vPCI when
> > IOMMU is enabled?  IOW: we can key the enabling of vPCI to
> > XEN_DOMCTL_CDF_iommu for specific domain types?
> 
> There are many Arm based platforms that need to use iommu but don't have (or 
> don't use) PCI, so we'd still like to have a separate vPCI flag.

OK, read below though - if we switch to vPCI being a descendant of
IOREQ (so that the PCI config space decoding is done by IOREQ) we
could hotplug vPCI managed devices at runtime without requiring any
prior initialization at domain create, since the traps to the PCI
config space would be setup by IOREQ.

We might need a PCI flag in order to signal whether the domain is
intended to use PCI devices or not, and so whether IOREQ needs to
setup PCI config space traps (either fully emulated or passthrough
devices).  But that would be arch-specific AFAICT, as on x86 we
always trap accesses to the PCI IO ports.

Thanks, Roger.



Re: [PATCH -next RFC 01/14] block: add some bdev apis

2023-12-05 Thread Bart Van Assche

On 12/5/23 04:37, Yu Kuai wrote:

+static inline u8 block_bits(struct block_device *bdev)
+{
+   return bdev->bd_inode->i_blkbits;
+}


This function needs a name that's more descriptive.

Thanks,

Bart.



Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function

2023-12-05 Thread Thomas Gleixner
On Mon, Dec 04 2023 at 13:31, Stefano Stabellini wrote:
> On Mon, 3 Dec 2023, Chen, Jiqian wrote:
>> >> vpci device state when device is reset on dom0 side.
>> >>
>> >> And call that function in pcistub_init_device. Because when
>> >> we use "pci-assignable-add" to assign a passthrough device in
>> >> Xen, it will reset passthrough device and the vpci state will
>> >> out of date, and then device will fail to restore bar state.
>> >>
>> >> Signed-off-by: Jiqian Chen 
>> >> Signed-off-by: Huang Rui 
>> > 
>> > This Signed-off-by chain is incorrect.
>> > 
>> > Documentation/process/submitting-patches.rst has a full chapter about
>> > S-O-B and the correct usage.
>> I am the author of this series of patches, and Huang Rui transported the v1 
>> to upstream. And now I transport v2. I am not aware that the SOB chain is 
>> incorrect.
>> Do you have any suggestions?
>
> I think he means that your Signed-off-by should be the second one of the
> two as you are the one submitting the patch to the LKML

No.

   Mailfrom: Jiqian Chen 
   

   Changelog-text

   Signed-off-by: Huang Rui 
   Signed-off-by: Jiqian Chen 

is equally wrong because that would end up with Chen as author and Huang
as first S-O-B which is required to be the author's S-O-B

To make the above correct this would require:

   Mailfrom: Jiqian Chen 
   

   From: Huang Rui 

   Changelog-text

   Signed-off-by: Huang Rui 
   Signed-off-by: Jiqian Chen 

   which tells that Huang is the author and Chen is the 'transporter',
   which unfortunately does not reflect reality.

Or:

   Mailfrom: Jiqian Chen 
   

   Changelog-text

   Co-developed-by: Huang Rui 
   Signed-off-by: Huang Rui 
   Signed-off-by: Jiqian Chen 

   which tells that Checn is the author and Huang co-developed the
   patch, which might be true or not.


V1 which was sent by Huang has the ordering is correct:

   Mailfrom: Huang Rui 
   

   From: Jiqian Chen 

   Changelog-text

   Signed-off-by: Jiqian Chen 
   Signed-off-by: Huang Rui 

   i.e. Chen authored and Huang transported

Now this V2 has not really much to do with V1 and is a new
implementation to solve the problem, which was authored by Chen, so
Huang is not involved at all if I understand correctly.

So what does his S-O-B mean here? Nothing...

It's very well documented how the whole S-O-B business works and it's
not really rocket science to get it straight.

It has a meaning and is not just for decoration purposes.

Thanks,

tglx



Re: [PATCH 1/2] xen: make include/xen/unaligned.h usable on all architectures

2023-12-05 Thread Juergen Gross

On 05.12.23 17:29, Julien Grall wrote:

Hi Arnd,

On 05/12/2023 14:37, Arnd Bergmann wrote:

On Tue, Dec 5, 2023, at 15:19, Julien Grall wrote:

On 05/12/2023 14:10, Arnd Bergmann wrote:

On Tue, Dec 5, 2023, at 15:01, Julien Grall wrote:

On 05/12/2023 13:59, Jan Beulich wrote:

On 05.12.2023 14:46, Julien Grall wrote:

This would repeat the mistake that we had in Linux in the
past (and still do in some drivers): Simply dereferencing
a misaligned pointer is always a bug, even on architectures
that have efficient unaligned load/store instructions,
because C makes this undefined behavior and gcc has
optimizations that assume e.g. 'int *' to never have
the lower two bits set [1].


Just to clarify, I haven't suggested to use 'int *'. My point was more
that I don't think that the helpers would work as-is on arm32 because
even if the ISA allows a direct access, we are setting the bit in SCTLR
to disable unaligned access.

As Juergen is proposing a common header, then I could ask him to do the
work to confirm that the helpers properly work on arm32. But I think
this is unfair.


When I introduced the helpers in Linux, I showed that these
produce the best output on all modern compilers (at least gcc-5,
probably earlier) for both architectures that allow unaligned
access and for those that don't. We used to have architecture
specific helpers depending on what each architecture could
do, but all the other variants we had were either wrong or
less efficient.

If for some reason an Arm system is configured to trap
all unaligned access, then you must already pass
-mno-unaligned-access to the compiler to prevent certain
optimizations, and then the helpers will still behave
correctly (the same way they do on armv5, which never has
unaligned access). On armv7 with -munaligned-access, the
same functions only prevent the use of stm/ldm and strd/ldrd
but still use ldr/str.


Unfortunately we don't explicitely do. This would explain why I saw some issues 
with certain compiler [1].


So I agree that adding -mno-unaligned-access for arm32 makes sense.

@Juergen, do you want me to send a patch?


Yes, will do.


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


[XEN PATCH] ns16550: remove partial explicit initializer

2023-12-05 Thread Nicola Vetrini
The initializer of 'ns16550_com' violates MISRA C Rule 9.3
because it explicitly initializes only the first element of the array,
but the semantics is the same if the explicit initialization is
omitted.

No functional change.

Signed-off-by: Nicola Vetrini 
---
In the context of the rule ("Arrays shall not be partially initialized"),
the initialization shall either be fully explicit or implicit.
---
 xen/drivers/char/ns16550.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index ddf2a48be6e7..a21c1d8c3402 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -73,7 +73,7 @@ static struct ns16550 {
 bool msi;
 const struct ns16550_config_param *param; /* Points into .init.*! */
 #endif
-} ns16550_com[2] = { { 0 } };
+} ns16550_com[2] = { };
 
 #ifdef NS16550_PCI
 struct ns16550_config {
-- 
2.34.1



Re: [PATCH 1/2] xen: make include/xen/unaligned.h usable on all architectures

2023-12-05 Thread Julien Grall

Hi Arnd,

On 05/12/2023 14:37, Arnd Bergmann wrote:

On Tue, Dec 5, 2023, at 15:19, Julien Grall wrote:

On 05/12/2023 14:10, Arnd Bergmann wrote:

On Tue, Dec 5, 2023, at 15:01, Julien Grall wrote:

On 05/12/2023 13:59, Jan Beulich wrote:

On 05.12.2023 14:46, Julien Grall wrote:

This would repeat the mistake that we had in Linux in the
past (and still do in some drivers): Simply dereferencing
a misaligned pointer is always a bug, even on architectures
that have efficient unaligned load/store instructions,
because C makes this undefined behavior and gcc has
optimizations that assume e.g. 'int *' to never have
the lower two bits set [1].


Just to clarify, I haven't suggested to use 'int *'. My point was more
that I don't think that the helpers would work as-is on arm32 because
even if the ISA allows a direct access, we are setting the bit in SCTLR
to disable unaligned access.

As Juergen is proposing a common header, then I could ask him to do the
work to confirm that the helpers properly work on arm32. But I think
this is unfair.


When I introduced the helpers in Linux, I showed that these
produce the best output on all modern compilers (at least gcc-5,
probably earlier) for both architectures that allow unaligned
access and for those that don't. We used to have architecture
specific helpers depending on what each architecture could
do, but all the other variants we had were either wrong or
less efficient.

If for some reason an Arm system is configured to trap
all unaligned access, then you must already pass
-mno-unaligned-access to the compiler to prevent certain
optimizations, and then the helpers will still behave
correctly (the same way they do on armv5, which never has
unaligned access). On armv7 with -munaligned-access, the
same functions only prevent the use of stm/ldm and strd/ldrd
but still use ldr/str.


Unfortunately we don't explicitely do. This would explain why I saw some 
issues with certain compiler [1].


So I agree that adding -mno-unaligned-access for arm32 makes sense.

@Juergen, do you want me to send a patch?

Cheers,

[1] c71163f6-2646-6fae-cb22-600eb0486...@xen.org

--
Julien Grall



Re: [PATCH v6 4/5] [FUTURE] xen/arm: enable vPCI for domUs

2023-12-05 Thread Stewart Hildebrand
On 12/5/23 06:08, Roger Pau Monné wrote:
> On Mon, Dec 04, 2023 at 02:07:51PM -0800, Stefano Stabellini wrote:
>> On Mon, 4 Dec 2023, Roger Pau Monné wrote:
>>> On Fri, Dec 01, 2023 at 06:56:32PM -0800, Stefano Stabellini wrote:
 On Fri, 1 Dec 2023, Roger Pau Monné wrote:
> On Mon, Nov 13, 2023 at 05:21:13PM -0500, Stewart Hildebrand wrote:
>> @@ -1618,6 +1630,14 @@ int iommu_do_pci_domctl(
>>  bus = PCI_BUS(machine_sbdf);
>>  devfn = PCI_DEVFN(machine_sbdf);
>>  
>> +if ( needs_vpci(d) && !has_vpci(d) )
>> +{
>> +printk(XENLOG_G_WARNING "Cannot assign %pp to %pd: vPCI 
>> support not enabled\n",
>> +   _SBDF(seg, bus, devfn), d);
>> +ret = -EPERM;
>> +break;
>
> I think this is likely too restrictive going forward.  The current
> approach is indeed to enable vPCI on a per-domain basis because that's
> how PVH dom0 uses it, due to being unable to use ioreq servers.
>
> If we start to expose vPCI suport to guests the interface should be on
> a per-device basis, so that vPCI could be enabled for some devices,
> while others could still be handled by ioreq servers.
>
> We might want to add a new flag to xen_domctl_assign_device (used by
> XEN_DOMCTL_assign_device) in order to signal whether the device will
> use vPCI.

 Actually I don't think this is a good idea. I am all for flexibility but
 supporting multiple different configurations comes at an extra cost for
 both maintainers and contributors. I think we should try to reduce the
 amount of configurations we support rather than increasing them
 (especially on x86 where we have PV, PVH, HVM).
>>>
>>> I think it's perfectly fine to initially require a domain to have all
>>> its devices either passed through using vPCI or ireqs, but the
>>> interface IMO should allow for such differentiation in the future.
>>> That's why I think introducing a domain wide vPCI flag might not be
>>> the best option going forward.
>>>
>>> It would be perfectly fine for XEN_DOMCTL_assign_device to set a
>>> domain wide vPCI flag, iow:
>>>
>>> if ( HYPERCALL_VPCI_FLAG_SET && !has_vpci(d) )
>>> {
>>> if ( has_arch_pdevs(d) )
>>> {
>>> printk("All passthrough devices must use the same backend\n");
>>> return -EINVAL;
>>> }
>>>
>>> /* Set vPCI domain flag */
>>> }
>>
>> That would be fine by me, but maybe we can avoid this change too. I was
>> imagining that vPCI would be enabled at domain creation, not at runtime.
>> And that vPCI would be enabled by default for all PVH guests (once we
>> are past the initial experimental phase.)
> 
> Then we don't even need a new CDF flag, and just enable vPCI when
> IOMMU is enabled?  IOW: we can key the enabling of vPCI to
> XEN_DOMCTL_CDF_iommu for specific domain types?

There are many Arm based platforms that need to use iommu but don't have (or 
don't use) PCI, so we'd still like to have a separate vPCI flag.

> 
> Maybe that's not so trivial on x86, as there's no x86 PVH domain type
> from the hypervisor PoV.
> 
>>
>>> We have already agreed that we want to aim for a setup where ioreqs
>>> and vPCI could be used for the same domain, but I guess you assumed
>>> that ioreqs would be used for emulated devices exclusively and vPCI
>>> for passthrough devices?
>>
>> Yes, that's right
>>
>>
>>> Overall if we agree that ioreqs and vPCI should co-exist for a domain,
>>> I'm not sure there's much reason to limit ioreqs to only handle
>>> emulated devices, seems like an arbitrary limitation.
>>
>> Reply below
>>
>>
 I don't think we should enable IOREQ servers to handle PCI passthrough
 for PVH guests and/or guests with vPCI. If the domain has vPCI, PCI
 Passthrough can be handled by vPCI just fine. I think this should be a
 good anti-feature to have (a goal to explicitly not add this feature) to
 reduce complexity. Unless you see a specific usecase to add support for
 it?
>>>
>>> There are passthrough devices (GPUs) that might require some extra
>>> mediation on dom0 (like the Intel GVT-g thing) that would force the
>>> usage of ioreqs to passthrough.
>>
>> From an architectural perspective, I think it would be cleaner, simpler
>> to maintain, and simpler to understand if Xen was the sole owner of the
>> PCI Root Complex and PCI config space mediation (implemented with vPCI).
>> IOREQ can be used for emulation and it works very well for that. At
>> least in my mind, that makes things much simpler.
> 
> But IOREQ already has all the code to mediate accesses to the PCI
> config space, and the interface to register separate servers for
> different PCI devices.
> 
> We would then need to duplicate this internally for vPCI, so that vPCI
> could forward accesses to IOREQ just for IOREQ to forward to yet a
> different component?  Seems like a lot of duplication for no benefit.
> 
>> I understand 

Re: [PATCH v2] x86/DMI: adjustments to comply with Misra C:2012 Rule 9.3

2023-12-05 Thread Nicola Vetrini

Hi Jan,

On 2023-12-05 14:35, Jan Beulich wrote:

The rule demands that all array elements be initialized (or dedicated
initializers be used). Introduce a small set of macros to allow doing 
so

without unduly affecting use sites (in particular in terms of how many
elements .matches[] actually has; right now there's no use of
DMI_MATCH4(), so we could even consider reducing the array size to 3).

Signed-off-by: Jan Beulich 
---
Of course a question is how many of these DMI table entries are in fact
no longer applicable (e.g. because of naming 32-bit-only systems).
Subsequently the table in dmi_scan.c itself may want cleaning up as
well, yet I guess the question of stale entries is even more relevant
there.
---
v2: Make things also build with older gcc.



Analyzed with ECLAIR for Rule 9.3: resolves all the violations related 
to DMI_MATCH.


--
Nicola Vetrini, BSc
Software Engineer, BUGSENG srl (https://bugseng.com)



Re: [BUG]i2c_hid_acpi broken with 4.17.2 on Framework Laptop 13 AMD

2023-12-05 Thread Sébastien Chaumat
Le mar. 5 déc. 2023 à 15:18, Jan Beulich  a écrit :
>
> On 05.12.2023 15:14, Sébastien Chaumat wrote:
> >  booting kernel with "dyndbg=file drivers/gpio/* +p"
>
> I'm afraid this doesn't tell me anything. I'm simply not familiar with
> Linux'es GPIO handling.
>
Thanks for your help so far.
I moved  the issue to linux-gpio devs  :

https://marc.info/?l=linux-gpio=17019023453=1

Sébastien



Re: [PATCH v2 5/6] x86/iommu: switch hwdom IOMMU to use a rangeset

2023-12-05 Thread Roger Pau Monné
On Tue, Dec 05, 2023 at 04:47:15PM +0100, Jan Beulich wrote:
> On 05.12.2023 16:43, Roger Pau Monné wrote:
> > On Tue, Dec 05, 2023 at 04:27:05PM +0100, Jan Beulich wrote:
> >> On 04.12.2023 10:43, Roger Pau Monne wrote:
> >>> @@ -476,58 +406,55 @@ void __hwdom_init arch_iommu_hwdom_init(struct 
> >>> domain *d)
> >>>  if ( !map )
> >>>  panic("IOMMU init: unable to allocate rangeset\n");
> >>>  
> >>> -max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
> >>> -top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
> >>> +if ( iommu_hwdom_inclusive )
> >>> +{
> >>> +/* Add the whole range below 4GB, UNUSABLE regions will be 
> >>> removed. */
> >>> +rc = rangeset_add_range(map, 0, max_pfn);
> >>> +if ( rc )
> >>> +panic("IOMMU inclusive mappings can't be added: %d\n",
> >>> +  rc);
> >>> +}
> >>>  
> >>> -for ( i = 0, start = 0, count = 0; i < top; )
> >>> +for ( i = 0; i < e820.nr_map; i++ )
> >>>  {
> >>> -unsigned long pfn = pdx_to_pfn(i);
> >>> -unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
> >>> +struct e820entry entry = e820.map[i];
> >>>  
> >>> -if ( !perms )
> >>> -/* nothing */;
> >>> -else if ( paging_mode_translate(d) )
> >>> +switch ( entry.type )
> >>>  {
> >>> -int rc;
> >>> +case E820_UNUSABLE:
> >>> +if ( !iommu_hwdom_inclusive || PFN_DOWN(entry.addr) > 
> >>> max_pfn )
> >>> +continue;
> >>
> >> The !iommu_hwdom_inclusive part isn't really needed here, is it? The ...
> > 
> > Nor the PFN_DOWN(entry.addr) > max_pfn.
> 
> Hmm, I couldn't convince myself that could also be dropped.

We never map unusable regions, so it's always fine to remove them from
the rangeset.  The condition was just a way to exit early and avoid
the rangeset_remove_range() call.

> >>> -rc = p2m_add_identity_entry(d, pfn,
> >>> -perms & IOMMUF_writable ? 
> >>> p2m_access_rw
> >>> -: 
> >>> p2m_access_r,
> >>> -0);
> >>> +rc = rangeset_remove_range(map, PFN_DOWN(entry.addr),
> >>> +   PFN_DOWN(entry.addr + entry.size 
> >>> - 1));
> >>
> >> ... call here would then simply be a no-op, as it looks. And things would
> >> overall look more safe if the removal was skipped for fewer reasons.
> > 
> > OK, the removal can be done unconditionally if so desired.
> > 
> >>> @@ -605,7 +532,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain 
> >>> *d)
> >>>  rangeset_destroy(map);
> >>>  
> >>>  /* Use if to avoid compiler warning */
> >>> -if ( iommu_iotlb_flush_all(d, flush_flags) )
> >>> +if ( iommu_iotlb_flush_all(d, map_data.flush_flags) )
> >>>  return;
> >>>  }
> >>
> >> Ah yes, here is said change. But I think for correctness this wants
> >> moving to the earlier patch.
> > 
> > OK, so something like:
> > 
> > map_data.flush_flags |= flush_flags;
> 
> Or simply drop flush_flags here right away (read: replace by map.flush_flags).

Right, OK, that will lead to some small changes to existing code which
I wanted to avoid in the context of just adding new code to deal with
a rangeset, but anyway, if that's preferred I will adjust.

Thanks, Roger.



Re: [PATCH v2 07/39] xen/riscv: introduce arch-riscv/hvm/save.h

2023-12-05 Thread Jan Beulich
On 24.11.2023 11:30, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/include/public/arch-riscv/hvm/save.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Structure definitions for HVM state that is held by Xen and must
> + * be saved along with the domain's memory and device-model state.
> + */
> +
> +#ifndef __XEN_PUBLIC_HVM_SAVE_RISCV_H__
> +#define __XEN_PUBLIC_HVM_SAVE_RISCV_H__
> +
> +#endif
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */

Seeing that Arm's is as empty, I wonder why we have it. Julien, Stefano?

> --- a/xen/include/public/hvm/save.h
> +++ b/xen/include/public/hvm/save.h
> @@ -91,6 +91,8 @@ DECLARE_HVM_SAVE_TYPE(END, 0, struct hvm_save_end);
>  #include "../arch-arm/hvm/save.h"
>  #elif defined(__powerpc64__)
>  #include "../arch-ppc.h"
> +#elif defined(__riscv)
> +#include "../arch-riscv/hvm/save.h"
>  #else
>  #error "unsupported architecture"
>  #endif

The PPC part here looks bogus altogether. Shawn?

Jan



Re: [PATCH v11 06/17] vpci/header: rework exit path in init_bars

2023-12-05 Thread Stewart Hildebrand
On 12/1/23 20:27, Volodymyr Babchuk wrote:
> Introduce "fail" label in init_bars() function to have the centralized

The name was correct at the time of submission, but since then, init_bars() was 
renamed to init_header() in staging

> error return path. This is the pre-requirement for the future changes
> in this function.
> 
> This patch does not introduce functional changes.
> 
> Signed-off-by: Volodymyr Babchuk 
> Suggested-by: Roger Pau Monné 
> Acked-by: Roger Pau Monné 
> --

NIT: The scissors line should be tree dashes, not two



Re: [PATCH v2 06/39] xen/riscv: introduce fence.h

2023-12-05 Thread Jan Beulich
On 24.11.2023 11:30, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/fence.h
> @@ -0,0 +1,12 @@
> +#ifndef _ASM_RISCV_FENCE_H
> +#define _ASM_RISCV_FENCE_H
> +
> +#ifdef CONFIG_SMP
> +#define RISCV_ACQUIRE_BARRIER"\tfence r , rw\n"
> +#define RISCV_RELEASE_BARRIER"\tfence rw,  w\n"
> +#else
> +#define RISCV_ACQUIRE_BARRIER
> +#define RISCV_RELEASE_BARRIER
> +#endif
> +
> +#endif   /* _ASM_RISCV_FENCE_H */

Imo such a header would be better to introduce once a use for the
constructs appears. Otherwise at the very least it wants explaining
in the description what this is going to be needed for. I can't
find items of these names in other architectures so far, so this
must be something RISC-V-specific.

Jan



Re: [PATCH v3 2/6] x86/HVM: split restore state checking from state loading

2023-12-05 Thread Roger Pau Monné
On Tue, Dec 05, 2023 at 03:59:13PM +0100, Jan Beulich wrote:
> On 05.12.2023 15:29, Roger Pau Monné wrote:
> > On Tue, Dec 05, 2023 at 09:52:31AM +0100, Jan Beulich wrote:
> >> On 04.12.2023 18:27, Roger Pau Monné wrote:
> >>> On Tue, Nov 28, 2023 at 11:34:04AM +0100, Jan Beulich wrote:
>  ..., at least as reasonably feasible without making a check hook
>  mandatory (in particular strict vs relaxed/zero-extend length checking
>  can't be done early this way).
> 
>  Note that only one of the two uses of hvm_load() is accompanied with
>  hvm_check(). The other directly consumes hvm_save() output, which ought
>  to be well-formed. This means that while input data related checks don't
>  need repeating in the "load" function when already done by the "check"
>  one (albeit assertions to this effect may be desirable), domain state
>  related checks (e.g. has_xyz(d)) will be required in both places.
> 
>  Suggested-by: Roger Pau Monné 
>  Signed-off-by: Jan Beulich 
>  ---
>  Do we really need all the copying involved in use of _hvm_read_entry()
>  (backing hvm_load_entry()? Zero-extending loads are likely easier to
>  handle that way, but for strict loads all we gain is a reduced risk of
>  unaligned accesses (compared to simply pointing into h->data[]).
> >>>
> >>> See below, but I wonder whether the checks could be performed as part
> >>> of hvm_load() without having to introduce a separate handler and loop
> >>> over the context entries.
> >>
> >> Specifically not. State loading (in the longer run) would better not fail
> >> once started. (Imo it should have been this way from the beginning.) Only
> >> then will the vCPU still be in a predictable state even after a possible
> >> error.
> > 
> > Looking at the callers, does such predictable state after failure
> > matter?
> > 
> > One caller is an hypercall used by the toolstack at domain create,
> > failing can just lead to the domain being destroyed.  The other caller
> > is vm fork, which will also lead to the fork being destroyed if
> > context loading fails.
> > 
> > Maybe I'm overlooking something.
> 
> You don't (I think), but existing callers necessarily have to behave the
> way you describe. From an abstract perspective, though, failed state
> loading would better allow a retry. And really I thought that when you
> suggested to split checking from loading, you had exactly that in mind.

Not really TBH, because I didn't think that much on a possible
implementation when proposing it.

Maybe a suitable compromise would be to reset the state to the initial
(at domain build) one on failure?

I do dislike the duplicated loops, as it seems like a lot of duplicate
boilerplate code, and I have fears of it going out of sync.

>  Would the hvm_sr_handlers[] better use array_access_nospec()?
> >>>
> >>> Maybe?  Given this is a domctl I do wonder whether a domain already
> >>> having access to such interface won't have easier ways to leak data
> >>> from Xen.  Maybe for a disaggregated setup.
> >>
> >> Hmm, now we're in the middle - Andrew effectively said "no need to".
> > 
> > I'm certainly not an expert on whether array_access_nospec() should be
> > used, so if Andrew says no need, that's likely better advice.
> > 
> > Maybe the xsm check used in such desegregated setups would already
> > stop speculation?
> 
> There's no XSM check anywhere near, and even if there was I don't see
> how it would stop mis-speculation on those array accesses.

This being a slow path anyway, I don't think the extra
array_access_nospec() would make much of an impact, but again I have
to admit it's unclear to me when those are actually required, so I
might suggest adding them out of precaution.

>  @@ -275,6 +281,78 @@ int hvm_save(struct domain *d, hvm_domai
>   return 0;
>   }
>   
>  +int hvm_check(const struct domain *d, hvm_domain_context_t *h)
>  +{
>  +const struct hvm_save_header *hdr;
>  +int rc;
>  +
>  +if ( d->is_dying )
>  +return -EINVAL;
>  +
>  +/* Get at the save header, which must be first. */
>  +hdr = hvm_get_entry(HEADER, h);
>  +if ( !hdr )
>  +return -ENODATA;
>  +
>  +rc = arch_hvm_check(d, hdr);
>  +if ( rc )
>  +return rc;
>  +
>  +for ( ; ; )
>  +{
>  +const struct hvm_save_descriptor *desc;
>  +hvm_check_handler handler;
>  +
>  +if ( h->size - h->cur < sizeof(*desc) )
>  +{
>  +/* Run out of data */
>  +printk(XENLOG_G_ERR
>  +   "HVM restore %pd: save did not end with a null 
>  entry\n",
>  +   d);
>  +return -ENODATA;
>  +}
>  +
>  +/* Read the typecode of the next entry and check for the 
>  end-marker. */
>  +desc = (const void 

Re: [PATCH v2 05/39] xen/riscv: introduce spinlock.h

2023-12-05 Thread Jan Beulich
On 24.11.2023 11:30, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko 

Acked-by: Jan Beulich 





[xen-unstable test] 183990: tolerable FAIL

2023-12-05 Thread osstest service owner
flight 183990 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183990/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 183983
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 183983
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 183983
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 183983
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 183983
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 183983
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 183983
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 183983
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 183983
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 183983
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 183983
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 183983
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  525c7c094b258e8a46b494488eef96f5670eb352
baseline version:
 xen  525c7c094b258e8a46b494488eef96f5670eb352

Last test of basis   183990  2023-12-05 01:55:39 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 

Re: [PATCH v2 6/6] x86/iommu: cleanup unused functions

2023-12-05 Thread Jan Beulich
On 05.12.2023 16:48, Roger Pau Monné wrote:
> On Tue, Dec 05, 2023 at 04:29:18PM +0100, Jan Beulich wrote:
>> On 04.12.2023 10:43, Roger Pau Monne wrote:
>>> Remove xen_in_range() and vpci_is_mmcfg_address() now that hey are unused.
>>
>> Of the latter you remove only the declaration. Stale patch maybe?
> 
> Fixed.
> 
>> For the former, am I misremembering that Andrew had asked for the function
>> to stay?
> 
> https://lore.kernel.org/xen-devel/81534803-9da4-49b7-894e-f3fb5e8fb...@citrix.com
> 
> I did read Andrew's response as it being fine to switch the current
> function to use a rangeset, but not as a request to duplicate it.

Hmm, maybe I inferred too much from "And I'd hoped to make this common".

Jan




  1   2   3   >