Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On 9/1/21 8:41 PM, Peter Maydell wrote: On Wed, 1 Sept 2021 at 19:30, Richard W.M. Jones wrote: On Wed, Sep 01, 2021 at 07:18:03PM +0100, Peter Maydell wrote: On Wed, 1 Sept 2021 at 18:01, Richard W.M. Jones wrote: This avoids the following assertion when the kernel initializes X.509 certificates: [7.315373] Loading compiled-in X.509 certificates qemu-system-arm: ../tcg/tcg.c:3063: temp_allocate_frame: Assertion `align <= TCG_TARGET_STACK_ALIGN' failed. Fixes: commit c1c091948ae Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1999878 Cc: qemu-sta...@nongnu.org Tested-by: Richard W.M. Jones Signed-off-by: Richard W.M. Jones --- tcg/arm/tcg-target.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h index d113b7f8db..09df3b39a1 100644 --- a/tcg/arm/tcg-target.h +++ b/tcg/arm/tcg-target.h @@ -115,7 +115,7 @@ extern bool use_neon_instructions; #endif /* used for function call generation */ -#define TCG_TARGET_STACK_ALIGN 8 +#define TCG_TARGET_STACK_ALIGN 16 #define TCG_TARGET_CALL_ALIGN_ARGS 1 #define TCG_TARGET_CALL_STACK_OFFSET 0 The 32-bit Arm procedure call standard only guarantees 8-alignment of SP, not 16-alignment, so I suspect this is not the correct fix. Wouldn't it be a good idea if asserts in TCG dumped out something useful about the guest code? Because I can only reproduce this bug in a very awkward batch environment I need to collect as much information from log messages as possible. Is the failure case short enough to allow -d ... logging to be taken? That's usually the most useful info, but it's so huge it's often not feasible. Anyway, the problem here is that the arm backend now supports Neon, and it will try to do some operations on 128 bit types, where at least the loads and stores require 16-alignment. But nothing is enforcing that we have 16 alignment. (Without the assert we'd probably blunder on and fault on an unaligned access.) Correct. And while we could actively align the stack, that makes the prologue and epilogue overly complicated, and we can't just use pop {reg-list}. My preferred solution is to add another per-backend define that says what the required alignment for vector stack spills, and then *don't* add the :128 bit alignment specifier to the vector load/store insns we emit. I'll wait til I get home in 10 days or so before I tackle this. r~
Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On Thu, Sep 02, 2021 at 08:36:56AM +0100, Peter Maydell wrote: > On Wed, 1 Sept 2021 at 21:24, Richard W.M. Jones wrote: > > > > On Wed, Sep 01, 2021 at 09:17:07PM +0100, Peter Maydell wrote: > > > On Wed, 1 Sept 2021 at 19:51, Richard W.M. Jones > > > wrote: > > > > > > > > On Wed, Sep 01, 2021 at 07:41:21PM +0100, Peter Maydell wrote: > > > > > Is the failure case short enough to allow -d ... logging to > > > > > be taken? That's usually the most useful info, but it's so huge > > > > > it's often not feasible. > > > > > > > > I can try -- what exact -d option would be useful? > > > > > > Depends what you're after. Personally I'm fairly sure I know > > > what's going on, I'm just not sure what the right fix is. > > > > Another question: We couldn't reproduce this even with the identical > > ARM guest kernel + initrd + command line using qemu-system-arm > > compiled for x86-64 host. This was a bit surprising! Was that bad > > luck or is there some reason why this bug might not be reproducible > > except on armv7 host? (Both cases use -machine accel=tcg). > > That's expected -- this is a bug in the codegen for arm hosts > (specifically 32-bit arm where Neon is available). tcg/i386/ > sets TCG_TARGET_STACK_ALIGN to 16, so it won't hit the assert. > > Yesterday I wrote: > > The prologue does seem to actively align to the > > specified value, not merely assume-and-preserve that alignment. > > but I was misreading the code -- it does just assume-and-preserve. > > Do you need an urgent fix/workaround for this? The simplest thing > is to wait for RTH to look at this, which is not likely to be before > the 13th. We can wait for Richard to take a look. Thanks, Rich. > Otherwise I think you can work around it with: > > --- a/tcg/arm/tcg-target.h > +++ b/tcg/arm/tcg-target.h > @@ -152,7 +152,7 @@ extern bool use_neon_instructions; > #define TCG_TARGET_HAS_qemu_st8_i32 0 > > #define TCG_TARGET_HAS_v64 use_neon_instructions > -#define TCG_TARGET_HAS_v128 use_neon_instructions > +#define TCG_TARGET_HAS_v128 0 > #define TCG_TARGET_HAS_v256 0 > > #define TCG_TARGET_HAS_andc_vec 1 > > though this is just a bodge that (hopefully) turns the use of v128 > off entirely. > > -- PMM -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On Wed, 1 Sept 2021 at 21:24, Richard W.M. Jones wrote: > > On Wed, Sep 01, 2021 at 09:17:07PM +0100, Peter Maydell wrote: > > On Wed, 1 Sept 2021 at 19:51, Richard W.M. Jones wrote: > > > > > > On Wed, Sep 01, 2021 at 07:41:21PM +0100, Peter Maydell wrote: > > > > Is the failure case short enough to allow -d ... logging to > > > > be taken? That's usually the most useful info, but it's so huge > > > > it's often not feasible. > > > > > > I can try -- what exact -d option would be useful? > > > > Depends what you're after. Personally I'm fairly sure I know > > what's going on, I'm just not sure what the right fix is. > > Another question: We couldn't reproduce this even with the identical > ARM guest kernel + initrd + command line using qemu-system-arm > compiled for x86-64 host. This was a bit surprising! Was that bad > luck or is there some reason why this bug might not be reproducible > except on armv7 host? (Both cases use -machine accel=tcg). That's expected -- this is a bug in the codegen for arm hosts (specifically 32-bit arm where Neon is available). tcg/i386/ sets TCG_TARGET_STACK_ALIGN to 16, so it won't hit the assert. Yesterday I wrote: > The prologue does seem to actively align to the > specified value, not merely assume-and-preserve that alignment. but I was misreading the code -- it does just assume-and-preserve. Do you need an urgent fix/workaround for this? The simplest thing is to wait for RTH to look at this, which is not likely to be before the 13th. Otherwise I think you can work around it with: --- a/tcg/arm/tcg-target.h +++ b/tcg/arm/tcg-target.h @@ -152,7 +152,7 @@ extern bool use_neon_instructions; #define TCG_TARGET_HAS_qemu_st8_i32 0 #define TCG_TARGET_HAS_v64 use_neon_instructions -#define TCG_TARGET_HAS_v128 use_neon_instructions +#define TCG_TARGET_HAS_v128 0 #define TCG_TARGET_HAS_v256 0 #define TCG_TARGET_HAS_andc_vec 1 though this is just a bodge that (hopefully) turns the use of v128 off entirely. -- PMM
Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On Wed, Sep 01, 2021 at 09:17:07PM +0100, Peter Maydell wrote: > On Wed, 1 Sept 2021 at 19:51, Richard W.M. Jones wrote: > > > > On Wed, Sep 01, 2021 at 07:41:21PM +0100, Peter Maydell wrote: > > > Is the failure case short enough to allow -d ... logging to > > > be taken? That's usually the most useful info, but it's so huge > > > it's often not feasible. > > > > I can try -- what exact -d option would be useful? > > Depends what you're after. Personally I'm fairly sure I know > what's going on, I'm just not sure what the right fix is. Another question: We couldn't reproduce this even with the identical ARM guest kernel + initrd + command line using qemu-system-arm compiled for x86-64 host. This was a bit surprising! Was that bad luck or is there some reason why this bug might not be reproducible except on armv7 host? (Both cases use -machine accel=tcg). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On Wed, 1 Sept 2021 at 19:51, Richard W.M. Jones wrote: > > On Wed, Sep 01, 2021 at 07:41:21PM +0100, Peter Maydell wrote: > > Is the failure case short enough to allow -d ... logging to > > be taken? That's usually the most useful info, but it's so huge > > it's often not feasible. > > I can try -- what exact -d option would be useful? Depends what you're after. Personally I'm fairly sure I know what's going on, I'm just not sure what the right fix is. -- PMM
Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On Wed, Sep 01, 2021 at 07:41:21PM +0100, Peter Maydell wrote: > Is the failure case short enough to allow -d ... logging to > be taken? That's usually the most useful info, but it's so huge > it's often not feasible. I can try -- what exact -d option would be useful? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On Wed, 1 Sept 2021 at 19:30, Richard W.M. Jones wrote: > > On Wed, Sep 01, 2021 at 07:18:03PM +0100, Peter Maydell wrote: > > On Wed, 1 Sept 2021 at 18:01, Richard W.M. Jones wrote: > > > > > > This avoids the following assertion when the kernel initializes X.509 > > > certificates: > > > > > > [7.315373] Loading compiled-in X.509 certificates > > > qemu-system-arm: ../tcg/tcg.c:3063: temp_allocate_frame: Assertion `align > > > <= TCG_TARGET_STACK_ALIGN' failed. > > > > > > Fixes: commit c1c091948ae > > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1999878 > > > Cc: qemu-sta...@nongnu.org > > > Tested-by: Richard W.M. Jones > > > Signed-off-by: Richard W.M. Jones > > > --- > > > tcg/arm/tcg-target.h | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h > > > index d113b7f8db..09df3b39a1 100644 > > > --- a/tcg/arm/tcg-target.h > > > +++ b/tcg/arm/tcg-target.h > > > @@ -115,7 +115,7 @@ extern bool use_neon_instructions; > > > #endif > > > > > > /* used for function call generation */ > > > -#define TCG_TARGET_STACK_ALIGN 8 > > > +#define TCG_TARGET_STACK_ALIGN 16 > > > #define TCG_TARGET_CALL_ALIGN_ARGS 1 > > > #define TCG_TARGET_CALL_STACK_OFFSET 0 > > > > The 32-bit Arm procedure call standard only guarantees 8-alignment > > of SP, not 16-alignment, so I suspect this is not the correct fix. > > Wouldn't it be a good idea if asserts in TCG dumped out something > useful about the guest code? Because I can only reproduce this bug in > a very awkward batch environment I need to collect as much information > from log messages as possible. Is the failure case short enough to allow -d ... logging to be taken? That's usually the most useful info, but it's so huge it's often not feasible. Anyway, the problem here is that the arm backend now supports Neon, and it will try to do some operations on 128 bit types, where at least the loads and stores require 16-alignment. But nothing is enforcing that we have 16 alignment. (Without the assert we'd probably blunder on and fault on an unaligned access.) Rereading the TCG code, maybe your patch here is right. It's not clear to me whether TCG_TARGET_STACK_ALIGN is intended to specify "alignment the calling convention requires" (though that's what the comment above it suggests) or "alignment that TCG requires". The prologue does seem to actively align to the specified value, not merely assume-and-preserve that alignment. -- PMM
Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On Wed, Sep 01, 2021 at 07:18:03PM +0100, Peter Maydell wrote: > On Wed, 1 Sept 2021 at 18:01, Richard W.M. Jones wrote: > > > > This avoids the following assertion when the kernel initializes X.509 > > certificates: > > > > [7.315373] Loading compiled-in X.509 certificates > > qemu-system-arm: ../tcg/tcg.c:3063: temp_allocate_frame: Assertion `align > > <= TCG_TARGET_STACK_ALIGN' failed. > > > > Fixes: commit c1c091948ae > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1999878 > > Cc: qemu-sta...@nongnu.org > > Tested-by: Richard W.M. Jones > > Signed-off-by: Richard W.M. Jones > > --- > > tcg/arm/tcg-target.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h > > index d113b7f8db..09df3b39a1 100644 > > --- a/tcg/arm/tcg-target.h > > +++ b/tcg/arm/tcg-target.h > > @@ -115,7 +115,7 @@ extern bool use_neon_instructions; > > #endif > > > > /* used for function call generation */ > > -#define TCG_TARGET_STACK_ALIGN 8 > > +#define TCG_TARGET_STACK_ALIGN 16 > > #define TCG_TARGET_CALL_ALIGN_ARGS 1 > > #define TCG_TARGET_CALL_STACK_OFFSET 0 > > The 32-bit Arm procedure call standard only guarantees 8-alignment > of SP, not 16-alignment, so I suspect this is not the correct fix. Wouldn't it be a good idea if asserts in TCG dumped out something useful about the guest code? Because I can only reproduce this bug in a very awkward batch environment I need to collect as much information from log messages as possible. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [PATCH] tcg/arm: Increase stack alignment for function generation
On Wed, 1 Sept 2021 at 18:01, Richard W.M. Jones wrote: > > This avoids the following assertion when the kernel initializes X.509 > certificates: > > [7.315373] Loading compiled-in X.509 certificates > qemu-system-arm: ../tcg/tcg.c:3063: temp_allocate_frame: Assertion `align <= > TCG_TARGET_STACK_ALIGN' failed. > > Fixes: commit c1c091948ae > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1999878 > Cc: qemu-sta...@nongnu.org > Tested-by: Richard W.M. Jones > Signed-off-by: Richard W.M. Jones > --- > tcg/arm/tcg-target.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h > index d113b7f8db..09df3b39a1 100644 > --- a/tcg/arm/tcg-target.h > +++ b/tcg/arm/tcg-target.h > @@ -115,7 +115,7 @@ extern bool use_neon_instructions; > #endif > > /* used for function call generation */ > -#define TCG_TARGET_STACK_ALIGN 8 > +#define TCG_TARGET_STACK_ALIGN 16 > #define TCG_TARGET_CALL_ALIGN_ARGS 1 > #define TCG_TARGET_CALL_STACK_OFFSET 0 The 32-bit Arm procedure call standard only guarantees 8-alignment of SP, not 16-alignment, so I suspect this is not the correct fix. -- PMM
[PATCH] tcg/arm: Increase stack alignment for function generation
https://www.mail-archive.com/qemu-devel@nongnu.org/msg833146.html I tested this patch which simply increases the stack alignment to 16 bytes and it fixes the assertion failure and otherwise appears to work (in as far as it boots the libguestfs appliance). However I've no idea if this is the right thing to do. I also had to change the tabs to spaces otherwise checkpatch complains which makes the patch look a bit odd. Rich.
[PATCH] tcg/arm: Increase stack alignment for function generation
This avoids the following assertion when the kernel initializes X.509 certificates: [7.315373] Loading compiled-in X.509 certificates qemu-system-arm: ../tcg/tcg.c:3063: temp_allocate_frame: Assertion `align <= TCG_TARGET_STACK_ALIGN' failed. Fixes: commit c1c091948ae Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1999878 Cc: qemu-sta...@nongnu.org Tested-by: Richard W.M. Jones Signed-off-by: Richard W.M. Jones --- tcg/arm/tcg-target.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h index d113b7f8db..09df3b39a1 100644 --- a/tcg/arm/tcg-target.h +++ b/tcg/arm/tcg-target.h @@ -115,7 +115,7 @@ extern bool use_neon_instructions; #endif /* used for function call generation */ -#define TCG_TARGET_STACK_ALIGN 8 +#define TCG_TARGET_STACK_ALIGN 16 #define TCG_TARGET_CALL_ALIGN_ARGS 1 #define TCG_TARGET_CALL_STACK_OFFSET 0 -- 2.32.0