Re: [PATCH, ARM] Fix PR77904 testcase failure
Forgot the reference: [1] https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01308.html On Monday, 31 December 2018, Thomas Preudhomme wrote: > Hi Richard, > > On Thursday, 20 December 2018, Richard Earnshaw (lists) < richard.earns...@arm.com> wrote: >> On 14/12/2018 23:28, Thomas Preudhomme wrote: >>> Hi, >>> >>> Commit r242693 forced fp to be saved/restored when needed due to an >>> instance of GCC using fp as a scratch register to save sp while it's >>> being clobbered by an inline asm. The normal path in >>> thumb1_compute_save_reg_mask saving callee-saved registers which are >>> live in the function does not work in that case because fp is chosen to >>> hold sp after that function is called. >>> >>> Since clobbering sp is now errored out by the compiler and this was the >>> only case reported where fp was live but not marked as such when >>> thumb1_compute_save_reg_mask is called, I believe the whole commit >>> r242693 should be reverted. >>> >>> ChangeLog entries are as follows: >>> >>> *** gcc/ChangeLog *** >>> >>> 2018-12-14 Thomas Preud'homme >>> >>> Revert: >>> 2016-11-22 Thomas Preud'homme >>> >>> PR target/77904 >>> * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer >>> in save register mask if it is needed. >>> >>> *** gcc/testsuite/ChangeLog *** >>> >>> 2018-12-14 Thomas Preud'homme >>> >>> Revert: >>> 2016-11-22 Thomas Preud'homme >>> >>> PR target/77904 >>> * gcc.target/arm/pr77904.c: New test. >>> >>> Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M >>> and regression testsuite does not show any regression. >>> >>> Ok for stage3? >> >> OK. >> >> R. > > Bernd suggested in [1] that the behaviour tested by pr77904.c might actually be a behaviour we can allow with a patch to add a dg-warning to the decade. I'll wait for a resolution on that suggestion before deciding whether to commit this. > > Best regards, > > Thomas > >> >>> >>> Best regards, >>> >>> Thomas >>> >>> >>> fix_pr77904_test_failure.patch >>> >>> From 63c52e7bf932947be7122cdc63f6cdc913479259 Mon Sep 17 00:00:00 2001 >>> From: Thomas Preud'homme >>> Date: Fri, 14 Dec 2018 16:02:59 + >>> Subject: [PATCH] [PATCH, ARM] Fix PR77904 testcase failure >>> >>> Hi, >>> >>> Commit r242693 forced fp to be saved/restored when needed due to an >>> instance of GCC using fp as a scratch register to save sp while it's >>> being clobbered by an inline asm. The normal path in >>> thumb1_compute_save_reg_mask saving callee-saved registers which are >>> live in the function does not work in that case because fp is chosen to >>> hold sp after that function is called. >>> >>> Since clobbering sp is now errored out by the compiler and this was the >>> only case reported where fp was live but not marked as such when >>> thumb1_compute_save_reg_mask is called, I believe the whole commit >>> r242693 should be reverted. >>> >>> ChangeLog entries are as follows: >>> >>> *** gcc/ChangeLog *** >>> >>> 2018-12-14 Thomas Preud'homme >>> >>> Revert: >>> 2016-11-22 Thomas Preud'homme >>> >>> PR target/77904 >>> * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer >>> in save register mask if it is needed. >>> >>> *** gcc/testsuite/ChangeLog *** >>> >>> 2018-12-14 Thomas Preud'homme >>> >>> Revert: >>> 2016-11-22 Thomas Preud'homme >>> >>> PR target/77904 >>> * gcc.target/arm/pr77904.c: New test. >>> >>> Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M >>> and regression testsuite does not show any regression. >>> >>> Ok for stage3? >>> >>> Best regards, >>> >>> Thomas >>> --- >>> gcc/ChangeLog | 9 ++ >>> gcc/config/arm/arm.c | 4 --- >>> gcc/testsuite/ChangeLog| 8 + >>> gcc/testsuite/gcc.target/arm/pr77904.c | 45 -- >>> 4 files changed, 17 insertions(+), 49 deletions(-) >>>
Re: [PATCH, ARM] Fix PR77904 testcase failure
Hi Richard, On Thursday, 20 December 2018, Richard Earnshaw (lists) < richard.earns...@arm.com> wrote: > On 14/12/2018 23:28, Thomas Preudhomme wrote: >> Hi, >> >> Commit r242693 forced fp to be saved/restored when needed due to an >> instance of GCC using fp as a scratch register to save sp while it's >> being clobbered by an inline asm. The normal path in >> thumb1_compute_save_reg_mask saving callee-saved registers which are >> live in the function does not work in that case because fp is chosen to >> hold sp after that function is called. >> >> Since clobbering sp is now errored out by the compiler and this was the >> only case reported where fp was live but not marked as such when >> thumb1_compute_save_reg_mask is called, I believe the whole commit >> r242693 should be reverted. >> >> ChangeLog entries are as follows: >> >> *** gcc/ChangeLog *** >> >> 2018-12-14 Thomas Preud'homme >> >> Revert: >> 2016-11-22 Thomas Preud'homme >> >> PR target/77904 >> * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer >> in save register mask if it is needed. >> >> *** gcc/testsuite/ChangeLog *** >> >> 2018-12-14 Thomas Preud'homme >> >> Revert: >> 2016-11-22 Thomas Preud'homme >> >> PR target/77904 >> * gcc.target/arm/pr77904.c: New test. >> >> Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M >> and regression testsuite does not show any regression. >> >> Ok for stage3? > > OK. > > R. Bernd suggested in [1] that the behaviour tested by pr77904.c might actually be a behaviour we can allow with a patch to add a dg-warning to the decade. I'll wait for a resolution on that suggestion before deciding whether to commit this. Best regards, Thomas > >> >> Best regards, >> >> Thomas >> >> >> fix_pr77904_test_failure.patch >> >> From 63c52e7bf932947be7122cdc63f6cdc913479259 Mon Sep 17 00:00:00 2001 >> From: Thomas Preud'homme >> Date: Fri, 14 Dec 2018 16:02:59 + >> Subject: [PATCH] [PATCH, ARM] Fix PR77904 testcase failure >> >> Hi, >> >> Commit r242693 forced fp to be saved/restored when needed due to an >> instance of GCC using fp as a scratch register to save sp while it's >> being clobbered by an inline asm. The normal path in >> thumb1_compute_save_reg_mask saving callee-saved registers which are >> live in the function does not work in that case because fp is chosen to >> hold sp after that function is called. >> >> Since clobbering sp is now errored out by the compiler and this was the >> only case reported where fp was live but not marked as such when >> thumb1_compute_save_reg_mask is called, I believe the whole commit >> r242693 should be reverted. >> >> ChangeLog entries are as follows: >> >> *** gcc/ChangeLog *** >> >> 2018-12-14 Thomas Preud'homme >> >> Revert: >> 2016-11-22 Thomas Preud'homme >> >> PR target/77904 >> * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer >> in save register mask if it is needed. >> >> *** gcc/testsuite/ChangeLog *** >> >> 2018-12-14 Thomas Preud'homme >> >> Revert: >> 2016-11-22 Thomas Preud'homme >> >> PR target/77904 >> * gcc.target/arm/pr77904.c: New test. >> >> Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M >> and regression testsuite does not show any regression. >> >> Ok for stage3? >> >> Best regards, >> >> Thomas >> --- >> gcc/ChangeLog | 9 ++ >> gcc/config/arm/arm.c | 4 --- >> gcc/testsuite/ChangeLog| 8 + >> gcc/testsuite/gcc.target/arm/pr77904.c | 45 -- >> 4 files changed, 17 insertions(+), 49 deletions(-) >> delete mode 100644 gcc/testsuite/gcc.target/arm/pr77904.c >> >> diff --git a/gcc/ChangeLog b/gcc/ChangeLog >> index d8e374fb15f..9caeb1d5e18 100644 >> --- a/gcc/ChangeLog >> +++ b/gcc/ChangeLog >> @@ -1,3 +1,12 @@ >> +2018-12-14 Thomas Preud'homme >> + >> + Revert: >> + 2016-11-22 Thomas Preud'homme >> + >> + PR target/77904 >> + * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer >> + in save register mask if it is needed. >> + >> 2018-11-27 Alan Modra >> >> * config/rs6000/aix
[PATCH, committed] Changing maintainer email address
Hi, I've updated my email address in MAINTAINERS file since I'm leaving my company. I'll do the copyright assignment paperwork before contributing any new patches. Best regards, Thomas From c486e31b10ae0ec648ba256a92d5a4bcef1ef83d Mon Sep 17 00:00:00 2001 From: thopre01 Date: Fri, 21 Dec 2018 17:53:03 + Subject: [PATCH] Update maintainer email address 2018-12-21 Thomas Preud'homme * MAINTAINERS (Write After Approval): Update my maintainer address. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@267330 138bc75d-0d04-0410-961f-82ee72b054a4 --- ChangeLog | 4 MAINTAINERS | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 11cfa2a6789..a86c3fc40c0 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,7 @@ +2018-12-21 Thomas Preud'homme + + * MAINTAINERS (Write After Approval): Update my maintainer address. + 2018-12-21 Gergö Barany * MAINTAINERS (Write After Approval): Add myself. diff --git a/MAINTAINERS b/MAINTAINERS index dcf744d023b..8ccd0ca7c33 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -537,7 +537,7 @@ Paul Pluzhnikov Antoniu Pop Siddhesh Poyarekar Vidya Praveen -Thomas Preud'homme +Thomas Preud'homme Vladimir Prus Yao Qi Jerry Quinn -- 2.19.1
[PATCH, ARM, committed] Fix size-optimization-ieee testcase failure
I've committed the obvious attached patch to fix the gcc.target/arm/size-optimization-ieee-* testcase failures. On some version of dejagnu, options in RUNTESTFLAGS are appended to the command-line and thus any -mfloat-abi=softfp or -mfloat-abi=hard in there overwrite the -mfloat-abi=soft in the dg-options for size-optimization-ieee-* tests. Test is still run though because arm_soft_ok returns true if -mfloat-abi=soft is accepted, even if the file is not compiled for softfloat due to a later -mfloat-abi on the command line. This patch adds a dg-skip-if to those tests to ensure they are not run in softfp or hard mode. 2018-12-21 Thomas Preud'homme gcc/testsuite/ * gcc.target/arm/size-optimization-ieee-1.c: Skip if passing -mfloat-abi=softfp or -mfloat-abi=hard. * gcc.target/arm/size-optimization-ieee-2.c: Likewise. * gcc.target/arm/size-optimization-ieee-3.c: Likewise. From c13cca23aa64a07f66c80f14dbdd79c63163783c Mon Sep 17 00:00:00 2001 From: thopre01 Date: Fri, 21 Dec 2018 11:49:04 + Subject: [PATCH] [ARM] Fix size-optimization-ieee testcase failure On some version of dejagnu, options in RUNTESTFLAGS are appended to the command-line and thus any -mfloat-abi=softfp or -mfloat-abi=hard in there overwrite the -mfloat-abi=soft in the dg-options for size-optimization-ieee-* tests. Test is still run though because arm_soft_ok returns true if -mfloat-abi=soft is accepted, even if the file is not compiled for softfloat due to a later -mfloat-abi on the command line. This patch adds a dg-skip-if to those tests to ensure they are not run in softfp or hard mode. 2018-12-21 Thomas Preud'homme gcc/testsuite/ * gcc.target/arm/size-optimization-ieee-1.c: Skip if passing -mfloat-abi=softfp or -mfloat-abi=hard. * gcc.target/arm/size-optimization-ieee-2.c: Likewise. * gcc.target/arm/size-optimization-ieee-3.c: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@267323 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/testsuite/ChangeLog | 7 +++ gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c | 1 + gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c | 1 + gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c | 1 + 4 files changed, 10 insertions(+) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index dcac93bb275..1569e7aaa0f 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,10 @@ +2018-12-21 Thomas Preud'homme + + * gcc.target/arm/size-optimization-ieee-1.c: Skip if passing + -mfloat-abi=softfp or -mfloat-abi=hard. + * gcc.target/arm/size-optimization-ieee-2.c: Likewise. + * gcc.target/arm/size-optimization-ieee-3.c: Likewise. + 2018-12-21 Jakub Jelinek PR target/88547 diff --git a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c index 34090f20fec..61475eb4c67 100644 --- a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c +++ b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c @@ -1,4 +1,5 @@ /* { dg-do link { target arm_soft_ok } } */ +/* { dg-skip-if "Feature is -mfloat-abi=soft only" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */ /* { dg-options "-mfloat-abi=soft" } */ int diff --git a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c index 75337894a9c..b4699271cea 100644 --- a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c +++ b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c @@ -1,4 +1,5 @@ /* { dg-do link { target arm_soft_ok } } */ +/* { dg-skip-if "Feature is -mfloat-abi=soft only" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */ /* { dg-options "-mfloat-abi=soft" } */ int diff --git a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c index 63c92b3bbb7..34b1ebe7afd 100644 --- a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c +++ b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c @@ -1,4 +1,5 @@ /* { dg-do link { target arm_soft_ok } } */ +/* { dg-skip-if "Feature is -mfloat-abi=soft only" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */ /* { dg-options "-mfloat-abi=soft" } */ int -- 2.19.1
Re: [PATCH, ARM] Do softfloat when -mfpu set, -mfloat-abi=softfp and targeting Thumb-1
Good catch. Committed patch in attachment. Best regards, Thomas On Wed, 19 Dec 2018 at 14:13, Richard Earnshaw (lists) wrote: > > On 14/12/2018 21:15, Thomas Preudhomme wrote: > > Hi Richard, > > > > Thanks for catching the problem with this approach. Hopefully this > > version should solve the real problem: > > > > > > FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT > > but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is > > not set. Among other things, it makes some of the cmse tests (eg. > > gcc.target/arm/cmse/baseline/softfp.c) fail when targeting > > -march=armv8-m.base -mcmse -mfpu= -mfloat-abi=softfp. This > > patch adds an extra check for TARGET_32BIT to TARGET_HARD_FLOAT such > > that it is false on TARGET_THUMB1 targets even when a FPU is specified. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-12-14 thomas Preud'homme > > > > * config/arm/arm.h (TARGET_HARD_FLOAT): Restrict to TARGET_32BIT > > targets. > > Yes, this is better. And with this change, I think this line: > > if (TARGET_HARD_FLOAT && !TARGET_THUMB1) > > in output_return_instruction() can be collapsed into simply > > > if (TARGET_HARD_FLOAT) > > OK with that change. > > R. > > > > > *** gcc/testsuite/ChangeLog *** > > > > 2018-12-14 thomas Preud'homme > > > > * gcc.target/arm/cmse/baseline/softfp.c: Force an FPU. > > > > Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M > > with -mfloat-abi=softfp > > > > Is this ok for stage3? > > > > Best regards, > > > > Thomas > > > > On Thu, 29 Nov 2018 at 14:52, Richard Earnshaw (lists) > > wrote: > >> > >> On 29/11/2018 10:51, Thomas Preudhomme wrote: > >>> Hi, > >>> > >>> FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT > >>> but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is > >>> not set. Among other things, it makes some of the cmse tests (eg. > >>> gcc.target/arm/cmse/baseline/softfp.c) fail when targeting > >>> -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch > >>> errors out when a Thumb-1 -like target is selected and a FPU is > >>> specified, thus making such tests being skipped. > >>> > >>> ChangeLog entries are as follows: > >>> > >>> *** gcc/ChangeLog *** > >>> > >>> 2018-11-28 thomas Preud'homme > >>> > >>> * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out > >>> if targeting Thumb-1 with an FPU specified. > >>> > >>> *** gcc/testsuite/ChangeLog *** > >>> > >>> 2018-11-28 thomas Preud'homme > >>> > >>> * gcc.target/arm/thumb1_mfpu-1.c: New testcase. > >>> * gcc.target/arm/thumb1_mfpu-2.c: Likewise. > >>> > >>> Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M. > >>> Fails as expected when targeting Armv6-M with an -mfpu or a default FPU. > >>> Succeeds without. > >>> > >>> Is this ok for stage3? > >>> > >> > >> This doesn't sound right. Specifically this bit... > >> > >> + else if (TARGET_THUMB1 > >> + && bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2)) > >> + error ("Thumb-1 does not allow FP instructions"); > >> > >> If I use > >> > >> -mcpu=arm1176jzf-s -mfpu=auto -mfloat-abi=softfp -mthumb > >> > >> then that shouldn't error, since softfp and thumb is, in reality, just > >> float-abi=soft (as there are no fp instructions in thumb). We also want > >> it to work this way so that I can add the thumb/arm attribute to > >> specific functions and have the compiler use HW float instructions when > >> they are suitable. > >> > >> > >> R. > >> > >>> Best regards, > >>> > >>> Thomas > >>> > >>> > >>> thumb1_mfpu_error.patch > >>> > >>> From 051e38552d7c596873e0303f6ec4272b26d50900 Mon Sep 17 00:00:00 2001 > >>> From: Thomas Preud'homme > >>> Date: Tue, 27 Nov 2018 15:52:38 + > >>> Subject: [PATCH] [PATCH, ARM] Error out when -mfpu set and targeting > >&g
[PATCH, ARM] Fix PR77904 testcase failure
Hi, Commit r242693 forced fp to be saved/restored when needed due to an instance of GCC using fp as a scratch register to save sp while it's being clobbered by an inline asm. The normal path in thumb1_compute_save_reg_mask saving callee-saved registers which are live in the function does not work in that case because fp is chosen to hold sp after that function is called. Since clobbering sp is now errored out by the compiler and this was the only case reported where fp was live but not marked as such when thumb1_compute_save_reg_mask is called, I believe the whole commit r242693 should be reverted. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-12-14 Thomas Preud'homme Revert: 2016-11-22 Thomas Preud'homme PR target/77904 * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer in save register mask if it is needed. *** gcc/testsuite/ChangeLog *** 2018-12-14 Thomas Preud'homme Revert: 2016-11-22 Thomas Preud'homme PR target/77904 * gcc.target/arm/pr77904.c: New test. Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M and regression testsuite does not show any regression. Ok for stage3? Best regards, Thomas From 63c52e7bf932947be7122cdc63f6cdc913479259 Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Fri, 14 Dec 2018 16:02:59 + Subject: [PATCH] [PATCH, ARM] Fix PR77904 testcase failure Hi, Commit r242693 forced fp to be saved/restored when needed due to an instance of GCC using fp as a scratch register to save sp while it's being clobbered by an inline asm. The normal path in thumb1_compute_save_reg_mask saving callee-saved registers which are live in the function does not work in that case because fp is chosen to hold sp after that function is called. Since clobbering sp is now errored out by the compiler and this was the only case reported where fp was live but not marked as such when thumb1_compute_save_reg_mask is called, I believe the whole commit r242693 should be reverted. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-12-14 Thomas Preud'homme Revert: 2016-11-22 Thomas Preud'homme PR target/77904 * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer in save register mask if it is needed. *** gcc/testsuite/ChangeLog *** 2018-12-14 Thomas Preud'homme Revert: 2016-11-22 Thomas Preud'homme PR target/77904 * gcc.target/arm/pr77904.c: New test. Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M and regression testsuite does not show any regression. Ok for stage3? Best regards, Thomas --- gcc/ChangeLog | 9 ++ gcc/config/arm/arm.c | 4 --- gcc/testsuite/ChangeLog| 8 + gcc/testsuite/gcc.target/arm/pr77904.c | 45 -- 4 files changed, 17 insertions(+), 49 deletions(-) delete mode 100644 gcc/testsuite/gcc.target/arm/pr77904.c diff --git a/gcc/ChangeLog b/gcc/ChangeLog index d8e374fb15f..9caeb1d5e18 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,12 @@ +2018-12-14 Thomas Preud'homme + + Revert: + 2016-11-22 Thomas Preud'homme + + PR target/77904 + * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer + in save register mask if it is needed. + 2018-11-27 Alan Modra * config/rs6000/aix71.h (ASM_SPEC): Don't select default -maix64 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 40f0574e32e..2ab5d8abc33 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -19553,10 +19553,6 @@ thumb1_compute_save_core_reg_mask (void) if (df_regs_ever_live_p (reg) && callee_saved_reg_p (reg)) mask |= 1 << reg; - /* Handle the frame pointer as a special case. */ - if (frame_pointer_needed) -mask |= 1 << HARD_FRAME_POINTER_REGNUM; - if (flag_pic && !TARGET_SINGLE_PIC_BASE && arm_pic_register != INVALID_REGNUM diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 9e1f6d05a45..4e58c8940da 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,11 @@ +2018-12-14 Thomas Preud'homme + + Revert: + 2016-11-22 Thomas Preud'homme + + PR target/77904 + * gcc.target/arm/pr77904.c: New test. + 2018-11-27 Jozef Lawrynowicz * lib/target-supports.exp diff --git a/gcc/testsuite/gcc.target/arm/pr77904.c b/gcc/testsuite/gcc.target/arm/pr77904.c deleted file mode 100644 index 76728c07e73..000 --- a/gcc/testsuite/gcc.target/arm/pr77904.c +++ /dev/null @@ -1,45 +0,0 @@ -/* { dg-do run } */ -/* { dg-options "-O2" } */ - -__attribute__ ((noinline, noclone)) void -clobber_sp (void) -{ - __asm volatile ("" : : : "sp"); -} - -int -main (void) -{ - int ret; - - __asm volatile ("mov\tr4, #0xf4\n\t" - "mov\tr5, #0xf5\n\t" - "mov\tr6, #0xf6\n\t" - "mov\tr7, #0xf7\n\t" - "mov\tr0, #0xf8\n\t" - "mov\tr8, r0\n\t" - "mov\tr0, #0xfa\n\t" - "mov\tr10, r0" -
Re: [PATCH, ARM] Do softfloat when -mfpu set, -mfloat-abi=softfp and targeting Thumb-1
Hi Richard, Thanks for catching the problem with this approach. Hopefully this version should solve the real problem: FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is not set. Among other things, it makes some of the cmse tests (eg. gcc.target/arm/cmse/baseline/softfp.c) fail when targeting -march=armv8-m.base -mcmse -mfpu= -mfloat-abi=softfp. This patch adds an extra check for TARGET_32BIT to TARGET_HARD_FLOAT such that it is false on TARGET_THUMB1 targets even when a FPU is specified. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-12-14 thomas Preud'homme * config/arm/arm.h (TARGET_HARD_FLOAT): Restrict to TARGET_32BIT targets. *** gcc/testsuite/ChangeLog *** 2018-12-14 thomas Preud'homme * gcc.target/arm/cmse/baseline/softfp.c: Force an FPU. Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M with -mfloat-abi=softfp Is this ok for stage3? Best regards, Thomas On Thu, 29 Nov 2018 at 14:52, Richard Earnshaw (lists) wrote: > > On 29/11/2018 10:51, Thomas Preudhomme wrote: > > Hi, > > > > FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT > > but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is > > not set. Among other things, it makes some of the cmse tests (eg. > > gcc.target/arm/cmse/baseline/softfp.c) fail when targeting > > -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch > > errors out when a Thumb-1 -like target is selected and a FPU is > > specified, thus making such tests being skipped. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-11-28 thomas Preud'homme > > > > * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out > > if targeting Thumb-1 with an FPU specified. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2018-11-28 thomas Preud'homme > > > > * gcc.target/arm/thumb1_mfpu-1.c: New testcase. > > * gcc.target/arm/thumb1_mfpu-2.c: Likewise. > > > > Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M. > > Fails as expected when targeting Armv6-M with an -mfpu or a default FPU. > > Succeeds without. > > > > Is this ok for stage3? > > > > This doesn't sound right. Specifically this bit... > > + else if (TARGET_THUMB1 > + && bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2)) > + error ("Thumb-1 does not allow FP instructions"); > > If I use > > -mcpu=arm1176jzf-s -mfpu=auto -mfloat-abi=softfp -mthumb > > then that shouldn't error, since softfp and thumb is, in reality, just > float-abi=soft (as there are no fp instructions in thumb). We also want > it to work this way so that I can add the thumb/arm attribute to > specific functions and have the compiler use HW float instructions when > they are suitable. > > > R. > > > Best regards, > > > > Thomas > > > > > > thumb1_mfpu_error.patch > > > > From 051e38552d7c596873e0303f6ec4272b26d50900 Mon Sep 17 00:00:00 2001 > > From: Thomas Preud'homme > > Date: Tue, 27 Nov 2018 15:52:38 + > > Subject: [PATCH] [PATCH, ARM] Error out when -mfpu set and targeting Thumb-1 > > > > Hi, > > > > FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT > > but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is > > not set. Among other things, it makes some of the cmse tests (eg. > > gcc.target/arm/cmse/baseline/softfp.c) fail when targeting > > -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch > > errors out when a Thumb-1 -like target is selected and a FPU is > > specified, thus making such tests being skipped. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-11-28 thomas Preud'homme > > > > * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out > > if targeting Thumb-1 with an FPU specified. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2018-11-28 thomas Preud'homme > > > > * gcc.target/arm/thumb1_mfpu-1.c: New testcase. > > * gcc.target/arm/thumb1_mfpu-2.c: Likewise. > > > > Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M. > > Fails as expected when targeting Armv6-M with an -mfpu or a default FPU. > > Succeeds without. > > > > Is this ok for stage3? > > > > Best regards, > > > > Thomas &
Re: [PATCH, libgcc/ARM & testsuite] Optimize executable size when using softfloat fmul/dmul
Hi Richard, None, is there any? All the one I could find in the big switch selecting tm_files and tmake_files in gcc/config.gcc are including arm/elf.h. I tried to build for arm-wince-pe but got: "Configuration arm-wince-pe not supported". However note that to guarantee correct results the only requirement is to support global symbol overriding weak symbol correctly and I see .weak usage in many other libgcc backend (eg. i386). The "take the first definition resolving an undefined reference and ignore the one in following object of a static library" is only to benefit from the size optimization. Best regards, Thomas On Fri, 7 Dec 2018 at 14:14, Richard Earnshaw (lists) wrote: > > On 19/11/2018 09:57, Thomas Preudhomme wrote: > > Softfloat single precision and double precision floating-point > > multiplication routines in libgcc share some code with the > > floating-point division of their corresponding precision. As the code > > is structured now, this leads to *all* division code being pulled in an > > executable in softfloat mode even if only multiplication is > > performed. > > > > This patch create some new LIB1ASMFUNCS macros to also build files with > > just the multiplication and shared code as weak symbols. By putting > > these earlier in the static library, they can then be picked up when > > only multiplication is used and they are overriden by the global > > definition in the existing file containing both multiplication and > > division code when division is needed. > > > > The patch also removes changes made to the FUNC_START and ARM_FUNC_START > > macros in r218124 since the intent was to put multiplication and > > division code into their own section in a later patch to achieve the > > same size optimization. That approach relied on specific section layout > > to ensure multiplication and division were not too far from the shared > > bit of code in order to the branches to be within range. Due to lack of > > guarantee regarding section layout, in particular with all the > > possibility of linker scripts, this approach was chosen instead. This > > patch keeps the two testcases that were posted by Tony Wang (an Arm > > employee at the time) on the mailing list to implement this approach > > and adds a new one, hence the attribution. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-11-14 Thomas Preud'homme > > > > * config/arm/elf.h: Update comment about condition that need to > > match with libgcc/config/arm/lib1funcs.S to also include > > libgcc/config/arm/t-arm. > > * doc/sourcebuild.texi (output-exists, output-exists-not): Rename > > subsubsection these directives are in to "Check for output files". > > Move scan-symbol to that section and add to it new scan-symbol-not > > directive. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2018-11-16 Tony Wang > > Thomas Preud'homme > > > > * lib/lto.exp (lto-execute): Define output_file and testname_with_flags > > to same value as execname. > > (scan-symbol): Move and rename to ... > > * lib/gcc-dg.exp (scan-symbol-common): This. Adapt into a > > helper function returning true or false if a symbol is present. > > (scan-symbol): New procedure. > > (scan-symbol-not): Likewise. > > * gcc.target/arm/size-optimization-ieee-1.c: New testcase. > > * gcc.target/arm/size-optimization-ieee-2.c: Likewise. > > * gcc.target/arm/size-optimization-ieee-3.c: Likewise. > > > > *** libgcc/ChangeLog *** > > > > 2018-11-16 Thomas Preud'homme > > > > * /config/arm/lib1funcs.S (FUNC_START): Remove unused sp_section > > parameter and corresponding code. > > (ARM_FUNC_START): Likewise in both definitions. > > Also update footer comment about condition that need to match with > > gcc/config/arm/elf.h to also include libgcc/config/arm/t-arm. > > * config/arm/ieee754-df.S (muldf3): Also build it if L_arm_muldf3 is > > defined. Weakly define it in this case. > > * config/arm/ieee754-sf.S (mulsf3): Likewise with L_arm_mulsf3. > > * config/arm/t-elf (LIB1ASMFUNCS): Build _arm_muldf3.o and > > _arm_mulsf3.o before muldiv versions if targeting Thumb-1 only. Add > > comment to keep condition in sync with the one in > > libgcc/config/arm/lib1funcs.S and gcc/config/arm/elf.h. > > > > Testing: Bootstrapped on arm-linux-gnueabihf (Arm & Thumb-2) and > > testsuite shows no > > regression. Also built an arm-none-eabi cross com
Re: [PATCH] [RFC] PR target/52813 and target/11807
[resending from the right address] Hi Christophe, Why not simply: "Clobber of unsupported" with an accompanying change of the documentation to state the extra bit you wanted to put in that error message? Perhaps even add a reference to the section of the documentation in the error message. Best regards, Thomas On Wed, 12 Dec 2018 at 15:13, Christophe Lyon wrote: > > On Wed, 12 Dec 2018 at 14:19, Christophe Lyon > wrote: > > > > On Wed, 12 Dec 2018 at 12:21, Thomas Preudhomme > > wrote: > > > > > > So my understanding is that the original code (CMSIS library) used to > > > clobber sp because the asm statement was actually changing the sp. > > > That in turn led GCC to try to save and restore sp which is not what > > > CMSIS was expecting to happen. Changing sp without clobber as done now > > > is probably the right solution and r242693 can be reverted. That will > > > remove the failing test. > > > > > > > OK, I read PR52813 too, but I'm not sure to fully understand the new status. > > My understanding is that since this patch was committed, if an asm statement > > clobbers sp, it is now allowed to actually declare it as clobber (this patch > > generates an error in such a case). > > So the user is now expected to lie to the compiler when writing to > > this kind of register (sp, pic register), by not declaring it as "clobber"? > > > > I'm attaching a small patch which adds a more verbose error message > along the lines of what I understand of the current status. > I'm pretty sure I got (at least) the formatting wrong :) > > Christophe > > > > > > Best regards, > > > > > > Thomas > > > On Wed, 12 Dec 2018 at 10:30, Thomas Preudhomme > > > wrote: > > > > > > > > Hi Christophe, > > > > > > > > That PR was about a bug occuring when sp was clobbered so if it cannot > > > > be clobbered anymore the whole commit (r242693) can be removed. Let me > > > > check the original code that lead to the PR why it's clobbering sp > > > > though. > > > > > > > > Best regards, > > > > > > > > Thomas > > > > On Wed, 12 Dec 2018 at 09:43, Christophe Lyon > > > > wrote: > > > > > > > > > > On Tue, 11 Dec 2018 at 16:52, Richard Sandiford > > > > > wrote: > > > > > > > > > > > > Dimitar Dimitrov writes: > > > > > > > On понеделник, 10 декември 2018 г. 11:21:53 EET Richard Sandiford > > > > > > > wrote: > > > > > > >> Dimitar Dimitrov writes: > > > > > > >> > I have tested this fix on x86_64 host, and found no regression > > > > > > >> > in the C > > > > > > >> > and C++ testsuites. I'm marking this patch as RFC simply > > > > > > >> > because I don't > > > > > > >> > have experience with other architectures, and I don't have a > > > > > > >> > setup to > > > > > > >> > test all architectures supported by GCC. > > > > > > >> > > > > > > > >> > gcc/ChangeLog: > > > > > > >> > > > > > > > >> > 2018-12-07 Dimitar Dimitrov > > > > > > >> > > > > > > > >> >* cfgexpand.c (asm_clobber_reg_is_valid): Also produce > > > > > > >> >error when stack pointer is clobbered. > > > > > > >> >(expand_asm_stmt): Refactor clobber check in separate > > > > > > >> > function. > > > > > > >> > > > > > > > >> > gcc/testsuite/ChangeLog: > > > > > > >> > > > > > > > >> > 2018-12-07 Dimitar Dimitrov > > > > > > >> > > > > > > > >> >* gcc.target/i386/pr52813.c: New test. > > > > > > >> > > > > > > > >> > Signed-off-by: Dimitar Dimitrov > > > > > > >> > > > > > > >> LGTM. Do you have a copyright assignment on file? 'Fraid this > > > > > > >> is > > > > > > >> probably big enough to need one. > > > > > > > Yes, I have copyright assignment. > > > > > > > > > > > > OK, great. I went ahead and applied the patch. > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > This patch introduces a regression on arm: > > > > > FAIL: gcc.target/arm/pr77904.c (test for excess errors) > > > > > Excess errors: > > > > > /gcc/testsuite/gcc.target/arm/pr77904.c:7:3: error: Stack Pointer > > > > > register clobbered by 'sp' in 'asm' > > > > > > > > > > Indeed the testcase has an explicit: > > > > > __asm volatile ("" : : : "sp"); > > > > > which is now rejected. > > > > > > > > > > Thomas, is that mandatory to test your code to fix pr77904? > > > > > > > > > > Thanks, > > > > > > > > > > Christophe > > > > > > > > > > > Thanks, > > > > > > Richard
Re: [PATCH] [RFC] PR target/52813 and target/11807
So my understanding is that the original code (CMSIS library) used to clobber sp because the asm statement was actually changing the sp. That in turn led GCC to try to save and restore sp which is not what CMSIS was expecting to happen. Changing sp without clobber as done now is probably the right solution and r242693 can be reverted. That will remove the failing test. Best regards, Thomas On Wed, 12 Dec 2018 at 10:30, Thomas Preudhomme wrote: > > Hi Christophe, > > That PR was about a bug occuring when sp was clobbered so if it cannot > be clobbered anymore the whole commit (r242693) can be removed. Let me > check the original code that lead to the PR why it's clobbering sp > though. > > Best regards, > > Thomas > On Wed, 12 Dec 2018 at 09:43, Christophe Lyon > wrote: > > > > On Tue, 11 Dec 2018 at 16:52, Richard Sandiford > > wrote: > > > > > > Dimitar Dimitrov writes: > > > > On понеделник, 10 декември 2018 г. 11:21:53 EET Richard Sandiford wrote: > > > >> Dimitar Dimitrov writes: > > > >> > I have tested this fix on x86_64 host, and found no regression in > > > >> > the C > > > >> > and C++ testsuites. I'm marking this patch as RFC simply because I > > > >> > don't > > > >> > have experience with other architectures, and I don't have a setup to > > > >> > test all architectures supported by GCC. > > > >> > > > > >> > gcc/ChangeLog: > > > >> > > > > >> > 2018-12-07 Dimitar Dimitrov > > > >> > > > > >> >* cfgexpand.c (asm_clobber_reg_is_valid): Also produce > > > >> >error when stack pointer is clobbered. > > > >> >(expand_asm_stmt): Refactor clobber check in separate function. > > > >> > > > > >> > gcc/testsuite/ChangeLog: > > > >> > > > > >> > 2018-12-07 Dimitar Dimitrov > > > >> > > > > >> >* gcc.target/i386/pr52813.c: New test. > > > >> > > > > >> > Signed-off-by: Dimitar Dimitrov > > > >> > > > >> LGTM. Do you have a copyright assignment on file? 'Fraid this is > > > >> probably big enough to need one. > > > > Yes, I have copyright assignment. > > > > > > OK, great. I went ahead and applied the patch. > > > > > > > Hi, > > > > This patch introduces a regression on arm: > > FAIL: gcc.target/arm/pr77904.c (test for excess errors) > > Excess errors: > > /gcc/testsuite/gcc.target/arm/pr77904.c:7:3: error: Stack Pointer > > register clobbered by 'sp' in 'asm' > > > > Indeed the testcase has an explicit: > > __asm volatile ("" : : : "sp"); > > which is now rejected. > > > > Thomas, is that mandatory to test your code to fix pr77904? > > > > Thanks, > > > > Christophe > > > > > Thanks, > > > Richard
Re: [PATCH] [RFC] PR target/52813 and target/11807
Hi Christophe, That PR was about a bug occuring when sp was clobbered so if it cannot be clobbered anymore the whole commit (r242693) can be removed. Let me check the original code that lead to the PR why it's clobbering sp though. Best regards, Thomas On Wed, 12 Dec 2018 at 09:43, Christophe Lyon wrote: > > On Tue, 11 Dec 2018 at 16:52, Richard Sandiford > wrote: > > > > Dimitar Dimitrov writes: > > > On понеделник, 10 декември 2018 г. 11:21:53 EET Richard Sandiford wrote: > > >> Dimitar Dimitrov writes: > > >> > I have tested this fix on x86_64 host, and found no regression in the C > > >> > and C++ testsuites. I'm marking this patch as RFC simply because I > > >> > don't > > >> > have experience with other architectures, and I don't have a setup to > > >> > test all architectures supported by GCC. > > >> > > > >> > gcc/ChangeLog: > > >> > > > >> > 2018-12-07 Dimitar Dimitrov > > >> > > > >> >* cfgexpand.c (asm_clobber_reg_is_valid): Also produce > > >> >error when stack pointer is clobbered. > > >> >(expand_asm_stmt): Refactor clobber check in separate function. > > >> > > > >> > gcc/testsuite/ChangeLog: > > >> > > > >> > 2018-12-07 Dimitar Dimitrov > > >> > > > >> >* gcc.target/i386/pr52813.c: New test. > > >> > > > >> > Signed-off-by: Dimitar Dimitrov > > >> > > >> LGTM. Do you have a copyright assignment on file? 'Fraid this is > > >> probably big enough to need one. > > > Yes, I have copyright assignment. > > > > OK, great. I went ahead and applied the patch. > > > > Hi, > > This patch introduces a regression on arm: > FAIL: gcc.target/arm/pr77904.c (test for excess errors) > Excess errors: > /gcc/testsuite/gcc.target/arm/pr77904.c:7:3: error: Stack Pointer > register clobbered by 'sp' in 'asm' > > Indeed the testcase has an explicit: > __asm volatile ("" : : : "sp"); > which is now rejected. > > Thomas, is that mandatory to test your code to fix pr77904? > > Thanks, > > Christophe > > > Thanks, > > Richard
Re: [PATCH, ARM] Improve robustness of -mslow-flash-data
Hi Kyrill, I've tested on armeb-none-eabi with -mslow-flash-data for both -mfloat-abi=hard and -mfloat-abi=soft. Both show no regression and the former shows some new PASS. Regarding the part you are hesitant about, the code was taken from aarch64_reinterpret_float_as_int in config/aarch64/aarch64.c. I'm not too keen on splitting the patch unless it's just for review (ie still committed as one) since the changes really go together. The tighter predicate and constraint are to prevent normal pattern to match when -mslow-flash-data is in effect while the new splitter and expander is to deal with load under those circumstances. Best regards, Thomas On Fri, 30 Nov 2018 at 14:11, Kyrill Tkachov wrote: > > Hi Thomas, > > On 19/11/18 17:56, Thomas Preudhomme wrote: > > Hi, > > > > Current code to handle -mslow-flash-data in machine description files > > suffers from a number of issues which this patch fixes: > > > > 1) The insn_and_split in vfp.md to load a generic floating-point > > constant via GPR first and move it to VFP register are guarded by > > !reload_completed which is forbidden explicitely in the GCC internals > > documentation section 17.2 point 3; > > > > 2) A number of testcase in the testsuite ICEs under -mslow-flash-data > > when targeting the hardfloat ABI [1]; > > > > 3) Instructions performing load from literal pool are not disabled. > > > > These problems are addressed by 2 separate actions: > > > > 1) Making the splitters take a clobber and changing the expanders > > accordingly to generate a mov with clobber in cases where a literal > > pool would be used. The splitter can thus be enabled after reload since > > it does not call gen_reg_rtx anymore; > > > > 2) Adding new predicates and constraints to disable literal pool loads > > in existing instructions when -mslow-flash-data is in effect. > > > > Please split these into two separate patches so we can more clearly see which > changes address which problem > > > The patch also rework the splitter for DFmode slightly to generate an > > intermediate DI load instead of 2 intermediate SI loads, thus relying on > > the existing DI splitters instead of redoing their job. At last, the > > patch adds some missing arm_fp_ok effective target to some of the > > slow-flash-data testcases. > > > > [1] > > c-c++-common/Wunused-var-3.c > > gcc.c-torture/compile/pr72771.c > > gcc.c-torture/compile/vector-5.c > > gcc.c-torture/compile/vector-6.c > > gcc.c-torture/execute/20030914-1.c > > gcc.c-torture/execute/20050316-1.c > > gcc.c-torture/execute/pr59643.c > > gcc.dg/builtin-tgmath-1.c > > gcc.dg/debug/pr55730.c > > gcc.dg/graphite/interchange-7.c > > gcc.dg/pr56890-2.c > > gcc.dg/pr68474.c > > gcc.dg/pr80286.c > > gcc.dg/torture/pr35227.c > > gcc.dg/torture/pr65077.c > > gcc.dg/torture/pr86363.c > > g++.dg/torture/pr81112.C > > g++.dg/torture/pr82985.C > > g++.dg/warn/Wunused-var-7.C > > and a lot more in libstdc++ in special_functions/*_comp_ellint_* and > > special_functions/*_ellint_* directories. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-11-14 Thomas Preud'homme > > > > * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and > > source is a constant that would be loaded by literal pool. > > (movsf expander): Generate a no_literal_pool_sf_immediate insn if > > -mslow-flash-data is present, targeting hardfloat ABI and source is > > a > > float constant that cannot be loaded via vmov. > > (movdf expander): Likewise but generate a > > no_literal_pool_df_immediate > > insn. > > (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a > > float constant that would be loaded by literal pool. > > (softfloat constant movsf splitter): Splitter for the above case. > > (movdf_soft_insn): Split if -mslow-flash-data and source is a float > > constant that would be loaded by literal pool. > > (softfloat constant movdf splitter): Splitter for the above case. > > * config/arm/constraints.md (Pz): Document existing constraint. > > (Ha): Define constraint. > > (Tu): Likewise. > > * config/arm/predicates.md (hard_sf_operand): New predicate. > > (hard_df_operand): Likewise. > > * config/arm/thumb2.md (thumb2_movsi_insn): Split if > > -mslow-flash-data and constant would be loaded by literal pool. > > * constant/arm/v
[PATCH, ARM] Error out when -mfpu set and targeting Thumb-1
Hi, FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is not set. Among other things, it makes some of the cmse tests (eg. gcc.target/arm/cmse/baseline/softfp.c) fail when targeting -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch errors out when a Thumb-1 -like target is selected and a FPU is specified, thus making such tests being skipped. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-11-28 thomas Preud'homme * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out if targeting Thumb-1 with an FPU specified. *** gcc/testsuite/ChangeLog *** 2018-11-28 thomas Preud'homme * gcc.target/arm/thumb1_mfpu-1.c: New testcase. * gcc.target/arm/thumb1_mfpu-2.c: Likewise. Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M. Fails as expected when targeting Armv6-M with an -mfpu or a default FPU. Succeeds without. Is this ok for stage3? Best regards, Thomas From 051e38552d7c596873e0303f6ec4272b26d50900 Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Tue, 27 Nov 2018 15:52:38 + Subject: [PATCH] [PATCH, ARM] Error out when -mfpu set and targeting Thumb-1 Hi, FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is not set. Among other things, it makes some of the cmse tests (eg. gcc.target/arm/cmse/baseline/softfp.c) fail when targeting -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch errors out when a Thumb-1 -like target is selected and a FPU is specified, thus making such tests being skipped. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-11-28 thomas Preud'homme * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out if targeting Thumb-1 with an FPU specified. *** gcc/testsuite/ChangeLog *** 2018-11-28 thomas Preud'homme * gcc.target/arm/thumb1_mfpu-1.c: New testcase. * gcc.target/arm/thumb1_mfpu-2.c: Likewise. Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M. Fails as expected when targeting Armv6-M with an -mfpu or a default FPU. Succeeds without. Is this ok for stage3? Best regards, Thomas --- gcc/config/arm/arm.c | 3 +++ gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c | 7 +++ gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c | 8 3 files changed, 18 insertions(+) create mode 100644 gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c create mode 100644 gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 40f0574e32e..1a205123cf5 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -3747,6 +3747,9 @@ arm_options_perform_arch_sanity_checks (void) { if (arm_abi == ARM_ABI_IWMMXT) arm_pcs_default = ARM_PCS_AAPCS_IWMMXT; + else if (TARGET_THUMB1 + && bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2)) + error ("Thumb-1 does not allow FP instructions"); else if (TARGET_HARD_FLOAT_ABI) { arm_pcs_default = ARM_PCS_AAPCS_VFP; diff --git a/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c new file mode 100644 index 000..5347e63f9b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-skip-if "incompatible float ABI" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ +/* { dg-options "-mthumb -mfpu=vfp -mfloat-abi=softfp" } */ +/* { dg-error "Thumb-1 does not allow FP instructions" "" { target *-*-* } 0 } */ + +int foo; diff --git a/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c new file mode 100644 index 000..941ed26ed01 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-skip-if "incompatible float ABI" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ +/* No need to skip in presence of -mfpu since arm_thumb1_ok will already fail + due to Thumb-1 with -mfpu which is tested by thumb1_mfpu-1 testcase. */ +/* { dg-options "-mthumb -mfloat-abi=softfp" } */ + +int foo; -- 2.19.1
Re: [PATCH, ARM, ping] Improve robustness of -mslow-flash-data
Ping? Best regards, Thomas On 19/11/2018 17:56, Thomas Preudhomme wrote: Hi, Current code to handle -mslow-flash-data in machine description files suffers from a number of issues which this patch fixes: 1) The insn_and_split in vfp.md to load a generic floating-point constant via GPR first and move it to VFP register are guarded by !reload_completed which is forbidden explicitely in the GCC internals documentation section 17.2 point 3; 2) A number of testcase in the testsuite ICEs under -mslow-flash-data when targeting the hardfloat ABI [1]; 3) Instructions performing load from literal pool are not disabled. These problems are addressed by 2 separate actions: 1) Making the splitters take a clobber and changing the expanders accordingly to generate a mov with clobber in cases where a literal pool would be used. The splitter can thus be enabled after reload since it does not call gen_reg_rtx anymore; 2) Adding new predicates and constraints to disable literal pool loads in existing instructions when -mslow-flash-data is in effect. The patch also rework the splitter for DFmode slightly to generate an intermediate DI load instead of 2 intermediate SI loads, thus relying on the existing DI splitters instead of redoing their job. At last, the patch adds some missing arm_fp_ok effective target to some of the slow-flash-data testcases. [1] c-c++-common/Wunused-var-3.c gcc.c-torture/compile/pr72771.c gcc.c-torture/compile/vector-5.c gcc.c-torture/compile/vector-6.c gcc.c-torture/execute/20030914-1.c gcc.c-torture/execute/20050316-1.c gcc.c-torture/execute/pr59643.c gcc.dg/builtin-tgmath-1.c gcc.dg/debug/pr55730.c gcc.dg/graphite/interchange-7.c gcc.dg/pr56890-2.c gcc.dg/pr68474.c gcc.dg/pr80286.c gcc.dg/torture/pr35227.c gcc.dg/torture/pr65077.c gcc.dg/torture/pr86363.c g++.dg/torture/pr81112.C g++.dg/torture/pr82985.C g++.dg/warn/Wunused-var-7.C and a lot more in libstdc++ in special_functions/*_comp_ellint_* and special_functions/*_ellint_* directories. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-11-14 Thomas Preud'homme * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and source is a constant that would be loaded by literal pool. (movsf expander): Generate a no_literal_pool_sf_immediate insn if -mslow-flash-data is present, targeting hardfloat ABI and source is a float constant that cannot be loaded via vmov. (movdf expander): Likewise but generate a no_literal_pool_df_immediate insn. (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a float constant that would be loaded by literal pool. (softfloat constant movsf splitter): Splitter for the above case. (movdf_soft_insn): Split if -mslow-flash-data and source is a float constant that would be loaded by literal pool. (softfloat constant movdf splitter): Splitter for the above case. * config/arm/constraints.md (Pz): Document existing constraint. (Ha): Define constraint. (Tu): Likewise. * config/arm/predicates.md (hard_sf_operand): New predicate. (hard_df_operand): Likewise. * config/arm/thumb2.md (thumb2_movsi_insn): Split if -mslow-flash-data and constant would be loaded by literal pool. * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant load in VFP register. (movdi_vfp): Likewise. (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to prevent match for a constant load if -mslow-flash-data and constant cannot be loaded via vmov. Adapt constraint accordingly by using Ha instead of E for generic floating-point constant load. (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead. (no_literal_pool_df_immediate): Add a clobber to use as the intermediate general purpose register and also enable it after reload but disable it constant is a valid FP constant. Add constraints and generate a DI intermediate load rather than 2 SI loads. (no_literal_pool_sf_immediate): Add a clobber to use as the intermediate general purpose register and also enable it after reload. *** gcc/testsuite/ChangeLog *** 2018-11-14 Thomas Preud'homme * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok effective target. * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to softfloat and hardfloat ABI which showed no regression and some FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any regression. Compiled SPEC2k6 without -mslow-flash-data and checked that code generation didn't change. Is this ok for stage3? Best regards, Thomas diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index a773518cefaf8451e77fead9e072ee8ef39f1
Re: [PATCH, libgcc/ARM & testsuite, ping] Optimize executable size when using softfloat fmul/dmul
Ping? Best regards, Thomas On Mon, 19 Nov 2018 at 10:51, Thomas Preudhomme wrote: > > FWIW, the testcases were taken from > https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01026.html > > Previous approach for fixing tying of fmul to fdiv can be seen in > https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01971.html. As mentioned > in the cover letter, this patch went for a completely different > approach and does not share any code besides the testcases. > > Best regards, > > Thomas > On Mon, 19 Nov 2018 at 09:57, Thomas Preudhomme > wrote: > > > > Softfloat single precision and double precision floating-point > > multiplication routines in libgcc share some code with the > > floating-point division of their corresponding precision. As the code > > is structured now, this leads to *all* division code being pulled in an > > executable in softfloat mode even if only multiplication is > > performed. > > > > This patch create some new LIB1ASMFUNCS macros to also build files with > > just the multiplication and shared code as weak symbols. By putting > > these earlier in the static library, they can then be picked up when > > only multiplication is used and they are overriden by the global > > definition in the existing file containing both multiplication and > > division code when division is needed. > > > > The patch also removes changes made to the FUNC_START and ARM_FUNC_START > > macros in r218124 since the intent was to put multiplication and > > division code into their own section in a later patch to achieve the > > same size optimization. That approach relied on specific section layout > > to ensure multiplication and division were not too far from the shared > > bit of code in order to the branches to be within range. Due to lack of > > guarantee regarding section layout, in particular with all the > > possibility of linker scripts, this approach was chosen instead. This > > patch keeps the two testcases that were posted by Tony Wang (an Arm > > employee at the time) on the mailing list to implement this approach > > and adds a new one, hence the attribution. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-11-14 Thomas Preud'homme > > > > * config/arm/elf.h: Update comment about condition that need to > > match with libgcc/config/arm/lib1funcs.S to also include > > libgcc/config/arm/t-arm. > > * doc/sourcebuild.texi (output-exists, output-exists-not): Rename > > subsubsection these directives are in to "Check for output files". > > Move scan-symbol to that section and add to it new scan-symbol-not > > directive. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2018-11-16 Tony Wang > > Thomas Preud'homme > > > > * lib/lto.exp (lto-execute): Define output_file and testname_with_flags > > to same value as execname. > > (scan-symbol): Move and rename to ... > > * lib/gcc-dg.exp (scan-symbol-common): This. Adapt into a > > helper function returning true or false if a symbol is present. > > (scan-symbol): New procedure. > > (scan-symbol-not): Likewise. > > * gcc.target/arm/size-optimization-ieee-1.c: New testcase. > > * gcc.target/arm/size-optimization-ieee-2.c: Likewise. > > * gcc.target/arm/size-optimization-ieee-3.c: Likewise. > > > > *** libgcc/ChangeLog *** > > > > 2018-11-16 Thomas Preud'homme > > > > * /config/arm/lib1funcs.S (FUNC_START): Remove unused sp_section > > parameter and corresponding code. > > (ARM_FUNC_START): Likewise in both definitions. > > Also update footer comment about condition that need to match with > > gcc/config/arm/elf.h to also include libgcc/config/arm/t-arm. > > * config/arm/ieee754-df.S (muldf3): Also build it if L_arm_muldf3 is > > defined. Weakly define it in this case. > > * config/arm/ieee754-sf.S (mulsf3): Likewise with L_arm_mulsf3. > > * config/arm/t-elf (LIB1ASMFUNCS): Build _arm_muldf3.o and > > _arm_mulsf3.o before muldiv versions if targeting Thumb-1 only. Add > > comment to keep condition in sync with the one in > > libgcc/config/arm/lib1funcs.S and gcc/config/arm/elf.h. > > > > Testing: Bootstrapped on arm-linux-gnueabihf (Arm & Thumb-2) and > > testsuite shows no > > regression. Also built an arm-none-eabi cross compiler targeting > > soft-float which also shows no regression. In particular newly added > > tests and gcc.dg/lto/20081212-1 test pass. > > > >
Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM
I'm talking about the PIC access to the guard's variable. See for example the pr85434.c testcase contributed with this patch when compiled for aarch64 with -Os -fpic -march=armv8-a -fstack-protector-strong: (insn 227 226 228 33 (set (reg:DI 90) (high:DI (symbol_ref:DI ("_GLOBAL_OFFSET_TABLE_" "/data/dev/checkouts/private/linaro/gcc/gcc/testsuite/gcc.target/arm/pr85434.c":148:1 -1 (nil)) (insn 228 227 229 33 (set (reg/f:DI 244) (unspec:DI [ (mem/u/c:DI (lo_sum:DI (reg:DI 90) (symbol_ref:DI ("__stack_chk_guard") [flags 0xc0] )) [0 S8 A8]) ] UNSPEC_GOTSMALLPIC28K)) "/data/dev/checkouts/private/linaro/gcc/gcc/testsuite/gcc.target/arm/pr85434.c":148:1 -1 (expr_list:REG_EQUAL (symbol_ref:DI ("__stack_chk_guard") [flags 0xc0] ) (nil))) (insn 229 228 230 33 (parallel [ (set (reg:DI 245) (unspec:DI [ (mem/v/f/c:DI (plus:DI (reg/f:DI 85 virtual-stack-vars) (const_int -8 [0xfff8])) [4 D.3715+0 S8 A64]) (mem/v/f/c:DI (reg/f:DI 244) [4 __stack_chk_guard+0 S8 A64]) ] UNSPEC_SP_TEST)) (clobber (scratch:DI)) ]) "/data/dev/checkouts/private/linaro/gcc/gcc/testsuite/gcc.target/arm/pr85434.c":148:1 -1 (nil)) The unspec in insn 228 is not CSEd in my experiment despite the same instruction happening in the prologue to set the canary. In arm backend it was but the PIC access is of the form (mem (reg) (unspec offset)), ie the outermost rtx in the source is not an unspec. Best regards, Thomas On Wed, 21 Nov 2018 at 17:54, Segher Boessenkool wrote: > > On Fri, Nov 16, 2018 at 02:56:46PM +, Thomas Preudhomme wrote: > > In case of high register pressure in PIC mode, address of the stack > > protector's guard can be spilled on ARM targets as shown in PR85434, > > thus allowing an attacker to control what the canary would be compared > > against. ARM does lack stack_protect_set and stack_protect_test insn > > patterns, defining them does not help as the address is expanded > > regularly and the patterns only deal with the copy and test of the > > guard with the canary. > > > > This problem does not occur for x86 targets because the PIC access and > > the test can be done in the same instruction. Aarch64 is exempt too > > because PIC access insn pattern are mov of UNSPEC which prevents it from > > the second access in the epilogue being CSEd in cse_local pass with the > > first access in the prologue. > > The unspecs are not CSEd because they are *different* unspecs (UNSPEC_SP_SET > vs. UNSPEC_SP_TEST; they have different args too, different number of args > even). Two the same unspecs can be CSEd just fine. > > > Segher
Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM
Thanks Kyrill. Committed the attached patch. Best regards, Thomas On Wed, 21 Nov 2018 at 16:06, Kyrill Tkachov wrote: > > Hi Thomas, > > Sorry for the delay. > > On 16/11/18 14:56, Thomas Preudhomme wrote: > > Ping? > > > > Best regards, > > > > Thomas > > > > On Sat, 10 Nov 2018 at 15:07, Thomas Preudhomme > > wrote: > >> Thanks Kyrill. > >> > >> Updated patch in attachment. Best regards, > >> > >> Thomas > >> On Thu, 8 Nov 2018 at 15:53, Kyrill Tkachov > >> wrote: > >>> Hi Thomas, > >>> > >>> On 08/11/18 09:52, Thomas Preudhomme wrote: > >>>> Ping? > >>>> > >>>> Best regards, > >>>> > >>>> Thomas > >>>> > >>>> On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme > >>>> wrote: > >>>>> Ping? > >>>>> > >>>>> Best regards, > >>>>> > >>>>> Thomas > >>>>> On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme > >>>>> wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Please find updated patch to fix PR85434: spilling of stack protector > >>>>>> guard's address on ARM. Quite a few changes have been made to the ARM > >>>>>> part since last round of review so I think it makes more sense to > >>>>>> review it anew. Ran bootstrap + regression testsuite + glibc build + > >>>>>> glibc regression testsuite for Arm and Thumb-2 and bootstrap + > >>>>>> regression testsuite for Thumb-1. GCC's regression testsuite was run > >>>>>> in 3 configurations in all those cases: > >>>>>> > >>>>>> - default configuration (no RUNTESTFLAGS) > >>>>>> - with -fstack-protector-all > >>>>>> - with -fPIC -fstack-protector-all (to exercise both codepath in stack > >>>>>> protector's split code) > >>>>>> > >>>>>> None of this show any regression beyond some new scan fail with > >>>>>> -fstack-protector-all or -fPIC due to unexpected code sequence for the > >>>>>> testcases concerned and some guality swing due to less optimization > >>>>>> with new stack protector on. > >>>>>> > >>>>>> Patch description and ChangeLog below. > >>>>>> > >>>>>> In case of high register pressure in PIC mode, address of the stack > >>>>>> protector's guard can be spilled on ARM targets as shown in PR85434, > >>>>>> thus allowing an attacker to control what the canary would be compared > >>>>>> against. ARM does lack stack_protect_set and stack_protect_test insn > >>>>>> patterns, defining them does not help as the address is expanded > >>>>>> regularly and the patterns only deal with the copy and test of the > >>>>>> guard with the canary. > >>>>>> > >>>>>> This problem does not occur for x86 targets because the PIC access and > >>>>>> the test can be done in the same instruction. Aarch64 is exempt too > >>>>>> because PIC access insn pattern are mov of UNSPEC which prevents it > >>>>>> from > >>>>>> the second access in the epilogue being CSEd in cse_local pass with the > >>>>>> first access in the prologue. > >>>>>> > >>>>>> The approach followed here is to create new "combined" set and test > >>>>>> standard pattern names that take the unexpanded guard and do the set or > >>>>>> test. This allows the target to use an opaque pattern (eg. using > >>>>>> UNSPEC) > >>>>>> to hide the individual instructions being generated to the compiler and > >>>>>> split the pattern into generic load, compare and branch instruction > >>>>>> after register allocator, therefore avoiding any spilling. This is here > >>>>>> implemented for the ARM targets. For targets not implementing these new > >>>>>> standard pattern names, the existing stack_protect_set and > >>>>>> stack_protect_test pattern names are used. > >>>>>> > >>>>>> To be able to split PIC access after re
Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM
Yes you did indeed which is why I didn't include you in to To list. I've reworked the Arm part significantly since it was last approved, the ping is meant for the Arm maintainers. Thanks for enquiring about it. Best regards, Thomas On Wed, 21 Nov 2018 at 00:32, Jeff Law wrote: > > On 11/16/18 7:56 AM, Thomas Preudhomme wrote: > > Ping? > I thought I acked the target independent stuff a while back. What's > still waiting on review here? > > jeff
[PATCH, ARM] Improve robustness of -mslow-flash-data
Hi, Current code to handle -mslow-flash-data in machine description files suffers from a number of issues which this patch fixes: 1) The insn_and_split in vfp.md to load a generic floating-point constant via GPR first and move it to VFP register are guarded by !reload_completed which is forbidden explicitely in the GCC internals documentation section 17.2 point 3; 2) A number of testcase in the testsuite ICEs under -mslow-flash-data when targeting the hardfloat ABI [1]; 3) Instructions performing load from literal pool are not disabled. These problems are addressed by 2 separate actions: 1) Making the splitters take a clobber and changing the expanders accordingly to generate a mov with clobber in cases where a literal pool would be used. The splitter can thus be enabled after reload since it does not call gen_reg_rtx anymore; 2) Adding new predicates and constraints to disable literal pool loads in existing instructions when -mslow-flash-data is in effect. The patch also rework the splitter for DFmode slightly to generate an intermediate DI load instead of 2 intermediate SI loads, thus relying on the existing DI splitters instead of redoing their job. At last, the patch adds some missing arm_fp_ok effective target to some of the slow-flash-data testcases. [1] c-c++-common/Wunused-var-3.c gcc.c-torture/compile/pr72771.c gcc.c-torture/compile/vector-5.c gcc.c-torture/compile/vector-6.c gcc.c-torture/execute/20030914-1.c gcc.c-torture/execute/20050316-1.c gcc.c-torture/execute/pr59643.c gcc.dg/builtin-tgmath-1.c gcc.dg/debug/pr55730.c gcc.dg/graphite/interchange-7.c gcc.dg/pr56890-2.c gcc.dg/pr68474.c gcc.dg/pr80286.c gcc.dg/torture/pr35227.c gcc.dg/torture/pr65077.c gcc.dg/torture/pr86363.c g++.dg/torture/pr81112.C g++.dg/torture/pr82985.C g++.dg/warn/Wunused-var-7.C and a lot more in libstdc++ in special_functions/*_comp_ellint_* and special_functions/*_ellint_* directories. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-11-14 Thomas Preud'homme * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and source is a constant that would be loaded by literal pool. (movsf expander): Generate a no_literal_pool_sf_immediate insn if -mslow-flash-data is present, targeting hardfloat ABI and source is a float constant that cannot be loaded via vmov. (movdf expander): Likewise but generate a no_literal_pool_df_immediate insn. (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a float constant that would be loaded by literal pool. (softfloat constant movsf splitter): Splitter for the above case. (movdf_soft_insn): Split if -mslow-flash-data and source is a float constant that would be loaded by literal pool. (softfloat constant movdf splitter): Splitter for the above case. * config/arm/constraints.md (Pz): Document existing constraint. (Ha): Define constraint. (Tu): Likewise. * config/arm/predicates.md (hard_sf_operand): New predicate. (hard_df_operand): Likewise. * config/arm/thumb2.md (thumb2_movsi_insn): Split if -mslow-flash-data and constant would be loaded by literal pool. * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant load in VFP register. (movdi_vfp): Likewise. (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to prevent match for a constant load if -mslow-flash-data and constant cannot be loaded via vmov. Adapt constraint accordingly by using Ha instead of E for generic floating-point constant load. (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead. (no_literal_pool_df_immediate): Add a clobber to use as the intermediate general purpose register and also enable it after reload but disable it constant is a valid FP constant. Add constraints and generate a DI intermediate load rather than 2 SI loads. (no_literal_pool_sf_immediate): Add a clobber to use as the intermediate general purpose register and also enable it after reload. *** gcc/testsuite/ChangeLog *** 2018-11-14 Thomas Preud'homme * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok effective target. * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to softfloat and hardfloat ABI which showed no regression and some FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any regression. Compiled SPEC2k6 without -mslow-flash-data and checked that code generation didn't change. Is this ok for stage3? Best regards, Thomas diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index
Re: [PATCH, libgcc/ARM & testsuite] Optimize executable size when using softfloat fmul/dmul
FWIW, the testcases were taken from https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01026.html Previous approach for fixing tying of fmul to fdiv can be seen in https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01971.html. As mentioned in the cover letter, this patch went for a completely different approach and does not share any code besides the testcases. Best regards, Thomas On Mon, 19 Nov 2018 at 09:57, Thomas Preudhomme wrote: > > Softfloat single precision and double precision floating-point > multiplication routines in libgcc share some code with the > floating-point division of their corresponding precision. As the code > is structured now, this leads to *all* division code being pulled in an > executable in softfloat mode even if only multiplication is > performed. > > This patch create some new LIB1ASMFUNCS macros to also build files with > just the multiplication and shared code as weak symbols. By putting > these earlier in the static library, they can then be picked up when > only multiplication is used and they are overriden by the global > definition in the existing file containing both multiplication and > division code when division is needed. > > The patch also removes changes made to the FUNC_START and ARM_FUNC_START > macros in r218124 since the intent was to put multiplication and > division code into their own section in a later patch to achieve the > same size optimization. That approach relied on specific section layout > to ensure multiplication and division were not too far from the shared > bit of code in order to the branches to be within range. Due to lack of > guarantee regarding section layout, in particular with all the > possibility of linker scripts, this approach was chosen instead. This > patch keeps the two testcases that were posted by Tony Wang (an Arm > employee at the time) on the mailing list to implement this approach > and adds a new one, hence the attribution. > > ChangeLog entries are as follows: > > *** gcc/ChangeLog *** > > 2018-11-14 Thomas Preud'homme > > * config/arm/elf.h: Update comment about condition that need to > match with libgcc/config/arm/lib1funcs.S to also include > libgcc/config/arm/t-arm. > * doc/sourcebuild.texi (output-exists, output-exists-not): Rename > subsubsection these directives are in to "Check for output files". > Move scan-symbol to that section and add to it new scan-symbol-not > directive. > > *** gcc/testsuite/ChangeLog *** > > 2018-11-16 Tony Wang > Thomas Preud'homme > > * lib/lto.exp (lto-execute): Define output_file and testname_with_flags > to same value as execname. > (scan-symbol): Move and rename to ... > * lib/gcc-dg.exp (scan-symbol-common): This. Adapt into a > helper function returning true or false if a symbol is present. > (scan-symbol): New procedure. > (scan-symbol-not): Likewise. > * gcc.target/arm/size-optimization-ieee-1.c: New testcase. > * gcc.target/arm/size-optimization-ieee-2.c: Likewise. > * gcc.target/arm/size-optimization-ieee-3.c: Likewise. > > *** libgcc/ChangeLog *** > > 2018-11-16 Thomas Preud'homme > > * /config/arm/lib1funcs.S (FUNC_START): Remove unused sp_section > parameter and corresponding code. > (ARM_FUNC_START): Likewise in both definitions. > Also update footer comment about condition that need to match with > gcc/config/arm/elf.h to also include libgcc/config/arm/t-arm. > * config/arm/ieee754-df.S (muldf3): Also build it if L_arm_muldf3 is > defined. Weakly define it in this case. > * config/arm/ieee754-sf.S (mulsf3): Likewise with L_arm_mulsf3. > * config/arm/t-elf (LIB1ASMFUNCS): Build _arm_muldf3.o and > _arm_mulsf3.o before muldiv versions if targeting Thumb-1 only. Add > comment to keep condition in sync with the one in > libgcc/config/arm/lib1funcs.S and gcc/config/arm/elf.h. > > Testing: Bootstrapped on arm-linux-gnueabihf (Arm & Thumb-2) and > testsuite shows no > regression. Also built an arm-none-eabi cross compiler targeting > soft-float which also shows no regression. In particular newly added > tests and gcc.dg/lto/20081212-1 test pass. > > Is this ok for stage3? > > Best regards, > > Thomas
[PATCH, libgcc/ARM & testsuite] Optimize executable size when using softfloat fmul/dmul
Softfloat single precision and double precision floating-point multiplication routines in libgcc share some code with the floating-point division of their corresponding precision. As the code is structured now, this leads to *all* division code being pulled in an executable in softfloat mode even if only multiplication is performed. This patch create some new LIB1ASMFUNCS macros to also build files with just the multiplication and shared code as weak symbols. By putting these earlier in the static library, they can then be picked up when only multiplication is used and they are overriden by the global definition in the existing file containing both multiplication and division code when division is needed. The patch also removes changes made to the FUNC_START and ARM_FUNC_START macros in r218124 since the intent was to put multiplication and division code into their own section in a later patch to achieve the same size optimization. That approach relied on specific section layout to ensure multiplication and division were not too far from the shared bit of code in order to the branches to be within range. Due to lack of guarantee regarding section layout, in particular with all the possibility of linker scripts, this approach was chosen instead. This patch keeps the two testcases that were posted by Tony Wang (an Arm employee at the time) on the mailing list to implement this approach and adds a new one, hence the attribution. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-11-14 Thomas Preud'homme * config/arm/elf.h: Update comment about condition that need to match with libgcc/config/arm/lib1funcs.S to also include libgcc/config/arm/t-arm. * doc/sourcebuild.texi (output-exists, output-exists-not): Rename subsubsection these directives are in to "Check for output files". Move scan-symbol to that section and add to it new scan-symbol-not directive. *** gcc/testsuite/ChangeLog *** 2018-11-16 Tony Wang Thomas Preud'homme * lib/lto.exp (lto-execute): Define output_file and testname_with_flags to same value as execname. (scan-symbol): Move and rename to ... * lib/gcc-dg.exp (scan-symbol-common): This. Adapt into a helper function returning true or false if a symbol is present. (scan-symbol): New procedure. (scan-symbol-not): Likewise. * gcc.target/arm/size-optimization-ieee-1.c: New testcase. * gcc.target/arm/size-optimization-ieee-2.c: Likewise. * gcc.target/arm/size-optimization-ieee-3.c: Likewise. *** libgcc/ChangeLog *** 2018-11-16 Thomas Preud'homme * /config/arm/lib1funcs.S (FUNC_START): Remove unused sp_section parameter and corresponding code. (ARM_FUNC_START): Likewise in both definitions. Also update footer comment about condition that need to match with gcc/config/arm/elf.h to also include libgcc/config/arm/t-arm. * config/arm/ieee754-df.S (muldf3): Also build it if L_arm_muldf3 is defined. Weakly define it in this case. * config/arm/ieee754-sf.S (mulsf3): Likewise with L_arm_mulsf3. * config/arm/t-elf (LIB1ASMFUNCS): Build _arm_muldf3.o and _arm_mulsf3.o before muldiv versions if targeting Thumb-1 only. Add comment to keep condition in sync with the one in libgcc/config/arm/lib1funcs.S and gcc/config/arm/elf.h. Testing: Bootstrapped on arm-linux-gnueabihf (Arm & Thumb-2) and testsuite shows no regression. Also built an arm-none-eabi cross compiler targeting soft-float which also shows no regression. In particular newly added tests and gcc.dg/lto/20081212-1 test pass. Is this ok for stage3? Best regards, Thomas From 8740697791f99b7175e188f049663883c39e51b0 Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Fri, 26 Oct 2018 16:21:09 +0100 Subject: [PATCH] [PATCH, libgcc/ARM] Optimize executable size when using softfloat fmul/dmul Softfloat single precision and double precision floating-point multiplication routines in libgcc share some code with the floating-point division of their corresponding precision. As the code is structured now, this leads to *all* division code being pulled in an executable in softfloat mode even if only multiplication is performed. This patch create some new LIB1ASMFUNCS macros to also build files with just the multiplication and shared code as weak symbols. By putting these earlier in the static library, they can then be picked up when only multiplication is used and they are overriden by the global definition in the existing file containing both multiplication and division code when division is needed. The patch also removes changes made to the FUNC_START and ARM_FUNC_START macros in r218124 since the intent was to put multiplication and division code into their own section in a later patch to achieve the same size optimization. That approach relied on specific section layout to ensure multiplication and division were not too far from the shared bit of code in order to the branches to be within
Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM
Ping? Best regards, Thomas On Sat, 10 Nov 2018 at 15:07, Thomas Preudhomme wrote: > > Thanks Kyrill. > > Updated patch in attachment. Best regards, > > Thomas > On Thu, 8 Nov 2018 at 15:53, Kyrill Tkachov > wrote: > > > > Hi Thomas, > > > > On 08/11/18 09:52, Thomas Preudhomme wrote: > > > Ping? > > > > > > Best regards, > > > > > > Thomas > > > > > > On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme > > > wrote: > > >> Ping? > > >> > > >> Best regards, > > >> > > >> Thomas > > >> On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme > > >> wrote: > > >>> Hi, > > >>> > > >>> Please find updated patch to fix PR85434: spilling of stack protector > > >>> guard's address on ARM. Quite a few changes have been made to the ARM > > >>> part since last round of review so I think it makes more sense to > > >>> review it anew. Ran bootstrap + regression testsuite + glibc build + > > >>> glibc regression testsuite for Arm and Thumb-2 and bootstrap + > > >>> regression testsuite for Thumb-1. GCC's regression testsuite was run > > >>> in 3 configurations in all those cases: > > >>> > > >>> - default configuration (no RUNTESTFLAGS) > > >>> - with -fstack-protector-all > > >>> - with -fPIC -fstack-protector-all (to exercise both codepath in stack > > >>> protector's split code) > > >>> > > >>> None of this show any regression beyond some new scan fail with > > >>> -fstack-protector-all or -fPIC due to unexpected code sequence for the > > >>> testcases concerned and some guality swing due to less optimization > > >>> with new stack protector on. > > >>> > > >>> Patch description and ChangeLog below. > > >>> > > >>> In case of high register pressure in PIC mode, address of the stack > > >>> protector's guard can be spilled on ARM targets as shown in PR85434, > > >>> thus allowing an attacker to control what the canary would be compared > > >>> against. ARM does lack stack_protect_set and stack_protect_test insn > > >>> patterns, defining them does not help as the address is expanded > > >>> regularly and the patterns only deal with the copy and test of the > > >>> guard with the canary. > > >>> > > >>> This problem does not occur for x86 targets because the PIC access and > > >>> the test can be done in the same instruction. Aarch64 is exempt too > > >>> because PIC access insn pattern are mov of UNSPEC which prevents it from > > >>> the second access in the epilogue being CSEd in cse_local pass with the > > >>> first access in the prologue. > > >>> > > >>> The approach followed here is to create new "combined" set and test > > >>> standard pattern names that take the unexpanded guard and do the set or > > >>> test. This allows the target to use an opaque pattern (eg. using UNSPEC) > > >>> to hide the individual instructions being generated to the compiler and > > >>> split the pattern into generic load, compare and branch instruction > > >>> after register allocator, therefore avoiding any spilling. This is here > > >>> implemented for the ARM targets. For targets not implementing these new > > >>> standard pattern names, the existing stack_protect_set and > > >>> stack_protect_test pattern names are used. > > >>> > > >>> To be able to split PIC access after register allocation, the functions > > >>> had to be augmented to force a new PIC register load and to control > > >>> which register it loads into. This is because sharing the PIC register > > >>> between prologue and epilogue could lead to spilling due to CSE again > > >>> which an attacker could use to control what the canary gets compared > > >>> against. > > >>> > > >>> ChangeLog entries are as follows: > > >>> > > >>> *** gcc/ChangeLog *** > > >>> > > >>> 2018-10-26 Thomas Preud'homme > > >>> > > >>> * target-insns.def (stack_protect_combined_set): Define new standard > > >>> pattern name. > > >>> (stack_protect_combined_test): Like
Re: [PATCH, ARM, ping2] PR85434: Prevent spilling of stack protector guard's address on ARM
Thanks Kyrill. Updated patch in attachment. Best regards, Thomas On Thu, 8 Nov 2018 at 15:53, Kyrill Tkachov wrote: > > Hi Thomas, > > On 08/11/18 09:52, Thomas Preudhomme wrote: > > Ping? > > > > Best regards, > > > > Thomas > > > > On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme > > wrote: > >> Ping? > >> > >> Best regards, > >> > >> Thomas > >> On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme > >> wrote: > >>> Hi, > >>> > >>> Please find updated patch to fix PR85434: spilling of stack protector > >>> guard's address on ARM. Quite a few changes have been made to the ARM > >>> part since last round of review so I think it makes more sense to > >>> review it anew. Ran bootstrap + regression testsuite + glibc build + > >>> glibc regression testsuite for Arm and Thumb-2 and bootstrap + > >>> regression testsuite for Thumb-1. GCC's regression testsuite was run > >>> in 3 configurations in all those cases: > >>> > >>> - default configuration (no RUNTESTFLAGS) > >>> - with -fstack-protector-all > >>> - with -fPIC -fstack-protector-all (to exercise both codepath in stack > >>> protector's split code) > >>> > >>> None of this show any regression beyond some new scan fail with > >>> -fstack-protector-all or -fPIC due to unexpected code sequence for the > >>> testcases concerned and some guality swing due to less optimization > >>> with new stack protector on. > >>> > >>> Patch description and ChangeLog below. > >>> > >>> In case of high register pressure in PIC mode, address of the stack > >>> protector's guard can be spilled on ARM targets as shown in PR85434, > >>> thus allowing an attacker to control what the canary would be compared > >>> against. ARM does lack stack_protect_set and stack_protect_test insn > >>> patterns, defining them does not help as the address is expanded > >>> regularly and the patterns only deal with the copy and test of the > >>> guard with the canary. > >>> > >>> This problem does not occur for x86 targets because the PIC access and > >>> the test can be done in the same instruction. Aarch64 is exempt too > >>> because PIC access insn pattern are mov of UNSPEC which prevents it from > >>> the second access in the epilogue being CSEd in cse_local pass with the > >>> first access in the prologue. > >>> > >>> The approach followed here is to create new "combined" set and test > >>> standard pattern names that take the unexpanded guard and do the set or > >>> test. This allows the target to use an opaque pattern (eg. using UNSPEC) > >>> to hide the individual instructions being generated to the compiler and > >>> split the pattern into generic load, compare and branch instruction > >>> after register allocator, therefore avoiding any spilling. This is here > >>> implemented for the ARM targets. For targets not implementing these new > >>> standard pattern names, the existing stack_protect_set and > >>> stack_protect_test pattern names are used. > >>> > >>> To be able to split PIC access after register allocation, the functions > >>> had to be augmented to force a new PIC register load and to control > >>> which register it loads into. This is because sharing the PIC register > >>> between prologue and epilogue could lead to spilling due to CSE again > >>> which an attacker could use to control what the canary gets compared > >>> against. > >>> > >>> ChangeLog entries are as follows: > >>> > >>> *** gcc/ChangeLog *** > >>> > >>> 2018-10-26 Thomas Preud'homme > >>> > >>> * target-insns.def (stack_protect_combined_set): Define new standard > >>> pattern name. > >>> (stack_protect_combined_test): Likewise. > >>> * cfgexpand.c (stack_protect_prologue): Try new > >>> stack_protect_combined_set pattern first. > >>> * function.c (stack_protect_epilogue): Try new > >>> stack_protect_combined_test pattern first. > >>> * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > >>> parameters to control which register to use as PIC register and force > >>> reloading PIC register respectively. Insert in the stream of insns if > >
Re: [PATCH, ARM, ping2] PR85434: Prevent spilling of stack protector guard's address on ARM
Ping? Best regards, Thomas On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme wrote: > > Ping? > > Best regards, > > Thomas > On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme > wrote: > > > > Hi, > > > > Please find updated patch to fix PR85434: spilling of stack protector > > guard's address on ARM. Quite a few changes have been made to the ARM > > part since last round of review so I think it makes more sense to > > review it anew. Ran bootstrap + regression testsuite + glibc build + > > glibc regression testsuite for Arm and Thumb-2 and bootstrap + > > regression testsuite for Thumb-1. GCC's regression testsuite was run > > in 3 configurations in all those cases: > > > > - default configuration (no RUNTESTFLAGS) > > - with -fstack-protector-all > > - with -fPIC -fstack-protector-all (to exercise both codepath in stack > > protector's split code) > > > > None of this show any regression beyond some new scan fail with > > -fstack-protector-all or -fPIC due to unexpected code sequence for the > > testcases concerned and some guality swing due to less optimization > > with new stack protector on. > > > > Patch description and ChangeLog below. > > > > In case of high register pressure in PIC mode, address of the stack > > protector's guard can be spilled on ARM targets as shown in PR85434, > > thus allowing an attacker to control what the canary would be compared > > against. ARM does lack stack_protect_set and stack_protect_test insn > > patterns, defining them does not help as the address is expanded > > regularly and the patterns only deal with the copy and test of the > > guard with the canary. > > > > This problem does not occur for x86 targets because the PIC access and > > the test can be done in the same instruction. Aarch64 is exempt too > > because PIC access insn pattern are mov of UNSPEC which prevents it from > > the second access in the epilogue being CSEd in cse_local pass with the > > first access in the prologue. > > > > The approach followed here is to create new "combined" set and test > > standard pattern names that take the unexpanded guard and do the set or > > test. This allows the target to use an opaque pattern (eg. using UNSPEC) > > to hide the individual instructions being generated to the compiler and > > split the pattern into generic load, compare and branch instruction > > after register allocator, therefore avoiding any spilling. This is here > > implemented for the ARM targets. For targets not implementing these new > > standard pattern names, the existing stack_protect_set and > > stack_protect_test pattern names are used. > > > > To be able to split PIC access after register allocation, the functions > > had to be augmented to force a new PIC register load and to control > > which register it loads into. This is because sharing the PIC register > > between prologue and epilogue could lead to spilling due to CSE again > > which an attacker could use to control what the canary gets compared > > against. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-10-26 Thomas Preud'homme > > > > * target-insns.def (stack_protect_combined_set): Define new standard > > pattern name. > > (stack_protect_combined_test): Likewise. > > * cfgexpand.c (stack_protect_prologue): Try new > > stack_protect_combined_set pattern first. > > * function.c (stack_protect_epilogue): Try new > > stack_protect_combined_test pattern first. > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > > parameters to control which register to use as PIC register and force > > reloading PIC register respectively. Insert in the stream of insns if > > possible. > > (legitimize_pic_address): Expose above new parameters in prototype and > > adapt recursive calls accordingly. Use pic_reg if non null instead of > > cached one. > > (arm_load_pic_register): Add pic_reg parameter and use it if non null. > > (arm_legitimize_address): Adapt to new legitimize_pic_address > > prototype. > > (thumb_legitimize_address): Likewise. > > (arm_emit_call_insn): Adapt to require_pic_register prototype change. > > (arm_expand_prologue): Adapt to arm_load_pic_register prototype change. > > (thumb1_expand_prologue): Likewise. > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > > change. > > (arm_load_pic_register): Likewise. > > * config/arm/predicated.md (guard_addr_operand): New predicate. > > (guard_operand): N
Re: [PATCH, ARM, ping] PR85434: Prevent spilling of stack protector guard's address on ARM
Ping? Best regards, Thomas On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme wrote: > > Hi, > > Please find updated patch to fix PR85434: spilling of stack protector > guard's address on ARM. Quite a few changes have been made to the ARM > part since last round of review so I think it makes more sense to > review it anew. Ran bootstrap + regression testsuite + glibc build + > glibc regression testsuite for Arm and Thumb-2 and bootstrap + > regression testsuite for Thumb-1. GCC's regression testsuite was run > in 3 configurations in all those cases: > > - default configuration (no RUNTESTFLAGS) > - with -fstack-protector-all > - with -fPIC -fstack-protector-all (to exercise both codepath in stack > protector's split code) > > None of this show any regression beyond some new scan fail with > -fstack-protector-all or -fPIC due to unexpected code sequence for the > testcases concerned and some guality swing due to less optimization > with new stack protector on. > > Patch description and ChangeLog below. > > In case of high register pressure in PIC mode, address of the stack > protector's guard can be spilled on ARM targets as shown in PR85434, > thus allowing an attacker to control what the canary would be compared > against. ARM does lack stack_protect_set and stack_protect_test insn > patterns, defining them does not help as the address is expanded > regularly and the patterns only deal with the copy and test of the > guard with the canary. > > This problem does not occur for x86 targets because the PIC access and > the test can be done in the same instruction. Aarch64 is exempt too > because PIC access insn pattern are mov of UNSPEC which prevents it from > the second access in the epilogue being CSEd in cse_local pass with the > first access in the prologue. > > The approach followed here is to create new "combined" set and test > standard pattern names that take the unexpanded guard and do the set or > test. This allows the target to use an opaque pattern (eg. using UNSPEC) > to hide the individual instructions being generated to the compiler and > split the pattern into generic load, compare and branch instruction > after register allocator, therefore avoiding any spilling. This is here > implemented for the ARM targets. For targets not implementing these new > standard pattern names, the existing stack_protect_set and > stack_protect_test pattern names are used. > > To be able to split PIC access after register allocation, the functions > had to be augmented to force a new PIC register load and to control > which register it loads into. This is because sharing the PIC register > between prologue and epilogue could lead to spilling due to CSE again > which an attacker could use to control what the canary gets compared > against. > > ChangeLog entries are as follows: > > *** gcc/ChangeLog *** > > 2018-10-26 Thomas Preud'homme > > * target-insns.def (stack_protect_combined_set): Define new standard > pattern name. > (stack_protect_combined_test): Likewise. > * cfgexpand.c (stack_protect_prologue): Try new > stack_protect_combined_set pattern first. > * function.c (stack_protect_epilogue): Try new > stack_protect_combined_test pattern first. > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > parameters to control which register to use as PIC register and force > reloading PIC register respectively. Insert in the stream of insns if > possible. > (legitimize_pic_address): Expose above new parameters in prototype and > adapt recursive calls accordingly. Use pic_reg if non null instead of > cached one. > (arm_load_pic_register): Add pic_reg parameter and use it if non null. > (arm_legitimize_address): Adapt to new legitimize_pic_address > prototype. > (thumb_legitimize_address): Likewise. > (arm_emit_call_insn): Adapt to require_pic_register prototype change. > (arm_expand_prologue): Adapt to arm_load_pic_register prototype change. > (thumb1_expand_prologue): Likewise. > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > change. > (arm_load_pic_register): Likewise. > * config/arm/predicated.md (guard_addr_operand): New predicate. > (guard_operand): New predicate. > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address > prototype change. > (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue > prototype change. > (stack_protect_combined_set): New expander.. > (stack_protect_combined_set_insn): New insn_and_split pattern. > (stack_protect_set_insn): New insn pattern. > (stack_protect_combined_test): New expander. > (stack_protect_combined_test_insn): New insn_and_split pattern. > (arm_stack_protect_test_insn): New insn pattern. > *
Re: [PATCH, GCC/ARM, ping3] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations
Ping? Best regards, Thomas On Tue, 23 Oct 2018 at 10:10, Thomas Preudhomme wrote: > > Ping? > > Best regards, > > Thomas > > On Mon, 15 Oct 2018 at 16:01, Thomas Preudhomme > wrote: > > > > Ping? > > > > Best regards, > > > > Thomas > > On Fri, 5 Oct 2018 at 17:50, Thomas Preudhomme > > wrote: > > > > > > Hi Ramana and Kyrill, > > > > > > I've reworked the patch to add some documentation of the option > > > conflict and reworked the -mword-relocation logic slightly to set the > > > variable explicitely in PIC mode rather than test for PIC and word > > > relocation everywhere. > > > > > > ChangeLog entries are now as follows: > > > > > > *** gcc/ChangeLog *** > > > > > > 2018-10-02 Thomas Preud'homme > > > > > > PR target/87374 > > > * config/arm/arm.c (arm_option_check_internal): Disable the combined > > > use of -mslow-flash-data and -mword-relocations. > > > (arm_option_override): Enable -mword-relocations if -fpic or -fPIC. > > > * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for > > > flag_pic. > > > * doc/invoke.texi (-mword-relocations): Mention conflict with > > > -mslow-flash-data. > > > (-mslow-flash-data): Reciprocally. > > > > > > *** gcc/testsuite/ChangeLog *** > > > > > > 2018-09-25 Thomas Preud'homme > > > > > > PR target/87374 > > > * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and > > > -mword-relocations would be passed when compiling the test. > > > * gcc.target/arm/movsi_movt.c: Likewise. > > > * gcc.target/arm/pr81863.c: Likewise. > > > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. > > > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. > > > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. > > > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. > > > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. > > > * gcc.target/arm/tls-disable-literal-pool.c: Likewise. > > > > > > Is this ok for trunk? > > > > > > Best regards, > > > > > > Thomas > > > > > > On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan > > > wrote: > > > > > > > > On 02/10/2018 11:42, Thomas Preudhomme wrote: > > > > > Hi Ramana, > > > > > > > > > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan > > > > > wrote: > > > > >> > > > > >> On 27/09/2018 09:26, Kyrill Tkachov wrote: > > > > >>> Hi Thomas, > > > > >>> > > > > >>> On 26/09/18 18:39, Thomas Preudhomme wrote: > > > > >>>> Hi, > > > > >>>> > > > > >>>> GCC ICEs under -mslow-flash-data and -mword-relocations because > > > > >>>> there > > > > >>>> is no way to load an address, both literal pools and MOVW/MOVT > > > > >>>> being > > > > >>>> forbidden. This patch gives an error message when both options are > > > > >>>> specified by the user and adds the according dg-skip-if directives > > > > >>>> for > > > > >>>> tests that use either of these options. > > > > >>>> > > > > >>>> ChangeLog entries are as follows: > > > > >>>> > > > > >>>> *** gcc/ChangeLog *** > > > > >>>> > > > > >>>> 2018-09-25 Thomas Preud'homme > > > > >>>> > > > > >>>>PR target/87374 > > > > >>>>* config/arm/arm.c (arm_option_check_internal): Disable the > > > > >>>> combined > > > > >>>>use of -mslow-flash-data and -mword-relocations. > > > > >>>> > > > > >>>> *** gcc/testsuite/ChangeLog *** > > > > >>>> > > > > >>>> 2018-09-25 Thomas Preud'homme > > > > >>>> > > > > >>>>PR target/87374 > > > > >>>>* gcc.target/arm/movdi_movt.c: Skip if both > > > > >>>> -mslow-flash-data and > > > > >>>>-mwo
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
: New test. Is this ok for trunk? Best regards, Thomas On Thu, 25 Oct 2018 at 15:54, Thomas Preudhomme wrote: > > Good thing I did, found a missing earlyclobber in the process. > Rerunning all tests again. > > Best regards, > > Thomas > On Wed, 24 Oct 2018 at 10:13, Thomas Preudhomme > wrote: > > > > Please hold on for the reviews, found a small improvement that could > > be done. Am testing it right now, should have something by tonight or > > tomorrow. > > > > Best regards, > > > > Thomas > > On Tue, 23 Oct 2018 at 13:35, Thomas Preudhomme > > wrote: > > > > > > [Removing Jeff Law since middle end code hasn't changed] > > > > > > Hi, > > > > > > Given how memory operand are reloaded even with an X constraint, I've > > > reworked the patch for the combined set and combined test instruction > > > ot keep the mem out of the match_operand and used an expander to > > > generate the right instruction pattern. I've also fixed some > > > longstanding issues with the patch when flag_pic is true and with > > > constraints for Thumb-1 that I hadn't noticed before due to using > > > dg-cmp-results in conjunction with test_summary which does not show > > > NA->FAIL (see [1]). > > > > > > All in all, I think the Arm code would do with a fresh review rather > > > than looking at the changes since last posted version. (unchanged) > > > ChangeLog entries are as follows: > > > > > > *** gcc/ChangeLog *** > > > > > > 2018-08-09 Thomas Preud'homme > > > > > > * target-insns.def (stack_protect_combined_set): Define new standard > > > pattern name. > > > (stack_protect_combined_test): Likewise. > > > * cfgexpand.c (stack_protect_prologue): Try new > > > stack_protect_combined_set pattern first. > > > * function.c (stack_protect_epilogue): Try new > > > stack_protect_combined_test pattern first. > > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > > > parameters to control which register to use as PIC register and force > > > reloading PIC register respectively. Insert in the stream of insns if > > > possible. > > > (legitimize_pic_address): Expose above new parameters in prototype and > > > adapt recursive calls accordingly. Use pic_reg if non null instead of > > > cached one. > > > (arm_load_pic_register): Add pic_reg parameter and use it if non null. > > > (arm_legitimize_address): Adapt to new legitimize_pic_address > > > prototype. > > > (thumb_legitimize_address): Likewise. > > > (arm_emit_call_insn): Adapt to require_pic_register prototype change. > > > (arm_expand_prologue): Adapt to arm_load_pic_register prototype > > > change. > > > (thumb1_expand_prologue): Likewise. > > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > > > change. > > > (arm_load_pic_register): Likewise. > > > * config/arm/predicated.md (guard_addr_operand): New predicate. > > > (guard_operand): New predicate. > > > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address > > > prototype change. > > > (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue > > > prototype change. > > > (stack_protect_combined_set): New expander.. > > > (stack_protect_combined_set_insn): New insn_and_split pattern. > > > (stack_protect_set_insn): New insn pattern. > > > (stack_protect_combined_test): New expander. > > > (stack_protect_combined_test_insn): New insn_and_split pattern. > > > (stack_protect_test_insn): New insn pattern. > > > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. > > > (UNSPEC_SP_TEST): Likewise. > > > * doc/md.texi (stack_protect_combined_set): Document new standard > > > pattern name. > > > (stack_protect_set): Clarify that the operand for guard's address is > > > legal. > > > (stack_protect_combined_test): Document new standard pattern name. > > > (stack_protect_test): Clarify that the operand for guard's address is > > > legal. > > > > > > *** gcc/testsuite/ChangeLog *** > > > > > > 2018-07-05 Thomas Preud'homme > > > > > > * gcc.target/arm/pr85434.c: New test. > > > > > > Testing: Bootstrap and regression testing f
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
Good thing I did, found a missing earlyclobber in the process. Rerunning all tests again. Best regards, Thomas On Wed, 24 Oct 2018 at 10:13, Thomas Preudhomme wrote: > > Please hold on for the reviews, found a small improvement that could > be done. Am testing it right now, should have something by tonight or > tomorrow. > > Best regards, > > Thomas > On Tue, 23 Oct 2018 at 13:35, Thomas Preudhomme > wrote: > > > > [Removing Jeff Law since middle end code hasn't changed] > > > > Hi, > > > > Given how memory operand are reloaded even with an X constraint, I've > > reworked the patch for the combined set and combined test instruction > > ot keep the mem out of the match_operand and used an expander to > > generate the right instruction pattern. I've also fixed some > > longstanding issues with the patch when flag_pic is true and with > > constraints for Thumb-1 that I hadn't noticed before due to using > > dg-cmp-results in conjunction with test_summary which does not show > > NA->FAIL (see [1]). > > > > All in all, I think the Arm code would do with a fresh review rather > > than looking at the changes since last posted version. (unchanged) > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-08-09 Thomas Preud'homme > > > > * target-insns.def (stack_protect_combined_set): Define new standard > > pattern name. > > (stack_protect_combined_test): Likewise. > > * cfgexpand.c (stack_protect_prologue): Try new > > stack_protect_combined_set pattern first. > > * function.c (stack_protect_epilogue): Try new > > stack_protect_combined_test pattern first. > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > > parameters to control which register to use as PIC register and force > > reloading PIC register respectively. Insert in the stream of insns if > > possible. > > (legitimize_pic_address): Expose above new parameters in prototype and > > adapt recursive calls accordingly. Use pic_reg if non null instead of > > cached one. > > (arm_load_pic_register): Add pic_reg parameter and use it if non null. > > (arm_legitimize_address): Adapt to new legitimize_pic_address > > prototype. > > (thumb_legitimize_address): Likewise. > > (arm_emit_call_insn): Adapt to require_pic_register prototype change. > > (arm_expand_prologue): Adapt to arm_load_pic_register prototype change. > > (thumb1_expand_prologue): Likewise. > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > > change. > > (arm_load_pic_register): Likewise. > > * config/arm/predicated.md (guard_addr_operand): New predicate. > > (guard_operand): New predicate. > > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address > > prototype change. > > (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue > > prototype change. > > (stack_protect_combined_set): New expander.. > > (stack_protect_combined_set_insn): New insn_and_split pattern. > > (stack_protect_set_insn): New insn pattern. > > (stack_protect_combined_test): New expander. > > (stack_protect_combined_test_insn): New insn_and_split pattern. > > (stack_protect_test_insn): New insn pattern. > > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. > > (UNSPEC_SP_TEST): Likewise. > > * doc/md.texi (stack_protect_combined_set): Document new standard > > pattern name. > > (stack_protect_set): Clarify that the operand for guard's address is > > legal. > > (stack_protect_combined_test): Document new standard pattern name. > > (stack_protect_test): Clarify that the operand for guard's address is > > legal. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2018-07-05 Thomas Preud'homme > > > > * gcc.target/arm/pr85434.c: New test. > > > > Testing: Bootstrap and regression testing for Arm, Thumb-1 and Thumb-2 > > with (i) default flags, (ii) an extra -fstack-protect-all and (iii) > > -fPIC -fstack-protect-all. A glibc build and testsuite run was also > > performed for Arm and Thumb-2. Default flags show no regression and > > the other runs have some expected scan-assembler failing (due to stack > > protector or fPIC code sequence), as well as guality fail (due to less > > optimized code with the new stack protector code) and some execution > > failures in sibcall-9 and sibcall-10 under -fPIC -fstack-protector-all &g
Re: [PATCH, contrib] dg-cmp-results: display NA->FAIL by default
Done. Committed patch and ChangeLog below *** contrib/ChangeLog *** 2018-10-25 Thomas Preud'homme * dg-cmp-results.sh: Print NA-FAIL and NA->UNRESOLVED changes at default verbosity. diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh index 821d557a168..eb976f68f4a 100755 --- a/contrib/dg-cmp-results.sh +++ b/contrib/dg-cmp-results.sh @@ -137,8 +137,11 @@ function drop() { function compare(st, nm) { old = peek() if (old == 0) { -# This new test wasn't run last time. -if (verbose >= 2) printf("NA->%s:%s\n", st, nm) + # This new test wasn't run last time. + if(st == "FAIL" || st == "UNRESOLVED" || verbose >= 2) { + # New test fails or we want all changes + printf("NA->%s:%s\n", st, nm) + } } else { # Compare this new test to the first queued old one. -- 2.19.1 Best regards, Thomas On Thu, 25 Oct 2018 at 08:29, Richard Sandiford wrote: > > Thomas Preudhomme writes: > > And now with the patch. My apologies for the omission. > > > > Best regards, > > > > Thomas > > On Tue, 23 Oct 2018 at 12:08, Thomas Preudhomme > > wrote: > >> > >> Hi, > >> > >> Currently, dg-cmp-results will not print anything for a test that was > >> not run before, even if it is a FAIL now. This means that when > >> contributing a code change together with a testcase in the same commit > >> one must run dg-cmp-results twice: once to check for regression on a > >> full testsuite run and once against the new testcase with -v -v. This > >> also prevents using dg-cmp-results on sum files generated with > >> test_summary since these would not contain PASS. > >> > >> This patch changes dg-cmp-results to print NA->FAIL changes by default. > >> > >> ChangeLog entry is as follows: > >> > >> *** contrib/ChangeLog *** > >> > >> 2018-10-23 Thomas Preud'homme > >> > >> * dg-cmp-results.sh: Print NA-FAIL changes at default verbosity. > >> > >> Is this ok for trunk? > >> > >> Best regards, > >> > >> Thomas > > > > From ab4272a15bdd8931ef683e234e7dd2e0d038df5f Mon Sep 17 00:00:00 2001 > > From: Thomas Preud'homme > > Date: Tue, 23 Oct 2018 11:54:51 +0100 > > Subject: [PATCH] dg-cmp-results: display NA->FAIL by default > > > > Hi, > > > > Currently, dg-cmp-results will not print anything for a test that was > > not run before, even if it is a FAIL now. This means that when > > contributing a code change together with a testcase in the same commit > > one must run dg-cmp-results twice: once to check for regression on a > > full testsuite run and once against the new testcase with -v -v. This > > also prevents using dg-cmp-results on sum files generated with > > test_summary since these would not contain PASS. > > > > This patch changes dg-cmp-results to print NA->FAIL changes by default. > > > > ChangeLog entry is as follows: > > > > *** contrib/ChangeLog *** > > > > 2018-10-23 Thomas Preud'homme > > > > * dg-cmp-results.sh: Print NA-FAIL changes at default verbosity. > > > > Is this ok for trunk? > > > > Best regards, > > > > Thomas > > --- > > contrib/dg-cmp-results.sh | 7 +-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh > > index 821d557a168..921a9b9ca28 100755 > > --- a/contrib/dg-cmp-results.sh > > +++ b/contrib/dg-cmp-results.sh > > @@ -137,8 +137,11 @@ function drop() { > > function compare(st, nm) { > > old = peek() > > if (old == 0) { > > -# This new test wasn't run last time. > > -if (verbose >= 2) printf("NA->%s:%s\n", st, nm) > > + # This new test wasn't run last time. > > + if(st == "FAIL" || verbose >= 2) { > > + # New test fails or we want all changes > > + printf("NA->%s:%s\n", st, nm) > > + } > > Probably also worth doing this for UNRESOLVED, where some markup problem > stops a test from doing anything useful. > > OK with that change, thanks. > > Richard
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
Please hold on for the reviews, found a small improvement that could be done. Am testing it right now, should have something by tonight or tomorrow. Best regards, Thomas On Tue, 23 Oct 2018 at 13:35, Thomas Preudhomme wrote: > > [Removing Jeff Law since middle end code hasn't changed] > > Hi, > > Given how memory operand are reloaded even with an X constraint, I've > reworked the patch for the combined set and combined test instruction > ot keep the mem out of the match_operand and used an expander to > generate the right instruction pattern. I've also fixed some > longstanding issues with the patch when flag_pic is true and with > constraints for Thumb-1 that I hadn't noticed before due to using > dg-cmp-results in conjunction with test_summary which does not show > NA->FAIL (see [1]). > > All in all, I think the Arm code would do with a fresh review rather > than looking at the changes since last posted version. (unchanged) > ChangeLog entries are as follows: > > *** gcc/ChangeLog *** > > 2018-08-09 Thomas Preud'homme > > * target-insns.def (stack_protect_combined_set): Define new standard > pattern name. > (stack_protect_combined_test): Likewise. > * cfgexpand.c (stack_protect_prologue): Try new > stack_protect_combined_set pattern first. > * function.c (stack_protect_epilogue): Try new > stack_protect_combined_test pattern first. > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > parameters to control which register to use as PIC register and force > reloading PIC register respectively. Insert in the stream of insns if > possible. > (legitimize_pic_address): Expose above new parameters in prototype and > adapt recursive calls accordingly. Use pic_reg if non null instead of > cached one. > (arm_load_pic_register): Add pic_reg parameter and use it if non null. > (arm_legitimize_address): Adapt to new legitimize_pic_address > prototype. > (thumb_legitimize_address): Likewise. > (arm_emit_call_insn): Adapt to require_pic_register prototype change. > (arm_expand_prologue): Adapt to arm_load_pic_register prototype change. > (thumb1_expand_prologue): Likewise. > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > change. > (arm_load_pic_register): Likewise. > * config/arm/predicated.md (guard_addr_operand): New predicate. > (guard_operand): New predicate. > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address > prototype change. > (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue > prototype change. > (stack_protect_combined_set): New expander.. > (stack_protect_combined_set_insn): New insn_and_split pattern. > (stack_protect_set_insn): New insn pattern. > (stack_protect_combined_test): New expander. > (stack_protect_combined_test_insn): New insn_and_split pattern. > (stack_protect_test_insn): New insn pattern. > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. > (UNSPEC_SP_TEST): Likewise. > * doc/md.texi (stack_protect_combined_set): Document new standard > pattern name. > (stack_protect_set): Clarify that the operand for guard's address is > legal. > (stack_protect_combined_test): Document new standard pattern name. > (stack_protect_test): Clarify that the operand for guard's address is > legal. > > *** gcc/testsuite/ChangeLog *** > > 2018-07-05 Thomas Preud'homme > > * gcc.target/arm/pr85434.c: New test. > > Testing: Bootstrap and regression testing for Arm, Thumb-1 and Thumb-2 > with (i) default flags, (ii) an extra -fstack-protect-all and (iii) > -fPIC -fstack-protect-all. A glibc build and testsuite run was also > performed for Arm and Thumb-2. Default flags show no regression and > the other runs have some expected scan-assembler failing (due to stack > protector or fPIC code sequence), as well as guality fail (due to less > optimized code with the new stack protector code) and some execution > failures in sibcall-9 and sibcall-10 under -fPIC -fstack-protector-all > due to the PIC sequence for the global variable making the frame > layout different for the 2 functions (these become PASS if making the > global variable static). > > Is this ok for trunk? > > Best regards, > > Thomas > > [1] https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01412.html > > > On Tue, 25 Sep 2018 at 17:10, Kyrill Tkachov > wrote: > > > > Hi Thomas, > > > > On 29/08/18 10:51, Thomas Preudhomme wrote: > > > Resend hopefully without HTML this time. > > > > > > On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme
[PATCH, testsuite] Fix sibcall-9 and sibcall-10 with -fPIC
Hi, gcc.dg/sibcall-9.c and gcc.dg/sibcall-10.c give execution failure on ARM when compiled with -fPIC due to the PIC access to volatile variable v creating an extra spill which causes the frame size of the two recursive functions to be different. Making the variable static solve the issue because the variable can be access in a PC-relative way and avoid the spill, while still testing sibling call as originally intended. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** * gcc.dg/sibcall-9.c: Make v static. * gcc.dg/sibcall-10.c: Likewise. Tested both testcase with and without -fPIC and it now passes in both case when targeting arm-none-eabi. It also passes in both cases on x86_64-linux-gnu. Is this ok for trunk? Best regards, Thomas From 27286120fe2d6a088d14d7e4f4b5b6fa6cc2bc41 Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Tue, 23 Oct 2018 14:01:31 +0100 Subject: [PATCH] [PATCH, testsuite] Fix sibcall-9 and sibcall-10 with -fPIC Hi, gcc.dg/sibcall-9.c and gcc.dg/sibcall-10.c give execution failure on ARM when compiled with -fPIC due to the PIC access to volatile variable v creating an extra spill which causes the frame size of the two recursive functions to be different. Making the variable static solve the issue because the variable can be access in a PC-relative way and avoid the spill, while still testing sibling call as originally intended. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** * gcc.dg/sibcall-9.c: Make v static. * gcc.dg/sibcall-10.c: Likewise. Tested both testcase with and without -fPIC and it now passes in both case when targeting arm-none-eabi. It also passes in both cases on x86_64-linux-gnu. Is this ok for trunk? Best regards, Thomas --- gcc/testsuite/gcc.dg/sibcall-10.c | 2 +- gcc/testsuite/gcc.dg/sibcall-9.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/sibcall-10.c b/gcc/testsuite/gcc.dg/sibcall-10.c index 54cc604aecf..4acca50e3e4 100644 --- a/gcc/testsuite/gcc.dg/sibcall-10.c +++ b/gcc/testsuite/gcc.dg/sibcall-10.c @@ -31,7 +31,7 @@ extern void exit (int); static ATTR void recurser_void1 (void); static ATTR void recurser_void2 (void); extern void track (void); -volatile int v; +static volatile int v; int n = 0; int main () diff --git a/gcc/testsuite/gcc.dg/sibcall-9.c b/gcc/testsuite/gcc.dg/sibcall-9.c index fc3bd9dcf16..32b2e1d5d61 100644 --- a/gcc/testsuite/gcc.dg/sibcall-9.c +++ b/gcc/testsuite/gcc.dg/sibcall-9.c @@ -31,7 +31,7 @@ extern void exit (int); static ATTR void recurser_void1 (int); static ATTR void recurser_void2 (int); extern void track (int); -volatile int v; +static volatile int v; int main () { -- 2.19.1
Re: [PATCH, contrib] dg-cmp-results: display NA->FAIL by default
And now with the patch. My apologies for the omission. Best regards, Thomas On Tue, 23 Oct 2018 at 12:08, Thomas Preudhomme wrote: > > Hi, > > Currently, dg-cmp-results will not print anything for a test that was > not run before, even if it is a FAIL now. This means that when > contributing a code change together with a testcase in the same commit > one must run dg-cmp-results twice: once to check for regression on a > full testsuite run and once against the new testcase with -v -v. This > also prevents using dg-cmp-results on sum files generated with > test_summary since these would not contain PASS. > > This patch changes dg-cmp-results to print NA->FAIL changes by default. > > ChangeLog entry is as follows: > > *** contrib/ChangeLog *** > > 2018-10-23 Thomas Preud'homme > > * dg-cmp-results.sh: Print NA-FAIL changes at default verbosity. > > Is this ok for trunk? > > Best regards, > > Thomas From ab4272a15bdd8931ef683e234e7dd2e0d038df5f Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Tue, 23 Oct 2018 11:54:51 +0100 Subject: [PATCH] dg-cmp-results: display NA->FAIL by default Hi, Currently, dg-cmp-results will not print anything for a test that was not run before, even if it is a FAIL now. This means that when contributing a code change together with a testcase in the same commit one must run dg-cmp-results twice: once to check for regression on a full testsuite run and once against the new testcase with -v -v. This also prevents using dg-cmp-results on sum files generated with test_summary since these would not contain PASS. This patch changes dg-cmp-results to print NA->FAIL changes by default. ChangeLog entry is as follows: *** contrib/ChangeLog *** 2018-10-23 Thomas Preud'homme * dg-cmp-results.sh: Print NA-FAIL changes at default verbosity. Is this ok for trunk? Best regards, Thomas --- contrib/dg-cmp-results.sh | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh index 821d557a168..921a9b9ca28 100755 --- a/contrib/dg-cmp-results.sh +++ b/contrib/dg-cmp-results.sh @@ -137,8 +137,11 @@ function drop() { function compare(st, nm) { old = peek() if (old == 0) { -# This new test wasn't run last time. -if (verbose >= 2) printf("NA->%s:%s\n", st, nm) + # This new test wasn't run last time. + if(st == "FAIL" || verbose >= 2) { + # New test fails or we want all changes + printf("NA->%s:%s\n", st, nm) + } } else { # Compare this new test to the first queued old one. -- 2.19.1
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
[Removing Jeff Law since middle end code hasn't changed] Hi, Given how memory operand are reloaded even with an X constraint, I've reworked the patch for the combined set and combined test instruction ot keep the mem out of the match_operand and used an expander to generate the right instruction pattern. I've also fixed some longstanding issues with the patch when flag_pic is true and with constraints for Thumb-1 that I hadn't noticed before due to using dg-cmp-results in conjunction with test_summary which does not show NA->FAIL (see [1]). All in all, I think the Arm code would do with a fresh review rather than looking at the changes since last posted version. (unchanged) ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-08-09 Thomas Preud'homme * target-insns.def (stack_protect_combined_set): Define new standard pattern name. (stack_protect_combined_test): Likewise. * cfgexpand.c (stack_protect_prologue): Try new stack_protect_combined_set pattern first. * function.c (stack_protect_epilogue): Try new stack_protect_combined_test pattern first. * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now parameters to control which register to use as PIC register and force reloading PIC register respectively. Insert in the stream of insns if possible. (legitimize_pic_address): Expose above new parameters in prototype and adapt recursive calls accordingly. Use pic_reg if non null instead of cached one. (arm_load_pic_register): Add pic_reg parameter and use it if non null. (arm_legitimize_address): Adapt to new legitimize_pic_address prototype. (thumb_legitimize_address): Likewise. (arm_emit_call_insn): Adapt to require_pic_register prototype change. (arm_expand_prologue): Adapt to arm_load_pic_register prototype change. (thumb1_expand_prologue): Likewise. * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype change. (arm_load_pic_register): Likewise. * config/arm/predicated.md (guard_addr_operand): New predicate. (guard_operand): New predicate. * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address prototype change. (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue prototype change. (stack_protect_combined_set): New expander.. (stack_protect_combined_set_insn): New insn_and_split pattern. (stack_protect_set_insn): New insn pattern. (stack_protect_combined_test): New expander. (stack_protect_combined_test_insn): New insn_and_split pattern. (stack_protect_test_insn): New insn pattern. * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. (UNSPEC_SP_TEST): Likewise. * doc/md.texi (stack_protect_combined_set): Document new standard pattern name. (stack_protect_set): Clarify that the operand for guard's address is legal. (stack_protect_combined_test): Document new standard pattern name. (stack_protect_test): Clarify that the operand for guard's address is legal. *** gcc/testsuite/ChangeLog *** 2018-07-05 Thomas Preud'homme * gcc.target/arm/pr85434.c: New test. Testing: Bootstrap and regression testing for Arm, Thumb-1 and Thumb-2 with (i) default flags, (ii) an extra -fstack-protect-all and (iii) -fPIC -fstack-protect-all. A glibc build and testsuite run was also performed for Arm and Thumb-2. Default flags show no regression and the other runs have some expected scan-assembler failing (due to stack protector or fPIC code sequence), as well as guality fail (due to less optimized code with the new stack protector code) and some execution failures in sibcall-9 and sibcall-10 under -fPIC -fstack-protector-all due to the PIC sequence for the global variable making the frame layout different for the 2 functions (these become PASS if making the global variable static). Is this ok for trunk? Best regards, Thomas [1] https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01412.html On Tue, 25 Sep 2018 at 17:10, Kyrill Tkachov wrote: > > Hi Thomas, > > On 29/08/18 10:51, Thomas Preudhomme wrote: > > Resend hopefully without HTML this time. > > > > On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme > > wrote: > >> Hi, > >> > >> I've reworked the patch fixing PR85434 (spilling of stack protector > >> guard's address on ARM) to address the testsuite regression on powerpc and > >> x86 as well as glibc testsuite regression on ARM. Issues were due to > >> unconditionally attempting to generate the new patterns. The code now > >> tests if there is a pattern for them for the target before generating > >> them. In the ARM side of the patch, I've also added a more specific > >> predicate for the new patterns. The new patch is found below. > >> > >> > >> In case of high register pressure in PIC mode, address of
[PATCH, contrib] dg-cmp-results: display NA->FAIL by default
Hi, Currently, dg-cmp-results will not print anything for a test that was not run before, even if it is a FAIL now. This means that when contributing a code change together with a testcase in the same commit one must run dg-cmp-results twice: once to check for regression on a full testsuite run and once against the new testcase with -v -v. This also prevents using dg-cmp-results on sum files generated with test_summary since these would not contain PASS. This patch changes dg-cmp-results to print NA->FAIL changes by default. ChangeLog entry is as follows: *** contrib/ChangeLog *** 2018-10-23 Thomas Preud'homme * dg-cmp-results.sh: Print NA-FAIL changes at default verbosity. Is this ok for trunk? Best regards, Thomas
Re: [PATCH, GCC/ARM, ping2] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations
Ping? Best regards, Thomas On Mon, 15 Oct 2018 at 16:01, Thomas Preudhomme wrote: > > Ping? > > Best regards, > > Thomas > On Fri, 5 Oct 2018 at 17:50, Thomas Preudhomme > wrote: > > > > Hi Ramana and Kyrill, > > > > I've reworked the patch to add some documentation of the option > > conflict and reworked the -mword-relocation logic slightly to set the > > variable explicitely in PIC mode rather than test for PIC and word > > relocation everywhere. > > > > ChangeLog entries are now as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-10-02 Thomas Preud'homme > > > > PR target/87374 > > * config/arm/arm.c (arm_option_check_internal): Disable the combined > > use of -mslow-flash-data and -mword-relocations. > > (arm_option_override): Enable -mword-relocations if -fpic or -fPIC. > > * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for > > flag_pic. > > * doc/invoke.texi (-mword-relocations): Mention conflict with > > -mslow-flash-data. > > (-mslow-flash-data): Reciprocally. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2018-09-25 Thomas Preud'homme > > > > PR target/87374 > > * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and > > -mword-relocations would be passed when compiling the test. > > * gcc.target/arm/movsi_movt.c: Likewise. > > * gcc.target/arm/pr81863.c: Likewise. > > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. > > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. > > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. > > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. > > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. > > * gcc.target/arm/tls-disable-literal-pool.c: Likewise. > > > > Is this ok for trunk? > > > > Best regards, > > > > Thomas > > > > On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan > > wrote: > > > > > > On 02/10/2018 11:42, Thomas Preudhomme wrote: > > > > Hi Ramana, > > > > > > > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan > > > > wrote: > > > >> > > > >> On 27/09/2018 09:26, Kyrill Tkachov wrote: > > > >>> Hi Thomas, > > > >>> > > > >>> On 26/09/18 18:39, Thomas Preudhomme wrote: > > > >>>> Hi, > > > >>>> > > > >>>> GCC ICEs under -mslow-flash-data and -mword-relocations because there > > > >>>> is no way to load an address, both literal pools and MOVW/MOVT being > > > >>>> forbidden. This patch gives an error message when both options are > > > >>>> specified by the user and adds the according dg-skip-if directives > > > >>>> for > > > >>>> tests that use either of these options. > > > >>>> > > > >>>> ChangeLog entries are as follows: > > > >>>> > > > >>>> *** gcc/ChangeLog *** > > > >>>> > > > >>>> 2018-09-25 Thomas Preud'homme > > > >>>> > > > >>>>PR target/87374 > > > >>>>* config/arm/arm.c (arm_option_check_internal): Disable the > > > >>>> combined > > > >>>>use of -mslow-flash-data and -mword-relocations. > > > >>>> > > > >>>> *** gcc/testsuite/ChangeLog *** > > > >>>> > > > >>>> 2018-09-25 Thomas Preud'homme > > > >>>> > > > >>>>PR target/87374 > > > >>>>* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data > > > >>>> and > > > >>>>-mword-relocations would be passed when compiling the test. > > > >>>>* gcc.target/arm/movsi_movt.c: Likewise. > > > >>>>* gcc.target/arm/pr81863.c: Likewise. > > > >>>>* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. > > > >>>>* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. > > > >>>>* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. > > > >>>>* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. > > > >>>>* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. >
Re: [PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations
Ping? Best regards, Thomas On Fri, 5 Oct 2018 at 17:50, Thomas Preudhomme wrote: > > Hi Ramana and Kyrill, > > I've reworked the patch to add some documentation of the option > conflict and reworked the -mword-relocation logic slightly to set the > variable explicitely in PIC mode rather than test for PIC and word > relocation everywhere. > > ChangeLog entries are now as follows: > > *** gcc/ChangeLog *** > > 2018-10-02 Thomas Preud'homme > > PR target/87374 > * config/arm/arm.c (arm_option_check_internal): Disable the combined > use of -mslow-flash-data and -mword-relocations. > (arm_option_override): Enable -mword-relocations if -fpic or -fPIC. > * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for > flag_pic. > * doc/invoke.texi (-mword-relocations): Mention conflict with > -mslow-flash-data. > (-mslow-flash-data): Reciprocally. > > *** gcc/testsuite/ChangeLog *** > > 2018-09-25 Thomas Preud'homme > > PR target/87374 > * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and > -mword-relocations would be passed when compiling the test. > * gcc.target/arm/movsi_movt.c: Likewise. > * gcc.target/arm/pr81863.c: Likewise. > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. > * gcc.target/arm/tls-disable-literal-pool.c: Likewise. > > Is this ok for trunk? > > Best regards, > > Thomas > > On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan > wrote: > > > > On 02/10/2018 11:42, Thomas Preudhomme wrote: > > > Hi Ramana, > > > > > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan > > > wrote: > > >> > > >> On 27/09/2018 09:26, Kyrill Tkachov wrote: > > >>> Hi Thomas, > > >>> > > >>> On 26/09/18 18:39, Thomas Preudhomme wrote: > > >>>> Hi, > > >>>> > > >>>> GCC ICEs under -mslow-flash-data and -mword-relocations because there > > >>>> is no way to load an address, both literal pools and MOVW/MOVT being > > >>>> forbidden. This patch gives an error message when both options are > > >>>> specified by the user and adds the according dg-skip-if directives for > > >>>> tests that use either of these options. > > >>>> > > >>>> ChangeLog entries are as follows: > > >>>> > > >>>> *** gcc/ChangeLog *** > > >>>> > > >>>> 2018-09-25 Thomas Preud'homme > > >>>> > > >>>>PR target/87374 > > >>>>* config/arm/arm.c (arm_option_check_internal): Disable the > > >>>> combined > > >>>>use of -mslow-flash-data and -mword-relocations. > > >>>> > > >>>> *** gcc/testsuite/ChangeLog *** > > >>>> > > >>>> 2018-09-25 Thomas Preud'homme > > >>>> > > >>>>PR target/87374 > > >>>>* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data > > >>>> and > > >>>>-mword-relocations would be passed when compiling the test. > > >>>>* gcc.target/arm/movsi_movt.c: Likewise. > > >>>>* gcc.target/arm/pr81863.c: Likewise. > > >>>>* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. > > >>>>* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. > > >>>>* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. > > >>>>* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. > > >>>>* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. > > >>>>* gcc.target/arm/tls-disable-literal-pool.c: Likewise. > > >>>> > > >>>> > > >>>> Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when > > >>>> targeting arm-none-eabi. Modified tests get skipped as expected when > > >>>> running the testsuite with -mslow-flash-data (pr81863.c) or > > >>>> -mword-relocations (all the others). > > >>>> > > >>>> > > >>>> Is this ok for trunk? I'd also appreciate guidance on whether
Re: [PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations
Hi Ramana and Kyrill, I've reworked the patch to add some documentation of the option conflict and reworked the -mword-relocation logic slightly to set the variable explicitely in PIC mode rather than test for PIC and word relocation everywhere. ChangeLog entries are now as follows: *** gcc/ChangeLog *** 2018-10-02 Thomas Preud'homme PR target/87374 * config/arm/arm.c (arm_option_check_internal): Disable the combined use of -mslow-flash-data and -mword-relocations. (arm_option_override): Enable -mword-relocations if -fpic or -fPIC. * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for flag_pic. * doc/invoke.texi (-mword-relocations): Mention conflict with -mslow-flash-data. (-mslow-flash-data): Reciprocally. *** gcc/testsuite/ChangeLog *** 2018-09-25 Thomas Preud'homme PR target/87374 * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and -mword-relocations would be passed when compiling the test. * gcc.target/arm/movsi_movt.c: Likewise. * gcc.target/arm/pr81863.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. * gcc.target/arm/tls-disable-literal-pool.c: Likewise. Is this ok for trunk? Best regards, Thomas On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan wrote: > > On 02/10/2018 11:42, Thomas Preudhomme wrote: > > Hi Ramana, > > > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan > > wrote: > >> > >> On 27/09/2018 09:26, Kyrill Tkachov wrote: > >>> Hi Thomas, > >>> > >>> On 26/09/18 18:39, Thomas Preudhomme wrote: > >>>> Hi, > >>>> > >>>> GCC ICEs under -mslow-flash-data and -mword-relocations because there > >>>> is no way to load an address, both literal pools and MOVW/MOVT being > >>>> forbidden. This patch gives an error message when both options are > >>>> specified by the user and adds the according dg-skip-if directives for > >>>> tests that use either of these options. > >>>> > >>>> ChangeLog entries are as follows: > >>>> > >>>> *** gcc/ChangeLog *** > >>>> > >>>> 2018-09-25 Thomas Preud'homme > >>>> > >>>>PR target/87374 > >>>>* config/arm/arm.c (arm_option_check_internal): Disable the > >>>> combined > >>>>use of -mslow-flash-data and -mword-relocations. > >>>> > >>>> *** gcc/testsuite/ChangeLog *** > >>>> > >>>> 2018-09-25 Thomas Preud'homme > >>>> > >>>>PR target/87374 > >>>>* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and > >>>>-mword-relocations would be passed when compiling the test. > >>>>* gcc.target/arm/movsi_movt.c: Likewise. > >>>>* gcc.target/arm/pr81863.c: Likewise. > >>>>* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. > >>>>* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. > >>>>* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. > >>>>* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. > >>>>* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. > >>>>* gcc.target/arm/tls-disable-literal-pool.c: Likewise. > >>>> > >>>> > >>>> Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when > >>>> targeting arm-none-eabi. Modified tests get skipped as expected when > >>>> running the testsuite with -mslow-flash-data (pr81863.c) or > >>>> -mword-relocations (all the others). > >>>> > >>>> > >>>> Is this ok for trunk? I'd also appreciate guidance on whether this is > >>>> worth a backport. It's a simple patch but on the other hand it only > >>>> prevents some option combination, it does not fix anything so I have > >>>> mixed feelings. > >>> > >>> In my opinion -mslow-flash-data is more of a tuning option rather than a > >>> security/ABI feature > >>> and therefore erroring out on its combination with -mword-relocations > >>> feels odd. > >>> I'm leaning more towards making -mword-relocations or any other option > >>> that
Re: [PATCH, LRA] Never reload fixed form constraints memory operand
My bad, I used dg-cmp-results without verbosity which didn't show the problem It starts to show it with -v -v, I'm not sure why. I'll have a look right now and revert by the end of today if I cannot come up with a fix. Does that sound ok? Best regards, Thomas On Thu, 4 Oct 2018 at 12:31, H.J. Lu wrote: > > On Wed, Oct 3, 2018 at 8:12 PM Vladimir Makarov wrote: > > > > On 10/03/2018 12:47 PM, Thomas Preudhomme wrote: > > > Best regards, > > > > > > Thomas > > > > > > never_reload_fixed_address_operand.patch > > > > > > > > > From 2831d8b886d92513c2d30d43a6a989d2bbd0ceee Mon Sep 17 00:00:00 2001 > > > From: Thomas Preud'homme > > > Date: Thu, 27 Sep 2018 09:50:12 +0100 > > > Subject: [PATCH] [PATCH, LRA] Never reload fixed form constraints memory > > > operand > > > > > > Hi, > > > > > > The unconditional reload of address operand for recognized instruction > > > in process_address_1 prevent the patch for fixing "PR85434: Address of > > > stack protector guard spilled to stack on ARM" proposed at [1]. The code > > > in this patch attempt to control which registers are used to make PIC > > > access but the reload performed by process_address_1 will use generic > > > PIC access. This patch removes the test for the instruction to be > > > unrecognized to do the reload, thus always avoiding to reload address > > > operand for fixed constraints (such as "X" used in the patch). > > > > > > [1]https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01838.html > > > > > > ChangeLog entry is as follows: > > > > > > *** gcc/ChangeLog *** > > > > > > 2018-10-03 Thomas Preud'homme > > > > > > * lra-constraints.c (process_address_1): Bail out for all > > > satisfied fixed constraints. > > > > > > Testing: Successfully bootstrapped and regtested on: > > > - arm-linux-gnueabihf (both Arm and Thumb2 mode) > > > - aarch64-linux-gnu > > > - x86_64-linux-gnu > > > - i386-linux-gnu > > > - sparc64-linux-gnu (gcc202) > > > - powerpc64le-linux-gnu (gcc112) > > > > > > Is this ok for trunk? > > > > > OK. Thank you for testing all these targets, Thomas. > > > > This caused: > > FAIL: gcc.target/i386/pr83317.c (internal compiler error) > FAIL: gcc.target/i386/pr83317.c (test for excess errors) > > [hjl@gnu-4 gcc]$ > /export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc > -B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/ > /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c > -m32 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers > -fdiagnostics-color=never -O1 -fPIC -msse2 -mfpmath=sse -S -o > pr83317.s > during RTL pass: reload > /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c: > In function \u2018foo\u2019: > /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c:21:1: > internal compiler error: in lra_eliminate_reg_if_possible, at > lra-eliminations.c:1393 > 0xea4875 lra_eliminate_reg_if_possible(rtx_def**) > /export/gnu/import/git/sources/gcc/gcc/lra-eliminations.c:1393 > 0xe8a94c address_eliminator > /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:362 > 0xe8aaf7 satisfies_memory_constraint_p > /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:401 > 0xe8f947 process_alt_operands > /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:2248 > 0xe93d1f curr_insn_transform > /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:3861 > 0xe975f2 lra_constraints(bool) > /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:4878 > 0xe7c458 lra(_IO_FILE*) > /export/gnu/import/git/sources/gcc/gcc/lra.c:2446 > 0xe111ff do_reload > /export/gnu/import/git/sources/gcc/gcc/ira.c:5469 > 0xe116f2 execute > /export/gnu/import/git/sources/gcc/gcc/ira.c:5653 > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <https://gcc.gnu.org/bugs/> for instructions. > [hjl@gnu-4 gcc]$ > > > -- > H.J.
[PATCH, LRA] Never reload fixed form constraints memory operand
Hi, The unconditional reload of address operand for recognized instruction in process_address_1 prevent the patch for fixing "PR85434: Address of stack protector guard spilled to stack on ARM" proposed at [1]. The code in this patch attempt to control which registers are used to make PIC access but the reload performed by process_address_1 will use generic PIC access. This patch removes the test for the instruction to be unrecognized to do the reload, thus always avoiding to reload address operand for fixed constraints (such as "X" used in the patch). [1] https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01838.html ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-10-03 Thomas Preud'homme * lra-constraints.c (process_address_1): Bail out for all satisfied fixed constraints. Testing: Successfully bootstrapped and regtested on: - arm-linux-gnueabihf (both Arm and Thumb2 mode) - aarch64-linux-gnu - x86_64-linux-gnu - i386-linux-gnu - sparc64-linux-gnu (gcc202) - powerpc64le-linux-gnu (gcc112) Is this ok for trunk? Best regards, Thomas From 2831d8b886d92513c2d30d43a6a989d2bbd0ceee Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Thu, 27 Sep 2018 09:50:12 +0100 Subject: [PATCH] [PATCH, LRA] Never reload fixed form constraints memory operand Hi, The unconditional reload of address operand for recognized instruction in process_address_1 prevent the patch for fixing "PR85434: Address of stack protector guard spilled to stack on ARM" proposed at [1]. The code in this patch attempt to control which registers are used to make PIC access but the reload performed by process_address_1 will use generic PIC access. This patch removes the test for the instruction to be unrecognized to do the reload, thus always avoiding to reload address operand for fixed constraints (such as "X" used in the patch). [1] https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01838.html ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-10-03 Thomas Preud'homme * lra-constraints.c (process_address_1): Bail out for all satisfied fixed constraints. Testing: Successfully bootstrapped and regtested on: - arm-linux-gnueabihf (both Arm and Thumb2 mode) - aarch64-linux-gnu - x86_64-linux-gnu - i386-linux-gnu - sparc64-linux-gnu (gcc202) - powerpc64le-linux-gnu (gcc112) Is this ok for trunk? Best regards, Thomas --- gcc/lra-constraints.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index 774d1ff3aaa..c3edd9ef45d 100644 --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -3243,8 +3243,7 @@ process_address_1 (int nop, bool check_only_p, /* Do not attempt to decompose arbitrary addresses generated by combine for asm operands with loose constraints, e.g 'X'. */ else if (MEM_P (op) - && !(INSN_CODE (curr_insn) < 0 - && get_constraint_type (cn) == CT_FIXED_FORM + && !(get_constraint_type (cn) == CT_FIXED_FORM && constraint_satisfied_p (op, cn))) decompose_mem_address (, op); else if (GET_CODE (op) == SUBREG -- 2.19.0
Re: [PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations
Hi Ramana, On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan wrote: > > On 27/09/2018 09:26, Kyrill Tkachov wrote: > > Hi Thomas, > > > > On 26/09/18 18:39, Thomas Preudhomme wrote: > >> Hi, > >> > >> GCC ICEs under -mslow-flash-data and -mword-relocations because there > >> is no way to load an address, both literal pools and MOVW/MOVT being > >> forbidden. This patch gives an error message when both options are > >> specified by the user and adds the according dg-skip-if directives for > >> tests that use either of these options. > >> > >> ChangeLog entries are as follows: > >> > >> *** gcc/ChangeLog *** > >> > >> 2018-09-25 Thomas Preud'homme > >> > >> PR target/87374 > >> * config/arm/arm.c (arm_option_check_internal): Disable the combined > >> use of -mslow-flash-data and -mword-relocations. > >> > >> *** gcc/testsuite/ChangeLog *** > >> > >> 2018-09-25 Thomas Preud'homme > >> > >> PR target/87374 > >> * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and > >> -mword-relocations would be passed when compiling the test. > >> * gcc.target/arm/movsi_movt.c: Likewise. > >> * gcc.target/arm/pr81863.c: Likewise. > >> * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. > >> * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. > >> * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. > >> * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. > >> * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. > >> * gcc.target/arm/tls-disable-literal-pool.c: Likewise. > >> > >> > >> Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when > >> targeting arm-none-eabi. Modified tests get skipped as expected when > >> running the testsuite with -mslow-flash-data (pr81863.c) or > >> -mword-relocations (all the others). > >> > >> > >> Is this ok for trunk? I'd also appreciate guidance on whether this is > >> worth a backport. It's a simple patch but on the other hand it only > >> prevents some option combination, it does not fix anything so I have > >> mixed feelings. > > > > In my opinion -mslow-flash-data is more of a tuning option rather than a > > security/ABI feature > > and therefore erroring out on its combination with -mword-relocations feels > > odd. > > I'm leaning more towards making -mword-relocations or any other option that > > really requires constant pools > > to bypass/disable the effects of -mslow-flash-data instead. > > -mslow-flash-data and -mword-relocations are contradictory in their > expectations. mslow-flash-data is for not putting anything in the > literal pool whereas mword-relocations is purely around the use of movw > / movt instructions for word sized values. I wish we had called > -mslow-flash-data something else (probably -mno-literal-pools). > -mslow-flash-data is used primarily by M-profile users and > -mword-relocations IIUC was a point fix for use in the Linux kernel for > module loads at a time when not all module loaders in the linux kernel > were fixed for the movw / movt relocations and armv7-a / thumb2 was in > it's infancy :). Thus they are used by different constituencies in > general and I wouldn't see them used together by actual users. Technically, -mslow-flash-data does not forbid literal pool, it just discourages it because it's slower than many instructions. -mpure-code on the other hand reuse the same logic and does forbid literal pools. We could treat -mslow-flash-data differently but the question is whether it is worth the trouble. By the way, I've noticed that the documentation for -mword-relocations says it defaults to on for -fpic and -fPIC but when looking through the code I saw that target_word_relocation is not set in these case, rather the initial commit checks that introduced -mword-relocation also checks for flag_pic when checking target_word_relocation. However a later commit added one more check for target_word_relocations but nothing for flag_pic. I'm now consolidating this so that flag_pic sets target_word_relocations. I'll do a regression testing with -fPIC and then post an updated patch. > > Considering the above, I would prefer a hard error rather than a warning > as they are contradictory and I'd prefer that we error'd out. Further > this bugzilla entry is probably created with fuzzing with a variety of > options rather than from any real use case. > > Oh and yes, lets update invoke.texi while here. Done. Will be part of the updated patch. Best regards, Thomas
[PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations
Hi, GCC ICEs under -mslow-flash-data and -mword-relocations because there is no way to load an address, both literal pools and MOVW/MOVT being forbidden. This patch gives an error message when both options are specified by the user and adds the according dg-skip-if directives for tests that use either of these options. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-09-25 Thomas Preud'homme PR target/87374 * config/arm/arm.c (arm_option_check_internal): Disable the combined use of -mslow-flash-data and -mword-relocations. *** gcc/testsuite/ChangeLog *** 2018-09-25 Thomas Preud'homme PR target/87374 * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and -mword-relocations would be passed when compiling the test. * gcc.target/arm/movsi_movt.c: Likewise. * gcc.target/arm/pr81863.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. * gcc.target/arm/tls-disable-literal-pool.c: Likewise. Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when targeting arm-none-eabi. Modified tests get skipped as expected when running the testsuite with -mslow-flash-data (pr81863.c) or -mword-relocations (all the others). Is this ok for trunk? I'd also appreciate guidance on whether this is worth a backport. It's a simple patch but on the other hand it only prevents some option combination, it does not fix anything so I have mixed feelings. Best regards, Thomas diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 6332e68df05..5beffc875c1 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -2893,17 +2893,22 @@ arm_option_check_internal (struct gcc_options *opts) flag_pic = 0; } - /* We only support -mpure-code and -mslow-flash-data on M-profile targets - with MOVT. */ - if ((target_pure_code || target_slow_flash_data) - && (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON)) + if (target_pure_code || target_slow_flash_data) { const char *flag = (target_pure_code ? "-mpure-code" : "-mslow-flash-data"); - error ("%s only supports non-pic code on M-profile targets with the " - "MOVT instruction", flag); -} + /* We only support -mpure-code and -mslow-flash-data on M-profile targets + with MOVT. */ + if (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON) + error ("%s only supports non-pic code on M-profile targets with the " + "MOVT instruction", flag); + + /* Cannot load addresses: -mslow-flash-data forbids literal pool and + -mword-relocations forbids relocation of MOVT/MOVW. */ + if (target_word_relocations) + error ("%s incompatible with -mword-relocations", flag); +} } /* Recompute the global settings depending on target attribute options. */ diff --git a/gcc/testsuite/gcc.target/arm/movdi_movt.c b/gcc/testsuite/gcc.target/arm/movdi_movt.c index e2a28ccbd99..a01ffa0dc93 100644 --- a/gcc/testsuite/gcc.target/arm/movdi_movt.c +++ b/gcc/testsuite/gcc.target/arm/movdi_movt.c @@ -1,4 +1,5 @@ /* { dg-do compile { target { arm_cortex_m && { arm_thumb2_ok || arm_thumb1_movt_ok } } } } */ +/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */ /* { dg-options "-O2 -mslow-flash-data" } */ unsigned long long diff --git a/gcc/testsuite/gcc.target/arm/movsi_movt.c b/gcc/testsuite/gcc.target/arm/movsi_movt.c index 3cf46e2fd17..19d202ecd33 100644 --- a/gcc/testsuite/gcc.target/arm/movsi_movt.c +++ b/gcc/testsuite/gcc.target/arm/movsi_movt.c @@ -1,4 +1,5 @@ /* { dg-do compile { target { arm_cortex_m && { arm_thumb2_ok || arm_thumb1_movt_ok } } } } */ +/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */ /* { dg-options "-O2 -mslow-flash-data" } */ unsigned diff --git a/gcc/testsuite/gcc.target/arm/pr81863.c b/gcc/testsuite/gcc.target/arm/pr81863.c index 63b1ed66b2c..225a0c5cc2b 100644 --- a/gcc/testsuite/gcc.target/arm/pr81863.c +++ b/gcc/testsuite/gcc.target/arm/pr81863.c @@ -1,5 +1,6 @@ /* testsuite/gcc.target/arm/pr48183.c */ /* { dg-do compile } */ +/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mslow-flash-data" } } */ /* { dg-options "-O2 -mword-relocations -march=armv7-a -marm" } */ /* { dg-final { scan-assembler-not "\[\\t \]+movw" } } */ diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c index 089a72b67f3..d10391a69ac 100644 --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c @@ -6,6 +6,7 @@ /* { dg-do compile } */ /* {
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
Hi all, Ping? This new version changes both the middle-end and back-end part so will need a review for both of those. Best regards, Thomas On Wed, 29 Aug 2018 at 11:07, Thomas Preudhomme wrote: > > Forgot another important change in ARM backend: > > The expander were causing one too many indirection which was what > caused the test failure in glibc. The new expanders code skip the > creation of a move from the memory reference of the guard's address to > a register since this is done in the insn themselves. I think during > the initial implementation of the first version of the patch I had > issues with loading the address and used that to load the address. As > can be seen from the absence of regression on the runtime stack > protector test in glibc, this is now working properly, also confirmed > by manual inspection of the code. > > I've attached the interdiff from previous version for reference. > > Best regards, > > Thomas > On Wed, 29 Aug 2018 at 10:51, Thomas Preudhomme > wrote: > > > > Resend hopefully without HTML this time. > > > > On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme > > wrote: > > > > > > Hi, > > > > > > I've reworked the patch fixing PR85434 (spilling of stack protector > > > guard's address on ARM) to address the testsuite regression on powerpc > > > and x86 as well as glibc testsuite regression on ARM. Issues were due to > > > unconditionally attempting to generate the new patterns. The code now > > > tests if there is a pattern for them for the target before generating > > > them. In the ARM side of the patch, I've also added a more specific > > > predicate for the new patterns. The new patch is found below. > > > > > > > > > In case of high register pressure in PIC mode, address of the stack > > > protector's guard can be spilled on ARM targets as shown in PR85434, > > > thus allowing an attacker to control what the canary would be compared > > > against. ARM does lack stack_protect_set and stack_protect_test insn > > > patterns, defining them does not help as the address is expanded > > > regularly and the patterns only deal with the copy and test of the > > > guard with the canary. > > > > > > This problem does not occur for x86 targets because the PIC access and > > > the test can be done in the same instruction. Aarch64 is exempt too > > > because PIC access insn pattern are mov of UNSPEC which prevents it from > > > the second access in the epilogue being CSEd in cse_local pass with the > > > first access in the prologue. > > > > > > The approach followed here is to create new "combined" set and test > > > standard pattern names that take the unexpanded guard and do the set or > > > test. This allows the target to use an opaque pattern (eg. using UNSPEC) > > > to hide the individual instructions being generated to the compiler and > > > split the pattern into generic load, compare and branch instruction > > > after register allocator, therefore avoiding any spilling. This is here > > > implemented for the ARM targets. For targets not implementing these new > > > standard pattern names, the existing stack_protect_set and > > > stack_protect_test pattern names are used. > > > > > > To be able to split PIC access after register allocation, the functions > > > had to be augmented to force a new PIC register load and to control > > > which register it loads into. This is because sharing the PIC register > > > between prologue and epilogue could lead to spilling due to CSE again > > > which an attacker could use to control what the canary gets compared > > > against. > > > > > > ChangeLog entries are as follows: > > > > > > *** gcc/ChangeLog *** > > > > > > 2018-08-09 Thomas Preud'homme > > > > > > * target-insns.def (stack_protect_combined_set): Define new standard > > > pattern name. > > > (stack_protect_combined_test): Likewise. > > > * cfgexpand.c (stack_protect_prologue): Try new > > > stack_protect_combined_set pattern first. > > > * function.c (stack_protect_epilogue): Try new > > > stack_protect_combined_test pattern first. > > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > > > parameters to control which register to use as PIC register and force > > > reloading PIC register respectively. Insert in the stream of insns if > > > possible. > > >
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
Forgot another important change in ARM backend: The expander were causing one too many indirection which was what caused the test failure in glibc. The new expanders code skip the creation of a move from the memory reference of the guard's address to a register since this is done in the insn themselves. I think during the initial implementation of the first version of the patch I had issues with loading the address and used that to load the address. As can be seen from the absence of regression on the runtime stack protector test in glibc, this is now working properly, also confirmed by manual inspection of the code. I've attached the interdiff from previous version for reference. Best regards, Thomas On Wed, 29 Aug 2018 at 10:51, Thomas Preudhomme wrote: > > Resend hopefully without HTML this time. > > On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme > wrote: > > > > Hi, > > > > I've reworked the patch fixing PR85434 (spilling of stack protector guard's > > address on ARM) to address the testsuite regression on powerpc and x86 as > > well as glibc testsuite regression on ARM. Issues were due to > > unconditionally attempting to generate the new patterns. The code now tests > > if there is a pattern for them for the target before generating them. In > > the ARM side of the patch, I've also added a more specific predicate for > > the new patterns. The new patch is found below. > > > > > > In case of high register pressure in PIC mode, address of the stack > > protector's guard can be spilled on ARM targets as shown in PR85434, > > thus allowing an attacker to control what the canary would be compared > > against. ARM does lack stack_protect_set and stack_protect_test insn > > patterns, defining them does not help as the address is expanded > > regularly and the patterns only deal with the copy and test of the > > guard with the canary. > > > > This problem does not occur for x86 targets because the PIC access and > > the test can be done in the same instruction. Aarch64 is exempt too > > because PIC access insn pattern are mov of UNSPEC which prevents it from > > the second access in the epilogue being CSEd in cse_local pass with the > > first access in the prologue. > > > > The approach followed here is to create new "combined" set and test > > standard pattern names that take the unexpanded guard and do the set or > > test. This allows the target to use an opaque pattern (eg. using UNSPEC) > > to hide the individual instructions being generated to the compiler and > > split the pattern into generic load, compare and branch instruction > > after register allocator, therefore avoiding any spilling. This is here > > implemented for the ARM targets. For targets not implementing these new > > standard pattern names, the existing stack_protect_set and > > stack_protect_test pattern names are used. > > > > To be able to split PIC access after register allocation, the functions > > had to be augmented to force a new PIC register load and to control > > which register it loads into. This is because sharing the PIC register > > between prologue and epilogue could lead to spilling due to CSE again > > which an attacker could use to control what the canary gets compared > > against. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-08-09 Thomas Preud'homme > > > > * target-insns.def (stack_protect_combined_set): Define new standard > > pattern name. > > (stack_protect_combined_test): Likewise. > > * cfgexpand.c (stack_protect_prologue): Try new > > stack_protect_combined_set pattern first. > > * function.c (stack_protect_epilogue): Try new > > stack_protect_combined_test pattern first. > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > > parameters to control which register to use as PIC register and force > > reloading PIC register respectively. Insert in the stream of insns if > > possible. > > (legitimize_pic_address): Expose above new parameters in prototype and > > adapt recursive calls accordingly. > > (arm_legitimize_address): Adapt to new legitimize_pic_address > > prototype. > > (thumb_legitimize_address): Likewise. > > (arm_emit_call_insn): Adapt to new require_pic_register prototype. > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > > change. > > * config/arm/predicated.md (guard_operand): New predicate. > > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
Resend hopefully without HTML this time. On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme wrote: > > Hi, > > I've reworked the patch fixing PR85434 (spilling of stack protector guard's > address on ARM) to address the testsuite regression on powerpc and x86 as > well as glibc testsuite regression on ARM. Issues were due to unconditionally > attempting to generate the new patterns. The code now tests if there is a > pattern for them for the target before generating them. In the ARM side of > the patch, I've also added a more specific predicate for the new patterns. > The new patch is found below. > > > In case of high register pressure in PIC mode, address of the stack > protector's guard can be spilled on ARM targets as shown in PR85434, > thus allowing an attacker to control what the canary would be compared > against. ARM does lack stack_protect_set and stack_protect_test insn > patterns, defining them does not help as the address is expanded > regularly and the patterns only deal with the copy and test of the > guard with the canary. > > This problem does not occur for x86 targets because the PIC access and > the test can be done in the same instruction. Aarch64 is exempt too > because PIC access insn pattern are mov of UNSPEC which prevents it from > the second access in the epilogue being CSEd in cse_local pass with the > first access in the prologue. > > The approach followed here is to create new "combined" set and test > standard pattern names that take the unexpanded guard and do the set or > test. This allows the target to use an opaque pattern (eg. using UNSPEC) > to hide the individual instructions being generated to the compiler and > split the pattern into generic load, compare and branch instruction > after register allocator, therefore avoiding any spilling. This is here > implemented for the ARM targets. For targets not implementing these new > standard pattern names, the existing stack_protect_set and > stack_protect_test pattern names are used. > > To be able to split PIC access after register allocation, the functions > had to be augmented to force a new PIC register load and to control > which register it loads into. This is because sharing the PIC register > between prologue and epilogue could lead to spilling due to CSE again > which an attacker could use to control what the canary gets compared > against. > > ChangeLog entries are as follows: > > *** gcc/ChangeLog *** > > 2018-08-09 Thomas Preud'homme > > * target-insns.def (stack_protect_combined_set): Define new standard > pattern name. > (stack_protect_combined_test): Likewise. > * cfgexpand.c (stack_protect_prologue): Try new > stack_protect_combined_set pattern first. > * function.c (stack_protect_epilogue): Try new > stack_protect_combined_test pattern first. > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > parameters to control which register to use as PIC register and force > reloading PIC register respectively. Insert in the stream of insns if > possible. > (legitimize_pic_address): Expose above new parameters in prototype and > adapt recursive calls accordingly. > (arm_legitimize_address): Adapt to new legitimize_pic_address > prototype. > (thumb_legitimize_address): Likewise. > (arm_emit_call_insn): Adapt to new require_pic_register prototype. > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > change. > * config/arm/predicated.md (guard_operand): New predicate. > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address > prototype change. > (stack_protect_combined_set): New insn_and_split pattern. > (stack_protect_set): New insn pattern. > (stack_protect_combined_test): New insn_and_split pattern. > (stack_protect_test): New insn pattern. > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. > (UNSPEC_SP_TEST): Likewise. > * doc/md.texi (stack_protect_combined_set): Document new standard > pattern name. > (stack_protect_set): Clarify that the operand for guard's address is > legal. > (stack_protect_combined_test): Document new standard pattern name. > (stack_protect_test): Clarify that the operand for guard's address is > legal. > > *** gcc/testsuite/ChangeLog *** > > 2018-07-05 Thomas Preud'homme > > * gcc.target/arm/pr85434.c: New test. > > > Testing: > > native x86_64: bootstrap + testsuite -> no regression, can see failures with > previous version of patch but not with new version > native powerpc64: bootstrap + testsuite -> no regression, can see failures > from pr86834 with previous vers
Re: [PATCH][GCC][AArch64] Limit movmem copies to TImode copies.
Hi Tamar, Thanks for your patch. Just one comment about your ChangeLog entry for the testsuiet change: shouldn't it mention that it is a new testcase? The patch you attached seems to create the file. Best regards, Thomas On Mon, 13 Aug 2018 at 10:33, Tamar Christina wrote: > Hi All, > > On AArch64 we have integer modes larger than TImode, and while we can > generate > moves for these they're not as efficient. > > So instead make sure we limit the maximum we can copy to TImode. This > means > copying a 16 byte struct will issue 1 TImode copy, which will be done > using a > single STP as we expect but an CImode sized copy won't issue CImode > operations. > > Bootstrapped and regtested on aarch4-none-linux-gnu and no issues. > Crosstested aarch4_be-none-elf and no issues. > > Ok for trunk? > > Thanks, > Tamar > > gcc/ > 2018-08-13 Tamar Christina > > * config/aarch64/aarch64.c (aarch64_expand_movmem): Set TImode max. > > gcc/testsuite/ > 2018-08-13 Tamar Christina > > * gcc.target/aarch64/large_struct_copy_2.c: Add assembler scan. > > -- >
[PATCH] Clarify source of tm.texi to copy for GFDL grant
When tm.texi.in is updated in the source tree, the following message gets displayed: Verify that you have permission to grant a GFDL license for all new text in tm.texi, then copy it to /gcc/doc/tm.texi. Having been myself and some colleagues confused several time by that message as to what tm.texi to copy, I think it would be clearer to indicate the absolute path for the source as well. This patch achieves that. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-08-09 Thomas Preud'homme * Makefile.in: Clarify which tm.texi to copy over to assert the right to grant a GFDL license for all. Testing: Built GCC with a change in tm.texi.in and copied by copy/pasting the source and destination path from the resulting message. Second build then succeeded. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/Makefile.in b/gcc/Makefile.in index e7d818d174c..d8d2b885f6d 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2504,7 +2504,7 @@ s-tm-texi: build/genhooks$(build_exeext) $(srcdir)/doc/tm.texi.in else \ echo >&2 ; \ echo Verify that you have permission to grant a GFDL license for all >&2 ; \ - echo new text in tm.texi, then copy it to $(srcdir)/doc/tm.texi. >&2 ; \ + echo new text in $(objdir)/tm.texi, then copy it to $(srcdir)/doc/tm.texi. >&2 ; \ false; \ fi -- 2.18.0
Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.
On Thu, 26 Jul 2018 at 12:01, Tamar Christina wrote: > > Hi Thomas, > > > -Original Message- > > From: Thomas Preudhomme > > Sent: Thursday, July 26, 2018 09:29 > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; Ramana Radhakrishnan > > ; Richard Earnshaw > > ; ni...@redhat.com; Kyrylo Tkachov > > > > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by > > enabling the FP16 pattern unconditionally. > > > > Hi Tamar, > > > > On Wed, 25 Jul 2018 at 16:28, Tamar Christina > > wrote: > > > > > > Hi Thomas, > > > > > > Thanks for the review! > > > > > > > > > > > > > I don't believe the TARGET_FP16 guard to be needed, because the > > > > > pattern doesn't actually generate code and requires another > > > > > pattern for that, and a reg to reg move should always be possible > > > > > anyway. So allowing the force to register here is safe and it > > > > > allows the compiler to generate a correct error instead of ICEing in > > > > > an > > infinite loop. > > > > > > > > How about subreg to subreg move? Doesn't that expand to more insns > > > > (subreg to reg and reg to subreg)? Couldn't you improve the logic to > > > > check that there is actually a mode change so that if there isn't > > > > (like moving from one subreg to another) just expand to a single move? > > > > > > > > > > Yes, but that is not a new issue. My patch is simply removing the > > > TARGET_FP16 restrictions and merging two patterns that should be one > > using an iterator and nothing more. > > > > > > The redundant mov is already there and a different issue than the ICE I'm > > trying to fix. > > > > It's there for movv4hf and movv6hf but your patch extends this problem to > > movv2sf and movv4sf as well. > > I don't understand how it can. My patch just replaces one pattern for V4HF and > one for V8HF with one pattern operating on VH. > > ;; Vector modes for 16-bit floating-point support. > (define_mode_iterator VH [V8HF V4HF]) > > My pattern has absolutely no effect on V2SF and V4SF or any of the other > modes. My bad, I was looking at VF. > > > > > > > > > None of the code inside the expander is needed at all, the code really > > > only has an effect on subreg to subreg moves, as `force_reg` doesn't do > > anything when it's argument is already a reg. > > > > > > The comment in the expander (which was already there) is wrong. The > > > *reason* the ICE is fixed isn't because of the `force_reg`. It's > > > because of the mere presence of the expander itself. The expander > > > matches the standard mov$a optab and so this prevents > > emit_move_insn_1 from doing the move by subwords as it finds a pattern > > that's able to do the move. > > > > Could you then fix the comment in your patch as well? I hadn't understood > > the force_reg was not key here. You might want to update the following > > sentence from your patch description if you are going to include it in your > > commit message: > > I'll update the comment in the patch. The cover letter won't be included in > the commit, > But it does accurately reflect the current state of affairs. The patch will > do the force_reg, > It's just not the reason it works. Understood. > > > > > The way this is worked around in the back-end is that we have move > > patterns in neon.md that usually just force the register instead of checking > > with the back-end. > > > > "The way this is worked around (..) that just force the register" is what > > led > > me to believe the force_reg was important. > > > > > > > > The expander however always falls through and doesn’t stop RTL > > > generation. You could remove all the code in there and have it > > > properly match the *neon_mov instructions which will do the right > > > thing later at code generation time and avoid the redundant moves. My > > guess is the original `force_reg` was copied from the other patterns like > > `movti` and the existing `mov`. There It makes sense because the > > operands can be MEM or anything general_operand. > > > > > > However the redundant moves are a different problem than what I'm > > > trying to solve here. So I think that's another patch which requires > > > further > > testing. > > > > I was just thinking of restricting when
Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.
Hi Tamar, On Wed, 25 Jul 2018 at 16:28, Tamar Christina wrote: > > Hi Thomas, > > Thanks for the review! > > > > > > > I don't believe the TARGET_FP16 guard to be needed, because the > > > pattern doesn't actually generate code and requires another pattern > > > for that, and a reg to reg move should always be possible anyway. So > > > allowing the force to register here is safe and it allows the compiler > > > to generate a correct error instead of ICEing in an infinite loop. > > > > How about subreg to subreg move? Doesn't that expand to more insns > > (subreg to reg and reg to subreg)? Couldn't you improve the logic to check > > that there is actually a mode change so that if there isn't (like moving > > from > > one subreg to another) just expand to a single move? > > > > Yes, but that is not a new issue. My patch is simply removing the TARGET_FP16 > restrictions and > merging two patterns that should be one using an iterator and nothing more. > > The redundant mov is already there and a different issue than the ICE I'm > trying to fix. It's there for movv4hf and movv6hf but your patch extends this problem to movv2sf and movv4sf as well. > > None of the code inside the expander is needed at all, the code really only > has an effect on subreg > to subreg moves, as `force_reg` doesn't do anything when it's argument is > already a reg. > > The comment in the expander (which was already there) is wrong. The *reason* > the ICE is fixed isn't > because of the `force_reg`. It's because of the mere presence of the expander > itself. The expander matches the > standard mov$a optab and so this prevents emit_move_insn_1 from doing the > move by subwords as it finds a pattern > that's able to do the move. Could you then fix the comment in your patch as well? I hadn't understood the force_reg was not key here. You might want to update the following sentence from your patch description if you are going to include it in your commit message: The way this is worked around in the back-end is that we have move patterns in neon.md that usually just force the register instead of checking with the back-end. "The way this is worked around (..) that just force the register" is what led me to believe the force_reg was important. > > The expander however always falls through and doesn’t stop RTL generation. > You could remove all the code in there and have > it properly match the *neon_mov instructions which will do the right thing > later at code generation time and avoid the redundant > moves. My guess is the original `force_reg` was copied from the other > patterns like `movti` and the existing `mov`. There It makes > sense because the operands can be MEM or anything general_operand. > > However the redundant moves are a different problem than what I'm trying to > solve here. So I think that's another patch which requires further > testing. I was just thinking of restricting when does the force_reg happens but if it can be removed completely I agree it should probably be done in a separate patch. Oh by the way, is there something that prevent those expander to ever be used with a memory operand? Because the GCC internals contains the following piece for mov standard pattern (bold marks added by me): "Second, these patterns are not used solely in the RTL generation pass. Even the reload pass can generate move insns to copy values from stack slots into temporary registers. When it does so, one of the operands is a hard register and the other is an operand that can need to be reloaded into a register. Therefore, when given such a pair of operands, the pattern must generate RTL which needs no reloading and needs no temporary registers—no registers other than the operands. For example, if you support the pattern with a define_ expand, then in such a case the define_expand *mustn’t call force_reg* or any other such function which might generate new pseudo registers." Best regards, Thomas > > Regards, > Tamar > > > Best regards, > > > > Thomas > > > > > > > > This patch ensures gcc.target/arm/big-endian-subreg.c is fixed without > > > introducing any regressions while fixing > > > > > > gcc.dg/vect/vect-nop-move.c execution test > > > g++.dg/torture/vshuf-v2si.C -O3 -g execution test > > > g++.dg/torture/vshuf-v4si.C -O3 -g execution test > > > g++.dg/torture/vshuf-v8hi.C -O3 -g execution test > > > > > > Regtested on armeb-none-eabi and no regressions. > > > Bootstrapped on arm-none-linux-gnueabihf and no issues. > > > > > > > > > Ok for trunk? > > > > > > Thanks, > > > Tamar > > > > > > gcc/ > > > 2018-07-23 Tamar Christina > > > > > > PR target/84711 > > > * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg. > > > * config/arm/neon.md (movv4hf, movv8hf): Refactored to.. > > > (mov): ..this and enable unconditionally. > > > > > > --
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
Hi Kyrill, Using memory_operand worked, the issues I encountered when using it in earlier versions of the patch must have been due to the missing test on address_operand in the preparation statements which I added later. Please find an updated patch in attachment. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-07-05 Thomas Preud'homme * target-insns.def (stack_protect_combined_set): Define new standard pattern name. (stack_protect_combined_test): Likewise. * cfgexpand.c (stack_protect_prologue): Try new stack_protect_combined_set pattern first. * function.c (stack_protect_epilogue): Try new stack_protect_combined_test pattern first. * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now parameters to control which register to use as PIC register and force reloading PIC register respectively. Insert in the stream of insns if possible. (legitimize_pic_address): Expose above new parameters in prototype and adapt recursive calls accordingly. (arm_legitimize_address): Adapt to new legitimize_pic_address prototype. (thumb_legitimize_address): Likewise. (arm_emit_call_insn): Adapt to new require_pic_register prototype. * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype change. * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address prototype change. (stack_protect_combined_set): New insn_and_split pattern. (stack_protect_set): New insn pattern. (stack_protect_combined_test): New insn_and_split pattern. (stack_protect_test): New insn pattern. * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. (UNSPEC_SP_TEST): Likewise. * doc/md.texi (stack_protect_combined_set): Document new standard pattern name. (stack_protect_set): Clarify that the operand for guard's address is legal. (stack_protect_combined_test): Document new standard pattern name. (stack_protect_test): Clarify that the operand for guard's address is legal. *** gcc/testsuite/ChangeLog *** 2018-07-05 Thomas Preud'homme * gcc.target/arm/pr85434.c: New test. Bootstrapped again for Arm and Thumb-2 and regtested with and without -fstack-protector-all without any regression. Best regards, Thomas On Thu, 19 Jul 2018 at 17:34, Thomas Preudhomme wrote: > > [Dropping Jeff Law from the list since he already commented on the > middle end parts] > > Hi Kyrill, > > On Thu, 19 Jul 2018 at 12:02, Kyrill Tkachov > wrote: > > > > Hi Thomas, > > > > On 17/07/18 12:02, Thomas Preudhomme wrote: > > > Fixed in attached patch. ChangeLog entries are unchanged: > > > > > > *** gcc/ChangeLog *** > > > > > > 2018-07-05 Thomas Preud'homme > > > > > > PR target/85434 > > > * target-insns.def (stack_protect_combined_set): Define new standard > > > pattern name. > > > (stack_protect_combined_test): Likewise. > > > * cfgexpand.c (stack_protect_prologue): Try new > > > stack_protect_combined_set pattern first. > > > * function.c (stack_protect_epilogue): Try new > > > stack_protect_combined_test pattern first. > > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > > > parameters to control which register to use as PIC register and force > > > reloading PIC register respectively. > > > (legitimize_pic_address): Expose above new parameters in prototype and > > > adapt recursive calls accordingly. > > > (arm_legitimize_address): Adapt to new legitimize_pic_address > > > prototype. > > > (thumb_legitimize_address): Likewise. > > > (arm_emit_call_insn): Adapt to new require_pic_register prototype. > > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > > > change. > > > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address > > > prototype change. > > > (stack_protect_combined_set): New insn_and_split pattern. > > > (stack_protect_set): New insn pattern. > > > (stack_protect_combined_test): New insn_and_split pattern. > > > (stack_protect_test): New insn pattern. > > > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. > > > (UNSPEC_SP_TEST): Likewise. > > > * doc/md.texi (stack_protect_combined_set): Document new standard > > > pattern name. > > > (stack_protect_set): Clarify that the operand for guard's address is > > > legal. > > > (stack_protect_combined_test): Document new standard pattern name. > > > (stack_protect_test): Clarify that the operand for guard's address is > &
Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.
Hi Tamar, On Mon, 23 Jul 2018 at 17:56, Tamar Christina wrote: > > Hi All, > > My previous patch changed arm_can_change_mode_class to allow subregs of > 64bit registers on arm big-endian. However it seems that we can't do this > because a the data in 64 bit VFP registers are stored in little-endian order, > even on big-endian. > > Allowing this change had a knock on effect that caused GCC's no-op detection > to think that loading from the first lane on arm big-endian is a no-op. this > because we can't describe the weird ordering we have on D registers on > big-endian. > > The original issue comes from the fact that the code does > > ... foo (... bar) > { > return bar; > } > > The expansion of the return statement causes GCC to try to return the value in > a register. GCC will try to emit the move then, from MEM to REG (due to the > SSA > temporary.). It checks for a mov optab for this which isn't available and > then tries to do the move in bits using emit_move_multi_word. > > emit_move_multi_word will split the move into sub parts, but then needs to get > the sub parts and does this using subregs, but it's told it can't do subregs! > > The compiler is now stuck in an infinite loop. > > The way this is worked around in the back-end is that we have move patterns in > neon.md that usually just force the register instead of checking with the > back-end. This prevents emit_move_multi_word from being needed. However the > pattern for V4HF and V8HF were guarded by TARGET_NEON && TARGET_FP16. > > I don't believe the TARGET_FP16 guard to be needed, because the pattern > doesn't > actually generate code and requires another pattern for that, and a reg to > reg move > should always be possible anyway. So allowing the force to register here is > safe > and it allows the compiler to generate a correct error instead of ICEing in an > infinite loop. How about subreg to subreg move? Doesn't that expand to more insns (subreg to reg and reg to subreg)? Couldn't you improve the logic to check that there is actually a mode change so that if there isn't (like moving from one subreg to another) just expand to a single move? Best regards, Thomas > > This patch ensures gcc.target/arm/big-endian-subreg.c is fixed without > introducing > any regressions while fixing > > gcc.dg/vect/vect-nop-move.c execution test > g++.dg/torture/vshuf-v2si.C -O3 -g execution test > g++.dg/torture/vshuf-v4si.C -O3 -g execution test > g++.dg/torture/vshuf-v8hi.C -O3 -g execution test > > Regtested on armeb-none-eabi and no regressions. > Bootstrapped on arm-none-linux-gnueabihf and no issues. > > > Ok for trunk? > > Thanks, > Tamar > > gcc/ > 2018-07-23 Tamar Christina > > PR target/84711 > * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg. > * config/arm/neon.md (movv4hf, movv8hf): Refactored to.. > (mov): ..this and enable unconditionally. > > --
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
[Dropping Jeff Law from the list since he already commented on the middle end parts] Hi Kyrill, On Thu, 19 Jul 2018 at 12:02, Kyrill Tkachov wrote: > > Hi Thomas, > > On 17/07/18 12:02, Thomas Preudhomme wrote: > > Fixed in attached patch. ChangeLog entries are unchanged: > > > > *** gcc/ChangeLog *** > > > > 2018-07-05 Thomas Preud'homme > > > > PR target/85434 > > * target-insns.def (stack_protect_combined_set): Define new standard > > pattern name. > > (stack_protect_combined_test): Likewise. > > * cfgexpand.c (stack_protect_prologue): Try new > > stack_protect_combined_set pattern first. > > * function.c (stack_protect_epilogue): Try new > > stack_protect_combined_test pattern first. > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > > parameters to control which register to use as PIC register and force > > reloading PIC register respectively. > > (legitimize_pic_address): Expose above new parameters in prototype and > > adapt recursive calls accordingly. > > (arm_legitimize_address): Adapt to new legitimize_pic_address > > prototype. > > (thumb_legitimize_address): Likewise. > > (arm_emit_call_insn): Adapt to new require_pic_register prototype. > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > > change. > > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address > > prototype change. > > (stack_protect_combined_set): New insn_and_split pattern. > > (stack_protect_set): New insn pattern. > > (stack_protect_combined_test): New insn_and_split pattern. > > (stack_protect_test): New insn pattern. > > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. > > (UNSPEC_SP_TEST): Likewise. > > * doc/md.texi (stack_protect_combined_set): Document new standard > > pattern name. > > (stack_protect_set): Clarify that the operand for guard's address is > > legal. > > (stack_protect_combined_test): Document new standard pattern name. > > (stack_protect_test): Clarify that the operand for guard's address is > > legal. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2018-07-05 Thomas Preud'homme > > > > PR target/85434 > > * gcc.target/arm/pr85434.c: New test. > > > > Sorry for the delay. Some comments inline. > > Kyrill > > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c > index d6e3c382085..d1a893ac56e 100644 > --- a/gcc/cfgexpand.c > +++ b/gcc/cfgexpand.c > @@ -6105,8 +6105,18 @@ stack_protect_prologue (void) > { > tree guard_decl = targetm.stack_protect_guard (); > rtx x, y; > + struct expand_operand ops[2]; > > x = expand_normal (crtl->stack_protect_guard); > + create_fixed_operand ([0], x); > + create_fixed_operand ([1], DECL_RTL (guard_decl)); > + /* Allow the target to compute address of Y and copy it to X without > + leaking Y into a register. This combined address + copy pattern allows > + the target to prevent spilling of any intermediate results by splitting > + it after register allocator. */ > + if (maybe_expand_insn (targetm.code_for_stack_protect_combined_set, 2, > ops)) > +return; > + > if (guard_decl) > y = expand_normal (guard_decl); > else > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h > index 8537262ce64..100844e659c 100644 > --- a/gcc/config/arm/arm-protos.h > +++ b/gcc/config/arm/arm-protos.h > @@ -67,7 +67,7 @@ extern int const_ok_for_dimode_op (HOST_WIDE_INT, enum > rtx_code); > extern int arm_split_constant (RTX_CODE, machine_mode, rtx, >HOST_WIDE_INT, rtx, rtx, int); > extern int legitimate_pic_operand_p (rtx); > -extern rtx legitimize_pic_address (rtx, machine_mode, rtx); > +extern rtx legitimize_pic_address (rtx, machine_mode, rtx, rtx, bool); > extern rtx legitimize_tls_address (rtx, rtx); > extern bool arm_legitimate_address_p (machine_mode, rtx, bool); > extern int arm_legitimate_address_outer_p (machine_mode, rtx, RTX_CODE, > int); > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index ec3abbcba9f..f4a970580c2 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -7369,20 +7369,26 @@ legitimate_pic_operand_p (rtx x) > } > > /* Record that the current function needs a PIC register. Initialize > - cfun->machine->pic_reg if we have not already done so. */ > + cfun->machine->pic_reg if we have not already done so. > + > + If not NULL, PIC_REG
Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).
Hi Martin, Why is this needed when -mfpu does not seem to need it for instance? Regarding the patch: > -print "Name(processor_type) Type(enum processor_type)" > -print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n" > +print "Name(processor_type) Type(enum processor_type) ForceHelp" > +print "Known ARM CPUs (for use with the -mtune= options):\n" Why changing the text beyond adding ForceHelp? > +@item ForceHelp > +This property is optional. If present, enum values is printed > +in @option{--help} output. > + are printed Thanks, Thomas On Wed, 18 Jul 2018 at 16:50, Martin Liška wrote: > > Hi. > > This introduces new ForceHelp option flag that helps to > print valid option enum values that are not directly > used as a type of an option. > > May I please ask ARM folks to test the patch? > Thanks, > Martin > > gcc/ChangeLog: > > 2018-07-18 Martin Liska > > PR driver/83193 > * config/arm/arm-tables.opt: Add ForceHelp flag for > processor_type and arch_name enum types. > * config/arm/parsecpu.awk: Likewise. > * doc/options.texi: Document new flag ForceHelp. > * opt-read.awk: Parse ForceHelp and set it in construction. > * optc-gen.awk: Likewise. > * opts.c (print_filtered_help): Handle force_help option. > * opts.h (struct cl_enum): New field force_help. > --- > gcc/config/arm/arm-tables.opt | 6 +++--- > gcc/config/arm/parsecpu.awk | 6 +++--- > gcc/doc/options.texi | 4 > gcc/opt-read.awk | 3 +++ > gcc/optc-gen.awk | 3 ++- > gcc/opts.c| 3 ++- > gcc/opts.h| 3 +++ > 7 files changed, 20 insertions(+), 8 deletions(-) > >
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
Fixed in attached patch. ChangeLog entries are unchanged: *** gcc/ChangeLog *** 2018-07-05 Thomas Preud'homme PR target/85434 * target-insns.def (stack_protect_combined_set): Define new standard pattern name. (stack_protect_combined_test): Likewise. * cfgexpand.c (stack_protect_prologue): Try new stack_protect_combined_set pattern first. * function.c (stack_protect_epilogue): Try new stack_protect_combined_test pattern first. * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now parameters to control which register to use as PIC register and force reloading PIC register respectively. (legitimize_pic_address): Expose above new parameters in prototype and adapt recursive calls accordingly. (arm_legitimize_address): Adapt to new legitimize_pic_address prototype. (thumb_legitimize_address): Likewise. (arm_emit_call_insn): Adapt to new require_pic_register prototype. * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype change. * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address prototype change. (stack_protect_combined_set): New insn_and_split pattern. (stack_protect_set): New insn pattern. (stack_protect_combined_test): New insn_and_split pattern. (stack_protect_test): New insn pattern. * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. (UNSPEC_SP_TEST): Likewise. * doc/md.texi (stack_protect_combined_set): Document new standard pattern name. (stack_protect_set): Clarify that the operand for guard's address is legal. (stack_protect_combined_test): Document new standard pattern name. (stack_protect_test): Clarify that the operand for guard's address is legal. *** gcc/testsuite/ChangeLog *** 2018-07-05 Thomas Preud'homme PR target/85434 * gcc.target/arm/pr85434.c: New test. Best regards, Thomas On Mon, 16 Jul 2018 at 22:46, Jeff Law wrote: > > On 07/05/2018 08:48 AM, Thomas Preudhomme wrote: > > In case of high register pressure in PIC mode, address of the stack > > protector's guard can be spilled on ARM targets as shown in PR85434, > > thus allowing an attacker to control what the canary would be compared > > against. ARM does lack stack_protect_set and stack_protect_test insn > > patterns, defining them does not help as the address is expanded > > regularly and the patterns only deal with the copy and test of the > > guard with the canary. > > > > This problem does not occur for x86 targets because the PIC access and > > the test can be done in the same instruction. Aarch64 is exempt too > > because PIC access insn pattern are mov of UNSPEC which prevents it from > > the second access in the epilogue being CSEd in cse_local pass with the > > first access in the prologue. > > > > The approach followed here is to create new "combined" set and test > > standard pattern names that take the unexpanded guard and do the set or > > test. This allows the target to use an opaque pattern (eg. using UNSPEC) > > to hide the individual instructions being generated to the compiler and > > split the pattern into generic load, compare and branch instruction > > after register allocator, therefore avoiding any spilling. This is here > > implemented for the ARM targets. For targets not implementing these new > > standard pattern names, the existing stack_protect_set and > > stack_protect_test pattern names are used. > > > > To be able to split PIC access after register allocation, the functions > > had to be augmented to force a new PIC register load and to control > > which register it loads into. This is because sharing the PIC register > > between prologue and epilogue could lead to spilling due to CSE again > > which an attacker could use to control what the canary gets compared > > against. > > > > ChangeLog entries are as follows: > > > > *** gcc/ChangeLog *** > > > > 2018-07-05 Thomas Preud'homme > > > > PR target/85434 > > * target-insns.def (stack_protect_combined_set): Define new standard > > pattern name. > > (stack_protect_combined_test): Likewise. > > * cfgexpand.c (stack_protect_prologue): Try new > > stack_protect_combined_set pattern first. > > * function.c (stack_protect_epilogue): Try new > > stack_protect_combined_test pattern first. > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > > parameters to control which register to use as PIC register and force > > reloading PIC register respectively. > > (legitimize_pic_address): Expose above new parameters in prototype and > > adapt recursive calls
Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
Adding Jeff and Eric since the patch adds an RTL target hook. Best regards, Thomas On Thu, 5 Jul 2018 at 15:48, Thomas Preudhomme wrote: > > In case of high register pressure in PIC mode, address of the stack > protector's guard can be spilled on ARM targets as shown in PR85434, > thus allowing an attacker to control what the canary would be compared > against. ARM does lack stack_protect_set and stack_protect_test insn > patterns, defining them does not help as the address is expanded > regularly and the patterns only deal with the copy and test of the > guard with the canary. > > This problem does not occur for x86 targets because the PIC access and > the test can be done in the same instruction. Aarch64 is exempt too > because PIC access insn pattern are mov of UNSPEC which prevents it from > the second access in the epilogue being CSEd in cse_local pass with the > first access in the prologue. > > The approach followed here is to create new "combined" set and test > standard pattern names that take the unexpanded guard and do the set or > test. This allows the target to use an opaque pattern (eg. using UNSPEC) > to hide the individual instructions being generated to the compiler and > split the pattern into generic load, compare and branch instruction > after register allocator, therefore avoiding any spilling. This is here > implemented for the ARM targets. For targets not implementing these new > standard pattern names, the existing stack_protect_set and > stack_protect_test pattern names are used. > > To be able to split PIC access after register allocation, the functions > had to be augmented to force a new PIC register load and to control > which register it loads into. This is because sharing the PIC register > between prologue and epilogue could lead to spilling due to CSE again > which an attacker could use to control what the canary gets compared > against. > > ChangeLog entries are as follows: > > *** gcc/ChangeLog *** > > 2018-07-05 Thomas Preud'homme > > PR target/85434 > * target-insns.def (stack_protect_combined_set): Define new standard > pattern name. > (stack_protect_combined_test): Likewise. > * cfgexpand.c (stack_protect_prologue): Try new > stack_protect_combined_set pattern first. > * function.c (stack_protect_epilogue): Try new > stack_protect_combined_test pattern first. > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now > parameters to control which register to use as PIC register and force > reloading PIC register respectively. > (legitimize_pic_address): Expose above new parameters in prototype and > adapt recursive calls accordingly. > (arm_legitimize_address): Adapt to new legitimize_pic_address > prototype. > (thumb_legitimize_address): Likewise. > (arm_emit_call_insn): Adapt to new require_pic_register prototype. > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype > change. > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address > prototype change. > (stack_protect_combined_set): New insn_and_split pattern. > (stack_protect_set): New insn pattern. > (stack_protect_combined_test): New insn_and_split pattern. > (stack_protect_test): New insn pattern. > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. > (UNSPEC_SP_TEST): Likewise. > * doc/md.texi (stack_protect_combined_set): Document new standard > pattern name. > (stack_protect_set): Clarify that the operand for guard's address is > legal. > (stack_protect_combined_test): Document new standard pattern name. > (stack_protect_test): Clarify that the operand for guard's address is > legal. > > *** gcc/testsuite/ChangeLog *** > > 2018-07-05 Thomas Preud'homme > > PR target/85434 > * gcc.target/arm/pr85434.c: New test. > > Testing: Bootstrapped on ARM in both Arm and Thumb-2 mode as well as on > Aarch64. Testsuite shows no regression on these 3 variants either both > with default flags and with -fstack-protector-all. > > Is this ok for trunk? If yes, would this be acceptable as a backport to > GCC 6, 7 and 8 provided that no regression is found? > > Best regards, > > Thomas From d917d48c2005e46154383589f203d06f3c6167e0 Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Tue, 8 May 2018 15:47:05 +0100 Subject: [PATCH] PR85434: Prevent spilling of stack protector guard's address on ARM In case of high register pressure in PIC mode, address of the stack protector's guard can be spilled on ARM targets as shown in PR85434, thus allowing an attacker to control what the canary would be compared against. ARM does lack stack_protect
[PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM
In case of high register pressure in PIC mode, address of the stack protector's guard can be spilled on ARM targets as shown in PR85434, thus allowing an attacker to control what the canary would be compared against. ARM does lack stack_protect_set and stack_protect_test insn patterns, defining them does not help as the address is expanded regularly and the patterns only deal with the copy and test of the guard with the canary. This problem does not occur for x86 targets because the PIC access and the test can be done in the same instruction. Aarch64 is exempt too because PIC access insn pattern are mov of UNSPEC which prevents it from the second access in the epilogue being CSEd in cse_local pass with the first access in the prologue. The approach followed here is to create new "combined" set and test standard pattern names that take the unexpanded guard and do the set or test. This allows the target to use an opaque pattern (eg. using UNSPEC) to hide the individual instructions being generated to the compiler and split the pattern into generic load, compare and branch instruction after register allocator, therefore avoiding any spilling. This is here implemented for the ARM targets. For targets not implementing these new standard pattern names, the existing stack_protect_set and stack_protect_test pattern names are used. To be able to split PIC access after register allocation, the functions had to be augmented to force a new PIC register load and to control which register it loads into. This is because sharing the PIC register between prologue and epilogue could lead to spilling due to CSE again which an attacker could use to control what the canary gets compared against. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-07-05 Thomas Preud'homme PR target/85434 * target-insns.def (stack_protect_combined_set): Define new standard pattern name. (stack_protect_combined_test): Likewise. * cfgexpand.c (stack_protect_prologue): Try new stack_protect_combined_set pattern first. * function.c (stack_protect_epilogue): Try new stack_protect_combined_test pattern first. * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now parameters to control which register to use as PIC register and force reloading PIC register respectively. (legitimize_pic_address): Expose above new parameters in prototype and adapt recursive calls accordingly. (arm_legitimize_address): Adapt to new legitimize_pic_address prototype. (thumb_legitimize_address): Likewise. (arm_emit_call_insn): Adapt to new require_pic_register prototype. * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype change. * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address prototype change. (stack_protect_combined_set): New insn_and_split pattern. (stack_protect_set): New insn pattern. (stack_protect_combined_test): New insn_and_split pattern. (stack_protect_test): New insn pattern. * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. (UNSPEC_SP_TEST): Likewise. * doc/md.texi (stack_protect_combined_set): Document new standard pattern name. (stack_protect_set): Clarify that the operand for guard's address is legal. (stack_protect_combined_test): Document new standard pattern name. (stack_protect_test): Clarify that the operand for guard's address is legal. *** gcc/testsuite/ChangeLog *** 2018-07-05 Thomas Preud'homme PR target/85434 * gcc.target/arm/pr85434.c: New test. Testing: Bootstrapped on ARM in both Arm and Thumb-2 mode as well as on Aarch64. Testsuite shows no regression on these 3 variants either both with default flags and with -fstack-protector-all. Is this ok for trunk? If yes, would this be acceptable as a backport to GCC 6, 7 and 8 provided that no regression is found? Best regards, Thomas From d917d48c2005e46154383589f203d06f3c6167e0 Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Tue, 8 May 2018 15:47:05 +0100 Subject: [PATCH] PR85434: Prevent spilling of stack protector guard's address on ARM In case of high register pressure in PIC mode, address of the stack protector's guard can be spilled on ARM targets as shown in PR85434, thus allowing an attacker to control what the canary would be compared against. ARM does lack stack_protect_set and stack_protect_test insn patterns, defining them does not help as the address is expanded regularly and the patterns only deal with the copy and test of the guard with the canary. This problem does not occur for x86 targets because the PIC access and the test can be done in the same instruction. Aarch64 is exempt too because PIC access insn pattern are mov of UNSPEC which prevents it from the second access in the epilogue being CSEd in cse_local pass with the first access in the prologue. The approach followed here is to create new "combined" set and test standard pattern names
[ARM] Fix PR85434: spill of stack protector's guard address
I'll make a fool of myself but I still have further questions if you don't mind (see inline). On Friday, 4 May 2018, Segher Boessenkool <seg...@kernel.crashing.org> wrote: > Hi! > > On Wed, May 02, 2018 at 07:57:55AM +0100, Thomas Preudhomme wrote: >> As mentionned in the ticket this was my first thought but this means >> making the pattern aware of all the possible way the address could be >> access (PIC Vs non-PIC, Arm Vs Thumb-2 Vs Thumb-1) to decide how many >> scratch registers are needed. I'd rather reuse the existing pattern as >> much as possible to make sure they are well tested. Ideally I wanted a >> way to mark a REG RTX so that it is never spilled and such that the >> mark is propagated when the register is moved to another register or >> propagated. But that is a bigger change so decided it should be an >> improvement for later but needed another solution right now. > > How would that work, esp. for pseudos? If too many regs have such a > mark then the compiler will have to sorry() or similar, not a good > thing at all. I'm missing something, there should be the same amount of pseudo with that mark as there is scratch in the new pattern doing memory address load(s) + set / check. I'm guessing this is not as easy to achieve as it sounds. > >> By the way about making sure the address is not left in a register, I >> have a question regarding the current stack_protect_set and >> stack_protect_check pattern and their requirements to have register >> cleared afterwards: why is that necessary? Currently not all registers >> are cleared and the guard is available in the canari before it is >> overwritten anyway so I don't see how clearing the register adds any >> extra security. What sort of attack is it protecting against? > > From md.texi: > > @item @samp{stack_protect_set} > This pattern, if defined, moves a @code{ptr_mode} value from the memory > in operand 1 to the memory in operand 0 without leaving the value in > a register afterward. This is to avoid leaking the value some place > that an attacker might use to rewrite the stack guard slot after > having clobbered it. > > (etc.) I've read that doc but what I don't understand is why the guard value being leaked in a register would be a problem if modified. The pattern as they are guarantee the guard is always reloaded from its canonical location (e.g. TLS var). Because the patterns do not represent in RTL what they do the compiler could not reuse the value left in a register. Are we worrying about optimization the assembler could do? > > Having the canary in a global variable makes it a lot easier for exploit > code to access it then if it is e.g. in TLS data. Actually leaking a > pointer to it would make it extra easy... If an attacker can execute code to access and modify the guard, why would s/he bother doing a stack overflow instead of just executing the code he wants to directly? Best regards, Thomas
Re: [ARM] Fix PR85434: spill of stack protector's guard address
Hi Segher, As mentionned in the ticket this was my first thought but this means making the pattern aware of all the possible way the address could be access (PIC Vs non-PIC, Arm Vs Thumb-2 Vs Thumb-1) to decide how many scratch registers are needed. I'd rather reuse the existing pattern as much as possible to make sure they are well tested. Ideally I wanted a way to mark a REG RTX so that it is never spilled and such that the mark is propagated when the register is moved to another register or propagated. But that is a bigger change so decided it should be an improvement for later but needed another solution right now. By the way about making sure the address is not left in a register, I have a question regarding the current stack_protect_set and stack_protect_check pattern and their requirements to have register cleared afterwards: why is that necessary? Currently not all registers are cleared and the guard is available in the canari before it is overwritten anyway so I don't see how clearing the register adds any extra security. What sort of attack is it protecting against? Best regards, Thomas On 29 April 2018 at 00:33, Segher Boessenkool <seg...@kernel.crashing.org> wrote: > Hi! > > On Sat, Apr 28, 2018 at 12:32:26AM +0100, Thomas Preudhomme wrote: >> On Arm (Aarch32 and Aarch64) the stack protector's guard is accessed by >> loading its address first before loading its value from it as part of >> the stack_protect_set or stack_protect_check insn pattern. This creates >> the risk of spilling between the two. > >> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c >> index deab929..c7ced8f 100644 >> --- a/gcc/cfgexpand.c >> +++ b/gcc/cfgexpand.c >> @@ -6156,6 +6156,10 @@ stack_protect_prologue (void) >>tree guard_decl = targetm.stack_protect_guard (); >>rtx x, y; >> >> + /* Prevent scheduling of instruction(s) between computation of the guard's >> + address and setting of the canari to avoid any spill of the guard's >> + address if computed outside the setting of the canari. */ >> + emit_insn (gen_blockage ()); >>x = expand_normal (crtl->stack_protect_guard); >>if (guard_decl) >> y = expand_normal (guard_decl); > > [ etc. ] > > Why pessimise code for all targets (quite a lot), when it does not even > fix the problem on Arm completely (or not obviously, anyway)? > > Instead, implement stack_protect_* and hide the memory accesses to the > stored canary value (and make sure its address is not left in a register > either!) > > I doubt this can be done completely target-independent, it will always > be best effort that way, aka it won't really work. > > > Segher
[ARM] Fix PR85434: spill of stack protector's guard address
On Arm (Aarch32 and Aarch64) the stack protector's guard is accessed by loading its address first before loading its value from it as part of the stack_protect_set or stack_protect_check insn pattern. This creates the risk of spilling between the two. It is particularly likely on Aarch32 when compiling PIC code because - computing the address takes several instructions (first compute the GOT base and then the GOT entry by adding an offset) which increases the likelyhood of CSE - the address computation can be CSEd due to the GOT entry computation being a MEM of the GOT base + an UNSPEC offset rather than an UNSPEC of a MEM like on AArche64. This patch address both issues by (i) adding some scheduler barriers around the stack protector code and (ii) making all memory loads involved in computing the guard's address volatile. The use of volatile rather than unspec was chosen so that the patterns for computing the guard address can be the same as for normal global variable access thus reusing more code. Finally the patch also improves the documentation to mention the need to be careful when computing the address of the guard. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-04-27 Thomas Preud'hommePR target/85434 * cfgexpand.c (stack_protect_prologue): Emit scheduler barriers around stack protector code. * function.c (stack_protect_epilogue): Likewise. * config/arm/arm-protos.h (arm_stack_chk_guard_decl_p): Declare. * config/arm/arm.md (calculate_pic_address): Mark memory volatile if is computing address of stack protector's guard. (calculate_pic_address splitter): Likewise. * config/arm/arm.c (require_pic_register): Add parameter to control whether to insert instruction at the end of the instruction stream. (legitimize_pic_address): Force computing PIC address at the end of instruction stream and adapt logic to change in calculate_pic_address insn pattern. (arm_stack_chk_guard_decl_p): New function. (arm_emit_call_insn): Adapt to change in require_pic_register(). * target.def (TARGET_STACK_PROTECT_GUARD): Document requirement on guard's address computation to be careful about not spilling. * doc/tm.texi: Regenerate. *** gcc/testsuite/ChangeLog *** 2018-04-27 Thomas Preud'homme PR target/85434 * gcc.target/arm/pr85434.c: New testcase. Testing: The code has been boostraped on an Armv8-A machine targeting: - Aarch32 ARM mode with -mfpu=neon-fpv4 and hardfloat ABI - Aarch64 Testsuite has been run for the following sets of flag: - arm-eabi-aem/-mthumb/-march=armv4t - arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp - arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard (thereby testing the code for ARM, Thumb-2 and Thumb-1 mode) without any regression. Is it ok for trunk? Best regards, Thomas From 76c48e31130f212721addeeca830477e3b6f5e10 Mon Sep 17 00:00:00 2001 From: Thomas Preud'homme Date: Mon, 23 Apr 2018 14:37:11 +0100 Subject: [PATCH] [ARM] Fix PR85434: spill of stack protector's guard address On Arm (Aarch32 and Aarch64) the stack protector's guard is accessed by loading its address first before loading its value from it as part of the stack_protect_set or stack_protect_check insn pattern. This creates the risk of spilling between the two. It is particularly likely on Aarch32 when compiling PIC code because - computing the address takes several instructions (first compute the GOT base and then the GOT entry by adding an offset) which increases the likelyhood of CSE - the address computation can be CSEd due to the GOT entry computation being a MEM of the GOT base + an UNSPEC offset rather than an UNSPEC of a MEM like on AArche64. This patch address both issues by (i) adding some scheduler barriers around the stack protector code and (ii) making all memory loads involved in computing the guard's address volatile. The use of volatile rather than unspec was chosen so that the patterns for computing the guard address can be the same as for normal global variable access thus reusing more code. Finally the patch also improves the documentation to mention the need to be careful when computing the address of the guard. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-04-27 Thomas Preud'homme * cfgexpand.c (stack_protect_prologue): Emit scheduler barriers around stack protector code. * function.c (stack_protect_epilogue): Likewise. * config/arm/arm-protos.h (arm_stack_chk_guard_decl_p): Declare. * config/arm/arm.md (calculate_pic_address): Mark memory volatile if is computing address of stack protector's guard. (calculate_pic_address splitter): Likewise. * config/arm/arm.c (require_pic_register): Add parameter to control
Re: [PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin
Hi Kyrill, On 11/04/18 10:02, Kyrill Tkachov wrote: Hi Thomas, On 09/04/18 15:29, Thomas Preudhomme wrote: Hi Ramana, On 06/04/18 17:17, Thomas Preudhomme wrote: > > > On 06/04/18 17:08, Ramana Radhakrishnan wrote: >> On 06/04/2018 16:54, Thomas Preudhomme wrote: >>> Instruction pattern for setting the FPSCR expects the input value to be >>> in a register. However, __builtin_arm_set_fpscr expander does not ensure >>> that this is the case and as a result GCC ICEs when the builtin is >>> called with a constant literal. >>> >>> This commit fixes the builtin to force the input value into a register. >>> It also remove the unneeded volatile in the existing fpscr test and >>> fixes the function prototype. >>> >>> ChangeLog entries are as follows: >>> >>> *** gcc/ChangeLog *** >>> >>> 2018-04-06 Thomas Preud'homme <thomas.preudho...@arm.com> >>> >>> PR target/85261 >>> * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand >>> into register. >>> >>> *** gcc/testsuite/ChangeLog *** >>> >>> 2018-04-06 Thomas Preud'homme <thomas.preudho...@arm.com> >>> >>> PR target/85261 >>> * gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with >>> literal value. Expect 2 MCR instruction. Fix function prototype. >>> Remove volatile keyword. >>> >>> Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows >>> no regression. >>> >>> Is this ok for stage4? >>> >>> Best regards, >>> >>> Thomas >>> >> >> (sorry about the duplicate for those who get it) >> >> >> LGTM, though in this case I would prefer a bootstrap and regression run >> as this is automatically exercised most with gcc.dg/atomic_*.c and you >> really need this tested on linux than just bare-metal as I'm not sure >> how this gets tested on arm-none-eabi. > > Oh it is indeed. Didn't realized it was used anywhere. Will start bootstrap > right away. Done with --with-arch=armv8-a --with-mode=thumb --with-fpu=neon-vfpv4 --with-float=hard --enable-languages=c,c++,fortran --with-system-zlib --enable-plugins --enable-bootstrap. Testsuite for that GCC does not show any regression either. Ok to commit? Thanks for doing this. This is ok for trunk. > >> >> What about earlier branches, have you looked ? This is a silly target >> bug and fixes should go back to older branches in this particular case >> after baking this on trunk for some time. > > GCC 6 and 7 are affected as well and a backport will be done once it has baked > long enough of course. Will now bootstrap and regtest against GCC 6 and 7. Will let you know once that is finished. Backports show no regression on a bootstrapped arm-none-linux-gnueabihf GCC 6 & 7. Ok to commit those? Best regards, Thomas
Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result
Hi Kyrill, One week went by so I've committed the change to GCC 7 as announced. Best regards, Thomas On 05/04/18 16:36, Kyrill Tkachov wrote: On 05/04/18 16:13, Thomas Preudhomme wrote: Hi Kyrill, On 04/04/18 18:20, Thomas Preudhomme wrote: Hi Kyrill, On 04/04/18 18:19, Kyrill Tkachov wrote: Hi Thomas, On 04/04/18 18:03, Thomas Preudhomme wrote: Hi, __builtin_cmse_nonsecure_caller implementation returns true in almost all cases due to 2 separate bugs: * gen_addsi is used instead of gen_andsi to retrieve the lsb * the lsb boolean value is not negated but the specification [1] says the intrinsic should return true for a nonsecure caller and a nonsecure caller is characterized with LR's lsb being 0 This was not caught due to (1) lack of runtime test and (2) the existing RTL scan not taking into account that '.' matches newline in Tcl regular expressions. This patch fixes the implementation issues and improves testing of cmse_nonsecure_caller by (1) adding a runtime test for the secure caller case and (2) looking for an SET insn of an AND expression in the right function. This leaves the nonsecure caller case only partly tested since the exact value being AND and the negation are not covered by the scan and the existing test infrastructure does not allow 2 separate compilation and link to be performed. It is enough though to catch the current incorrect behavior. The patch also reorganize the scan directives in cmse-1.c to more easily identify what function they are intended to test in the file. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-04-04 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85203 * config/arm/arm-builtins.c (arm_expand_builtin): Change expansion to perform a bitwise AND of the argument followed by a boolean negation of the result. *** gcc/testsuite/ChangeLog *** 2018-04-04 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85203 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan to match a single insn of the baz function. Move scan directives at the end of the file below the functions they are trying to test for better readability. * gcc.target/arm/cmse/cmse-16.c: New testcase. Testing: No bootstrap since only M profile builtin code has been changed but regression testing for arm-none-eabi targeting Arm Cortex-M23 and Cortex-M33 shows no regression. Is this ok for stage4? Ok, thanks for fixing this. Does this need backporting to the branches? Yes to gcc-7-branch only. The patch applies cleanly on gcc-7-branch and the same testing shows no regression. Ok to apply to gcc-7-branch once the patch has baked for 7 days in trunk? Yes, thanks. Kyrill Best regards, Thomas
Re: [PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin
Hi Ramana, On 06/04/18 17:17, Thomas Preudhomme wrote: On 06/04/18 17:08, Ramana Radhakrishnan wrote: On 06/04/2018 16:54, Thomas Preudhomme wrote: Instruction pattern for setting the FPSCR expects the input value to be in a register. However, __builtin_arm_set_fpscr expander does not ensure that this is the case and as a result GCC ICEs when the builtin is called with a constant literal. This commit fixes the builtin to force the input value into a register. It also remove the unneeded volatile in the existing fpscr test and fixes the function prototype. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-04-06 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85261 * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand into register. *** gcc/testsuite/ChangeLog *** 2018-04-06 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85261 * gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with literal value. Expect 2 MCR instruction. Fix function prototype. Remove volatile keyword. Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows no regression. Is this ok for stage4? Best regards, Thomas (sorry about the duplicate for those who get it) LGTM, though in this case I would prefer a bootstrap and regression run as this is automatically exercised most with gcc.dg/atomic_*.c and you really need this tested on linux than just bare-metal as I'm not sure how this gets tested on arm-none-eabi. Oh it is indeed. Didn't realized it was used anywhere. Will start bootstrap right away. Done with --with-arch=armv8-a --with-mode=thumb --with-fpu=neon-vfpv4 --with-float=hard --enable-languages=c,c++,fortran --with-system-zlib --enable-plugins --enable-bootstrap. Testsuite for that GCC does not show any regression either. Ok to commit? What about earlier branches, have you looked ? This is a silly target bug and fixes should go back to older branches in this particular case after baking this on trunk for some time. GCC 6 and 7 are affected as well and a backport will be done once it has baked long enough of course. Will now bootstrap and regtest against GCC 6 and 7. Will let you know once that is finished. Best regards, Thomas
Re: [PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin
On 06/04/18 17:08, Ramana Radhakrishnan wrote: On 06/04/2018 16:54, Thomas Preudhomme wrote: Instruction pattern for setting the FPSCR expects the input value to be in a register. However, __builtin_arm_set_fpscr expander does not ensure that this is the case and as a result GCC ICEs when the builtin is called with a constant literal. This commit fixes the builtin to force the input value into a register. It also remove the unneeded volatile in the existing fpscr test and fixes the function prototype. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-04-06 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85261 * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand into register. *** gcc/testsuite/ChangeLog *** 2018-04-06 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85261 * gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with literal value. Expect 2 MCR instruction. Fix function prototype. Remove volatile keyword. Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows no regression. Is this ok for stage4? Best regards, Thomas (sorry about the duplicate for those who get it) LGTM, though in this case I would prefer a bootstrap and regression run as this is automatically exercised most with gcc.dg/atomic_*.c and you really need this tested on linux than just bare-metal as I'm not sure how this gets tested on arm-none-eabi. Oh it is indeed. Didn't realized it was used anywhere. Will start bootstrap right away. What about earlier branches, have you looked ? This is a silly target bug and fixes should go back to older branches in this particular case after baking this on trunk for some time. GCC 6 and 7 are affected as well and a backport will be done once it has baked long enough of course. Best regards, Thomas
[PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin
Instruction pattern for setting the FPSCR expects the input value to be in a register. However, __builtin_arm_set_fpscr expander does not ensure that this is the case and as a result GCC ICEs when the builtin is called with a constant literal. This commit fixes the builtin to force the input value into a register. It also remove the unneeded volatile in the existing fpscr test and fixes the function prototype. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-04-06 Thomas Preud'hommePR target/85261 * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand into register. *** gcc/testsuite/ChangeLog *** 2018-04-06 Thomas Preud'homme PR target/85261 * gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with literal value. Expect 2 MCR instruction. Fix function prototype. Remove volatile keyword. Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows no regression. Is this ok for stage4? Best regards, Thomas diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 8940d1f6311bccf86664ab2eaa938735eec595f6..e100d933a77c5de4a13cb961d1bff40f57f2ea80 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -2592,7 +2592,7 @@ arm_expand_builtin (tree exp, icode = CODE_FOR_set_fpscr; arg0 = CALL_EXPR_ARG (exp, 0); op0 = expand_normal (arg0); - pat = GEN_FCN (icode) (op0); + pat = GEN_FCN (icode) (force_reg (SImode, op0)); } emit_insn (pat); return target; diff --git a/gcc/testsuite/gcc.target/arm/fpscr.c b/gcc/testsuite/gcc.target/arm/fpscr.c index 7b4d71d72d8964f6da0d0604bf59aeb4a895df43..4c3eaf7fcf75ad8582071ecb110fd1e4976a3b24 100644 --- a/gcc/testsuite/gcc.target/arm/fpscr.c +++ b/gcc/testsuite/gcc.target/arm/fpscr.c @@ -6,11 +6,14 @@ /* { dg-add-options arm_fp } */ void -test_fpscr () +test_fpscr (void) { - volatile unsigned int status = __builtin_arm_get_fpscr (); + unsigned status; + + __builtin_arm_set_fpscr (0); + status = __builtin_arm_get_fpscr (); __builtin_arm_set_fpscr (status); } /* { dg-final { scan-assembler "mrc\tp10, 7, r\[0-9\]+, cr1, cr0, 0" } } */ -/* { dg-final { scan-assembler "mcr\tp10, 7, r\[0-9\]+, cr1, cr0, 0" } } */ +/* { dg-final { scan-assembler-times "mcr\tp10, 7, r\[0-9\]+, cr1, cr0, 0" 2 } } */
Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result
Hi Kyrill, On 04/04/18 18:20, Thomas Preudhomme wrote: Hi Kyrill, On 04/04/18 18:19, Kyrill Tkachov wrote: Hi Thomas, On 04/04/18 18:03, Thomas Preudhomme wrote: Hi, __builtin_cmse_nonsecure_caller implementation returns true in almost all cases due to 2 separate bugs: * gen_addsi is used instead of gen_andsi to retrieve the lsb * the lsb boolean value is not negated but the specification [1] says the intrinsic should return true for a nonsecure caller and a nonsecure caller is characterized with LR's lsb being 0 This was not caught due to (1) lack of runtime test and (2) the existing RTL scan not taking into account that '.' matches newline in Tcl regular expressions. This patch fixes the implementation issues and improves testing of cmse_nonsecure_caller by (1) adding a runtime test for the secure caller case and (2) looking for an SET insn of an AND expression in the right function. This leaves the nonsecure caller case only partly tested since the exact value being AND and the negation are not covered by the scan and the existing test infrastructure does not allow 2 separate compilation and link to be performed. It is enough though to catch the current incorrect behavior. The patch also reorganize the scan directives in cmse-1.c to more easily identify what function they are intended to test in the file. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-04-04 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85203 * config/arm/arm-builtins.c (arm_expand_builtin): Change expansion to perform a bitwise AND of the argument followed by a boolean negation of the result. *** gcc/testsuite/ChangeLog *** 2018-04-04 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85203 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan to match a single insn of the baz function. Move scan directives at the end of the file below the functions they are trying to test for better readability. * gcc.target/arm/cmse/cmse-16.c: New testcase. Testing: No bootstrap since only M profile builtin code has been changed but regression testing for arm-none-eabi targeting Arm Cortex-M23 and Cortex-M33 shows no regression. Is this ok for stage4? Ok, thanks for fixing this. Does this need backporting to the branches? Yes to gcc-7-branch only. The patch applies cleanly on gcc-7-branch and the same testing shows no regression. Ok to apply to gcc-7-branch once the patch has baked for 7 days in trunk? Best regards, Thomas
Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result
Hi Kyrill, On 04/04/18 18:19, Kyrill Tkachov wrote: Hi Thomas, On 04/04/18 18:03, Thomas Preudhomme wrote: Hi, __builtin_cmse_nonsecure_caller implementation returns true in almost all cases due to 2 separate bugs: * gen_addsi is used instead of gen_andsi to retrieve the lsb * the lsb boolean value is not negated but the specification [1] says the intrinsic should return true for a nonsecure caller and a nonsecure caller is characterized with LR's lsb being 0 This was not caught due to (1) lack of runtime test and (2) the existing RTL scan not taking into account that '.' matches newline in Tcl regular expressions. This patch fixes the implementation issues and improves testing of cmse_nonsecure_caller by (1) adding a runtime test for the secure caller case and (2) looking for an SET insn of an AND expression in the right function. This leaves the nonsecure caller case only partly tested since the exact value being AND and the negation are not covered by the scan and the existing test infrastructure does not allow 2 separate compilation and link to be performed. It is enough though to catch the current incorrect behavior. The patch also reorganize the scan directives in cmse-1.c to more easily identify what function they are intended to test in the file. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-04-04 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85203 * config/arm/arm-builtins.c (arm_expand_builtin): Change expansion to perform a bitwise AND of the argument followed by a boolean negation of the result. *** gcc/testsuite/ChangeLog *** 2018-04-04 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85203 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan to match a single insn of the baz function. Move scan directives at the end of the file below the functions they are trying to test for better readability. * gcc.target/arm/cmse/cmse-16.c: New testcase. Testing: No bootstrap since only M profile builtin code has been changed but regression testing for arm-none-eabi targeting Arm Cortex-M23 and Cortex-M33 shows no regression. Is this ok for stage4? Ok, thanks for fixing this. Does this need backporting to the branches? Yes to gcc-7-branch only. Best regards, Thomas
Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result
Oops, forgot the link. On 04/04/18 18:03, Thomas Preudhomme wrote: Hi, __builtin_cmse_nonsecure_caller implementation returns true in almost all cases due to 2 separate bugs: * gen_addsi is used instead of gen_andsi to retrieve the lsb * the lsb boolean value is not negated but the specification [1] says the intrinsic should return true for a nonsecure caller and a nonsecure caller is characterized with LR's lsb being 0 [1] https://static.docs.arm.com/ecm0359818/10/ECM0359818_armv8m_security_extensions_reqs_on_dev_tools_1_0.pdf Best regards, Thomas This was not caught due to (1) lack of runtime test and (2) the existing RTL scan not taking into account that '.' matches newline in Tcl regular expressions. This patch fixes the implementation issues and improves testing of cmse_nonsecure_caller by (1) adding a runtime test for the secure caller case and (2) looking for an SET insn of an AND expression in the right function. This leaves the nonsecure caller case only partly tested since the exact value being AND and the negation are not covered by the scan and the existing test infrastructure does not allow 2 separate compilation and link to be performed. It is enough though to catch the current incorrect behavior. The patch also reorganize the scan directives in cmse-1.c to more easily identify what function they are intended to test in the file. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-04-04 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85203 * config/arm/arm-builtins.c (arm_expand_builtin): Change expansion to perform a bitwise AND of the argument followed by a boolean negation of the result. *** gcc/testsuite/ChangeLog *** 2018-04-04 Thomas Preud'homme <thomas.preudho...@arm.com> PR target/85203 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan to match a single insn of the baz function. Move scan directives at the end of the file below the functions they are trying to test for better readability. * gcc.target/arm/cmse/cmse-16.c: New testcase. Testing: No bootstrap since only M profile builtin code has been changed but regression testing for arm-none-eabi targeting Arm Cortex-M23 and Cortex-M33 shows no regression. Is this ok for stage4? Best regards, Thomas
[PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result
Hi, __builtin_cmse_nonsecure_caller implementation returns true in almost all cases due to 2 separate bugs: * gen_addsi is used instead of gen_andsi to retrieve the lsb * the lsb boolean value is not negated but the specification [1] says the intrinsic should return true for a nonsecure caller and a nonsecure caller is characterized with LR's lsb being 0 This was not caught due to (1) lack of runtime test and (2) the existing RTL scan not taking into account that '.' matches newline in Tcl regular expressions. This patch fixes the implementation issues and improves testing of cmse_nonsecure_caller by (1) adding a runtime test for the secure caller case and (2) looking for an SET insn of an AND expression in the right function. This leaves the nonsecure caller case only partly tested since the exact value being AND and the negation are not covered by the scan and the existing test infrastructure does not allow 2 separate compilation and link to be performed. It is enough though to catch the current incorrect behavior. The patch also reorganize the scan directives in cmse-1.c to more easily identify what function they are intended to test in the file. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2018-04-04 Thomas Preud'hommePR target/85203 * config/arm/arm-builtins.c (arm_expand_builtin): Change expansion to perform a bitwise AND of the argument followed by a boolean negation of the result. *** gcc/testsuite/ChangeLog *** 2018-04-04 Thomas Preud'homme PR target/85203 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan to match a single insn of the baz function. Move scan directives at the end of the file below the functions they are trying to test for better readability. * gcc.target/arm/cmse/cmse-16.c: New testcase. Testing: No bootstrap since only M profile builtin code has been changed but regression testing for arm-none-eabi targeting Arm Cortex-M23 and Cortex-M33 shows no regression. Is this ok for stage4? Best regards, Thomas diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 8940d1f6311bccf86664ab2eaa938735eec595f6..184eb2a934308717b6e1054e376487a297f8d5de 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -2600,7 +2600,9 @@ arm_expand_builtin (tree exp, case ARM_BUILTIN_CMSE_NONSECURE_CALLER: target = gen_reg_rtx (SImode); op0 = arm_return_addr (0, NULL_RTX); - emit_insn (gen_addsi3 (target, op0, const1_rtx)); + emit_insn (gen_andsi3 (target, op0, const1_rtx)); + op1 = gen_rtx_EQ (SImode, target, const0_rtx); + emit_insn (gen_cstoresi4 (target, op1, target, const0_rtx)); return target; case ARM_BUILTIN_TEXTRMSB: diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c index c13272eed683aa06db027cd4646e5fe67817212b..f764153cb17b796ccd0d20abb78d5cf56be52911 100644 --- a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c +++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c @@ -71,6 +71,20 @@ baz (void) { return cmse_nonsecure_caller (); } +/* { dg-final { scan-assembler "baz:" } } */ +/* { dg-final { scan-assembler "__acle_se_baz:" } } */ +/* { dg-final { scan-assembler-not "\tcmse_nonsecure_caller" } } */ +/* Look for an andsi of 1 with a register in function baz, ie. + +;; Function baz + +(insn (set (reg:SI ) + (and:SI (reg:SI ) + (const_int 1 ) + > +(insn +*/ +/* { dg-final { scan-rtl-dump "\n;; Function baz\[^\n\]*\[^(\]+\[^;\]*\n\\(insn \[^(\]+ \\(set \\(reg\[^:\]*:SI \[^)\]+\\)\[^(\]*\\(and:SI \\(reg\[^:\]*:SI \[^)\]+\\)\[^(\]*\\((const_int 1|reg\[^:\]*:SI) \[^)\]+\\)\[^(\]+(\\(nil\\)\[^(\]+)?\\(insn" expand } } */ typedef int __attribute__ ((cmse_nonsecure_call)) (int_nsfunc_t) (void); @@ -86,6 +100,11 @@ qux (int_nsfunc_t * callback) { fp = cmse_nsfptr_create (callback); } +/* { dg-final { scan-assembler "qux:" } } */ +/* { dg-final { scan-assembler "__acle_se_qux:" } } */ +/* { dg-final { scan-assembler "bic" } } */ +/* { dg-final { scan-assembler "push\t\{r4, r5, r6" } } */ +/* { dg-final { scan-assembler "msr\tAPSR_nzcvq" } } */ int call_callback (void) { @@ -94,13 +113,4 @@ int call_callback (void) else return default_callback (); } -/* { dg-final { scan-assembler "baz:" } } */ -/* { dg-final { scan-assembler "__acle_se_baz:" } } */ -/* { dg-final { scan-assembler "qux:" } } */ -/* { dg-final { scan-assembler "__acle_se_qux:" } } */ -/* { dg-final { scan-assembler-not "\tcmse_nonsecure_caller" } } */ -/* { dg-final { scan-rtl-dump "and.*reg.*const_int 1" expand } } */ -/* { dg-final { scan-assembler "bic" } } */ -/* { dg-final { scan-assembler "push\t\{r4, r5, r6" } } */ -/* { dg-final { scan-assembler "msr\tAPSR_nzcvq" } } */ /* { dg-final { scan-assembler-times "bl\\s+__gnu_cmse_nonsecure_call" 1 } } */ diff --git
[arm-embedded][PATCH] Add multilib mapping for -mcpu=cortex-r52
Hi, Currently -mcpu=cortex-r52 gets assigned the default multilib due to lack of mapping from -mcpu=cortex-r52 to an -march option. This is inconsistent with -march=armv8-r which gets the thumb/v7-ar multilib. This patch adds the appropriate mapping. ChangeLog entry is as follows: *** gcc/ChangeLog.arm *** 2018-03-15 Thomas Preud'homme* config/arm/t-rmprofile: Add mapping from -mcpu=cortex-r52 to -march=armv7. Testing: -mcpu=cortex-r52 -print-multi-directory prints . (ie. default mutlilib) without the patch with a multilib build but prints the expected thumb/v7-ar with the patch. We've decided to apply this patch to the ARM/embedded-7-branch. Best regards, Thomas
[arm-embedded][PATCH] Add multilib mapping for -mcpu=cortex-m33+nodsp
Hi, Currently -mcpu=cortex-m33+nodsp gets assigned the thumb multilib due to lack of mapping from -mcpu=cortex-m33+nodsp to an -march option. This leads to link failures for linking Armv4T Thumb code from the multilib with Armv8-M Mainline code from the code being compiled. This patch adds the appropriate mapping. ChangeLog entry is as follows: *** gcc/ChangeLog.arm *** 2018-03-14 Thomas Preud'homme* config/arm/t-rmprofile: Add mapping from -mcpu=cortex-m33+nodsp to -march=armv8-m.main. Testing: A hello world fails to link without the patch with a multilib build but succeeds with the patch. We've decided to apply this patch to the ARM/embedded-7-branch branch. Best regards, Thomas diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile index a3a24d59fb29b42a36177bd2d2ebfae4e50e5a10..54411795215b8aff90ba9cfb806ec7b33db4caea 100644 --- a/gcc/config/arm/t-rmprofile +++ b/gcc/config/arm/t-rmprofile @@ -102,6 +102,7 @@ MULTILIB_MATCHES += march?armv7e-m=mcpu?cortex-m4 MULTILIB_MATCHES += march?armv7e-m=mcpu?cortex-m7 MULTILIB_MATCHES += march?armv8-m.base=mcpu?cortex-m23 MULTILIB_MATCHES += march?armv8-m.main=mcpu?cortex-m33 +MULTILIB_MATCHES += march?armv8-m.main=mcpu?cortex-m33+nodsp MULTILIB_MATCHES += march?armv7=mcpu?cortex-r4 MULTILIB_MATCHES += march?armv7=mcpu?cortex-r4f MULTILIB_MATCHES += march?armv7=mcpu?cortex-r5
[PATCH, GCC/testsuite] Fix FAIL display for some scan-*-times directives
Hi, scan-assembler-times and scan-tree-dump-times dejagnu directives show a different output in the summary files depending on whether they PASS or FAIL. This means that dg-cmp-results would not show a regression because it would not see a connection between the two output. The difference comes from the FAIL showing the number of actual times the pattern was match, presumably to help debugging. This patch moves the info regarding the actual number of times the pattern match in a separate verbose message. This keeps the message unchanged but let developers have the required debug message with -v. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2018-03-13 Thomas Preud'homme* lib/scanasm.exp (scan-assembler-times): Move FAIL debug info into a separate verbose message. * lib/scandump.exp (scan-dump-times): Likewise. Testing: Made a modified version of gcc.dg/nand.c and gcc.dg/torture/pr61772.c to FAIL their scan-assembler-times and scan-tree-dump-times respective directives. Without the patch dg-cmp-results does not flag any regression but does with the patch. Is this ok for stage 4? Best regards, Thomas diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp index 3a775b0a812775193cf1181337a5b890cde74133..61e0f3f48aeea5785689c5df7a15dc2ccbc71029 100644 --- a/gcc/testsuite/lib/scanasm.exp +++ b/gcc/testsuite/lib/scanasm.exp @@ -266,7 +266,8 @@ proc scan-assembler-times { args } { if {$result_count == $times} { pass "$testcase scan-assembler-times $pp_pattern $times" } else { - fail "$testcase scan-assembler-times $pp_pattern $times (found $result_count times)" + verbose -log "$testcase: $pp_pattern found $result_count times" + fail "$testcase scan-assembler-times $pp_pattern $times" } } diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp index 4e3da972ae4ed09c9874eb384daf825e6e2dcde3..be8fbe8b461dc81d5683fe323c0913f678daa1e0 100644 --- a/gcc/testsuite/lib/scandump.exp +++ b/gcc/testsuite/lib/scandump.exp @@ -110,7 +110,8 @@ proc scan-dump-times { args } { if {$result_count == $times} { pass "$testname" } else { -fail "$testname (found $result_count times)" + verbose -log "$testcase: pattern found $result_count times" +fail "$testname" } }
[PATCH, GCC/testsuite/ARM] Fix copysign_softfloat_1.c option directives
gcc.target/arm/copysign_softfloat_1.c's use of arm_arch_v6t2 in dg-add-option changes the architecture to -march=armv6t2. Since the test only requires Thumb-2 capable architecture, we just need to add -mthumb on the command line since arm_thumb2_ok guarantees by definition that doing that is enough to select Thumb-2. This fixes warning on the command line when having -mcpu=cortex-m3 in RUNTESTFLAGS for instance. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2018-03-01 Thomas Preud'homme
Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase
Finally committed to gcc-7-branch, sorry for doing this so late. I've merged the two commits into one. Patch attached for reference. Best regards, Thomas On 05/12/17 21:26, Mike Stump wrote: On Dec 5, 2017, at 12:56 PM, Thomas Preudhomme <thomas.preudho...@foss.arm.com> wrote: Thanks, I've tested after the two commits and it works both in tree and out of tree. It'll simplify comparing in tree results Vs out of tree for us, thanks a lot! Would you consider a backport to stable branches if nobody complains after a week? Yeah, back port is Ok. diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index b211dec4ffb20359f50bbc695481977282eb0525..b78c5f59bfc1121cf61071e41bd11551a9ab7122 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,12 @@ +2017-02-27 Thomas Preud'homme <thomas.preudho...@arm.com> + + Backport from mainline + 2017-12-05 Matthew Gretton-Dann <matthew.gretton-d...@arm.com> + with follow-up r255433 commit. + + * gcc.c-torture/unsorted/dump-noaddr.x: Generate dump files in + tmpdir. + 2018-02-26 Carl Love <c...@us.ibm.com> Backport from mainline: commit 257747 on 2018-02-16. diff --git a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x index d14d494570944b2be82c2575204cdbf4b15721ca..e86f36a1861fc4dc46bd449d78403f510ec4b920 100644 --- a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x +++ b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x @@ -9,14 +9,14 @@ proc dump_compare { src options } { # loop through all the options foreach option $option_list { - file delete -force dump1 - file mkdir dump1 + file delete -force $tmpdir/dump1 + file mkdir $tmpdir/dump1 c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr" - file delete -force dump2 - file mkdir dump2 + file delete -force $tmpdir/dump2 + file mkdir $tmpdir/dump2 c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr" - foreach dump1 [lsort [glob -nocomplain dump1/*]] { - regsub dump1/ $dump1 dump2/ dump2 + foreach dump1 [lsort [glob -nocomplain $tmpdir/dump1/*]] { + set dump2 "$tmpdir/dump2/[file tail $dump1]" set dumptail "gcc.c-torture/unsorted/[file tail $dump1]" regsub {\.\d+((t|r|i)\.[^.]+)$} $dumptail {.*\1} dumptail set tmp [ diff "$dump1" "$dump2" ] @@ -29,8 +29,8 @@ proc dump_compare { src options } { } } } -file delete -force dump1 -file delete -force dump2 +file delete -force $tmpdir/dump1 +file delete -force $tmpdir/dump2 } dump_compare $src $options
[arm-embedded] Allow -mcpu=cortex-m33+nodsp
Hi, we decided to apply the following patch to ARM/embedded-7-branch to support -mcpu=cortex-m33+nodsp. DSP instructions are optional for Arm Cortex-M33, yet its -mcpu option does not allow +nodsp. Users are thus left with using -march=armv8-m.main -mtune=cortex-m33. This patch creates a new cpu cortex-m33+nodsp since there is no mechanism on GCC 7 for CPU extensions. Since GCC passes the -mcpu parameter down to GAS verbatim and that GAS does not support +nodsp for cortex-m33, this patch also special cases -mcpu=cortex-m33 in arm_file_start to output a .arch option instead of .cpu. 2018-02-26 Thomas Preud'homme* config/arm/arm-cpus.in (cortex-m33+nodsp): New CPU. * config/arm/arm-cpu-cdata.h: Regenerate. * config/arm/arm-cpu-data.h: Likewise. * config/arm/arm-cpu.h: Likewise. * config/arm/arm-tables.opt: Likewise. * config/arm/arm-tune.md: Likewise. * config/arm/arm.c (arm_file_start): Special case * -mcpu=cortex-m33+nodsp to emit .arch armv8-m.main instead. * doc/invoke.texi: Document cortex-m33+nodsp as a valid value for -mcpu and -mtune. Testing: Compiled a hello world with -S -mcpu=cortex-m33 and with -S -mcpu=cortex-m33+dsp and compared both assembly files. The latter correctly emits .arch armv8-m.main instead of .cpu cortex-m33. Best regards, Thomas diff --git a/gcc/ChangeLog.arm b/gcc/ChangeLog.arm index a98ecb028f6800a516f6cd252390ceac1e08911b..e09bd132d224aee511591143d86efff8bb156d60 100644 --- a/gcc/ChangeLog.arm +++ b/gcc/ChangeLog.arm @@ -1,3 +1,9 @@ +2018-02-26 Thomas Preud'homme + + * config/arm/arm-cpus.in (cortex-m33+nodsp): Define. + * doc/invoke.texi: Document +nodsp as a valid extension for + -mcpu=cortex-m33. + 2017-11-23 Thomas Preud'homme Cherry-pick from GCC 7 diff --git a/gcc/config/arm/arm-cpu-cdata.h b/gcc/config/arm/arm-cpu-cdata.h index 27571c841d928fe9c331006bfc9608c4e75b60d8..f5e34c830ca28196ded0912c230f719a6ff5681e 100644 --- a/gcc/config/arm/arm-cpu-cdata.h +++ b/gcc/config/arm/arm-cpu-cdata.h @@ -789,6 +789,13 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] = }, }, { +"cortex-m33+nodsp", +{ + ISA_ARMv8m_main, + isa_nobit +}, + }, + { "cortex-r52", { ISA_ARMv8r,isa_bit_crc32, diff --git a/gcc/config/arm/arm-cpu-data.h b/gcc/config/arm/arm-cpu-data.h index e474efa02ed93a93ae00ac2057a9bc841c48b87f..30902ecabc6c72e46e6f6aa1d92b9980fd639dcd 100644 --- a/gcc/config/arm/arm-cpu-data.h +++ b/gcc/config/arm/arm-cpu-data.h @@ -1221,6 +1221,17 @@ static const struct processors all_cores[] = _v7m_tune }, { +"cortex-m33+nodsp", +TARGET_CPU_cortexm33nodsp, +(TF_LDSCHED), +"8M_MAIN", BASE_ARCH_8M_MAIN, +{ + ISA_ARMv8m_main, + isa_nobit +}, +_v7m_tune + }, + { "cortex-r52", TARGET_CPU_cortexr52, (TF_LDSCHED), diff --git a/gcc/config/arm/arm-cpu.h b/gcc/config/arm/arm-cpu.h index 502965081faa625abc93d97559517baf50972e1b..22566495fdf0da0ad75b81a5956eecb898c38684 100644 --- a/gcc/config/arm/arm-cpu.h +++ b/gcc/config/arm/arm-cpu.h @@ -130,6 +130,7 @@ enum processor_type TARGET_CPU_cortexa73cortexa53, TARGET_CPU_cortexm23, TARGET_CPU_cortexm33, + TARGET_CPU_cortexm33nodsp, TARGET_CPU_cortexr52, TARGET_CPU_arm_none }; diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in index 5f18dfb35687888bc7f642785693f75658a96733..7368a067db92b384f83fdb4a0af6cb77cff4e6f4 100644 --- a/gcc/config/arm/arm-cpus.in +++ b/gcc/config/arm/arm-cpus.in @@ -1090,6 +1090,13 @@ begin cpu cortex-m33 costs v7m end cpu cortex-m33 +begin cpu cortex-m33+nodsp + cname cortexm33nodsp + tune flags LDSCHED + architecture armv8-m.main + costs v7m +end cpu cortex-m33+nodsp + # V8 R-profile implementations. begin cpu cortex-r52 cname cortexr52 diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index ede44f497edd69390bbbe6de5a913430b546c547..a46bc3c7f8ba6048969bae4d37a7be3c5242ce6a 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -349,6 +349,9 @@ EnumValue Enum(processor_type) String(cortex-m33) Value( TARGET_CPU_cortexm33) EnumValue +Enum(processor_type) String(cortex-m33+nodsp) Value( TARGET_CPU_cortexm33nodsp) + +EnumValue Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52) Enum diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index 519c0556fe76a5a391cd268bb50541c77a4596d4..542b7972d21cd3c9986229e91ce0841522e3b52f 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -57,5 +57,5 @@ cortexa73,exynosm1,xgene1, cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35, cortexa73cortexa53,cortexm23,cortexm33, - cortexr52" + cortexm33nodsp,cortexr52" (const (symbol_ref "((enum attr_tune) arm_tune)"))) diff --git a/gcc/config/arm/arm.c
[PATCH, arm-embedded] Multilib mapping for Armv8-R
Hi, We have decided to apply the following patch to the ARM/embedded-7-branch to provide better multilib for Armv8-R targets. Due to there being no multilib mapping for Armv8-R, default multilib built for -march=armv4t with softfloat floating-point arithmetic is being used. This patch maps it instead to the existing Armv7 multilibs. Note that mapping for single-precision Armv8-R has been left out due to there being no Arm implementation of that architecture variant. Changelog entry is as follows: *** gcc/ChangeLog *** 2018-02-26 Thomas Preud'homme* config/arm/t-rmprofile: Map Armv8-R and Armv8-R with CRC extension to Armv7 multilibs. Testing: Ran -print-multi-directory for all combinations of -march=armv8-r/-march=armv8-r+crc with -mfpu=neon-fp-armv8/crypto-neon-fp-armv8. All gave the expected result. Details in appendix. Is this ok for stage4? Best regards, Thomas Appendix: output of -print-multi-directory for all supported Armv8-R configuration single precision FPU excepted. % for ext in "" +crc; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done arm-none-eabi-gcc -march=armv8-r -mfloat-abi=soft -print-multi-directory: thumb/v7-ar arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=soft -print-multi-directory: thumb/v7-ar % for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=softfp -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done arm-none-eabi-gcc -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp arm-none-eabi-gcc -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp arm-none-eabi-gcc -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp arm-none-eabi-gcc -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp % for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=hard -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done arm-none-eabi-gcc -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard arm-none-eabi-gcc -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard arm-none-eabi-gcc -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard arm-none-eabi-gcc -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard % for ext in "" +crc; do cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=soft -print-multi-directory: . arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 -mfloat-abi=soft -print-multi-directory: . % for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=softfp -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp % for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=hard -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile index d4bc9fde4c5544812bde4743ccc18d68c1c25132..a3a24d59fb29b42a36177bd2d2ebfae4e50e5a10 100644 --- a/gcc/config/arm/t-rmprofile +++ b/gcc/config/arm/t-rmprofile @@ -135,6 +135,8 @@
Re: [PATCH, GCC/ARM] Multilib mapping for Armv8-R
tfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp % for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto +fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto +crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=hard -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done arm-none-eabi-gcc -march=armv8-r -mfloat-abi=hard -print-multi-directory: . arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=hard -print-multi-directory: . arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=hard -print-multi-directory: . arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=hard -print-multi-directory: . arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=hard -print-multi-directory: thumb/v7+fp/hard On 13/02/18 10:27, Kyrill Tkachov wrote: Hi Thomas, On 13/02/18 10:24, Thomas Preudhomme wrote: Hi, Due to there being no multilib mapping for Armv8-R, default multilib targeting -march=armv4t with softfloat floating-point arithmetic is being used. This patch maps it instead to the existing Armv7 multilibs. Note that since there is no single-precision multilib compatible with R profile, -march=armv8-r+fp.sp is mapped to -march=armv7 ie. Armv7 with softfloat floating-point. Thanks for doing this. Changelog entry is as follows: *** gcc/ChangeLog *** 2018-02-12 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/t-multilib: Map Armv8-R to Armv7 multilibs. Testing: Ran -print-multi-directory for all combinations of extensions one can pass to -march=armv8-r (including no extension but only considering a single ordering of extension). All gave the expected result. Details in appendix. Is this ok for stage4? Best regards, Thomas Appendix: output of -print-multi-directory for all extensions available to -march=armv8-r Can you please add a representative subset of these as tests to gcc.target/arm/multilib.exp. That way we can have the peace of mind that they have sane mappings as we go forward. <snip, thanks for the results> diff --git a/gcc/config/arm/t-multilib b/gcc/config/arm/t-multilib index 2f790097670e1bf81b56b069a6b1582763aab6e9..cd5927a7c9ec053b4d5b9725f7b30daeca3b1aa3 100644 --- a/gcc/config/arm/t-multilib +++ b/gcc/config/arm/t-multilib @@ -70,6 +70,7 @@ v8_a_simd_variants := $(call all_feat_combs, simd crypto) v8_1_a_simd_variants := $(call all_feat_combs, simd crypto) v8_2_a_simd_variants := $(call all_feat_combs, simd fp16 fp16fml crypto dotprod) v8_4_a_simd_variants := $(call all_feat_combs, simd fp16 crypto) +v8_r_nosimd_variants := $(call all_feat_combs, crc fp.sp) ifneq (,$(HAS_APROFILE)) include $(srcdir)/config/arm/t-aprofile @@ -105,6 +106,20 @@ MULTILIB_MATCHES += march?armv7+fp=march?armv7-r+fp+idiv MULTILIB_MATCHES += $(foreach ARCH, $(all_early_arch), \ march?armv5te+fp=march?$(ARCH)+fp) +# +# Armv8-r: map down onto common v7 code. Please use Armv8-R. +# Note 1: there is no single-precision armv7 multilib so +fp.sp is mapped +# down to softfloat armv7 (second MULTILIB_MATCHES). +# Note 2: +fp.sp being a subset of +simd and +crypto, there is no need to +# consider the combination of +fp.sp with a simd extension since matching +# is run after canonicalization +MULTILIB_MATCHES += march?armv7=march?armv8-r +MULTILIB_MATCHES += $(foreach ARCH, $(v8_r_nosimd_var
[PATCH, GCC/ARM] Multilib mapping for Armv8-R
Hi, Due to there being no multilib mapping for Armv8-R, default multilib targeting -march=armv4t with softfloat floating-point arithmetic is being used. This patch maps it instead to the existing Armv7 multilibs. Note that since there is no single-precision multilib compatible with R profile, -march=armv8-r+fp.sp is mapped to -march=armv7 ie. Armv7 with softfloat floating-point. Changelog entry is as follows: *** gcc/ChangeLog *** 2018-02-12 Thomas Preud'homme* config/arm/t-multilib: Map Armv8-R to Armv7 multilibs. Testing: Ran -print-multi-directory for all combinations of extensions one can pass to -march=armv8-r (including no extension but only considering a single ordering of extension). All gave the expected result. Details in appendix. Is this ok for stage4? Best regards, Thomas Appendix: output of -print-multi-directory for all extensions available to -march=armv8-r % for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto +fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto +crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done arm-none-eabi-gcc -march=armv8-r -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=soft -print-multi-directory: thumb/v7/nofp % for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto +fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto +crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=softfp -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done arm-none-eabi-gcc -march=armv8-r -mfloat-abi=softfp -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=softfp -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=softfp -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=softfp -print-multi-directory: thumb/v7/nofp arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=softfp -print-multi-directory: thumb/v7+fp/softfp % for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto
Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase
Hi Mike, Thanks, I've tested after the two commits and it works both in tree and out of tree. It'll simplify comparing in tree results Vs out of tree for us, thanks a lot! Would you consider a backport to stable branches if nobody complains after a week? Best regards, Thomas On 05/12/17 19:27, Mike Stump wrote: On Dec 5, 2017, at 11:11 AM, Thomas Preudhomme <thomas.preudho...@foss.arm.com> wrote: On 05/12/17 17:54, Andrew Pinski wrote: On Tue, Dec 5, 2017 at 9:50 AM, Thomas Preudhomme <thomas.preudho...@foss.arm.com> wrote: Hi, dump-noaddr test FAILS when $tmpdir is not the same as the directory where runtest is called from. Note that this does not happen when running make check because tmpdir is set to srcdir. In that case, file mkdir will create the directory in the current directory while GCC is invoked from tmpdir and hence -dumpbase look for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to be relative to tmpdir which will work in all case. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-12-05 Thomas Preud'homme <thomas.preudho...@arm.com> * gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base relative to tmpdir. Testing: Successfully ran unsorted.exp via make check and out of tree testing using runtest from /test with tmpdir set in /test/site.exp to . Is this ok for stage3? https://gcc.gnu.org/ml/gcc-patches/2012-06/msg01752.html I don't remember where this discussion went last time. Maybe this time there will be a resolution :). FWIW, I agree with Matt, creating the dump in tmpdir makes more sense. I think his patch can be simplified though because the compiler seems to be invoked from tmpdir so it can at least be omitted from the -dumpbase. Sounds reasonable. I've added that on top of his patch and checked that in. Let us know if it works or not.
Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase
On 05/12/17 17:54, Andrew Pinski wrote: On Tue, Dec 5, 2017 at 9:50 AM, Thomas Preudhomme <thomas.preudho...@foss.arm.com> wrote: Hi, dump-noaddr test FAILS when $tmpdir is not the same as the directory where runtest is called from. Note that this does not happen when running make check because tmpdir is set to srcdir. In that case, file mkdir will create the directory in the current directory while GCC is invoked from tmpdir and hence -dumpbase look for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to be relative to tmpdir which will work in all case. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-12-05 Thomas Preud'homme <thomas.preudho...@arm.com> * gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base relative to tmpdir. Testing: Successfully ran unsorted.exp via make check and out of tree testing using runtest from /test with tmpdir set in /test/site.exp to . Is this ok for stage3? https://gcc.gnu.org/ml/gcc-patches/2012-06/msg01752.html I don't remember where this discussion went last time. Maybe this time there will be a resolution :). FWIW, I agree with Matt, creating the dump in tmpdir makes more sense. I think his patch can be simplified though because the compiler seems to be invoked from tmpdir so it can at least be omitted from the -dumpbase. Best regards, Thomas
[PATCH, GCC/testsuite] Fix dump-noaddr dumpbase
Hi, dump-noaddr test FAILS when $tmpdir is not the same as the directory where runtest is called from. Note that this does not happen when running make check because tmpdir is set to srcdir. In that case, file mkdir will create the directory in the current directory while GCC is invoked from tmpdir and hence -dumpbase look for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to be relative to tmpdir which will work in all case. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-12-05 Thomas Preud'homme* gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base relative to tmpdir. Testing: Successfully ran unsorted.exp via make check and out of tree testing using runtest from /test with tmpdir set in /test/site.exp to . Is this ok for stage3? Best regards, Thomas diff --git a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x index d14d494570944b2be82c2575204cdbf4b15721ca..68d6c3e38325cabbdd280ecf05e663dbcda99900 100644 --- a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x +++ b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x @@ -11,10 +11,10 @@ proc dump_compare { src options } { foreach option $option_list { file delete -force dump1 file mkdir dump1 - c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr" + c-torture-compile $src "$option $options -dumpbase [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr" file delete -force dump2 file mkdir dump2 - c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr" + c-torture-compile $src "$option $options -dumpbase [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr" foreach dump1 [lsort [glob -nocomplain dump1/*]] { regsub dump1/ $dump1 dump2/ dump2 set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"
[PATCH, GCC/testsuite] Improve fstack_protector effective target
Hi, Effective target fstack_protector fails to return an error for newlib-based target (such as arm-none-eabi targets) which does not support stack protector. This is due to the test being too simplist for stack protection code to be generated by GCC: it does not contain a local buffer and does not read unknown input. This commit adds a small local buffer with a copy of the filename to trigger stack protector code to be generated. The filename is used instead of the full path so as to ensure the size will fit in the local buffer. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-11-28 Thomas Preud'homme* lib/target-supports.exp (check_effective_target_fstack_protector): Copy filename in local buffer to trigger stack protection. Testing: Ran gcc.dg/pr38616 on arm-none-eabi and arm-linux-gnueabihf, the former is now UNSUPPORTED while the latter continues to PASS. Is this ok for stage3? Best regards, Thomas diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index d30fd368922713d3695f22710197ce7094c977cd..8aff16a25823ec48e76ad6ad8fdc8db998a45877 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -1064,7 +1064,11 @@ proc check_effective_target_static {} { # Return 1 if the target supports -fstack-protector proc check_effective_target_fstack_protector {} { return [check_runtime fstack_protector { - int main (void) { return 0; } + #include + int main (int argc, char *argv[]) { + char buf[64]; + return !strcpy (buf, strrchr (argv[0], '/')); + } } "-fstack-protector"] }
[arm-embedded] [PATCH, GCC/LTO, ping] Fix PR69866: LTO with def for weak alias in regular object file
Hi, We have decided to apply the forwarded patch to the embedded-7-branch to fix an ICE when doing partial LTO with weak symbols. ChangeLog entry is as follows: 2017-11-28 Thomas Preud'hommeBackport from mainline 2017-06-15 Jan Hubicka Thomas Preud'homme PR lto/69866 * lto-symtab.c (lto_symtab_merge_symbols): Drop useless definitions that resolved externally. Backport from mainline 2017-06-15 Thomas Preud'homme PR lto/69866 * gcc.dg/lto/pr69866_0.c: New test. * gcc.dg/lto/pr69866_1.c: Likewise. Best regards, Thomas --- Begin Message --- Hi, I am testing the following. Let me know if it works for you. Honza Index: lto/lto-symtab.c === --- lto/lto-symtab.c(revision 249213) +++ lto/lto-symtab.c(working copy) @@ -952,6 +952,42 @@ if (tgt) node->resolve_alias (tgt, true); } + /* If the symbol was preempted outside IR, see if we want to get rid +of the definition. */ + if (node->analyzed + && !DECL_EXTERNAL (node->decl) + && (node->resolution == LDPR_PREEMPTED_REG + || node->resolution == LDPR_RESOLVED_IR + || node->resolution == LDPR_RESOLVED_EXEC + || node->resolution == LDPR_RESOLVED_DYN)) + { + DECL_EXTERNAL (node->decl) = 1; + /* If alias to local symbol was preempted by external definition, +we know it is not pointing to the local symbol. Remove it. */ + if (node->alias + && !node->weakref + && !node->transparent_alias + && node->get_alias_target ()->binds_to_current_def_p ()) + { + node->alias = false; + node->remove_all_references (); + node->definition = false; + node->analyzed = false; + node->cpp_implicit_alias = false; + } + else if (!node->alias + && node->definition + && node->get_availability () <= AVAIL_INTERPOSABLE) + { + if ((cnode = dyn_cast (node)) != NULL) + cnode->reset (); + else + { + node->analyzed = node->definition = false; + node->remove_all_references (); + } + } + } if (!(cnode = dyn_cast (node)) || !cnode->clone_of --- End Message ---
Re: [PATCH, GCC/ARM] Factor out CMSE register clearing code
On 22/11/17 14:45, Kyrill Tkachov wrote: Hi Thomas, On 15/11/17 17:12, Thomas Preudhomme wrote: Hi, Functions cmse_nonsecure_call_clear_caller_saved and cmse_nonsecure_entry_clear_before_return both contain very similar code to clear registers. What's worse, they differ slightly at times so if a bug is found in one careful thoughts is needed to decide whether the other function needs fixing too. This commit addresses the situation by factoring the two pieces of code into a new function. In doing so the code generated to clear VFP registers in cmse_nonsecure_call now uses the same sequence as cmse_nonsecure_entry functions. Tests expectation are thus updated accordingly. ChangeLog entry are as follow: *** gcc/ChangeLog *** 2017-10-24 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm.c (cmse_clear_registers): New function. (cmse_nonsecure_call_clear_caller_saved): Replace register clearing code by call to cmse_clear_registers. (cmse_nonsecure_entry_clear_before_return): Likewise. *** gcc/ChangeLog *** 2017-10-24 Thomas Preud'homme <thomas.preudho...@arm.com> * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Adapt expectations to vmov instructions now generated. * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise. Testing: bootstrapped on arm-linux-gnueabihf and no regression in the testsuite. Is this ok for trunk? This looks mostly ok, but I have a concern from reading the code that I'd like some help with... diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 9b494e9529a4470c18192a4561e03d2f80e90797..22c9add0722974902b2a89b2b0a75759ff8ba37c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -16991,6 +16991,128 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno, return not_to_clear_mask; } +/* Clear registers secret before doing a cmse_nonsecure_call or returning from + a cmse_nonsecure_entry function. TO_CLEAR_BITMAP indicates which registers + are to be fully cleared, using the value in register CLEARING_REG if more + efficient. The PADDING_BITS_LEN entries array PADDING_BITS_TO_CLEAR gives + the bits that needs to be cleared in caller-saved core registers, with + SCRATCH_REG used as a scratch register for that clearing. + + NOTE: one of three following assertions must hold: + - SCRATCH_REG is a low register + - CLEARING_REG is in the set of registers fully cleared (ie. its bit is set + in TO_CLEAR_BITMAP) + - CLEARING_REG is a low register. */ + +static void +cmse_clear_registers (sbitmap to_clear_bitmap, uint32_t *padding_bits_to_clear, + int padding_bits_len, rtx scratch_reg, rtx clearing_reg) +{ + bool saved_clearing = false; + rtx saved_clearing_reg = NULL_RTX; + int i, regno, clearing_regno, minregno = R0_REGNUM, maxregno = minregno - 1; + Here minregno becomes 0 and maxregno becomes -1... + gcc_assert (arm_arch_cmse); + + if (!bitmap_empty_p (to_clear_bitmap)) + { + minregno = bitmap_first_set_bit (to_clear_bitmap); + maxregno = bitmap_last_set_bit (to_clear_bitmap); + } ...and here is a path on maxregno may not be set to a proper register number... If bitmap is empty yes, ie. if no bit is set and no register should be cleared. + + for (regno = minregno; regno <= maxregno; regno++) + { + if (!bitmap_bit_p (to_clear_bitmap, regno)) + continue; + ...and here we iterate from minregno (potentially 0) to maxregno (potentially -1) which will lead to trouble. Are there any guarantees that this case will not occur? It absolutely does occur and that's on purpose. If maxregno is -1 it means there is no bit to clear and so it is fine to do nothing. Best regards, Thomas
[PATCH, GCC/ARM] Remove useless variable in CMSE code
Hi, Functions cmse_nonsecure_call_clear_caller_saved () and cmse_nonsecure_entry_clear_before_return () use a separate variable holding a pointer to padding_bits_to_clear array's first entry which is used when calling function compute_not_to_clear_mask (). This does not save space over using _bits_to_clear[0] directly so this commit gets rid of it. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-11-08 Thomas Preud'homme* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Get rid of padding_bits_to_clear_ptr. (cmse_nonsecure_entry_clear_before_return): Likewise. Testing: Bootstrapped an arm-none-linux-gnueabihf compiler and regression test does not show any regression. Committed as obvious. Best regards, Thomas diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 7384b96fea0179334a6010b099df68c8e2a0fc32..bcb708c1b316ea08969e118fb0949b941ff19c27 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -17002,7 +17002,6 @@ cmse_nonsecure_call_clear_caller_saved (void) bool using_r4, first_param = true; function_args_iterator args_iter; uint32_t padding_bits_to_clear[4] = {0U, 0U, 0U, 0U}; - uint32_t * padding_bits_to_clear_ptr = _bits_to_clear[0]; if (!NONDEBUG_INSN_P (insn)) continue; @@ -17086,7 +17085,7 @@ cmse_nonsecure_call_clear_caller_saved (void) to_clear_args_mask = compute_not_to_clear_mask (arg_type, arg_rtx, REGNO (arg_rtx), - padding_bits_to_clear_ptr); + _bits_to_clear[0]); if (to_clear_args_mask) { for (regno = R0_REGNUM; regno <= maxregno; regno++) @@ -25134,7 +25133,6 @@ cmse_nonsecure_entry_clear_before_return (void) { int regno, maxregno = TARGET_HARD_FLOAT ? LAST_VFP_REGNUM : IP_REGNUM; uint32_t padding_bits_to_clear = 0; - uint32_t * padding_bits_to_clear_ptr = _bits_to_clear; auto_sbitmap to_clear_bitmap (maxregno + 1); tree result_type; rtx result_rtl; @@ -25187,7 +25185,7 @@ cmse_nonsecure_entry_clear_before_return (void) gcc_assert (REG_P (result_rtl)); to_clear_return_mask = compute_not_to_clear_mask (result_type, result_rtl, 0, - padding_bits_to_clear_ptr); + _bits_to_clear); if (to_clear_return_mask) { gcc_assert ((unsigned) maxregno < sizeof (long long) * __CHAR_BIT__);
Re: [PATCH, GCC/ARM] Use bitmap to control cmse_nonsecure_call register clearing
Thanks Kyrill. Committed the attached rebased patch (same patch but without the last hunk because a better fix was done in an earlier commit). Best regards, Thomas On 22/11/17 11:57, Kyrill Tkachov wrote: Hi Thomas, On 15/11/17 17:08, Thomas Preudhomme wrote: Hi, As part of r253256, cmse_nonsecure_entry_clear_before_return has been rewritten to use auto_sbitmap instead of an integer bitfield to control which register needs to be cleared. This commit continue this work in cmse_nonsecure_call_clear_caller_saved. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-10-16 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Use auto_sbitap instead of integer bitfield to control register needing clearing. Testing: bootstrapped on arm-linux-gnueabihf and no regression in the testsuite. Is this ok for trunk? Ok for trunk. Thanks for this conversion. It's much easier to understand the code without having to think about the bitmasks and shifts. Kyrill Best regards, Thomas diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 106e3edce0d6f2518eb391c436c5213a78d1275b..092cd61d49382101bce9b8c5f04de31965dcdc77 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -17007,10 +17007,11 @@ cmse_nonsecure_call_clear_caller_saved (void) FOR_BB_INSNS (bb, insn) { - uint64_t to_clear_mask, float_mask; + unsigned address_regnum, regno, maxregno = + TARGET_HARD_FLOAT_ABI ? D7_VFP_REGNUM : NUM_ARG_REGS - 1; + auto_sbitmap to_clear_bitmap (maxregno + 1); rtx_insn *seq; rtx pat, call, unspec, reg, cleared_reg, tmp; - unsigned int regno, maxregno; rtx address; CUMULATIVE_ARGS args_so_far_v; cumulative_args_t args_so_far; @@ -17041,18 +17042,21 @@ cmse_nonsecure_call_clear_caller_saved (void) continue; /* Determine the caller-saved registers we need to clear. */ - to_clear_mask = (1LL << (NUM_ARG_REGS)) - 1; - maxregno = NUM_ARG_REGS - 1; + bitmap_clear (to_clear_bitmap); + bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS); + /* Only look at the caller-saved floating point registers in case of -mfloat-abi=hard. For -mfloat-abi=softfp we will be using the lazy store and loads which clear both caller- and callee-saved registers. */ if (TARGET_HARD_FLOAT_ABI) { - float_mask = (1LL << (D7_VFP_REGNUM + 1)) - 1; - float_mask &= ~((1LL << FIRST_VFP_REGNUM) - 1); - to_clear_mask |= float_mask; - maxregno = D7_VFP_REGNUM; + auto_sbitmap float_bitmap (maxregno + 1); + + bitmap_clear (float_bitmap); + bitmap_set_range (float_bitmap, FIRST_VFP_REGNUM, +D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1); + bitmap_ior (to_clear_bitmap, to_clear_bitmap, float_bitmap); } /* Make sure the register used to hold the function address is not @@ -17060,7 +17064,9 @@ cmse_nonsecure_call_clear_caller_saved (void) address = RTVEC_ELT (XVEC (unspec, 0), 0); gcc_assert (MEM_P (address)); gcc_assert (REG_P (XEXP (address, 0))); - to_clear_mask &= ~(1LL << REGNO (XEXP (address, 0))); + address_regnum = REGNO (XEXP (address, 0)); + if (address_regnum < R0_REGNUM + NUM_ARG_REGS) + bitmap_clear_bit (to_clear_bitmap, address_regnum); /* Set basic block of call insn so that df rescan is performed on insns inserted here. */ @@ -17081,6 +17087,7 @@ cmse_nonsecure_call_clear_caller_saved (void) FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter) { rtx arg_rtx; + uint64_t to_clear_args_mask; machine_mode arg_mode = TYPE_MODE (arg_type); if (VOID_TYPE_P (arg_type)) @@ -17093,10 +17100,18 @@ cmse_nonsecure_call_clear_caller_saved (void) arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type, true); gcc_assert (REG_P (arg_rtx)); - to_clear_mask - &= ~compute_not_to_clear_mask (arg_type, arg_rtx, - REGNO (arg_rtx), - padding_bits_to_clear_ptr); + to_clear_args_mask + = compute_not_to_clear_mask (arg_type, arg_rtx, + REGNO (arg_rtx), + padding_bits_to_clear_ptr); + if (to_clear_args_mask) + { + for (regno = R0_REGNUM; regno <= maxregno; regno++) + { + if (to_clear_args_mask & (1ULL << regno)) + bitmap_clear_bit (to_clear_bitmap, regno); + } + } first_param = false; } @@ -17155,7 +17170,7 @@ cmse_nonsecure_call_clear_caller_saved (void) call. */ for (regno = R0_REGNUM; regno <= maxregno; regno++) { - if (!(to_clear_mask & (1LL << regno))) + if (!bitmap_bit_p (to_clear_bitmap, regno)) continue; /* If regno is an even vfp register and its successor is also to @@ -17164,7 +17179,7 @@ cmse_nonsecure_call_clear_caller_saved (void) { if (TARGET_VFP_DOUBLE &&
Re: [PATCH] Use bswap framework in store-merging (PR tree-optimization/78821)
Hi Jakub, On 16/11/17 17:06, Jakub Jelinek wrote: Hi! This patch uses the bswap pass framework inside of the store merging pass to handle adjacent stores which produce together a 16/32/64 bit store of bswapped value (loaded or from SSA_NAME) or identity (usually only from SSA_NAME, the code prefers to use the existing store merging code if coming from identity load, because it e.g. can handle arbitrary sizes, not just 16/32/64 bits). There are small tweaks to the bswap code to make it usable inside of the store merging pass. Then when processing the stores, we record what find_bswap_or_nop_1 returns and do a small sanity check on it, and when doing coalesce_immediate_stores (i.e. the splitting into groups), we try for 64-bit, 32-bit and 16-bit sizes if we can extend/shift (according to endianity) and perform_symbolic_merge them together. If it is possible, we turn those 2+ adjacent stores that make together {64,32,16} bits into a separate group and process it specially later (we need to treat it as a single store rather than multiple, so split_group is only very lightweight for that case). Nice, the two finally merged! I took a look at the bswap part and it all looked good to me code and comment wise. I only have one small nit regarding a space/tab change (see below). Bootstrapped/regtested on {x86_64,i686,powerpc64le,powerpc64}-linux, ok for trunk? The cases this patch can handle are less common than rhs_code INTEGER_CST (stores of constants to adjacent memory) or MEM_REF (adjacent memory copying), but are more common than the bitwise ops, during combined x86_64+i686 bootstraps/regtests it triggered: lrotate_expr 974 2528 nop_expr 720 1711 (lrotate_expr stands for bswap, nop_expr for identity, the first column is the actual count of such new stores, the second is the original number of stores that have been optimized this way). Are you saying that lrotate_expr is just the title and it also includes 32- and 64-bit bswap or is it only the count of lrotate_expr nodes? 2017-11-16 Jakub JelinekPR tree-optimization/78821 * gimple-ssa-store-merging.c (find_bswap_or_nop_load): Give up if base is TARGET_MEM_REF. If base is not MEM_REF, set base_addr to the address of the base rather than the base itself. (find_bswap_or_nop_1): Just use pointer comparison for vuse check. (find_bswap_or_nop_finalize): New function. (find_bswap_or_nop): Use it. (bswap_replace): Return a tree rather than bool, change first argument from gimple * to gimple_stmt_iterator, allow inserting into an empty sequence, allow ins_stmt to be NULL - then emit all stmts into gsi. Fix up MEM_REF address gimplification. (pass_optimize_bswap::execute): Adjust bswap_replace caller. Formatting fix. (struct store_immediate_info): Add N and INS_STMT non-static data members. (store_immediate_info::store_immediate_info): Initialize them from newly added ctor args. (merged_store_group::apply_stores): Formatting fixes. Sort by bitpos at the end. (stmts_may_clobber_ref_p): For stores call also refs_anti_dependent_p. (gather_bswap_load_refs): New function. (imm_store_chain_info::try_coalesce_bswap): New method. (imm_store_chain_info::coalesce_immediate_stores): Use it. (split_group): Handle LROTATE_EXPR and NOP_EXPR rhs_code specially. (imm_store_chain_info::output_merged_store): Fail if number of new estimated stmts is bigger or equal than old. Handle LROTATE_EXPR and NOP_EXPR rhs_code. (pass_store_merging::process_store): Compute n and ins_stmt, if ins_stmt is non-NULL and the store rhs is otherwise invalid, use LROTATE_EXPR rhs_code. Pass n and ins_stmt to store_immediate_info ctor. (pass_store_merging::execute): Calculate dominators. * gcc.dg/store_merging_16.c: New test. --- gcc/gimple-ssa-store-merging.c.jj 2017-11-16 10:45:09.239185205 +0100 +++ gcc/gimple-ssa-store-merging.c 2017-11-16 15:34:08.560080214 +0100 @@ -369,7 +369,10 @@ find_bswap_or_nop_load (gimple *stmt, tr base_addr = get_inner_reference (ref, , , , , , , ); - if (TREE_CODE (base_addr) == MEM_REF) + if (TREE_CODE (base_addr) == TARGET_MEM_REF) +/* Do not rewrite TARGET_MEM_REF. */ +return false; + else if (TREE_CODE (base_addr) == MEM_REF) { offset_int bit_offset = 0; tree off = TREE_OPERAND (base_addr, 1); @@ -401,6 +404,8 @@ find_bswap_or_nop_load (gimple *stmt, tr bitpos += bit_offset.to_shwi (); } + else +base_addr = build_fold_addr_expr (base_addr); if (bitpos % BITS_PER_UNIT) return false; @@ -743,8 +748,7 @@ find_bswap_or_nop_1 (gimple *stmt, struc if (TYPE_PRECISION (n1.type) != TYPE_PRECISION (n2.type)) return
[PATCH, GCC/ARM] Do no clobber r4 in Armv8-M nonsecure call
Hi, Expanders for Armv8-M nonsecure call unnecessarily clobber r4 despite the libcall they perform not writing to r4. Furthermore, the requirement for the branch target address to be in r4 as expected by the libcall is modeled in a convoluted way in the define_insn patterns: the address is a register match_operand constrained by the match_dup for the clobber which is guaranteed to be r4 due to the expander. This patch simplifies all this by simply requiring the address to be in r4 and removing the clobbers. Expanders are left alone because cmse_nonsecure_call_clear_caller_saved relies on branch target memory attributes which would be lost if expanding to reg:SI R4_REGNUM. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-10-24 Thomas Preud'homme* config/arm/arm.md (R4_REGNUM): Define constant. (nonsecure_call_internal): Remove r4 clobber. (nonsecure_call_value_internal): Likewise. * config/arm/thumb1.md (nonsecure_call_reg_thumb1_v5): Remove second clobber and resequence match_operands. (nonsecure_call_value_reg_thumb1_v5): Likewise. * config/arm/thumb2.md (nonsecure_call_reg_thumb2): Likewise. (nonsecure_call_value_reg_thumb2): Likewise. Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no regression. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index ddb9d8f359007c1d86d497aef0ff5fc0e4061813..6b0794ede9fbc5a4f41e1f4a92acb9b649a277bc 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -30,6 +30,7 @@ (define_constants [(R0_REGNUM 0) ; First CORE register (R1_REGNUM 1) ; Second CORE register + (R4_REGNUM 4) ; Fifth CORE register (IP_REGNUM 12) ; Scratch register (SP_REGNUM 13) ; Stack pointer (LR_REGNUM14) ; Return address register @@ -8118,14 +8119,13 @@ UNSPEC_NONSECURE_MEM) (match_operand 1 "general_operand" "")) (use (match_operand 2 "" "")) - (clobber (reg:SI LR_REGNUM)) - (clobber (reg:SI 4))])] + (clobber (reg:SI LR_REGNUM))])] "use_cmse" " { rtx tmp; tmp = copy_to_suggested_reg (XEXP (operands[0], 0), - gen_rtx_REG (SImode, 4), + gen_rtx_REG (SImode, R4_REGNUM), SImode); operands[0] = replace_equiv_address (operands[0], tmp); @@ -8210,14 +8210,13 @@ UNSPEC_NONSECURE_MEM) (match_operand 2 "general_operand" ""))) (use (match_operand 3 "" "")) - (clobber (reg:SI LR_REGNUM)) - (clobber (reg:SI 4))])] + (clobber (reg:SI LR_REGNUM))])] "use_cmse" " { rtx tmp; tmp = copy_to_suggested_reg (XEXP (operands[1], 0), - gen_rtx_REG (SImode, 4), + gen_rtx_REG (SImode, R4_REGNUM), SImode); operands[1] = replace_equiv_address (operands[1], tmp); diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md index 5d196a673355a7acf7d0ed30f21b997b815913f5..f91659386bf240172bd9a3076722683c8a50dff4 100644 --- a/gcc/config/arm/thumb1.md +++ b/gcc/config/arm/thumb1.md @@ -1732,12 +1732,11 @@ ) (define_insn "*nonsecure_call_reg_thumb1_v5" - [(call (unspec:SI [(mem:SI (match_operand:SI 0 "register_operand" "l*r"))] + [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))] UNSPEC_NONSECURE_MEM) - (match_operand 1 "" "")) - (use (match_operand 2 "" "")) - (clobber (reg:SI LR_REGNUM)) - (clobber (match_dup 0))] + (match_operand 0 "" "")) + (use (match_operand 1 "" "")) + (clobber (reg:SI LR_REGNUM))] "TARGET_THUMB1 && use_cmse && !SIBLING_CALL_P (insn)" "bl\\t__gnu_cmse_nonsecure_call" [(set_attr "length" "4") @@ -1779,12 +1778,11 @@ (define_insn "*nonsecure_call_value_reg_thumb1_v5" [(set (match_operand 0 "" "") (call (unspec:SI - [(mem:SI (match_operand:SI 1 "register_operand" "l*r"))] + [(mem:SI (reg:SI R4_REGNUM))] UNSPEC_NONSECURE_MEM) - (match_operand 2 "" ""))) - (use (match_operand 3 "" "")) - (clobber (reg:SI LR_REGNUM)) - (clobber (match_dup 1))] + (match_operand 1 "" ""))) + (use (match_operand 2 "" "")) + (clobber (reg:SI LR_REGNUM))] "TARGET_THUMB1 && use_cmse" "bl\\t__gnu_cmse_nonsecure_call" [(set_attr "length" "4") diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md index 776d611d2538e790a5f504995050ffdfc51d7193..d56a8bd167575263edc2a4b3f66bda34a4a7a72a 100644 --- a/gcc/config/arm/thumb2.md +++ b/gcc/config/arm/thumb2.md @@ -555,12 +555,11 @@ ) (define_insn "*nonsecure_call_reg_thumb2" - [(call (unspec:SI [(mem:SI (match_operand:SI 0 "s_register_operand" "r"))] + [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))] UNSPEC_NONSECURE_MEM) - (match_operand 1 "" "")) - (use (match_operand 2 "" "")) - (clobber (reg:SI LR_REGNUM)) - (clobber (match_dup 0))] + (match_operand 0 "" "")) + (use (match_operand 1 "" "")) + (clobber (reg:SI LR_REGNUM))] "TARGET_THUMB2 && use_cmse"
[PATCH, GCC/ARM] Factor out CMSE register clearing code
Hi, Functions cmse_nonsecure_call_clear_caller_saved and cmse_nonsecure_entry_clear_before_return both contain very similar code to clear registers. What's worse, they differ slightly at times so if a bug is found in one careful thoughts is needed to decide whether the other function needs fixing too. This commit addresses the situation by factoring the two pieces of code into a new function. In doing so the code generated to clear VFP registers in cmse_nonsecure_call now uses the same sequence as cmse_nonsecure_entry functions. Tests expectation are thus updated accordingly. ChangeLog entry are as follow: *** gcc/ChangeLog *** 2017-10-24 Thomas Preud'homme* config/arm/arm.c (cmse_clear_registers): New function. (cmse_nonsecure_call_clear_caller_saved): Replace register clearing code by call to cmse_clear_registers. (cmse_nonsecure_entry_clear_before_return): Likewise. *** gcc/ChangeLog *** 2017-10-24 Thomas Preud'homme * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Adapt expectations to vmov instructions now generated. * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise. Testing: bootstrapped on arm-linux-gnueabihf and no regression in the testsuite. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 9b494e9529a4470c18192a4561e03d2f80e90797..22c9add0722974902b2a89b2b0a75759ff8ba37c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -16991,6 +16991,128 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno, return not_to_clear_mask; } +/* Clear registers secret before doing a cmse_nonsecure_call or returning from + a cmse_nonsecure_entry function. TO_CLEAR_BITMAP indicates which registers + are to be fully cleared, using the value in register CLEARING_REG if more + efficient. The PADDING_BITS_LEN entries array PADDING_BITS_TO_CLEAR gives + the bits that needs to be cleared in caller-saved core registers, with + SCRATCH_REG used as a scratch register for that clearing. + + NOTE: one of three following assertions must hold: + - SCRATCH_REG is a low register + - CLEARING_REG is in the set of registers fully cleared (ie. its bit is set + in TO_CLEAR_BITMAP) + - CLEARING_REG is a low register. */ + +static void +cmse_clear_registers (sbitmap to_clear_bitmap, uint32_t *padding_bits_to_clear, + int padding_bits_len, rtx scratch_reg, rtx clearing_reg) +{ + bool saved_clearing = false; + rtx saved_clearing_reg = NULL_RTX; + int i, regno, clearing_regno, minregno = R0_REGNUM, maxregno = minregno - 1; + + gcc_assert (arm_arch_cmse); + + if (!bitmap_empty_p (to_clear_bitmap)) +{ + minregno = bitmap_first_set_bit (to_clear_bitmap); + maxregno = bitmap_last_set_bit (to_clear_bitmap); +} + clearing_regno = REGNO (clearing_reg); + + /* Clear padding bits. */ + gcc_assert (padding_bits_len <= NUM_ARG_REGS); + for (i = 0, regno = R0_REGNUM; i < padding_bits_len; i++, regno++) +{ + uint64_t mask; + rtx rtx16, dest, cleared_reg = gen_rtx_REG (SImode, regno); + + if (padding_bits_to_clear[i] == 0) + continue; + + /* If this is a Thumb-1 target and SCRATCH_REG is not a low register, use + CLEARING_REG as scratch. */ + if (TARGET_THUMB1 + && REGNO (scratch_reg) > LAST_LO_REGNUM) + { + /* clearing_reg is not to be cleared, copy its value into scratch_reg + such that we can use clearing_reg to clear the unused bits in the + arguments. */ + if ((clearing_regno > maxregno + || !bitmap_bit_p (to_clear_bitmap, clearing_regno)) + && !saved_clearing) + { + gcc_assert (clearing_regno <= LAST_LO_REGNUM); + emit_move_insn (scratch_reg, clearing_reg); + saved_clearing = true; + saved_clearing_reg = scratch_reg; + } + scratch_reg = clearing_reg; + } + + /* Fill the lower half of the negated padding_bits_to_clear[i]. */ + mask = (~padding_bits_to_clear[i]) & 0x; + emit_move_insn (scratch_reg, gen_int_mode (mask, SImode)); + + /* Fill the top half of the negated padding_bits_to_clear[i]. */ + mask = (~padding_bits_to_clear[i]) >> 16; + rtx16 = gen_int_mode (16, SImode); + dest = gen_rtx_ZERO_EXTRACT (SImode, scratch_reg, rtx16, rtx16); + if (mask) + emit_insn (gen_rtx_SET (dest, gen_int_mode (mask, SImode))); + + emit_insn (gen_andsi3 (cleared_reg, cleared_reg, scratch_reg)); +} + if (saved_clearing) +emit_move_insn (clearing_reg, saved_clearing_reg); + + + /* Clear full registers. */ + + /* If not marked for clearing, clearing_reg already does not
[PATCH, GCC/ARM] Use bitmap to control cmse_nonsecure_call register clearing
Hi, As part of r253256, cmse_nonsecure_entry_clear_before_return has been rewritten to use auto_sbitmap instead of an integer bitfield to control which register needs to be cleared. This commit continue this work in cmse_nonsecure_call_clear_caller_saved. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-10-16 Thomas Preud'homme* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Use auto_sbitap instead of integer bitfield to control register needing clearing. Testing: bootstrapped on arm-linux-gnueabihf and no regression in the testsuite. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 9919f54242d9317125a104f9777d76a85de80e9b..7384b96fea0179334a6010b099df68c8e2a0fc32 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -16990,10 +16990,11 @@ cmse_nonsecure_call_clear_caller_saved (void) FOR_BB_INSNS (bb, insn) { - uint64_t to_clear_mask, float_mask; + unsigned address_regnum, regno, maxregno = + TARGET_HARD_FLOAT_ABI ? D7_VFP_REGNUM : NUM_ARG_REGS - 1; + auto_sbitmap to_clear_bitmap (maxregno + 1); rtx_insn *seq; rtx pat, call, unspec, reg, cleared_reg, tmp; - unsigned int regno, maxregno; rtx address; CUMULATIVE_ARGS args_so_far_v; cumulative_args_t args_so_far; @@ -17024,18 +17025,21 @@ cmse_nonsecure_call_clear_caller_saved (void) continue; /* Determine the caller-saved registers we need to clear. */ - to_clear_mask = (1LL << (NUM_ARG_REGS)) - 1; - maxregno = NUM_ARG_REGS - 1; + bitmap_clear (to_clear_bitmap); + bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS); + /* Only look at the caller-saved floating point registers in case of -mfloat-abi=hard. For -mfloat-abi=softfp we will be using the lazy store and loads which clear both caller- and callee-saved registers. */ if (TARGET_HARD_FLOAT_ABI) { - float_mask = (1LL << (D7_VFP_REGNUM + 1)) - 1; - float_mask &= ~((1LL << FIRST_VFP_REGNUM) - 1); - to_clear_mask |= float_mask; - maxregno = D7_VFP_REGNUM; + auto_sbitmap float_bitmap (maxregno + 1); + + bitmap_clear (float_bitmap); + bitmap_set_range (float_bitmap, FIRST_VFP_REGNUM, +D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1); + bitmap_ior (to_clear_bitmap, to_clear_bitmap, float_bitmap); } /* Make sure the register used to hold the function address is not @@ -17043,7 +17047,9 @@ cmse_nonsecure_call_clear_caller_saved (void) address = RTVEC_ELT (XVEC (unspec, 0), 0); gcc_assert (MEM_P (address)); gcc_assert (REG_P (XEXP (address, 0))); - to_clear_mask &= ~(1LL << REGNO (XEXP (address, 0))); + address_regnum = REGNO (XEXP (address, 0)); + if (address_regnum < R0_REGNUM + NUM_ARG_REGS) + bitmap_clear_bit (to_clear_bitmap, address_regnum); /* Set basic block of call insn so that df rescan is performed on insns inserted here. */ @@ -17064,6 +17070,7 @@ cmse_nonsecure_call_clear_caller_saved (void) FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter) { rtx arg_rtx; + uint64_t to_clear_args_mask; machine_mode arg_mode = TYPE_MODE (arg_type); if (VOID_TYPE_P (arg_type)) @@ -17076,10 +17083,18 @@ cmse_nonsecure_call_clear_caller_saved (void) arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type, true); gcc_assert (REG_P (arg_rtx)); - to_clear_mask - &= ~compute_not_to_clear_mask (arg_type, arg_rtx, - REGNO (arg_rtx), - padding_bits_to_clear_ptr); + to_clear_args_mask + = compute_not_to_clear_mask (arg_type, arg_rtx, + REGNO (arg_rtx), + padding_bits_to_clear_ptr); + if (to_clear_args_mask) + { + for (regno = R0_REGNUM; regno <= maxregno; regno++) + { + if (to_clear_args_mask & (1ULL << regno)) + bitmap_clear_bit (to_clear_bitmap, regno); + } + } first_param = false; } @@ -17138,7 +17153,7 @@ cmse_nonsecure_call_clear_caller_saved (void) call. */ for (regno = R0_REGNUM; regno <= maxregno; regno++) { - if (!(to_clear_mask & (1LL << regno))) + if (!bitmap_bit_p (to_clear_bitmap, regno)) continue; /* If regno is an even vfp register and its successor is also to @@ -17147,7 +17162,7 @@ cmse_nonsecure_call_clear_caller_saved (void) { if (TARGET_VFP_DOUBLE && VFP_REGNO_OK_FOR_DOUBLE (regno) - && to_clear_mask & (1LL << (regno + 1))) + && bitmap_bit_p (to_clear_bitmap, (regno + 1))) emit_move_insn (gen_rtx_REG (DFmode, regno++), CONST0_RTX (DFmode)); else @@ -17161,7 +17176,6 @@ cmse_nonsecure_call_clear_caller_saved (void) seq = get_insns (); end_sequence (); emit_insn_before (seq, insn); - } } } @@ -25188,7 +25202,7 @@ cmse_nonsecure_entry_clear_before_return (void) if
[PATCH, GCC/testsuite/ARM] Rework expectation for call to Armv8-M nonsecure function
Hi, Testcase gcc.target/arm/cmse/cmse-14.c checks whether bar is called via __gnu_cmse_nonsecure_call libcall and not via a direct call. However the pattern is a bit surprising in that it needs to explicitely allow "by" due to allowing anything before the 'b'. This patch rewrites the logic to look for b as a first non-whitespace letter followed iby anything (to match bl and conditional branches) followed by some spaces and then bar. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-11-01 Thomas Preud'homme* gcc.target/arm/cmse/cmse-14.c: Change logic to match branch instruction to bar. Testing: Test still passes for both Armv8-M Baseline and Mainline. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c index 701e9ee7e318a07278099548f9b7042a1fde1204..df1ea52bec533c36a738d7d3b2b2ff749b0f3713 100644 --- a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c +++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c @@ -10,4 +10,4 @@ int foo (void) } /* { dg-final { scan-assembler "bl\t__gnu_cmse_nonsecure_call" } } */ -/* { dg-final { scan-assembler-not "b\[^ y\n\]*\\s+bar" } } */ +/* { dg-final { scan-assembler-not "^(.*\\s)?bl?\[^\\s]*\\s+bar" } } */
[PATCH, GCC/testsuite/ARM] Fix selection of effective target for cmse tests
Hi, Some of the tests in the gcc.target/arm/cmse directory (eg. gcc.target/arm/cmse/mainline/bitfield-4.c) are failing when run without an architecture specified in RUNTESTFLAGS due to them not adding the option to select an Armv8-M architecture. This patch fixes the issue by adding the right option from the exp file so that no architecture fiddling is necessary in the individual tests. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-11-03 Thomas Preud'homme* gcc.target/arm/cmse/cmse.exp: Add option to select Armv8-M Baseline or Armv8-M Mainline when running the respective tests. * gcc.target/arm/cmse/baseline/cmse-11.c: Remove architecture check and selection. * gcc.target/arm/cmse/baseline/cmse-13.c: Likewise. * gcc.target/arm/cmse/baseline/cmse-2.c: Likewise. * gcc.target/arm/cmse/baseline/cmse-6.c: Likewise. * gcc.target/arm/cmse/baseline/softfp.c: Likewise. * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/soft/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/soft/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/soft/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/softfp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/softfp/cmse-8.c: Likewise. Testing: Running cmse.exp for both Armv8-M Baseline and Mainline shows no regression. Running it for a toolchain defaulting to Armv8-M Baseline but with RUNTESTFLAGS unset sees some FAIL->PASS. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c index 795544fe11d9d7f24086be16916a5bfee89d7b44..230b255963f56a6c29b91d2501b43fed6eda2476 100644 --- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c +++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c @@ -1,7 +1,5 @@ /* { dg-do compile } */ /* { dg-options "-mcmse" } */ -/* { dg-require-effective-target arm_arch_v8m_base_ok } */ -/* { dg-add-options arm_arch_v8m_base } */ int __attribute__ ((cmse_nonsecure_call)) (*bar) (int); diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c index 7208a2cedd2f4f8296b2801d6f5e5d7838b26551..7ab3219e860e993e2eca3bbee2e885f59b7b3cb4 100644 --- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c +++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c @@ -1,7 +1,5 @@ /* { dg-do compile } */ /* { dg-options "-mcmse" } */ -/* { dg-require-effective-target arm_arch_v8m_base_ok } */ -/* { dg-add-options arm_arch_v8m_base } */ #include "../cmse-13.x" diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c index fec7dc10484b14db5796f5f431a9306c3b2e307c..d5115ecf2bdb3e87dc6a92244cb204e753f25b07 100644 --- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c +++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c @@ -1,7 +1,5 @@ /* { dg-do compile } */ /* { dg-options "-mcmse" } */ -/* { dg-require-effective-target arm_arch_v8m_base_ok } */ -/* { dg-add-options arm_arch_v8m_base } */ extern float bar (void); diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c index 43d45e7a63e56edfebc203c8f0e516dc13fbbd65..cae4f343621d1a19a8893ea4950d33e5e1842fb5 100644 --- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c +++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c @@ -1,7 +1,5 @@ /* { dg-do compile } */ /* { dg-options "-mcmse" } */ -/* { dg-require-effective-target arm_arch_v8m_base_ok } */ -/* { dg-add-options arm_arch_v8m_base } */ int __attribute__ ((cmse_nonsecure_call)) (*bar) (double); diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c index ca76e12cd9287fd12b7eb7add638973f5d314939..3d383ff6ee17677120e3e1e81726785c30f3b25c 100644 --- a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c +++
[PATCH, GCC/ARM] Fix ICE in Armv8-M Security Extensions code
Hi, Commit r253825 which introduced some sanity checks for sbitmap revealed a bug in the conversion of cmse_nonsecure_entry_clear_before_return () to using bitmap structure. bitmap_and expects that the two bitmaps have the same length, yet the code in cmse_nonsecure_entry_clear_before_return () have different size for to_clear_bitmap and to_clear_arg_regs_bitmap, with the assumption that bitmap_and would behave has if the bits not allocated were in fact zero. This commit makes sure both bitmap are equally sized. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-11-13 Thomas Preud'homme* config/arm/arm.c (cmse_nonsecure_entry_clear_before_return): Allocate to_clear_arg_regs_bitmap to the same size as to_clear_bitmap. Testing: Bootstrapped GCC on arm-none-linux-gnueabihf target and testsuite shows no regression. Running cmse.exp tests for Armv8-M Baseline and Mainline shows FAIL->PASS for bitfield-1, bitfield-2, bitfield-3 and struct-1 testcases. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index db99303f3fb7a2196f48358e74fa4d98f31f045e..106e3edce0d6f2518eb391c436c5213a78d1275b 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -25205,7 +25205,8 @@ cmse_nonsecure_entry_clear_before_return (void) if (padding_bits_to_clear != 0) { rtx reg_rtx; - auto_sbitmap to_clear_arg_regs_bitmap (R0_REGNUM + NUM_ARG_REGS); + int to_clear_bitmap_size = SBITMAP_SIZE ((sbitmap) to_clear_bitmap); + auto_sbitmap to_clear_arg_regs_bitmap (to_clear_bitmap_size); /* Padding bits to clear is not 0 so we know we are dealing with returning a composite type, which only uses r0. Let's make sure that
Re: [PATCH, GCC/testsuite/ARM] Consolidate sources for cmse tests
on-1.c: Likewise. * gcc.target/arm/cmse/union-2.x: New file. * gcc.target/arm/cmse/baseline/union-2.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/union-2.c: Likewise. Testing: Running cmse.exp for both Armv8-M Baseline and Mainline shows no regression. Is this ok for trunk? Best regards, Thomas On 10/11/17 11:19, Thomas Preudhomme wrote: For the most part, testcases under gcc.target/arm/cmse/baseline and gcc.target/arm/cmse/mainline are duplicate copies with only different dejagnu directives. Although there is no requirement for them to be similar, having them both identical allow to compare the code generated and make it easier in case of change in code generation to both architecture to update the testcases (if one needs updating so does the other). Similarly all the tests in gcc.target/arm/cmse/mainline/ have the same source but are duplicate copies. This patch moves all the code in the tests to a parent directory: gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline and gcc.target/arm/cmse/mainline for tests *only* shared by the various float ABI of Armv8-M Mainline. C includes are then used where the code used to sit. Note that the cmse-13.c test used to differ slightly between architectures and float ABI tested in the first floating-point constant passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on 3.0 to not confuse with the 1.0 constant used to clear VFP registers in some of the configurations. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-11-03 Thomas Preud'homme <thomas.preudho...@arm.com> * gcc.target/arm/cmse/bitfield-4.x: New file. * gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-4.c: Likewise. * gcc.target/arm/cmse/bitfield-5.x: New file. * gcc.target/arm/cmse/baseline/bitfield-5.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-5.c: Likewise. * gcc.target/arm/cmse/bitfield-6.x: New file. * gcc.target/arm/cmse/baseline/bitfield-6.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-6.c: Likewise. * gcc.target/arm/cmse/bitfield-7.x: New file. * gcc.target/arm/cmse/baseline/bitfield-7.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-7.c: Likewise. * gcc.target/arm/cmse/bitfield-8.x: New file. * gcc.target/arm/cmse/baseline/bitfield-8.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-8.c: Likewise. * gcc.target/arm/cmse/bitfield-9.x: New file. * gcc.target/arm/cmse/baseline/bitfield-9.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-9.c: Likewise. * gcc.target/arm/cmse/bitfield-and-union.x: New file. * gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: Rename into ... * gcc.target/arm/cmse/baseline/bitfield-and-union.c: This. Remove code and include above bitfield-and-union.x file. * gcc.target/arm/cmse/mainline/bitfield-and-union-1.c: Rename into ... * gcc.target/arm/cmse/mainline/bitfield-and-union.c: this. Remove code and include above bitfield-and-union.x file. * gcc.target/arm/cmse/cmse-13.x: New file. * gcc.target/arm/cmse/baseline/cmse-13.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise. * gcc.target/arm/cmse/cmse-5.x: New file. * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/harFor the most part, testcases under gcc.target/arm/cmse/baseline and gcc.target/arm/cmse/mainline are duplicate copies with only different dejagnu directives. Although there is no requirement for them to be similar, having them both identical allow to compare the code generated and make it easier in case of change in code generation to both architecture to update the testcases (if one needs updating so does the other). Similarly all the tests in gcc.target/arm/cmse/mainline/ have the same source but are duplicate copies. This patch moves all the code in the tests to a parent directory: gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline and gcc.target/arm/cmse/mainline for tests *only* shared by the various float ABI of Armv8-M Mainline. C includes are then used where the code used to sit. Note that the cmse-13.c test used to differ slightly between architectures and float ABI tested in the first floating-point constant passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on 3.0 to not confuse with the 1.0 constant used to clear VFP regis
[PATCH, GCC/testsuite/ARM] Consolidate sources for cmse tests
For the most part, testcases under gcc.target/arm/cmse/baseline and gcc.target/arm/cmse/mainline are duplicate copies with only different dejagnu directives. Although there is no requirement for them to be similar, having them both identical allow to compare the code generated and make it easier in case of change in code generation to both architecture to update the testcases (if one needs updating so does the other). Similarly all the tests in gcc.target/arm/cmse/mainline/ have the same source but are duplicate copies. This patch moves all the code in the tests to a parent directory: gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline and gcc.target/arm/cmse/mainline for tests *only* shared by the various float ABI of Armv8-M Mainline. C includes are then used where the code used to sit. Note that the cmse-13.c test used to differ slightly between architectures and float ABI tested in the first floating-point constant passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on 3.0 to not confuse with the 1.0 constant used to clear VFP registers in some of the configurations. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-11-03 Thomas Preud'homme* gcc.target/arm/cmse/bitfield-4.x: New file. * gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-4.c: Likewise. * gcc.target/arm/cmse/bitfield-5.x: New file. * gcc.target/arm/cmse/baseline/bitfield-5.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-5.c: Likewise. * gcc.target/arm/cmse/bitfield-6.x: New file. * gcc.target/arm/cmse/baseline/bitfield-6.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-6.c: Likewise. * gcc.target/arm/cmse/bitfield-7.x: New file. * gcc.target/arm/cmse/baseline/bitfield-7.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-7.c: Likewise. * gcc.target/arm/cmse/bitfield-8.x: New file. * gcc.target/arm/cmse/baseline/bitfield-8.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-8.c: Likewise. * gcc.target/arm/cmse/bitfield-9.x: New file. * gcc.target/arm/cmse/baseline/bitfield-9.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/bitfield-9.c: Likewise. * gcc.target/arm/cmse/bitfield-and-union.x: New file. * gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: Rename into ... * gcc.target/arm/cmse/baseline/bitfield-and-union.c: This. Remove code and include above bitfield-and-union.x file. * gcc.target/arm/cmse/mainline/bitfield-and-union-1.c: Rename into ... * gcc.target/arm/cmse/mainline/bitfield-and-union.c: this. Remove code and include above bitfield-and-union.x file. * gcc.target/arm/cmse/cmse-13.x: New file. * gcc.target/arm/cmse/baseline/cmse-13.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise. * gcc.target/arm/cmse/cmse-5.x: New file. * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Remove code and include above file. * gcc.target/arm/cmse/mainline/harFor the most part, testcases under gcc.target/arm/cmse/baseline and gcc.target/arm/cmse/mainline are duplicate copies with only different dejagnu directives. Although there is no requirement for them to be similar, having them both identical allow to compare the code generated and make it easier in case of change in code generation to both architecture to update the testcases (if one needs updating so does the other). Similarly all the tests in gcc.target/arm/cmse/mainline/ have the same source but are duplicate copies. This patch moves all the code in the tests to a parent directory: gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline and gcc.target/arm/cmse/mainline for tests *only* shared by the various float ABI of Armv8-M Mainline. C includes are then used where the code used to sit. Note that the cmse-13.c test used to differ slightly between architectures and float ABI tested in the first floating-point constant passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on 3.0 to not confuse with the 1.0 constant used to clear VFP registers in some of the configurations. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-11-03 Thomas Preud'homme * gcc.target/arm/cmse/bitfield-4.x: New file. * gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and
[PATCH, GCC/testsuite] Fix retrieval of testname
When gcc-dg-runtest is used to run a test the test is run several times with different options. For clarity of the log, the test infrastructure then append the options to the testname. This means that all the code that must deal with the testcase itself (eg. removing the output files after the test has run) needs to remove the option name. There is already a pattern (see below) for this in several place of the testsuite framework but it is also missing in many places. This patch fixes all of these places. The pattern is as follows: set testcase [testname-for-summary] ; The name might include a list of options; extract the file name. set testcase [lindex $testcase 0] ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2017-11-08 Thomas Preud'homme* lib/scanasm.exp (scan-assembler): Extract filename from testname used in summary. (scan-assembler-not): Likewise. (scan-hidden): Likewise. (scan-not-hidden): Likewise. (scan-stack-usage): Likewise. (scan-stack-usage-not): Likewise. (scan-assembler-times): Likewise. (scan-assembler-dem): Likewise. (scan-assembler-dem-not): Likewise. (object-size): Likewise. (scan-lto-assembler): Likewise. * lib/scandump.exp (scan-dump): Likewise. (scan-dump-times): Likewise. (scan-dump-not): Likewise. (scan-dump-dem): Likewise. (scan-dump-dem-not): Likewise Testing: Ran testsuite on bootstrap aarch64-linux-gnu and x86_64-linux-gnu compiled with C, fortran and ada support without any regression. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp index a66bb28253196410554405facefa8641d1020c1d..33286152f30df959a4bffa81634d0bfe7b898e8f 100644 --- a/gcc/testsuite/lib/scanasm.exp +++ b/gcc/testsuite/lib/scanasm.exp @@ -78,7 +78,9 @@ proc dg-scan { name positive testcase output_file orig_args } { proc scan-assembler { args } { set testcase [testname-for-summary] -set output_file "[file rootname [file tail $testcase]].s" +# The name might include a list of options; extract the file name. +set filename [lindex $testcase 0] +set output_file "[file rootname [file tail $filename]].s" dg-scan "scan-assembler" 1 $testcase $output_file $args } @@ -89,7 +91,9 @@ force_conventional_output_for scan-assembler proc scan-assembler-not { args } { set testcase [testname-for-summary] -set output_file "[file rootname [file tail $testcase]].s" +# The name might include a list of options; extract the file name. +set filename [lindex $testcase 0] +set output_file "[file rootname [file tail $filename]].s" dg-scan "scan-assembler-not" 0 $testcase $output_file $args } @@ -117,7 +121,9 @@ proc hidden-scan-for { symbol } { proc scan-hidden { args } { set testcase [testname-for-summary] -set output_file "[file rootname [file tail $testcase]].s" +# The name might include a list of options; extract the file name. +set filename [lindex $testcase 0] +set output_file "[file rootname [file tail $filename]].s" set symbol [lindex $args 0] @@ -133,7 +139,9 @@ proc scan-hidden { args } { proc scan-not-hidden { args } { set testcase [testname-for-summary] -set output_file "[file rootname [file tail $testcase]].s" +# The name might include a list of options; extract the file name. +set filename [lindex $testcase 0] +set output_file "[file rootname [file tail $filename]].s" set symbol [lindex $args 0] set hidden_scan [hidden-scan-for $symbol] @@ -163,7 +171,9 @@ proc scan-file-not { output_file args } { proc scan-stack-usage { args } { set testcase [testname-for-summary] -set output_file "[file rootname [file tail $testcase]].su" +# The name might include a list of options; extract the file name. +set filename [lindex $testcase 0] +set output_file "[file rootname [file tail $filename]].su" dg-scan "scan-file" 1 $testcase $output_file $args } @@ -173,7 +183,9 @@ proc scan-stack-usage { args } { proc scan-stack-usage-not { args } { set testcase [testname-for-summary] -set output_file "[file rootname [file tail $testcase]].su" +# The name might include a list of options; extract the file name. +set filename [lindex $testcase 0] +set output_file "[file rootname [file tail $filename]].su" dg-scan "scan-file-not" 0 $testcase $output_file $args } @@ -230,12 +242,14 @@ proc scan-assembler-times { args } { } set testcase [testname-for-summary] +# The name might include a list of options; extract the file name. +set filename [lindex $testcase 0] set pattern [lindex $args 0] set times [lindex $args 1] set pp_pattern [make_pattern_printable $pattern] # This must match the rule in gcc-dg.exp. -set output_file "[file rootname [file tail
[PATCH, GCC/ARM] Fix cmse_nonsecure_entry return insn size
Hi, A number of instructions are output in assembler form by output_return_instruction () when compiling a function with the cmse_nonsecure_entry attribute for Armv8-M Mainline with hardfloat float ABI. However, the corresponding thumb2_cmse_entry_return insn pattern does not account for all these instructions in its computing of the length of the instruction. This may lead GCC to use the wrong branching instruction due to incorrect computation of the offset between the branch instruction's address and the target address. This commit fixes the mismatch between what output_return_instruction () does and what the pattern think it does and adds a note warning about mismatch in the affected functions' heading comments to ensure code does not get out of sync again. Note: no test is provided because the C testcase is fragile (only works on GCC 6) and the extracted RTL test fails to compile due to bugs in the RTL frontend (PR82815 and PR82817) ChangeLog entries are as follows: *** gcc/ChangeLog *** 2017-10-30 Thomas Preud'homme* config/arm/arm.c (output_return_instruction): Add comments to indicate requirement for cmse_nonsecure_entry return to account for the size of clearing instruction output here. (thumb_exit): Likewise. * config/arm/thumb2.md (thumb2_cmse_entry_return): Fix length for return in hardfloat mode. Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no regression. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 033ec255a577f782201527f57f45802bc0eb45e0..9919f54242d9317125a104f9777d76a85de80e9b 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -19417,7 +19417,12 @@ arm_get_vfp_saved_size (void) /* Generate a function exit sequence. If REALLY_RETURN is false, then do everything bar the final return instruction. If simple_return is true, - then do not output epilogue, because it has already been emitted in RTL. */ + then do not output epilogue, because it has already been emitted in RTL. + + Note: do not forget to update length attribute of corresponding insn pattern + when changing assembly output (eg. length attribute of + thumb2_cmse_entry_return when updating Armv8-M Mainline Security Extensions + register clearing sequences). */ const char * output_return_instruction (rtx operand, bool really_return, bool reverse, bool simple_return) @@ -23950,7 +23955,12 @@ thumb_pop (FILE *f, unsigned long mask) /* Generate code to return from a thumb function. If 'reg_containing_return_addr' is -1, then the return address is - actually on the stack, at the stack pointer. */ + actually on the stack, at the stack pointer. + + Note: do not forget to update length attribute of corresponding insn pattern + when changing assembly output (eg. length attribute of epilogue_insns when + updating Armv8-M Baseline Security Extensions register clearing + sequences). */ static void thumb_exit (FILE *f, int reg_containing_return_addr) { diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md index b78c3d256aeafc2eeb3dcdc2b9b07b1af9df5294..776d611d2538e790a5f504995050ffdfc51d7193 100644 --- a/gcc/config/arm/thumb2.md +++ b/gcc/config/arm/thumb2.md @@ -1132,7 +1132,7 @@ ; we adapt the length accordingly. (set (attr "length") (if_then_else (match_test "TARGET_HARD_FLOAT") - (const_int 12) + (const_int 34) (const_int 8))) ; We do not support predicate execution of returns from cmse_nonsecure_entry ; functions because we need to clear the APSR. Since predicable has to be
[PATCH, GCC/ARM] Allow +nodsp for -mcpu=cortex-m33
Hi, DSP instructions are optional for Arm Cortex-M33, yet its -mcpu option does not allow +nodsp. Users are thus left with using -march=armv8-m.main -mtune=cortex-m33. This patch allows +nodsp to -mcpu=cortex-m33. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-10-11 Thomas Preud'homme* config/arm/arm-cpus.in (cortex-m33): Add nodsp option. * doc/invoke.texi: Document +nodsp as a valid extension for -mcpu=cortex-m33. Tested by building an arm-none-eabi GCC cross-compiler and checking that __ARM_FEATURE_DSP is *not* defined when invoked with -mcpu=cortex-m33+nodsp. Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in index 07de4c9375ba7a0df0d8bd00385e54a4042e5264..25fc429a8338e433b9fcd0ee385ff127423494c2 100644 --- a/gcc/config/arm/arm-cpus.in +++ b/gcc/config/arm/arm-cpus.in @@ -1516,6 +1516,7 @@ begin cpu cortex-m33 architecture armv8-m.main+dsp fpu fpv5-sp-d16 option nofp remove ALL_FP + option nodsp remove armv7em costs v7m end cpu cortex-m33 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 9ad1fb339babe2ce8f45ecac2fa93d7b9ae5fd30..722d5cc2c0a020906e6df3260822cdd268245082 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -15803,6 +15803,9 @@ Permissible names for this option are the same as those for The following extension options are common to the listed CPUs: @table @samp +@item +nodsp +Disable the DSP instructions on @samp{cortex-m33}. + @item +nofp Disables the floating-point instructions on @samp{arm9e}, @samp{arm946e-s}, @samp{arm966e-s}, @samp{arm968e-s}, @samp{arm10e},
Re: [PATCH, GCC/ARM, ping] Remove ARMv8-M code for D17-D31
Committed (sorry for delay). Best regards, Thomas On 06/09/17 09:12, Kyrill Tkachov wrote: Hi Thomas, On 05/09/17 10:04, Thomas Preudhomme wrote: Ping? This is ok if a bootstrap and test run on arm-none-linux-gnueabihf shows no problems. Thanks, Kyrill Best regards, Thomas On 25/08/17 12:18, Thomas Preudhomme wrote: Hi, I've now also added a couple more changes: * size to_clear_bitmap according to maxregno to be consistent with its use * use directly TARGET_HARD_FLOAT instead of clear_vfpregs Original message below (ChangeLog unchanged): Function cmse_nonsecure_entry_clear_before_return has code to deal with high VFP register (D16-D31) while ARMv8-M Baseline and Mainline both do not support more than 16 double VFP registers (D0-D15). This makes this security-sensitive code harder to read for not much benefit since libcall for cmse_nonsecure_call functions do not deal with those high VFP registers anyway. This commit gets rid of this code for simplicity and fixes 2 issues in the same function: - stop the first loop when reaching maxregno to avoid dealing with VFP registers if targetting Thumb-1 or using -mfloat-abi=soft - include maxregno in that loop ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-06-13 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security Extensions with more than 16 double VFP registers. (cmse_nonsecure_entry_clear_before_return): Remove second entry of to_clear_mask and all code related to it. Replace the remaining entry by a sbitmap and adapt code accordingly. Testing: Testsuite shows no regression when run for ARMv8-M Baseline and ARMv8-M Mainline. Is this ok for trunk? Best regards, Thomas On 23/08/17 11:56, Thomas Preudhomme wrote: Ping? Best regards, Thomas On 17/07/17 17:25, Thomas Preudhomme wrote: My bad, found an off-by-one error in the sizing of bitmaps. Please find fixed patch in attachment. ChangeLog entry is unchanged: *** gcc/ChangeLog *** 2017-06-13 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security Extensions with more than 16 double VFP registers. (cmse_nonsecure_entry_clear_before_return): Remove second entry of to_clear_mask and all code related to it. Replace the remaining entry by a sbitmap and adapt code accordingly. Best regards, Thomas On 17/07/17 09:52, Thomas Preudhomme wrote: Ping? Best regards, Thomas On 12/07/17 09:59, Thomas Preudhomme wrote: Hi Richard, On 07/07/17 15:19, Richard Earnshaw (lists) wrote: Hmm, I think that's because really this is a partial conversion. It looks like doing this properly would involve moving that existing code to use sbitmaps as well. I think doing that would be better for long-term maintenance perspectives, but I'm not going to insist that you do it now. There's also the assert later but I've found a way to improve it slightly. While switching to auto_sbitmap I also changed the code slightly to allocate directly bitmaps to the right size. Since the change is probably bigger than what you had in mind I'd appreciate if you can give me an OK again. See updated patch in attachment. ChangeLog entry is unchanged: 2017-06-13 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security Extensions with more than 16 double VFP registers. (cmse_nonsecure_entry_clear_before_return): Remove second entry of to_clear_mask and all code related to it. Replace the remaining entry by a sbitmap and adapt code accordingly. As a result I'll let you take the call as to whether you keep this version or go back to your earlier patch. If you do decide to keep this version, then see the comment below. Given the changes I'm more happy with how the patch looks now and making it go in can be a nice incentive to change other ARMv8-M Security Extension related code later on. Best regards, Thomas
[arm-embedded] [PATCH 3/3, GCC/ARM] Add support for ARM Cortex-R52 processor
Hi, We have decided to apply the following patch to the embedded-7-branch to enable Arm Cortex-R52 support. *** gcc/ChangeLog.arm *** 2017-09-04 Thomas Preud'homme <thomas.preudho...@arm.com> Backport from mainline 2017-07-14 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm-cpus.in (cortex-r52): Add new entry. (armv8-r): Set ARM Cortex-R52 as default CPU. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate. * config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM Cortex-R52. * doc/invoke.texi: Mention -mtune=cortex-r52 and availability of fp.dp extension for -mcpu=cortex-r52. Best regards, Thomas --- Begin Message --- Hi, On 29/06/17 16:13, Thomas Preudhomme wrote: Please ignore this patch. I'll respin the patch on a more recent GCC. Please find an updated patch in attachment. This patch adds support for the ARM Cortex-R52 processor rencently announced. [1] https://developer.arm.com/products/processors/cortex-r/cortex-r52 ChangeLog entry is as follows: *** gcc/ChangeLog *** 2017-07-14 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm-cpus.in (cortex-r52): Add new entry. (armv8-r): Set ARM Cortex-R52 as default CPU. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate. * config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM Cortex-R52. * doc/invoke.texi: Mention -mtune=cortex-r52 and availability of fp.dp extension for -mcpu=cortex-r52. Tested by building an arm-none-eabi GCC cross-compiler targeting Cortex-R52 and building an hello world with it. Also checked that the .fpu option created by GCC for -mcpu=cortex-r52 and -mcpu=cortex-r52+nofp.dp is as expected (respectively .fpu neon-fp-armv8 and .fpu fpv5-sp-d16 Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in index e2ff297aed7514073dbb3bf5ee86964f202e5a14..d009a9e18acb093aefe0f9d8d6de49489fc2325c 100644 --- a/gcc/config/arm/arm-cpus.in +++ b/gcc/config/arm/arm-cpus.in @@ -381,7 +381,7 @@ begin arch armv8-m.main end arch armv8-m.main begin arch armv8-r - tune for cortex-r4 + tune for cortex-r52 tune flags CO_PROC base 8R profile R @@ -1315,6 +1315,16 @@ begin cpu cortex-m33 costs v7m end cpu cortex-m33 +# V8 R-profile implementations. +begin cpu cortex-r52 + cname cortexr52 + tune flags LDSCHED + architecture armv8-r+crc+simd + fpu neon-fp-armv8 + option nofp.dp remove FP_DBL ALL_SIMD + costs cortex +end cpu cortex-r52 + # FPU entries # format: # begin fpu diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index 51678c2566e841894c5c0e9c613c8c0f832e9988..4e508b1555a77628ff6e7cfea39c98b87caa840a 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -357,6 +357,9 @@ Enum(processor_type) String(cortex-m23) Value( TARGET_CPU_cortexm23) EnumValue Enum(processor_type) String(cortex-m33) Value( TARGET_CPU_cortexm33) +EnumValue +Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52) + Enum Name(arm_arch) Type(int) Known ARM architectures (for use with the -march= option): diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index ba2c7d8ecfdbf6966ebf04b680d587a0e057b161..1b3f7a94cc78fac8abf1042ef60c81a74eaf24eb 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -57,5 +57,6 @@ cortexa73,exynosm1,xgene1, cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35, cortexa73cortexa53,cortexa55,cortexa75, - cortexa75cortexa55,cortexm23,cortexm33" + cortexa75cortexa55,cortexm23,cortexm33, + cortexr52" (const (symbol_ref "((enum attr_tune) arm_tune)"))) diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c index 16171d4e801af46ad549314d1f376e90d5bff57c..5c29b94caaba4ff6f89a191f1d8edcf10431c0b3 100644 --- a/gcc/config/arm/driver-arm.c +++ b/gcc/config/arm/driver-arm.c @@ -58,6 +58,7 @@ static struct vendor_cpu arm_cpu_table[] = { {"0xc15", "armv7-r", "cortex-r5"}, {"0xc17", "armv7-r", "cortex-r7"}, {"0xc18", "armv7-r", "cortex-r8"}, +{"0xd13", "armv8-r+crc", "cortex-r52"}, {"0xc20", "armv6-m", "cortex-m0"}, {"0xc21", "armv6-m", "cortex-m1"}, {"0xc23", "armv7-m", "cortex-m3"}, diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index e60edcae53ef3c995054b9b0229b5f0fccbb8462..a093b9bcf77b1f4b40992516e853826bb7d528d4 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -15538,7 +15538,7 @@ Permissible names are: @samp{arm2}, @samp{arm250}, @samp{cortex-a32}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55}, @samp{cortex-a57}, @samp{cort
[arm-embedded] [PATCH, GCC/ARM] Rewire -mfpu=fp-armv8 as VFPv5 + D32 + DP
Hi, We have decided to apply the following patch to the embedded-7-branch to enable ARMv8-R support. ChangeLog entry is as follows: *** gcc/ChangeLog.arm *** 2017-09-04 Thomas Preud'hommeBackport from mainline 2017-07-14 Thomas Preud'homme * config/arm/arm-isa.h (isa_bit_FP_ARMv8): Delete enumerator. (ISA_FP_ARMv8): Define as ISA_FPv5 and ISA_FP_D32. * config/arm/arm-cpus.in (armv8-r): Define fp.sp as enabling FPv5. (fp-armv8): Define it as FP_ARMv8 only. config/arm/arm.h (TARGET_FPU_ARMV8): Delete. (TARGET_VFP_FP16INST): Define using TARGET_VFP5 rather than TARGET_FPU_ARMV8. config/arm/arm.c (arm_rtx_costs_internal): Replace checks against TARGET_FPU_ARMV8 by checks against TARGET_VFP5. * config/arm/arm-builtins.c (arm_builtin_vectorized_function): Define first ARM_CHECK_BUILTIN_MODE definition using TARGET_VFP5 rather than TARGET_FPU_ARMV8. * config/arm/arm-c.c (arm_cpu_builtins): Likewise for __ARM_FEATURE_NUMERIC_MAXMIN macro definition. * config/arm/arm.md (cmov): Condition on TARGET_VFP5 rather than TARGET_FPU_ARMV8. * config/arm/neon.md (neon_vrint): Likewise. (neon_vcvt): Likewise. (neon_): Likewise. (3): Likewise. * config/arm/vfp.md (lsi2): Likewise. * config/arm/predicates.md (arm_cond_move_operator): Check against TARGET_VFP5 rather than TARGET_FPU_ARMV8 and fix spacing. Best regards, Thomas --- Begin Message --- Hi, fp-armv8 is currently defined as a double precision FPv5 with 32 D registers *and* a special FP_ARMv8 bit. However FP for ARMv8 should only bring 32 D registers on top of FPv5-D16 so this FP_ARMv8 bit is spurious. As a consequence, many instruction patterns which are guarded by TARGET_FPU_ARMV8 are unavailable to FPv5-D16 and FPv5-SP-D16. This patch gets rid of TARGET_FPU_ARMV8 and rewire all uses to expressions based on TARGET_VFP5, TARGET_VFPD32 and TARGET_VFP_DOUBLE. It also redefine ISA_FP_ARMv8 to include the D32 capability to distinguish it from FPv5-D16. At last, it sets the +fp.sp for ARMv8-R to enable FPv5-SP-D16 (ie FP for ARMv8 with single precision only and 16 D registers). ChangeLog entry is as follows: 2017-07-07 Thomas Preud'homme * config/arm/arm-isa.h (isa_bit_FP_ARMv8): Delete enumerator. (ISA_FP_ARMv8): Define as ISA_FPv5 and ISA_FP_D32. * config/arm/arm-cpus.in (armv8-r): Define fp.sp as enabling FPv5. (fp-armv8): Define it as FP_ARMv8 only. config/arm/arm.h (TARGET_FPU_ARMV8): Delete. (TARGET_VFP_FP16INST): Define using TARGET_VFP5 rather than TARGET_FPU_ARMV8. config/arm/arm.c (arm_rtx_costs_internal): Replace checks against TARGET_FPU_ARMV8 by checks against TARGET_VFP5. * config/arm/arm-builtins.c (arm_builtin_vectorized_function): Define first ARM_CHECK_BUILTIN_MODE definition using TARGET_VFP5 rather than TARGET_FPU_ARMV8. * config/arm/arm-c.c (arm_cpu_builtins): Likewise for __ARM_FEATURE_NUMERIC_MAXMIN macro definition. * config/arm/arm.md (cmov): Condition on TARGET_VFP5 rather than TARGET_FPU_ARMV8. * config/arm/neon.md (neon_vrint): Likewise. (neon_vcvt): Likewise. (neon_): Likewise. (3): Likewise. * config/arm/vfp.md (lsi2): Likewise. * config/arm/predicates.md (arm_cond_move_operator): Check against TARGET_VFP5 rather than TARGET_FPU_ARMV8 and fix spacing. Testing: * Bootstrapped under ARMv8-A Thumb state and ran testsuite -> no regression * built Spec2000 and Spec2006 with -march=armv8-a+fp16 and compared objdump -> no code generation difference Is this ok for trunk? Best regards, Thomas diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 63ee880822c17eda55dd58438d61cbbba333b2c6..7504ed581c63a657a0dff48442633704bd252b2e 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -3098,7 +3098,7 @@ arm_builtin_vectorized_function (unsigned int fn, tree type_out, tree type_in) NULL_TREE is returned if no such builtin is available. */ #undef ARM_CHECK_BUILTIN_MODE #define ARM_CHECK_BUILTIN_MODE(C)\ - (TARGET_FPU_ARMV8 \ + (TARGET_VFP5 \ && flag_unsafe_math_optimizations \ && ARM_CHECK_BUILTIN_MODE_1 (C)) diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c index a3daa3220a2bc4220dffdb7ca08ca9419bdac425..9178937b6d9e0fe5d0948701390c4cf01f4f8c7d 100644 --- a/gcc/config/arm/arm-c.c +++ b/gcc/config/arm/arm-c.c @@ -96,7 +96,7 @@ arm_cpu_builtins (struct cpp_reader* pfile) || TARGET_ARM_ARCH_ISA_THUMB >=2)); def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN", - TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_FPU_ARMV8); + TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_VFP5); def_or_undef_macro (pfile, "__ARM_FEATURE_SIMD32",
[arm-embedded] [PATCH 2/3, GCC/ARM] Add support for ARMv8-R architecture
Hi, We have decided to apply the following patch to the embedded-7-branch to enable ARMv8-R support. ChangeLog entry is as follows: *** gcc/ChangeLog.arm *** 2017-09-04 Thomas Preud'homme <thomas.preudho...@arm.com> Backport from mainline 2017-07-06 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm-cpus.in (armv8-r): Add new entry. * config/arm/arm-isa.h (ISA_ARMv8r): Define macro. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R enumerator. * doc/invoke.texi: Mention -march=armv8-r and its extensions. *** gcc/testsuite/ChangeLog *** 2017-09-04 Thomas Preud'homme <thomas.preudho...@arm.com> Backport from mainline 2017-07-06 Thomas Preud'homme <thomas.preudho...@arm.com> * lib/target-supports.exp: Generate check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r and check_effective_target_arm_arch_v8r_multilib. *** libgcc/ChangeLog *** 2017-09-04 Thomas Preud'homme <thomas.preudho...@arm.com> Backport from mainline 2017-07-06 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R. --- Begin Message --- Please find an updated patch in attachment. ChangeLog entry are now as follows: *** gcc/ChangeLog *** 2017-07-06 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm-cpus.in (armv8-r): Add new entry. * config/arm/arm-isa.h (ISA_ARMv8r): Define macro. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R enumerator. * doc/invoke.texi: Mention -march=armv8-r and its extensions. *** gcc/testsuite/ChangeLog *** 2017-01-31 Thomas Preud'homme <thomas.preudho...@arm.com> * lib/target-supports.exp: Generate check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r and check_effective_target_arm_arch_v8r_multilib. *** libgcc/ChangeLog *** 2017-01-31 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R. Tested by building an arm-none-eabi GCC cross-compiler targetting ARMv8-R. Is this ok for stage1? Best regards, Thomas Best regards, Thomas On 29/06/17 16:13, Thomas Preudhomme wrote: Please ignore this patch. I'll respin the patch on a more recent GCC. Best regards, Thomas On 29/06/17 14:55, Thomas Preudhomme wrote: Hi, This patch adds support for ARMv8-R architecture [1] which was recently announced. User level instructions for ARMv8-R are the same as those in ARMv8-A Aarch32 mode so this patch define ARMv8-R to have the same features as ARMv8-A in ARM backend. [1] https://developer.arm.com/products/architecture/r-profile/docs/ddi0568/latest/arm-architecture-reference-manual-supplement-armv8-for-the-armv8-r-aarch32-architecture-profile ChangeLog entries are as follow: *** gcc/ChangeLog *** 2017-01-31 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/arm-cpus.in (armv8-r, armv8-r+rcr): Add new entry. * config/arm/arm-cpu-cdata.h: Regenerate. * config/arm/arm-cpu-data.h: Regenerate. * config/arm/arm-isa.h (ISA_ARMv8r): Define macro. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R enumerator. * config/arm/bpabi.h (BE8_LINK_SPEC): Add entry for ARMv8-R and ARMv8-R with CRC extensions. * doc/invoke.texi: Mention -march=armv8-r and -march=armv8-r+crc options. Document meaning of -march=armv8-r+rcr. *** gcc/testsuite/ChangeLog *** 2017-01-31 Thomas Preud'homme <thomas.preudho...@arm.com> * lib/target-supports.exp: Generate check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r and check_effective_target_arm_arch_v8r_multilib. *** libgcc/ChangeLog *** 2017-01-31 Thomas Preud'homme <thomas.preudho...@arm.com> * config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R. Tested by building an arm-none-eabi GCC cross-compiler targetting ARMv8-R. Is this ok for stage1? Best regards, Thomas diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in index 946d543ebb29416da9b4928161607cccacaa78a7..f35128acb7d68c6a0592355b9d3d56ee8f826aca 100644 --- a/gcc/config/arm/arm-cpus.in +++ b/gcc/config/arm/arm-cpus.in @@ -380,6 +380,22 @@ begin arch armv8-m.main option nodsp remove bit_ARMv7em end arch armv8-m.main +begin arch armv8-r + tune for cortex-r4 + tune flags CO_PROC + base 8R + profile R + isa ARMv8r + option crc add bit_crc32 +# fp.sp => fp-armv8 (d16); simd => simd + fp-armv8 + d32 + double precision +# note: no fp option for fp-armv8 (d16) + double precision at the moment + option fp.sp add FP_ARMv8 + option simd add FP_ARMv8 NEON + option crypto add FP_ARMv8 CRYPTO + option nocrypto remove