from:"Thomas Preudhomme"

Re: [PATCH, ARM] Fix PR77904 testcase failure

2018-12-31 Thread Thomas Preudhomme

Forgot the reference:

[1] https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01308.html

On Monday, 31 December 2018, Thomas Preudhomme 
wrote:
> Hi Richard,
>
> On Thursday, 20 December 2018, Richard Earnshaw (lists) <
richard.earns...@arm.com> wrote:
>> On 14/12/2018 23:28, Thomas Preudhomme wrote:
>>> Hi,
>>>
>>> Commit r242693 forced fp to be saved/restored when needed due to an
>>> instance of GCC using fp as a scratch register to save sp while it's
>>> being clobbered by an inline asm. The normal path in
>>> thumb1_compute_save_reg_mask saving callee-saved registers which are
>>> live in the function does not work in that case because fp is chosen to
>>> hold sp after that function is called.
>>>
>>> Since clobbering sp is now errored out by the compiler and this was the
>>> only case reported where fp was live but not marked as such when
>>> thumb1_compute_save_reg_mask is called, I believe the whole commit
>>> r242693 should be reverted.
>>>
>>> ChangeLog entries are as follows:
>>>
>>> *** gcc/ChangeLog ***
>>>
>>> 2018-12-14  Thomas Preud'homme  
>>>
>>> Revert:
>>> 2016-11-22  Thomas Preud'homme  
>>>
>>> PR target/77904
>>> * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame
pointer
>>> in save register mask if it is needed.
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>>
>>> 2018-12-14  Thomas Preud'homme  
>>>
>>> Revert:
>>> 2016-11-22  Thomas Preud'homme  
>>>
>>> PR target/77904
>>> * gcc.target/arm/pr77904.c: New test.
>>>
>>> Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M
>>> and regression testsuite does not show any regression.
>>>
>>> Ok for stage3?
>>
>> OK.
>>
>> R.
>
> Bernd suggested in [1] that the behaviour tested by pr77904.c might
actually be a behaviour we can allow with a patch to add a dg-warning to
the decade. I'll wait for a resolution on that suggestion before deciding
whether to commit this.
>
> Best regards,
>
> Thomas
>
>>
>>>
>>> Best regards,
>>>
>>> Thomas
>>>
>>>
>>> fix_pr77904_test_failure.patch
>>>
>>> From 63c52e7bf932947be7122cdc63f6cdc913479259 Mon Sep 17 00:00:00 2001
>>> From: Thomas Preud'homme 
>>> Date: Fri, 14 Dec 2018 16:02:59 +
>>> Subject: [PATCH] [PATCH, ARM] Fix PR77904 testcase failure
>>>
>>> Hi,
>>>
>>> Commit r242693 forced fp to be saved/restored when needed due to an
>>> instance of GCC using fp as a scratch register to save sp while it's
>>> being clobbered by an inline asm. The normal path in
>>> thumb1_compute_save_reg_mask saving callee-saved registers which are
>>> live in the function does not work in that case because fp is chosen to
>>> hold sp after that function is called.
>>>
>>> Since clobbering sp is now errored out by the compiler and this was the
>>> only case reported where fp was live but not marked as such when
>>> thumb1_compute_save_reg_mask is called, I believe the whole commit
>>> r242693 should be reverted.
>>>
>>> ChangeLog entries are as follows:
>>>
>>> *** gcc/ChangeLog ***
>>>
>>> 2018-12-14  Thomas Preud'homme  
>>>
>>> Revert:
>>> 2016-11-22  Thomas Preud'homme  
>>>
>>> PR target/77904
>>> * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame
pointer
>>> in save register mask if it is needed.
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>>
>>> 2018-12-14  Thomas Preud'homme  
>>>
>>> Revert:
>>> 2016-11-22  Thomas Preud'homme  
>>>
>>> PR target/77904
>>> * gcc.target/arm/pr77904.c: New test.
>>>
>>> Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M
>>> and regression testsuite does not show any regression.
>>>
>>> Ok for stage3?
>>>
>>> Best regards,
>>>
>>> Thomas
>>> ---
>>>  gcc/ChangeLog  |  9 ++
>>>  gcc/config/arm/arm.c   |  4 ---
>>>  gcc/testsuite/ChangeLog|  8 +
>>>  gcc/testsuite/gcc.target/arm/pr77904.c | 45 --
>>>  4 files changed, 17 insertions(+), 49 deletions(-)
>>>

Re: [PATCH, ARM] Fix PR77904 testcase failure

2018-12-31 Thread Thomas Preudhomme

Hi Richard,

On Thursday, 20 December 2018, Richard Earnshaw (lists) <
richard.earns...@arm.com> wrote:
> On 14/12/2018 23:28, Thomas Preudhomme wrote:
>> Hi,
>>
>> Commit r242693 forced fp to be saved/restored when needed due to an
>> instance of GCC using fp as a scratch register to save sp while it's
>> being clobbered by an inline asm. The normal path in
>> thumb1_compute_save_reg_mask saving callee-saved registers which are
>> live in the function does not work in that case because fp is chosen to
>> hold sp after that function is called.
>>
>> Since clobbering sp is now errored out by the compiler and this was the
>> only case reported where fp was live but not marked as such when
>> thumb1_compute_save_reg_mask is called, I believe the whole commit
>> r242693 should be reverted.
>>
>> ChangeLog entries are as follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2018-12-14  Thomas Preud'homme  
>>
>> Revert:
>> 2016-11-22  Thomas Preud'homme  
>>
>> PR target/77904
>> * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer
>> in save register mask if it is needed.
>>
>> *** gcc/testsuite/ChangeLog ***
>>
>> 2018-12-14  Thomas Preud'homme  
>>
>> Revert:
>> 2016-11-22  Thomas Preud'homme  
>>
>> PR target/77904
>> * gcc.target/arm/pr77904.c: New test.
>>
>> Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M
>> and regression testsuite does not show any regression.
>>
>> Ok for stage3?
>
> OK.
>
> R.

Bernd suggested in [1] that the behaviour tested by pr77904.c might
actually be a behaviour we can allow with a patch to add a dg-warning to
the decade. I'll wait for a resolution on that suggestion before deciding
whether to commit this.

Best regards,

Thomas

>
>>
>> Best regards,
>>
>> Thomas
>>
>>
>> fix_pr77904_test_failure.patch
>>
>> From 63c52e7bf932947be7122cdc63f6cdc913479259 Mon Sep 17 00:00:00 2001
>> From: Thomas Preud'homme 
>> Date: Fri, 14 Dec 2018 16:02:59 +
>> Subject: [PATCH] [PATCH, ARM] Fix PR77904 testcase failure
>>
>> Hi,
>>
>> Commit r242693 forced fp to be saved/restored when needed due to an
>> instance of GCC using fp as a scratch register to save sp while it's
>> being clobbered by an inline asm. The normal path in
>> thumb1_compute_save_reg_mask saving callee-saved registers which are
>> live in the function does not work in that case because fp is chosen to
>> hold sp after that function is called.
>>
>> Since clobbering sp is now errored out by the compiler and this was the
>> only case reported where fp was live but not marked as such when
>> thumb1_compute_save_reg_mask is called, I believe the whole commit
>> r242693 should be reverted.
>>
>> ChangeLog entries are as follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2018-12-14  Thomas Preud'homme  
>>
>> Revert:
>> 2016-11-22  Thomas Preud'homme  
>>
>> PR target/77904
>> * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer
>> in save register mask if it is needed.
>>
>> *** gcc/testsuite/ChangeLog ***
>>
>> 2018-12-14  Thomas Preud'homme  
>>
>> Revert:
>> 2016-11-22  Thomas Preud'homme  
>>
>> PR target/77904
>> * gcc.target/arm/pr77904.c: New test.
>>
>> Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M
>> and regression testsuite does not show any regression.
>>
>> Ok for stage3?
>>
>> Best regards,
>>
>> Thomas
>> ---
>>  gcc/ChangeLog  |  9 ++
>>  gcc/config/arm/arm.c   |  4 ---
>>  gcc/testsuite/ChangeLog|  8 +
>>  gcc/testsuite/gcc.target/arm/pr77904.c | 45 --
>>  4 files changed, 17 insertions(+), 49 deletions(-)
>>  delete mode 100644 gcc/testsuite/gcc.target/arm/pr77904.c
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index d8e374fb15f..9caeb1d5e18 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,12 @@
>> +2018-12-14  Thomas Preud'homme  
>> +
>> + Revert:
>> + 2016-11-22  Thomas Preud'homme  
>> +
>> + PR target/77904
>> + * config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame
pointer
>> + in save register mask if it is needed.
>> +
>>  2018-11-27  Alan Modra  
>>
>>   * config/rs6000/aix

[PATCH, committed] Changing maintainer email address

2018-12-21 Thread Thomas Preudhomme

Hi,

I've updated my email address in MAINTAINERS file since I'm leaving my
company. I'll do the copyright assignment paperwork before
contributing any new patches.

Best regards,

Thomas
From c486e31b10ae0ec648ba256a92d5a4bcef1ef83d Mon Sep 17 00:00:00 2001
From: thopre01 
Date: Fri, 21 Dec 2018 17:53:03 +
Subject: [PATCH] Update maintainer email address

2018-12-21  Thomas Preud'homme  

* MAINTAINERS (Write After Approval): Update my maintainer address.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@267330 138bc75d-0d04-0410-961f-82ee72b054a4
---
 ChangeLog   | 4 
 MAINTAINERS | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 11cfa2a6789..a86c3fc40c0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2018-12-21  Thomas Preud'homme  
+
+	* MAINTAINERS (Write After Approval): Update my maintainer address.
+
 2018-12-21  Gergö Barany  
 
 	* MAINTAINERS (Write After Approval): Add myself.
diff --git a/MAINTAINERS b/MAINTAINERS
index dcf744d023b..8ccd0ca7c33 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -537,7 +537,7 @@ Paul Pluzhnikov	
 Antoniu Pop	
 Siddhesh Poyarekar
 Vidya Praveen	
-Thomas Preud'homme
+Thomas Preud'homme
 Vladimir Prus	
 Yao Qi		
 Jerry Quinn	
-- 
2.19.1

[PATCH, ARM, committed] Fix size-optimization-ieee testcase failure

2018-12-21 Thread Thomas Preudhomme

I've committed the obvious attached patch to fix the
gcc.target/arm/size-optimization-ieee-* testcase failures.

On some version of dejagnu, options in RUNTESTFLAGS are appended to the
command-line and thus any -mfloat-abi=softfp or -mfloat-abi=hard in
there overwrite the -mfloat-abi=soft in the dg-options for
size-optimization-ieee-* tests. Test is still run though because
arm_soft_ok returns true if -mfloat-abi=soft is accepted, even if the
file is not compiled for softfloat due to a later -mfloat-abi on the
command line.

This patch adds a dg-skip-if to those tests to ensure they are not run
in softfp or hard mode.

2018-12-21  Thomas Preud'homme  

gcc/testsuite/
* gcc.target/arm/size-optimization-ieee-1.c: Skip if passing
-mfloat-abi=softfp or -mfloat-abi=hard.
* gcc.target/arm/size-optimization-ieee-2.c: Likewise.
* gcc.target/arm/size-optimization-ieee-3.c: Likewise.
From c13cca23aa64a07f66c80f14dbdd79c63163783c Mon Sep 17 00:00:00 2001
From: thopre01 
Date: Fri, 21 Dec 2018 11:49:04 +
Subject: [PATCH] [ARM] Fix size-optimization-ieee testcase failure

On some version of dejagnu, options in RUNTESTFLAGS are appended to the
command-line and thus any -mfloat-abi=softfp or -mfloat-abi=hard in
there overwrite the -mfloat-abi=soft in the dg-options for
size-optimization-ieee-* tests. Test is still run though because
arm_soft_ok returns true if -mfloat-abi=soft is accepted, even if the
file is not compiled for softfloat due to a later -mfloat-abi on the
command line.

This patch adds a dg-skip-if to those tests to ensure they are not run
in softfp or hard mode.

2018-12-21  Thomas Preud'homme  

gcc/testsuite/
* gcc.target/arm/size-optimization-ieee-1.c: Skip if passing
-mfloat-abi=softfp or -mfloat-abi=hard.
* gcc.target/arm/size-optimization-ieee-2.c: Likewise.
* gcc.target/arm/size-optimization-ieee-3.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@267323 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog | 7 +++
 gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c | 1 +
 gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c | 1 +
 gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c | 1 +
 4 files changed, 10 insertions(+)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index dcac93bb275..1569e7aaa0f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2018-12-21  Thomas Preud'homme  
+
+	* gcc.target/arm/size-optimization-ieee-1.c: Skip if passing
+	-mfloat-abi=softfp or -mfloat-abi=hard.
+	* gcc.target/arm/size-optimization-ieee-2.c: Likewise.
+	* gcc.target/arm/size-optimization-ieee-3.c: Likewise.
+
 2018-12-21  Jakub Jelinek  
 
 	PR target/88547
diff --git a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c
index 34090f20fec..61475eb4c67 100644
--- a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c
+++ b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-1.c
@@ -1,4 +1,5 @@
 /* { dg-do link { target arm_soft_ok } } */
+/* { dg-skip-if "Feature is -mfloat-abi=soft only" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
 /* { dg-options "-mfloat-abi=soft" } */
 
 int
diff --git a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c
index 75337894a9c..b4699271cea 100644
--- a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c
+++ b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-2.c
@@ -1,4 +1,5 @@
 /* { dg-do link { target arm_soft_ok } } */
+/* { dg-skip-if "Feature is -mfloat-abi=soft only" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
 /* { dg-options "-mfloat-abi=soft" } */
 
 int
diff --git a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c
index 63c92b3bbb7..34b1ebe7afd 100644
--- a/gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c
+++ b/gcc/testsuite/gcc.target/arm/size-optimization-ieee-3.c
@@ -1,4 +1,5 @@
 /* { dg-do link { target arm_soft_ok } } */
+/* { dg-skip-if "Feature is -mfloat-abi=soft only" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
 /* { dg-options "-mfloat-abi=soft" } */
 
 int
-- 
2.19.1

Re: [PATCH, ARM] Do softfloat when -mfpu set, -mfloat-abi=softfp and targeting Thumb-1

2018-12-19 Thread Thomas Preudhomme

Good catch.

Committed patch in attachment. Best regards,

Thomas
On Wed, 19 Dec 2018 at 14:13, Richard Earnshaw (lists)
 wrote:
>
> On 14/12/2018 21:15, Thomas Preudhomme wrote:
> > Hi Richard,
> >
> > Thanks for catching the problem with this approach. Hopefully this
> > version should solve the real problem:
> >
> >
> > FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
> > but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
> > not set. Among other things, it makes some of the cmse tests (eg.
> > gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
> > -march=armv8-m.base -mcmse -mfpu= -mfloat-abi=softfp. This
> > patch adds an extra check for TARGET_32BIT to TARGET_HARD_FLOAT such
> > that it is false on TARGET_THUMB1 targets even when a FPU is specified.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-12-14  thomas Preud'homme  
> >
> > * config/arm/arm.h (TARGET_HARD_FLOAT): Restrict to TARGET_32BIT
> > targets.
>
> Yes, this is better.  And with this change, I think this line:
>
>   if (TARGET_HARD_FLOAT && !TARGET_THUMB1)
>
> in output_return_instruction() can be collapsed into simply
>
>
> if (TARGET_HARD_FLOAT)
>
> OK with that change.
>
> R.
>
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-12-14  thomas Preud'homme  
> >
> > * gcc.target/arm/cmse/baseline/softfp.c: Force an FPU.
> >
> > Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M
> > with -mfloat-abi=softfp
> >
> > Is this ok for stage3?
> >
> > Best regards,
> >
> > Thomas
> >
> > On Thu, 29 Nov 2018 at 14:52, Richard Earnshaw (lists)
> >  wrote:
> >>
> >> On 29/11/2018 10:51, Thomas Preudhomme wrote:
> >>> Hi,
> >>>
> >>> FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
> >>> but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
> >>> not set. Among other things, it makes some of the cmse tests (eg.
> >>> gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
> >>> -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch
> >>> errors out when a Thumb-1 -like target is selected and a FPU is
> >>> specified, thus making such tests being skipped.
> >>>
> >>> ChangeLog entries are as follows:
> >>>
> >>> *** gcc/ChangeLog ***
> >>>
> >>> 2018-11-28  thomas Preud'homme  
> >>>
> >>> * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out
> >>> if targeting Thumb-1 with an FPU specified.
> >>>
> >>> *** gcc/testsuite/ChangeLog ***
> >>>
> >>> 2018-11-28  thomas Preud'homme  
> >>>
> >>> * gcc.target/arm/thumb1_mfpu-1.c: New testcase.
> >>> * gcc.target/arm/thumb1_mfpu-2.c: Likewise.
> >>>
> >>> Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M.
> >>> Fails as expected when targeting Armv6-M with an -mfpu or a default FPU.
> >>> Succeeds without.
> >>>
> >>> Is this ok for stage3?
> >>>
> >>
> >> This doesn't sound right.  Specifically this bit...
> >>
> >> +  else if (TARGET_THUMB1
> >> +  && bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2))
> >> +   error ("Thumb-1 does not allow FP instructions");
> >>
> >> If I use
> >>
> >> -mcpu=arm1176jzf-s -mfpu=auto -mfloat-abi=softfp -mthumb
> >>
> >> then that shouldn't error, since softfp and thumb is, in reality, just
> >> float-abi=soft (as there are no fp instructions in thumb).  We also want
> >> it to work this way so that I can add the thumb/arm attribute to
> >> specific functions and have the compiler use HW float instructions when
> >> they are suitable.
> >>
> >>
> >> R.
> >>
> >>> Best regards,
> >>>
> >>> Thomas
> >>>
> >>>
> >>> thumb1_mfpu_error.patch
> >>>
> >>> From 051e38552d7c596873e0303f6ec4272b26d50900 Mon Sep 17 00:00:00 2001
> >>> From: Thomas Preud'homme 
> >>> Date: Tue, 27 Nov 2018 15:52:38 +
> >>> Subject: [PATCH] [PATCH, ARM] Error out when -mfpu set and targeting 
> >&g

[PATCH, ARM] Fix PR77904 testcase failure

2018-12-14 Thread Thomas Preudhomme

Hi,

Commit r242693 forced fp to be saved/restored when needed due to an
instance of GCC using fp as a scratch register to save sp while it's
being clobbered by an inline asm. The normal path in
thumb1_compute_save_reg_mask saving callee-saved registers which are
live in the function does not work in that case because fp is chosen to
hold sp after that function is called.

Since clobbering sp is now errored out by the compiler and this was the
only case reported where fp was live but not marked as such when
thumb1_compute_save_reg_mask is called, I believe the whole commit
r242693 should be reverted.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-12-14  Thomas Preud'homme  

Revert:
2016-11-22  Thomas Preud'homme  

PR target/77904
* config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer
in save register mask if it is needed.

*** gcc/testsuite/ChangeLog ***

2018-12-14  Thomas Preud'homme  

Revert:
2016-11-22  Thomas Preud'homme  

PR target/77904
* gcc.target/arm/pr77904.c: New test.

Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M
and regression testsuite does not show any regression.

Ok for stage3?

Best regards,

Thomas
From 63c52e7bf932947be7122cdc63f6cdc913479259 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Fri, 14 Dec 2018 16:02:59 +
Subject: [PATCH] [PATCH, ARM] Fix PR77904 testcase failure

Hi,

Commit r242693 forced fp to be saved/restored when needed due to an
instance of GCC using fp as a scratch register to save sp while it's
being clobbered by an inline asm. The normal path in
thumb1_compute_save_reg_mask saving callee-saved registers which are
live in the function does not work in that case because fp is chosen to
hold sp after that function is called.

Since clobbering sp is now errored out by the compiler and this was the
only case reported where fp was live but not marked as such when
thumb1_compute_save_reg_mask is called, I believe the whole commit
r242693 should be reverted.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-12-14  Thomas Preud'homme  

Revert:
2016-11-22  Thomas Preud'homme  

PR target/77904
* config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer
in save register mask if it is needed.

*** gcc/testsuite/ChangeLog ***

2018-12-14  Thomas Preud'homme  

Revert:
2016-11-22  Thomas Preud'homme  

PR target/77904
* gcc.target/arm/pr77904.c: New test.

Testing: Built an arm-none-eabi GCC cross-compiler targeting Armv6S-M
and regression testsuite does not show any regression.

Ok for stage3?

Best regards,

Thomas
---
 gcc/ChangeLog  |  9 ++
 gcc/config/arm/arm.c   |  4 ---
 gcc/testsuite/ChangeLog|  8 +
 gcc/testsuite/gcc.target/arm/pr77904.c | 45 --
 4 files changed, 17 insertions(+), 49 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/arm/pr77904.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d8e374fb15f..9caeb1d5e18 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2018-12-14  Thomas Preud'homme  
+
+	Revert:
+	2016-11-22  Thomas Preud'homme  
+
+	PR target/77904
+	* config/arm/arm.c (thumb1_compute_save_reg_mask): Mark frame pointer
+	in save register mask if it is needed.
+
 2018-11-27  Alan Modra  
 
 	* config/rs6000/aix71.h (ASM_SPEC): Don't select default -maix64
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 40f0574e32e..2ab5d8abc33 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19553,10 +19553,6 @@ thumb1_compute_save_core_reg_mask (void)
 if (df_regs_ever_live_p (reg) && callee_saved_reg_p (reg))
   mask |= 1 << reg;
 
-  /* Handle the frame pointer as a special case.  */
-  if (frame_pointer_needed)
-mask |= 1 << HARD_FRAME_POINTER_REGNUM;
-
   if (flag_pic
   && !TARGET_SINGLE_PIC_BASE
   && arm_pic_register != INVALID_REGNUM
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9e1f6d05a45..4e58c8940da 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@
+2018-12-14  Thomas Preud'homme  
+
+	Revert:
+	2016-11-22  Thomas Preud'homme  
+
+	PR target/77904
+	* gcc.target/arm/pr77904.c: New test.
+
 2018-11-27  Jozef Lawrynowicz  
 
 	* lib/target-supports.exp
diff --git a/gcc/testsuite/gcc.target/arm/pr77904.c b/gcc/testsuite/gcc.target/arm/pr77904.c
deleted file mode 100644
index 76728c07e73..000
--- a/gcc/testsuite/gcc.target/arm/pr77904.c
+++ /dev/null
@@ -1,45 +0,0 @@
-/* { dg-do run } */
-/* { dg-options "-O2" } */
-
-__attribute__ ((noinline, noclone)) void
-clobber_sp (void)
-{
-  __asm volatile ("" : : : "sp");
-}
-
-int
-main (void)
-{
-  int ret;
-
-  __asm volatile ("mov\tr4, #0xf4\n\t"
-		  "mov\tr5, #0xf5\n\t"
-		  "mov\tr6, #0xf6\n\t"
-		  "mov\tr7, #0xf7\n\t"
-		  "mov\tr0, #0xf8\n\t"
-		  "mov\tr8, r0\n\t"
-		  "mov\tr0, #0xfa\n\t"
-		  "mov\tr10, r0"
-

Re: [PATCH, ARM] Do softfloat when -mfpu set, -mfloat-abi=softfp and targeting Thumb-1

2018-12-14 Thread Thomas Preudhomme

Hi Richard,

Thanks for catching the problem with this approach. Hopefully this
version should solve the real problem:


FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
not set. Among other things, it makes some of the cmse tests (eg.
gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
-march=armv8-m.base -mcmse -mfpu= -mfloat-abi=softfp. This
patch adds an extra check for TARGET_32BIT to TARGET_HARD_FLOAT such
that it is false on TARGET_THUMB1 targets even when a FPU is specified.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-12-14  thomas Preud'homme  

* config/arm/arm.h (TARGET_HARD_FLOAT): Restrict to TARGET_32BIT
targets.

*** gcc/testsuite/ChangeLog ***

2018-12-14  thomas Preud'homme  

* gcc.target/arm/cmse/baseline/softfp.c: Force an FPU.

Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M
with -mfloat-abi=softfp

Is this ok for stage3?

Best regards,

Thomas

On Thu, 29 Nov 2018 at 14:52, Richard Earnshaw (lists)
 wrote:
>
> On 29/11/2018 10:51, Thomas Preudhomme wrote:
> > Hi,
> >
> > FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
> > but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
> > not set. Among other things, it makes some of the cmse tests (eg.
> > gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
> > -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch
> > errors out when a Thumb-1 -like target is selected and a FPU is
> > specified, thus making such tests being skipped.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-11-28  thomas Preud'homme  
> >
> > * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out
> > if targeting Thumb-1 with an FPU specified.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-11-28  thomas Preud'homme  
> >
> > * gcc.target/arm/thumb1_mfpu-1.c: New testcase.
> > * gcc.target/arm/thumb1_mfpu-2.c: Likewise.
> >
> > Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M.
> > Fails as expected when targeting Armv6-M with an -mfpu or a default FPU.
> > Succeeds without.
> >
> > Is this ok for stage3?
> >
>
> This doesn't sound right.  Specifically this bit...
>
> +  else if (TARGET_THUMB1
> +  && bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2))
> +   error ("Thumb-1 does not allow FP instructions");
>
> If I use
>
> -mcpu=arm1176jzf-s -mfpu=auto -mfloat-abi=softfp -mthumb
>
> then that shouldn't error, since softfp and thumb is, in reality, just
> float-abi=soft (as there are no fp instructions in thumb).  We also want
> it to work this way so that I can add the thumb/arm attribute to
> specific functions and have the compiler use HW float instructions when
> they are suitable.
>
>
> R.
>
> > Best regards,
> >
> > Thomas
> >
> >
> > thumb1_mfpu_error.patch
> >
> > From 051e38552d7c596873e0303f6ec4272b26d50900 Mon Sep 17 00:00:00 2001
> > From: Thomas Preud'homme 
> > Date: Tue, 27 Nov 2018 15:52:38 +
> > Subject: [PATCH] [PATCH, ARM] Error out when -mfpu set and targeting Thumb-1
> >
> > Hi,
> >
> > FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
> > but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
> > not set. Among other things, it makes some of the cmse tests (eg.
> > gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
> > -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch
> > errors out when a Thumb-1 -like target is selected and a FPU is
> > specified, thus making such tests being skipped.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-11-28  thomas Preud'homme  
> >
> >   * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out
> >   if targeting Thumb-1 with an FPU specified.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-11-28  thomas Preud'homme  
> >
> >   * gcc.target/arm/thumb1_mfpu-1.c: New testcase.
> >   * gcc.target/arm/thumb1_mfpu-2.c: Likewise.
> >
> > Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M.
> > Fails as expected when targeting Armv6-M with an -mfpu or a default FPU.
> > Succeeds without.
> >
> > Is this ok for stage3?
> >
> > Best regards,
> >
> > Thomas
&

Re: [PATCH, libgcc/ARM & testsuite] Optimize executable size when using softfloat fmul/dmul

2018-12-14 Thread Thomas Preudhomme

Hi Richard,

None, is there any? All the one I could find in the big switch
selecting tm_files and tmake_files in gcc/config.gcc are including
arm/elf.h. I tried to build for arm-wince-pe but got: "Configuration
arm-wince-pe not supported". However note that to guarantee correct
results the only requirement is to support global symbol overriding
weak symbol correctly and I see .weak usage in many other libgcc
backend (eg. i386). The "take the first definition resolving an
undefined reference and ignore the one in following object of a static
library" is only to benefit from the size optimization.

Best regards,

Thomas
On Fri, 7 Dec 2018 at 14:14, Richard Earnshaw (lists)
 wrote:
>
> On 19/11/2018 09:57, Thomas Preudhomme wrote:
> > Softfloat single precision and double precision floating-point
> > multiplication routines in libgcc share some code with the
> > floating-point division of their corresponding precision. As the code
> > is structured now, this leads to *all* division code being pulled in an
> > executable in softfloat mode even if only multiplication is
> > performed.
> >
> > This patch create some new LIB1ASMFUNCS macros to also build files with
> > just the multiplication and shared code as weak symbols. By putting
> > these earlier in the static library, they can then be picked up when
> > only multiplication is used and they are overriden by the global
> > definition in the existing file containing both multiplication and
> > division code when division is needed.
> >
> > The patch also removes changes made to the FUNC_START and ARM_FUNC_START
> > macros in r218124 since the intent was to put multiplication and
> > division code into their own section in a later patch to achieve the
> > same size optimization. That approach relied on specific section layout
> > to ensure multiplication and division were not too far from the shared
> > bit of code in order to the branches to be within range. Due to lack of
> > guarantee regarding section layout, in particular with all the
> > possibility of linker scripts, this approach was chosen instead. This
> > patch keeps the two testcases that were posted by Tony Wang (an Arm
> > employee at the time) on the mailing list to implement this approach
> > and adds a new one, hence the attribution.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-11-14  Thomas Preud'homme  
> >
> > * config/arm/elf.h: Update comment about condition that need to
> > match with libgcc/config/arm/lib1funcs.S to also include
> > libgcc/config/arm/t-arm.
> > * doc/sourcebuild.texi (output-exists, output-exists-not): Rename
> > subsubsection these directives are in to "Check for output files".
> > Move scan-symbol to that section and add to it new scan-symbol-not
> > directive.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-11-16  Tony Wang  
> > Thomas Preud'homme  
> >
> > * lib/lto.exp (lto-execute): Define output_file and testname_with_flags
> > to same value as execname.
> > (scan-symbol): Move and rename to ...
> > * lib/gcc-dg.exp (scan-symbol-common): This.  Adapt into a
> > helper function returning true or false if a symbol is present.
> > (scan-symbol): New procedure.
> > (scan-symbol-not): Likewise.
> > * gcc.target/arm/size-optimization-ieee-1.c: New testcase.
> > * gcc.target/arm/size-optimization-ieee-2.c: Likewise.
> > * gcc.target/arm/size-optimization-ieee-3.c: Likewise.
> >
> > *** libgcc/ChangeLog ***
> >
> > 2018-11-16  Thomas Preud'homme  
> >
> > * /config/arm/lib1funcs.S (FUNC_START): Remove unused sp_section
> > parameter and corresponding code.
> > (ARM_FUNC_START): Likewise in both definitions.
> > Also update footer comment about condition that need to match with
> > gcc/config/arm/elf.h to also include libgcc/config/arm/t-arm.
> > * config/arm/ieee754-df.S (muldf3): Also build it if L_arm_muldf3 is
> > defined.  Weakly define it in this case.
> > * config/arm/ieee754-sf.S (mulsf3): Likewise with L_arm_mulsf3.
> > * config/arm/t-elf (LIB1ASMFUNCS): Build _arm_muldf3.o and
> > _arm_mulsf3.o before muldiv versions if targeting Thumb-1 only. Add
> > comment to keep condition in sync with the one in
> > libgcc/config/arm/lib1funcs.S and gcc/config/arm/elf.h.
> >
> > Testing: Bootstrapped on arm-linux-gnueabihf (Arm & Thumb-2) and
> > testsuite shows no
> > regression. Also built an arm-none-eabi cross com

Re: [PATCH] [RFC] PR target/52813 and target/11807

2018-12-12 Thread Thomas Preudhomme

[resending from the right address]

Hi Christophe,

Why not simply: "Clobber of  unsupported" with an
accompanying change of the documentation to state the extra bit you
wanted to put in that error message? Perhaps even add a reference to
the section of the documentation in the error message.


Best regards,


Thomas
On Wed, 12 Dec 2018 at 15:13, Christophe Lyon
 wrote:
>
> On Wed, 12 Dec 2018 at 14:19, Christophe Lyon
>  wrote:
> >
> > On Wed, 12 Dec 2018 at 12:21, Thomas Preudhomme
> >  wrote:
> > >
> > > So my understanding is that the original code (CMSIS library) used to
> > > clobber sp because the asm statement was actually changing the sp.
> > > That in turn led GCC to try to save and restore sp which is not what
> > > CMSIS was expecting to happen. Changing sp without clobber as done now
> > > is probably the right solution and r242693 can be reverted. That will
> > > remove the failing test.
> > >
> >
> > OK, I read PR52813 too, but I'm not sure to fully understand the new status.
> > My understanding is that since this patch was committed, if an asm statement
> > clobbers sp, it is now allowed to actually declare it as clobber (this patch
> > generates an error in such a case).
> > So the user is now expected to lie to the compiler when writing to
> > this kind of register (sp, pic register), by not declaring it as "clobber"?
> >
>
> I'm attaching a small patch which adds a more verbose error message
> along the lines of what I understand of the current status.
> I'm pretty sure I got (at least) the formatting wrong :)
>
> Christophe
>
> >
> > > Best regards,
> > >
> > > Thomas
> > > On Wed, 12 Dec 2018 at 10:30, Thomas Preudhomme
> > >  wrote:
> > > >
> > > > Hi Christophe,
> > > >
> > > > That PR was about a bug occuring when sp was clobbered so if it cannot
> > > > be clobbered anymore the whole commit (r242693) can be removed. Let me
> > > > check the original code that lead to the PR why it's clobbering sp
> > > > though.
> > > >
> > > > Best regards,
> > > >
> > > > Thomas
> > > > On Wed, 12 Dec 2018 at 09:43, Christophe Lyon
> > > >  wrote:
> > > > >
> > > > > On Tue, 11 Dec 2018 at 16:52, Richard Sandiford
> > > > >  wrote:
> > > > > >
> > > > > > Dimitar Dimitrov  writes:
> > > > > > > On понеделник, 10 декември 2018 г. 11:21:53 EET Richard Sandiford 
> > > > > > > wrote:
> > > > > > >> Dimitar Dimitrov  writes:
> > > > > > >> > I have tested this fix on x86_64 host, and found no regression 
> > > > > > >> > in the C
> > > > > > >> > and C++ testsuites.  I'm marking this patch as RFC simply 
> > > > > > >> > because I don't
> > > > > > >> > have experience with other architectures, and I don't have a 
> > > > > > >> > setup to
> > > > > > >> > test all architectures supported by GCC.
> > > > > > >> >
> > > > > > >> > gcc/ChangeLog:
> > > > > > >> >
> > > > > > >> > 2018-12-07  Dimitar Dimitrov  
> > > > > > >> >
> > > > > > >> >* cfgexpand.c (asm_clobber_reg_is_valid): Also produce
> > > > > > >> >error when stack pointer is clobbered.
> > > > > > >> >(expand_asm_stmt): Refactor clobber check in separate 
> > > > > > >> > function.
> > > > > > >> >
> > > > > > >> > gcc/testsuite/ChangeLog:
> > > > > > >> >
> > > > > > >> > 2018-12-07  Dimitar Dimitrov  
> > > > > > >> >
> > > > > > >> >* gcc.target/i386/pr52813.c: New test.
> > > > > > >> >
> > > > > > >> > Signed-off-by: Dimitar Dimitrov 
> > > > > > >>
> > > > > > >> LGTM.  Do you have a copyright assignment on file?  'Fraid this 
> > > > > > >> is
> > > > > > >> probably big enough to need one.
> > > > > > > Yes, I have copyright assignment.
> > > > > >
> > > > > > OK, great.  I went ahead and applied the patch.
> > > > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > This patch introduces a regression on arm:
> > > > > FAIL: gcc.target/arm/pr77904.c (test for excess errors)
> > > > > Excess errors:
> > > > > /gcc/testsuite/gcc.target/arm/pr77904.c:7:3: error: Stack Pointer
> > > > > register clobbered by 'sp' in 'asm'
> > > > >
> > > > > Indeed the testcase has an explicit:
> > > > >   __asm volatile ("" : : : "sp");
> > > > > which is now rejected.
> > > > >
> > > > > Thomas, is that mandatory to test your code to fix pr77904?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Christophe
> > > > >
> > > > > > Thanks,
> > > > > > Richard

Re: [PATCH] [RFC] PR target/52813 and target/11807

2018-12-12 Thread Thomas Preudhomme

So my understanding is that the original code (CMSIS library) used to
clobber sp because the asm statement was actually changing the sp.
That in turn led GCC to try to save and restore sp which is not what
CMSIS was expecting to happen. Changing sp without clobber as done now
is probably the right solution and r242693 can be reverted. That will
remove the failing test.

Best regards,

Thomas
On Wed, 12 Dec 2018 at 10:30, Thomas Preudhomme
 wrote:
>
> Hi Christophe,
>
> That PR was about a bug occuring when sp was clobbered so if it cannot
> be clobbered anymore the whole commit (r242693) can be removed. Let me
> check the original code that lead to the PR why it's clobbering sp
> though.
>
> Best regards,
>
> Thomas
> On Wed, 12 Dec 2018 at 09:43, Christophe Lyon
>  wrote:
> >
> > On Tue, 11 Dec 2018 at 16:52, Richard Sandiford
> >  wrote:
> > >
> > > Dimitar Dimitrov  writes:
> > > > On понеделник, 10 декември 2018 г. 11:21:53 EET Richard Sandiford wrote:
> > > >> Dimitar Dimitrov  writes:
> > > >> > I have tested this fix on x86_64 host, and found no regression in 
> > > >> > the C
> > > >> > and C++ testsuites.  I'm marking this patch as RFC simply because I 
> > > >> > don't
> > > >> > have experience with other architectures, and I don't have a setup to
> > > >> > test all architectures supported by GCC.
> > > >> >
> > > >> > gcc/ChangeLog:
> > > >> >
> > > >> > 2018-12-07  Dimitar Dimitrov  
> > > >> >
> > > >> >* cfgexpand.c (asm_clobber_reg_is_valid): Also produce
> > > >> >error when stack pointer is clobbered.
> > > >> >(expand_asm_stmt): Refactor clobber check in separate function.
> > > >> >
> > > >> > gcc/testsuite/ChangeLog:
> > > >> >
> > > >> > 2018-12-07  Dimitar Dimitrov  
> > > >> >
> > > >> >* gcc.target/i386/pr52813.c: New test.
> > > >> >
> > > >> > Signed-off-by: Dimitar Dimitrov 
> > > >>
> > > >> LGTM.  Do you have a copyright assignment on file?  'Fraid this is
> > > >> probably big enough to need one.
> > > > Yes, I have copyright assignment.
> > >
> > > OK, great.  I went ahead and applied the patch.
> > >
> >
> > Hi,
> >
> > This patch introduces a regression on arm:
> > FAIL: gcc.target/arm/pr77904.c (test for excess errors)
> > Excess errors:
> > /gcc/testsuite/gcc.target/arm/pr77904.c:7:3: error: Stack Pointer
> > register clobbered by 'sp' in 'asm'
> >
> > Indeed the testcase has an explicit:
> >   __asm volatile ("" : : : "sp");
> > which is now rejected.
> >
> > Thomas, is that mandatory to test your code to fix pr77904?
> >
> > Thanks,
> >
> > Christophe
> >
> > > Thanks,
> > > Richard

Re: [PATCH] [RFC] PR target/52813 and target/11807

2018-12-12 Thread Thomas Preudhomme

Hi Christophe,

That PR was about a bug occuring when sp was clobbered so if it cannot
be clobbered anymore the whole commit (r242693) can be removed. Let me
check the original code that lead to the PR why it's clobbering sp
though.

Best regards,

Thomas
On Wed, 12 Dec 2018 at 09:43, Christophe Lyon
 wrote:
>
> On Tue, 11 Dec 2018 at 16:52, Richard Sandiford
>  wrote:
> >
> > Dimitar Dimitrov  writes:
> > > On понеделник, 10 декември 2018 г. 11:21:53 EET Richard Sandiford wrote:
> > >> Dimitar Dimitrov  writes:
> > >> > I have tested this fix on x86_64 host, and found no regression in the C
> > >> > and C++ testsuites.  I'm marking this patch as RFC simply because I 
> > >> > don't
> > >> > have experience with other architectures, and I don't have a setup to
> > >> > test all architectures supported by GCC.
> > >> >
> > >> > gcc/ChangeLog:
> > >> >
> > >> > 2018-12-07  Dimitar Dimitrov  
> > >> >
> > >> >* cfgexpand.c (asm_clobber_reg_is_valid): Also produce
> > >> >error when stack pointer is clobbered.
> > >> >(expand_asm_stmt): Refactor clobber check in separate function.
> > >> >
> > >> > gcc/testsuite/ChangeLog:
> > >> >
> > >> > 2018-12-07  Dimitar Dimitrov  
> > >> >
> > >> >* gcc.target/i386/pr52813.c: New test.
> > >> >
> > >> > Signed-off-by: Dimitar Dimitrov 
> > >>
> > >> LGTM.  Do you have a copyright assignment on file?  'Fraid this is
> > >> probably big enough to need one.
> > > Yes, I have copyright assignment.
> >
> > OK, great.  I went ahead and applied the patch.
> >
>
> Hi,
>
> This patch introduces a regression on arm:
> FAIL: gcc.target/arm/pr77904.c (test for excess errors)
> Excess errors:
> /gcc/testsuite/gcc.target/arm/pr77904.c:7:3: error: Stack Pointer
> register clobbered by 'sp' in 'asm'
>
> Indeed the testcase has an explicit:
>   __asm volatile ("" : : : "sp");
> which is now rejected.
>
> Thomas, is that mandatory to test your code to fix pr77904?
>
> Thanks,
>
> Christophe
>
> > Thanks,
> > Richard

Re: [PATCH, ARM] Improve robustness of -mslow-flash-data

2018-12-11 Thread Thomas Preudhomme

Hi Kyrill,

I've tested on armeb-none-eabi with -mslow-flash-data for both
-mfloat-abi=hard and -mfloat-abi=soft. Both show no regression and the
former shows some new PASS.

Regarding the part you are hesitant about, the code was taken from
aarch64_reinterpret_float_as_int in config/aarch64/aarch64.c. I'm not
too keen on splitting the patch unless it's just for review (ie still
committed as one) since the changes really go together. The tighter
predicate and constraint are to prevent normal pattern to match when
-mslow-flash-data is in effect while the new splitter and expander is
to deal with load under those circumstances.

Best regards,

Thomas
On Fri, 30 Nov 2018 at 14:11, Kyrill Tkachov
 wrote:
>
> Hi Thomas,
>
> On 19/11/18 17:56, Thomas Preudhomme wrote:
> > Hi,
> >
> > Current code to handle -mslow-flash-data in machine description files
> > suffers from a number of issues which this patch fixes:
> >
> > 1) The insn_and_split in vfp.md to load a generic floating-point
> > constant via GPR first and move it to VFP register are guarded by
> > !reload_completed which is forbidden explicitely in the GCC internals
> > documentation section 17.2 point 3;
> >
> > 2) A number of testcase in the testsuite ICEs under -mslow-flash-data
> > when targeting the hardfloat ABI [1];
> >
> > 3) Instructions performing load from literal pool are not disabled.
> >
> > These problems are addressed by 2 separate actions:
> >
> > 1) Making the splitters take a clobber and changing the expanders
> > accordingly to generate a mov with clobber in cases where a literal
> > pool would be used. The splitter can thus be enabled after reload since
> > it does not call gen_reg_rtx anymore;
> >
> > 2) Adding new predicates and constraints to disable literal pool loads
> > in existing instructions when -mslow-flash-data is in effect.
> >
>
> Please split these into two separate patches so we can more clearly see which 
> changes address which problem
>
> > The patch also rework the splitter for DFmode slightly to generate an
> > intermediate DI load instead of 2 intermediate SI loads, thus relying on
> > the existing DI splitters instead of redoing their job. At last, the
> > patch adds some missing arm_fp_ok effective target to some of the
> > slow-flash-data testcases.
> >
> > [1]
> > c-c++-common/Wunused-var-3.c
> > gcc.c-torture/compile/pr72771.c
> > gcc.c-torture/compile/vector-5.c
> > gcc.c-torture/compile/vector-6.c
> > gcc.c-torture/execute/20030914-1.c
> > gcc.c-torture/execute/20050316-1.c
> > gcc.c-torture/execute/pr59643.c
> > gcc.dg/builtin-tgmath-1.c
> > gcc.dg/debug/pr55730.c
> > gcc.dg/graphite/interchange-7.c
> > gcc.dg/pr56890-2.c
> > gcc.dg/pr68474.c
> > gcc.dg/pr80286.c
> > gcc.dg/torture/pr35227.c
> > gcc.dg/torture/pr65077.c
> > gcc.dg/torture/pr86363.c
> > g++.dg/torture/pr81112.C
> > g++.dg/torture/pr82985.C
> > g++.dg/warn/Wunused-var-7.C
> > and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
> > special_functions/*_ellint_* directories.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-11-14  Thomas Preud'homme 
> >
> > * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
> > source is a constant that would be loaded by literal pool.
> > (movsf expander): Generate a no_literal_pool_sf_immediate insn if
> > -mslow-flash-data is present, targeting hardfloat ABI and source is 
> > a
> > float constant that cannot be loaded via vmov.
> > (movdf expander): Likewise but generate a 
> > no_literal_pool_df_immediate
> > insn.
> > (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
> > float constant that would be loaded by literal pool.
> > (softfloat constant movsf splitter): Splitter for the above case.
> > (movdf_soft_insn): Split if -mslow-flash-data and source is a float
> > constant that would be loaded by literal pool.
> > (softfloat constant movdf splitter): Splitter for the above case.
> > * config/arm/constraints.md (Pz): Document existing constraint.
> > (Ha): Define constraint.
> > (Tu): Likewise.
> > * config/arm/predicates.md (hard_sf_operand): New predicate.
> > (hard_df_operand): Likewise.
> > * config/arm/thumb2.md (thumb2_movsi_insn): Split if
> > -mslow-flash-data and constant would be loaded by literal pool.
> > * constant/arm/v

[PATCH, ARM] Error out when -mfpu set and targeting Thumb-1

2018-11-29 Thread Thomas Preudhomme

Hi,

FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
not set. Among other things, it makes some of the cmse tests (eg.
gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
-march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch
errors out when a Thumb-1 -like target is selected and a FPU is
specified, thus making such tests being skipped.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-11-28  thomas Preud'homme  

* config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out
if targeting Thumb-1 with an FPU specified.

*** gcc/testsuite/ChangeLog ***

2018-11-28  thomas Preud'homme  

* gcc.target/arm/thumb1_mfpu-1.c: New testcase.
* gcc.target/arm/thumb1_mfpu-2.c: Likewise.

Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M.
Fails as expected when targeting Armv6-M with an -mfpu or a default FPU.
Succeeds without.

Is this ok for stage3?

Best regards,

Thomas
From 051e38552d7c596873e0303f6ec4272b26d50900 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Tue, 27 Nov 2018 15:52:38 +
Subject: [PATCH] [PATCH, ARM] Error out when -mfpu set and targeting Thumb-1

Hi,

FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
not set. Among other things, it makes some of the cmse tests (eg.
gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
-march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch
errors out when a Thumb-1 -like target is selected and a FPU is
specified, thus making such tests being skipped.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-11-28  thomas Preud'homme  

	* config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out
	if targeting Thumb-1 with an FPU specified.

*** gcc/testsuite/ChangeLog ***

2018-11-28  thomas Preud'homme  

	* gcc.target/arm/thumb1_mfpu-1.c: New testcase.
	* gcc.target/arm/thumb1_mfpu-2.c: Likewise.

Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M.
Fails as expected when targeting Armv6-M with an -mfpu or a default FPU.
Succeeds without.

Is this ok for stage3?

Best regards,

Thomas
---
 gcc/config/arm/arm.c | 3 +++
 gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c | 7 +++
 gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c | 8 
 3 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 40f0574e32e..1a205123cf5 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3747,6 +3747,9 @@ arm_options_perform_arch_sanity_checks (void)
 {
   if (arm_abi == ARM_ABI_IWMMXT)
 	arm_pcs_default = ARM_PCS_AAPCS_IWMMXT;
+  else if (TARGET_THUMB1
+	   && bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2))
+	error ("Thumb-1 does not allow FP instructions");
   else if (TARGET_HARD_FLOAT_ABI)
 	{
 	  arm_pcs_default = ARM_PCS_AAPCS_VFP;
diff --git a/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c
new file mode 100644
index 000..5347e63f9b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb1_ok } */
+/* { dg-skip-if "incompatible float ABI" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */
+/* { dg-options "-mthumb -mfpu=vfp -mfloat-abi=softfp" } */
+/* { dg-error "Thumb-1 does not allow FP instructions" "" { target *-*-* } 0 } */
+
+int foo;
diff --git a/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c
new file mode 100644
index 000..941ed26ed01
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb1_ok } */
+/* { dg-skip-if "incompatible float ABI" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */
+/* No need to skip in presence of -mfpu since arm_thumb1_ok will already fail
+   due to Thumb-1 with -mfpu which is tested by thumb1_mfpu-1 testcase.  */
+/* { dg-options "-mthumb -mfloat-abi=softfp" } */
+
+int foo;
-- 
2.19.1

Re: [PATCH, ARM, ping] Improve robustness of -mslow-flash-data

2018-11-26 Thread Thomas Preudhomme


Ping?

Best regards,

Thomas

On 19/11/2018 17:56, Thomas Preudhomme wrote:

Hi,

Current code to handle -mslow-flash-data in machine description files
suffers from a number of issues which this patch fixes:

1) The insn_and_split in vfp.md to load a generic floating-point
constant via GPR first and move it to VFP register are guarded by
!reload_completed which is forbidden explicitely in the GCC internals
documentation section 17.2 point 3;

2) A number of testcase in the testsuite ICEs under -mslow-flash-data
when targeting the hardfloat ABI [1];

3) Instructions performing load from literal pool are not disabled.

These problems are addressed by 2 separate actions:

1) Making the splitters take a clobber and changing the expanders
accordingly to generate a mov with clobber in cases where a literal
pool would be used. The splitter can thus be enabled after reload since
it does not call gen_reg_rtx anymore;

2) Adding new predicates and constraints to disable literal pool loads
in existing instructions when -mslow-flash-data is in effect.

The patch also rework the splitter for DFmode slightly to generate an
intermediate DI load instead of 2 intermediate SI loads, thus relying on
the existing DI splitters instead of redoing their job. At last, the
patch adds some missing arm_fp_ok effective target to some of the
slow-flash-data testcases.

[1]
c-c++-common/Wunused-var-3.c
gcc.c-torture/compile/pr72771.c
gcc.c-torture/compile/vector-5.c
gcc.c-torture/compile/vector-6.c
gcc.c-torture/execute/20030914-1.c
gcc.c-torture/execute/20050316-1.c
gcc.c-torture/execute/pr59643.c
gcc.dg/builtin-tgmath-1.c
gcc.dg/debug/pr55730.c
gcc.dg/graphite/interchange-7.c
gcc.dg/pr56890-2.c
gcc.dg/pr68474.c
gcc.dg/pr80286.c
gcc.dg/torture/pr35227.c
gcc.dg/torture/pr65077.c
gcc.dg/torture/pr86363.c
g++.dg/torture/pr81112.C
g++.dg/torture/pr82985.C
g++.dg/warn/Wunused-var-7.C
and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
special_functions/*_ellint_* directories.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-11-14  Thomas Preud'homme  

 * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
 source is a constant that would be loaded by literal pool.
 (movsf expander): Generate a no_literal_pool_sf_immediate insn if
 -mslow-flash-data is present, targeting hardfloat ABI and source is a
 float constant that cannot be loaded via vmov.
 (movdf expander): Likewise but generate a no_literal_pool_df_immediate
 insn.
 (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
 float constant that would be loaded by literal pool.
 (softfloat constant movsf splitter): Splitter for the above case.
 (movdf_soft_insn): Split if -mslow-flash-data and source is a float
 constant that would be loaded by literal pool.
 (softfloat constant movdf splitter): Splitter for the above case.
 * config/arm/constraints.md (Pz): Document existing constraint.
 (Ha): Define constraint.
 (Tu): Likewise.
 * config/arm/predicates.md (hard_sf_operand): New predicate.
 (hard_df_operand): Likewise.
 * config/arm/thumb2.md (thumb2_movsi_insn): Split if
 -mslow-flash-data and constant would be loaded by literal pool.
 * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
 load in VFP register.
 (movdi_vfp): Likewise.
 (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
 prevent match for a constant load if -mslow-flash-data and constant
 cannot be loaded via vmov.  Adapt constraint accordingly by
 using Ha instead of E for generic floating-point constant load.
 (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
 (no_literal_pool_df_immediate): Add a clobber to use as the
 intermediate general purpose register and also enable it after reload
 but disable it constant is a valid FP constant.  Add constraints and
 generate a DI intermediate load rather than 2 SI loads.
 (no_literal_pool_sf_immediate): Add a clobber to use as the
 intermediate general purpose register and also enable it after
 reload.

*** gcc/testsuite/ChangeLog ***

2018-11-14  Thomas Preud'homme  

 * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
 effective target.
 * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
 * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
 * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.

Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
softfloat and hardfloat ABI which showed no regression and some
FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
code generation didn't change.

Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index a773518cefaf8451e77fead9e072ee8ef39f1

Re: [PATCH, libgcc/ARM & testsuite, ping] Optimize executable size when using softfloat fmul/dmul

2018-11-26 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On Mon, 19 Nov 2018 at 10:51, Thomas Preudhomme
 wrote:
>
> FWIW, the testcases were taken from
> https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01026.html
>
> Previous approach for fixing tying of fmul to fdiv can be seen in
> https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01971.html. As mentioned
> in the cover letter, this patch went for a completely different
> approach and does not share any code besides the testcases.
>
> Best regards,
>
> Thomas
> On Mon, 19 Nov 2018 at 09:57, Thomas Preudhomme
>  wrote:
> >
> > Softfloat single precision and double precision floating-point
> > multiplication routines in libgcc share some code with the
> > floating-point division of their corresponding precision. As the code
> > is structured now, this leads to *all* division code being pulled in an
> > executable in softfloat mode even if only multiplication is
> > performed.
> >
> > This patch create some new LIB1ASMFUNCS macros to also build files with
> > just the multiplication and shared code as weak symbols. By putting
> > these earlier in the static library, they can then be picked up when
> > only multiplication is used and they are overriden by the global
> > definition in the existing file containing both multiplication and
> > division code when division is needed.
> >
> > The patch also removes changes made to the FUNC_START and ARM_FUNC_START
> > macros in r218124 since the intent was to put multiplication and
> > division code into their own section in a later patch to achieve the
> > same size optimization. That approach relied on specific section layout
> > to ensure multiplication and division were not too far from the shared
> > bit of code in order to the branches to be within range. Due to lack of
> > guarantee regarding section layout, in particular with all the
> > possibility of linker scripts, this approach was chosen instead. This
> > patch keeps the two testcases that were posted by Tony Wang (an Arm
> > employee at the time) on the mailing list to implement this approach
> > and adds a new one, hence the attribution.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-11-14  Thomas Preud'homme  
> >
> > * config/arm/elf.h: Update comment about condition that need to
> > match with libgcc/config/arm/lib1funcs.S to also include
> > libgcc/config/arm/t-arm.
> > * doc/sourcebuild.texi (output-exists, output-exists-not): Rename
> > subsubsection these directives are in to "Check for output files".
> > Move scan-symbol to that section and add to it new scan-symbol-not
> > directive.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-11-16  Tony Wang  
> > Thomas Preud'homme  
> >
> > * lib/lto.exp (lto-execute): Define output_file and testname_with_flags
> > to same value as execname.
> > (scan-symbol): Move and rename to ...
> > * lib/gcc-dg.exp (scan-symbol-common): This.  Adapt into a
> > helper function returning true or false if a symbol is present.
> > (scan-symbol): New procedure.
> > (scan-symbol-not): Likewise.
> > * gcc.target/arm/size-optimization-ieee-1.c: New testcase.
> > * gcc.target/arm/size-optimization-ieee-2.c: Likewise.
> > * gcc.target/arm/size-optimization-ieee-3.c: Likewise.
> >
> > *** libgcc/ChangeLog ***
> >
> > 2018-11-16  Thomas Preud'homme  
> >
> > * /config/arm/lib1funcs.S (FUNC_START): Remove unused sp_section
> > parameter and corresponding code.
> > (ARM_FUNC_START): Likewise in both definitions.
> > Also update footer comment about condition that need to match with
> > gcc/config/arm/elf.h to also include libgcc/config/arm/t-arm.
> > * config/arm/ieee754-df.S (muldf3): Also build it if L_arm_muldf3 is
> > defined.  Weakly define it in this case.
> > * config/arm/ieee754-sf.S (mulsf3): Likewise with L_arm_mulsf3.
> > * config/arm/t-elf (LIB1ASMFUNCS): Build _arm_muldf3.o and
> > _arm_mulsf3.o before muldiv versions if targeting Thumb-1 only. Add
> > comment to keep condition in sync with the one in
> > libgcc/config/arm/lib1funcs.S and gcc/config/arm/elf.h.
> >
> > Testing: Bootstrapped on arm-linux-gnueabihf (Arm & Thumb-2) and
> > testsuite shows no
> > regression. Also built an arm-none-eabi cross compiler targeting
> > soft-float which also shows no regression. In particular newly added
> > tests and gcc.dg/lto/20081212-1 test pass.
> >
> >

Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-22 Thread Thomas Preudhomme

I'm talking about the PIC access to the guard's variable. See for
example the pr85434.c testcase contributed with this patch when
compiled for aarch64 with -Os -fpic -march=armv8-a
-fstack-protector-strong:

(insn 227 226 228 33 (set (reg:DI 90)
(high:DI (symbol_ref:DI ("_GLOBAL_OFFSET_TABLE_"
"/data/dev/checkouts/private/linaro/gcc/gcc/testsuite/gcc.target/arm/pr85434.c":148:1
-1
 (nil))
(insn 228 227 229 33 (set (reg/f:DI 244)
(unspec:DI [
(mem/u/c:DI (lo_sum:DI (reg:DI 90)
(symbol_ref:DI ("__stack_chk_guard") [flags
0xc0]  )) [0  S8 A8])
] UNSPEC_GOTSMALLPIC28K))
"/data/dev/checkouts/private/linaro/gcc/gcc/testsuite/gcc.target/arm/pr85434.c":148:1
-1
 (expr_list:REG_EQUAL (symbol_ref:DI ("__stack_chk_guard") [flags
0xc0]  )
(nil)))
(insn 229 228 230 33 (parallel [
(set (reg:DI 245)
(unspec:DI [
(mem/v/f/c:DI (plus:DI (reg/f:DI 85 virtual-stack-vars)
(const_int -8 [0xfff8]))
[4 D.3715+0 S8 A64])
(mem/v/f/c:DI (reg/f:DI 244) [4
__stack_chk_guard+0 S8 A64])
] UNSPEC_SP_TEST))
(clobber (scratch:DI))
]) 
"/data/dev/checkouts/private/linaro/gcc/gcc/testsuite/gcc.target/arm/pr85434.c":148:1
-1
 (nil))

The unspec in insn 228 is not CSEd in my experiment despite the same
instruction happening in the prologue to set the canary. In arm
backend it was but the PIC access is of the form (mem (reg) (unspec
offset)), ie the outermost rtx in the source is not an unspec.

Best regards,

Thomas
On Wed, 21 Nov 2018 at 17:54, Segher Boessenkool
 wrote:
>
> On Fri, Nov 16, 2018 at 02:56:46PM +, Thomas Preudhomme wrote:
> > In case of high register pressure in PIC mode, address of the stack
> > protector's guard can be spilled on ARM targets as shown in PR85434,
> > thus allowing an attacker to control what the canary would be compared
> > against. ARM does lack stack_protect_set and stack_protect_test insn
> > patterns, defining them does not help as the address is expanded
> > regularly and the patterns only deal with the copy and test of the
> > guard with the canary.
> >
> > This problem does not occur for x86 targets because the PIC access and
> > the test can be done in the same instruction. Aarch64 is exempt too
> > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > the second access in the epilogue being CSEd in cse_local pass with the
> > first access in the prologue.
>
> The unspecs are not CSEd because they are *different* unspecs (UNSPEC_SP_SET
> vs. UNSPEC_SP_TEST; they have different args too, different number of args
> even).  Two the same unspecs can be CSEd just fine.
>
>
> Segher

Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-22 Thread Thomas Preudhomme

Thanks Kyrill. Committed the attached patch.

Best regards,

Thomas
On Wed, 21 Nov 2018 at 16:06, Kyrill Tkachov
 wrote:
>
> Hi Thomas,
>
> Sorry for the delay.
>
> On 16/11/18 14:56, Thomas Preudhomme wrote:
> > Ping?
> >
> > Best regards,
> >
> > Thomas
> >
> > On Sat, 10 Nov 2018 at 15:07, Thomas Preudhomme
> >  wrote:
> >> Thanks Kyrill.
> >>
> >> Updated patch in attachment. Best regards,
> >>
> >> Thomas
> >> On Thu, 8 Nov 2018 at 15:53, Kyrill Tkachov  
> >> wrote:
> >>> Hi Thomas,
> >>>
> >>> On 08/11/18 09:52, Thomas Preudhomme wrote:
> >>>> Ping?
> >>>>
> >>>> Best regards,
> >>>>
> >>>> Thomas
> >>>>
> >>>> On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme
> >>>>  wrote:
> >>>>> Ping?
> >>>>>
> >>>>> Best regards,
> >>>>>
> >>>>> Thomas
> >>>>> On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme
> >>>>>  wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> Please find updated patch to fix PR85434: spilling of stack protector
> >>>>>> guard's address on ARM. Quite a few changes have been made to the ARM
> >>>>>> part since last round of review so I think it makes more sense to
> >>>>>> review it anew. Ran bootstrap + regression testsuite + glibc build +
> >>>>>> glibc regression testsuite for Arm and Thumb-2 and bootstrap +
> >>>>>> regression testsuite for Thumb-1. GCC's regression testsuite was run
> >>>>>> in 3 configurations in all those cases:
> >>>>>>
> >>>>>> - default configuration (no RUNTESTFLAGS)
> >>>>>> - with -fstack-protector-all
> >>>>>> - with -fPIC -fstack-protector-all (to exercise both codepath in stack
> >>>>>> protector's split code)
> >>>>>>
> >>>>>> None of this show any regression beyond some new scan fail with
> >>>>>> -fstack-protector-all or -fPIC due to unexpected code sequence for the
> >>>>>> testcases concerned and some guality swing due to less optimization
> >>>>>> with new stack protector on.
> >>>>>>
> >>>>>> Patch description and ChangeLog below.
> >>>>>>
> >>>>>> In case of high register pressure in PIC mode, address of the stack
> >>>>>> protector's guard can be spilled on ARM targets as shown in PR85434,
> >>>>>> thus allowing an attacker to control what the canary would be compared
> >>>>>> against. ARM does lack stack_protect_set and stack_protect_test insn
> >>>>>> patterns, defining them does not help as the address is expanded
> >>>>>> regularly and the patterns only deal with the copy and test of the
> >>>>>> guard with the canary.
> >>>>>>
> >>>>>> This problem does not occur for x86 targets because the PIC access and
> >>>>>> the test can be done in the same instruction. Aarch64 is exempt too
> >>>>>> because PIC access insn pattern are mov of UNSPEC which prevents it 
> >>>>>> from
> >>>>>> the second access in the epilogue being CSEd in cse_local pass with the
> >>>>>> first access in the prologue.
> >>>>>>
> >>>>>> The approach followed here is to create new "combined" set and test
> >>>>>> standard pattern names that take the unexpanded guard and do the set or
> >>>>>> test. This allows the target to use an opaque pattern (eg. using 
> >>>>>> UNSPEC)
> >>>>>> to hide the individual instructions being generated to the compiler and
> >>>>>> split the pattern into generic load, compare and branch instruction
> >>>>>> after register allocator, therefore avoiding any spilling. This is here
> >>>>>> implemented for the ARM targets. For targets not implementing these new
> >>>>>> standard pattern names, the existing stack_protect_set and
> >>>>>> stack_protect_test pattern names are used.
> >>>>>>
> >>>>>> To be able to split PIC access after re

Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-21 Thread Thomas Preudhomme

Yes you did indeed which is why I didn't include you in to To list.
I've reworked the Arm part significantly since it was last approved,
the ping is meant for the Arm maintainers.

Thanks for enquiring about it. Best regards,

Thomas
On Wed, 21 Nov 2018 at 00:32, Jeff Law  wrote:
>
> On 11/16/18 7:56 AM, Thomas Preudhomme wrote:
> > Ping?
> I thought I acked the target independent stuff a while back.  What's
> still waiting on review here?
>
> jeff

[PATCH, ARM] Improve robustness of -mslow-flash-data

2018-11-19 Thread Thomas Preudhomme


Hi,

Current code to handle -mslow-flash-data in machine description files
suffers from a number of issues which this patch fixes:

1) The insn_and_split in vfp.md to load a generic floating-point
constant via GPR first and move it to VFP register are guarded by
!reload_completed which is forbidden explicitely in the GCC internals
documentation section 17.2 point 3;

2) A number of testcase in the testsuite ICEs under -mslow-flash-data
when targeting the hardfloat ABI [1];

3) Instructions performing load from literal pool are not disabled.

These problems are addressed by 2 separate actions:

1) Making the splitters take a clobber and changing the expanders
accordingly to generate a mov with clobber in cases where a literal
pool would be used. The splitter can thus be enabled after reload since
it does not call gen_reg_rtx anymore;

2) Adding new predicates and constraints to disable literal pool loads
in existing instructions when -mslow-flash-data is in effect.

The patch also rework the splitter for DFmode slightly to generate an
intermediate DI load instead of 2 intermediate SI loads, thus relying on
the existing DI splitters instead of redoing their job. At last, the
patch adds some missing arm_fp_ok effective target to some of the
slow-flash-data testcases.

[1]
c-c++-common/Wunused-var-3.c
gcc.c-torture/compile/pr72771.c
gcc.c-torture/compile/vector-5.c
gcc.c-torture/compile/vector-6.c
gcc.c-torture/execute/20030914-1.c
gcc.c-torture/execute/20050316-1.c
gcc.c-torture/execute/pr59643.c
gcc.dg/builtin-tgmath-1.c
gcc.dg/debug/pr55730.c
gcc.dg/graphite/interchange-7.c
gcc.dg/pr56890-2.c
gcc.dg/pr68474.c
gcc.dg/pr80286.c
gcc.dg/torture/pr35227.c
gcc.dg/torture/pr65077.c
gcc.dg/torture/pr86363.c
g++.dg/torture/pr81112.C
g++.dg/torture/pr82985.C
g++.dg/warn/Wunused-var-7.C
and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
special_functions/*_ellint_* directories.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-11-14  Thomas Preud'homme  

* config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
source is a constant that would be loaded by literal pool.
(movsf expander): Generate a no_literal_pool_sf_immediate insn if
-mslow-flash-data is present, targeting hardfloat ABI and source is a
float constant that cannot be loaded via vmov.
(movdf expander): Likewise but generate a no_literal_pool_df_immediate
insn.
(arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
float constant that would be loaded by literal pool.
(softfloat constant movsf splitter): Splitter for the above case.
(movdf_soft_insn): Split if -mslow-flash-data and source is a float
constant that would be loaded by literal pool.
(softfloat constant movdf splitter): Splitter for the above case.
* config/arm/constraints.md (Pz): Document existing constraint.
(Ha): Define constraint.
(Tu): Likewise.
* config/arm/predicates.md (hard_sf_operand): New predicate.
(hard_df_operand): Likewise.
* config/arm/thumb2.md (thumb2_movsi_insn): Split if
-mslow-flash-data and constant would be loaded by literal pool.
* constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
load in VFP register.
(movdi_vfp): Likewise.
(thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
prevent match for a constant load if -mslow-flash-data and constant
cannot be loaded via vmov.  Adapt constraint accordingly by
using Ha instead of E for generic floating-point constant load.
(thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
(no_literal_pool_df_immediate): Add a clobber to use as the
intermediate general purpose register and also enable it after reload
but disable it constant is a valid FP constant.  Add constraints and
generate a DI intermediate load rather than 2 SI loads.
(no_literal_pool_sf_immediate): Add a clobber to use as the
intermediate general purpose register and also enable it after
reload.

*** gcc/testsuite/ChangeLog ***

2018-11-14  Thomas Preud'homme  

* gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
effective target.
* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.

Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
softfloat and hardfloat ABI which showed no regression and some
FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
code generation didn't change.

Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index

Re: [PATCH, libgcc/ARM & testsuite] Optimize executable size when using softfloat fmul/dmul

2018-11-19 Thread Thomas Preudhomme

FWIW, the testcases were taken from
https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01026.html

Previous approach for fixing tying of fmul to fdiv can be seen in
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01971.html. As mentioned
in the cover letter, this patch went for a completely different
approach and does not share any code besides the testcases.

Best regards,

Thomas
On Mon, 19 Nov 2018 at 09:57, Thomas Preudhomme
 wrote:
>
> Softfloat single precision and double precision floating-point
> multiplication routines in libgcc share some code with the
> floating-point division of their corresponding precision. As the code
> is structured now, this leads to *all* division code being pulled in an
> executable in softfloat mode even if only multiplication is
> performed.
>
> This patch create some new LIB1ASMFUNCS macros to also build files with
> just the multiplication and shared code as weak symbols. By putting
> these earlier in the static library, they can then be picked up when
> only multiplication is used and they are overriden by the global
> definition in the existing file containing both multiplication and
> division code when division is needed.
>
> The patch also removes changes made to the FUNC_START and ARM_FUNC_START
> macros in r218124 since the intent was to put multiplication and
> division code into their own section in a later patch to achieve the
> same size optimization. That approach relied on specific section layout
> to ensure multiplication and division were not too far from the shared
> bit of code in order to the branches to be within range. Due to lack of
> guarantee regarding section layout, in particular with all the
> possibility of linker scripts, this approach was chosen instead. This
> patch keeps the two testcases that were posted by Tony Wang (an Arm
> employee at the time) on the mailing list to implement this approach
> and adds a new one, hence the attribution.
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-11-14  Thomas Preud'homme  
>
> * config/arm/elf.h: Update comment about condition that need to
> match with libgcc/config/arm/lib1funcs.S to also include
> libgcc/config/arm/t-arm.
> * doc/sourcebuild.texi (output-exists, output-exists-not): Rename
> subsubsection these directives are in to "Check for output files".
> Move scan-symbol to that section and add to it new scan-symbol-not
> directive.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-11-16  Tony Wang  
> Thomas Preud'homme  
>
> * lib/lto.exp (lto-execute): Define output_file and testname_with_flags
> to same value as execname.
> (scan-symbol): Move and rename to ...
> * lib/gcc-dg.exp (scan-symbol-common): This.  Adapt into a
> helper function returning true or false if a symbol is present.
> (scan-symbol): New procedure.
> (scan-symbol-not): Likewise.
> * gcc.target/arm/size-optimization-ieee-1.c: New testcase.
> * gcc.target/arm/size-optimization-ieee-2.c: Likewise.
> * gcc.target/arm/size-optimization-ieee-3.c: Likewise.
>
> *** libgcc/ChangeLog ***
>
> 2018-11-16  Thomas Preud'homme  
>
> * /config/arm/lib1funcs.S (FUNC_START): Remove unused sp_section
> parameter and corresponding code.
> (ARM_FUNC_START): Likewise in both definitions.
> Also update footer comment about condition that need to match with
> gcc/config/arm/elf.h to also include libgcc/config/arm/t-arm.
> * config/arm/ieee754-df.S (muldf3): Also build it if L_arm_muldf3 is
> defined.  Weakly define it in this case.
> * config/arm/ieee754-sf.S (mulsf3): Likewise with L_arm_mulsf3.
> * config/arm/t-elf (LIB1ASMFUNCS): Build _arm_muldf3.o and
> _arm_mulsf3.o before muldiv versions if targeting Thumb-1 only. Add
> comment to keep condition in sync with the one in
> libgcc/config/arm/lib1funcs.S and gcc/config/arm/elf.h.
>
> Testing: Bootstrapped on arm-linux-gnueabihf (Arm & Thumb-2) and
> testsuite shows no
> regression. Also built an arm-none-eabi cross compiler targeting
> soft-float which also shows no regression. In particular newly added
> tests and gcc.dg/lto/20081212-1 test pass.
>
> Is this ok for stage3?
>
> Best regards,
>
> Thomas

[PATCH, libgcc/ARM & testsuite] Optimize executable size when using softfloat fmul/dmul

2018-11-19 Thread Thomas Preudhomme

Softfloat single precision and double precision floating-point
multiplication routines in libgcc share some code with the
floating-point division of their corresponding precision. As the code
is structured now, this leads to *all* division code being pulled in an
executable in softfloat mode even if only multiplication is
performed.

This patch create some new LIB1ASMFUNCS macros to also build files with
just the multiplication and shared code as weak symbols. By putting
these earlier in the static library, they can then be picked up when
only multiplication is used and they are overriden by the global
definition in the existing file containing both multiplication and
division code when division is needed.

The patch also removes changes made to the FUNC_START and ARM_FUNC_START
macros in r218124 since the intent was to put multiplication and
division code into their own section in a later patch to achieve the
same size optimization. That approach relied on specific section layout
to ensure multiplication and division were not too far from the shared
bit of code in order to the branches to be within range. Due to lack of
guarantee regarding section layout, in particular with all the
possibility of linker scripts, this approach was chosen instead. This
patch keeps the two testcases that were posted by Tony Wang (an Arm
employee at the time) on the mailing list to implement this approach
and adds a new one, hence the attribution.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-11-14  Thomas Preud'homme  

* config/arm/elf.h: Update comment about condition that need to
match with libgcc/config/arm/lib1funcs.S to also include
libgcc/config/arm/t-arm.
* doc/sourcebuild.texi (output-exists, output-exists-not): Rename
subsubsection these directives are in to "Check for output files".
Move scan-symbol to that section and add to it new scan-symbol-not
directive.

*** gcc/testsuite/ChangeLog ***

2018-11-16  Tony Wang  
Thomas Preud'homme  

* lib/lto.exp (lto-execute): Define output_file and testname_with_flags
to same value as execname.
(scan-symbol): Move and rename to ...
* lib/gcc-dg.exp (scan-symbol-common): This.  Adapt into a
helper function returning true or false if a symbol is present.
(scan-symbol): New procedure.
(scan-symbol-not): Likewise.
* gcc.target/arm/size-optimization-ieee-1.c: New testcase.
* gcc.target/arm/size-optimization-ieee-2.c: Likewise.
* gcc.target/arm/size-optimization-ieee-3.c: Likewise.

*** libgcc/ChangeLog ***

2018-11-16  Thomas Preud'homme  

* /config/arm/lib1funcs.S (FUNC_START): Remove unused sp_section
parameter and corresponding code.
(ARM_FUNC_START): Likewise in both definitions.
Also update footer comment about condition that need to match with
gcc/config/arm/elf.h to also include libgcc/config/arm/t-arm.
* config/arm/ieee754-df.S (muldf3): Also build it if L_arm_muldf3 is
defined.  Weakly define it in this case.
* config/arm/ieee754-sf.S (mulsf3): Likewise with L_arm_mulsf3.
* config/arm/t-elf (LIB1ASMFUNCS): Build _arm_muldf3.o and
_arm_mulsf3.o before muldiv versions if targeting Thumb-1 only. Add
comment to keep condition in sync with the one in
libgcc/config/arm/lib1funcs.S and gcc/config/arm/elf.h.

Testing: Bootstrapped on arm-linux-gnueabihf (Arm & Thumb-2) and
testsuite shows no
regression. Also built an arm-none-eabi cross compiler targeting
soft-float which also shows no regression. In particular newly added
tests and gcc.dg/lto/20081212-1 test pass.

Is this ok for stage3?

Best regards,

Thomas
From 8740697791f99b7175e188f049663883c39e51b0 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Fri, 26 Oct 2018 16:21:09 +0100
Subject: [PATCH] [PATCH, libgcc/ARM] Optimize executable size when using
 softfloat fmul/dmul

Softfloat single precision and double precision floating-point
multiplication routines in libgcc share some code with the
floating-point division of their corresponding precision. As the code
is structured now, this leads to *all* division code being pulled in an
executable in softfloat mode even if only multiplication is
performed.

This patch create some new LIB1ASMFUNCS macros to also build files with
just the multiplication and shared code as weak symbols. By putting
these earlier in the static library, they can then be picked up when
only multiplication is used and they are overriden by the global
definition in the existing file containing both multiplication and
division code when division is needed.

The patch also removes changes made to the FUNC_START and ARM_FUNC_START
macros in r218124 since the intent was to put multiplication and
division code into their own section in a later patch to achieve the
same size optimization. That approach relied on specific section layout
to ensure multiplication and division were not too far from the shared
bit of code in order to the branches to be within

Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-16 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On Sat, 10 Nov 2018 at 15:07, Thomas Preudhomme
 wrote:
>
> Thanks Kyrill.
>
> Updated patch in attachment. Best regards,
>
> Thomas
> On Thu, 8 Nov 2018 at 15:53, Kyrill Tkachov  
> wrote:
> >
> > Hi Thomas,
> >
> > On 08/11/18 09:52, Thomas Preudhomme wrote:
> > > Ping?
> > >
> > > Best regards,
> > >
> > > Thomas
> > >
> > > On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme
> > >  wrote:
> > >> Ping?
> > >>
> > >> Best regards,
> > >>
> > >> Thomas
> > >> On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme
> > >>  wrote:
> > >>> Hi,
> > >>>
> > >>> Please find updated patch to fix PR85434: spilling of stack protector
> > >>> guard's address on ARM. Quite a few changes have been made to the ARM
> > >>> part since last round of review so I think it makes more sense to
> > >>> review it anew. Ran bootstrap + regression testsuite + glibc build +
> > >>> glibc regression testsuite for Arm and Thumb-2 and bootstrap +
> > >>> regression testsuite for Thumb-1. GCC's regression testsuite was run
> > >>> in 3 configurations in all those cases:
> > >>>
> > >>> - default configuration (no RUNTESTFLAGS)
> > >>> - with -fstack-protector-all
> > >>> - with -fPIC -fstack-protector-all (to exercise both codepath in stack
> > >>> protector's split code)
> > >>>
> > >>> None of this show any regression beyond some new scan fail with
> > >>> -fstack-protector-all or -fPIC due to unexpected code sequence for the
> > >>> testcases concerned and some guality swing due to less optimization
> > >>> with new stack protector on.
> > >>>
> > >>> Patch description and ChangeLog below.
> > >>>
> > >>> In case of high register pressure in PIC mode, address of the stack
> > >>> protector's guard can be spilled on ARM targets as shown in PR85434,
> > >>> thus allowing an attacker to control what the canary would be compared
> > >>> against. ARM does lack stack_protect_set and stack_protect_test insn
> > >>> patterns, defining them does not help as the address is expanded
> > >>> regularly and the patterns only deal with the copy and test of the
> > >>> guard with the canary.
> > >>>
> > >>> This problem does not occur for x86 targets because the PIC access and
> > >>> the test can be done in the same instruction. Aarch64 is exempt too
> > >>> because PIC access insn pattern are mov of UNSPEC which prevents it from
> > >>> the second access in the epilogue being CSEd in cse_local pass with the
> > >>> first access in the prologue.
> > >>>
> > >>> The approach followed here is to create new "combined" set and test
> > >>> standard pattern names that take the unexpanded guard and do the set or
> > >>> test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > >>> to hide the individual instructions being generated to the compiler and
> > >>> split the pattern into generic load, compare and branch instruction
> > >>> after register allocator, therefore avoiding any spilling. This is here
> > >>> implemented for the ARM targets. For targets not implementing these new
> > >>> standard pattern names, the existing stack_protect_set and
> > >>> stack_protect_test pattern names are used.
> > >>>
> > >>> To be able to split PIC access after register allocation, the functions
> > >>> had to be augmented to force a new PIC register load and to control
> > >>> which register it loads into. This is because sharing the PIC register
> > >>> between prologue and epilogue could lead to spilling due to CSE again
> > >>> which an attacker could use to control what the canary gets compared
> > >>> against.
> > >>>
> > >>> ChangeLog entries are as follows:
> > >>>
> > >>> *** gcc/ChangeLog ***
> > >>>
> > >>> 2018-10-26  Thomas Preud'homme  
> > >>>
> > >>> * target-insns.def (stack_protect_combined_set): Define new standard
> > >>> pattern name.
> > >>> (stack_protect_combined_test): Like

Re: [PATCH, ARM, ping2] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-10 Thread Thomas Preudhomme

Thanks Kyrill.

Updated patch in attachment. Best regards,

Thomas
On Thu, 8 Nov 2018 at 15:53, Kyrill Tkachov  wrote:
>
> Hi Thomas,
>
> On 08/11/18 09:52, Thomas Preudhomme wrote:
> > Ping?
> >
> > Best regards,
> >
> > Thomas
> >
> > On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme
> >  wrote:
> >> Ping?
> >>
> >> Best regards,
> >>
> >> Thomas
> >> On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme
> >>  wrote:
> >>> Hi,
> >>>
> >>> Please find updated patch to fix PR85434: spilling of stack protector
> >>> guard's address on ARM. Quite a few changes have been made to the ARM
> >>> part since last round of review so I think it makes more sense to
> >>> review it anew. Ran bootstrap + regression testsuite + glibc build +
> >>> glibc regression testsuite for Arm and Thumb-2 and bootstrap +
> >>> regression testsuite for Thumb-1. GCC's regression testsuite was run
> >>> in 3 configurations in all those cases:
> >>>
> >>> - default configuration (no RUNTESTFLAGS)
> >>> - with -fstack-protector-all
> >>> - with -fPIC -fstack-protector-all (to exercise both codepath in stack
> >>> protector's split code)
> >>>
> >>> None of this show any regression beyond some new scan fail with
> >>> -fstack-protector-all or -fPIC due to unexpected code sequence for the
> >>> testcases concerned and some guality swing due to less optimization
> >>> with new stack protector on.
> >>>
> >>> Patch description and ChangeLog below.
> >>>
> >>> In case of high register pressure in PIC mode, address of the stack
> >>> protector's guard can be spilled on ARM targets as shown in PR85434,
> >>> thus allowing an attacker to control what the canary would be compared
> >>> against. ARM does lack stack_protect_set and stack_protect_test insn
> >>> patterns, defining them does not help as the address is expanded
> >>> regularly and the patterns only deal with the copy and test of the
> >>> guard with the canary.
> >>>
> >>> This problem does not occur for x86 targets because the PIC access and
> >>> the test can be done in the same instruction. Aarch64 is exempt too
> >>> because PIC access insn pattern are mov of UNSPEC which prevents it from
> >>> the second access in the epilogue being CSEd in cse_local pass with the
> >>> first access in the prologue.
> >>>
> >>> The approach followed here is to create new "combined" set and test
> >>> standard pattern names that take the unexpanded guard and do the set or
> >>> test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> >>> to hide the individual instructions being generated to the compiler and
> >>> split the pattern into generic load, compare and branch instruction
> >>> after register allocator, therefore avoiding any spilling. This is here
> >>> implemented for the ARM targets. For targets not implementing these new
> >>> standard pattern names, the existing stack_protect_set and
> >>> stack_protect_test pattern names are used.
> >>>
> >>> To be able to split PIC access after register allocation, the functions
> >>> had to be augmented to force a new PIC register load and to control
> >>> which register it loads into. This is because sharing the PIC register
> >>> between prologue and epilogue could lead to spilling due to CSE again
> >>> which an attacker could use to control what the canary gets compared
> >>> against.
> >>>
> >>> ChangeLog entries are as follows:
> >>>
> >>> *** gcc/ChangeLog ***
> >>>
> >>> 2018-10-26  Thomas Preud'homme  
> >>>
> >>> * target-insns.def (stack_protect_combined_set): Define new standard
> >>> pattern name.
> >>> (stack_protect_combined_test): Likewise.
> >>> * cfgexpand.c (stack_protect_prologue): Try new
> >>> stack_protect_combined_set pattern first.
> >>> * function.c (stack_protect_epilogue): Try new
> >>> stack_protect_combined_test pattern first.
> >>> * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> >>> parameters to control which register to use as PIC register and force
> >>> reloading PIC register respectively.  Insert in the stream of insns if
> >

Re: [PATCH, ARM, ping2] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-08 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme
 wrote:
>
> Ping?
>
> Best regards,
>
> Thomas
> On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme
>  wrote:
> >
> > Hi,
> >
> > Please find updated patch to fix PR85434: spilling of stack protector
> > guard's address on ARM. Quite a few changes have been made to the ARM
> > part since last round of review so I think it makes more sense to
> > review it anew. Ran bootstrap + regression testsuite + glibc build +
> > glibc regression testsuite for Arm and Thumb-2 and bootstrap +
> > regression testsuite for Thumb-1. GCC's regression testsuite was run
> > in 3 configurations in all those cases:
> >
> > - default configuration (no RUNTESTFLAGS)
> > - with -fstack-protector-all
> > - with -fPIC -fstack-protector-all (to exercise both codepath in stack
> > protector's split code)
> >
> > None of this show any regression beyond some new scan fail with
> > -fstack-protector-all or -fPIC due to unexpected code sequence for the
> > testcases concerned and some guality swing due to less optimization
> > with new stack protector on.
> >
> > Patch description and ChangeLog below.
> >
> > In case of high register pressure in PIC mode, address of the stack
> > protector's guard can be spilled on ARM targets as shown in PR85434,
> > thus allowing an attacker to control what the canary would be compared
> > against. ARM does lack stack_protect_set and stack_protect_test insn
> > patterns, defining them does not help as the address is expanded
> > regularly and the patterns only deal with the copy and test of the
> > guard with the canary.
> >
> > This problem does not occur for x86 targets because the PIC access and
> > the test can be done in the same instruction. Aarch64 is exempt too
> > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > the second access in the epilogue being CSEd in cse_local pass with the
> > first access in the prologue.
> >
> > The approach followed here is to create new "combined" set and test
> > standard pattern names that take the unexpanded guard and do the set or
> > test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > to hide the individual instructions being generated to the compiler and
> > split the pattern into generic load, compare and branch instruction
> > after register allocator, therefore avoiding any spilling. This is here
> > implemented for the ARM targets. For targets not implementing these new
> > standard pattern names, the existing stack_protect_set and
> > stack_protect_test pattern names are used.
> >
> > To be able to split PIC access after register allocation, the functions
> > had to be augmented to force a new PIC register load and to control
> > which register it loads into. This is because sharing the PIC register
> > between prologue and epilogue could lead to spilling due to CSE again
> > which an attacker could use to control what the canary gets compared
> > against.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-10-26  Thomas Preud'homme  
> >
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.  Insert in the stream of insns if
> > possible.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls accordingly.  Use pic_reg if non null instead of
> > cached one.
> > (arm_load_pic_register): Add pic_reg parameter and use it if non null.
> > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > prototype.
> > (thumb_legitimize_address): Likewise.
> > (arm_emit_call_insn): Adapt to require_pic_register prototype change.
> > (arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
> > (thumb1_expand_prologue): Likewise.
> > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > change.
> > (arm_load_pic_register): Likewise.
> > * config/arm/predicated.md (guard_addr_operand): New predicate.
> > (guard_operand): N

Re: [PATCH, ARM, ping] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-01 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas
On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme
 wrote:
>
> Hi,
>
> Please find updated patch to fix PR85434: spilling of stack protector
> guard's address on ARM. Quite a few changes have been made to the ARM
> part since last round of review so I think it makes more sense to
> review it anew. Ran bootstrap + regression testsuite + glibc build +
> glibc regression testsuite for Arm and Thumb-2 and bootstrap +
> regression testsuite for Thumb-1. GCC's regression testsuite was run
> in 3 configurations in all those cases:
>
> - default configuration (no RUNTESTFLAGS)
> - with -fstack-protector-all
> - with -fPIC -fstack-protector-all (to exercise both codepath in stack
> protector's split code)
>
> None of this show any regression beyond some new scan fail with
> -fstack-protector-all or -fPIC due to unexpected code sequence for the
> testcases concerned and some guality swing due to less optimization
> with new stack protector on.
>
> Patch description and ChangeLog below.
>
> In case of high register pressure in PIC mode, address of the stack
> protector's guard can be spilled on ARM targets as shown in PR85434,
> thus allowing an attacker to control what the canary would be compared
> against. ARM does lack stack_protect_set and stack_protect_test insn
> patterns, defining them does not help as the address is expanded
> regularly and the patterns only deal with the copy and test of the
> guard with the canary.
>
> This problem does not occur for x86 targets because the PIC access and
> the test can be done in the same instruction. Aarch64 is exempt too
> because PIC access insn pattern are mov of UNSPEC which prevents it from
> the second access in the epilogue being CSEd in cse_local pass with the
> first access in the prologue.
>
> The approach followed here is to create new "combined" set and test
> standard pattern names that take the unexpanded guard and do the set or
> test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> to hide the individual instructions being generated to the compiler and
> split the pattern into generic load, compare and branch instruction
> after register allocator, therefore avoiding any spilling. This is here
> implemented for the ARM targets. For targets not implementing these new
> standard pattern names, the existing stack_protect_set and
> stack_protect_test pattern names are used.
>
> To be able to split PIC access after register allocation, the functions
> had to be augmented to force a new PIC register load and to control
> which register it loads into. This is because sharing the PIC register
> between prologue and epilogue could lead to spilling due to CSE again
> which an attacker could use to control what the canary gets compared
> against.
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-10-26  Thomas Preud'homme  
>
> * target-insns.def (stack_protect_combined_set): Define new standard
> pattern name.
> (stack_protect_combined_test): Likewise.
> * cfgexpand.c (stack_protect_prologue): Try new
> stack_protect_combined_set pattern first.
> * function.c (stack_protect_epilogue): Try new
> stack_protect_combined_test pattern first.
> * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> parameters to control which register to use as PIC register and force
> reloading PIC register respectively.  Insert in the stream of insns if
> possible.
> (legitimize_pic_address): Expose above new parameters in prototype and
> adapt recursive calls accordingly.  Use pic_reg if non null instead of
> cached one.
> (arm_load_pic_register): Add pic_reg parameter and use it if non null.
> (arm_legitimize_address): Adapt to new legitimize_pic_address
> prototype.
> (thumb_legitimize_address): Likewise.
> (arm_emit_call_insn): Adapt to require_pic_register prototype change.
> (arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
> (thumb1_expand_prologue): Likewise.
> * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> change.
> (arm_load_pic_register): Likewise.
> * config/arm/predicated.md (guard_addr_operand): New predicate.
> (guard_operand): New predicate.
> * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> prototype change.
> (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
> prototype change.
> (stack_protect_combined_set): New expander..
> (stack_protect_combined_set_insn): New insn_and_split pattern.
> (stack_protect_set_insn): New insn pattern.
> (stack_protect_combined_test): New expander.
> (stack_protect_combined_test_insn): New insn_and_split pattern.
> (arm_stack_protect_test_insn): New insn pattern.
> *

Re: [PATCH, GCC/ARM, ping3] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-10-30 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On Tue, 23 Oct 2018 at 10:10, Thomas Preudhomme
 wrote:
>
> Ping?
>
> Best regards,
>
> Thomas
>
> On Mon, 15 Oct 2018 at 16:01, Thomas Preudhomme
>  wrote:
> >
> > Ping?
> >
> > Best regards,
> >
> > Thomas
> > On Fri, 5 Oct 2018 at 17:50, Thomas Preudhomme
> >  wrote:
> > >
> > > Hi Ramana and Kyrill,
> > >
> > > I've reworked the patch to add some documentation of the option
> > > conflict and reworked the -mword-relocation logic slightly to set the
> > > variable explicitely in PIC mode rather than test for PIC and word
> > > relocation everywhere.
> > >
> > > ChangeLog entries are now as follows:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-10-02  Thomas Preud'homme  
> > >
> > > PR target/87374
> > > * config/arm/arm.c (arm_option_check_internal): Disable the combined
> > > use of -mslow-flash-data and -mword-relocations.
> > > (arm_option_override): Enable -mword-relocations if -fpic or -fPIC.
> > > * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for
> > > flag_pic.
> > > * doc/invoke.texi (-mword-relocations): Mention conflict with
> > > -mslow-flash-data.
> > > (-mslow-flash-data): Reciprocally.
> > >
> > > *** gcc/testsuite/ChangeLog ***
> > >
> > > 2018-09-25  Thomas Preud'homme  
> > >
> > > PR target/87374
> > > * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
> > > -mword-relocations would be passed when compiling the test.
> > > * gcc.target/arm/movsi_movt.c: Likewise.
> > > * gcc.target/arm/pr81863.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> > > * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> > >
> > > Is this ok for trunk?
> > >
> > > Best regards,
> > >
> > > Thomas
> > >
> > > On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan
> > >  wrote:
> > > >
> > > > On 02/10/2018 11:42, Thomas Preudhomme wrote:
> > > > > Hi Ramana,
> > > > >
> > > > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan
> > > > >  wrote:
> > > > >>
> > > > >> On 27/09/2018 09:26, Kyrill Tkachov wrote:
> > > > >>> Hi Thomas,
> > > > >>>
> > > > >>> On 26/09/18 18:39, Thomas Preudhomme wrote:
> > > > >>>> Hi,
> > > > >>>>
> > > > >>>> GCC ICEs under -mslow-flash-data and -mword-relocations because 
> > > > >>>> there
> > > > >>>> is no way to load an address, both literal pools and MOVW/MOVT 
> > > > >>>> being
> > > > >>>> forbidden. This patch gives an error message when both options are
> > > > >>>> specified by the user and adds the according dg-skip-if directives 
> > > > >>>> for
> > > > >>>> tests that use either of these options.
> > > > >>>>
> > > > >>>> ChangeLog entries are as follows:
> > > > >>>>
> > > > >>>> *** gcc/ChangeLog ***
> > > > >>>>
> > > > >>>> 2018-09-25  Thomas Preud'homme  
> > > > >>>>
> > > > >>>>PR target/87374
> > > > >>>>* config/arm/arm.c (arm_option_check_internal): Disable the 
> > > > >>>> combined
> > > > >>>>use of -mslow-flash-data and -mword-relocations.
> > > > >>>>
> > > > >>>> *** gcc/testsuite/ChangeLog ***
> > > > >>>>
> > > > >>>> 2018-09-25  Thomas Preud'homme  
> > > > >>>>
> > > > >>>>PR target/87374
> > > > >>>>* gcc.target/arm/movdi_movt.c: Skip if both 
> > > > >>>> -mslow-flash-data and
> > > > >>>>-mwo

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-10-26 Thread Thomas Preudhomme

: New test.

Is this ok for trunk?

Best regards,

Thomas
On Thu, 25 Oct 2018 at 15:54, Thomas Preudhomme
 wrote:
>
> Good thing I did, found a missing earlyclobber in the process.
> Rerunning all tests again.
>
> Best regards,
>
> Thomas
> On Wed, 24 Oct 2018 at 10:13, Thomas Preudhomme
>  wrote:
> >
> > Please hold on for the reviews, found a small improvement that could
> > be done. Am testing it right now, should have something by tonight or
> > tomorrow.
> >
> > Best regards,
> >
> > Thomas
> > On Tue, 23 Oct 2018 at 13:35, Thomas Preudhomme
> >  wrote:
> > >
> > > [Removing Jeff Law since middle end code hasn't changed]
> > >
> > > Hi,
> > >
> > > Given how memory operand are reloaded even with an X constraint, I've
> > > reworked the patch for the combined set and combined test instruction
> > > ot keep the mem out of the match_operand and used an expander to
> > > generate the right instruction pattern. I've also fixed some
> > > longstanding issues with the patch when flag_pic is true and with
> > > constraints for Thumb-1 that I hadn't noticed before due to using
> > > dg-cmp-results in conjunction with test_summary which does not show
> > > NA->FAIL (see [1]).
> > >
> > > All in all, I think the Arm code would do with a fresh review rather
> > > than looking at the changes since last posted version. (unchanged)
> > > ChangeLog entries are as follows:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-08-09  Thomas Preud'homme  
> > >
> > > * target-insns.def (stack_protect_combined_set): Define new standard
> > > pattern name.
> > > (stack_protect_combined_test): Likewise.
> > > * cfgexpand.c (stack_protect_prologue): Try new
> > > stack_protect_combined_set pattern first.
> > > * function.c (stack_protect_epilogue): Try new
> > > stack_protect_combined_test pattern first.
> > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > > parameters to control which register to use as PIC register and force
> > > reloading PIC register respectively.  Insert in the stream of insns if
> > > possible.
> > > (legitimize_pic_address): Expose above new parameters in prototype and
> > > adapt recursive calls accordingly.  Use pic_reg if non null instead of
> > > cached one.
> > > (arm_load_pic_register): Add pic_reg parameter and use it if non null.
> > > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > > prototype.
> > > (thumb_legitimize_address): Likewise.
> > > (arm_emit_call_insn): Adapt to require_pic_register prototype change.
> > > (arm_expand_prologue): Adapt to arm_load_pic_register prototype 
> > > change.
> > > (thumb1_expand_prologue): Likewise.
> > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > > change.
> > > (arm_load_pic_register): Likewise.
> > > * config/arm/predicated.md (guard_addr_operand): New predicate.
> > > (guard_operand): New predicate.
> > > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> > > prototype change.
> > > (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
> > > prototype change.
> > > (stack_protect_combined_set): New expander..
> > > (stack_protect_combined_set_insn): New insn_and_split pattern.
> > > (stack_protect_set_insn): New insn pattern.
> > > (stack_protect_combined_test): New expander.
> > > (stack_protect_combined_test_insn): New insn_and_split pattern.
> > > (stack_protect_test_insn): New insn pattern.
> > > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> > > (UNSPEC_SP_TEST): Likewise.
> > > * doc/md.texi (stack_protect_combined_set): Document new standard
> > > pattern name.
> > > (stack_protect_set): Clarify that the operand for guard's address is
> > > legal.
> > > (stack_protect_combined_test): Document new standard pattern name.
> > > (stack_protect_test): Clarify that the operand for guard's address is
> > > legal.
> > >
> > > *** gcc/testsuite/ChangeLog ***
> > >
> > > 2018-07-05  Thomas Preud'homme  
> > >
> > > * gcc.target/arm/pr85434.c: New test.
> > >
> > > Testing: Bootstrap and regression testing f

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-10-25 Thread Thomas Preudhomme

Good thing I did, found a missing earlyclobber in the process.
Rerunning all tests again.

Best regards,

Thomas
On Wed, 24 Oct 2018 at 10:13, Thomas Preudhomme
 wrote:
>
> Please hold on for the reviews, found a small improvement that could
> be done. Am testing it right now, should have something by tonight or
> tomorrow.
>
> Best regards,
>
> Thomas
> On Tue, 23 Oct 2018 at 13:35, Thomas Preudhomme
>  wrote:
> >
> > [Removing Jeff Law since middle end code hasn't changed]
> >
> > Hi,
> >
> > Given how memory operand are reloaded even with an X constraint, I've
> > reworked the patch for the combined set and combined test instruction
> > ot keep the mem out of the match_operand and used an expander to
> > generate the right instruction pattern. I've also fixed some
> > longstanding issues with the patch when flag_pic is true and with
> > constraints for Thumb-1 that I hadn't noticed before due to using
> > dg-cmp-results in conjunction with test_summary which does not show
> > NA->FAIL (see [1]).
> >
> > All in all, I think the Arm code would do with a fresh review rather
> > than looking at the changes since last posted version. (unchanged)
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-08-09  Thomas Preud'homme  
> >
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.  Insert in the stream of insns if
> > possible.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls accordingly.  Use pic_reg if non null instead of
> > cached one.
> > (arm_load_pic_register): Add pic_reg parameter and use it if non null.
> > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > prototype.
> > (thumb_legitimize_address): Likewise.
> > (arm_emit_call_insn): Adapt to require_pic_register prototype change.
> > (arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
> > (thumb1_expand_prologue): Likewise.
> > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > change.
> > (arm_load_pic_register): Likewise.
> > * config/arm/predicated.md (guard_addr_operand): New predicate.
> > (guard_operand): New predicate.
> > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> > prototype change.
> > (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
> > prototype change.
> > (stack_protect_combined_set): New expander..
> > (stack_protect_combined_set_insn): New insn_and_split pattern.
> > (stack_protect_set_insn): New insn pattern.
> > (stack_protect_combined_test): New expander.
> > (stack_protect_combined_test_insn): New insn_and_split pattern.
> > (stack_protect_test_insn): New insn pattern.
> > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> > (UNSPEC_SP_TEST): Likewise.
> > * doc/md.texi (stack_protect_combined_set): Document new standard
> > pattern name.
> > (stack_protect_set): Clarify that the operand for guard's address is
> > legal.
> > (stack_protect_combined_test): Document new standard pattern name.
> > (stack_protect_test): Clarify that the operand for guard's address is
> > legal.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme  
> >
> > * gcc.target/arm/pr85434.c: New test.
> >
> > Testing: Bootstrap and regression testing for Arm, Thumb-1 and Thumb-2
> > with (i) default flags, (ii) an extra -fstack-protect-all and (iii)
> > -fPIC -fstack-protect-all. A glibc build and testsuite run was also
> > performed for Arm and Thumb-2. Default flags show no regression and
> > the other runs have some expected scan-assembler failing (due to stack
> > protector or fPIC code sequence), as well as guality fail (due to less
> > optimized code with the new stack protector code) and some execution
> > failures in sibcall-9 and sibcall-10 under -fPIC -fstack-protector-all
&g

Re: [PATCH, contrib] dg-cmp-results: display NA->FAIL by default

2018-10-25 Thread Thomas Preudhomme

Done. Committed patch and ChangeLog below

*** contrib/ChangeLog ***

2018-10-25  Thomas Preud'homme  

* dg-cmp-results.sh: Print NA-FAIL and NA->UNRESOLVED changes at
default verbosity.


diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh
index 821d557a168..eb976f68f4a 100755
--- a/contrib/dg-cmp-results.sh
+++ b/contrib/dg-cmp-results.sh
@@ -137,8 +137,11 @@ function drop() {
 function compare(st, nm) {
 old = peek()
 if (old == 0) {
-# This new test wasn't run last time.
-if (verbose >= 2) printf("NA->%s:%s\n", st, nm)
+   # This new test wasn't run last time.
+   if(st == "FAIL" || st == "UNRESOLVED" || verbose >= 2) {
+   # New test fails or we want all changes
+   printf("NA->%s:%s\n", st, nm)
+   }
 }
 else {
# Compare this new test to the first queued old one.
-- 
2.19.1

Best regards,

Thomas
On Thu, 25 Oct 2018 at 08:29, Richard Sandiford
 wrote:
>
> Thomas Preudhomme  writes:
> > And now with the patch. My apologies for the omission.
> >
> > Best regards,
> >
> > Thomas
> > On Tue, 23 Oct 2018 at 12:08, Thomas Preudhomme
> >  wrote:
> >>
> >> Hi,
> >>
> >> Currently, dg-cmp-results will not print anything for a test that was
> >> not run before, even if it is a FAIL now. This means that when
> >> contributing a code change together with a testcase in the same commit
> >> one must run dg-cmp-results twice: once to check for regression on a
> >> full testsuite run and once against the new testcase with -v -v. This
> >> also prevents using dg-cmp-results on sum files generated with
> >> test_summary since these would not contain PASS.
> >>
> >> This patch changes dg-cmp-results to print NA->FAIL changes by default.
> >>
> >> ChangeLog entry is as follows:
> >>
> >> *** contrib/ChangeLog ***
> >>
> >> 2018-10-23  Thomas Preud'homme  
> >>
> >> * dg-cmp-results.sh: Print NA-FAIL changes at default verbosity.
> >>
> >> Is this ok for trunk?
> >>
> >> Best regards,
> >>
> >> Thomas
> >
> > From ab4272a15bdd8931ef683e234e7dd2e0d038df5f Mon Sep 17 00:00:00 2001
> > From: Thomas Preud'homme 
> > Date: Tue, 23 Oct 2018 11:54:51 +0100
> > Subject: [PATCH] dg-cmp-results: display NA->FAIL by default
> >
> > Hi,
> >
> > Currently, dg-cmp-results will not print anything for a test that was
> > not run before, even if it is a FAIL now. This means that when
> > contributing a code change together with a testcase in the same commit
> > one must run dg-cmp-results twice: once to check for regression on a
> > full testsuite run and once against the new testcase with -v -v. This
> > also prevents using dg-cmp-results on sum files generated with
> > test_summary since these would not contain PASS.
> >
> > This patch changes dg-cmp-results to print NA->FAIL changes by default.
> >
> > ChangeLog entry is as follows:
> >
> > *** contrib/ChangeLog ***
> >
> > 2018-10-23  Thomas Preud'homme  
> >
> >   * dg-cmp-results.sh: Print NA-FAIL changes at default verbosity.
> >
> > Is this ok for trunk?
> >
> > Best regards,
> >
> > Thomas
> > ---
> >  contrib/dg-cmp-results.sh | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh
> > index 821d557a168..921a9b9ca28 100755
> > --- a/contrib/dg-cmp-results.sh
> > +++ b/contrib/dg-cmp-results.sh
> > @@ -137,8 +137,11 @@ function drop() {
> >  function compare(st, nm) {
> >  old = peek()
> >  if (old == 0) {
> > -# This new test wasn't run last time.
> > -if (verbose >= 2) printf("NA->%s:%s\n", st, nm)
> > + # This new test wasn't run last time.
> > + if(st == "FAIL" || verbose >= 2) {
> > + # New test fails or we want all changes
> > + printf("NA->%s:%s\n", st, nm)
> > + }
>
> Probably also worth doing this for UNRESOLVED, where some markup problem
> stops a test from doing anything useful.
>
> OK with that change, thanks.
>
> Richard

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-10-24 Thread Thomas Preudhomme

Please hold on for the reviews, found a small improvement that could
be done. Am testing it right now, should have something by tonight or
tomorrow.

Best regards,

Thomas
On Tue, 23 Oct 2018 at 13:35, Thomas Preudhomme
 wrote:
>
> [Removing Jeff Law since middle end code hasn't changed]
>
> Hi,
>
> Given how memory operand are reloaded even with an X constraint, I've
> reworked the patch for the combined set and combined test instruction
> ot keep the mem out of the match_operand and used an expander to
> generate the right instruction pattern. I've also fixed some
> longstanding issues with the patch when flag_pic is true and with
> constraints for Thumb-1 that I hadn't noticed before due to using
> dg-cmp-results in conjunction with test_summary which does not show
> NA->FAIL (see [1]).
>
> All in all, I think the Arm code would do with a fresh review rather
> than looking at the changes since last posted version. (unchanged)
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-08-09  Thomas Preud'homme  
>
> * target-insns.def (stack_protect_combined_set): Define new standard
> pattern name.
> (stack_protect_combined_test): Likewise.
> * cfgexpand.c (stack_protect_prologue): Try new
> stack_protect_combined_set pattern first.
> * function.c (stack_protect_epilogue): Try new
> stack_protect_combined_test pattern first.
> * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> parameters to control which register to use as PIC register and force
> reloading PIC register respectively.  Insert in the stream of insns if
> possible.
> (legitimize_pic_address): Expose above new parameters in prototype and
> adapt recursive calls accordingly.  Use pic_reg if non null instead of
> cached one.
> (arm_load_pic_register): Add pic_reg parameter and use it if non null.
> (arm_legitimize_address): Adapt to new legitimize_pic_address
> prototype.
> (thumb_legitimize_address): Likewise.
> (arm_emit_call_insn): Adapt to require_pic_register prototype change.
> (arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
> (thumb1_expand_prologue): Likewise.
> * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> change.
> (arm_load_pic_register): Likewise.
> * config/arm/predicated.md (guard_addr_operand): New predicate.
> (guard_operand): New predicate.
> * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> prototype change.
> (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
> prototype change.
> (stack_protect_combined_set): New expander..
> (stack_protect_combined_set_insn): New insn_and_split pattern.
> (stack_protect_set_insn): New insn pattern.
> (stack_protect_combined_test): New expander.
> (stack_protect_combined_test_insn): New insn_and_split pattern.
> (stack_protect_test_insn): New insn pattern.
> * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> (UNSPEC_SP_TEST): Likewise.
> * doc/md.texi (stack_protect_combined_set): Document new standard
> pattern name.
> (stack_protect_set): Clarify that the operand for guard's address is
> legal.
> (stack_protect_combined_test): Document new standard pattern name.
> (stack_protect_test): Clarify that the operand for guard's address is
> legal.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-07-05  Thomas Preud'homme  
>
> * gcc.target/arm/pr85434.c: New test.
>
> Testing: Bootstrap and regression testing for Arm, Thumb-1 and Thumb-2
> with (i) default flags, (ii) an extra -fstack-protect-all and (iii)
> -fPIC -fstack-protect-all. A glibc build and testsuite run was also
> performed for Arm and Thumb-2. Default flags show no regression and
> the other runs have some expected scan-assembler failing (due to stack
> protector or fPIC code sequence), as well as guality fail (due to less
> optimized code with the new stack protector code) and some execution
> failures in sibcall-9 and sibcall-10 under -fPIC -fstack-protector-all
> due to the PIC sequence for the global variable making the frame
> layout different for the 2 functions (these become PASS if making the
> global variable static).
>
> Is this ok for trunk?
>
> Best regards,
>
> Thomas
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01412.html
>
>
> On Tue, 25 Sep 2018 at 17:10, Kyrill Tkachov
>  wrote:
> >
> > Hi Thomas,
> >
> > On 29/08/18 10:51, Thomas Preudhomme wrote:
> > > Resend hopefully without HTML this time.
> > >
> > > On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme

[PATCH, testsuite] Fix sibcall-9 and sibcall-10 with -fPIC

2018-10-23 Thread Thomas Preudhomme

Hi,

gcc.dg/sibcall-9.c and gcc.dg/sibcall-10.c give execution failure
on ARM when compiled with -fPIC due to the PIC access to volatile
variable v creating an extra spill which causes the frame size of the
two recursive functions to be different. Making the variable static
solve the issue because the variable can be access in a PC-relative way
and avoid the spill, while still testing sibling call as originally
intended.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

* gcc.dg/sibcall-9.c: Make v static.
* gcc.dg/sibcall-10.c: Likewise.

Tested both testcase with and without -fPIC and it now passes in both
case when targeting arm-none-eabi. It also passes in both cases on
x86_64-linux-gnu.

Is this ok for trunk?

Best regards,

Thomas
From 27286120fe2d6a088d14d7e4f4b5b6fa6cc2bc41 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Tue, 23 Oct 2018 14:01:31 +0100
Subject: [PATCH] [PATCH, testsuite] Fix sibcall-9 and sibcall-10 with -fPIC

Hi,

gcc.dg/sibcall-9.c and gcc.dg/sibcall-10.c give execution failure
on ARM when compiled with -fPIC due to the PIC access to volatile
variable v creating an extra spill which causes the frame size of the
two recursive functions to be different. Making the variable static
solve the issue because the variable can be access in a PC-relative way
and avoid the spill, while still testing sibling call as originally
intended.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

	* gcc.dg/sibcall-9.c: Make v static.
	* gcc.dg/sibcall-10.c: Likewise.

Tested both testcase with and without -fPIC and it now passes in both
case when targeting arm-none-eabi. It also passes in both cases on
x86_64-linux-gnu.

Is this ok for trunk?

Best regards,

Thomas
---
 gcc/testsuite/gcc.dg/sibcall-10.c | 2 +-
 gcc/testsuite/gcc.dg/sibcall-9.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/sibcall-10.c b/gcc/testsuite/gcc.dg/sibcall-10.c
index 54cc604aecf..4acca50e3e4 100644
--- a/gcc/testsuite/gcc.dg/sibcall-10.c
+++ b/gcc/testsuite/gcc.dg/sibcall-10.c
@@ -31,7 +31,7 @@ extern void exit (int);
 static ATTR void recurser_void1 (void);
 static ATTR void recurser_void2 (void);
 extern void track (void);
-volatile int v;
+static volatile int v;
 
 int n = 0;
 int main ()
diff --git a/gcc/testsuite/gcc.dg/sibcall-9.c b/gcc/testsuite/gcc.dg/sibcall-9.c
index fc3bd9dcf16..32b2e1d5d61 100644
--- a/gcc/testsuite/gcc.dg/sibcall-9.c
+++ b/gcc/testsuite/gcc.dg/sibcall-9.c
@@ -31,7 +31,7 @@ extern void exit (int);
 static ATTR void recurser_void1 (int);
 static ATTR void recurser_void2 (int);
 extern void track (int);
-volatile int v;
+static volatile int v;
 
 int main ()
 {
-- 
2.19.1

Re: [PATCH, contrib] dg-cmp-results: display NA->FAIL by default

2018-10-23 Thread Thomas Preudhomme

And now with the patch. My apologies for the omission.

Best regards,

Thomas
On Tue, 23 Oct 2018 at 12:08, Thomas Preudhomme
 wrote:
>
> Hi,
>
> Currently, dg-cmp-results will not print anything for a test that was
> not run before, even if it is a FAIL now. This means that when
> contributing a code change together with a testcase in the same commit
> one must run dg-cmp-results twice: once to check for regression on a
> full testsuite run and once against the new testcase with -v -v. This
> also prevents using dg-cmp-results on sum files generated with
> test_summary since these would not contain PASS.
>
> This patch changes dg-cmp-results to print NA->FAIL changes by default.
>
> ChangeLog entry is as follows:
>
> *** contrib/ChangeLog ***
>
> 2018-10-23  Thomas Preud'homme  
>
> * dg-cmp-results.sh: Print NA-FAIL changes at default verbosity.
>
> Is this ok for trunk?
>
> Best regards,
>
> Thomas
From ab4272a15bdd8931ef683e234e7dd2e0d038df5f Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Tue, 23 Oct 2018 11:54:51 +0100
Subject: [PATCH] dg-cmp-results: display NA->FAIL by default

Hi,

Currently, dg-cmp-results will not print anything for a test that was
not run before, even if it is a FAIL now. This means that when
contributing a code change together with a testcase in the same commit
one must run dg-cmp-results twice: once to check for regression on a
full testsuite run and once against the new testcase with -v -v. This
also prevents using dg-cmp-results on sum files generated with
test_summary since these would not contain PASS.

This patch changes dg-cmp-results to print NA->FAIL changes by default.

ChangeLog entry is as follows:

*** contrib/ChangeLog ***

2018-10-23  Thomas Preud'homme  

	* dg-cmp-results.sh: Print NA-FAIL changes at default verbosity.

Is this ok for trunk?

Best regards,

Thomas
---
 contrib/dg-cmp-results.sh | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/contrib/dg-cmp-results.sh b/contrib/dg-cmp-results.sh
index 821d557a168..921a9b9ca28 100755
--- a/contrib/dg-cmp-results.sh
+++ b/contrib/dg-cmp-results.sh
@@ -137,8 +137,11 @@ function drop() {
 function compare(st, nm) {
 old = peek()
 if (old == 0) {
-# This new test wasn't run last time.
-if (verbose >= 2) printf("NA->%s:%s\n", st, nm)
+	# This new test wasn't run last time.
+	if(st == "FAIL" || verbose >= 2) {
+	# New test fails or we want all changes
+	printf("NA->%s:%s\n", st, nm)
+	}
 }
 else {
 	# Compare this new test to the first queued old one.
-- 
2.19.1

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-10-23 Thread Thomas Preudhomme

[Removing Jeff Law since middle end code hasn't changed]

Hi,

Given how memory operand are reloaded even with an X constraint, I've
reworked the patch for the combined set and combined test instruction
ot keep the mem out of the match_operand and used an expander to
generate the right instruction pattern. I've also fixed some
longstanding issues with the patch when flag_pic is true and with
constraints for Thumb-1 that I hadn't noticed before due to using
dg-cmp-results in conjunction with test_summary which does not show
NA->FAIL (see [1]).

All in all, I think the Arm code would do with a fresh review rather
than looking at the changes since last posted version. (unchanged)
ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-08-09  Thomas Preud'homme  

* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.  Insert in the stream of insns if
possible.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.  Use pic_reg if non null instead of
cached one.
(arm_load_pic_register): Add pic_reg parameter and use it if non null.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to require_pic_register prototype change.
(arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
(thumb1_expand_prologue): Likewise.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
(arm_load_pic_register): Likewise.
* config/arm/predicated.md (guard_addr_operand): New predicate.
(guard_operand): New predicate.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
prototype change.
(stack_protect_combined_set): New expander..
(stack_protect_combined_set_insn): New insn_and_split pattern.
(stack_protect_set_insn): New insn pattern.
(stack_protect_combined_test): New expander.
(stack_protect_combined_test_insn): New insn_and_split pattern.
(stack_protect_test_insn): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

* gcc.target/arm/pr85434.c: New test.

Testing: Bootstrap and regression testing for Arm, Thumb-1 and Thumb-2
with (i) default flags, (ii) an extra -fstack-protect-all and (iii)
-fPIC -fstack-protect-all. A glibc build and testsuite run was also
performed for Arm and Thumb-2. Default flags show no regression and
the other runs have some expected scan-assembler failing (due to stack
protector or fPIC code sequence), as well as guality fail (due to less
optimized code with the new stack protector code) and some execution
failures in sibcall-9 and sibcall-10 under -fPIC -fstack-protector-all
due to the PIC sequence for the global variable making the frame
layout different for the 2 functions (these become PASS if making the
global variable static).

Is this ok for trunk?

Best regards,

Thomas

[1] https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01412.html

On Tue, 25 Sep 2018 at 17:10, Kyrill Tkachov
 wrote:
>
> Hi Thomas,
>
> On 29/08/18 10:51, Thomas Preudhomme wrote:
> > Resend hopefully without HTML this time.
> >
> > On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme
> >  wrote:
> >> Hi,
> >>
> >> I've reworked the patch fixing PR85434 (spilling of stack protector 
> >> guard's address on ARM) to address the testsuite regression on powerpc and 
> >> x86 as well as glibc testsuite regression on ARM. Issues were due to 
> >> unconditionally attempting to generate the new patterns. The code now 
> >> tests if there is a pattern for them for the target before generating 
> >> them. In the ARM side of the patch, I've also added a more specific 
> >> predicate for the new patterns. The new patch is found below.
> >>
> >>
> >> In case of high register pressure in PIC mode, address of

[PATCH, contrib] dg-cmp-results: display NA->FAIL by default

2018-10-23 Thread Thomas Preudhomme

Hi,

Currently, dg-cmp-results will not print anything for a test that was
not run before, even if it is a FAIL now. This means that when
contributing a code change together with a testcase in the same commit
one must run dg-cmp-results twice: once to check for regression on a
full testsuite run and once against the new testcase with -v -v. This
also prevents using dg-cmp-results on sum files generated with
test_summary since these would not contain PASS.

This patch changes dg-cmp-results to print NA->FAIL changes by default.

ChangeLog entry is as follows:

*** contrib/ChangeLog ***

2018-10-23  Thomas Preud'homme  

* dg-cmp-results.sh: Print NA-FAIL changes at default verbosity.

Is this ok for trunk?

Best regards,

Thomas

Re: [PATCH, GCC/ARM, ping2] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-10-23 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas

On Mon, 15 Oct 2018 at 16:01, Thomas Preudhomme
 wrote:
>
> Ping?
>
> Best regards,
>
> Thomas
> On Fri, 5 Oct 2018 at 17:50, Thomas Preudhomme
>  wrote:
> >
> > Hi Ramana and Kyrill,
> >
> > I've reworked the patch to add some documentation of the option
> > conflict and reworked the -mword-relocation logic slightly to set the
> > variable explicitely in PIC mode rather than test for PIC and word
> > relocation everywhere.
> >
> > ChangeLog entries are now as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-10-02  Thomas Preud'homme  
> >
> > PR target/87374
> > * config/arm/arm.c (arm_option_check_internal): Disable the combined
> > use of -mslow-flash-data and -mword-relocations.
> > (arm_option_override): Enable -mword-relocations if -fpic or -fPIC.
> > * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for
> > flag_pic.
> > * doc/invoke.texi (-mword-relocations): Mention conflict with
> > -mslow-flash-data.
> > (-mslow-flash-data): Reciprocally.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-09-25  Thomas Preud'homme  
> >
> > PR target/87374
> > * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
> > -mword-relocations would be passed when compiling the test.
> > * gcc.target/arm/movsi_movt.c: Likewise.
> > * gcc.target/arm/pr81863.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> > * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> >
> > Is this ok for trunk?
> >
> > Best regards,
> >
> > Thomas
> >
> > On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan
> >  wrote:
> > >
> > > On 02/10/2018 11:42, Thomas Preudhomme wrote:
> > > > Hi Ramana,
> > > >
> > > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan
> > > >  wrote:
> > > >>
> > > >> On 27/09/2018 09:26, Kyrill Tkachov wrote:
> > > >>> Hi Thomas,
> > > >>>
> > > >>> On 26/09/18 18:39, Thomas Preudhomme wrote:
> > > >>>> Hi,
> > > >>>>
> > > >>>> GCC ICEs under -mslow-flash-data and -mword-relocations because there
> > > >>>> is no way to load an address, both literal pools and MOVW/MOVT being
> > > >>>> forbidden. This patch gives an error message when both options are
> > > >>>> specified by the user and adds the according dg-skip-if directives 
> > > >>>> for
> > > >>>> tests that use either of these options.
> > > >>>>
> > > >>>> ChangeLog entries are as follows:
> > > >>>>
> > > >>>> *** gcc/ChangeLog ***
> > > >>>>
> > > >>>> 2018-09-25  Thomas Preud'homme  
> > > >>>>
> > > >>>>PR target/87374
> > > >>>>* config/arm/arm.c (arm_option_check_internal): Disable the 
> > > >>>> combined
> > > >>>>use of -mslow-flash-data and -mword-relocations.
> > > >>>>
> > > >>>> *** gcc/testsuite/ChangeLog ***
> > > >>>>
> > > >>>> 2018-09-25  Thomas Preud'homme  
> > > >>>>
> > > >>>>PR target/87374
> > > >>>>* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data 
> > > >>>> and
> > > >>>>-mword-relocations would be passed when compiling the test.
> > > >>>>* gcc.target/arm/movsi_movt.c: Likewise.
> > > >>>>* gcc.target/arm/pr81863.c: Likewise.
> > > >>>>* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> > > >>>>* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> > > >>>>* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> > > >>>>* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> > > >>>>* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
>

Re: [PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-10-15 Thread Thomas Preudhomme

Ping?

Best regards,

Thomas
On Fri, 5 Oct 2018 at 17:50, Thomas Preudhomme
 wrote:
>
> Hi Ramana and Kyrill,
>
> I've reworked the patch to add some documentation of the option
> conflict and reworked the -mword-relocation logic slightly to set the
> variable explicitely in PIC mode rather than test for PIC and word
> relocation everywhere.
>
> ChangeLog entries are now as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-10-02  Thomas Preud'homme  
>
> PR target/87374
> * config/arm/arm.c (arm_option_check_internal): Disable the combined
> use of -mslow-flash-data and -mword-relocations.
> (arm_option_override): Enable -mword-relocations if -fpic or -fPIC.
> * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for
> flag_pic.
> * doc/invoke.texi (-mword-relocations): Mention conflict with
> -mslow-flash-data.
> (-mslow-flash-data): Reciprocally.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-09-25  Thomas Preud'homme  
>
> PR target/87374
> * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
> -mword-relocations would be passed when compiling the test.
> * gcc.target/arm/movsi_movt.c: Likewise.
> * gcc.target/arm/pr81863.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
>
> Is this ok for trunk?
>
> Best regards,
>
> Thomas
>
> On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan
>  wrote:
> >
> > On 02/10/2018 11:42, Thomas Preudhomme wrote:
> > > Hi Ramana,
> > >
> > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan
> > >  wrote:
> > >>
> > >> On 27/09/2018 09:26, Kyrill Tkachov wrote:
> > >>> Hi Thomas,
> > >>>
> > >>> On 26/09/18 18:39, Thomas Preudhomme wrote:
> > >>>> Hi,
> > >>>>
> > >>>> GCC ICEs under -mslow-flash-data and -mword-relocations because there
> > >>>> is no way to load an address, both literal pools and MOVW/MOVT being
> > >>>> forbidden. This patch gives an error message when both options are
> > >>>> specified by the user and adds the according dg-skip-if directives for
> > >>>> tests that use either of these options.
> > >>>>
> > >>>> ChangeLog entries are as follows:
> > >>>>
> > >>>> *** gcc/ChangeLog ***
> > >>>>
> > >>>> 2018-09-25  Thomas Preud'homme  
> > >>>>
> > >>>>PR target/87374
> > >>>>* config/arm/arm.c (arm_option_check_internal): Disable the 
> > >>>> combined
> > >>>>use of -mslow-flash-data and -mword-relocations.
> > >>>>
> > >>>> *** gcc/testsuite/ChangeLog ***
> > >>>>
> > >>>> 2018-09-25  Thomas Preud'homme  
> > >>>>
> > >>>>PR target/87374
> > >>>>* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data 
> > >>>> and
> > >>>>-mword-relocations would be passed when compiling the test.
> > >>>>* gcc.target/arm/movsi_movt.c: Likewise.
> > >>>>* gcc.target/arm/pr81863.c: Likewise.
> > >>>>* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> > >>>>* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> > >>>>* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> > >>>>* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> > >>>>* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> > >>>>* gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> > >>>>
> > >>>>
> > >>>> Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when
> > >>>> targeting arm-none-eabi. Modified tests get skipped as expected when
> > >>>> running the testsuite with -mslow-flash-data (pr81863.c) or
> > >>>> -mword-relocations (all the others).
> > >>>>
> > >>>>
> > >>>> Is this ok for trunk? I'd also appreciate guidance on whether

Re: [PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-10-05 Thread Thomas Preudhomme

Hi Ramana and Kyrill,

I've reworked the patch to add some documentation of the option
conflict and reworked the -mword-relocation logic slightly to set the
variable explicitely in PIC mode rather than test for PIC and word
relocation everywhere.

ChangeLog entries are now as follows:

*** gcc/ChangeLog ***

2018-10-02  Thomas Preud'homme  

PR target/87374
* config/arm/arm.c (arm_option_check_internal): Disable the combined
use of -mslow-flash-data and -mword-relocations.
(arm_option_override): Enable -mword-relocations if -fpic or -fPIC.
* config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for
flag_pic.
* doc/invoke.texi (-mword-relocations): Mention conflict with
-mslow-flash-data.
(-mslow-flash-data): Reciprocally.

*** gcc/testsuite/ChangeLog ***

2018-09-25  Thomas Preud'homme  

PR target/87374
* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
-mword-relocations would be passed when compiling the test.
* gcc.target/arm/movsi_movt.c: Likewise.
* gcc.target/arm/pr81863.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
* gcc.target/arm/tls-disable-literal-pool.c: Likewise.

Is this ok for trunk?

Best regards,

Thomas

On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan
 wrote:
>
> On 02/10/2018 11:42, Thomas Preudhomme wrote:
> > Hi Ramana,
> >
> > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan
> >  wrote:
> >>
> >> On 27/09/2018 09:26, Kyrill Tkachov wrote:
> >>> Hi Thomas,
> >>>
> >>> On 26/09/18 18:39, Thomas Preudhomme wrote:
> >>>> Hi,
> >>>>
> >>>> GCC ICEs under -mslow-flash-data and -mword-relocations because there
> >>>> is no way to load an address, both literal pools and MOVW/MOVT being
> >>>> forbidden. This patch gives an error message when both options are
> >>>> specified by the user and adds the according dg-skip-if directives for
> >>>> tests that use either of these options.
> >>>>
> >>>> ChangeLog entries are as follows:
> >>>>
> >>>> *** gcc/ChangeLog ***
> >>>>
> >>>> 2018-09-25  Thomas Preud'homme  
> >>>>
> >>>>PR target/87374
> >>>>* config/arm/arm.c (arm_option_check_internal): Disable the 
> >>>> combined
> >>>>use of -mslow-flash-data and -mword-relocations.
> >>>>
> >>>> *** gcc/testsuite/ChangeLog ***
> >>>>
> >>>> 2018-09-25  Thomas Preud'homme  
> >>>>
> >>>>PR target/87374
> >>>>* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
> >>>>-mword-relocations would be passed when compiling the test.
> >>>>* gcc.target/arm/movsi_movt.c: Likewise.
> >>>>* gcc.target/arm/pr81863.c: Likewise.
> >>>>* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> >>>>* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> >>>>* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> >>>>* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> >>>>* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> >>>>* gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> >>>>
> >>>>
> >>>> Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when
> >>>> targeting arm-none-eabi. Modified tests get skipped as expected when
> >>>> running the testsuite with -mslow-flash-data (pr81863.c) or
> >>>> -mword-relocations (all the others).
> >>>>
> >>>>
> >>>> Is this ok for trunk? I'd also appreciate guidance on whether this is
> >>>> worth a backport. It's a simple patch but on the other hand it only
> >>>> prevents some option combination, it does not fix anything so I have
> >>>> mixed feelings.
> >>>
> >>> In my opinion -mslow-flash-data is more of a tuning option rather than a 
> >>> security/ABI feature
> >>> and therefore erroring out on its combination with -mword-relocations 
> >>> feels odd.
> >>> I'm leaning more towards making -mword-relocations or any other option 
> >>> that

Re: [PATCH, LRA] Never reload fixed form constraints memory operand

2018-10-04 Thread Thomas Preudhomme

My bad, I used dg-cmp-results without verbosity which didn't show the
problem It starts to show it with -v -v, I'm not sure why. I'll have a
look right now and revert by the end of today if I cannot come up with
a fix. Does that sound ok?

Best regards,

Thomas
On Thu, 4 Oct 2018 at 12:31, H.J. Lu  wrote:
>
> On Wed, Oct 3, 2018 at 8:12 PM Vladimir Makarov  wrote:
> >
> > On 10/03/2018 12:47 PM, Thomas Preudhomme wrote:
> > > Best regards,
> > >
> > > Thomas
> > >
> > > never_reload_fixed_address_operand.patch
> > >
> > >
> > >  From 2831d8b886d92513c2d30d43a6a989d2bbd0ceee Mon Sep 17 00:00:00 2001
> > > From: Thomas Preud'homme
> > > Date: Thu, 27 Sep 2018 09:50:12 +0100
> > > Subject: [PATCH] [PATCH, LRA] Never reload fixed form constraints memory
> > >   operand
> > >
> > > Hi,
> > >
> > > The unconditional reload of address operand for recognized instruction
> > > in process_address_1 prevent the patch for fixing "PR85434: Address of
> > > stack protector guard spilled to stack on ARM" proposed at [1]. The code
> > > in this patch attempt to control which registers are used to make PIC
> > > access but the reload performed by process_address_1 will use generic
> > > PIC access. This patch removes the test for the instruction to be
> > > unrecognized to do the reload, thus always avoiding to reload address
> > > operand for fixed constraints (such as "X" used in the patch).
> > >
> > > [1]https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01838.html
> > >
> > > ChangeLog entry is as follows:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-10-03  Thomas Preud'homme
> > >
> > >   * lra-constraints.c (process_address_1): Bail out for all
> > >   satisfied fixed constraints.
> > >
> > > Testing: Successfully bootstrapped and regtested on:
> > > - arm-linux-gnueabihf (both Arm and Thumb2 mode)
> > > - aarch64-linux-gnu
> > > - x86_64-linux-gnu
> > > - i386-linux-gnu
> > > - sparc64-linux-gnu (gcc202)
> > > - powerpc64le-linux-gnu (gcc112)
> > >
> > > Is this ok for trunk?
> > >
> > OK. Thank you for testing all these targets, Thomas.
> >
>
> This caused:
>
> FAIL: gcc.target/i386/pr83317.c (internal compiler error)
> FAIL: gcc.target/i386/pr83317.c (test for excess errors)
>
> [hjl@gnu-4 gcc]$
> /export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c
>  -m32   -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
> -fdiagnostics-color=never   -O1 -fPIC -msse2 -mfpmath=sse -S -o
> pr83317.s
> during RTL pass: reload
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c:
> In function \u2018foo\u2019:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c:21:1:
> internal compiler error: in lra_eliminate_reg_if_possible, at
> lra-eliminations.c:1393
> 0xea4875 lra_eliminate_reg_if_possible(rtx_def**)
> /export/gnu/import/git/sources/gcc/gcc/lra-eliminations.c:1393
> 0xe8a94c address_eliminator
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:362
> 0xe8aaf7 satisfies_memory_constraint_p
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:401
> 0xe8f947 process_alt_operands
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:2248
> 0xe93d1f curr_insn_transform
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:3861
> 0xe975f2 lra_constraints(bool)
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:4878
> 0xe7c458 lra(_IO_FILE*)
> /export/gnu/import/git/sources/gcc/gcc/lra.c:2446
> 0xe111ff do_reload
> /export/gnu/import/git/sources/gcc/gcc/ira.c:5469
> 0xe116f2 execute
> /export/gnu/import/git/sources/gcc/gcc/ira.c:5653
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <https://gcc.gnu.org/bugs/> for instructions.
> [hjl@gnu-4 gcc]$
>
>
> --
> H.J.

[PATCH, LRA] Never reload fixed form constraints memory operand

2018-10-03 Thread Thomas Preudhomme

Hi,

The unconditional reload of address operand for recognized instruction
in process_address_1 prevent the patch for fixing "PR85434: Address of
stack protector guard spilled to stack on ARM" proposed at [1]. The code
in this patch attempt to control which registers are used to make PIC
access but the reload performed by process_address_1 will use generic
PIC access. This patch removes the test for the instruction to be
unrecognized to do the reload, thus always avoiding to reload address
operand for fixed constraints (such as "X" used in the patch).

[1] https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01838.html

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-10-03  Thomas Preud'homme  

* lra-constraints.c (process_address_1): Bail out for all
satisfied fixed constraints.

Testing: Successfully bootstrapped and regtested on:
- arm-linux-gnueabihf (both Arm and Thumb2 mode)
- aarch64-linux-gnu
- x86_64-linux-gnu
- i386-linux-gnu
- sparc64-linux-gnu (gcc202)
- powerpc64le-linux-gnu (gcc112)

Is this ok for trunk?

Best regards,

Thomas
From 2831d8b886d92513c2d30d43a6a989d2bbd0ceee Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Thu, 27 Sep 2018 09:50:12 +0100
Subject: [PATCH] [PATCH, LRA] Never reload fixed form constraints memory
 operand

Hi,

The unconditional reload of address operand for recognized instruction
in process_address_1 prevent the patch for fixing "PR85434: Address of
stack protector guard spilled to stack on ARM" proposed at [1]. The code
in this patch attempt to control which registers are used to make PIC
access but the reload performed by process_address_1 will use generic
PIC access. This patch removes the test for the instruction to be
unrecognized to do the reload, thus always avoiding to reload address
operand for fixed constraints (such as "X" used in the patch).

[1] https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01838.html

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-10-03  Thomas Preud'homme  

	* lra-constraints.c (process_address_1): Bail out for all
	satisfied fixed constraints.

Testing: Successfully bootstrapped and regtested on:
- arm-linux-gnueabihf (both Arm and Thumb2 mode)
- aarch64-linux-gnu
- x86_64-linux-gnu
- i386-linux-gnu
- sparc64-linux-gnu (gcc202)
- powerpc64le-linux-gnu (gcc112)

Is this ok for trunk?

Best regards,

Thomas
---
 gcc/lra-constraints.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 774d1ff3aaa..c3edd9ef45d 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -3243,8 +3243,7 @@ process_address_1 (int nop, bool check_only_p,
   /* Do not attempt to decompose arbitrary addresses generated by combine
  for asm operands with loose constraints, e.g 'X'.  */
   else if (MEM_P (op)
-	   && !(INSN_CODE (curr_insn) < 0
-		&& get_constraint_type (cn) == CT_FIXED_FORM
+	   && !(get_constraint_type (cn) == CT_FIXED_FORM
 	&& constraint_satisfied_p (op, cn)))
 decompose_mem_address (, op);
   else if (GET_CODE (op) == SUBREG
-- 
2.19.0

Re: [PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-10-02 Thread Thomas Preudhomme

Hi Ramana,

On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan
 wrote:
>
> On 27/09/2018 09:26, Kyrill Tkachov wrote:
> > Hi Thomas,
> >
> > On 26/09/18 18:39, Thomas Preudhomme wrote:
> >> Hi,
> >>
> >> GCC ICEs under -mslow-flash-data and -mword-relocations because there
> >> is no way to load an address, both literal pools and MOVW/MOVT being
> >> forbidden. This patch gives an error message when both options are
> >> specified by the user and adds the according dg-skip-if directives for
> >> tests that use either of these options.
> >>
> >> ChangeLog entries are as follows:
> >>
> >> *** gcc/ChangeLog ***
> >>
> >> 2018-09-25  Thomas Preud'homme  
> >>
> >>   PR target/87374
> >>   * config/arm/arm.c (arm_option_check_internal): Disable the combined
> >>   use of -mslow-flash-data and -mword-relocations.
> >>
> >> *** gcc/testsuite/ChangeLog ***
> >>
> >> 2018-09-25  Thomas Preud'homme  
> >>
> >>   PR target/87374
> >>   * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
> >>   -mword-relocations would be passed when compiling the test.
> >>   * gcc.target/arm/movsi_movt.c: Likewise.
> >>   * gcc.target/arm/pr81863.c: Likewise.
> >>   * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> >>   * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> >>   * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> >>   * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> >>   * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> >>   * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> >>
> >>
> >> Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when
> >> targeting arm-none-eabi. Modified tests get skipped as expected when
> >> running the testsuite with -mslow-flash-data (pr81863.c) or
> >> -mword-relocations (all the others).
> >>
> >>
> >> Is this ok for trunk? I'd also appreciate guidance on whether this is
> >> worth a backport. It's a simple patch but on the other hand it only
> >> prevents some option combination, it does not fix anything so I have
> >> mixed feelings.
> >
> > In my opinion -mslow-flash-data is more of a tuning option rather than a 
> > security/ABI feature
> > and therefore erroring out on its combination with -mword-relocations feels 
> > odd.
> > I'm leaning more towards making -mword-relocations or any other option that 
> > really requires constant pools
> > to bypass/disable the effects of -mslow-flash-data instead.
>
> -mslow-flash-data and -mword-relocations are contradictory in their
> expectations. mslow-flash-data is for not putting anything in the
> literal pool whereas mword-relocations is purely around the use of movw
> / movt instructions for word sized values. I wish we had called
> -mslow-flash-data something else (probably -mno-literal-pools).
> -mslow-flash-data is used primarily by M-profile users and
> -mword-relocations IIUC was a point fix for use in the Linux kernel for
> module loads at a time when not all module loaders in the linux kernel
> were fixed for the movw / movt relocations and armv7-a / thumb2 was in
> it's infancy :). Thus they are used by different constituencies in
> general and I wouldn't see them used together by actual users.

Technically, -mslow-flash-data does not forbid literal pool, it just
discourages it because it's slower than many instructions. -mpure-code
on the other hand reuse the same logic and does forbid literal pools.
We could treat -mslow-flash-data differently but the question is
whether it is worth the trouble.

By the way, I've noticed that the documentation for -mword-relocations
says it defaults to on for -fpic and -fPIC but when looking through
the code I saw that target_word_relocation is not set in these case,
rather the initial commit checks that introduced -mword-relocation
also checks for flag_pic when checking target_word_relocation. However
a later commit added one more check for target_word_relocations but
nothing for flag_pic. I'm now consolidating this so that flag_pic sets
target_word_relocations. I'll do a regression testing with -fPIC and
then post an updated patch.

>
> Considering the above, I would prefer a hard error rather than a warning
> as they are contradictory and I'd prefer that we error'd out. Further
> this bugzilla entry is probably created with fuzzing with a variety of
> options rather than from any real use case.
>
> Oh and yes, lets update invoke.texi while here.

Done. Will be part of the updated patch.

Best regards,

Thomas

[PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-09-26 Thread Thomas Preudhomme

Hi,

GCC ICEs under -mslow-flash-data and -mword-relocations because there
is no way to load an address, both literal pools and MOVW/MOVT being
forbidden. This patch gives an error message when both options are
specified by the user and adds the according dg-skip-if directives for
tests that use either of these options.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-09-25  Thomas Preud'homme  

PR target/87374
* config/arm/arm.c (arm_option_check_internal): Disable the combined
use of -mslow-flash-data and -mword-relocations.

*** gcc/testsuite/ChangeLog ***

2018-09-25  Thomas Preud'homme  

PR target/87374
* gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
-mword-relocations would be passed when compiling the test.
* gcc.target/arm/movsi_movt.c: Likewise.
* gcc.target/arm/pr81863.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
* gcc.target/arm/tls-disable-literal-pool.c: Likewise.


Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when
targeting arm-none-eabi. Modified tests get skipped as expected when
running the testsuite with -mslow-flash-data (pr81863.c) or
-mword-relocations (all the others).


Is this ok for trunk? I'd also appreciate guidance on whether this is
worth a backport. It's a simple patch but on the other hand it only
prevents some option combination, it does not fix anything so I have
mixed feelings.

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6332e68df05..5beffc875c1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2893,17 +2893,22 @@ arm_option_check_internal (struct gcc_options *opts)
   flag_pic = 0;
 }
 
-  /* We only support -mpure-code and -mslow-flash-data on M-profile targets
- with MOVT.  */
-  if ((target_pure_code || target_slow_flash_data)
-  && (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON))
+  if (target_pure_code || target_slow_flash_data)
 {
   const char *flag = (target_pure_code ? "-mpure-code" :
 	 "-mslow-flash-data");
-  error ("%s only supports non-pic code on M-profile targets with the "
-	 "MOVT instruction", flag);
-}
 
+  /* We only support -mpure-code and -mslow-flash-data on M-profile targets
+	 with MOVT.  */
+  if (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON)
+	error ("%s only supports non-pic code on M-profile targets with the "
+	   "MOVT instruction", flag);
+
+  /* Cannot load addresses: -mslow-flash-data forbids literal pool and
+	 -mword-relocations forbids relocation of MOVT/MOVW.  */
+  if (target_word_relocations)
+	error ("%s incompatible with -mword-relocations", flag);
+}
 }
 
 /* Recompute the global settings depending on target attribute options.  */
diff --git a/gcc/testsuite/gcc.target/arm/movdi_movt.c b/gcc/testsuite/gcc.target/arm/movdi_movt.c
index e2a28ccbd99..a01ffa0dc93 100644
--- a/gcc/testsuite/gcc.target/arm/movdi_movt.c
+++ b/gcc/testsuite/gcc.target/arm/movdi_movt.c
@@ -1,4 +1,5 @@
 /* { dg-do compile { target { arm_cortex_m && { arm_thumb2_ok || arm_thumb1_movt_ok } } } } */
+/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
 /* { dg-options "-O2 -mslow-flash-data" } */
 
 unsigned long long
diff --git a/gcc/testsuite/gcc.target/arm/movsi_movt.c b/gcc/testsuite/gcc.target/arm/movsi_movt.c
index 3cf46e2fd17..19d202ecd33 100644
--- a/gcc/testsuite/gcc.target/arm/movsi_movt.c
+++ b/gcc/testsuite/gcc.target/arm/movsi_movt.c
@@ -1,4 +1,5 @@
 /* { dg-do compile { target { arm_cortex_m && { arm_thumb2_ok || arm_thumb1_movt_ok } } } } */
+/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
 /* { dg-options "-O2 -mslow-flash-data" } */
 
 unsigned
diff --git a/gcc/testsuite/gcc.target/arm/pr81863.c b/gcc/testsuite/gcc.target/arm/pr81863.c
index 63b1ed66b2c..225a0c5cc2b 100644
--- a/gcc/testsuite/gcc.target/arm/pr81863.c
+++ b/gcc/testsuite/gcc.target/arm/pr81863.c
@@ -1,5 +1,6 @@
 /* testsuite/gcc.target/arm/pr48183.c */
 /* { dg-do compile } */
+/* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mslow-flash-data" } } */
 /* { dg-options "-O2 -mword-relocations -march=armv7-a -marm" } */
 /* { dg-final { scan-assembler-not "\[\\t \]+movw" } } */
 
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c
index 089a72b67f3..d10391a69ac 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c
@@ -6,6 +6,7 @@
 /* { dg-do compile } */
 /* {

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-09-13 Thread Thomas Preudhomme

Hi all,

Ping? This new version changes both the middle-end and back-end part
so will need a review for both of those.

Best regards,

Thomas
On Wed, 29 Aug 2018 at 11:07, Thomas Preudhomme
 wrote:
>
> Forgot another important change in ARM backend:
>
> The expander were causing one too many indirection which was what
> caused the test failure in glibc. The new expanders code skip the
> creation of a move from the memory reference of the guard's address to
> a register since this is done in the insn themselves. I think during
> the initial implementation of the first version of the patch I had
> issues with loading the address and used that to load the address. As
> can be seen from the absence of regression on the runtime stack
> protector test in glibc, this is now working properly, also confirmed
> by manual inspection of the code.
>
> I've attached the interdiff from previous version for reference.
>
> Best regards,
>
> Thomas
> On Wed, 29 Aug 2018 at 10:51, Thomas Preudhomme
>  wrote:
> >
> > Resend hopefully without HTML this time.
> >
> > On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme
> >  wrote:
> > >
> > > Hi,
> > >
> > > I've reworked the patch fixing PR85434 (spilling of stack protector 
> > > guard's address on ARM) to address the testsuite regression on powerpc 
> > > and x86 as well as glibc testsuite regression on ARM. Issues were due to 
> > > unconditionally attempting to generate the new patterns. The code now 
> > > tests if there is a pattern for them for the target before generating 
> > > them. In the ARM side of the patch, I've also added a more specific 
> > > predicate for the new patterns. The new patch is found below.
> > >
> > >
> > > In case of high register pressure in PIC mode, address of the stack
> > > protector's guard can be spilled on ARM targets as shown in PR85434,
> > > thus allowing an attacker to control what the canary would be compared
> > > against. ARM does lack stack_protect_set and stack_protect_test insn
> > > patterns, defining them does not help as the address is expanded
> > > regularly and the patterns only deal with the copy and test of the
> > > guard with the canary.
> > >
> > > This problem does not occur for x86 targets because the PIC access and
> > > the test can be done in the same instruction. Aarch64 is exempt too
> > > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > > the second access in the epilogue being CSEd in cse_local pass with the
> > > first access in the prologue.
> > >
> > > The approach followed here is to create new "combined" set and test
> > > standard pattern names that take the unexpanded guard and do the set or
> > > test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > > to hide the individual instructions being generated to the compiler and
> > > split the pattern into generic load, compare and branch instruction
> > > after register allocator, therefore avoiding any spilling. This is here
> > > implemented for the ARM targets. For targets not implementing these new
> > > standard pattern names, the existing stack_protect_set and
> > > stack_protect_test pattern names are used.
> > >
> > > To be able to split PIC access after register allocation, the functions
> > > had to be augmented to force a new PIC register load and to control
> > > which register it loads into. This is because sharing the PIC register
> > > between prologue and epilogue could lead to spilling due to CSE again
> > > which an attacker could use to control what the canary gets compared
> > > against.
> > >
> > > ChangeLog entries are as follows:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-08-09  Thomas Preud'homme  
> > >
> > > * target-insns.def (stack_protect_combined_set): Define new standard
> > > pattern name.
> > > (stack_protect_combined_test): Likewise.
> > > * cfgexpand.c (stack_protect_prologue): Try new
> > > stack_protect_combined_set pattern first.
> > > * function.c (stack_protect_epilogue): Try new
> > > stack_protect_combined_test pattern first.
> > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > > parameters to control which register to use as PIC register and force
> > > reloading PIC register respectively.  Insert in the stream of insns if
> > > possible.
> > >

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-08-29 Thread Thomas Preudhomme

Forgot another important change in ARM backend:

The expander were causing one too many indirection which was what
caused the test failure in glibc. The new expanders code skip the
creation of a move from the memory reference of the guard's address to
a register since this is done in the insn themselves. I think during
the initial implementation of the first version of the patch I had
issues with loading the address and used that to load the address. As
can be seen from the absence of regression on the runtime stack
protector test in glibc, this is now working properly, also confirmed
by manual inspection of the code.

I've attached the interdiff from previous version for reference.

Best regards,

Thomas
On Wed, 29 Aug 2018 at 10:51, Thomas Preudhomme
 wrote:
>
> Resend hopefully without HTML this time.
>
> On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme
>  wrote:
> >
> > Hi,
> >
> > I've reworked the patch fixing PR85434 (spilling of stack protector guard's 
> > address on ARM) to address the testsuite regression on powerpc and x86 as 
> > well as glibc testsuite regression on ARM. Issues were due to 
> > unconditionally attempting to generate the new patterns. The code now tests 
> > if there is a pattern for them for the target before generating them. In 
> > the ARM side of the patch, I've also added a more specific predicate for 
> > the new patterns. The new patch is found below.
> >
> >
> > In case of high register pressure in PIC mode, address of the stack
> > protector's guard can be spilled on ARM targets as shown in PR85434,
> > thus allowing an attacker to control what the canary would be compared
> > against. ARM does lack stack_protect_set and stack_protect_test insn
> > patterns, defining them does not help as the address is expanded
> > regularly and the patterns only deal with the copy and test of the
> > guard with the canary.
> >
> > This problem does not occur for x86 targets because the PIC access and
> > the test can be done in the same instruction. Aarch64 is exempt too
> > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > the second access in the epilogue being CSEd in cse_local pass with the
> > first access in the prologue.
> >
> > The approach followed here is to create new "combined" set and test
> > standard pattern names that take the unexpanded guard and do the set or
> > test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > to hide the individual instructions being generated to the compiler and
> > split the pattern into generic load, compare and branch instruction
> > after register allocator, therefore avoiding any spilling. This is here
> > implemented for the ARM targets. For targets not implementing these new
> > standard pattern names, the existing stack_protect_set and
> > stack_protect_test pattern names are used.
> >
> > To be able to split PIC access after register allocation, the functions
> > had to be augmented to force a new PIC register load and to control
> > which register it loads into. This is because sharing the PIC register
> > between prologue and epilogue could lead to spilling due to CSE again
> > which an attacker could use to control what the canary gets compared
> > against.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-08-09  Thomas Preud'homme  
> >
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.  Insert in the stream of insns if
> > possible.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls accordingly.
> > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > prototype.
> > (thumb_legitimize_address): Likewise.
> > (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > change.
> > * config/arm/predicated.md (guard_operand): New predicate.
> > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-08-29 Thread Thomas Preudhomme

Resend hopefully without HTML this time.

On Wed, 29 Aug 2018 at 10:49, Thomas Preudhomme
 wrote:
>
> Hi,
>
> I've reworked the patch fixing PR85434 (spilling of stack protector guard's 
> address on ARM) to address the testsuite regression on powerpc and x86 as 
> well as glibc testsuite regression on ARM. Issues were due to unconditionally 
> attempting to generate the new patterns. The code now tests if there is a 
> pattern for them for the target before generating them. In the ARM side of 
> the patch, I've also added a more specific predicate for the new patterns. 
> The new patch is found below.
>
>
> In case of high register pressure in PIC mode, address of the stack
> protector's guard can be spilled on ARM targets as shown in PR85434,
> thus allowing an attacker to control what the canary would be compared
> against. ARM does lack stack_protect_set and stack_protect_test insn
> patterns, defining them does not help as the address is expanded
> regularly and the patterns only deal with the copy and test of the
> guard with the canary.
>
> This problem does not occur for x86 targets because the PIC access and
> the test can be done in the same instruction. Aarch64 is exempt too
> because PIC access insn pattern are mov of UNSPEC which prevents it from
> the second access in the epilogue being CSEd in cse_local pass with the
> first access in the prologue.
>
> The approach followed here is to create new "combined" set and test
> standard pattern names that take the unexpanded guard and do the set or
> test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> to hide the individual instructions being generated to the compiler and
> split the pattern into generic load, compare and branch instruction
> after register allocator, therefore avoiding any spilling. This is here
> implemented for the ARM targets. For targets not implementing these new
> standard pattern names, the existing stack_protect_set and
> stack_protect_test pattern names are used.
>
> To be able to split PIC access after register allocation, the functions
> had to be augmented to force a new PIC register load and to control
> which register it loads into. This is because sharing the PIC register
> between prologue and epilogue could lead to spilling due to CSE again
> which an attacker could use to control what the canary gets compared
> against.
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-08-09  Thomas Preud'homme  
>
> * target-insns.def (stack_protect_combined_set): Define new standard
> pattern name.
> (stack_protect_combined_test): Likewise.
> * cfgexpand.c (stack_protect_prologue): Try new
> stack_protect_combined_set pattern first.
> * function.c (stack_protect_epilogue): Try new
> stack_protect_combined_test pattern first.
> * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> parameters to control which register to use as PIC register and force
> reloading PIC register respectively.  Insert in the stream of insns if
> possible.
> (legitimize_pic_address): Expose above new parameters in prototype and
> adapt recursive calls accordingly.
> (arm_legitimize_address): Adapt to new legitimize_pic_address
> prototype.
> (thumb_legitimize_address): Likewise.
> (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> change.
> * config/arm/predicated.md (guard_operand): New predicate.
> * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> prototype change.
> (stack_protect_combined_set): New insn_and_split pattern.
> (stack_protect_set): New insn pattern.
> (stack_protect_combined_test): New insn_and_split pattern.
> (stack_protect_test): New insn pattern.
> * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> (UNSPEC_SP_TEST): Likewise.
> * doc/md.texi (stack_protect_combined_set): Document new standard
> pattern name.
> (stack_protect_set): Clarify that the operand for guard's address is
> legal.
> (stack_protect_combined_test): Document new standard pattern name.
> (stack_protect_test): Clarify that the operand for guard's address is
> legal.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-07-05  Thomas Preud'homme  
>
> * gcc.target/arm/pr85434.c: New test.
>
>
> Testing:
>
> native x86_64: bootstrap + testsuite -> no regression, can see failures with 
> previous version of patch but not with new version
> native powerpc64: bootstrap + testsuite -> no regression, can see failures 
> from pr86834 with previous vers

Re: [PATCH][GCC][AArch64] Limit movmem copies to TImode copies.

2018-08-13 Thread Thomas Preudhomme

Hi Tamar,

Thanks for your patch.

Just one comment about your ChangeLog entry for the testsuiet change:
shouldn't it mention that it is a new testcase? The patch you attached
seems to create the file.

Best regards,

Thomas

On Mon, 13 Aug 2018 at 10:33, Tamar Christina 
wrote:

> Hi All,
>
> On AArch64 we have integer modes larger than TImode, and while we can
> generate
> moves for these they're not as efficient.
>
> So instead make sure we limit the maximum we can copy to TImode.  This
> means
> copying a 16 byte struct will issue 1 TImode copy, which will be done
> using a
> single STP as we expect but an CImode sized copy won't issue CImode
> operations.
>
> Bootstrapped and regtested on aarch4-none-linux-gnu and no issues.
> Crosstested aarch4_be-none-elf and no issues.
>
> Ok for trunk?
>
> Thanks,
> Tamar
>
> gcc/
> 2018-08-13  Tamar Christina  
>
> * config/aarch64/aarch64.c (aarch64_expand_movmem): Set TImode max.
>
> gcc/testsuite/
> 2018-08-13  Tamar Christina  
>
> * gcc.target/aarch64/large_struct_copy_2.c: Add assembler scan.
>
> --
>

[PATCH] Clarify source of tm.texi to copy for GFDL grant

2018-08-09 Thread Thomas Preudhomme

When tm.texi.in is updated in the source tree, the following message
gets displayed:

Verify that you have permission to grant a GFDL license for all
new text in tm.texi, then copy it to /gcc/doc/tm.texi.

Having been myself and some colleagues confused several time by that
message as to what tm.texi to copy, I think it would be clearer to
indicate the absolute path for the source as well. This patch achieves
that.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-08-09  Thomas Preud'homme  

* Makefile.in: Clarify which tm.texi to copy over to assert the
right to grant a GFDL license for all.

Testing: Built GCC with a change in tm.texi.in and copied by
copy/pasting the source and destination path from the resulting message.
Second build then succeeded.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index e7d818d174c..d8d2b885f6d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2504,7 +2504,7 @@ s-tm-texi: build/genhooks$(build_exeext) $(srcdir)/doc/tm.texi.in
 	else \
 	  echo >&2 ; \
 	  echo Verify that you have permission to grant a GFDL license for all >&2 ; \
-	  echo new text in tm.texi, then copy it to $(srcdir)/doc/tm.texi. >&2 ; \
+	  echo new text in $(objdir)/tm.texi, then copy it to $(srcdir)/doc/tm.texi. >&2 ; \
 	  false; \
 	fi
 
-- 
2.18.0

Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-26 Thread Thomas Preudhomme

On Thu, 26 Jul 2018 at 12:01, Tamar Christina  wrote:
>
> Hi Thomas,
>
> > -Original Message-
> > From: Thomas Preudhomme 
> > Sent: Thursday, July 26, 2018 09:29
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; Ramana Radhakrishnan
> > ; Richard Earnshaw
> > ; ni...@redhat.com; Kyrylo Tkachov
> > 
> > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by
> > enabling the FP16 pattern unconditionally.
> >
> > Hi Tamar,
> >
> > On Wed, 25 Jul 2018 at 16:28, Tamar Christina 
> > wrote:
> > >
> > > Hi Thomas,
> > >
> > > Thanks for the review!
> > >
> > > > >
> > > > > I don't believe the TARGET_FP16 guard to be needed, because the
> > > > > pattern doesn't actually generate code and requires another
> > > > > pattern for that, and a reg to reg move should always be possible
> > > > > anyway. So allowing the force to register here is safe and it
> > > > > allows the compiler to generate a correct error instead of ICEing in 
> > > > > an
> > infinite loop.
> > > >
> > > > How about subreg to subreg move? Doesn't that expand to more insns
> > > > (subreg to reg and reg to subreg)? Couldn't you improve the logic to
> > > > check that there is actually a mode change so that if there isn't
> > > > (like moving from one subreg to another) just expand to a single move?
> > > >
> > >
> > > Yes, but that is not a new issue. My patch is simply removing the
> > > TARGET_FP16 restrictions and merging two patterns that should be one
> > using an iterator and nothing more.
> > >
> > > The redundant mov is already there and a different issue than the ICE I'm
> > trying to fix.
> >
> > It's there for movv4hf and movv6hf but your patch extends this problem to
> > movv2sf and movv4sf as well.
>
> I don't understand how it can. My patch just replaces one pattern for V4HF and
> one for V8HF with one pattern operating on VH.
>
> ;; Vector modes for 16-bit floating-point support.
> (define_mode_iterator VH [V8HF V4HF])
>
> My pattern has absolutely no effect on V2SF and V4SF or any of the other 
> modes.

My bad, I was looking at VF.

>
> >
> > >
> > > None of the code inside the expander is needed at all, the code really
> > > only has an effect on subreg to subreg moves, as `force_reg` doesn't do
> > anything when it's argument is already a reg.
> > >
> > > The comment in the expander (which was already there) is wrong. The
> > > *reason* the ICE is fixed isn't because of the `force_reg`. It's
> > > because of the mere presence of the expander itself. The expander
> > > matches the standard mov$a optab and so this prevents
> > emit_move_insn_1 from doing the move by subwords as it finds a pattern
> > that's able to do the move.
> >
> > Could you then fix the comment in your patch as well? I hadn't understood
> > the force_reg was not key here. You might want to update the following
> > sentence from your patch description if you are going to include it in your
> > commit message:
>
> I'll update the comment in the patch. The cover letter won't be included in 
> the commit,
> But it does accurately reflect the current state of affairs. The patch will 
> do the force_reg,
> It's just not the reason it works.

Understood.

>
> >
> > The way this is worked around in the back-end is that we have move
> > patterns in neon.md that usually just force the register instead of checking
> > with the back-end.
> >
> > "The way this is worked around (..) that just force the register" is what 
> > led
> > me to believe the force_reg was important.
> >
> > >
> > > The expander however always falls through and doesn’t stop RTL
> > > generation. You could remove all the code in there and have it
> > > properly match the *neon_mov instructions which will do the right
> > > thing later at code generation time and avoid the redundant moves.  My
> > guess is the original `force_reg` was copied from the other patterns like
> > `movti` and the existing `mov`. There It makes sense because the
> > operands can be MEM or anything general_operand.
> > >
> > > However the redundant moves are a different problem than what I'm
> > > trying to solve here. So I think that's another patch which requires 
> > > further
> > testing.
> >
> > I was just thinking of restricting when

Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-26 Thread Thomas Preudhomme

Hi Tamar,

On Wed, 25 Jul 2018 at 16:28, Tamar Christina  wrote:
>
> Hi Thomas,
>
> Thanks for the review!
>
> > >
> > > I don't believe the TARGET_FP16 guard to be needed, because the
> > > pattern doesn't actually generate code and requires another pattern
> > > for that, and a reg to reg move should always be possible anyway. So
> > > allowing the force to register here is safe and it allows the compiler
> > > to generate a correct error instead of ICEing in an infinite loop.
> >
> > How about subreg to subreg move? Doesn't that expand to more insns
> > (subreg to reg and reg to subreg)? Couldn't you improve the logic to check
> > that there is actually a mode change so that if there isn't (like moving 
> > from
> > one subreg to another) just expand to a single move?
> >
>
> Yes, but that is not a new issue. My patch is simply removing the TARGET_FP16 
> restrictions and
> merging two patterns that should be one using an iterator and nothing more.
>
> The redundant mov is already there and a different issue than the ICE I'm 
> trying to fix.

It's there for movv4hf and movv6hf but your patch extends this problem
to movv2sf and movv4sf as well.

>
> None of the code inside the expander is needed at all, the code really only 
> has an effect on subreg
> to subreg moves, as `force_reg` doesn't do anything when it's argument is 
> already a reg.
>
> The comment in the expander (which was already there) is wrong. The *reason* 
> the ICE is fixed isn't
> because of the `force_reg`. It's because of the mere presence of the expander 
> itself. The expander matches the
> standard mov$a optab and so this prevents emit_move_insn_1 from doing the 
> move by subwords as it finds a pattern
> that's able to do the move.

Could you then fix the comment in your patch as well? I hadn't
understood the force_reg was not key here. You might want to update
the following sentence from your patch description if you are going to
include it in your commit message:

The way this is worked around in the back-end is that we have move patterns in
neon.md that usually just force the register instead of checking with the
back-end.

"The way this is worked around (..) that just force the register" is
what led me to believe the force_reg was important.

>
> The expander however always falls through and doesn’t stop RTL generation. 
> You could remove all the code in there and have
> it properly match the *neon_mov instructions which will do the right thing 
> later at code generation time and avoid the redundant
> moves.  My guess is the original `force_reg` was copied from the other 
> patterns like `movti` and the existing `mov`. There It makes
> sense because the operands can be MEM or anything general_operand.
>
> However the redundant moves are a different problem than what I'm trying to 
> solve here. So I think that's another patch which requires further
> testing.

I was just thinking of restricting when does the force_reg happens but
if it can be removed completely I agree it should probably be done in
a separate patch.

Oh by the way, is there something that prevent those expander to ever
be used with a memory operand? Because the GCC internals contains the
following piece for mov standard pattern (bold marks added by me):

"Second, these patterns are not used solely in the RTL generation pass. Even
the reload pass can generate move insns to copy values from stack slots into
temporary registers. When it does so, one of the operands is a hard register
and the other is an operand that can need to be reloaded into a register.
Therefore, when given such a pair of operands, the pattern must generate RTL
which needs no reloading and needs no temporary registers—no registers other
than the operands. For example, if you support the pattern with a define_
expand, then in such a case the define_expand *mustn’t call force_reg* or any
other such function which might generate new pseudo registers."

Best regards,

Thomas

>
> Regards,
> Tamar
>
> > Best regards,
> >
> > Thomas
> >
> > >
> > > This patch ensures gcc.target/arm/big-endian-subreg.c is fixed without
> > > introducing any regressions while fixing
> > >
> > > gcc.dg/vect/vect-nop-move.c execution test
> > > g++.dg/torture/vshuf-v2si.C   -O3 -g  execution test
> > > g++.dg/torture/vshuf-v4si.C   -O3 -g  execution test
> > > g++.dg/torture/vshuf-v8hi.C   -O3 -g  execution test
> > >
> > > Regtested on armeb-none-eabi and no regressions.
> > > Bootstrapped on arm-none-linux-gnueabihf and no issues.
> > >
> > >
> > > Ok for trunk?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/
> > > 2018-07-23  Tamar Christina  
> > >
> > > PR target/84711
> > > * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg.
> > > * config/arm/neon.md (movv4hf, movv8hf): Refactored to..
> > > (mov): ..this and enable unconditionally.
> > >
> > > --

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-25 Thread Thomas Preudhomme

Hi Kyrill,

Using memory_operand worked, the issues I encountered when using it in
earlier versions of the patch must have been due to the missing test
on address_operand in the preparation statements which I added later.
Please find an updated patch in attachment. ChangeLog entry is as
follows:

*** gcc/ChangeLog ***

2018-07-05  Thomas Preud'homme  

* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.  Insert in the stream of insns if
possible.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to new require_pic_register prototype.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(stack_protect_combined_set): New insn_and_split pattern.
(stack_protect_set): New insn pattern.
(stack_protect_combined_test): New insn_and_split pattern.
(stack_protect_test): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

* gcc.target/arm/pr85434.c: New test.

Bootstrapped again for Arm and Thumb-2 and regtested with and without
-fstack-protector-all without any regression.

Best regards,

Thomas
On Thu, 19 Jul 2018 at 17:34, Thomas Preudhomme
 wrote:
>
> [Dropping Jeff Law from the list since he already commented on the
> middle end parts]
>
> Hi Kyrill,
>
> On Thu, 19 Jul 2018 at 12:02, Kyrill Tkachov
>  wrote:
> >
> > Hi Thomas,
> >
> > On 17/07/18 12:02, Thomas Preudhomme wrote:
> > > Fixed in attached patch. ChangeLog entries are unchanged:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-07-05  Thomas Preud'homme 
> > >
> > > PR target/85434
> > > * target-insns.def (stack_protect_combined_set): Define new standard
> > > pattern name.
> > > (stack_protect_combined_test): Likewise.
> > > * cfgexpand.c (stack_protect_prologue): Try new
> > > stack_protect_combined_set pattern first.
> > > * function.c (stack_protect_epilogue): Try new
> > > stack_protect_combined_test pattern first.
> > > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > > parameters to control which register to use as PIC register and force
> > > reloading PIC register respectively.
> > > (legitimize_pic_address): Expose above new parameters in prototype and
> > > adapt recursive calls accordingly.
> > > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > > prototype.
> > > (thumb_legitimize_address): Likewise.
> > > (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> > > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > > change.
> > > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> > > prototype change.
> > > (stack_protect_combined_set): New insn_and_split pattern.
> > > (stack_protect_set): New insn pattern.
> > > (stack_protect_combined_test): New insn_and_split pattern.
> > > (stack_protect_test): New insn pattern.
> > > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> > > (UNSPEC_SP_TEST): Likewise.
> > > * doc/md.texi (stack_protect_combined_set): Document new standard
> > > pattern name.
> > > (stack_protect_set): Clarify that the operand for guard's address is
> > > legal.
> > > (stack_protect_combined_test): Document new standard pattern name.
> > > (stack_protect_test): Clarify that the operand for guard's address is
> &

Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-25 Thread Thomas Preudhomme

Hi Tamar,

On Mon, 23 Jul 2018 at 17:56, Tamar Christina  wrote:
>
> Hi All,
>
> My previous patch changed arm_can_change_mode_class to allow subregs of
> 64bit registers on arm big-endian.  However it seems that we can't do this
> because a the data in 64 bit VFP registers are stored in little-endian order,
> even on big-endian.
>
> Allowing this change had a knock on effect that caused GCC's no-op detection
> to think that loading from the first lane on arm big-endian is a no-op.  this
> because we can't describe the weird ordering we have on D registers on 
> big-endian.
>
> The original issue comes from the fact that the code does
>
> ... foo (... bar)
> {
>   return bar;
> }
>
> The expansion of the return statement causes GCC to try to return the value in
> a register.  GCC will try to emit the move then, from MEM to REG (due to the 
> SSA
> temporary.).  It checks for a mov optab for this which isn't available and
> then tries to do the move in bits using emit_move_multi_word.
>
> emit_move_multi_word will split the move into sub parts, but then needs to get
> the sub parts and does this using subregs, but it's told it can't do subregs!
>
> The compiler is now stuck in an infinite loop.
>
> The way this is worked around in the back-end is that we have move patterns in
> neon.md that usually just force the register instead of checking with the
> back-end. This prevents emit_move_multi_word from being needed.  However the
> pattern for V4HF and V8HF were guarded by TARGET_NEON && TARGET_FP16.
>
> I don't believe the TARGET_FP16 guard to be needed, because the pattern 
> doesn't
> actually generate code and requires another pattern for that, and a reg to 
> reg move
> should always be possible anyway. So allowing the force to register here is 
> safe
> and it allows the compiler to generate a correct error instead of ICEing in an
> infinite loop.

How about subreg to subreg move? Doesn't that expand to more insns
(subreg to reg and reg to subreg)? Couldn't you improve the logic to
check that there is actually a mode change so that if there isn't
(like moving from one subreg to another) just expand to a single move?

Best regards,

Thomas

>
> This patch ensures gcc.target/arm/big-endian-subreg.c is fixed without 
> introducing
> any regressions while fixing
>
> gcc.dg/vect/vect-nop-move.c execution test
> g++.dg/torture/vshuf-v2si.C   -O3 -g  execution test
> g++.dg/torture/vshuf-v4si.C   -O3 -g  execution test
> g++.dg/torture/vshuf-v8hi.C   -O3 -g  execution test
>
> Regtested on armeb-none-eabi and no regressions.
> Bootstrapped on arm-none-linux-gnueabihf and no issues.
>
>
> Ok for trunk?
>
> Thanks,
> Tamar
>
> gcc/
> 2018-07-23  Tamar Christina  
>
> PR target/84711
> * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg.
> * config/arm/neon.md (movv4hf, movv8hf): Refactored to..
> (mov): ..this and enable unconditionally.
>
> --

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-19 Thread Thomas Preudhomme

[Dropping Jeff Law from the list since he already commented on the
middle end parts]

Hi Kyrill,

On Thu, 19 Jul 2018 at 12:02, Kyrill Tkachov
 wrote:
>
> Hi Thomas,
>
> On 17/07/18 12:02, Thomas Preudhomme wrote:
> > Fixed in attached patch. ChangeLog entries are unchanged:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme 
> >
> > PR target/85434
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls accordingly.
> > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > prototype.
> > (thumb_legitimize_address): Likewise.
> > (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > change.
> > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> > prototype change.
> > (stack_protect_combined_set): New insn_and_split pattern.
> > (stack_protect_set): New insn pattern.
> > (stack_protect_combined_test): New insn_and_split pattern.
> > (stack_protect_test): New insn pattern.
> > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> > (UNSPEC_SP_TEST): Likewise.
> > * doc/md.texi (stack_protect_combined_set): Document new standard
> > pattern name.
> > (stack_protect_set): Clarify that the operand for guard's address is
> > legal.
> > (stack_protect_combined_test): Document new standard pattern name.
> > (stack_protect_test): Clarify that the operand for guard's address is
> > legal.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme 
> >
> > PR target/85434
> > * gcc.target/arm/pr85434.c: New test.
> >
>
> Sorry for the delay. Some comments inline.
>
> Kyrill
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index d6e3c382085..d1a893ac56e 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -6105,8 +6105,18 @@ stack_protect_prologue (void)
>   {
> tree guard_decl = targetm.stack_protect_guard ();
> rtx x, y;
> +  struct expand_operand ops[2];
>
> x = expand_normal (crtl->stack_protect_guard);
> +  create_fixed_operand ([0], x);
> +  create_fixed_operand ([1], DECL_RTL (guard_decl));
> +  /* Allow the target to compute address of Y and copy it to X without
> + leaking Y into a register.  This combined address + copy pattern allows
> + the target to prevent spilling of any intermediate results by splitting
> + it after register allocator.  */
> +  if (maybe_expand_insn (targetm.code_for_stack_protect_combined_set, 2, 
> ops))
> +return;
> +
> if (guard_decl)
>   y = expand_normal (guard_decl);
> else
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 8537262ce64..100844e659c 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -67,7 +67,7 @@ extern int const_ok_for_dimode_op (HOST_WIDE_INT, enum 
> rtx_code);
>   extern int arm_split_constant (RTX_CODE, machine_mode, rtx,
>HOST_WIDE_INT, rtx, rtx, int);
>   extern int legitimate_pic_operand_p (rtx);
> -extern rtx legitimize_pic_address (rtx, machine_mode, rtx);
> +extern rtx legitimize_pic_address (rtx, machine_mode, rtx, rtx, bool);
>   extern rtx legitimize_tls_address (rtx, rtx);
>   extern bool arm_legitimate_address_p (machine_mode, rtx, bool);
>   extern int arm_legitimate_address_outer_p (machine_mode, rtx, RTX_CODE, 
> int);
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index ec3abbcba9f..f4a970580c2 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -7369,20 +7369,26 @@ legitimate_pic_operand_p (rtx x)
>   }
>
>   /* Record that the current function needs a PIC register.  Initialize
> -   cfun->machine->pic_reg if we have not already done so.  */
> +   cfun->machine->pic_reg if we have not already done so.
> +
> +   If not NULL, PIC_REG

Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-18 Thread Thomas Preudhomme

Hi Martin,

Why is this needed when -mfpu does not seem to need it for instance?
Regarding the patch:

> -print "Name(processor_type) Type(enum processor_type)"
> -print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n"
> +print "Name(processor_type) Type(enum processor_type) ForceHelp"
> +print "Known ARM CPUs (for use with the -mtune= options):\n"

Why changing the text beyond adding ForceHelp?

> +@item ForceHelp
> +This property is optional.  If present, enum values is printed
> +in @option{--help} output.
> +

are printed

Thanks,

Thomas
On Wed, 18 Jul 2018 at 16:50, Martin Liška  wrote:
>
> Hi.
>
> This introduces new ForceHelp option flag that helps to
> print valid option enum values that are not directly
> used as a type of an option.
>
> May I please ask ARM folks to test the patch?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2018-07-18  Martin Liska  
>
> PR driver/83193
> * config/arm/arm-tables.opt: Add ForceHelp flag for
> processor_type and arch_name enum types.
> * config/arm/parsecpu.awk: Likewise.
> * doc/options.texi: Document new flag ForceHelp.
> * opt-read.awk: Parse ForceHelp and set it in construction.
> * optc-gen.awk: Likewise.
> * opts.c (print_filtered_help): Handle force_help option.
> * opts.h (struct cl_enum): New field force_help.
> ---
>  gcc/config/arm/arm-tables.opt | 6 +++---
>  gcc/config/arm/parsecpu.awk   | 6 +++---
>  gcc/doc/options.texi  | 4 
>  gcc/opt-read.awk  | 3 +++
>  gcc/optc-gen.awk  | 3 ++-
>  gcc/opts.c| 3 ++-
>  gcc/opts.h| 3 +++
>  7 files changed, 20 insertions(+), 8 deletions(-)
>
>

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-17 Thread Thomas Preudhomme

Fixed in attached patch. ChangeLog entries are unchanged:

*** gcc/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to new require_pic_register prototype.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(stack_protect_combined_set): New insn_and_split pattern.
(stack_protect_set): New insn pattern.
(stack_protect_combined_test): New insn_and_split pattern.
(stack_protect_test): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* gcc.target/arm/pr85434.c: New test.

Best regards,

Thomas
On Mon, 16 Jul 2018 at 22:46, Jeff Law  wrote:
>
> On 07/05/2018 08:48 AM, Thomas Preudhomme wrote:
> > In case of high register pressure in PIC mode, address of the stack
> > protector's guard can be spilled on ARM targets as shown in PR85434,
> > thus allowing an attacker to control what the canary would be compared
> > against. ARM does lack stack_protect_set and stack_protect_test insn
> > patterns, defining them does not help as the address is expanded
> > regularly and the patterns only deal with the copy and test of the
> > guard with the canary.
> >
> > This problem does not occur for x86 targets because the PIC access and
> > the test can be done in the same instruction. Aarch64 is exempt too
> > because PIC access insn pattern are mov of UNSPEC which prevents it from
> > the second access in the epilogue being CSEd in cse_local pass with the
> > first access in the prologue.
> >
> > The approach followed here is to create new "combined" set and test
> > standard pattern names that take the unexpanded guard and do the set or
> > test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> > to hide the individual instructions being generated to the compiler and
> > split the pattern into generic load, compare and branch instruction
> > after register allocator, therefore avoiding any spilling. This is here
> > implemented for the ARM targets. For targets not implementing these new
> > standard pattern names, the existing stack_protect_set and
> > stack_protect_test pattern names are used.
> >
> > To be able to split PIC access after register allocation, the functions
> > had to be augmented to force a new PIC register load and to control
> > which register it loads into. This is because sharing the PIC register
> > between prologue and epilogue could lead to spilling due to CSE again
> > which an attacker could use to control what the canary gets compared
> > against.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme  
> >
> > PR target/85434
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-10 Thread Thomas Preudhomme

Adding Jeff and Eric since the patch adds an RTL target hook.

Best regards,

Thomas

On Thu, 5 Jul 2018 at 15:48, Thomas Preudhomme
 wrote:
>
> In case of high register pressure in PIC mode, address of the stack
> protector's guard can be spilled on ARM targets as shown in PR85434,
> thus allowing an attacker to control what the canary would be compared
> against. ARM does lack stack_protect_set and stack_protect_test insn
> patterns, defining them does not help as the address is expanded
> regularly and the patterns only deal with the copy and test of the
> guard with the canary.
>
> This problem does not occur for x86 targets because the PIC access and
> the test can be done in the same instruction. Aarch64 is exempt too
> because PIC access insn pattern are mov of UNSPEC which prevents it from
> the second access in the epilogue being CSEd in cse_local pass with the
> first access in the prologue.
>
> The approach followed here is to create new "combined" set and test
> standard pattern names that take the unexpanded guard and do the set or
> test. This allows the target to use an opaque pattern (eg. using UNSPEC)
> to hide the individual instructions being generated to the compiler and
> split the pattern into generic load, compare and branch instruction
> after register allocator, therefore avoiding any spilling. This is here
> implemented for the ARM targets. For targets not implementing these new
> standard pattern names, the existing stack_protect_set and
> stack_protect_test pattern names are used.
>
> To be able to split PIC access after register allocation, the functions
> had to be augmented to force a new PIC register load and to control
> which register it loads into. This is because sharing the PIC register
> between prologue and epilogue could lead to spilling due to CSE again
> which an attacker could use to control what the canary gets compared
> against.
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-07-05  Thomas Preud'homme  
>
> PR target/85434
> * target-insns.def (stack_protect_combined_set): Define new standard
> pattern name.
> (stack_protect_combined_test): Likewise.
> * cfgexpand.c (stack_protect_prologue): Try new
> stack_protect_combined_set pattern first.
> * function.c (stack_protect_epilogue): Try new
> stack_protect_combined_test pattern first.
> * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> parameters to control which register to use as PIC register and force
> reloading PIC register respectively.
> (legitimize_pic_address): Expose above new parameters in prototype and
> adapt recursive calls accordingly.
> (arm_legitimize_address): Adapt to new legitimize_pic_address
> prototype.
> (thumb_legitimize_address): Likewise.
> (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> change.
> * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> prototype change.
> (stack_protect_combined_set): New insn_and_split pattern.
> (stack_protect_set): New insn pattern.
> (stack_protect_combined_test): New insn_and_split pattern.
> (stack_protect_test): New insn pattern.
> * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> (UNSPEC_SP_TEST): Likewise.
> * doc/md.texi (stack_protect_combined_set): Document new standard
> pattern name.
> (stack_protect_set): Clarify that the operand for guard's address is
> legal.
> (stack_protect_combined_test): Document new standard pattern name.
> (stack_protect_test): Clarify that the operand for guard's address is
> legal.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-07-05  Thomas Preud'homme  
>
> PR target/85434
> * gcc.target/arm/pr85434.c: New test.
>
> Testing: Bootstrapped on ARM in both Arm and Thumb-2 mode as well as on
> Aarch64. Testsuite shows no regression on these 3 variants either both
> with default flags and with -fstack-protector-all.
>
> Is this ok for trunk? If yes, would this be acceptable as a backport to
> GCC 6, 7 and 8 provided that no regression is found?
>
> Best regards,
>
> Thomas
From d917d48c2005e46154383589f203d06f3c6167e0 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Tue, 8 May 2018 15:47:05 +0100
Subject: [PATCH] PR85434: Prevent spilling of stack protector guard's address
 on ARM

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect

[PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-05 Thread Thomas Preudhomme

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names that take the unexpanded guard and do the set or
test. This allows the target to use an opaque pattern (eg. using UNSPEC)
to hide the individual instructions being generated to the compiler and
split the pattern into generic load, compare and branch instruction
after register allocator, therefore avoiding any spilling. This is here
implemented for the ARM targets. For targets not implementing these new
standard pattern names, the existing stack_protect_set and
stack_protect_test pattern names are used.

To be able to split PIC access after register allocation, the functions
had to be augmented to force a new PIC register load and to control
which register it loads into. This is because sharing the PIC register
between prologue and epilogue could lead to spilling due to CSE again
which an attacker could use to control what the canary gets compared
against.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to new require_pic_register prototype.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(stack_protect_combined_set): New insn_and_split pattern.
(stack_protect_set): New insn pattern.
(stack_protect_combined_test): New insn_and_split pattern.
(stack_protect_test): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* gcc.target/arm/pr85434.c: New test.

Testing: Bootstrapped on ARM in both Arm and Thumb-2 mode as well as on
Aarch64. Testsuite shows no regression on these 3 variants either both
with default flags and with -fstack-protector-all.

Is this ok for trunk? If yes, would this be acceptable as a backport to
GCC 6, 7 and 8 provided that no regression is found?

Best regards,

Thomas
From d917d48c2005e46154383589f203d06f3c6167e0 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Tue, 8 May 2018 15:47:05 +0100
Subject: [PATCH] PR85434: Prevent spilling of stack protector guard's address
 on ARM

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names

[ARM] Fix PR85434: spill of stack protector's guard address

2018-05-03 Thread Thomas Preudhomme

I'll make a fool of myself but I still have further questions if you don't
mind (see inline).

On Friday, 4 May 2018, Segher Boessenkool <seg...@kernel.crashing.org>
wrote:
> Hi!
>
> On Wed, May 02, 2018 at 07:57:55AM +0100, Thomas Preudhomme wrote:
>> As mentionned in the ticket this was my first thought but this means
>> making the pattern aware of all the possible way the address could be
>> access (PIC Vs non-PIC, Arm Vs Thumb-2 Vs Thumb-1) to decide how many
>> scratch registers are needed. I'd rather reuse the existing pattern as
>> much as possible to make sure they are well tested. Ideally I wanted a
>> way to mark a REG RTX so that it is never spilled and such that the
>> mark is propagated when the register is moved to another register or
>> propagated. But that is a bigger change so decided it should be an
>> improvement for later but needed another solution right now.
>
> How would that work, esp. for pseudos?  If too many regs have such a
> mark then the compiler will have to sorry() or similar, not a good
> thing at all.

I'm missing something, there should be the same amount of pseudo with that
mark as there is scratch in the new pattern doing memory address load(s) +
set / check. I'm guessing this is not as easy to achieve as it sounds.

>
>> By the way about making sure the address is not left in a register, I
>> have a question regarding the current stack_protect_set and
>> stack_protect_check pattern and their requirements to have register
>> cleared afterwards: why is that necessary? Currently not all registers
>> are cleared and the guard is available in the canari before it is
>> overwritten anyway so I don't see how clearing the register adds any
>> extra security. What sort of attack is it protecting against?
>
> From md.texi:
>
> @item @samp{stack_protect_set}
> This pattern, if defined, moves a @code{ptr_mode} value from the memory
> in operand 1 to the memory in operand 0 without leaving the value in
> a register afterward.  This is to avoid leaking the value some place
> that an attacker might use to rewrite the stack guard slot after
> having clobbered it.
>
> (etc.)

I've read that doc but what I don't understand is why the guard value being
leaked in a register would be a problem if modified. The pattern as they
are guarantee the guard is always reloaded from its canonical location
(e.g. TLS var). Because the patterns do not represent in RTL what they do
the compiler could not reuse the value left in a register. Are we worrying
about optimization the assembler could do?

>
> Having the canary in a global variable makes it a lot easier for exploit
> code to access it then if it is e.g. in TLS data.  Actually leaking a
> pointer to it would make it extra easy...

If an attacker can execute code to access and modify the guard, why would
s/he bother doing a stack overflow instead of just executing the code he
wants to directly?

Best regards,

Thomas

Re: [ARM] Fix PR85434: spill of stack protector's guard address

2018-05-02 Thread Thomas Preudhomme

Hi Segher,

As mentionned in the ticket this was my first thought but this means
making the pattern aware of all the possible way the address could be
access (PIC Vs non-PIC, Arm Vs Thumb-2 Vs Thumb-1) to decide how many
scratch registers are needed. I'd rather reuse the existing pattern as
much as possible to make sure they are well tested. Ideally I wanted a
way to mark a REG RTX so that it is never spilled and such that the
mark is propagated when the register is moved to another register or
propagated. But that is a bigger change so decided it should be an
improvement for later but needed another solution right now.

By the way about making sure the address is not left in a register, I
have a question regarding the current stack_protect_set and
stack_protect_check pattern and their requirements to have register
cleared afterwards: why is that necessary? Currently not all registers
are cleared and the guard is available in the canari before it is
overwritten anyway so I don't see how clearing the register adds any
extra security. What sort of attack is it protecting against?

Best regards,

Thomas

On 29 April 2018 at 00:33, Segher Boessenkool
<seg...@kernel.crashing.org> wrote:
> Hi!
>
> On Sat, Apr 28, 2018 at 12:32:26AM +0100, Thomas Preudhomme wrote:
>> On Arm (Aarch32 and Aarch64) the stack protector's guard is accessed by
>> loading its address first before loading its value from it as part of
>> the stack_protect_set or stack_protect_check insn pattern. This creates
>> the risk of spilling between the two.
>
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index deab929..c7ced8f 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -6156,6 +6156,10 @@ stack_protect_prologue (void)
>>tree guard_decl = targetm.stack_protect_guard ();
>>rtx x, y;
>>
>> +  /* Prevent scheduling of instruction(s) between computation of the guard's
>> + address and setting of the canari to avoid any spill of the guard's
>> + address if computed outside the setting of the canari.  */
>> +  emit_insn (gen_blockage ());
>>x = expand_normal (crtl->stack_protect_guard);
>>if (guard_decl)
>>  y = expand_normal (guard_decl);
>
> [ etc. ]
>
> Why pessimise code for all targets (quite a lot), when it does not even
> fix the problem on Arm completely (or not obviously, anyway)?
>
> Instead, implement stack_protect_* and hide the memory accesses to the
> stored canary value (and make sure its address is not left in a register
> either!)
>
> I doubt this can be done completely target-independent, it will always
> be best effort that way, aka it won't really work.
>
>
> Segher

[ARM] Fix PR85434: spill of stack protector's guard address

2018-04-27 Thread Thomas Preudhomme

On Arm (Aarch32 and Aarch64) the stack protector's guard is accessed by
loading its address first before loading its value from it as part of
the stack_protect_set or stack_protect_check insn pattern. This creates
the risk of spilling between the two.

It is particularly likely on Aarch32 when compiling PIC code because
- computing the address takes several instructions (first compute the
  GOT base and then the GOT entry by adding an offset) which increases
  the likelyhood of CSE
- the address computation can be CSEd due to the GOT entry computation
  being a MEM of the GOT base + an UNSPEC offset rather than an UNSPEC
  of a MEM like on AArche64.

This patch address both issues by (i) adding some scheduler barriers
around the stack protector code and (ii) making all memory loads
involved in computing the guard's address volatile. The use of volatile
rather than unspec was chosen so that the patterns for computing the
guard address can be the same as for normal global variable access thus
reusing more code. Finally the patch also improves the documentation to
mention the need to be careful when computing the address of the guard.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-27  Thomas Preud'homme  

PR target/85434
* cfgexpand.c (stack_protect_prologue): Emit scheduler barriers around
stack protector code.
* function.c (stack_protect_epilogue): Likewise.
* config/arm/arm-protos.h (arm_stack_chk_guard_decl_p): Declare.
* config/arm/arm.md (calculate_pic_address): Mark memory volatile if
is computing address of stack protector's guard.
(calculate_pic_address splitter): Likewise.
* config/arm/arm.c (require_pic_register): Add parameter to control
whether to insert instruction at the end of the instruction stream.
(legitimize_pic_address): Force computing PIC address at the end of
instruction stream and adapt logic to change in calculate_pic_address
insn pattern.
(arm_stack_chk_guard_decl_p): New function.
(arm_emit_call_insn): Adapt to change in require_pic_register().
* target.def (TARGET_STACK_PROTECT_GUARD): Document requirement on
guard's address computation to be careful about not spilling.
* doc/tm.texi: Regenerate.

*** gcc/testsuite/ChangeLog ***

2018-04-27  Thomas Preud'homme  

PR target/85434
* gcc.target/arm/pr85434.c: New testcase.

Testing: The code has been boostraped on an Armv8-A machine targeting:
- Aarch32 ARM mode with -mfpu=neon-fpv4 and hardfloat ABI
- Aarch64
Testsuite has been run for the following sets of flag:

- arm-eabi-aem/-mthumb/-march=armv4t
- arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp
- 
arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard

(thereby testing the code for ARM, Thumb-2 and Thumb-1 mode) without any
regression.

Is it ok for trunk?

Best regards,

Thomas
From 76c48e31130f212721addeeca830477e3b6f5e10 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Mon, 23 Apr 2018 14:37:11 +0100
Subject: [PATCH] [ARM] Fix PR85434: spill of stack protector's guard address

On Arm (Aarch32 and Aarch64) the stack protector's guard is accessed by
loading its address first before loading its value from it as part of
the stack_protect_set or stack_protect_check insn pattern. This creates
the risk of spilling between the two.

It is particularly likely on Aarch32 when compiling PIC code because
- computing the address takes several instructions (first compute the
  GOT base and then the GOT entry by adding an offset) which increases
  the likelyhood of CSE
- the address computation can be CSEd due to the GOT entry computation
  being a MEM of the GOT base + an UNSPEC offset rather than an UNSPEC
  of a MEM like on AArche64.

This patch address both issues by (i) adding some scheduler barriers
around the stack protector code and (ii) making all memory loads
involved in computing the guard's address volatile. The use of volatile
rather than unspec was chosen so that the patterns for computing the
guard address can be the same as for normal global variable access thus
reusing more code. Finally the patch also improves the documentation to
mention the need to be careful when computing the address of the guard.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-27  Thomas Preud'homme  

	* cfgexpand.c (stack_protect_prologue): Emit scheduler barriers around
	stack protector code.
	* function.c (stack_protect_epilogue): Likewise.
	* config/arm/arm-protos.h (arm_stack_chk_guard_decl_p): Declare.
	* config/arm/arm.md (calculate_pic_address): Mark memory volatile if
	is computing address of stack protector's guard.
	(calculate_pic_address splitter): Likewise.
	* config/arm/arm.c (require_pic_register): Add parameter to control

Re: [PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin

2018-04-18 Thread Thomas Preudhomme


Hi Kyrill,

On 11/04/18 10:02, Kyrill Tkachov wrote:

Hi Thomas,

On 09/04/18 15:29, Thomas Preudhomme wrote:

Hi Ramana,

On 06/04/18 17:17, Thomas Preudhomme wrote:
>
>
> On 06/04/18 17:08, Ramana Radhakrishnan wrote:
>> On 06/04/2018 16:54, Thomas Preudhomme wrote:
>>> Instruction pattern for setting the FPSCR expects the input value to be
>>> in a register. However, __builtin_arm_set_fpscr expander does not ensure
>>> that this is the case and as a result GCC ICEs when the builtin is
>>> called with a constant literal.
>>>
>>> This commit fixes the builtin to force the input value into a register.
>>> It also remove the unneeded volatile in the existing fpscr test and
>>> fixes the function prototype.
>>>
>>> ChangeLog entries are as follows:
>>>
>>> *** gcc/ChangeLog ***
>>>
>>> 2018-04-06  Thomas Preud'homme <thomas.preudho...@arm.com>
>>>
>>> PR target/85261
>>> * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand
>>> into register.
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>>
>>> 2018-04-06  Thomas Preud'homme <thomas.preudho...@arm.com>
>>>
>>> PR target/85261
>>> * gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with
>>> literal value.  Expect 2 MCR instruction. Fix function prototype.
>>> Remove volatile keyword.
>>>
>>> Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows
>>> no regression.
>>>
>>> Is this ok for stage4?
>>>
>>> Best regards,
>>>
>>> Thomas
>>>
>>
>> (sorry about the duplicate for those who get it)
>>
>>
>> LGTM, though in this case I would prefer a bootstrap and regression run
>> as this is automatically exercised most with gcc.dg/atomic_*.c and you
>> really need this tested on linux than just bare-metal as I'm not sure
>> how this gets tested on arm-none-eabi.
>
> Oh it is indeed. Didn't realized it was used anywhere. Will start bootstrap
> right away.

Done with --with-arch=armv8-a --with-mode=thumb --with-fpu=neon-vfpv4
--with-float=hard --enable-languages=c,c++,fortran --with-system-zlib
--enable-plugins --enable-bootstrap. Testsuite for that GCC does not show any
regression either.

Ok to commit?



Thanks for doing this.
This is ok for trunk.


>
>>
>> What about earlier branches, have you looked ? This is a silly target
>> bug and fixes should go back to older branches in this particular case
>> after baking this on trunk for some time.
>
> GCC 6 and 7 are affected as well and a backport will be done once it has baked
> long enough of course.

Will now bootstrap and regtest against GCC 6 and 7. Will let you know once that
is finished.


Backports show no regression on a bootstrapped arm-none-linux-gnueabihf GCC 6 & 
7. Ok to commit those?


Best regards,

Thomas

Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-11 Thread Thomas Preudhomme


Hi Kyrill,

One week went by so I've committed the change to GCC 7 as announced.

Best regards,

Thomas

On 05/04/18 16:36, Kyrill Tkachov wrote:


On 05/04/18 16:13, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:20, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:19, Kyrill Tkachov wrote:

Hi Thomas,

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme <thomas.preudho...@arm.com>

    PR target/85203
    * config/arm/arm-builtins.c (arm_expand_builtin): Change
    expansion to perform a bitwise AND of the argument followed by a
    boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme <thomas.preudho...@arm.com>

    PR target/85203
    * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
    to match a single insn of the baz function.  Move scan directives at
    the end of the file below the functions they are trying to test for
    better readability.
    * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?



Ok, thanks for fixing this.
Does this need backporting to the branches?


Yes to gcc-7-branch only.


The patch applies cleanly on gcc-7-branch and the same testing shows no 
regression. Ok to apply to gcc-7-branch once the patch has baked for 7 days in 
trunk?



Yes, thanks.
Kyrill


Best regards,

Thomas

Re: [PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin

2018-04-09 Thread Thomas Preudhomme


Hi Ramana,

On 06/04/18 17:17, Thomas Preudhomme wrote:



On 06/04/18 17:08, Ramana Radhakrishnan wrote:

On 06/04/2018 16:54, Thomas Preudhomme wrote:

Instruction pattern for setting the FPSCR expects the input value to be
in a register. However, __builtin_arm_set_fpscr expander does not ensure
that this is the case and as a result GCC ICEs when the builtin is
called with a constant literal.

This commit fixes the builtin to force the input value into a register.
It also remove the unneeded volatile in the existing fpscr test and
fixes the function prototype.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-04-06  Thomas Preud'homme  <thomas.preudho...@arm.com>

PR target/85261
* config/arm/arm-builtins.c (arm_expand_builtin): Force input operand
into register.

*** gcc/testsuite/ChangeLog ***

2018-04-06  Thomas Preud'homme  <thomas.preudho...@arm.com>

PR target/85261
* gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with
literal value.  Expect 2 MCR instruction.  Fix function prototype.
Remove volatile keyword.

Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows
no regression.

Is this ok for stage4?

Best regards,

Thomas



(sorry about the duplicate for those who get it)


LGTM, though in this case I would prefer a bootstrap and regression run
as this is automatically exercised most with gcc.dg/atomic_*.c and you
really need this tested on linux than just bare-metal as I'm not sure
how this gets tested on arm-none-eabi.


Oh it is indeed. Didn't realized it was used anywhere. Will start bootstrap 
right away.


Done with --with-arch=armv8-a --with-mode=thumb --with-fpu=neon-vfpv4 
--with-float=hard --enable-languages=c,c++,fortran --with-system-zlib 
--enable-plugins --enable-bootstrap. Testsuite for that GCC does not show any 
regression either.


Ok to commit?





What about earlier branches, have you looked ? This is a silly target
bug and fixes should go back to older branches in this particular case
after baking this on trunk for some time.


GCC 6 and 7 are affected as well and a backport will be done once it has baked 
long enough of course.


Will now bootstrap and regtest against GCC 6 and 7. Will let you know once that 
is finished.


Best regards,

Thomas

Re: [PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin

2018-04-06 Thread Thomas Preudhomme




On 06/04/18 17:08, Ramana Radhakrishnan wrote:

On 06/04/2018 16:54, Thomas Preudhomme wrote:

Instruction pattern for setting the FPSCR expects the input value to be
in a register. However, __builtin_arm_set_fpscr expander does not ensure
that this is the case and as a result GCC ICEs when the builtin is
called with a constant literal.

This commit fixes the builtin to force the input value into a register.
It also remove the unneeded volatile in the existing fpscr test and
fixes the function prototype.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-04-06  Thomas Preud'homme  <thomas.preudho...@arm.com>

PR target/85261
* config/arm/arm-builtins.c (arm_expand_builtin): Force input operand
into register.

*** gcc/testsuite/ChangeLog ***

2018-04-06  Thomas Preud'homme  <thomas.preudho...@arm.com>

PR target/85261
* gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with
literal value.  Expect 2 MCR instruction.  Fix function prototype.
Remove volatile keyword.

Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows
no regression.

Is this ok for stage4?

Best regards,

Thomas



(sorry about the duplicate for those who get it)


LGTM, though in this case I would prefer a bootstrap and regression run
as this is automatically exercised most with gcc.dg/atomic_*.c and you
really need this tested on linux than just bare-metal as I'm not sure
how this gets tested on arm-none-eabi.


Oh it is indeed. Didn't realized it was used anywhere. Will start bootstrap 
right away.




What about earlier branches, have you looked ? This is a silly target
bug and fixes should go back to older branches in this particular case
after baking this on trunk for some time.


GCC 6 and 7 are affected as well and a backport will be done once it has baked 
long enough of course.


Best regards,

Thomas

[PATCH, GCC/ARM] Fix PR85261: ICE with FPSCR setter builtin

2018-04-06 Thread Thomas Preudhomme


Instruction pattern for setting the FPSCR expects the input value to be
in a register. However, __builtin_arm_set_fpscr expander does not ensure
that this is the case and as a result GCC ICEs when the builtin is
called with a constant literal.

This commit fixes the builtin to force the input value into a register.
It also remove the unneeded volatile in the existing fpscr test and
fixes the function prototype.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-04-06  Thomas Preud'homme  

PR target/85261
* config/arm/arm-builtins.c (arm_expand_builtin): Force input operand
into register.

*** gcc/testsuite/ChangeLog ***

2018-04-06  Thomas Preud'homme  

PR target/85261
* gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with
literal value.  Expect 2 MCR instruction.  Fix function prototype.
Remove volatile keyword.

Testing: Built an arm-none-eabi GCC cross-compiler and testsuite shows
no regression.

Is this ok for stage4?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 8940d1f6311bccf86664ab2eaa938735eec595f6..e100d933a77c5de4a13cb961d1bff40f57f2ea80 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -2592,7 +2592,7 @@ arm_expand_builtin (tree exp,
 	  icode = CODE_FOR_set_fpscr;
 	  arg0 = CALL_EXPR_ARG (exp, 0);
 	  op0 = expand_normal (arg0);
-	  pat = GEN_FCN (icode) (op0);
+	  pat = GEN_FCN (icode) (force_reg (SImode, op0));
 	}
   emit_insn (pat);
   return target;
diff --git a/gcc/testsuite/gcc.target/arm/fpscr.c b/gcc/testsuite/gcc.target/arm/fpscr.c
index 7b4d71d72d8964f6da0d0604bf59aeb4a895df43..4c3eaf7fcf75ad8582071ecb110fd1e4976a3b24 100644
--- a/gcc/testsuite/gcc.target/arm/fpscr.c
+++ b/gcc/testsuite/gcc.target/arm/fpscr.c
@@ -6,11 +6,14 @@
 /* { dg-add-options arm_fp } */
 
 void
-test_fpscr ()
+test_fpscr (void)
 {
-  volatile unsigned int status = __builtin_arm_get_fpscr ();
+  unsigned status;
+
+  __builtin_arm_set_fpscr (0);
+  status = __builtin_arm_get_fpscr ();
   __builtin_arm_set_fpscr (status);
 }
 
 /* { dg-final { scan-assembler "mrc\tp10, 7, r\[0-9\]+, cr1, cr0, 0" } } */
-/* { dg-final { scan-assembler "mcr\tp10, 7, r\[0-9\]+, cr1, cr0, 0" } } */
+/* { dg-final { scan-assembler-times "mcr\tp10, 7, r\[0-9\]+, cr1, cr0, 0" 2 } } */

Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-05 Thread Thomas Preudhomme


Hi Kyrill,

On 04/04/18 18:20, Thomas Preudhomme wrote:

Hi Kyrill,

On 04/04/18 18:19, Kyrill Tkachov wrote:

Hi Thomas,

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme <thomas.preudho...@arm.com>

    PR target/85203
    * config/arm/arm-builtins.c (arm_expand_builtin): Change
    expansion to perform a bitwise AND of the argument followed by a
    boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme <thomas.preudho...@arm.com>

    PR target/85203
    * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
    to match a single insn of the baz function.  Move scan directives at
    the end of the file below the functions they are trying to test for
    better readability.
    * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?



Ok, thanks for fixing this.
Does this need backporting to the branches?


Yes to gcc-7-branch only.


The patch applies cleanly on gcc-7-branch and the same testing shows no 
regression. Ok to apply to gcc-7-branch once the patch has baked for 7 days in 
trunk?


Best regards,

Thomas

Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-04 Thread Thomas Preudhomme


Hi Kyrill,

On 04/04/18 18:19, Kyrill Tkachov wrote:

Hi Thomas,

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme <thomas.preudho...@arm.com>

    PR target/85203
    * config/arm/arm-builtins.c (arm_expand_builtin): Change
    expansion to perform a bitwise AND of the argument followed by a
    boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme <thomas.preudho...@arm.com>

    PR target/85203
    * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
    to match a single insn of the baz function.  Move scan directives at
    the end of the file below the functions they are trying to test for
    better readability.
    * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?



Ok, thanks for fixing this.
Does this need backporting to the branches?


Yes to gcc-7-branch only.

Best regards,

Thomas

Re: [PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-04 Thread Thomas Preudhomme


Oops, forgot the link.

On 04/04/18 18:03, Thomas Preudhomme wrote:

Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
   the intrinsic should return true for a nonsecure caller and a
   nonsecure caller is characterized with LR's lsb being 0


[1] 
https://static.docs.arm.com/ecm0359818/10/ECM0359818_armv8m_security_extensions_reqs_on_dev_tools_1_0.pdf


Best regards,

Thomas



This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme  <thomas.preudho...@arm.com>

 PR target/85203
 * config/arm/arm-builtins.c (arm_expand_builtin): Change
 expansion to perform a bitwise AND of the argument followed by a
 boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme  <thomas.preudho...@arm.com>

 PR target/85203
 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
 to match a single insn of the baz function.  Move scan directives at
 the end of the file below the functions they are trying to test for
 better readability.
 * gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?

Best regards,

Thomas

[PATCH, GCC/ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result

2018-04-04 Thread Thomas Preudhomme


Hi,

__builtin_cmse_nonsecure_caller implementation returns true in almost
all cases due to 2 separate bugs:

* gen_addsi is used instead of gen_andsi to retrieve the lsb
* the lsb boolean value is not negated but the specification [1] says
  the intrinsic should return true for a nonsecure caller and a
  nonsecure caller is characterized with LR's lsb being 0

This was not caught due to (1) lack of runtime test and (2) the existing
RTL scan not taking into account that '.' matches newline in Tcl regular
expressions.

This patch fixes the implementation issues and improves testing of
cmse_nonsecure_caller by (1) adding a runtime test for the secure caller
case and (2) looking for an SET insn of an AND expression in the right
function. This leaves the nonsecure caller case only partly tested
since the exact value being AND and the negation are not covered by the
scan and the existing test infrastructure does not allow 2 separate
compilation and link to be performed. It is enough though to catch the
current incorrect behavior.

The patch also reorganize the scan directives in cmse-1.c to more easily
identify what function they are intended to test in the file.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2018-04-04  Thomas Preud'homme  

PR target/85203
* config/arm/arm-builtins.c (arm_expand_builtin): Change
expansion to perform a bitwise AND of the argument followed by a
boolean negation of the result.

*** gcc/testsuite/ChangeLog ***

2018-04-04  Thomas Preud'homme  

PR target/85203
* gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan
to match a single insn of the baz function.  Move scan directives at
the end of the file below the functions they are trying to test for
better readability.
* gcc.target/arm/cmse/cmse-16.c: New testcase.

Testing: No bootstrap since only M profile builtin code has been changed
but regression testing for arm-none-eabi targeting Arm Cortex-M23 and
Cortex-M33 shows no regression.

Is this ok for stage4?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 8940d1f6311bccf86664ab2eaa938735eec595f6..184eb2a934308717b6e1054e376487a297f8d5de 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -2600,7 +2600,9 @@ arm_expand_builtin (tree exp,
 case ARM_BUILTIN_CMSE_NONSECURE_CALLER:
   target = gen_reg_rtx (SImode);
   op0 = arm_return_addr (0, NULL_RTX);
-  emit_insn (gen_addsi3 (target, op0, const1_rtx));
+  emit_insn (gen_andsi3 (target, op0, const1_rtx));
+  op1 = gen_rtx_EQ (SImode, target, const0_rtx);
+  emit_insn (gen_cstoresi4 (target, op1, target, const0_rtx));
   return target;
 
 case ARM_BUILTIN_TEXTRMSB:
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
index c13272eed683aa06db027cd4646e5fe67817212b..f764153cb17b796ccd0d20abb78d5cf56be52911 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
@@ -71,6 +71,20 @@ baz (void)
 {
   return cmse_nonsecure_caller ();
 }
+/* { dg-final { scan-assembler "baz:" } } */
+/* { dg-final { scan-assembler "__acle_se_baz:" } } */
+/* { dg-final { scan-assembler-not "\tcmse_nonsecure_caller" } } */
+/* Look for an andsi of 1 with a register in function baz, ie.
+
+;; Function baz
+
+(insn  (set (reg:SI )
+ (and:SI (reg:SI )
+	 (const_int 1 )
+   >
+(insn
+*/
+/* { dg-final { scan-rtl-dump "\n;; Function baz\[^\n\]*\[^(\]+\[^;\]*\n\\(insn \[^(\]+ \\(set \\(reg\[^:\]*:SI \[^)\]+\\)\[^(\]*\\(and:SI \\(reg\[^:\]*:SI \[^)\]+\\)\[^(\]*\\((const_int 1|reg\[^:\]*:SI) \[^)\]+\\)\[^(\]+(\\(nil\\)\[^(\]+)?\\(insn" expand } } */
 
 typedef int __attribute__ ((cmse_nonsecure_call)) (int_nsfunc_t) (void);
 
@@ -86,6 +100,11 @@ qux (int_nsfunc_t * callback)
 {
   fp = cmse_nsfptr_create (callback);
 }
+/* { dg-final { scan-assembler "qux:" } } */
+/* { dg-final { scan-assembler "__acle_se_qux:" } } */
+/* { dg-final { scan-assembler "bic" } } */
+/* { dg-final { scan-assembler "push\t\{r4, r5, r6" } } */
+/* { dg-final { scan-assembler "msr\tAPSR_nzcvq" } } */
 
 int call_callback (void)
 {
@@ -94,13 +113,4 @@ int call_callback (void)
   else
 return default_callback ();
 }
-/* { dg-final { scan-assembler "baz:" } } */
-/* { dg-final { scan-assembler "__acle_se_baz:" } } */
-/* { dg-final { scan-assembler "qux:" } } */
-/* { dg-final { scan-assembler "__acle_se_qux:" } } */
-/* { dg-final { scan-assembler-not "\tcmse_nonsecure_caller" } } */
-/* { dg-final { scan-rtl-dump "and.*reg.*const_int 1" expand } } */
-/* { dg-final { scan-assembler "bic" } } */
-/* { dg-final { scan-assembler "push\t\{r4, r5, r6" } } */
-/* { dg-final { scan-assembler "msr\tAPSR_nzcvq" } } */
 /* { dg-final { scan-assembler-times "bl\\s+__gnu_cmse_nonsecure_call" 1 } } */
diff --git

[arm-embedded][PATCH] Add multilib mapping for -mcpu=cortex-r52

2018-03-15 Thread Thomas Preudhomme


Hi,

Currently -mcpu=cortex-r52 gets assigned the default multilib due to
lack of mapping from -mcpu=cortex-r52 to an -march option. This is
inconsistent with -march=armv8-r which gets the thumb/v7-ar multilib.

This patch adds the appropriate mapping.

ChangeLog entry is as follows:

*** gcc/ChangeLog.arm ***

2018-03-15  Thomas Preud'homme  

* config/arm/t-rmprofile: Add mapping from -mcpu=cortex-r52 to
-march=armv7.

Testing: -mcpu=cortex-r52 -print-multi-directory prints . (ie. default
mutlilib) without the patch with a multilib build but prints the
expected thumb/v7-ar with the patch.

We've decided to apply this patch to the ARM/embedded-7-branch.

Best regards,

Thomas

[arm-embedded][PATCH] Add multilib mapping for -mcpu=cortex-m33+nodsp

2018-03-15 Thread Thomas Preudhomme


Hi,

Currently -mcpu=cortex-m33+nodsp gets assigned the thumb multilib due to
lack of mapping from -mcpu=cortex-m33+nodsp to an -march option. This
leads to link failures for linking Armv4T Thumb code from the multilib
with Armv8-M Mainline code from the code being compiled.

This patch adds the appropriate mapping.

ChangeLog entry is as follows:

*** gcc/ChangeLog.arm ***

2018-03-14  Thomas Preud'homme  

* config/arm/t-rmprofile: Add mapping from -mcpu=cortex-m33+nodsp to
-march=armv8-m.main.

Testing: A hello world fails to link without the patch with a multilib
build but succeeds with the patch.

We've decided to apply this patch to the ARM/embedded-7-branch branch.

Best regards,

Thomas
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index a3a24d59fb29b42a36177bd2d2ebfae4e50e5a10..54411795215b8aff90ba9cfb806ec7b33db4caea 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -102,6 +102,7 @@ MULTILIB_MATCHES   += march?armv7e-m=mcpu?cortex-m4
 MULTILIB_MATCHES   += march?armv7e-m=mcpu?cortex-m7
 MULTILIB_MATCHES   += march?armv8-m.base=mcpu?cortex-m23
 MULTILIB_MATCHES   += march?armv8-m.main=mcpu?cortex-m33
+MULTILIB_MATCHES   += march?armv8-m.main=mcpu?cortex-m33+nodsp
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r4
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r4f
 MULTILIB_MATCHES   += march?armv7=mcpu?cortex-r5

[PATCH, GCC/testsuite] Fix FAIL display for some scan-*-times directives

2018-03-13 Thread Thomas Preudhomme


Hi,

scan-assembler-times and scan-tree-dump-times dejagnu directives show a
different output in the summary files depending on whether they PASS or
FAIL. This means that dg-cmp-results would not show a regression because
it would not see a connection between the two output.

The difference comes from the FAIL showing the number of actual times
the pattern was match, presumably to help debugging. This patch moves
the info regarding the actual number of times the pattern match in a
separate verbose message. This keeps the message unchanged but let
developers have the required debug message with -v.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2018-03-13  Thomas Preud'homme  

* lib/scanasm.exp (scan-assembler-times): Move FAIL debug info into a
separate verbose message.
* lib/scandump.exp (scan-dump-times): Likewise.

Testing: Made a modified version of gcc.dg/nand.c and
gcc.dg/torture/pr61772.c to FAIL their scan-assembler-times and
scan-tree-dump-times respective directives. Without the patch
dg-cmp-results does not flag any regression but does with the patch.

Is this ok for stage 4?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 3a775b0a812775193cf1181337a5b890cde74133..61e0f3f48aeea5785689c5df7a15dc2ccbc71029 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -266,7 +266,8 @@ proc scan-assembler-times { args } {
 if {$result_count == $times} {
 	pass "$testcase scan-assembler-times $pp_pattern $times"
 } else {
-	fail "$testcase scan-assembler-times $pp_pattern $times (found $result_count times)"
+	verbose -log "$testcase: $pp_pattern found $result_count times"
+	fail "$testcase scan-assembler-times $pp_pattern $times"
 }
 }
 
diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp
index 4e3da972ae4ed09c9874eb384daf825e6e2dcde3..be8fbe8b461dc81d5683fe323c0913f678daa1e0 100644
--- a/gcc/testsuite/lib/scandump.exp
+++ b/gcc/testsuite/lib/scandump.exp
@@ -110,7 +110,8 @@ proc scan-dump-times { args } {
 if {$result_count == $times} {
 pass "$testname"
 } else {
-fail "$testname (found $result_count times)"
+	verbose -log "$testcase: pattern found $result_count times"
+fail "$testname"
 }
 }

[PATCH, GCC/testsuite/ARM] Fix copysign_softfloat_1.c option directives

2018-03-01 Thread Thomas Preudhomme


gcc.target/arm/copysign_softfloat_1.c's use of arm_arch_v6t2 in
dg-add-option changes the architecture to -march=armv6t2. Since the test
only requires Thumb-2 capable architecture, we just need to add -mthumb
on the command line since arm_thumb2_ok guarantees by definition that
doing that is enough to select Thumb-2. This fixes warning on the
command line when having -mcpu=cortex-m3 in RUNTESTFLAGS for instance.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2018-03-01  Thomas Preud'homme

Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase

2018-03-01 Thread Thomas Preudhomme

Finally committed to gcc-7-branch, sorry for doing this so late. I've merged the 
two commits into one. Patch attached for reference.


Best regards,

Thomas

On 05/12/17 21:26, Mike Stump wrote:

On Dec 5, 2017, at 12:56 PM, Thomas Preudhomme <thomas.preudho...@foss.arm.com> 
wrote:


Thanks, I've tested after the two commits and it works both in tree and out of 
tree. It'll simplify comparing in tree results Vs out of tree for us, thanks a 
lot!

Would you consider a backport to stable branches if nobody complains after a 
week?


Yeah, back port is Ok.

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index b211dec4ffb20359f50bbc695481977282eb0525..b78c5f59bfc1121cf61071e41bd11551a9ab7122 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,12 @@
+2017-02-27  Thomas Preud'homme  <thomas.preudho...@arm.com>
+
+	Backport from mainline
+	2017-12-05  Matthew Gretton-Dann  <matthew.gretton-d...@arm.com>
+	with follow-up r255433 commit.
+
+	* gcc.c-torture/unsorted/dump-noaddr.x: Generate dump files in
+	tmpdir.
+
 2018-02-26  Carl Love  <c...@us.ibm.com>
 
 	Backport from mainline: commit 257747 on 2018-02-16.
diff --git a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
index d14d494570944b2be82c2575204cdbf4b15721ca..e86f36a1861fc4dc46bd449d78403f510ec4b920 100644
--- a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
+++ b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
@@ -9,14 +9,14 @@ proc dump_compare { src options } {
 
 # loop through all the options
 foreach option $option_list {
-	file delete -force dump1
-	file mkdir dump1
+	file delete -force $tmpdir/dump1
+	file mkdir $tmpdir/dump1
 	c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
-	file delete -force dump2
-	file mkdir dump2
+	file delete -force $tmpdir/dump2
+	file mkdir $tmpdir/dump2
 	c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
-	foreach dump1 [lsort [glob -nocomplain dump1/*]] {
-	regsub dump1/ $dump1 dump2/ dump2
+	foreach dump1 [lsort [glob -nocomplain $tmpdir/dump1/*]] {
+	set dump2 "$tmpdir/dump2/[file tail $dump1]"
 	set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"
 	regsub {\.\d+((t|r|i)\.[^.]+)$} $dumptail {.*\1} dumptail
 	set tmp [ diff "$dump1" "$dump2" ]
@@ -29,8 +29,8 @@ proc dump_compare { src options } {
 	}
 	}
 }
-file delete -force dump1
-file delete -force dump2
+file delete -force $tmpdir/dump1
+file delete -force $tmpdir/dump2
 }
 
 dump_compare $src $options

[arm-embedded] Allow -mcpu=cortex-m33+nodsp

2018-02-27 Thread Thomas Preudhomme


Hi, we decided to apply the following patch to ARM/embedded-7-branch to
support -mcpu=cortex-m33+nodsp.

DSP instructions are optional for Arm Cortex-M33, yet its -mcpu option
does not allow +nodsp. Users are thus left with using
-march=armv8-m.main -mtune=cortex-m33. This patch creates a new cpu
cortex-m33+nodsp since there is no mechanism on GCC 7 for CPU
extensions. Since GCC passes the -mcpu parameter down to GAS verbatim
and that GAS does not support +nodsp for cortex-m33, this patch also
special cases -mcpu=cortex-m33 in arm_file_start to output a .arch
option instead of .cpu.

2018-02-26  Thomas Preud'homme  

* config/arm/arm-cpus.in (cortex-m33+nodsp): New CPU.
* config/arm/arm-cpu-cdata.h: Regenerate.
* config/arm/arm-cpu-data.h: Likewise.
* config/arm/arm-cpu.h: Likewise.
* config/arm/arm-tables.opt: Likewise.
* config/arm/arm-tune.md: Likewise.
* config/arm/arm.c (arm_file_start): Special case
* -mcpu=cortex-m33+nodsp to emit .arch armv8-m.main instead.
* doc/invoke.texi: Document cortex-m33+nodsp as a valid value for -mcpu
and -mtune.

Testing: Compiled a hello world with -S -mcpu=cortex-m33 and with
-S -mcpu=cortex-m33+dsp and compared both assembly files. The latter
correctly emits .arch armv8-m.main instead of .cpu cortex-m33.

Best regards,

Thomas
diff --git a/gcc/ChangeLog.arm b/gcc/ChangeLog.arm
index a98ecb028f6800a516f6cd252390ceac1e08911b..e09bd132d224aee511591143d86efff8bb156d60 100644
--- a/gcc/ChangeLog.arm
+++ b/gcc/ChangeLog.arm
@@ -1,3 +1,9 @@
+2018-02-26  Thomas Preud'homme  
+
+	* config/arm/arm-cpus.in (cortex-m33+nodsp): Define.
+	* doc/invoke.texi: Document +nodsp as a valid extension for
+	-mcpu=cortex-m33.
+
 2017-11-23  Thomas Preud'homme  
 
 	Cherry-pick from GCC 7
diff --git a/gcc/config/arm/arm-cpu-cdata.h b/gcc/config/arm/arm-cpu-cdata.h
index 27571c841d928fe9c331006bfc9608c4e75b60d8..f5e34c830ca28196ded0912c230f719a6ff5681e 100644
--- a/gcc/config/arm/arm-cpu-cdata.h
+++ b/gcc/config/arm/arm-cpu-cdata.h
@@ -789,6 +789,13 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
 },
   },
   {
+"cortex-m33+nodsp",
+{
+  ISA_ARMv8m_main,
+  isa_nobit
+},
+  },
+  {
 "cortex-r52",
 {
   ISA_ARMv8r,isa_bit_crc32,
diff --git a/gcc/config/arm/arm-cpu-data.h b/gcc/config/arm/arm-cpu-data.h
index e474efa02ed93a93ae00ac2057a9bc841c48b87f..30902ecabc6c72e46e6f6aa1d92b9980fd639dcd 100644
--- a/gcc/config/arm/arm-cpu-data.h
+++ b/gcc/config/arm/arm-cpu-data.h
@@ -1221,6 +1221,17 @@ static const struct processors all_cores[] =
 _v7m_tune
   },
   {
+"cortex-m33+nodsp",
+TARGET_CPU_cortexm33nodsp,
+(TF_LDSCHED),
+"8M_MAIN", BASE_ARCH_8M_MAIN,
+{
+  ISA_ARMv8m_main,
+  isa_nobit
+},
+_v7m_tune
+  },
+  {
 "cortex-r52",
 TARGET_CPU_cortexr52,
 (TF_LDSCHED),
diff --git a/gcc/config/arm/arm-cpu.h b/gcc/config/arm/arm-cpu.h
index 502965081faa625abc93d97559517baf50972e1b..22566495fdf0da0ad75b81a5956eecb898c38684 100644
--- a/gcc/config/arm/arm-cpu.h
+++ b/gcc/config/arm/arm-cpu.h
@@ -130,6 +130,7 @@ enum processor_type
   TARGET_CPU_cortexa73cortexa53,
   TARGET_CPU_cortexm23,
   TARGET_CPU_cortexm33,
+  TARGET_CPU_cortexm33nodsp,
   TARGET_CPU_cortexr52,
   TARGET_CPU_arm_none
 };
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 5f18dfb35687888bc7f642785693f75658a96733..7368a067db92b384f83fdb4a0af6cb77cff4e6f4 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1090,6 +1090,13 @@ begin cpu cortex-m33
  costs v7m
 end cpu cortex-m33
 
+begin cpu cortex-m33+nodsp
+ cname cortexm33nodsp
+ tune flags LDSCHED
+ architecture armv8-m.main
+ costs v7m
+end cpu cortex-m33+nodsp
+
 # V8 R-profile implementations.
 begin cpu cortex-r52
  cname cortexr52
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index ede44f497edd69390bbbe6de5a913430b546c547..a46bc3c7f8ba6048969bae4d37a7be3c5242ce6a 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -349,6 +349,9 @@ EnumValue
 Enum(processor_type) String(cortex-m33) Value( TARGET_CPU_cortexm33)
 
 EnumValue
+Enum(processor_type) String(cortex-m33+nodsp) Value( TARGET_CPU_cortexm33nodsp)
+
+EnumValue
 Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
 
 Enum
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 519c0556fe76a5a391cd268bb50541c77a4596d4..542b7972d21cd3c9986229e91ce0841522e3b52f 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -57,5 +57,5 @@
 	cortexa73,exynosm1,xgene1,
 	cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,
 	cortexa73cortexa53,cortexm23,cortexm33,
-	cortexr52"
+	cortexm33nodsp,cortexr52"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c

[PATCH, arm-embedded] Multilib mapping for Armv8-R

2018-02-27 Thread Thomas Preudhomme


Hi,

We have decided to apply the following patch to the
ARM/embedded-7-branch to provide better multilib for Armv8-R targets.

Due to there being no multilib mapping for Armv8-R, default multilib
built for -march=armv4t with softfloat floating-point arithmetic is
being used. This patch maps it instead to the existing Armv7 multilibs.
Note that mapping for single-precision Armv8-R has been left out due to
there being no Arm implementation of that architecture variant.

Changelog entry is as follows:

*** gcc/ChangeLog ***

2018-02-26  Thomas Preud'homme  

* config/arm/t-rmprofile: Map Armv8-R and Armv8-R with CRC extension to
Armv7 multilibs.

Testing:

Ran -print-multi-directory for all combinations of
-march=armv8-r/-march=armv8-r+crc with
-mfpu=neon-fp-armv8/crypto-neon-fp-armv8. All gave the expected result. Details
in appendix.

Is this ok for stage4?

Best regards,

Thomas

Appendix: output of -print-multi-directory for all supported Armv8-R
configuration single precision FPU excepted.

% for ext in "" +crc; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} 
-mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done
arm-none-eabi-gcc -march=armv8-r -mfloat-abi=soft -print-multi-directory: 
thumb/v7-ar
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=soft -print-multi-directory: 
thumb/v7-ar


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=softfp 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done
arm-none-eabi-gcc -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=hard 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done
arm-none-eabi-gcc -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard


% for ext in "" +crc; do cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} 
-mfpu=${fpu} -mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval 
$cmd ; done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=soft -print-multi-directory: .
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=soft -print-multi-directory: .


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} 
-mfloat-abi=softfp -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done 
; done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} 
-mfloat-abi=hard -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; 
done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index d4bc9fde4c5544812bde4743ccc18d68c1c25132..a3a24d59fb29b42a36177bd2d2ebfae4e50e5a10 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -135,6 +135,8 @@

Re: [PATCH, GCC/ARM] Multilib mapping for Armv8-R

2018-02-16 Thread Thomas Preudhomme

tfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp


% for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto 
+fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto 
+crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=hard 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done

arm-none-eabi-gcc -march=armv8-r -mfloat-abi=hard -print-multi-directory: .
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=hard -print-multi-directory: .
arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=hard -print-multi-directory: 
.
arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=hard -print-multi-directory: 
thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=hard -print-multi-directory: 
thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=hard 
-print-multi-directory: .
arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=hard 
-print-multi-directory: thumb/v7+fp/hard


On 13/02/18 10:27, Kyrill Tkachov wrote:

Hi Thomas,

On 13/02/18 10:24, Thomas Preudhomme wrote:

Hi,

Due to there being no multilib mapping for Armv8-R, default multilib
targeting -march=armv4t with softfloat floating-point arithmetic is
being used. This patch maps it instead to the existing Armv7 multilibs.
Note that since there is no single-precision multilib compatible with
R profile, -march=armv8-r+fp.sp is mapped to -march=armv7 ie. Armv7
with softfloat floating-point.



Thanks for doing this.


Changelog entry is as follows:

*** gcc/ChangeLog ***

2018-02-12  Thomas Preud'homme <thomas.preudho...@arm.com>

    * config/arm/t-multilib: Map Armv8-R to Armv7 multilibs.

Testing:

Ran -print-multi-directory for all combinations of extensions one can
pass to -march=armv8-r (including no extension but only considering a
single ordering of extension). All gave the expected result. Details in
appendix.

Is this ok for stage4?

Best regards,

Thomas

Appendix: output of -print-multi-directory for all extensions available
to -march=armv8-r



Can you please add a representative subset of these as tests to 
gcc.target/arm/multilib.exp.
That way we can have the peace of mind that they have sane mappings as we go 
forward.


<snip, thanks for the results>

diff --git a/gcc/config/arm/t-multilib b/gcc/config/arm/t-multilib
index 
2f790097670e1bf81b56b069a6b1582763aab6e9..cd5927a7c9ec053b4d5b9725f7b30daeca3b1aa3 
100644

--- a/gcc/config/arm/t-multilib
+++ b/gcc/config/arm/t-multilib
@@ -70,6 +70,7 @@ v8_a_simd_variants    := $(call all_feat_combs, simd crypto)
  v8_1_a_simd_variants    := $(call all_feat_combs, simd crypto)
  v8_2_a_simd_variants    := $(call all_feat_combs, simd fp16 fp16fml crypto 
dotprod)

  v8_4_a_simd_variants    := $(call all_feat_combs, simd fp16 crypto)
+v8_r_nosimd_variants    := $(call all_feat_combs, crc fp.sp)

  ifneq (,$(HAS_APROFILE))
  include $(srcdir)/config/arm/t-aprofile
@@ -105,6 +106,20 @@ MULTILIB_MATCHES    += march?armv7+fp=march?armv7-r+fp+idiv

  MULTILIB_MATCHES    += $(foreach ARCH, $(all_early_arch), \
   march?armv5te+fp=march?$(ARCH)+fp)
+#
+# Armv8-r: map down onto common v7 code.


Please use Armv8-R.



  +# Note 1: there is no single-precision armv7 multilib so +fp.sp is mapped
+# down to softfloat armv7 (second MULTILIB_MATCHES).
+# Note 2: +fp.sp being a subset of +simd and +crypto, there is no need to
+# consider the combination of +fp.sp with a simd extension since matching
+# is run after canonicalization
+MULTILIB_MATCHES    += march?armv7=march?armv8-r
+MULTILIB_MATCHES    += $(foreach ARCH, $(v8_r_nosimd_var

[PATCH, GCC/ARM] Multilib mapping for Armv8-R

2018-02-13 Thread Thomas Preudhomme


Hi,

Due to there being no multilib mapping for Armv8-R, default multilib
targeting -march=armv4t with softfloat floating-point arithmetic is
being used. This patch maps it instead to the existing Armv7 multilibs.
Note that since there is no single-precision multilib compatible with
R profile, -march=armv8-r+fp.sp is mapped to -march=armv7 ie. Armv7
with softfloat floating-point.

Changelog entry is as follows:

*** gcc/ChangeLog ***

2018-02-12  Thomas Preud'homme  

* config/arm/t-multilib: Map Armv8-R to Armv7 multilibs.

Testing:

Ran -print-multi-directory for all combinations of extensions one can
pass to -march=armv8-r (including no extension but only considering a
single ordering of extension). All gave the expected result. Details in
appendix.

Is this ok for stage4?

Best regards,

Thomas

Appendix: output of -print-multi-directory for all extensions available
to -march=armv8-r

% for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto 
+fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto 
+crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=soft 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done
arm-none-eabi-gcc -march=armv8-r -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=soft -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=soft 
-print-multi-directory: thumb/v7/nofp


% for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto 
+fp.sp+simd +fp.sp+crypto +simd+crypto +crc+fp.sp+simd +crc+fp.sp+crypto 
+crc+simd+crypto +fp.sp+simd+crypto +crc+fp.sp+simd+crypto ; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfloat-abi=softfp 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done
arm-none-eabi-gcc -march=armv8-r -mfloat-abi=softfp -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=softfp -print-multi-directory: 
thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+fp.sp -mfloat-abi=softfp 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+simd -mfloat-abi=softfp -print-multi-directory: 
thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp -mfloat-abi=softfp 
-print-multi-directory: thumb/v7/nofp
arm-none-eabi-gcc -march=armv8-r+crc+simd -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+fp.sp+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+fp.sp+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp
arm-none-eabi-gcc -march=armv8-r+crc+fp.sp+simd+crypto -mfloat-abi=softfp 
-print-multi-directory: thumb/v7+fp/softfp


% for ext in "" +crc +fp.sp +simd +crypto +crc+fp.sp +crc+simd +crc+crypto

Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase

2017-12-05 Thread Thomas Preudhomme


Hi Mike,

Thanks, I've tested after the two commits and it works both in tree and out of 
tree. It'll simplify comparing in tree results Vs out of tree for us, thanks a lot!


Would you consider a backport to stable branches if nobody complains after a 
week?

Best regards,

Thomas

On 05/12/17 19:27, Mike Stump wrote:

On Dec 5, 2017, at 11:11 AM, Thomas Preudhomme <thomas.preudho...@foss.arm.com> 
wrote:


On 05/12/17 17:54, Andrew Pinski wrote:

On Tue, Dec 5, 2017 at 9:50 AM, Thomas Preudhomme
<thomas.preudho...@foss.arm.com> wrote:

Hi,

dump-noaddr test FAILS when $tmpdir is not the same as the directory
where runtest is called from. Note that this does not happen when
running make check because tmpdir is set to srcdir.

In that case, file mkdir will create the directory in the current
directory while GCC is invoked from tmpdir and hence -dumpbase look
for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to
be relative to tmpdir which will work in all case.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-12-05  Thomas Preud'homme  <thomas.preudho...@arm.com>

 * gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base
 relative to tmpdir.

Testing: Successfully ran unsorted.exp via make check and out of tree
testing using runtest from /test with tmpdir set in
/test/site.exp to .

Is this ok for stage3?

https://gcc.gnu.org/ml/gcc-patches/2012-06/msg01752.html
I don't remember where this discussion went last time.
Maybe this time there will be a resolution :).


FWIW, I agree with Matt, creating the dump in tmpdir makes more sense. I think 
his patch can be simplified though because the compiler seems to be invoked 
from tmpdir so it can at least be omitted from the -dumpbase.


Sounds reasonable.  I've added that on top of his patch and checked that in.  
Let us know if it works or not.

Re: [PATCH, GCC/testsuite] Fix dump-noaddr dumpbase

2017-12-05 Thread Thomas Preudhomme




On 05/12/17 17:54, Andrew Pinski wrote:

On Tue, Dec 5, 2017 at 9:50 AM, Thomas Preudhomme
<thomas.preudho...@foss.arm.com> wrote:

Hi,

dump-noaddr test FAILS when $tmpdir is not the same as the directory
where runtest is called from. Note that this does not happen when
running make check because tmpdir is set to srcdir.

In that case, file mkdir will create the directory in the current
directory while GCC is invoked from tmpdir and hence -dumpbase look
for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to
be relative to tmpdir which will work in all case.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-12-05  Thomas Preud'homme  <thomas.preudho...@arm.com>

 * gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base
 relative to tmpdir.

Testing: Successfully ran unsorted.exp via make check and out of tree
testing using runtest from /test with tmpdir set in
/test/site.exp to .

Is this ok for stage3?


https://gcc.gnu.org/ml/gcc-patches/2012-06/msg01752.html
I don't remember where this discussion went last time.
Maybe this time there will be a resolution :).


FWIW, I agree with Matt, creating the dump in tmpdir makes more sense. I think 
his patch can be simplified though because the compiler seems to be invoked from 
tmpdir so it can at least be omitted from the -dumpbase.


Best regards,

Thomas

[PATCH, GCC/testsuite] Fix dump-noaddr dumpbase

2017-12-05 Thread Thomas Preudhomme


Hi,

dump-noaddr test FAILS when $tmpdir is not the same as the directory
where runtest is called from. Note that this does not happen when
running make check because tmpdir is set to srcdir.

In that case, file mkdir will create the directory in the current
directory while GCC is invoked from tmpdir and hence -dumpbase look
for dump1 and dump2 relative to tmpdir. This patch forces dumpbase to
be relative to tmpdir which will work in all case.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-12-05  Thomas Preud'homme  

* gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Set dump base
relative to tmpdir.

Testing: Successfully ran unsorted.exp via make check and out of tree
testing using runtest from /test with tmpdir set in
/test/site.exp to .

Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
index d14d494570944b2be82c2575204cdbf4b15721ca..68d6c3e38325cabbdd280ecf05e663dbcda99900 100644
--- a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
+++ b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
@@ -11,10 +11,10 @@ proc dump_compare { src options } {
 foreach option $option_list {
 	file delete -force dump1
 	file mkdir dump1
-	c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
+	c-torture-compile $src "$option $options -dumpbase [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
 	file delete -force dump2
 	file mkdir dump2
-	c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
+	c-torture-compile $src "$option $options -dumpbase [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
 	foreach dump1 [lsort [glob -nocomplain dump1/*]] {
 	regsub dump1/ $dump1 dump2/ dump2
 	set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"

[PATCH, GCC/testsuite] Improve fstack_protector effective target

2017-11-30 Thread Thomas Preudhomme


Hi,

Effective target fstack_protector fails to return an error for
newlib-based target (such as arm-none-eabi targets) which does not
support stack protector. This is due to the test being too simplist for
stack protection code to be generated by GCC: it does not contain a
local buffer and does not read unknown input.

This commit adds a small local buffer with a copy of the filename to
trigger stack protector code to be generated. The filename is used
instead of the full path so as to ensure the size will fit in the local
buffer.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-28  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_fstack_protector):
Copy filename in local buffer to trigger stack protection.

Testing: Ran gcc.dg/pr38616 on arm-none-eabi and arm-linux-gnueabihf,
the former is now UNSUPPORTED while the latter continues to PASS.

Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index d30fd368922713d3695f22710197ce7094c977cd..8aff16a25823ec48e76ad6ad8fdc8db998a45877 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1064,7 +1064,11 @@ proc check_effective_target_static {} {
 # Return 1 if the target supports -fstack-protector
 proc check_effective_target_fstack_protector {} {
 return [check_runtime fstack_protector {
-	int main (void) { return 0; }
+	#include 
+	int main (int argc, char *argv[]) {
+	  char buf[64];
+	  return !strcpy (buf, strrchr (argv[0], '/'));
+	}
 } "-fstack-protector"]
 }

[arm-embedded] [PATCH, GCC/LTO, ping] Fix PR69866: LTO with def for weak alias in regular object file

2017-11-28 Thread Thomas Preudhomme


Hi,

We have decided to apply the forwarded patch to the embedded-7-branch to fix an 
ICE when doing partial LTO with weak symbols.


ChangeLog entry is as follows:

2017-11-28  Thomas Preud'homme  

Backport from mainline
2017-06-15  Jan Hubicka  
Thomas Preud'homme  

PR lto/69866
* lto-symtab.c (lto_symtab_merge_symbols): Drop useless definitions
that resolved externally.

Backport from mainline
2017-06-15  Thomas Preud'homme  

PR lto/69866
* gcc.dg/lto/pr69866_0.c: New test.
* gcc.dg/lto/pr69866_1.c: Likewise.


Best regards,

Thomas
--- Begin Message ---
Hi,
I am testing the following. Let me know if it works for you.

Honza

Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 249213)
+++ lto/lto-symtab.c(working copy)
@@ -952,6 +952,42 @@
  if (tgt)
node->resolve_alias (tgt, true);
}
+ /* If the symbol was preempted outside IR, see if we want to get rid
+of the definition.  */
+ if (node->analyzed
+ && !DECL_EXTERNAL (node->decl)
+ && (node->resolution == LDPR_PREEMPTED_REG
+ || node->resolution == LDPR_RESOLVED_IR
+ || node->resolution == LDPR_RESOLVED_EXEC
+ || node->resolution == LDPR_RESOLVED_DYN))
+   {
+ DECL_EXTERNAL (node->decl) = 1;
+ /* If alias to local symbol was preempted by external definition,
+we know it is not pointing to the local symbol.  Remove it.  */
+ if (node->alias
+ && !node->weakref
+ && !node->transparent_alias
+ && node->get_alias_target ()->binds_to_current_def_p ())
+   {
+ node->alias = false;
+ node->remove_all_references ();
+ node->definition = false;
+ node->analyzed = false;
+ node->cpp_implicit_alias = false;
+   }
+ else if (!node->alias
+  && node->definition
+  && node->get_availability () <= AVAIL_INTERPOSABLE)
+   {
+ if ((cnode = dyn_cast  (node)) != NULL)
+   cnode->reset ();
+ else
+   {
+ node->analyzed = node->definition = false;
+ node->remove_all_references ();
+   }
+   }
+   }
 
  if (!(cnode = dyn_cast  (node))
  || !cnode->clone_of
--- End Message ---

Re: [PATCH, GCC/ARM] Factor out CMSE register clearing code

2017-11-22 Thread Thomas Preudhomme




On 22/11/17 14:45, Kyrill Tkachov wrote:

Hi Thomas,

On 15/11/17 17:12, Thomas Preudhomme wrote:

Hi,

Functions cmse_nonsecure_call_clear_caller_saved and
cmse_nonsecure_entry_clear_before_return both contain very similar code
to clear registers. What's worse, they differ slightly at times so if a
bug is found in one careful thoughts is needed to decide whether the
other function needs fixing too.

This commit addresses the situation by factoring the two pieces of code
into a new function. In doing so the code generated to clear VFP
registers in cmse_nonsecure_call now uses the same sequence as
cmse_nonsecure_entry functions. Tests expectation are thus updated
accordingly.

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme <thomas.preudho...@arm.com>

    * config/arm/arm.c (cmse_clear_registers): New function.
    (cmse_nonsecure_call_clear_caller_saved): Replace register clearing
    code by call to cmse_clear_registers.
    (cmse_nonsecure_entry_clear_before_return): Likewise.

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme <thomas.preudho...@arm.com>

    * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Adapt expectations
    to vmov instructions now generated.
    * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
    * gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?



This looks mostly ok, but I have a concern from reading the code that I'd like 
some help with...


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
9b494e9529a4470c18192a4561e03d2f80e90797..22c9add0722974902b2a89b2b0a75759ff8ba37c 
100644

--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16991,6 +16991,128 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, 
int regno,

    return not_to_clear_mask;
  }

+/* Clear registers secret before doing a cmse_nonsecure_call or returning from
+   a cmse_nonsecure_entry function.  TO_CLEAR_BITMAP indicates which registers
+   are to be fully cleared, using the value in register CLEARING_REG if more
+   efficient.  The PADDING_BITS_LEN entries array PADDING_BITS_TO_CLEAR gives
+   the bits that needs to be cleared in caller-saved core registers, with
+   SCRATCH_REG used as a scratch register for that clearing.
+
+   NOTE: one of three following assertions must hold:
+   - SCRATCH_REG is a low register
+   - CLEARING_REG is in the set of registers fully cleared (ie. its bit is set
+ in TO_CLEAR_BITMAP)
+   - CLEARING_REG is a low register.  */
+
+static void
+cmse_clear_registers (sbitmap to_clear_bitmap, uint32_t *padding_bits_to_clear,
+  int padding_bits_len, rtx scratch_reg, rtx clearing_reg)
+{
+  bool saved_clearing = false;
+  rtx saved_clearing_reg = NULL_RTX;
+  int i, regno, clearing_regno, minregno = R0_REGNUM, maxregno = minregno - 1;
+

Here minregno becomes 0 and maxregno becomes -1...

+  gcc_assert (arm_arch_cmse);
+
+  if (!bitmap_empty_p (to_clear_bitmap))
+    {
+  minregno = bitmap_first_set_bit (to_clear_bitmap);
+  maxregno = bitmap_last_set_bit (to_clear_bitmap);
+    }


...and here is a path on maxregno may not be set to a proper register number...



If bitmap is empty yes, ie. if no bit is set and no register should be cleared.



+
+  for (regno = minregno; regno <= maxregno; regno++)
+    {
+  if (!bitmap_bit_p (to_clear_bitmap, regno))
+    continue;
+

...and here we iterate from minregno (potentially 0) to maxregno (potentially 
-1) which will lead to trouble.

Are there any guarantees that this case will not occur?


It absolutely does occur and that's on purpose. If maxregno is -1 it means there 
is no bit to clear and so it is fine to do nothing.


Best regards,

Thomas

[PATCH, GCC/ARM] Remove useless variable in CMSE code

2017-11-22 Thread Thomas Preudhomme


Hi,

Functions cmse_nonsecure_call_clear_caller_saved () and
cmse_nonsecure_entry_clear_before_return () use a separate variable
holding a pointer to padding_bits_to_clear array's first entry which is
used when calling function compute_not_to_clear_mask ().  This does not
save space over using _bits_to_clear[0] directly so this commit
gets rid of it.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-08  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Get rid of
padding_bits_to_clear_ptr.
(cmse_nonsecure_entry_clear_before_return): Likewise.

Testing: Bootstrapped an arm-none-linux-gnueabihf compiler and
regression test does not show any regression.

Committed as obvious.

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7384b96fea0179334a6010b099df68c8e2a0fc32..bcb708c1b316ea08969e118fb0949b941ff19c27 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17002,7 +17002,6 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  bool using_r4, first_param = true;
 	  function_args_iterator args_iter;
 	  uint32_t padding_bits_to_clear[4] = {0U, 0U, 0U, 0U};
-	  uint32_t * padding_bits_to_clear_ptr = _bits_to_clear[0];
 
 	  if (!NONDEBUG_INSN_P (insn))
 	continue;
@@ -17086,7 +17085,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  to_clear_args_mask
 		= compute_not_to_clear_mask (arg_type, arg_rtx,
 	 REGNO (arg_rtx),
-	 padding_bits_to_clear_ptr);
+	 _bits_to_clear[0]);
 	  if (to_clear_args_mask)
 		{
 		  for (regno = R0_REGNUM; regno <= maxregno; regno++)
@@ -25134,7 +25133,6 @@ cmse_nonsecure_entry_clear_before_return (void)
 {
   int regno, maxregno = TARGET_HARD_FLOAT ? LAST_VFP_REGNUM : IP_REGNUM;
   uint32_t padding_bits_to_clear = 0;
-  uint32_t * padding_bits_to_clear_ptr = _bits_to_clear;
   auto_sbitmap to_clear_bitmap (maxregno + 1);
   tree result_type;
   rtx result_rtl;
@@ -25187,7 +25185,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   gcc_assert (REG_P (result_rtl));
   to_clear_return_mask
 	= compute_not_to_clear_mask (result_type, result_rtl, 0,
- padding_bits_to_clear_ptr);
+ _bits_to_clear);
   if (to_clear_return_mask)
 	{
 	  gcc_assert ((unsigned) maxregno < sizeof (long long) * __CHAR_BIT__);

Re: [PATCH, GCC/ARM] Use bitmap to control cmse_nonsecure_call register clearing

2017-11-22 Thread Thomas Preudhomme


Thanks Kyrill.

Committed the attached rebased patch (same patch but without the last hunk 
because a better fix was done in an earlier commit).


Best regards,

Thomas

On 22/11/17 11:57, Kyrill Tkachov wrote:

Hi Thomas,

On 15/11/17 17:08, Thomas Preudhomme wrote:

Hi,

As part of r253256, cmse_nonsecure_entry_clear_before_return has been
rewritten to use auto_sbitmap instead of an integer bitfield to control
which register needs to be cleared. This commit continue this work in
cmse_nonsecure_call_clear_caller_saved.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-16  Thomas Preud'homme <thomas.preudho...@arm.com>

    * config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Use
    auto_sbitap instead of integer bitfield to control register needing
    clearing.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?



Ok for trunk.
Thanks for this conversion. It's much easier to understand the code
without having to think about the bitmasks and shifts.

Kyrill


Best regards,

Thomas


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 106e3edce0d6f2518eb391c436c5213a78d1275b..092cd61d49382101bce9b8c5f04de31965dcdc77 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17007,10 +17007,11 @@ cmse_nonsecure_call_clear_caller_saved (void)
 
   FOR_BB_INSNS (bb, insn)
 	{
-	  uint64_t to_clear_mask, float_mask;
+	  unsigned address_regnum, regno, maxregno =
+	TARGET_HARD_FLOAT_ABI ? D7_VFP_REGNUM : NUM_ARG_REGS - 1;
+	  auto_sbitmap to_clear_bitmap (maxregno + 1);
 	  rtx_insn *seq;
 	  rtx pat, call, unspec, reg, cleared_reg, tmp;
-	  unsigned int regno, maxregno;
 	  rtx address;
 	  CUMULATIVE_ARGS args_so_far_v;
 	  cumulative_args_t args_so_far;
@@ -17041,18 +17042,21 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	continue;
 
 	  /* Determine the caller-saved registers we need to clear.  */
-	  to_clear_mask = (1LL << (NUM_ARG_REGS)) - 1;
-	  maxregno = NUM_ARG_REGS - 1;
+	  bitmap_clear (to_clear_bitmap);
+	  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
+
 	  /* Only look at the caller-saved floating point registers in case of
 	 -mfloat-abi=hard.  For -mfloat-abi=softfp we will be using the
 	 lazy store and loads which clear both caller- and callee-saved
 	 registers.  */
 	  if (TARGET_HARD_FLOAT_ABI)
 	{
-	  float_mask = (1LL << (D7_VFP_REGNUM + 1)) - 1;
-	  float_mask &= ~((1LL << FIRST_VFP_REGNUM) - 1);
-	  to_clear_mask |= float_mask;
-	  maxregno = D7_VFP_REGNUM;
+	  auto_sbitmap float_bitmap (maxregno + 1);
+
+	  bitmap_clear (float_bitmap);
+	  bitmap_set_range (float_bitmap, FIRST_VFP_REGNUM,
+D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1);
+	  bitmap_ior (to_clear_bitmap, to_clear_bitmap, float_bitmap);
 	}
 
 	  /* Make sure the register used to hold the function address is not
@@ -17060,7 +17064,9 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  address = RTVEC_ELT (XVEC (unspec, 0), 0);
 	  gcc_assert (MEM_P (address));
 	  gcc_assert (REG_P (XEXP (address, 0)));
-	  to_clear_mask &= ~(1LL << REGNO (XEXP (address, 0)));
+	  address_regnum = REGNO (XEXP (address, 0));
+	  if (address_regnum < R0_REGNUM + NUM_ARG_REGS)
+	bitmap_clear_bit (to_clear_bitmap, address_regnum);
 
 	  /* Set basic block of call insn so that df rescan is performed on
 	 insns inserted here.  */
@@ -17081,6 +17087,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
 	{
 	  rtx arg_rtx;
+	  uint64_t to_clear_args_mask;
 	  machine_mode arg_mode = TYPE_MODE (arg_type);
 
 	  if (VOID_TYPE_P (arg_type))
@@ -17093,10 +17100,18 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type,
 	  true);
 	  gcc_assert (REG_P (arg_rtx));
-	  to_clear_mask
-		&= ~compute_not_to_clear_mask (arg_type, arg_rtx,
-	   REGNO (arg_rtx),
-	   padding_bits_to_clear_ptr);
+	  to_clear_args_mask
+		= compute_not_to_clear_mask (arg_type, arg_rtx,
+	 REGNO (arg_rtx),
+	 padding_bits_to_clear_ptr);
+	  if (to_clear_args_mask)
+		{
+		  for (regno = R0_REGNUM; regno <= maxregno; regno++)
+		{
+		  if (to_clear_args_mask & (1ULL << regno))
+			bitmap_clear_bit (to_clear_bitmap, regno);
+		}
+		}
 
 	  first_param = false;
 	}
@@ -17155,7 +17170,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	 call.  */
 	  for (regno = R0_REGNUM; regno <= maxregno; regno++)
 	{
-	  if (!(to_clear_mask & (1LL << regno)))
+	  if (!bitmap_bit_p (to_clear_bitmap, regno))
 		continue;
 
 	  /* If regno is an even vfp register and its successor is also to
@@ -17164,7 +17179,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 		{
 		  if (TARGET_VFP_DOUBLE
 		  &&

Re: [PATCH] Use bswap framework in store-merging (PR tree-optimization/78821)

2017-11-17 Thread Thomas Preudhomme


Hi Jakub,

On 16/11/17 17:06, Jakub Jelinek wrote:

Hi!

This patch uses the bswap pass framework inside of the store merging
pass to handle adjacent stores which produce together a 16/32/64 bit
store of bswapped value (loaded or from SSA_NAME) or identity (usually
only from SSA_NAME, the code prefers to use the existing store merging
code if coming from identity load, because it e.g. can handle arbitrary
sizes, not just 16/32/64 bits).

There are small tweaks to the bswap code to make it usable inside of
the store merging pass.  Then when processing the stores, we record
what find_bswap_or_nop_1 returns and do a small sanity check on it,
and when doing coalesce_immediate_stores (i.e. the splitting into
groups), we try for 64-bit, 32-bit and 16-bit sizes if we can extend/shift
(according to endianity) and perform_symbolic_merge them together.
If it is possible, we turn those 2+ adjacent stores that make together
{64,32,16} bits into a separate group and process it specially later
(we need to treat it as a single store rather than multiple, so
split_group is only very lightweight for that case).


Nice, the two finally merged! I took a look at the bswap part and it all looked 
good to me code and comment wise. I only have one small nit regarding a 
space/tab change (see below).




Bootstrapped/regtested on {x86_64,i686,powerpc64le,powerpc64}-linux, ok for 
trunk?

The cases this patch can handle are less common than rhs_code INTEGER_CST
(stores of constants to adjacent memory) or MEM_REF (adjacent memory
copying), but are more common than the bitwise ops, during combined
x86_64+i686 bootstraps/regtests it triggered:
lrotate_expr  974   2528
nop_expr  720   1711
(lrotate_expr stands for bswap, nop_expr for identity, the first column is
the actual count of such new stores, the second is the original number of
stores that have been optimized this way).


Are you saying that lrotate_expr is just the title and it also includes 32- and 
64-bit bswap or is it only the count of lrotate_expr nodes?




2017-11-16  Jakub Jelinek  

PR tree-optimization/78821
* gimple-ssa-store-merging.c (find_bswap_or_nop_load): Give up
if base is TARGET_MEM_REF.  If base is not MEM_REF, set base_addr
to the address of the base rather than the base itself.
(find_bswap_or_nop_1): Just use pointer comparison for vuse check.
(find_bswap_or_nop_finalize): New function.
(find_bswap_or_nop): Use it.
(bswap_replace): Return a tree rather than bool, change first
argument from gimple * to gimple_stmt_iterator, allow inserting
into an empty sequence, allow ins_stmt to be NULL - then emit
all stmts into gsi.  Fix up MEM_REF address gimplification.
(pass_optimize_bswap::execute): Adjust bswap_replace caller.
Formatting fix.
(struct store_immediate_info): Add N and INS_STMT non-static
data members.
(store_immediate_info::store_immediate_info): Initialize them
from newly added ctor args.
(merged_store_group::apply_stores): Formatting fixes.  Sort by
bitpos at the end.
(stmts_may_clobber_ref_p): For stores call also
refs_anti_dependent_p.
(gather_bswap_load_refs): New function.
(imm_store_chain_info::try_coalesce_bswap): New method.
(imm_store_chain_info::coalesce_immediate_stores): Use it.
(split_group): Handle LROTATE_EXPR and NOP_EXPR rhs_code specially.
(imm_store_chain_info::output_merged_store): Fail if number of
new estimated stmts is bigger or equal than old.  Handle LROTATE_EXPR
and NOP_EXPR rhs_code.
(pass_store_merging::process_store): Compute n and ins_stmt, if
ins_stmt is non-NULL and the store rhs is otherwise invalid, use
LROTATE_EXPR rhs_code.  Pass n and ins_stmt to store_immediate_info
ctor.
(pass_store_merging::execute): Calculate dominators.

* gcc.dg/store_merging_16.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2017-11-16 10:45:09.239185205 +0100
+++ gcc/gimple-ssa-store-merging.c  2017-11-16 15:34:08.560080214 +0100
@@ -369,7 +369,10 @@ find_bswap_or_nop_load (gimple *stmt, tr
base_addr = get_inner_reference (ref, , , , ,
   , , );
  
-  if (TREE_CODE (base_addr) == MEM_REF)

+  if (TREE_CODE (base_addr) == TARGET_MEM_REF)
+/* Do not rewrite TARGET_MEM_REF.  */
+return false;
+  else if (TREE_CODE (base_addr) == MEM_REF)
  {
offset_int bit_offset = 0;
tree off = TREE_OPERAND (base_addr, 1);
@@ -401,6 +404,8 @@ find_bswap_or_nop_load (gimple *stmt, tr
  
bitpos += bit_offset.to_shwi ();

  }
+  else
+base_addr = build_fold_addr_expr (base_addr);
  
if (bitpos % BITS_PER_UNIT)

  return false;
@@ -743,8 +748,7 @@ find_bswap_or_nop_1 (gimple *stmt, struc
  if (TYPE_PRECISION (n1.type) != TYPE_PRECISION (n2.type))
return

[PATCH, GCC/ARM] Do no clobber r4 in Armv8-M nonsecure call

2017-11-15 Thread Thomas Preudhomme


Hi,

Expanders for Armv8-M nonsecure call unnecessarily clobber r4 despite
the libcall they perform not writing to r4.  Furthermore, the
requirement for the branch target address to be in r4 as expected by
the libcall is modeled in a convoluted way in the define_insn patterns:
the address is a register match_operand constrained by the match_dup
for the clobber which is guaranteed to be r4 due to the expander.

This patch simplifies all this by simply requiring the address to be in
r4 and removing the clobbers. Expanders are left alone because
cmse_nonsecure_call_clear_caller_saved relies on branch target memory
attributes which would be lost if expanding to reg:SI R4_REGNUM.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* config/arm/arm.md (R4_REGNUM): Define constant.
(nonsecure_call_internal): Remove r4 clobber.
(nonsecure_call_value_internal): Likewise.
* config/arm/thumb1.md (nonsecure_call_reg_thumb1_v5): Remove second
clobber and resequence match_operands.
(nonsecure_call_value_reg_thumb1_v5): Likewise.
* config/arm/thumb2.md (nonsecure_call_reg_thumb2): Likewise.
(nonsecure_call_value_reg_thumb2): Likewise.

Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ddb9d8f359007c1d86d497aef0ff5fc0e4061813..6b0794ede9fbc5a4f41e1f4a92acb9b649a277bc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -30,6 +30,7 @@
 (define_constants
   [(R0_REGNUM 0)	; First CORE register
(R1_REGNUM	  1)	; Second CORE register
+   (R4_REGNUM	  4)	; Fifth CORE register
(IP_REGNUM	 12)	; Scratch register
(SP_REGNUM	 13)	; Stack pointer
(LR_REGNUM14)	; Return address register
@@ -8118,14 +8119,13 @@
 			   UNSPEC_NONSECURE_MEM)
 		(match_operand 1 "general_operand" ""))
 	  (use (match_operand 2 "" ""))
-	  (clobber (reg:SI LR_REGNUM))
-	  (clobber (reg:SI 4))])]
+	  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
   "
   {
 rtx tmp;
 tmp = copy_to_suggested_reg (XEXP (operands[0], 0),
- gen_rtx_REG (SImode, 4),
+ gen_rtx_REG (SImode, R4_REGNUM),
  SImode);
 
 operands[0] = replace_equiv_address (operands[0], tmp);
@@ -8210,14 +8210,13 @@
 UNSPEC_NONSECURE_MEM)
 			 (match_operand 2 "general_operand" "")))
 	  (use (match_operand 3 "" ""))
-	  (clobber (reg:SI LR_REGNUM))
-	  (clobber (reg:SI 4))])]
+	  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
   "
   {
 rtx tmp;
 tmp = copy_to_suggested_reg (XEXP (operands[1], 0),
- gen_rtx_REG (SImode, 4),
+ gen_rtx_REG (SImode, R4_REGNUM),
  SImode);
 
 operands[1] = replace_equiv_address (operands[1], tmp);
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 5d196a673355a7acf7d0ed30f21b997b815913f5..f91659386bf240172bd9a3076722683c8a50dff4 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1732,12 +1732,11 @@
 )
 
 (define_insn "*nonsecure_call_reg_thumb1_v5"
-  [(call (unspec:SI [(mem:SI (match_operand:SI 0 "register_operand" "l*r"))]
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
 		UNSPEC_NONSECURE_MEM)
-	 (match_operand 1 "" ""))
-   (use (match_operand 2 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 0))]
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB1 && use_cmse && !SIBLING_CALL_P (insn)"
   "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
@@ -1779,12 +1778,11 @@
 (define_insn "*nonsecure_call_value_reg_thumb1_v5"
   [(set (match_operand 0 "" "")
 	(call (unspec:SI
-	   [(mem:SI (match_operand:SI 1 "register_operand" "l*r"))]
+	   [(mem:SI (reg:SI R4_REGNUM))]
 	   UNSPEC_NONSECURE_MEM)
-	  (match_operand 2 "" "")))
-   (use (match_operand 3 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 1))]
+	  (match_operand 1 "" "")))
+   (use (match_operand 2 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB1 && use_cmse"
   "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 776d611d2538e790a5f504995050ffdfc51d7193..d56a8bd167575263edc2a4b3f66bda34a4a7a72a 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -555,12 +555,11 @@
 )
 
 (define_insn "*nonsecure_call_reg_thumb2"
-  [(call (unspec:SI [(mem:SI (match_operand:SI 0 "s_register_operand" "r"))]
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
 		UNSPEC_NONSECURE_MEM)
-	 (match_operand 1 "" ""))
-   (use (match_operand 2 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 0))]
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB2 && use_cmse"

[PATCH, GCC/ARM] Factor out CMSE register clearing code

2017-11-15 Thread Thomas Preudhomme


Hi,

Functions cmse_nonsecure_call_clear_caller_saved and
cmse_nonsecure_entry_clear_before_return both contain very similar code
to clear registers. What's worse, they differ slightly at times so if a
bug is found in one careful thoughts is needed to decide whether the
other function needs fixing too.

This commit addresses the situation by factoring the two pieces of code
into a new function. In doing so the code generated to clear VFP
registers in cmse_nonsecure_call now uses the same sequence as
cmse_nonsecure_entry functions. Tests expectation are thus updated
accordingly.

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* config/arm/arm.c (cmse_clear_registers): New function.
(cmse_nonsecure_call_clear_caller_saved): Replace register clearing
code by call to cmse_clear_registers.
(cmse_nonsecure_entry_clear_before_return): Likewise.

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Adapt expectations
to vmov instructions now generated.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9b494e9529a4470c18192a4561e03d2f80e90797..22c9add0722974902b2a89b2b0a75759ff8ba37c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16991,6 +16991,128 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno,
   return not_to_clear_mask;
 }
 
+/* Clear registers secret before doing a cmse_nonsecure_call or returning from
+   a cmse_nonsecure_entry function.  TO_CLEAR_BITMAP indicates which registers
+   are to be fully cleared, using the value in register CLEARING_REG if more
+   efficient.  The PADDING_BITS_LEN entries array PADDING_BITS_TO_CLEAR gives
+   the bits that needs to be cleared in caller-saved core registers, with
+   SCRATCH_REG used as a scratch register for that clearing.
+
+   NOTE: one of three following assertions must hold:
+   - SCRATCH_REG is a low register
+   - CLEARING_REG is in the set of registers fully cleared (ie. its bit is set
+ in TO_CLEAR_BITMAP)
+   - CLEARING_REG is a low register.  */
+
+static void
+cmse_clear_registers (sbitmap to_clear_bitmap, uint32_t *padding_bits_to_clear,
+		  int padding_bits_len, rtx scratch_reg, rtx clearing_reg)
+{
+  bool saved_clearing = false;
+  rtx saved_clearing_reg = NULL_RTX;
+  int i, regno, clearing_regno, minregno = R0_REGNUM, maxregno = minregno - 1;
+
+  gcc_assert (arm_arch_cmse);
+
+  if (!bitmap_empty_p (to_clear_bitmap))
+{
+  minregno = bitmap_first_set_bit (to_clear_bitmap);
+  maxregno = bitmap_last_set_bit (to_clear_bitmap);
+}
+  clearing_regno = REGNO (clearing_reg);
+
+  /* Clear padding bits.  */
+  gcc_assert (padding_bits_len <= NUM_ARG_REGS);
+  for (i = 0, regno = R0_REGNUM; i < padding_bits_len; i++, regno++)
+{
+  uint64_t mask;
+  rtx rtx16, dest, cleared_reg = gen_rtx_REG (SImode, regno);
+
+  if (padding_bits_to_clear[i] == 0)
+	continue;
+
+  /* If this is a Thumb-1 target and SCRATCH_REG is not a low register, use
+	 CLEARING_REG as scratch.  */
+  if (TARGET_THUMB1
+	  && REGNO (scratch_reg) > LAST_LO_REGNUM)
+	{
+	  /* clearing_reg is not to be cleared, copy its value into scratch_reg
+	 such that we can use clearing_reg to clear the unused bits in the
+	 arguments.  */
+	  if ((clearing_regno > maxregno
+	   || !bitmap_bit_p (to_clear_bitmap, clearing_regno))
+	  && !saved_clearing)
+	{
+	  gcc_assert (clearing_regno <= LAST_LO_REGNUM);
+	  emit_move_insn (scratch_reg, clearing_reg);
+	  saved_clearing = true;
+	  saved_clearing_reg = scratch_reg;
+	}
+	  scratch_reg = clearing_reg;
+	}
+
+  /* Fill the lower half of the negated padding_bits_to_clear[i].  */
+  mask = (~padding_bits_to_clear[i]) & 0x;
+  emit_move_insn (scratch_reg, gen_int_mode (mask, SImode));
+
+  /* Fill the top half of the negated padding_bits_to_clear[i].  */
+  mask = (~padding_bits_to_clear[i]) >> 16;
+  rtx16 = gen_int_mode (16, SImode);
+  dest = gen_rtx_ZERO_EXTRACT (SImode, scratch_reg, rtx16, rtx16);
+  if (mask)
+	emit_insn (gen_rtx_SET (dest, gen_int_mode (mask, SImode)));
+
+  emit_insn (gen_andsi3 (cleared_reg, cleared_reg, scratch_reg));
+}
+  if (saved_clearing)
+emit_move_insn (clearing_reg, saved_clearing_reg);
+
+
+  /* Clear full registers.  */
+
+  /* If not marked for clearing, clearing_reg already does not

[PATCH, GCC/ARM] Use bitmap to control cmse_nonsecure_call register clearing

2017-11-15 Thread Thomas Preudhomme


Hi,

As part of r253256, cmse_nonsecure_entry_clear_before_return has been
rewritten to use auto_sbitmap instead of an integer bitfield to control
which register needs to be cleared. This commit continue this work in
cmse_nonsecure_call_clear_caller_saved.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-16  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Use
auto_sbitap instead of integer bitfield to control register needing
clearing.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9919f54242d9317125a104f9777d76a85de80e9b..7384b96fea0179334a6010b099df68c8e2a0fc32 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16990,10 +16990,11 @@ cmse_nonsecure_call_clear_caller_saved (void)
 
   FOR_BB_INSNS (bb, insn)
 	{
-	  uint64_t to_clear_mask, float_mask;
+	  unsigned address_regnum, regno, maxregno =
+	TARGET_HARD_FLOAT_ABI ? D7_VFP_REGNUM : NUM_ARG_REGS - 1;
+	  auto_sbitmap to_clear_bitmap (maxregno + 1);
 	  rtx_insn *seq;
 	  rtx pat, call, unspec, reg, cleared_reg, tmp;
-	  unsigned int regno, maxregno;
 	  rtx address;
 	  CUMULATIVE_ARGS args_so_far_v;
 	  cumulative_args_t args_so_far;
@@ -17024,18 +17025,21 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	continue;
 
 	  /* Determine the caller-saved registers we need to clear.  */
-	  to_clear_mask = (1LL << (NUM_ARG_REGS)) - 1;
-	  maxregno = NUM_ARG_REGS - 1;
+	  bitmap_clear (to_clear_bitmap);
+	  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
+
 	  /* Only look at the caller-saved floating point registers in case of
 	 -mfloat-abi=hard.  For -mfloat-abi=softfp we will be using the
 	 lazy store and loads which clear both caller- and callee-saved
 	 registers.  */
 	  if (TARGET_HARD_FLOAT_ABI)
 	{
-	  float_mask = (1LL << (D7_VFP_REGNUM + 1)) - 1;
-	  float_mask &= ~((1LL << FIRST_VFP_REGNUM) - 1);
-	  to_clear_mask |= float_mask;
-	  maxregno = D7_VFP_REGNUM;
+	  auto_sbitmap float_bitmap (maxregno + 1);
+
+	  bitmap_clear (float_bitmap);
+	  bitmap_set_range (float_bitmap, FIRST_VFP_REGNUM,
+D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1);
+	  bitmap_ior (to_clear_bitmap, to_clear_bitmap, float_bitmap);
 	}
 
 	  /* Make sure the register used to hold the function address is not
@@ -17043,7 +17047,9 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  address = RTVEC_ELT (XVEC (unspec, 0), 0);
 	  gcc_assert (MEM_P (address));
 	  gcc_assert (REG_P (XEXP (address, 0)));
-	  to_clear_mask &= ~(1LL << REGNO (XEXP (address, 0)));
+	  address_regnum = REGNO (XEXP (address, 0));
+	  if (address_regnum < R0_REGNUM + NUM_ARG_REGS)
+	bitmap_clear_bit (to_clear_bitmap, address_regnum);
 
 	  /* Set basic block of call insn so that df rescan is performed on
 	 insns inserted here.  */
@@ -17064,6 +17070,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
 	{
 	  rtx arg_rtx;
+	  uint64_t to_clear_args_mask;
 	  machine_mode arg_mode = TYPE_MODE (arg_type);
 
 	  if (VOID_TYPE_P (arg_type))
@@ -17076,10 +17083,18 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type,
 	  true);
 	  gcc_assert (REG_P (arg_rtx));
-	  to_clear_mask
-		&= ~compute_not_to_clear_mask (arg_type, arg_rtx,
-	   REGNO (arg_rtx),
-	   padding_bits_to_clear_ptr);
+	  to_clear_args_mask
+		= compute_not_to_clear_mask (arg_type, arg_rtx,
+	 REGNO (arg_rtx),
+	 padding_bits_to_clear_ptr);
+	  if (to_clear_args_mask)
+		{
+		  for (regno = R0_REGNUM; regno <= maxregno; regno++)
+		{
+		  if (to_clear_args_mask & (1ULL << regno))
+			bitmap_clear_bit (to_clear_bitmap, regno);
+		}
+		}
 
 	  first_param = false;
 	}
@@ -17138,7 +17153,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	 call.  */
 	  for (regno = R0_REGNUM; regno <= maxregno; regno++)
 	{
-	  if (!(to_clear_mask & (1LL << regno)))
+	  if (!bitmap_bit_p (to_clear_bitmap, regno))
 		continue;
 
 	  /* If regno is an even vfp register and its successor is also to
@@ -17147,7 +17162,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 		{
 		  if (TARGET_VFP_DOUBLE
 		  && VFP_REGNO_OK_FOR_DOUBLE (regno)
-		  && to_clear_mask & (1LL << (regno + 1)))
+		  && bitmap_bit_p (to_clear_bitmap, (regno + 1)))
 		emit_move_insn (gen_rtx_REG (DFmode, regno++),
 CONST0_RTX (DFmode));
 		  else
@@ -17161,7 +17176,6 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  seq = get_insns ();
 	  end_sequence ();
 	  emit_insn_before (seq, insn);
-
 	}
 }
 }
@@ -25188,7 +25202,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   if

[PATCH, GCC/testsuite/ARM] Rework expectation for call to Armv8-M nonsecure function

2017-11-15 Thread Thomas Preudhomme


Hi,

Testcase gcc.target/arm/cmse/cmse-14.c checks whether bar is called via
__gnu_cmse_nonsecure_call libcall and not via a direct call. However the
pattern is a bit surprising in that it needs to explicitely allow "by"
due to allowing anything before the 'b'.

This patch rewrites the logic to look for b as a first non-whitespace
letter followed iby anything (to match bl and conditional branches)
followed by some spaces and then bar.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-01  Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-14.c: Change logic to match branch
instruction to bar.

Testing: Test still passes for both Armv8-M Baseline and Mainline.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
index 701e9ee7e318a07278099548f9b7042a1fde1204..df1ea52bec533c36a738d7d3b2b2ff749b0f3713 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
@@ -10,4 +10,4 @@ int foo (void)
 }
 
 /* { dg-final { scan-assembler "bl\t__gnu_cmse_nonsecure_call" } } */
-/* { dg-final { scan-assembler-not "b\[^ y\n\]*\\s+bar" } } */
+/* { dg-final { scan-assembler-not "^(.*\\s)?bl?\[^\\s]*\\s+bar" } } */

[PATCH, GCC/testsuite/ARM] Fix selection of effective target for cmse tests

2017-11-15 Thread Thomas Preudhomme


Hi,

Some of the tests in the gcc.target/arm/cmse directory (eg.
gcc.target/arm/cmse/mainline/bitfield-4.c) are failing when run without
an architecture specified in RUNTESTFLAGS due to them not adding the
option to select an Armv8-M architecture.

This patch fixes the issue by adding the right option from the exp file
so that no architecture fiddling is necessary in the individual tests.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

* gcc.target/arm/cmse/cmse.exp: Add option to select Armv8-M Baseline
or Armv8-M Mainline when running the respective tests.
* gcc.target/arm/cmse/baseline/cmse-11.c: Remove architecture check and
selection.
* gcc.target/arm/cmse/baseline/cmse-13.c: Likewise.
* gcc.target/arm/cmse/baseline/cmse-2.c: Likewise.
* gcc.target/arm/cmse/baseline/cmse-6.c: Likewise.
* gcc.target/arm/cmse/baseline/softfp.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-8.c: Likewise.

Testing: Running cmse.exp for both Armv8-M Baseline and Mainline shows
no regression. Running it for a toolchain defaulting to Armv8-M Baseline
but with RUNTESTFLAGS unset sees some FAIL->PASS.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
index 795544fe11d9d7f24086be16916a5bfee89d7b44..230b255963f56a6c29b91d2501b43fed6eda2476 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 int __attribute__ ((cmse_nonsecure_call)) (*bar) (int);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
index 7208a2cedd2f4f8296b2801d6f5e5d7838b26551..7ab3219e860e993e2eca3bbee2e885f59b7b3cb4 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" } */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 #include "../cmse-13.x"
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
index fec7dc10484b14db5796f5f431a9306c3b2e307c..d5115ecf2bdb3e87dc6a92244cb204e753f25b07 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 extern float bar (void);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
index 43d45e7a63e56edfebc203c8f0e516dc13fbbd65..cae4f343621d1a19a8893ea4950d33e5e1842fb5 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 int __attribute__ ((cmse_nonsecure_call)) (*bar) (double);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c
index ca76e12cd9287fd12b7eb7add638973f5d314939..3d383ff6ee17677120e3e1e81726785c30f3b25c 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c
+++

[PATCH, GCC/ARM] Fix ICE in Armv8-M Security Extensions code

2017-11-15 Thread Thomas Preudhomme


Hi,

Commit r253825 which introduced some sanity checks for sbitmap revealed
a bug in the conversion of cmse_nonsecure_entry_clear_before_return ()
to using bitmap structure. bitmap_and expects that the two bitmaps have
the same length, yet the code in
cmse_nonsecure_entry_clear_before_return () have different size for
to_clear_bitmap and to_clear_arg_regs_bitmap, with the assumption that
bitmap_and would behave has if the bits not allocated were in fact zero.
This commit makes sure both bitmap are equally sized.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-13  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_entry_clear_before_return): Allocate
to_clear_arg_regs_bitmap to the same size as to_clear_bitmap.

Testing: Bootstrapped GCC on arm-none-linux-gnueabihf target and
testsuite shows no regression. Running cmse.exp tests for Armv8-M
Baseline and Mainline shows FAIL->PASS for bitfield-1, bitfield-2,
bitfield-3 and struct-1 testcases.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index db99303f3fb7a2196f48358e74fa4d98f31f045e..106e3edce0d6f2518eb391c436c5213a78d1275b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25205,7 +25205,8 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (padding_bits_to_clear != 0)
 {
   rtx reg_rtx;
-  auto_sbitmap to_clear_arg_regs_bitmap (R0_REGNUM + NUM_ARG_REGS);
+  int to_clear_bitmap_size = SBITMAP_SIZE ((sbitmap) to_clear_bitmap);
+  auto_sbitmap to_clear_arg_regs_bitmap (to_clear_bitmap_size);
 
   /* Padding bits to clear is not 0 so we know we are dealing with
 	 returning a composite type, which only uses r0.  Let's make sure that

Re: [PATCH, GCC/testsuite/ARM] Consolidate sources for cmse tests

2017-11-10 Thread Thomas Preudhomme

on-1.c: Likewise.
* gcc.target/arm/cmse/union-2.x: New file.
* gcc.target/arm/cmse/baseline/union-2.c: Remove code and include above
file.
* gcc.target/arm/cmse/mainline/union-2.c: Likewise.

Testing: Running cmse.exp for both Armv8-M Baseline and Mainline
shows no regression.

Is this ok for trunk?

Best regards,

Thomas

On 10/11/17 11:19, Thomas Preudhomme wrote:

For the most part, testcases under gcc.target/arm/cmse/baseline and
gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).

Similarly all the tests in gcc.target/arm/cmse/mainline/ have
the same source but are duplicate copies.

This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.

Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP registers in
some of the configurations.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  <thomas.preudho...@arm.com>

 * gcc.target/arm/cmse/bitfield-4.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-4.c: Likewise.
 * gcc.target/arm/cmse/bitfield-5.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-5.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-5.c: Likewise.
 * gcc.target/arm/cmse/bitfield-6.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-6.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-6.c: Likewise.
 * gcc.target/arm/cmse/bitfield-7.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-7.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-7.c: Likewise.
 * gcc.target/arm/cmse/bitfield-8.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-8.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-8.c: Likewise.
 * gcc.target/arm/cmse/bitfield-9.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-9.c: Remove code and include
 above file.
 * gcc.target/arm/cmse/mainline/bitfield-9.c: Likewise.
 * gcc.target/arm/cmse/bitfield-and-union.x: New file.
 * gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: Rename into ...
 * gcc.target/arm/cmse/baseline/bitfield-and-union.c: This.  Remove code
 and include above bitfield-and-union.x file.
 * gcc.target/arm/cmse/mainline/bitfield-and-union-1.c: Rename into ...
 * gcc.target/arm/cmse/mainline/bitfield-and-union.c: this.  Remove code
 and include above bitfield-and-union.x file.
 * gcc.target/arm/cmse/cmse-13.x: New file.
 * gcc.target/arm/cmse/baseline/cmse-13.c: Remove code and include above
 file.
 * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
 * gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
 * gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
 * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
 * gcc.target/arm/cmse/cmse-5.x: New file.
 * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Remove code and
 include above file.
 * gcc.target/arm/cmse/mainline/harFor the most part, testcases under 
gcc.target/arm/cmse/baseline and

gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).

Similarly all the tests in gcc.target/arm/cmse/mainline/ have
the same source but are duplicate copies.

This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.

Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP regis

[PATCH, GCC/testsuite/ARM] Consolidate sources for cmse tests

2017-11-10 Thread Thomas Preudhomme


For the most part, testcases under gcc.target/arm/cmse/baseline and
gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).

Similarly all the tests in gcc.target/arm/cmse/mainline/ have
the same source but are duplicate copies.

This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.

Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP registers in
some of the configurations.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

* gcc.target/arm/cmse/bitfield-4.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-4.c: Likewise.
* gcc.target/arm/cmse/bitfield-5.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-5.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-5.c: Likewise.
* gcc.target/arm/cmse/bitfield-6.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-6.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-6.c: Likewise.
* gcc.target/arm/cmse/bitfield-7.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-7.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-7.c: Likewise.
* gcc.target/arm/cmse/bitfield-8.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-8.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-8.c: Likewise.
* gcc.target/arm/cmse/bitfield-9.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-9.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-9.c: Likewise.
* gcc.target/arm/cmse/bitfield-and-union.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: Rename into ...
* gcc.target/arm/cmse/baseline/bitfield-and-union.c: This.  Remove code
and include above bitfield-and-union.x file.
* gcc.target/arm/cmse/mainline/bitfield-and-union-1.c: Rename into ...
* gcc.target/arm/cmse/mainline/bitfield-and-union.c: this.  Remove code
and include above bitfield-and-union.x file.
* gcc.target/arm/cmse/cmse-13.x: New file.
* gcc.target/arm/cmse/baseline/cmse-13.c: Remove code and include above
file.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/cmse-5.x: New file.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Remove code and
include above file.
	* gcc.target/arm/cmse/mainline/harFor the most part, testcases under 
gcc.target/arm/cmse/baseline and

gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).

Similarly all the tests in gcc.target/arm/cmse/mainline/ have
the same source but are duplicate copies.

This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.

Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP registers in
some of the configurations.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

* gcc.target/arm/cmse/bitfield-4.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and

[PATCH, GCC/testsuite] Fix retrieval of testname

2017-11-09 Thread Thomas Preudhomme


When gcc-dg-runtest is used to run a test the test is run several times
with different options. For clarity of the log, the test infrastructure
then append the options to the testname. This means that all the code
that must deal with the testcase itself (eg. removing the output files
after the test has run) needs to remove the option name.

There is already a pattern (see below) for this in several place of the
testsuite framework but it is also missing in many places. This patch
fixes all of these places. The pattern is as follows:

set testcase [testname-for-summary]
; The name might include a list of options; extract the file name.
set testcase [lindex $testcase 0]

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-08  Thomas Preud'homme  

* lib/scanasm.exp (scan-assembler): Extract filename from testname used
in summary.
(scan-assembler-not): Likewise.
(scan-hidden): Likewise.
(scan-not-hidden): Likewise.
(scan-stack-usage): Likewise.
(scan-stack-usage-not): Likewise.
(scan-assembler-times): Likewise.
(scan-assembler-dem): Likewise.
(scan-assembler-dem-not): Likewise.
(object-size): Likewise.
(scan-lto-assembler): Likewise.
* lib/scandump.exp (scan-dump): Likewise.
(scan-dump-times): Likewise.
(scan-dump-not): Likewise.
(scan-dump-dem): Likewise.
(scan-dump-dem-not): Likewise

Testing: Ran testsuite on bootstrap aarch64-linux-gnu and
x86_64-linux-gnu compiled with C, fortran and ada support without any 
regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index a66bb28253196410554405facefa8641d1020c1d..33286152f30df959a4bffa81634d0bfe7b898e8f 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -78,7 +78,9 @@ proc dg-scan { name positive testcase output_file orig_args } {
 
 proc scan-assembler { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].s"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].s"
 dg-scan "scan-assembler" 1 $testcase $output_file $args
 }
 
@@ -89,7 +91,9 @@ force_conventional_output_for scan-assembler
 
 proc scan-assembler-not { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].s"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].s"
 
 dg-scan "scan-assembler-not" 0 $testcase $output_file $args
 }
@@ -117,7 +121,9 @@ proc hidden-scan-for { symbol } {
 
 proc scan-hidden { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].s"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].s"
 
 set symbol [lindex $args 0]
 
@@ -133,7 +139,9 @@ proc scan-hidden { args } {
 
 proc scan-not-hidden { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].s"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].s"
 
 set symbol [lindex $args 0]
 set hidden_scan [hidden-scan-for $symbol]
@@ -163,7 +171,9 @@ proc scan-file-not { output_file args } {
 
 proc scan-stack-usage { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].su"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].su"
 
 dg-scan "scan-file" 1 $testcase $output_file $args
 }
@@ -173,7 +183,9 @@ proc scan-stack-usage { args } {
 
 proc scan-stack-usage-not { args } {
 set testcase [testname-for-summary]
-set output_file "[file rootname [file tail $testcase]].su"
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
+set output_file "[file rootname [file tail $filename]].su"
 
 dg-scan "scan-file-not" 0 $testcase $output_file $args
 }
@@ -230,12 +242,14 @@ proc scan-assembler-times { args } {
 }
 
 set testcase [testname-for-summary]
+# The name might include a list of options; extract the file name.
+set filename [lindex $testcase 0]
 set pattern [lindex $args 0]
 set times [lindex $args 1]
 set pp_pattern [make_pattern_printable $pattern]
 
 # This must match the rule in gcc-dg.exp.
-set output_file "[file rootname [file tail

[PATCH, GCC/ARM] Fix cmse_nonsecure_entry return insn size

2017-11-08 Thread Thomas Preudhomme


Hi,

A number of instructions are output in assembler form by
output_return_instruction () when compiling a function with the
cmse_nonsecure_entry attribute for Armv8-M Mainline with hardfloat float
ABI. However, the corresponding thumb2_cmse_entry_return insn pattern
does not account for all these instructions in its computing of the
length of the instruction.

This may lead GCC to use the wrong branching instruction due to
incorrect computation of the offset between the branch instruction's
address and the target address.

This commit fixes the mismatch between what output_return_instruction ()
does and what the pattern think it does and adds a note warning about
mismatch in the affected functions' heading comments to ensure code does
not get out of sync again.

Note: no test is provided because the C testcase is fragile (only works
on GCC 6) and the extracted RTL test fails to compile due to bugs in the
RTL frontend (PR82815 and PR82817)

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2017-10-30  Thomas Preud'homme  

* config/arm/arm.c (output_return_instruction): Add comments to
indicate requirement for cmse_nonsecure_entry return to account
for the size of clearing instruction output here.
(thumb_exit): Likewise.
* config/arm/thumb2.md (thumb2_cmse_entry_return): Fix length for
return in hardfloat mode.

Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 033ec255a577f782201527f57f45802bc0eb45e0..9919f54242d9317125a104f9777d76a85de80e9b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19417,7 +19417,12 @@ arm_get_vfp_saved_size (void)
 
 /* Generate a function exit sequence.  If REALLY_RETURN is false, then do
everything bar the final return instruction.  If simple_return is true,
-   then do not output epilogue, because it has already been emitted in RTL.  */
+   then do not output epilogue, because it has already been emitted in RTL.
+
+   Note: do not forget to update length attribute of corresponding insn pattern
+   when changing assembly output (eg. length attribute of
+   thumb2_cmse_entry_return when updating Armv8-M Mainline Security Extensions
+   register clearing sequences).  */
 const char *
 output_return_instruction (rtx operand, bool really_return, bool reverse,
bool simple_return)
@@ -23950,7 +23955,12 @@ thumb_pop (FILE *f, unsigned long mask)
 
 /* Generate code to return from a thumb function.
If 'reg_containing_return_addr' is -1, then the return address is
-   actually on the stack, at the stack pointer.  */
+   actually on the stack, at the stack pointer.
+
+   Note: do not forget to update length attribute of corresponding insn pattern
+   when changing assembly output (eg. length attribute of epilogue_insns when
+   updating Armv8-M Baseline Security Extensions register clearing
+   sequences).  */
 static void
 thumb_exit (FILE *f, int reg_containing_return_addr)
 {
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index b78c3d256aeafc2eeb3dcdc2b9b07b1af9df5294..776d611d2538e790a5f504995050ffdfc51d7193 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1132,7 +1132,7 @@
; we adapt the length accordingly.
(set (attr "length")
  (if_then_else (match_test "TARGET_HARD_FLOAT")
-  (const_int 12)
+  (const_int 34)
   (const_int 8)))
; We do not support predicate execution of returns from cmse_nonsecure_entry
; functions because we need to clear the APSR.  Since predicable has to be

[PATCH, GCC/ARM] Allow +nodsp for -mcpu=cortex-m33

2017-10-16 Thread Thomas Preudhomme


Hi,

DSP instructions are optional for Arm Cortex-M33, yet its -mcpu option
does not allow +nodsp. Users are thus left with using
-march=armv8-m.main -mtune=cortex-m33. This patch allows +nodsp to
-mcpu=cortex-m33.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-11  Thomas Preud'homme  

* config/arm/arm-cpus.in (cortex-m33): Add nodsp option.
* doc/invoke.texi: Document +nodsp as a valid extension for
-mcpu=cortex-m33.

Tested by building an arm-none-eabi GCC cross-compiler and checking that
__ARM_FEATURE_DSP is *not* defined when invoked with
-mcpu=cortex-m33+nodsp.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 07de4c9375ba7a0df0d8bd00385e54a4042e5264..25fc429a8338e433b9fcd0ee385ff127423494c2 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1516,6 +1516,7 @@ begin cpu cortex-m33
  architecture armv8-m.main+dsp
  fpu fpv5-sp-d16
  option nofp remove ALL_FP
+ option nodsp remove armv7em
  costs v7m
 end cpu cortex-m33
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9ad1fb339babe2ce8f45ecac2fa93d7b9ae5fd30..722d5cc2c0a020906e6df3260822cdd268245082 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15803,6 +15803,9 @@ Permissible names for this option are the same as those for
 The following extension options are common to the listed CPUs:
 
 @table @samp
+@item +nodsp
+Disable the DSP instructions on @samp{cortex-m33}.
+
 @item  +nofp
 Disables the floating-point instructions on @samp{arm9e},
 @samp{arm946e-s}, @samp{arm966e-s}, @samp{arm968e-s}, @samp{arm10e},

Re: [PATCH, GCC/ARM, ping] Remove ARMv8-M code for D17-D31

2017-09-28 Thread Thomas Preudhomme


Committed (sorry for delay).

Best regards,

Thomas

On 06/09/17 09:12, Kyrill Tkachov wrote:

Hi Thomas,

On 05/09/17 10:04, Thomas Preudhomme wrote:

Ping?



This is ok if a bootstrap and test run on arm-none-linux-gnueabihf shows no 
problems.

Thanks,
Kyrill


Best regards,

Thomas

On 25/08/17 12:18, Thomas Preudhomme wrote:

Hi,

I've now also added a couple more changes:

* size to_clear_bitmap according to maxregno to be consistent with its use
* use directly TARGET_HARD_FLOAT instead of clear_vfpregs


Original message below (ChangeLog unchanged):

Function cmse_nonsecure_entry_clear_before_return has code to deal with
high VFP register (D16-D31) while ARMv8-M Baseline and Mainline both do
not support more than 16 double VFP registers (D0-D15). This makes this
security-sensitive code harder to read for not much benefit since
libcall for cmse_nonsecure_call functions do not deal with those high
VFP registers anyway.

This commit gets rid of this code for simplicity and fixes 2 issues in
the same function:

- stop the first loop when reaching maxregno to avoid dealing with VFP
   registers if targetting Thumb-1 or using -mfloat-abi=soft
- include maxregno in that loop

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-06-13  Thomas Preud'homme <thomas.preudho...@arm.com>

 * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
 Extensions with more than 16 double VFP registers.
 (cmse_nonsecure_entry_clear_before_return): Remove second entry of
 to_clear_mask and all code related to it.  Replace the remaining
 entry by a sbitmap and adapt code accordingly.

Testing: Testsuite shows no regression when run for ARMv8-M Baseline and
ARMv8-M Mainline.

Is this ok for trunk?

Best regards,

Thomas

On 23/08/17 11:56, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 17/07/17 17:25, Thomas Preudhomme wrote:
My bad, found an off-by-one error in the sizing of bitmaps. Please find 
fixed patch in attachment.


ChangeLog entry is unchanged:

*** gcc/ChangeLog ***

2017-06-13  Thomas Preud'homme <thomas.preudho...@arm.com>

 * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
 Extensions with more than 16 double VFP registers.
 (cmse_nonsecure_entry_clear_before_return): Remove second entry of
 to_clear_mask and all code related to it.  Replace the remaining
 entry by a sbitmap and adapt code accordingly.

Best regards,

Thomas

On 17/07/17 09:52, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 12/07/17 09:59, Thomas Preudhomme wrote:

Hi Richard,

On 07/07/17 15:19, Richard Earnshaw (lists) wrote:


Hmm, I think that's because really this is a partial conversion.  It
looks like doing this properly would involve moving that existing code
to use sbitmaps as well.  I think doing that would be better for
long-term maintenance perspectives, but I'm not going to insist that you
do it now.


There's also the assert later but I've found a way to improve it 
slightly. While switching to auto_sbitmap I also changed the code 
slightly to allocate directly bitmaps to the right size. Since the change 
is probably bigger than what you had in mind I'd appreciate if you can 
give me an OK again. See updated patch in attachment. ChangeLog entry is 
unchanged:


2017-06-13  Thomas Preud'homme <thomas.preudho...@arm.com>

 * config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
 Extensions with more than 16 double VFP registers.
 (cmse_nonsecure_entry_clear_before_return): Remove second entry of
 to_clear_mask and all code related to it.  Replace the remaining
 entry by a sbitmap and adapt code accordingly.



As a result I'll let you take the call as to whether you keep this
version or go back to your earlier patch.  If you do decide to keep this
version, then see the comment below.


Given the changes I'm more happy with how the patch looks now and making 
it go in can be a nice incentive to change other ARMv8-M Security 
Extension related code later on.


Best regards,

Thomas

[arm-embedded] [PATCH 3/3, GCC/ARM] Add support for ARM Cortex-R52 processor

2017-09-06 Thread Thomas Preudhomme


Hi,

We have decided to apply the following patch to the embedded-7-branch to enable 
Arm Cortex-R52 support.


*** gcc/ChangeLog.arm ***

2017-09-04  Thomas Preud'homme  <thomas.preudho...@arm.com>

 Backport from mainline
 2017-07-14  Thomas Preud'homme  <thomas.preudho...@arm.com>

* config/arm/arm-cpus.in (cortex-r52): Add new entry.
(armv8-r): Set ARM Cortex-R52 as default CPU.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM
Cortex-R52.
* doc/invoke.texi: Mention -mtune=cortex-r52 and availability of fp.dp
extension for -mcpu=cortex-r52.

Best regards,

Thomas
--- Begin Message ---

Hi,

On 29/06/17 16:13, Thomas Preudhomme wrote:

Please ignore this patch. I'll respin the patch on a more recent GCC.


Please find an updated patch in attachment.

This patch adds support for the ARM Cortex-R52 processor rencently
announced.

[1] https://developer.arm.com/products/processors/cortex-r/cortex-r52

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-07-14  Thomas Preud'homme  <thomas.preudho...@arm.com>

* config/arm/arm-cpus.in (cortex-r52): Add new entry.
(armv8-r): Set ARM Cortex-R52 as default CPU.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM
Cortex-R52.
* doc/invoke.texi: Mention -mtune=cortex-r52 and availability of fp.dp
extension for -mcpu=cortex-r52.

Tested by building an arm-none-eabi GCC cross-compiler targeting Cortex-R52 and 
building an hello world with it. Also checked that the .fpu option created by 
GCC for -mcpu=cortex-r52 and -mcpu=cortex-r52+nofp.dp is as expected 
(respectively .fpu neon-fp-armv8 and .fpu fpv5-sp-d16


Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index e2ff297aed7514073dbb3bf5ee86964f202e5a14..d009a9e18acb093aefe0f9d8d6de49489fc2325c 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -381,7 +381,7 @@ begin arch armv8-m.main
 end arch armv8-m.main
 
 begin arch armv8-r
- tune for cortex-r4
+ tune for cortex-r52
  tune flags CO_PROC
  base 8R
  profile R
@@ -1315,6 +1315,16 @@ begin cpu cortex-m33
  costs v7m
 end cpu cortex-m33
 
+# V8 R-profile implementations.
+begin cpu cortex-r52
+ cname cortexr52
+ tune flags LDSCHED
+ architecture armv8-r+crc+simd
+ fpu neon-fp-armv8
+ option nofp.dp remove FP_DBL ALL_SIMD
+ costs cortex
+end cpu cortex-r52
+
 # FPU entries
 # format:
 # begin fpu 
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 51678c2566e841894c5c0e9c613c8c0f832e9988..4e508b1555a77628ff6e7cfea39c98b87caa840a 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -357,6 +357,9 @@ Enum(processor_type) String(cortex-m23) Value( TARGET_CPU_cortexm23)
 EnumValue
 Enum(processor_type) String(cortex-m33) Value( TARGET_CPU_cortexm33)
 
+EnumValue
+Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
+
 Enum
 Name(arm_arch) Type(int)
 Known ARM architectures (for use with the -march= option):
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index ba2c7d8ecfdbf6966ebf04b680d587a0e057b161..1b3f7a94cc78fac8abf1042ef60c81a74eaf24eb 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -57,5 +57,6 @@
 	cortexa73,exynosm1,xgene1,
 	cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,
 	cortexa73cortexa53,cortexa55,cortexa75,
-	cortexa75cortexa55,cortexm23,cortexm33"
+	cortexa75cortexa55,cortexm23,cortexm33,
+	cortexr52"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c
index 16171d4e801af46ad549314d1f376e90d5bff57c..5c29b94caaba4ff6f89a191f1d8edcf10431c0b3 100644
--- a/gcc/config/arm/driver-arm.c
+++ b/gcc/config/arm/driver-arm.c
@@ -58,6 +58,7 @@ static struct vendor_cpu arm_cpu_table[] = {
 {"0xc15", "armv7-r", "cortex-r5"},
 {"0xc17", "armv7-r", "cortex-r7"},
 {"0xc18", "armv7-r", "cortex-r8"},
+{"0xd13", "armv8-r+crc", "cortex-r52"},
 {"0xc20", "armv6-m", "cortex-m0"},
 {"0xc21", "armv6-m", "cortex-m1"},
 {"0xc23", "armv7-m", "cortex-m3"},
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e60edcae53ef3c995054b9b0229b5f0fccbb8462..a093b9bcf77b1f4b40992516e853826bb7d528d4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15538,7 +15538,7 @@ Permissible names are: @samp{arm2}, @samp{arm250},
 @samp{cortex-a32}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cort

[arm-embedded] [PATCH, GCC/ARM] Rewire -mfpu=fp-armv8 as VFPv5 + D32 + DP

2017-09-06 Thread Thomas Preudhomme


Hi,

We have decided to apply the following patch to the embedded-7-branch to enable 
ARMv8-R support.



ChangeLog entry is as follows:

*** gcc/ChangeLog.arm ***

2017-09-04  Thomas Preud'homme  

 Backport from mainline
 2017-07-14  Thomas Preud'homme  

* config/arm/arm-isa.h (isa_bit_FP_ARMv8): Delete enumerator.
(ISA_FP_ARMv8): Define as ISA_FPv5 and ISA_FP_D32.
* config/arm/arm-cpus.in (armv8-r): Define fp.sp as enabling FPv5.
(fp-armv8): Define it as FP_ARMv8 only.
config/arm/arm.h (TARGET_FPU_ARMV8): Delete.
(TARGET_VFP_FP16INST): Define using TARGET_VFP5 rather than
TARGET_FPU_ARMV8.
config/arm/arm.c (arm_rtx_costs_internal): Replace checks against
TARGET_FPU_ARMV8 by checks against TARGET_VFP5.
* config/arm/arm-builtins.c (arm_builtin_vectorized_function): Define
first ARM_CHECK_BUILTIN_MODE definition using TARGET_VFP5 rather
than TARGET_FPU_ARMV8.
* config/arm/arm-c.c (arm_cpu_builtins): Likewise for
__ARM_FEATURE_NUMERIC_MAXMIN macro definition.
* config/arm/arm.md (cmov): Condition on TARGET_VFP5 rather than
TARGET_FPU_ARMV8.
* config/arm/neon.md (neon_vrint): Likewise.
(neon_vcvt): Likewise.
(neon_): Likewise.
(3): Likewise.
* config/arm/vfp.md (lsi2): Likewise.
* config/arm/predicates.md (arm_cond_move_operator): Check against
TARGET_VFP5 rather than TARGET_FPU_ARMV8 and fix spacing.

Best regards,

Thomas
--- Begin Message ---

Hi,

fp-armv8 is currently defined as a double precision FPv5 with 32 D
registers *and* a special FP_ARMv8 bit. However FP for ARMv8 should only
bring 32 D registers on top of FPv5-D16 so this FP_ARMv8 bit is
spurious. As a consequence, many instruction patterns which are guarded
by TARGET_FPU_ARMV8 are unavailable to FPv5-D16 and FPv5-SP-D16.

This patch gets rid of TARGET_FPU_ARMV8 and rewire all uses to
expressions based on TARGET_VFP5, TARGET_VFPD32 and TARGET_VFP_DOUBLE.
It also redefine ISA_FP_ARMv8 to include the D32 capability to
distinguish it from FPv5-D16. At last, it sets the +fp.sp for ARMv8-R to
enable FPv5-SP-D16 (ie FP for ARMv8 with single precision only and 16 D
registers).

ChangeLog entry is as follows:

2017-07-07  Thomas Preud'homme  

* config/arm/arm-isa.h (isa_bit_FP_ARMv8): Delete enumerator.
(ISA_FP_ARMv8): Define as ISA_FPv5 and ISA_FP_D32.
* config/arm/arm-cpus.in (armv8-r): Define fp.sp as enabling FPv5.
(fp-armv8): Define it as FP_ARMv8 only.
config/arm/arm.h (TARGET_FPU_ARMV8): Delete.
(TARGET_VFP_FP16INST): Define using TARGET_VFP5 rather than
TARGET_FPU_ARMV8.
config/arm/arm.c (arm_rtx_costs_internal): Replace checks against
TARGET_FPU_ARMV8 by checks against TARGET_VFP5.
* config/arm/arm-builtins.c (arm_builtin_vectorized_function): Define
first ARM_CHECK_BUILTIN_MODE definition using TARGET_VFP5 rather
than TARGET_FPU_ARMV8.
* config/arm/arm-c.c (arm_cpu_builtins): Likewise for
__ARM_FEATURE_NUMERIC_MAXMIN macro definition.
* config/arm/arm.md (cmov): Condition on TARGET_VFP5 rather than
TARGET_FPU_ARMV8.
* config/arm/neon.md (neon_vrint): Likewise.
(neon_vcvt): Likewise.
(neon_): Likewise.
(3): Likewise.
* config/arm/vfp.md (lsi2): Likewise.
* config/arm/predicates.md (arm_cond_move_operator): Check against
TARGET_VFP5 rather than TARGET_FPU_ARMV8 and fix spacing.

Testing:
  * Bootstrapped under ARMv8-A Thumb state and ran testsuite -> no regression
  * built Spec2000 and Spec2006 with -march=armv8-a+fp16 and compared objdump 
-> no code generation difference


Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 63ee880822c17eda55dd58438d61cbbba333b2c6..7504ed581c63a657a0dff48442633704bd252b2e 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -3098,7 +3098,7 @@ arm_builtin_vectorized_function (unsigned int fn, tree type_out, tree type_in)
NULL_TREE is returned if no such builtin is available.  */
 #undef ARM_CHECK_BUILTIN_MODE
 #define ARM_CHECK_BUILTIN_MODE(C)\
-  (TARGET_FPU_ARMV8   \
+  (TARGET_VFP5   \
&& flag_unsafe_math_optimizations \
&& ARM_CHECK_BUILTIN_MODE_1 (C))
 
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index a3daa3220a2bc4220dffdb7ca08ca9419bdac425..9178937b6d9e0fe5d0948701390c4cf01f4f8c7d 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -96,7 +96,7 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 		   || TARGET_ARM_ARCH_ISA_THUMB >=2));
 
   def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN",
-		  TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_FPU_ARMV8);
+		  TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_VFP5);
 
   def_or_undef_macro (pfile, "__ARM_FEATURE_SIMD32",

[arm-embedded] [PATCH 2/3, GCC/ARM] Add support for ARMv8-R architecture

2017-09-06 Thread Thomas Preudhomme


Hi,

We have decided to apply the following patch to the embedded-7-branch to enable 
ARMv8-R support.


ChangeLog entry is as follows:

*** gcc/ChangeLog.arm ***

2017-09-04  Thomas Preud'homme  <thomas.preudho...@arm.com>

 Backport from mainline
 2017-07-06  Thomas Preud'homme  <thomas.preudho...@arm.com>

* config/arm/arm-cpus.in (armv8-r): Add new entry.
* config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
enumerator.
* doc/invoke.texi: Mention -march=armv8-r and its extensions.

*** gcc/testsuite/ChangeLog ***


2017-09-04  Thomas Preud'homme  <thomas.preudho...@arm.com>

 Backport from mainline
 2017-07-06  Thomas Preud'homme  <thomas.preudho...@arm.com>

* lib/target-supports.exp: Generate
check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r
and check_effective_target_arm_arch_v8r_multilib.

*** libgcc/ChangeLog ***


2017-09-04  Thomas Preud'homme  <thomas.preudho...@arm.com>

 Backport from mainline
 2017-07-06  Thomas Preud'homme  <thomas.preudho...@arm.com>

* config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.
--- Begin Message ---

Please find an updated patch in attachment. ChangeLog entry are now as follows:

*** gcc/ChangeLog ***

2017-07-06  Thomas Preud'homme  <thomas.preudho...@arm.com>

* config/arm/arm-cpus.in (armv8-r): Add new entry.
* config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
enumerator.
* doc/invoke.texi: Mention -march=armv8-r and its extensions.

*** gcc/testsuite/ChangeLog ***

2017-01-31  Thomas Preud'homme  <thomas.preudho...@arm.com>

* lib/target-supports.exp: Generate
check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r
and check_effective_target_arm_arch_v8r_multilib.

*** libgcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  <thomas.preudho...@arm.com>

* config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.


Tested by building an arm-none-eabi GCC cross-compiler targetting
ARMv8-R.

Is this ok for stage1?

Best regards,

Thomas

Best regards,

Thomas

On 29/06/17 16:13, Thomas Preudhomme wrote:

Please ignore this patch. I'll respin the patch on a more recent GCC.

Best regards,

Thomas

On 29/06/17 14:55, Thomas Preudhomme wrote:

Hi,

This patch adds support for ARMv8-R architecture [1] which was recently
announced. User level instructions for ARMv8-R are the same as those in
ARMv8-A Aarch32 mode so this patch define ARMv8-R to have the same
features as ARMv8-A in ARM backend.

[1] 
https://developer.arm.com/products/architecture/r-profile/docs/ddi0568/latest/arm-architecture-reference-manual-supplement-armv8-for-the-armv8-r-aarch32-architecture-profile 



ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  <thomas.preudho...@arm.com>

 * config/arm/arm-cpus.in (armv8-r, armv8-r+rcr): Add new entry.
 * config/arm/arm-cpu-cdata.h: Regenerate.
 * config/arm/arm-cpu-data.h: Regenerate.
 * config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
 enumerator.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Add entry for ARMv8-R and
 ARMv8-R with CRC extensions.
 * doc/invoke.texi: Mention -march=armv8-r and -march=armv8-r+crc
 options.  Document meaning of -march=armv8-r+rcr.

*** gcc/testsuite/ChangeLog ***

2017-01-31  Thomas Preud'homme  <thomas.preudho...@arm.com>

 * lib/target-supports.exp: Generate
 check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r
 and check_effective_target_arm_arch_v8r_multilib.

*** libgcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  <thomas.preudho...@arm.com>

 * config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.

Tested by building an arm-none-eabi GCC cross-compiler targetting
ARMv8-R.

Is this ok for stage1?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 946d543ebb29416da9b4928161607cccacaa78a7..f35128acb7d68c6a0592355b9d3d56ee8f826aca 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -380,6 +380,22 @@ begin arch armv8-m.main
  option nodsp remove bit_ARMv7em
 end arch armv8-m.main
 
+begin arch armv8-r
+ tune for cortex-r4
+ tune flags CO_PROC
+ base 8R
+ profile R
+ isa ARMv8r
+ option crc add bit_crc32
+# fp.sp => fp-armv8 (d16); simd => simd + fp-armv8 + d32 + double precision
+# note: no fp option for fp-armv8 (d16) + double precision at the moment
+ option fp.sp add FP_ARMv8
+ option simd add FP_ARMv8 NEON
+ option crypto add FP_ARMv8 CRYPTO
+ option nocrypto remove

1 2 3 4 5 >

1 - 100 of 471 matches

Mail list logo