Re: [PATCH] testsuite: Verify r0-r3 are extended with CMSE

2024-04-30 Thread Richard Earnshaw (lists)
On 30/04/2024 16:37, Torbjorn SVENSSON wrote:
> 
> 
> On 2024-04-30 17:11, Richard Earnshaw (lists) wrote:
>> On 27/04/2024 15:13, Torbjörn SVENSSON wrote:
>>> Add regression test to the existing zero/sign extend tests for CMSE to
>>> verify that r0, r1, r2 and r3 are properly extended, not just r0.
>>>
>>> Test is done using -O0 to ensure the instructions are in a predictable
>>> order.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/arm/cmse/extend-param.c: Add regression test.
>>>
>>> Signed-off-by: Torbjörn SVENSSON 
>>> ---
>>>   .../gcc.target/arm/cmse/extend-param.c    | 20 ++-
>>>   1 file changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c 
>>> b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> index 01fac786238..b8b8ecbff56 100644
>>> --- a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> +++ b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> @@ -93,4 +93,22 @@ __attribute__((cmse_nonsecure_entry)) char 
>>> boolSecureFunc (bool index) {
>>>   return 0;
>>>     return array[index];
>>>   -}
>>> \ No newline at end of file
>>> +}
>>> +
>>> +/*
>>> +**__acle_se_boolCharShortEnumSecureFunc:
>>> +**    ...
>>> +**    uxtb    r0, r0
>>> +**    uxtb    r1, r1
>>> +**    uxth    r2, r2
>>> +**    uxtb    r3, r3
>>> +**    ...
>>> +*/
>>> +__attribute__((cmse_nonsecure_entry,optimize(0))) char 
>>> boolCharShortEnumSecureFunc (bool a, unsigned char b, unsigned short c, 
>>> enum offset d) {
>>> +
>>> +  size_t index = a + b + c + d;
>>> +  if (index >= ARRAY_SIZE)
>>> +    return 0;
>>> +  return array[index];
>>> +
>>> +}
>>
>> Ok, but please can you add '-fshort-enums' to dg-options to ensure this test 
>> still behaves correctly if run with a different default (I missed that last 
>> time around).
> 
> Ok, I'll add that to extend-param.c. Do you want me to also add it to the 
> extend-return.c test case?
> 
> Kind regards,
> Torbjörn

Yes please, if it has the same issue.

R.


Re: [PATCH] testsuite: Verify r0-r3 are extended with CMSE

2024-04-30 Thread Richard Earnshaw (lists)
On 27/04/2024 15:13, Torbjörn SVENSSON wrote:
> Add regression test to the existing zero/sign extend tests for CMSE to
> verify that r0, r1, r2 and r3 are properly extended, not just r0.
> 
> Test is done using -O0 to ensure the instructions are in a predictable
> order.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/cmse/extend-param.c: Add regression test.
> 
> Signed-off-by: Torbjörn SVENSSON 
> ---
>  .../gcc.target/arm/cmse/extend-param.c| 20 ++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c 
> b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> index 01fac786238..b8b8ecbff56 100644
> --- a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> +++ b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> @@ -93,4 +93,22 @@ __attribute__((cmse_nonsecure_entry)) char boolSecureFunc 
> (bool index) {
>  return 0;
>return array[index];
>  
> -}
> \ No newline at end of file
> +}
> +
> +/*
> +**__acle_se_boolCharShortEnumSecureFunc:
> +**   ...
> +**   uxtbr0, r0
> +**   uxtbr1, r1
> +**   uxthr2, r2
> +**   uxtbr3, r3
> +**   ...
> +*/
> +__attribute__((cmse_nonsecure_entry,optimize(0))) char 
> boolCharShortEnumSecureFunc (bool a, unsigned char b, unsigned short c, enum 
> offset d) {
> +
> +  size_t index = a + b + c + d;
> +  if (index >= ARRAY_SIZE)
> +return 0;
> +  return array[index];
> +
> +}

Ok, but please can you add '-fshort-enums' to dg-options to ensure this test 
still behaves correctly if run with a different default (I missed that last 
time around).

R.


Re: [PATCH][GCC] aarch64: Fix SCHEDULER_IDENT for Cortex-A510

2024-04-26 Thread Richard Earnshaw (lists)
On 25/04/2024 15:59, Richard Ball wrote:
> Hi Richard,
> 
> I committed this combined patch (with Cortex-A520) for trunk 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=cab53aae43cf94171b01320c08302e47a5daa391
>  
> 
> 
> Am I ok to commit just the Cortex-A510 half into gcc-12 and gcc-13.

Yes, if that's the correct thing to do there.

R.

> 
> Thanks,
> Richard Ball
> --
> *From:* Richard Ball
> *Sent:* 12 March 2024 14:08
> *To:* gcc-patches@gcc.gnu.org ; Richard Earnshaw 
> ; Richard Sandiford ; 
> Marcus Shawcroft 
> *Subject:* [PATCH][GCC] aarch64: Fix SCHEDULER_IDENT for Cortex-A510
>  
> The SCHEDULER_IDENT for this CPU was incorrectly
> set to cortexa55, which is incorrect. This can cause
> sub-optimal asm to be generated.
> 
> Ok for trunk?
> 
> Can I also backport this to gcc-12 and gcc-13?
> 
> gcc/ChangeLog:
>     PR target/114272
>     * config/aarch64/aarch64-cores.def (AARCH64_CORE):
>     Change SCHEDULER_IDENT from cortexa55 to cortexa53
>     for Cortex-A510.



Re: [PATCH] arm: Zero/Sign extends for CMSE security

2024-04-26 Thread Richard Earnshaw (lists)
On 26/04/2024 09:39, Torbjorn SVENSSON wrote:
> Hi,
> 
> On 2024-04-25 16:25, Richard Ball wrote:
>> Hi Torbjorn,
>>
>> Thanks very much for the comments.
>> I think given that the code that handles this, is within a 
>> FOREACH_FUNCTION_ARGS loop.
>> It seems a fairly safe assumption that if the code works for one that it 
>> will work for all.
>> To go back and add extra tests to me seems a little overkill.
> 
> For verifying that the implementation does the right thing now, no, but for 
> verifying against future regressions, then yes.
> 
> So, from a regression point of view, I think it makes sense to have the check 
> that more than the first argument is managed properly.
> 
> Kind regards,
> Torbjörn

Feel free to post some additional tests, Torbjorn.

R.


Re: [PATCH] arm: Zero/Sign extends for CMSE security

2024-04-25 Thread Richard Earnshaw (lists)
On 24/04/2024 16:55, Richard Ball wrote:
> This patch makes the following changes:
> 
> 1) When calling a secure function from non-secure code then any arguments
>smaller than 32-bits that are passed in registers are zero- or 
> sign-extended.
> 2) After a non-secure function returns into secure code then any return value
>smaller than 32-bits that is passed in a register is  zero- or 
> sign-extended.
> 
> This patch addresses the following CVE-2024-0151.
> 
> gcc/ChangeLog:
> PR target/114837
> * config/arm/arm.cc (cmse_nonsecure_call_inline_register_clear):
>   Add zero/sign extend.
> (arm_expand_prologue): Add zero/sign extend.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/cmse/extend-param.c: New test.
> * gcc.target/arm/cmse/extend-return.c: New test.

OK.  And OK to backport to active branches.

R.


Re: [PATCH] [testsuite] [arm] require arm_v8_1m_main for pacbti tests

2024-04-19 Thread Richard Earnshaw (lists)
On 19/04/2024 13:45, Alexandre Oliva wrote:
> On Apr 16, 2024, "Richard Earnshaw (lists)"  wrote:
> 
>> The require-effective-target flags test whether a specific set of
>> flags will make the compilation work, so they need to be used in
>> conjunction with the corresponding dg-add-options flags that then
>> apply those options.
> 
> *nod*, that's the theory.  Problem is the architectures suported by
> [add_options_for_]arm_arch_*[_ok] do not match exactly those expected by
> the tests, and I can't quite tell whether the subtle changes they would
> introduce would change what they intend to test, or even whether the
> differences are irrelevant, or would be sensible to add as variants to
> the dg machinery.  I think it would take someone more familiar than I am
> with all of the ARM variants to do this correctly.  I don't even know
> how these changes would need to be tested to be sure they remain
> correct.

It's ok to add additional variations to the table of variants in 
target-supports.exp, but we should avoid writing new specific run-time 
functions unless we really want an executable test.

I started doing some cleanup of the Arm tests infrastructure during phase 3, 
but stopped during phase 4 as I wanted to minimise the changes being made now.  
I plan to go back and work on it some more once stage 1 re-opens.

> 
> Would you be willing to take it from here, or would you accept the patch
> as an incremental yet imperfect improvement, or would you prefer to
> guide me in making it correct, and in verifying it (there are questions
> below)?  I don't have a lot of cycles to put into this (we've already
> worked around the testsuite bugs we ran into), but it would be desirable
> to get a fix into GCC as well, if we can converge on one without
> unreasonably burdening anyone.
> 
> 
>   v8_1m_main "-march=armv8.1-m.main+fp -mthumb" __ARM_ARCH_8M_MAIN__
>   v8_1m_main_pacbti "-march=armv8.1-m.main+pacbti+fp -mthumb"
>   "__ARM_ARCH_8M_MAIN__ && __ARM_FEATURE_BTI && 
> __ARM_FEATURE_PAUTH
> 
> Why do these have +fp in -march but not in the v8_1m* arch name?

It's ... complicated :)

The +fp is there because, with the move to having -mfpu=auto as the default, we 
need to avoid problems when the compiler has been configured with 
--with-float=hard, which requires the extension register set (fp or vector 
support) even if the test code itself doesn't care.  The best way to handle 
this in most cases is to give the architecture strings a default FPU 
specification (ie +fp). 

> 
> 
> gcc/testsuite/g++.target/arm/pac-1.C:
> /* { dg-options "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
> -mthumb -mfloat-abi=hard -g -O0" } */
> 
> v8_1m_main_pacbti plus +mve minus +fp.
> Do we need a dg arch for that?

I'd be inclined to drop +mve from this one; there's nothing I can see in the 
test that would generate mve instructions, so I think it's irrelevant.  We can 
use the existing v8_1m_main_pacbti operations.

> 
> 
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c:
> /* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
> -mfloat-abi=hard" } */
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c:
> /* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
> 
> v8_1m_main_pacbti minus -mthumb.
> AFAICT the -mthumb is redundant.

Nearly, but not quite.  Although the gcc driver knows that m-profile 
architectures require thumb, that's not enough to override an explicit -marm 
from a testsuite configuration run, so if your site.exp file adds -marm in a 
test configuration we need to override that or the test will fail.  But the 
table based list of options will do that for you.

> 
> 
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c:
> /* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
> 
> v8_1m_main minus -mthumb.
> AFAICT the -mthumb is redundant.

As above

> 
> 
> gcc/testsuite/gcc.target/arm/bti-1.c:
> /* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
> -mbranch-protection=bti --save-temps" } */
> gcc/testsuite/gcc.target/arm/bti-2.c:
> /* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
> -mbranch-protection=bti --save-temps" } */
> 
> v8_1m_main minus +fp.> 
> Can these be bumped to +fp, or do we need an extra dg arch?
> 
> Are these missing +pacbti?

The tests themselves do not require fp, but if we use the effective-target 
rules (arm_arch_v8_1m_main), we can remove the -march, -mthumb and -mfloat-abi 
flags from these tests.

These tests for BTI should NOT have +pacbti: they're testing that the compiler 
generates the right nop-based implementation that is backw

Re: [PATCH]AArch64: remove reliance on register allocator for simd/gpreg costing. [PR114741]

2024-04-18 Thread Richard Earnshaw (lists)
On 18/04/2024 11:11, Tamar Christina wrote:
> Hi All,
> 
> In PR114741 we see that we have a regression in codegen when SVE is enable 
> where
> the simple testcase:
> 
> void foo(unsigned v, unsigned *p)
> {
> *p = v & 1;
> }
> 
> generates
> 
> foo:
> fmovs31, w0
> and z31.s, z31.s, #1
> str s31, [x1]
> ret
> 
> instead of:
> 
> foo:
> and w0, w0, 1
> str w0, [x1]
> ret
> 
> This causes an impact it not just codesize but also performance.  This is 
> caused
> by the use of the ^ constraint modifier in the pattern 3.
> 
> The documentation states that this modifier should only have an effect on the
> alternative costing in that a particular alternative is to be preferred unless
> a non-psuedo reload is needed.
> 
> The pattern was trying to convey that whenever both r and w are required, that
> it should prefer r unless a reload is needed.  This is because if a reload is
> needed then we can construct the constants more flexibly on the SIMD side.
> 
> We were using this so simplify the implementation and to get generic cases 
> such
> as:
> 
> double negabs (double x)
> {
>unsigned long long y;
>memcpy (, , sizeof(double));
>y = y | (1UL << 63);
>memcpy (, , sizeof(double));
>return x;
> }
> 
> which don't go through an expander.
> However the implementation of ^ in the register allocator is not according to
> the documentation in that it also has an effect during coloring.  During 
> initial
> register class selection it applies a penalty to a class, similar to how ? 
> does.
> 
> In this example the penalty makes the use of GP regs expensive enough that it 
> no
> longer considers them:
> 
> r106: preferred FP_REGS, alternative NO_REGS, allocno FP_REGS
> ;;3--> b  0: i   9 r106=r105&0x1
> :cortex_a53_slot_any:GENERAL_REGS+0(-1)FP_REGS+1(1)PR_LO_REGS+0(0)
>  PR_HI_REGS+0(0):model 4
> 
> which is not the expected behavior.  For GCC 14 this is a conservative fix.
> 
> 1. we remove the ^ modifier from the logical optabs.
> 
> 2. In order not to regress copysign we then move the copysign expansion to
>directly use the SIMD variant.  Since copysign only supports floating point
>modes this is fine and no longer relies on the register allocator to select
>the right alternative.
> 
> It once again regresses the general case, but this case wasn't optimized in
> earlier GCCs either so it's not a regression in GCC 14.  This change gives
> strict better codegen than earlier GCCs and still optimizes the important 
> cases.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 
>   PR target/114741
>   * config/aarch64/aarch64.md (3): Remove ^ from alt 2.
>   (copysign3): Use SIMD version of IOR directly.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/114741
>   * gcc.target/aarch64/fneg-abs_2.c: Update codegen.
>   * gcc.target/aarch64/fneg-abs_4.c: xfail for now.
>   * gcc.target/aarch64/pr114741.c: New test.
> 
> ---
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 
> 385a669b9b3c31cc9108a660e881b9091c71fc7c..dbde066f7478bec51a8703b017ea553aa98be309
>  100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4811,7 +4811,7 @@ (define_insn "3"
>""
>{@ [ cons: =0 , 1  , 2; attrs: type , arch  ]
>   [ r, %r , r; logic_reg   , * ] \t%0, 
> %1, %2
> - [ rk   , ^r ,  ; logic_imm   , * ] \t%0, 
> %1, %2
> + [ rk   , r  ,  ; logic_imm   , * ] \t%0, 
> %1, %2
>   [ w, 0  ,  ; *   , sve   ] \t%Z0., 
> %Z0., #%2
>   [ w, w  , w; neon_logic  , simd  ] 
> \t%0., %1., %2.
>}
> @@ -7192,22 +7192,29 @@ (define_expand "copysign3"
> (match_operand:GPF 2 "nonmemory_operand")]
>"TARGET_SIMD"
>  {
> -  machine_mode int_mode = mode;
> -  rtx bitmask = gen_reg_rtx (int_mode);
> -  emit_move_insn (bitmask, GEN_INT (HOST_WIDE_INT_M1U
> - << (GET_MODE_BITSIZE (mode) - 1)));
> +  rtx signbit_const = GEN_INT (HOST_WIDE_INT_M1U
> +<< (GET_MODE_BITSIZE (mode) - 1));
>/* copysign (x, -1) should instead be expanded as orr with the sign
>   bit.  */
>rtx op2_elt = unwrap_const_vec_duplicate (operands[2]);
>if (GET_CODE (op2_elt) == CONST_DOUBLE
>&& real_isneg (CONST_DOUBLE_REAL_VALUE (op2_elt)))
>  {
> -  emit_insn (gen_ior3 (
> - lowpart_subreg (int_mode, operands[0], mode),
> - lowpart_subreg (int_mode, operands[1], mode), bitmask));
> +  rtx v_bitmask
> + = force_reg (V2mode,
> +  gen_const_vec_duplicate (V2mode,
> +   signbit_const));
> +
> +  emit_insn (gen_iorv23 (
> + lowpart_subreg (V2mode, operands[0], mode),
> + lowpart_subreg 

Re: [PATCH] [testsuite] [arm] accept empty init for bfloat16

2024-04-16 Thread Richard Earnshaw (lists)
On 16/04/2024 04:50, Alexandre Oliva wrote:
> 
> Complete r13-2205, adjusting an arm-specific test that expects a
> no-longer-issued error at an empty initializer.
> 
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/arm/bfloat16_scalar_typecheck.c: Accept C23
> empty initializers.
> ---
>  .../gcc.target/arm/bfloat16_scalar_typecheck.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c 
> b/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> index 8c80c55bc9f4c..04ede93bda152 100644
> --- a/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> +++ b/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> @@ -42,7 +42,7 @@ bfloat16_t footest (bfloat16_t scalar0)
>short initi_1_4 = glob_bfloat; /* { dg-error {invalid conversion from type 
> 'bfloat16_t'} } */
>double initi_1_5 = glob_bfloat; /* { dg-error {invalid conversion from 
> type 'bfloat16_t'} } */
>  
> -  bfloat16_t scalar2_1 = {}; /* { dg-error {empty scalar initializer} } */
> +  bfloat16_t scalar2_1 = {};
>bfloat16_t scalar2_2 = { glob_bfloat };
>bfloat16_t scalar2_3 = { 0 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
>bfloat16_t scalar2_4 = { 0.1 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
> @@ -94,7 +94,7 @@ bfloat16_t footest (bfloat16_t scalar0)
>  
>/* Compound literals.  */
>  
> -  (bfloat16_t) {}; /* { dg-error {empty scalar initializer} } */
> +  (bfloat16_t) {};
>(bfloat16_t) { glob_bfloat };
>(bfloat16_t) { 0 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
>(bfloat16_t) { 0.1 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
> 


This test is checking for errors.  Perhaps it would be better to select an 
older version of the standard and then set pedantic-error mode.

R.


Re: [testsuite] [aarch64] Require fpic effective target

2024-04-16 Thread Richard Earnshaw (lists)
On 16/04/2024 04:08, Alexandre Oliva wrote:
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> Co-authored-by: Olivier Hainque 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/aarch64/pr94201.c: Add missing
>   dg-require-effective-target fpic.
>   * gcc.target/aarch64/pr103085.c: Likewise.
> 
> ---
>  gcc/testsuite/gcc.target/aarch64/pr103085.c |1 +
>  gcc/testsuite/gcc.target/aarch64/pr94201.c  |1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr103085.c 
> b/gcc/testsuite/gcc.target/aarch64/pr103085.c
> index dbc9c15b71f22..347280ed42b2d 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pr103085.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr103085.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fstack-protector-strong -fPIC" } */
> +/* { dg-require-effective-target fpic } */
>  
>  void g(int*);
>  void
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr94201.c 
> b/gcc/testsuite/gcc.target/aarch64/pr94201.c
> index 691761691868a..3b9b79059e02b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pr94201.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr94201.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-mcmodel=tiny -mabi=ilp32 -fPIC" } */
> +/* { dg-require-effective-target fpic } */
>  
>  extern int bar (void *);
>  extern long long a;
> 


OK

R.


Re: [PATCH] [testsuite] [arm] require arm_v8_1m_main for pacbti tests

2024-04-16 Thread Richard Earnshaw (lists)
On 16/04/2024 04:48, Alexandre Oliva wrote:
> 
> arm pac and bti tests that use -march=armv8.1-m.main get an implicit
> -mthumb, that is incompatible with vxworks kernel mode.  Declaring the
> requirement for a 8.1-m.main-compatible toolchain is enough to avoid
> those fails, because the toolchain feature test fails in kernel mode.
> 
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * g++.target/arm/pac-1.C: Require arm_arch_v8_1m_main.
>   * gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.
>   * gcc.target/arm/bti-1.c: Likewise.
>   * gcc.target/arm/bti-2.c: Likewise.
> ---
>  gcc/testsuite/g++.target/arm/pac-1.C   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-11.c   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-12.c   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-7.c|1 +
>  gcc/testsuite/gcc.target/arm/bti-1.c   |1 +
>  gcc/testsuite/gcc.target/arm/bti-2.c   |1 +
>  6 files changed, 6 insertions(+)
> 
> diff --git a/gcc/testsuite/g++.target/arm/pac-1.C 
> b/gcc/testsuite/g++.target/arm/pac-1.C
> index f671a27b048c6..f48ad6cc5cb65 100644
> --- a/gcc/testsuite/g++.target/arm/pac-1.C
> +++ b/gcc/testsuite/g++.target/arm/pac-1.C
> @@ -2,6 +2,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-options "-march=armv8.1-m.main+mve+pacbti 
> -mbranch-protection=pac-ret -mthumb -mfloat-abi=hard -g -O0" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */

The require-effective-target flags test whether a specific set of flags will 
make the compilation work, so they need to be used in conjunction with the 
corresponding dg-add-options flags that then apply those options.  It isn't 
safe to just add a different architecture flag instead.  So if you're going to 
use this effective target, you should use it along with "dg-add-options 
arm_arch_v8_1m_main" (ie the effective-target name minus the trailing '_ok'), 
and then replace dg-options with dg-additional-options adding the remaining 
flags.  You can then remove the dg-skip-if as well because that's what the 
require-effective-target flag is doing.  So something like

dg-do compile
dg-require-effective-target arm_arch_v8_1m_main_ok
dg-add-options arm_arch_v8_1m_main
dg-additional-options "-mbranch-protection=pac-ret -g -O0"

But this test is also adding pacbti to the architecture flags, so it would 
probably be better to use v8_1m_main_pacbti_ok as the effective target.  It's 
not identical to the options above, but it's probably sufficient for this test. 
 Each test below will need checking for the exact flags that are needed for the 
test in question.


>  
>  __attribute__((noinline)) void
>  fn1 (int a, int b, int c)
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> index 6a5ae92c567f3..dba4f491cfea7 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" "-mfloat-abi=*" } } */
>  /* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
>  
>  #if (__ARM_FEATURE_BTI != 1)
>  #error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be defined to 1."
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> index db40b17c3b030..308a41eb4ba4c 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
>  
>  #if defined (__ARM_FEATURE_BTI)
>  #error "Feature test macro __ARM_FEATURE_BTI should not be defined."
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> index 1b25907635e24..10836a84bde56 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
> -mfloat-abi=hard" } */
> +/* { 

Re: [PATCH 1/1] aarch64: Sync aarch64-sys-regs.def with Binutils

2024-03-20 Thread Richard Earnshaw (lists)
On 20/03/2024 11:21, Yury Khrustalev wrote:
> This patch updates `aarch64-sys-regs.def', bringing it into sync with
> the Binutils source.
> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-sys-regs.def: Copy from Binutils.

Thanks, I've pushed this.  It's trivial enough and there's value of keeping it 
in sync with binutils.

One comment though, there should be one hard tab before "* config/..."; you 
seem to have some other random characters there that looked like white space.

R.

> ---
>  gcc/config/aarch64/aarch64-sys-regs.def | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
> b/gcc/config/aarch64/aarch64-sys-regs.def
> index 6a948171d6e..8b65673a5d6 100644
> --- a/gcc/config/aarch64/aarch64-sys-regs.def
> +++ b/gcc/config/aarch64/aarch64-sys-regs.def
> @@ -521,6 +521,7 @@
>SYSREG ("id_aa64isar0_el1",CPENC (3,0,0,6,0),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64isar1_el1",CPENC (3,0,0,6,1),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64isar2_el1",CPENC (3,0,0,6,2),  F_REG_READ, 
> AARCH64_NO_FEATURES)
> +  SYSREG ("id_aa64isar3_el1",CPENC (3,0,0,6,3),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr0_el1",CPENC (3,0,0,7,0),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr1_el1",CPENC (3,0,0,7,1),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr2_el1",CPENC (3,0,0,7,2),  F_REG_READ, 
> AARCH64_NO_FEATURES)



Re: [PATCH] arm: [MVE intrinsics] Fix support for loads [PR target/114323]

2024-03-18 Thread Richard Earnshaw (lists)




On 15/03/2024 20:08, Christophe Lyon wrote:

The testcase in this PR shows that we would load from an uninitialized
location, because the vld1 instrinsics are reported as "const". This
is because function_instance::reads_global_state_p() does not take
CP_READ_MEMORY into account.  Fixing this gives vld1 the "pure"
attribute instead, and solves the problem.

2024-03-15  Christophe Lyon  

PR target/114323
gcc/
* config/arm/arm-mve-builtins.cc
(function_instance::reads_global_state_p): Take CP_READ_MEMORY
into account.

gcc/testsuite/
* gcc.target/arm/mve/pr114323.c: New.


OK.

R.


---
  gcc/config/arm/arm-mve-builtins.cc  |  2 +-
  gcc/testsuite/gcc.target/arm/mve/pr114323.c | 22 +
  2 files changed, 23 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.target/arm/mve/pr114323.c

diff --git a/gcc/config/arm/arm-mve-builtins.cc 
b/gcc/config/arm/arm-mve-builtins.cc
index 2f2c0f4a02a..6a5775c67e5 100644
--- a/gcc/config/arm/arm-mve-builtins.cc
+++ b/gcc/config/arm/arm-mve-builtins.cc
@@ -746,7 +746,7 @@ function_instance::reads_global_state_p () const
if (flags & CP_READ_FPCR)
  return true;
  
-  return false;

+  return flags & CP_READ_MEMORY;
  }
  
  /* Return true if calls to the function could modify some form of

diff --git a/gcc/testsuite/gcc.target/arm/mve/pr114323.c 
b/gcc/testsuite/gcc.target/arm/mve/pr114323.c
new file mode 100644
index 000..bd9127b886a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/pr114323.c
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_mve_hw } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+
+#include 
+
+__attribute__((noipa))
+uint32x4_t foo (void) {
+  uint32x4_t V0 = vld1q_u32(((const uint32_t[4]){1, 2, 3, 4}));
+  return V0;
+}
+
+int main(void)
+{
+  uint32_t buf[4];
+ vst1q_u32 (buf, foo());
+
+  for (int i = 0; i < 4; i++)
+if (buf[i] != i+1)
+  __builtin_abort ();
+}


Re: [PATCH] testsuite: Turn errors back into warnings in arm/acle/cde-mve-error-2.c

2024-03-18 Thread Richard Earnshaw (lists)




On 15/03/2024 15:13, Thiago Jung Bauermann wrote:


Hello,

"Richard Earnshaw (lists)"  writes:


On 13/01/2024 20:46, Thiago Jung Bauermann wrote:

diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c 
b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
index 5b7774825442..da283a06a54d 100644
--- a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
+++ b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
@@ -2,6 +2,7 @@

  /* { dg-do assemble } */
  /* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
+/* { dg-options "-fpermissive" } */
  /* { dg-add-options arm_v8_1m_main_cde_mve_fp } */

  /* The error checking files are split since there are three kinds of
@@ -115,73 +116,73 @@ uint8x16_t test_bad_immediates (uint8x16_t n, uint8x16_t 
m, int someval,

/* `imm' is of wrong type.  */
accum += __arm_vcx1q_u8 (0, "");/* { dg-error {argument 
2 to '__builtin_arm_vcx1qv16qi' must be a constant immediate in range \[0-4095\]} } */
-  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 117 } */
+  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 118 } */


Absolute line numbers are a pain, but I think we can use '.-1' (without the 
quotes) in
these cases to minimize the churn.


That worked, thank you for the tip.


If that works, ok with that change.


I took the opportunity to request commit access to the GCC repo so that
I can commit the patch myself. Sorry for the delay. I'll commit it as
soon as I get it.

Thank you for the patch review! I'm including below the updated version.


I pushed this, thanks.

R.



--
Thiago


 From 78e70788da5ed849d7828b0219d3aa8955ad0fea Mon Sep 17 00:00:00 2001
From: Thiago Jung Bauermann 
Date: Sat, 13 Jan 2024 14:28:07 -0300
Subject: [PATCH v2] testsuite: Turn errors back into warnings in
  arm/acle/cde-mve-error-2.c
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Since commit 2c3db94d9fd ("c: Turn int-conversion warnings into
permerrors") the test fails with errors such as:

   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
32)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
33)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
34)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
35)
 ⋮
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 118 (test for 
warnings, line 117)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
119)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 120 (test for 
warnings, line 119)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
121)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 122 (test for 
warnings, line 121)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
123)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 124 (test for 
warnings, line 123)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
125)
 ⋮
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0  (test for excess errors)

There's a total of 1016 errors.  Here's a sample of the excess errors:

   Excess errors:
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:117:31: 
error: passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:119:3: 
error: passing argument 3 of '__builtin_arm_vcx1qav16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:121:3: 
error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:123:3: 
error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]

The test expects these messages to be warnings, not errors.  My first try
was to change it to expect them as errors instead.  This didn't work, IIUC
because the error prevents the compiler from continuing processing the file
and thus other errors which are expected by the test don't get emitted.

Therefore, add -fpermissive so that the test behaves as it did previously.
Because of the additional line in the header, the line numbers of the
expected warnings don't match anymore so replace them with ".-1" as
suggested by Richard Earnshaw.

Tested on armv8l-linux-gnueabihf.

gcc/testsuite/ChangeLog:
* gcc.target/arm/acle/cde-mve-error-

Re: [PATCH] aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310]

2024-03-14 Thread Richard Earnshaw (lists)




On 14/03/2024 08:37, Jakub Jelinek wrote:

Hi!

The following testcase ICEs with LSE atomics.
The problem is that the @atomic_compare_and_swap expander uses
aarch64_reg_or_zero predicate for the desired operand, which is fine,
given that for most of the modes and even for TImode in some cases
it can handle zero immediate just fine, but the TImode
@aarch64_compare_and_swap_lse just uses register_operand for
that operand instead, again intentionally so, because the casp,
caspa, caspl and caspal instructions need to use a pair of consecutive
registers for the operand and xzr is just one register and we can't
just store zero into the link register to emulate pair of zeros.

So, the following patch fixes that by forcing the newval operand into
a register for the TImode LSE case.

Bootstrapped/regtested on aarch64-linux, ok for trunk?


An alternative fix would be to use a mode_attr to pick a different 
predicate for TImode.  But that's probably just a matter of taste; I'm 
not sure that one would be better than the other in reality.


OK (or with my suggestion if you prefer).

R.



2024-03-14  Jakub Jelinek  

PR target/114310
* config/aarch64/aarch64.cc (aarch64_expand_compare_and_swap): For
TImode force newval into a register.

* gcc.dg/pr114310.c: New test.

--- gcc/config/aarch64/aarch64.cc.jj2024-03-12 10:16:12.024101665 +0100
+++ gcc/config/aarch64/aarch64.cc   2024-03-13 18:55:39.147986554 +0100
@@ -24693,6 +24693,8 @@ aarch64_expand_compare_and_swap (rtx ope
  rval = copy_to_mode_reg (r_mode, oldval);
else
emit_move_insn (rval, gen_lowpart (r_mode, oldval));
+  if (mode == TImode)
+   newval = force_reg (mode, newval);
  
emit_insn (gen_aarch64_compare_and_swap_lse (mode, rval, mem,

   newval, mod_s));
--- gcc/testsuite/gcc.dg/pr114310.c.jj  2024-03-13 19:09:25.322597418 +0100
+++ gcc/testsuite/gcc.dg/pr114310.c 2024-03-13 19:08:50.802073314 +0100
@@ -0,0 +1,20 @@
+/* PR target/114310 */
+/* { dg-do run { target int128 } } */
+
+volatile __attribute__((aligned (sizeof (__int128_t __int128_t v = 10;
+
+int
+main ()
+{
+#if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
+  if (__sync_val_compare_and_swap (, (__int128_t) 10, (__int128_t) 0) != 10)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 10, (__int128_t) 15) != 0)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 0, (__int128_t) 42) != 0)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 31, (__int128_t) 35) != 42)
+__builtin_abort ();
+#endif
+  return 0;
+}

Jakub



Re: [PATCH] arm: fix c23 0-named-args caller-side stdarg

2024-03-07 Thread Richard Earnshaw (lists)
On 06/03/2024 20:28, Alexandre Oliva wrote:
> On Mar  1, 2024, "Richard Earnshaw (lists)"  wrote:
> 
>> On 01/03/2024 04:38, Alexandre Oliva wrote:
>>> Thanks for the review.
> 
>> For closure, Jakub has just pushed a patch to the generic code, so I
>> don't think we need this now.
> 
> ACK.  I see the c2x-stdarg-4.c test is now passing in our arm-eabi
> gcc-13 tree.  Thank you all.
> 
> Alas, the same nightly build showed a new riscv fail in c23-stdarg-6.c,
> that also got backported to gcc-13.  Presumably it's failing in the
> trunk as well, both riscv32-elf and riscv64-elf.
> 
> I haven't looked into whether it's a regression brought about by the
> patch or just a new failure mode that the new test exposed.  Either way,
> I'm not sure whether to link this new failure to any of the associated
> PRs or to file a new one, but, FTR, I'm going to look into it.
> 

I'd suggest a new pr.  It's easier to track than re-opening an existing on.

R.

> -- 
> Alexandre Oliva, happy hacker    https://FSFLA.org/blogs/lxo/ 
> <https://FSFLA.org/blogs/lxo/>
>    Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive



Re: [PATCH] arm: Support -mfdpic for more targets

2024-03-06 Thread Richard Earnshaw (lists)
On 06/03/2024 05:07, Fangrui Song wrote:
> On Fri, Feb 23, 2024 at 7:33 PM Fangrui Song  wrote:
>>
>> From: Fangrui Song 
>>
>> Targets that are not arm*-*-uclinuxfdpiceabi can use -S -mfdpic, but -c
>> -mfdpic does not pass --fdpic to gas.  This is an unnecessary
>> restriction.  Just define the ASM_SPEC in bpabi.h.
>>
>> Additionally, use armelf[b]_linux_fdpiceabi emulations for -mfdpic in
>> linux-eabi.h.  This will allow a future musl fdpic port to use the
>> desired BFD emulation.
>>
>> gcc/ChangeLog:
>>
>> * config/arm/bpabi.h (TARGET_FDPIC_ASM_SPEC): Transform -mfdpic.
>> * config/arm/linux-eabi.h (TARGET_FDPIC_LINKER_EMULATION): Define.
>> (SUBTARGET_EXTRA_LINK_SPEC): Use TARGET_FDPIC_LINKER_EMULATION
>> if -mfdpic.
>> ---
>>  gcc/config/arm/bpabi.h  | 2 +-
>>  gcc/config/arm/linux-eabi.h | 5 -
>>  2 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
>> index 7a279f3ed3c..6778be1a8bf 100644
>> --- a/gcc/config/arm/bpabi.h
>> +++ b/gcc/config/arm/bpabi.h
>> @@ -55,7 +55,7 @@
>>  #define TARGET_FIX_V4BX_SPEC " %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*"\
>>    "|march=armv4|mcpu=fa526|mcpu=fa626:--fix-v4bx}"
>>
>> -#define TARGET_FDPIC_ASM_SPEC ""
>> +#define TARGET_FDPIC_ASM_SPEC "%{mfdpic: --fdpic}"
>>
>>  #define BE8_LINK_SPEC  \
>>    "%{!r:%{!mbe32:%:be8_linkopt(%{mlittle-endian:little}"   \
>> diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
>> index eef791f6a02..0c5c58e4928 100644
>> --- a/gcc/config/arm/linux-eabi.h
>> +++ b/gcc/config/arm/linux-eabi.h
>> @@ -46,12 +46,15 @@
>>  #undef  TARGET_LINKER_EMULATION
>>  #if TARGET_BIG_ENDIAN_DEFAULT
>>  #define TARGET_LINKER_EMULATION "armelfb_linux_eabi"
>> +#define TARGET_FDPIC_LINKER_EMULATION "armelfb_linux_fdpiceabi"
>>  #else
>>  #define TARGET_LINKER_EMULATION "armelf_linux_eabi"
>> +#define TARGET_FDPIC_LINKER_EMULATION "armelf_linux_fdpiceabi"
>>  #endif
>>
>>  #undef  SUBTARGET_EXTRA_LINK_SPEC
>> -#define SUBTARGET_EXTRA_LINK_SPEC " -m " TARGET_LINKER_EMULATION
>> +#define SUBTARGET_EXTRA_LINK_SPEC " -m %{mfdpic: " \
>> +  TARGET_FDPIC_LINKER_EMULATION ";:" TARGET_LINKER_EMULATION "}"
>>
>>  /* GNU/Linux on ARM currently supports three dynamic linkers:
>> - ld-linux.so.2 - for the legacy ABI
>> --
>> 2.44.0.rc1.240.g4c46232300-goog
>>
> 
> Ping:)
> 

We're in stage4 at present and this is new material.  I'll look at it after the 
branch has been cut.

R.

> 
> -- 
> 宋方睿



Re: [PATCH v2] testsuite, arm: Fix testcase arm/pr112337.c to check for the options first

2024-03-05 Thread Richard Earnshaw (lists)
On 19/02/2024 10:11, Saurabh Jha wrote:
> 
> On 2/9/2024 2:57 PM, Richard Earnshaw (lists) wrote:
>> On 30/01/2024 17:07, Saurabh Jha wrote:
>>> Hey,
>>>
>>> Previously, this test was added to fix this bug: 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337. However, it did not 
>>> check the compilation options before using them, leading to errors.
>>>
>>> This patch fixes the test by first checking whether it can use the options 
>>> before using them.
>>>
>>> Tested for arm-none-eabi and found no regressions. The output of check-gcc 
>>> with RUNTESTFLAGS="arm.exp=*" changed like this:
>>>
>>> Before:
>>> # of expected passes  5963
>>> # of unexpected failures  64
>>>
>>> After:
>>> # of expected passes  5964
>>> # of unexpected failures  63
>>>
>>> Ok for master?
>>>
>>> Regards,
>>> Saurabh
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>  * gcc.target/arm/pr112337.c: Check whether we can use the 
>>> compilation options before using them.
>> My apologies for missing this earlier.  It didn't show up in patchwork. 
>> That's most likely because the attachment is a binary blob instead of 
>> text/plain.  That also means that the Linaro CI system hasn't seen this 
>> patch either.  Please can you fix your mailer to add plain text patch files.
>>
>> -/* { dg-options "-O2 -march=armv8.1-m.main+fp.dp+mve.fp -mfloat-abi=hard" } 
>> */
>> +/* { dg-require-effective-target arm_hard_ok } */
>> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>> +/* { dg-options "-O2 -mfloat-abi=hard" } */
>> +/* { dg-add-options arm_v8_1m_mve } */
>>
>> This is moving in the right direction, but it adds more than necessary now: 
>> checking for, and adding -mfloat-abi=hard is not necessary any more as 
>> arm_v8_1m_mve_ok will work out what float-abi flags are needed to make the 
>> options work. (What's more, it will prevent the test from running if the 
>> base configuration of the compiler is incompatible with the hard float ABI, 
>> which is more than we need.).
>>
>> So please can you re-spin removing the hard-float check and removing that 
>> from dg-options.
>>
>> Thanks,
>> R.
> 
> Hi Richard,
> 
> Agreed with your comments. Please find the patch with the suggested changes 
> attached.
> 
> Regards,
> 
> Saurabh
> 


Thanks, I've pushed this.  Next time, please can you put the commit message 
inside the patch, so that I can apply things automatically.  Eg: 

>From 1c92c94074449929f40cea99a6450bcde3aec12f Mon Sep 17 00:00:00 2001
From: Saurabh Jha 
Date: Tue, 30 Jan 2024 15:03:36 +
Subject: [PATCH] Fix testcase pr112337.c to check the options [PR112337]

gcc.target/arm/pr112337.c was failing to validate that adding MVE options
was compatible with the test environment, so add the missing checks.

gcc/testsuite/ChangeLog:

PR target/112337
* gcc.target/arm/pr112337.c: Check for, then use the right MVE
options.

---
 gcc/testsuite/gcc.target/arm/pr112337.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr112337.c 
b/gcc/testsuite/gcc.target/arm/pr112337.c

...


Re: [PATCH v4] aarch64,arm: Move branch-protection data to targets

2024-03-01 Thread Richard Earnshaw (lists)
On 11/01/2024 14:35, Szabolcs Nagy wrote:
> The branch-protection types are target specific, not the same on arm
> and aarch64.  This currently affects pac-ret+b-key, but there will be
> a new type on aarch64 that is not relevant for arm.
> 
> After the move, change aarch_ identifiers to aarch64_ or arm_ as
> appropriate.
> 
> Refactor aarch_validate_mbranch_protection to take the target specific
> branch-protection types as an argument.
> 
> In case of invalid input currently no hints are provided: the way
> branch-protection types and subtypes can be mixed makes it difficult
> without causing confusion.
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64.md: Rename aarch_ to aarch64_.
>   * config/aarch64/aarch64.opt: Likewise.
>   * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Likewise.
>   * config/aarch64/aarch64.cc (aarch64_expand_prologue): Likewise.
>   (aarch64_expand_epilogue): Likewise.
>   (aarch64_post_cfi_startproc): Likewise.
>   (aarch64_handle_no_branch_protection): Copy and rename.
>   (aarch64_handle_standard_branch_protection): Likewise.
>   (aarch64_handle_pac_ret_protection): Likewise.
>   (aarch64_handle_pac_ret_leaf): Likewise.
>   (aarch64_handle_pac_ret_b_key): Likewise.
>   (aarch64_handle_bti_protection): Likewise.
>   (aarch64_override_options): Update branch protection validation.
>   (aarch64_handle_attr_branch_protection): Likewise.
>   * config/arm/aarch-common-protos.h (aarch_validate_mbranch_protection):
>   Pass branch protection type description as argument.
>   (struct aarch_branch_protect_type): Move from aarch-common.h.
>   * config/arm/aarch-common.cc (aarch_handle_no_branch_protection):
>   Remove.
>   (aarch_handle_standard_branch_protection): Remove.
>   (aarch_handle_pac_ret_protection): Remove.
>   (aarch_handle_pac_ret_leaf): Remove.
>   (aarch_handle_pac_ret_b_key): Remove.
>   (aarch_handle_bti_protection): Remove.
>   (aarch_validate_mbranch_protection): Pass branch protection type
>   description as argument.
>   * config/arm/aarch-common.h (enum aarch_key_type): Remove.
>   (struct aarch_branch_protect_type): Remove.
>   * config/arm/arm-c.cc (arm_cpu_builtins): Remove aarch_ra_sign_key.
>   * config/arm/arm.cc (arm_handle_no_branch_protection): Copy and rename.
>   (arm_handle_standard_branch_protection): Likewise.
>   (arm_handle_pac_ret_protection): Likewise.
>   (arm_handle_pac_ret_leaf): Likewise.
>   (arm_handle_bti_protection): Likewise.
>   (arm_configure_build_target): Update branch protection validation.
>   * config/arm/arm.opt: Remove aarch_ra_sign_key.
> ---
> v4:
> - pass types as argument to validation.
> - make target specific types data static.
> 
>  gcc/config/aarch64/aarch64-c.cc  |  4 +-
>  gcc/config/aarch64/aarch64.cc| 75 
>  gcc/config/aarch64/aarch64.md|  2 +-
>  gcc/config/aarch64/aarch64.opt   |  2 +-
>  gcc/config/arm/aarch-common-protos.h | 19 ++-
>  gcc/config/arm/aarch-common.cc   | 71 --
>  gcc/config/arm/aarch-common.h| 20 
>  gcc/config/arm/arm-c.cc  |  2 -
>  gcc/config/arm/arm.cc| 55 +---
>  gcc/config/arm/arm.opt   |  3 --
>  10 files changed, 145 insertions(+), 108 deletions(-)
> 


OK

R.



Re: [PATCH v6 5/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-03-01 Thread Richard Earnshaw (lists)
On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds support for MVE Tail-Predicated Low Overhead Loops by using 
> the
> doloop funcitonality added to support predicated vectorized hardware loops.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
>   declaration to pass basic_block.
>   (arm_attempt_dlstp_transform): New declaration.
>   * config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): Define targethook.
>   (TARGET_PREDICT_DOLOOP_P): Likewise.
>   (arm_target_bb_ok_for_lob): Adapt condition.
>   (arm_mve_get_vctp_lanes): New function.
>   (arm_dl_usage_type): New internal enum.
>   (arm_get_required_vpr_reg): New function.
>   (arm_get_required_vpr_reg_param): New function.
>   (arm_get_required_vpr_reg_ret_val): New function.
>   (arm_mve_get_loop_vctp): New function.
>   (arm_mve_insn_predicated_by): New function.
>   (arm_mve_across_lane_insn_p): New function.
>   (arm_mve_load_store_insn_p): New function.
>   (arm_mve_impl_pred_on_outputs_p): New function.
>   (arm_mve_impl_pred_on_inputs_p): New function.
>   (arm_last_vect_def_insn): New function.
>   (arm_mve_impl_predicated_p): New function.
>   (arm_mve_check_reg_origin_is_num_elems): New function.
>   (arm_mve_dlstp_check_inc_counter): New function.
>   (arm_mve_dlstp_check_dec_counter): New function.
>   (arm_mve_loop_valid_for_dlstp): New function.
>   (arm_predict_doloop_p): New function.
>   (arm_loop_unroll_adjust): New function.
>   (arm_emit_mve_unpredicated_insn_to_seq): New function.
>   (arm_attempt_dlstp_transform): New function.
>   * config/arm/arm.opt (mdlstp): New option.
>   * config/arm/iteratords.md (dlstp_elemsize, letp_num_lanes,
>   letp_num_lanes_neg, letp_num_lanes_minus_1): New attributes.
>   (DLSTP, LETP): New iterators.
>   (predicated_doloop_end_internal): New pattern.
>   (dlstp_insn): New pattern.
>   * config/arm/thumb2.md (doloop_end): Adapt to support tail-predicated
>   loops.
>   (doloop_begin): Likewise.
>   * config/arm/types.md (mve_misc): New mve type to represent
>   predicated_loop_end insn sequences.
>   * config/arm/unspecs.md:
>   (DLSTP8, DLSTP16, DLSTP32, DSLTP64,
>   LETP8, LETP16, LETP32, LETP64): New unspecs for DLSTP and LETP.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/lob.h: Add new helpers.
>   * gcc.target/arm/lob1.c: Use new helpers.
>   * gcc.target/arm/lob6.c: Likewise.
>   * gcc.target/arm/dlstp-compile-asm-1.c: New test.
>   * gcc.target/arm/dlstp-compile-asm-2.c: New test.
>   * gcc.target/arm/dlstp-compile-asm-3.c: New test.
>   * gcc.target/arm/dlstp-int8x16.c: New test.
>   * gcc.target/arm/dlstp-int8x16-run.c: New test.
>   * gcc.target/arm/dlstp-int16x8.c: New test.
>   * gcc.target/arm/dlstp-int16x8-run.c: New test.
>   * gcc.target/arm/dlstp-int32x4.c: New test.
>   * gcc.target/arm/dlstp-int32x4-run.c: New test.
>   * gcc.target/arm/dlstp-int64x2.c: New test.
>   * gcc.target/arm/dlstp-int64x2-run.c: New test.
>   * gcc.target/arm/dlstp-invalid-asm.c: New test.
> 
> Co-authored-by: Stam Markianos-Wright 
> ---
>  gcc/config/arm/arm-protos.h   |4 +-
>  gcc/config/arm/arm.cc | 1249 -
>  gcc/config/arm/arm.opt|3 +
>  gcc/config/arm/iterators.md   |   15 +
>  gcc/config/arm/mve.md |   50 +
>  gcc/config/arm/thumb2.md  |  138 +-
>  gcc/config/arm/types.md   |6 +-
>  gcc/config/arm/unspecs.md |   14 +-
>  gcc/testsuite/gcc.target/arm/lob.h|  128 +-
>  gcc/testsuite/gcc.target/arm/lob1.c   |   23 +-
>  gcc/testsuite/gcc.target/arm/lob6.c   |8 +-
>  .../gcc.target/arm/mve/dlstp-compile-asm-1.c  |  146 ++
>  .../gcc.target/arm/mve/dlstp-compile-asm-2.c  |  749 ++
>  .../gcc.target/arm/mve/dlstp-compile-asm-3.c  |   46 +
>  .../gcc.target/arm/mve/dlstp-int16x8-run.c|   44 +
>  .../gcc.target/arm/mve/dlstp-int16x8.c|   31 +
>  .../gcc.target/arm/mve/dlstp-int32x4-run.c|   45 +
>  .../gcc.target/arm/mve/dlstp-int32x4.c|   31 +
>  .../gcc.target/arm/mve/dlstp-int64x2-run.c|   48 +
>  .../gcc.target/arm/mve/dlstp-int64x2.c|   28 +
>  .../gcc.target/arm/mve/dlstp-int8x16-run.c|   44 +
>  .../gcc.target/arm/mve/dlstp-int8x16.c|   32 +
>  .../gcc.target/arm/mve/dlstp-invalid-asm.c|  521 +++
>  23 files changed, 3321 insertions(+), 82 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c
>  create mode 100644 

Re: [PATCH v6 4/5] doloop: Add support for predicated vectorized loops

2024-03-01 Thread Richard Earnshaw (lists)
On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds support in the target agnostic doloop pass for the detection 
> of
> predicated vectorized hardware loops.  Arm is currently the only target that
> will make use of this feature.
> 
> gcc/ChangeLog:
> 
>   * df-core.cc (df_bb_regno_only_def_find): New helper function.
>   * df.h (df_bb_regno_only_def_find): Declare new function.
>   * loop-doloop.cc (doloop_condition_get): Add support for detecting
>   predicated vectorized hardware loops.
>   (doloop_modify): Add support for GTU condition checks.
>   (doloop_optimize): Update costing computation to support alterations to
>   desc->niter_expr by the backend.
> 
> Co-authored-by: Stam Markianos-Wright 
> ---
>  gcc/df-core.cc |  15 +
>  gcc/df.h   |   1 +
>  gcc/loop-doloop.cc | 164 +++--
>  3 files changed, 113 insertions(+), 67 deletions(-)
> 

As discussed, I think we should wait for gcc-15 for this[*]; I know it was 
initially submitted during stage1 but it's had to go through a lot of revision 
since then and we're very close to wanting to cut the release branch.

R.

[*] Unless an independent reviewer wants to sign this off anyway.


Re: [PATCH v6 3/5] arm: Fix a wrong attribute use and remove unused unspecs and iterators

2024-03-01 Thread Richard Earnshaw (lists)
On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch fixes the erroneous use of a mode attribute without a mode iterator
> in the pattern and removes unused unspecs and iterators.
> 
> gcc/ChangeLog:
> 
>   * config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
>   VMLALDAVAXQ_U cases.
>   (VMLALDAVXQ): Remove iterator.
>   (VMLALDAVXQ_P): Likewise.
>   (VMLALDAVAXQ): Likewise.
>   * config/arm/mve.md (mve_vstrwq_p_fv4sf): Replace use of 
>   mode iterator attribute with V4BI mode.
>   * config/arm/unspecs.md (VMLALDAVXQ_U, VMLALDAVXQ_P_U,
>   VMLALDAVAXQ_U): Remove unused unspecs.
> ---
>  gcc/config/arm/iterators.md | 9 +++--
>  gcc/config/arm/mve.md   | 2 +-
>  gcc/config/arm/unspecs.md   | 3 ---
>  3 files changed, 4 insertions(+), 10 deletions(-)
> 

OK

R.


Re: [PATCH v6 2/5] arm: Annotate instructions with mve_safe_imp_xlane_pred

2024-03-01 Thread Richard Earnshaw (lists)
On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch annotates some MVE across lane instructions with a new attribute.
> We use this attribute to let the compiler know that these instructions can be
> safely implicitly predicated when tail predicating if their operands are
> guaranteed to have zeroed tail predicated lanes.  These instructions were
> selected because having the value 0 in those lanes or 'tail-predicating' those
> lanes have the same effect.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.md (mve_safe_imp_xlane_pred): New attribute.
>   * config/arm/iterators.md (mve_vmaxmin_safe_imp): New iterator
>   attribute.
>   * config/arm/mve.md (vaddvq_s, vaddvq_u, vaddlvq_s, vaddlvq_u,
>   vaddvaq_s, vaddvaq_u, vmaxavq_s, vmaxvq_u, vmladavq_s, vmladavq_u,
>   vmladavxq_s, vmlsdavq_s, vmlsdavxq_s, vaddlvaq_s, vaddlvaq_u,
>   vmlaldavq_u, vmlaldavq_s, vmlaldavq_u, vmlaldavxq_s, vmlsldavq_s,
>   vmlsldavxq_s, vrmlaldavhq_u, vrmlaldavhq_s, vrmlaldavhxq_s,
>   vrmlsldavhq_s, vrmlsldavhxq_s, vrmlaldavhaq_s, vrmlaldavhaq_u,
>   vrmlaldavhaxq_s, vrmlsldavhaq_s, vrmlsldavhaxq_s, vabavq_s, vabavq_u,
>   vmladavaq_u, vmladavaq_s, vmladavaxq_s, vmlsdavaq_s, vmlsdavaxq_s,
>   vmlaldavaq_s, vmlaldavaq_u, vmlaldavaxq_s, vmlsldavaq_s,
>   vmlsldavaxq_s): Added mve_safe_imp_xlane_pred.
> ---
>  gcc/config/arm/arm.md   |  6 ++
>  gcc/config/arm/iterators.md |  8 
>  gcc/config/arm/mve.md   | 12 
>  3 files changed, 26 insertions(+)
> 

OK

R.


Re: [PATCH v6 1/5] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-03-01 Thread Richard Earnshaw (lists)
On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds an attribute to the mve md patterns to be able to identify
> predicable MVE instructions and what their predicated and unpredicated 
> variants
> are.  This attribute is used to encode the icode of the unpredicated variant 
> of
> an instruction in its predicated variant.
> 
> This will make it possible for us to transform VPT-predicated insns in
> the insn chain into their unpredicated equivalents when transforming the loop
> into a MVE Tail-Predicated Low Overhead Loop. For example:
> `mve_vldrbq_z_ -> mve_vldrbq_`.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.md (mve_unpredicated_insn): New attribute.
>   * config/arm/arm.h (MVE_VPT_PREDICATED_INSN_P): New define.
>   (MVE_VPT_UNPREDICATED_INSN_P): Likewise.
>   (MVE_VPT_PREDICABLE_INSN_P): Likewise.
>   * config/arm/vec-common.md (mve_vshlq_): Add attribute.
>   * config/arm/mve.md (arm_vcx1q_p_v16qi): Add attribute.
>   (arm_vcx1qv16qi): Likewise.
>   (arm_vcx1qav16qi): Likewise.
>   (arm_vcx1qv16qi): Likewise.
>   (arm_vcx2q_p_v16qi): Likewise.
>   (arm_vcx2qv16qi): Likewise.
>   (arm_vcx2qav16qi): Likewise.
>   (arm_vcx2qv16qi): Likewise.
>   (arm_vcx3q_p_v16qi): Likewise.
>   (arm_vcx3qv16qi): Likewise.
>   (arm_vcx3qav16qi): Likewise.
>   (arm_vcx3qv16qi): Likewise.
>   (@mve_q_): Likewise.
>   (@mve_q_int_): Likewise.
>   (@mve_q_v4si): Likewise.
>   (@mve_q_n_): Likewise.
>   (@mve_q_r_): Likewise.
>   (@mve_q_f): Likewise.
>   (@mve_q_m_): Likewise.
>   (@mve_q_m_n_): Likewise.
>   (@mve_q_m_r_): Likewise.
>   (@mve_q_m_f): Likewise.
>   (@mve_q_int_m_): Likewise.
>   (@mve_q_p_v4si): Likewise.
>   (@mve_q_p_): Likewise.
>   (@mve_q_): Likewise.
>   (@mve_q_f): Likewise.
>   (@mve_q_m_): Likewise.
>   (@mve_q_m_f): Likewise.
>   (mve_vq_f): Likewise.
>   (mve_q): Likewise.
>   (mve_q_f): Likewise.
>   (mve_vadciq_v4si): Likewise.
>   (mve_vadciq_m_v4si): Likewise.
>   (mve_vadcq_v4si): Likewise.
>   (mve_vadcq_m_v4si): Likewise.
>   (mve_vandq_): Likewise.
>   (mve_vandq_f): Likewise.
>   (mve_vandq_m_): Likewise.
>   (mve_vandq_m_f): Likewise.
>   (mve_vandq_s): Likewise.
>   (mve_vandq_u): Likewise.
>   (mve_vbicq_): Likewise.
>   (mve_vbicq_f): Likewise.
>   (mve_vbicq_m_): Likewise.
>   (mve_vbicq_m_f): Likewise.
>   (mve_vbicq_m_n_): Likewise.
>   (mve_vbicq_n_): Likewise.
>   (mve_vbicq_s): Likewise.
>   (mve_vbicq_u): Likewise.
>   (@mve_vclzq_s): Likewise.
>   (mve_vclzq_u): Likewise.
>   (@mve_vcmp_q_): Likewise.
>   (@mve_vcmp_q_n_): Likewise.
>   (@mve_vcmp_q_f): Likewise.
>   (@mve_vcmp_q_n_f): Likewise.
>   (@mve_vcmp_q_m_f): Likewise.
>   (@mve_vcmp_q_m_n_): Likewise.
>   (@mve_vcmp_q_m_): Likewise.
>   (@mve_vcmp_q_m_n_f): Likewise.
>   (mve_vctpq): Likewise.
>   (mve_vctpq_m): Likewise.
>   (mve_vcvtaq_): Likewise.
>   (mve_vcvtaq_m_): Likewise.
>   (mve_vcvtbq_f16_f32v8hf): Likewise.
>   (mve_vcvtbq_f32_f16v4sf): Likewise.
>   (mve_vcvtbq_m_f16_f32v8hf): Likewise.
>   (mve_vcvtbq_m_f32_f16v4sf): Likewise.
>   (mve_vcvtmq_): Likewise.
>   (mve_vcvtmq_m_): Likewise.
>   (mve_vcvtnq_): Likewise.
>   (mve_vcvtnq_m_): Likewise.
>   (mve_vcvtpq_): Likewise.
>   (mve_vcvtpq_m_): Likewise.
>   (mve_vcvtq_from_f_): Likewise.
>   (mve_vcvtq_m_from_f_): Likewise.
>   (mve_vcvtq_m_n_from_f_): Likewise.
>   (mve_vcvtq_m_n_to_f_): Likewise.
>   (mve_vcvtq_m_to_f_): Likewise.
>   (mve_vcvtq_n_from_f_): Likewise.
>   (mve_vcvtq_n_to_f_): Likewise.
>   (mve_vcvtq_to_f_): Likewise.
>   (mve_vcvttq_f16_f32v8hf): Likewise.
>   (mve_vcvttq_f32_f16v4sf): Likewise.
>   (mve_vcvttq_m_f16_f32v8hf): Likewise.
>   (mve_vcvttq_m_f32_f16v4sf): Likewise.
>   (mve_vdwdupq_m_wb_u_insn): Likewise.
>   (mve_vdwdupq_wb_u_insn): Likewise.
>   (mve_veorq_s>): Likewise.
>   (mve_veorq_u>): Likewise.
>   (mve_veorq_f): Likewise.
>   (mve_vidupq_m_wb_u_insn): Likewise.
>   (mve_vidupq_u_insn): Likewise.
>   (mve_viwdupq_m_wb_u_insn): Likewise.
>   (mve_viwdupq_wb_u_insn): Likewise.
>   (mve_vldrbq_): Likewise.
>   (mve_vldrbq_gather_offset_): Likewise.
>   (mve_vldrbq_gather_offset_z_): Likewise.
>   (mve_vldrbq_z_): Likewise.
>   (mve_vldrdq_gather_base_v2di): Likewise.
>   (mve_vldrdq_gather_base_wb_v2di_insn): Likewise.
>   (mve_vldrdq_gather_base_wb_z_v2di_insn): Likewise.
>   (mve_vldrdq_gather_base_z_v2di): Likewise.
>   (mve_vldrdq_gather_offset_v2di): Likewise.
>   (mve_vldrdq_gather_offset_z_v2di): Likewise.
>   (mve_vldrdq_gather_shifted_offset_v2di): Likewise.
>   (mve_vldrdq_gather_shifted_offset_z_v2di): Likewise.
>   (mve_vldrhq_): Likewise.
>   

Re: [PATCH] arm: Fixed C23 call compatibility with arm-none-eabi

2024-03-01 Thread Richard Earnshaw (lists)
On 19/02/2024 09:13, Torbjörn SVENSSON wrote:
> Ok for trunk and releases/gcc-13?
> Regtested on top of 945cb8490cb for arm-none-eabi, without any regression.
> 
> Backporting to releases/gcc-13 will change -std=c23 to -std=c2x.

Jakub has just pushed a different fix for this, so I don't think we need this 
now.

R.


> 
> --
> 
> In commit 4fe34cdcc80ac225b80670eabc38ac5e31ce8a5a, -std=c23 support was
> introduced to support functions without any named arguments.  For
> arm-none-eabi, this is not as simple as placing all arguments on the
> stack.  Align the caller to use r0, r1, r2 and r3 for arguments even for
> functions without any named arguments, as specified in the AAPCS.
> 
> Verify that the generic test case have the arguments are in the right
> order and add ARM specific test cases.
> 
> gcc/ChangeLog:
> 
>   * calls.h: Added the type of the function to function_arg_info.
>   * calls.cc: Save the type of the function.
>   * config/arm/arm.cc: Check in the AAPCS layout function if
>   function has no named args.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/torture/c23-stdarg-split-1a.c: Detect out of order
>   arguments.
>   * gcc.dg/torture/c23-stdarg-split-1b.c: Likewise.
>   * gcc.target/arm/aapcs/align_vaarg3.c: New test.
>   * gcc.target/arm/aapcs/align_vaarg4.c: New test.
> 
> Signed-off-by: Torbjörn SVENSSON 
> Co-authored-by: Yvan ROUX 
> ---
>  gcc/calls.cc  |  2 +-
>  gcc/calls.h   | 20 --
>  gcc/config/arm/arm.cc | 13 ---
>  .../gcc.dg/torture/c23-stdarg-split-1a.c  |  4 +-
>  .../gcc.dg/torture/c23-stdarg-split-1b.c  | 15 +---
>  .../gcc.target/arm/aapcs/align_vaarg3.c   | 37 +++
>  .../gcc.target/arm/aapcs/align_vaarg4.c   | 31 
>  7 files changed, 102 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/align_vaarg3.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/align_vaarg4.c
> 
> diff --git a/gcc/calls.cc b/gcc/calls.cc
> index 01f44734743..a1cc283b952 100644
> --- a/gcc/calls.cc
> +++ b/gcc/calls.cc
> @@ -1376,7 +1376,7 @@ initialize_argument_information (int num_actuals 
> ATTRIBUTE_UNUSED,
>with those made by function.cc.  */
>  
>/* See if this argument should be passed by invisible reference.  */
> -  function_arg_info arg (type, argpos < n_named_args);
> +  function_arg_info arg (type, fntype, argpos < n_named_args);
>if (pass_by_reference (args_so_far_pnt, arg))
>   {
> const bool callee_copies
> diff --git a/gcc/calls.h b/gcc/calls.h
> index 464a4e34e33..88836559ebe 100644
> --- a/gcc/calls.h
> +++ b/gcc/calls.h
> @@ -35,24 +35,33 @@ class function_arg_info
>  {
>  public:
>function_arg_info ()
> -: type (NULL_TREE), mode (VOIDmode), named (false),
> +: type (NULL_TREE), fntype (NULL_TREE), mode (VOIDmode), named (false),
>pass_by_reference (false)
>{}
>  
>/* Initialize an argument of mode MODE, either before or after promotion.  
> */
>function_arg_info (machine_mode mode, bool named)
> -: type (NULL_TREE), mode (mode), named (named), pass_by_reference (false)
> +: type (NULL_TREE), fntype (NULL_TREE), mode (mode), named (named),
> +pass_by_reference (false)
>{}
>  
>/* Initialize an unpromoted argument of type TYPE.  */
>function_arg_info (tree type, bool named)
> -: type (type), mode (TYPE_MODE (type)), named (named),
> +: type (type), fntype (NULL_TREE), mode (TYPE_MODE (type)), named 
> (named),
>pass_by_reference (false)
>{}
>  
> +  /* Initialize an unpromoted argument of type TYPE with a known function 
> type
> + FNTYPE.  */
> +  function_arg_info (tree type, tree fntype, bool named)
> +: type (type), fntype (fntype), mode (TYPE_MODE (type)), named (named),
> +pass_by_reference (false)
> +  {}
> +
>/* Initialize an argument with explicit properties.  */
>function_arg_info (tree type, machine_mode mode, bool named)
> -: type (type), mode (mode), named (named), pass_by_reference (false)
> +: type (type), fntype (NULL_TREE), mode (mode), named (named),
> +pass_by_reference (false)
>{}
>  
>/* Return true if the gimple-level type is an aggregate.  */
> @@ -96,6 +105,9 @@ public:
>   libgcc support functions).  */
>tree type;
>  
> +  /* The type of the function that has this argument, or null if not known.  
> */
> +  tree fntype;
> +
>/* The mode of the argument.  Depending on context, this might be
>   the mode of the argument type or the mode after promotion.  */
>machine_mode mode;
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 1cd69268ee9..98e149e5b7e 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -7006,7 +7006,7 @@ aapcs_libcall_value (machine_mode mode)
> numbers referred to here are those in the 

Re: [PATCH] arm: fix c23 0-named-args caller-side stdarg

2024-03-01 Thread Richard Earnshaw (lists)
On 01/03/2024 04:38, Alexandre Oliva wrote:
> Hello, Matthew,
> 
> Thanks for the review.

For closure, Jakub has just pushed a patch to the generic code, so I don't 
think we need this now.

R.

> 
> On Feb 26, 2024, Matthew Malcomson  wrote:
> 
>> I think you're right that the AAPCS32 requires all arguments to be passed in
>> registers for this testcase.
>> (Nit on the commit-message: It says that your reading of the AAPCS32
>> suggests
>> that the *caller* is correct -- I believe based on the change you
>> suggested you
>> meant *callee* is correct in expecting arguments in registers.)
> 
> Ugh, yeah, sorry about the typo.
> 
>> The approach you suggest looks OK to me -- I do notice that it doesn't
>> fix the
>> legacy ABI's of `atpcs` and `apcs` and guess it would be nicer to have them
>> working at the same time though would defer to maintainers on how
>> important that
>> is.
>> (For the benefit of others reading) I don't believe there is any ABI concern
>> with this since it's fixing something that is currently not working at
>> all and
>> only applies to c23 (so a change shouldn't have too much of an impact).
> 
>> You mention you chose to make the change in the arm backend rather
>> than general
>> code due to hesitancy to change the generic ABI-affecting code. That makes
>> sense to me, certainly at this late stage in the development cycle.
> 
> *nod* I wrote the patch in the following context: I hit the problem on
> the very first toolchain I started transitioning to gcc-13.  I couldn't
> really fathom the notion that this breakage could have survived an
> entire release cycle if it affected many targets, and sort of held on to
> an assumption that the abi used by our arm-eabi toolchain had to be an
> uncommon one.
> 
> All of this hypothesizing falls apart by the now apparent knowledge that
> the test is faling elsewhere as well, even on other ARM ABIs, it just
> hadn't been addressed yet.  I'm glad we're getting there :-)
> 
>> From a quick check on c23-stdarg-4.c it does look like the below
>> change ends up
>> with the same codegen as your patch (except in the case of those
>> legacy ABI's,
>> where the below does make the caller and callee ABI match AFAICT):
> 
>> ```
>>   diff --git a/gcc/calls.cc b/gcc/calls.cc
>>   index 01f44734743..0b302f633ed 100644
>>   --- a/gcc/calls.cc
>>   +++ b/gcc/calls.cc
>>   @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int ignore)
>>     we do not have any reliable way to pass unnamed args in
>>     registers, so we must force them into memory.  */
> 
>>   -  if (type_arg_types != 0
>>   +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>>  && targetm.calls.strict_argument_naming (args_so_far))
>>    ;
>>  else if (type_arg_types != 0
>>  && ! targetm.calls.pretend_outgoing_varargs_named
>> (args_so_far))
>>    /* Don't include the last named arg.  */
>>    --n_named_args;
>>   -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>>   +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
>>   +    && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>>    n_named_args = 0;
>>  else
>>    /* Treat all args as named.  */
>> ```
> 
>> Do you agree that this makes sense (i.e. is there something I'm
>> completely missing)?
> 
> Yeah, your argument is quite convincing, and the target knobs are indeed
> in line with the change you suggest, whereas the current code seems to
> deviate from them.
> 
> With my ABI designer hat on, however, I see that there's room for ABIs
> to make decisions about 0-args stdargs that go differently from stdargs
> with leading named args, from prototyped functions, and even from
> prototypeless functions, and we might end up needing more knobs to deal
> with such custom decisions.  We can cross that bridge if/when we get to
> it, though.
> 
>> (lm32 mcore msp430 gcn cris fr30 frv h8300 arm v850 rx pru)
> 
> Interesting that ppc64le is not on your list.  There's PR107453 about
> that, and another thread is discussing a fix for it that is somewhat
> different from what you propose (presumably because the way the problem
> manifests on ppc64le is different), but it also tweaks expand_call.
> 
> I'll copy you when following up there.
> 



Re: [PATCH] testsuite: Turn errors back into warnings in arm/acle/cde-mve-error-2.c

2024-03-01 Thread Richard Earnshaw (lists)
On 13/01/2024 20:46, Thiago Jung Bauermann wrote:
> Since commit 2c3db94d9fd ("c: Turn int-conversion warnings into
> permerrors") the test fails with errors such as:
> 
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 32)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 33)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 34)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 35)
> ⋮
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 118 (test for 
> warnings, line 117)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 119)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 120 (test for 
> warnings, line 119)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 121)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 122 (test for 
> warnings, line 121)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 123)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 124 (test for 
> warnings, line 123)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 125)
> ⋮
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0  (test for excess errors)
> 
> There's a total of 1016 errors.  Here's a sample of the excess errors:
> 
>   Excess errors:
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:117:31: 
> error: passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:119:3: 
> error: passing argument 3 of '__builtin_arm_vcx1qav16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:121:3: 
> error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:123:3: 
> error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
> 
> The test expects these messages to be warnings, not errors.  My first try
> was to change it to expect them as errors instead.  This didn't work, IIUC
> because the error prevents the compiler from continuing processing the file
> and thus other errors which are expected by the test don't get emitted.
> 
> Therefore, add -fpermissive so that the test behaves as it did previously.
> Because of the additional line in the header, I had to adjust the line
> numbers of the expected warnings.
> 
> Tested on armv8l-linux-gnueabihf.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.target/arm/acle/cde-mve-error-2.c: Add -fpermissive.
> ---
>  .../gcc.target/arm/acle/cde-mve-error-2.c | 63 ++-
>  1 file changed, 32 insertions(+), 31 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c 
> b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> index 5b7774825442..da283a06a54d 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> @@ -2,6 +2,7 @@
>  
>  /* { dg-do assemble } */
>  /* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
> +/* { dg-options "-fpermissive" } */
>  /* { dg-add-options arm_v8_1m_main_cde_mve_fp } */
>  
>  /* The error checking files are split since there are three kinds of
> @@ -115,73 +116,73 @@ uint8x16_t test_bad_immediates (uint8x16_t n, 
> uint8x16_t m, int someval,
>  
>/* `imm' is of wrong type.  */
>accum += __arm_vcx1q_u8 (0, "");/* { dg-error 
> {argument 2 to '__builtin_arm_vcx1qv16qi' must be a constant immediate in 
> range \[0-4095\]} } */
> -  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes 
> integer from pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 
> 117 } */
> +  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes 
> integer from pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 
> 118 } */

Absolute line numbers are a pain, but I think we can use '.-1' (without the 
quotes) in these cases to minimize the churn.

If that works, ok with that change.

R.



Re: [PATCH] testsuite: Fix fallout of turning warnings into errors on 32-bit Arm

2024-03-01 Thread Richard Earnshaw (lists)
On 01/03/2024 14:23, Andre Vieira (lists) wrote:
> Hi Thiago,
> 
> Thanks for this, LGTM but I can't approve this, CC'ing Richard.
> 
> Do have a nitpick, in the gcc/testsuite/ChangeLog: remove 'gcc/testsuite' 
> from bullet points 2-4.
> 

Yes, this is OK with the change Andre mentioned (your push will fail if you 
don't fix that).

R.

PS, if you've set up GCC git customizations (see 
contrib/gcc-git-customization.sh), you can verify things like this with 'git 
gcc-verify HEAD^..HEAD'


> Kind regards,
> Andre
> 
> On 13/01/2024 00:55, Thiago Jung Bauermann wrote:
>> Since commits 2c3db94d9fd ("c: Turn int-conversion warnings into
>> permerrors") and 55e94561e97e ("c: Turn -Wimplicit-function-declaration
>> into a permerror") these tests fail with errors such as:
>>
>>    FAIL: gcc.target/arm/pr59858.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr65647.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr65710.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr97969.c (test for excess errors)
>>
>> Here's one example of the excess errors:
>>
>>    FAIL: gcc.target/arm/pr65647.c (test for excess errors)
>>    Excess errors:
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:17: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:51: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:62: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:7:48: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:8:9: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:24:5: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:25:5: error: 
>> initialization of 'int' from 'struct S1 *' makes integer from pointer 
>> without a cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:41:3: error: 
>> implicit declaration of function 'fn3'; did you mean 'fn2'? 
>> [-Wimplicit-function-declaration]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:46:3: error: 
>> implicit declaration of function 'fn5'; did you mean 'fn4'? 
>> [-Wimplicit-function-declaration]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:57:16: error: 
>> implicit declaration of function 'fn6'; did you mean 'fn4'? 
>> [-Wimplicit-function-declaration]
>>
>> PR rtl-optimization/59858 and PR target/65710 test the fix of an ICE.
>> PR target/65647 and PR target/97969 test for a compilation infinite loop.
>>
>> Therefore, add -fpermissive so that the tests behave as they did previously.
>> Tested on armv8l-linux-gnueabihf.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.target/arm/pr59858.c: Add -fpermissive.
>> * gcc/testsuite/gcc.target/arm/pr65647.c: Likewise.
>> * gcc/testsuite/gcc.target/arm/pr65710.c: Likewise.
>> * gcc/testsuite/gcc.target/arm/pr97969.c: Likewise.
>> ---
>>   gcc/testsuite/gcc.target/arm/pr59858.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr65647.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr65710.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr97969.c | 2 +-
>>   4 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/pr59858.c 
>> b/gcc/testsuite/gcc.target/arm/pr59858.c
>> index 3360b48e8586..9336edfce277 100644
>> --- a/gcc/testsuite/gcc.target/arm/pr59858.c
>> +++ b/gcc/testsuite/gcc.target/arm/pr59858.c
>> @@ -1,5 +1,5 @@
>>   /* { dg-do compile } */
>> -/* { dg-options "-march=armv5te -fno-builtin -mfloat-abi=soft -mthumb 
>> -fno-stack-protector -Os -fno-tree-loop-optimize -fno-tree-dominator-opts 
>> -fPIC -w" } */
>> +/* { dg-options "-march=armv5te -fno-builtin -mfloat-abi=soft -mthumb 
>> -fno-stack-protector -Os -fno-tree-loop-optimize -fno-tree-dominator-opts 
>> -fPIC -w -fpermissive" } */
>>   /* { dg-require-effective-target fpic } */
>>   /* { dg-skip-if "Incompatible command line options: -mfloat-abi=soft 
>> -mfloat-abi=hard" { *-*-* } { "-mfloat-abi=hard" } { "" } } */
>>   /* { dg-require-effective-target arm_arch_v5te_thumb_ok } */
>> diff --git a/gcc/testsuite/gcc.target/arm/pr65647.c 
>> b/gcc/testsuite/gcc.target/arm/pr65647.c
>> index 26b4e399f6be..3cbf6b804ec0 100644
>> --- a/gcc/testsuite/gcc.target/arm/pr65647.c
>> +++ b/gcc/testsuite/gcc.target/arm/pr65647.c
>> @@ -1,7 +1,7 @@
>>   /* { dg-do compile } */
>>   /* { dg-require-effective-target arm_arch_v6m_ok } */
>>   /* { 

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-03-01 Thread Richard Earnshaw (lists)
On 29/02/2024 15:55, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 02:14:05PM +, Richard Earnshaw wrote:
>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>> including the new testcase you added.
>>>
>>
>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>> n_named_args entirely, it doesn't need it (I don't think it even existed
>> when the AAPCS code was added).
> 
> So far I've just checked that the new testcase passes not just on
> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
> with vanilla trunk.
> Haven't posted this patch in patch form, plus while I'm not really sure
> whether setting n_named_args to 0 or not changing in the
> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
> correct to me.  If structure_value_addr_parm is 1, the function effectively
> has a single named argument and then ... args and if the target wants
> n_named_args to be number of named arguments except the last, then that
> should be 0 rather than 1.
> 
> Thus, is the following patch ok for trunk then?
> 
> 2024-02-29  Jakub Jelinek  
> 
>   PR target/107453

PR 114136

Would be more appropriate for this, I think.

Otherwise, OK.

R.

>   * calls.cc (expand_call): For TYPE_NO_NAMED_ARGS_STDARG_P set
>   n_named_args initially before INIT_CUMULATIVE_ARGS to
>   structure_value_addr_parm rather than 0, after it don't modify
>   it if strict_argument_naming and clear only if
>   !pretend_outgoing_varargs_named.
> 
> --- gcc/calls.cc.jj   2024-01-22 11:48:08.045847508 +0100
> +++ gcc/calls.cc  2024-02-29 16:24:47.799855912 +0100
> @@ -2938,7 +2938,7 @@ expand_call (tree exp, rtx target, int i
>/* Count the struct value address, if it is passed as a parm.  */
>+ structure_value_addr_parm);
>else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> -n_named_args = 0;
> +n_named_args = structure_value_addr_parm;
>else
>  /* If we know nothing, treat all args as named.  */
>  n_named_args = num_actuals;
> @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int i
>   we do not have any reliable way to pass unnamed args in
>   registers, so we must force them into memory.  */
>  
> -  if (type_arg_types != 0
> +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>&& targetm.calls.strict_argument_naming (args_so_far))
>  ;
>else if (type_arg_types != 0
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  /* Don't include the last named arg.  */
>  --n_named_args;
> -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
>else
>  /* Treat all args as named.  */
> 
>   Jakub
> 



Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-03-01 Thread Richard Earnshaw (lists)
On 29/02/2024 17:56, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 05:51:03PM +0000, Richard Earnshaw (lists) wrote:
>> Oh, but wait!  Perhaps that now falls into the initial 'if' clause and we 
>> never reach the point where you pick zero.  So perhaps I'm worrying about 
>> nothing.
> 
> If you are worried about the
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
> case in the patch, we know at that point that the initial n_named_args is
> equal to structure_value_addr_parm, so either 0, in that case
> --n_named_args;
> would yield the undesirable negative value, so we want 0 instead; for that
> case we could as well just have ; in there instead of n_named_args = 0;,
> or it is 1, in that case --n_named_args; would turn that into 0.
> 
>   Jakub
> 

No, I was thinking about the case of strict_argument_naming when the first 
argument is the artificial return value pointer.  In that case we'd want 
n_named_args=1.

But I think it's a non-issue as that will be caught by 

  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
   && targetm.calls.strict_argument_naming (args_so_far))
 ;

R.


Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-02-29 Thread Richard Earnshaw (lists)
On 29/02/2024 17:55, Andrew Pinski (QUIC) wrote:
>> -Original Message-
>> From: Maxim Kuvyrkov 
>> Sent: Thursday, February 29, 2024 9:46 AM
>> To: Andrew Pinski (QUIC) 
>> Cc: Evgeny Karpov ; Andrew Pinski
>> ; Richard Sandiford ; gcc-
>> patc...@gcc.gnu.org; 10wa...@gmail.com; m...@harmstone.com; Zac
>> Walker ; Ron Riddle
>> ; Radek Barton 
>> Subject: Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW
>> environments for AArch64
>>
>> WARNING: This email originated from outside of Qualcomm. Please be wary
>> of any links or attachments, and do not enable macros.
>>
>>> On Feb 29, 2024, at 21:35, Andrew Pinski (QUIC)
>>  wrote:
>>>
>>>
>>>
 -Original Message-
 From: Evgeny Karpov 
 Sent: Thursday, February 29, 2024 8:46 AM
 To: Andrew Pinski 
 Cc: Richard Sandiford ; gcc-
 patc...@gcc.gnu.org; 10wa...@gmail.com; Maxim Kuvyrkov
 ; m...@harmstone.com; Zac Walker
 ; Ron Riddle ;
 Radek Barton ; Andrew Pinski (QUIC)
 
 Subject: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments
 for AArch64

 Wednesday, February 28, 2024 2:00 AM
 Andrew Pinski wrote:

> What does this mean with respect to C++ exceptions? Or you using
> SJLJ exceptions support or the dwarf unwinding ones without SEH
>> support?
> I am not sure if SJLJ exceptions is well tested any more in GCC either.
>
> Also I have a question if you ran the full GCC/G++ testsuites and
> what were the results?
> If you did run it, did you use a cross compiler or the native
> compiler? Did you do a bootstrap (GCC uses C++ but no exceptions
>> though)?

 As mentioned in the cover letter and the thread, the current
 contribution covers only the C scope.
 Exception handling is fully disabled for now.
 There is an experimental build with C++ and SEH, however, it is not
 included in the plan for the current contribution.

 https://github.com/Windows-on-ARM-Experiments/mingw-woarm64-
>> build

> If you run using a cross compiler, did you use ssh or some other
> route to run the applications?
>
> Thanks,
> Andrew Pinski

 GitHub Actions are used to cross-compile toolchains, packages and
 tests, and execute tests on Windows Arm64.
>>>
>>> This does not answer my question because what you are running is just
>> simple testcases and not the FULL GCC testsuite.
>>> So again have you ran the GCC testsuite and do you have a dejagnu board to
>> be able to execute the binaries?
>>> I think without the GCC testsuite ran to find all of the known failures, 
>>> you are
>> going to be running into many issues.
>>> The GCC testsuite includes many tests for ABI corner cases and many
>> features that you will most likely not think about testing using your simple
>> testcases.
>>> In fact I suspect there will be some of the aarch64 testcases which will 
>>> need
>> to be modified for the windows ABI which you have not done yet.
>>
>> Hi Andrew,
>>
>> We (Linaro) have a prototype CI loop setup for testing aarch64-w64-
>> mingw32, and we have results for gcc-c and libatomic -- see [1].
>>
>> The results are far from clean, but that's expected.  This patch series aims 
>> at
>> enabling C hello-world only, and subsequent patch series will improve the
>> state of the port.
>>
>> [1] https://ci.linaro.org/job/tcwg_gnu_mingw_check_gcc--master-woa64-
>> build/6/artifact/artifacts/sumfiles/
> 
> Looking at these results, this port is not in any shape or form to be 
> upstreamed right now. Even simple -g will cause failures.
> Note we don't need a clean testsuite run but the patch series is not even 
> allowing enabling hello world due to the -g not being able to used.
> 

It seemed to me as though the patch was posted for comments, not for immediate 
inclusion.  I agree this isn't ready for committing yet, but neither should the 
submitters wait until it's perfect before posting it.

I think it's gcc-15 material, so now is about the right time to be thinking 
about it.

R.

> Thanks,
> Amdrew Pinski
> 
>>
>> Thanks,
>>
>> --
>> Maxim Kuvyrkov
>> https://www.linaro.org
> 



Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)
On 29/02/2024 17:38, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 05:23:25PM +0000, Richard Earnshaw (lists) wrote:
>> On 29/02/2024 15:55, Jakub Jelinek wrote:
>>> On Thu, Feb 29, 2024 at 02:14:05PM +, Richard Earnshaw wrote:
>>>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>>>> including the new testcase you added.
>>>>>
>>>>
>>>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>>>> n_named_args entirely, it doesn't need it (I don't think it even existed
>>>> when the AAPCS code was added).
>>>
>>> So far I've just checked that the new testcase passes not just on
>>> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
>>> with vanilla trunk.
>>> Haven't posted this patch in patch form, plus while I'm not really sure
>>> whether setting n_named_args to 0 or not changing in the
>>> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
>>> correct to me.  If structure_value_addr_parm is 1, the function effectively
>>> has a single named argument and then ... args and if the target wants
>>> n_named_args to be number of named arguments except the last, then that
>>> should be 0 rather than 1.
>>>
>>> Thus, is the following patch ok for trunk then?
>>
>> The comment at the start of the section says
>>
>>   /* Now possibly adjust the number of named args.
>>  Normally, don't include the last named arg if anonymous args follow.
>>  We do include the last named arg if
>>  targetm.calls.strict_argument_naming() returns nonzero.
>>  (If no anonymous args follow, the result of list_length is actually
>>  one too large.  This is harmless.)
>>
>> So in the case of strict_argument_naming perhaps it should return 1, but 0 
>> for other cases.
> 
> The TYPE_NO_NAMED_ARGS_STDARG_P (funtype) case is as if type_arg_types != 0
> and list_length (type_arg_types) == 0, i.e. no user named arguments.
> As list_length (NULL) returns 0, perhaps it could be even handled just the
> by changing all the type_arg_types != 0 checks to
> type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> There are just 2 cases I'm worried about, one is that I think rest of
> calls.cc nor the backends are prepared to see n_named_args -1 after the
> adjustments, I think it is better to use 0, and then the question is what
> the !strict_argument_naming && !pretend_outgoing_varargs_named case
> wants to do for the aggregate return.  The patch as posted for
> void foo (...); void bar () { foo (1, 2, 3); }
> will set n_named_args initially to 0 (no named args) and with the
> adjustments for strict_argument_naming 0, otherwise for !pretend
> 0 as well, otherwise 3.
> For
> struct { char buf[4096]; } baz (...); void qux () { baz (1, 2, 3); }
> the patch sets n_named_args initially to 1 (the hidden return) and
> with the arguments for strict keep it at 1, for !pretend 0 and otherwise
> 3.
> 
> So, which case do you think is handled incorrectly with that?

The way I was thinking about it (and testing it on Arm) was to look at 
n_named_args for the cases of a traditional varargs case, then reduce that by 
one (except it can't ever be negative).

So for 

void f(...);
void g(int, ...);
struct S { int a[32]; };

struct S h (...);
struct S i (int, ...);

void a ()
{
  struct S x;
  f(1, 2, 3, 4);
  g(1, 2, 3, 4);
  x = h (1, 2, 3, 4);
  x = i (1, 2, 3, 4);
}

There are various permutations that could lead to answers of 0, 1, 2, 4 and 5 
depending on how those various targets treat each case and how the result 
pointer address is handled.  My suspicion is that for a target that has strict 
argument naming and the result pointer passed as a first argument, the answer 
for the 'h()' call should be 1, not zero.  

Oh, but wait!  Perhaps that now falls into the initial 'if' clause and we never 
reach the point where you pick zero.  So perhaps I'm worrying about nothing.

R.

> 
>   Jakub
> 



Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)
On 29/02/2024 15:55, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 02:14:05PM +, Richard Earnshaw wrote:
>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>> including the new testcase you added.
>>>
>>
>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>> n_named_args entirely, it doesn't need it (I don't think it even existed
>> when the AAPCS code was added).
> 
> So far I've just checked that the new testcase passes not just on
> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
> with vanilla trunk.
> Haven't posted this patch in patch form, plus while I'm not really sure
> whether setting n_named_args to 0 or not changing in the
> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
> correct to me.  If structure_value_addr_parm is 1, the function effectively
> has a single named argument and then ... args and if the target wants
> n_named_args to be number of named arguments except the last, then that
> should be 0 rather than 1.
> 
> Thus, is the following patch ok for trunk then?

The comment at the start of the section says

  /* Now possibly adjust the number of named args.
 Normally, don't include the last named arg if anonymous args follow.
 We do include the last named arg if
 targetm.calls.strict_argument_naming() returns nonzero.
 (If no anonymous args follow, the result of list_length is actually
 one too large.  This is harmless.)

So in the case of strict_argument_naming perhaps it should return 1, but 0 for 
other cases.

R.

> 
> 2024-02-29  Jakub Jelinek  
> 
>   PR target/107453
>   * calls.cc (expand_call): For TYPE_NO_NAMED_ARGS_STDARG_P set
>   n_named_args initially before INIT_CUMULATIVE_ARGS to
>   structure_value_addr_parm rather than 0, after it don't modify
>   it if strict_argument_naming and clear only if
>   !pretend_outgoing_varargs_named.
> 
> --- gcc/calls.cc.jj   2024-01-22 11:48:08.045847508 +0100
> +++ gcc/calls.cc  2024-02-29 16:24:47.799855912 +0100
> @@ -2938,7 +2938,7 @@ expand_call (tree exp, rtx target, int i
>/* Count the struct value address, if it is passed as a parm.  */
>+ structure_value_addr_parm);
>else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> -n_named_args = 0;
> +n_named_args = structure_value_addr_parm;
>else
>  /* If we know nothing, treat all args as named.  */
>  n_named_args = num_actuals;
> @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int i
>   we do not have any reliable way to pass unnamed args in
>   registers, so we must force them into memory.  */
>  
> -  if (type_arg_types != 0
> +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>&& targetm.calls.strict_argument_naming (args_so_far))
>  ;
>else if (type_arg_types != 0
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  /* Don't include the last named arg.  */
>  --n_named_args;
> -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
>else
>  /* Treat all args as named.  */
> 
>   Jakub
> 



Re: [PATCH] calls: Fix up TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)
On 27/02/2024 17:25, Jakub Jelinek wrote:
> On Tue, Feb 27, 2024 at 04:41:32PM +, Richard Earnshaw wrote:
>>> 2023-01-09  Jakub Jelinek  
>>>
>>> PR target/107453
>>> * calls.cc (expand_call): For calls with
>>> TYPE_NO_NAMED_ARGS_STDARG_P (funtype) use zero for n_named_args.
>>> Formatting fix.
>>
>> This one has been festering for a while; both Alexandre and Torbjorn have 
>> attempted to fix it recently, but I'm not sure either is really right...
>>
>> On Arm this is causing all anonymous arguments to be passed on the stack,
>> which is incorrect per the ABI.  On a target that uses
>> 'pretend_outgoing_vararg_named', why is it correct to set n_named_args to
>> zero?  Is it enough to guard both the statements you've added with
>> !targetm.calls.pretend_outgoing_args_named?
> 
> I'm afraid I haven't heard of that target hook before.
> All I was doing with that change was fixing a regression reported in the PR
> for ppc64le/sparc/nvptx/loongarch at least.
> 
> The TYPE_NO_NAMED_ARGS_STDARG_P functions (C23 fns like void foo (...) {})
> have NULL type_arg_types, so the list_length (type_arg_types) isn't done for
> it, but it should be handled as if it was non-NULL but list length was 0.
> 
> So, for the
>   if (type_arg_types != 0)
> n_named_args
>   = (list_length (type_arg_types)
>  /* Count the struct value address, if it is passed as a parm.  */
>  + structure_value_addr_parm);
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> n_named_args = 0;
>   else
> /* If we know nothing, treat all args as named.  */
> n_named_args = num_actuals;
> case, I think guarding it by any target hooks is wrong, although
> I guess it should have been
> n_named_args = structure_value_addr_parm;
> instead of
> n_named_args = 0;
> 
> For the second
>   if (type_arg_types != 0
>   && targetm.calls.strict_argument_naming (args_so_far))
> ;
>   else if (type_arg_types != 0
>&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
> /* Don't include the last named arg.  */
> --n_named_args;
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> n_named_args = 0;
>   else
> /* Treat all args as named.  */
> n_named_args = num_actuals;
> bet (but no testing done, don't even know which targets return what for
> those hooks) we should treat those as if type_arg_types was non-NULL
> with 0 elements in the list, except the --n_named_args doesn't make sense
> because that would decrease it to -1.
> So perhaps
>   if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>   && targetm.calls.strict_argument_naming (args_so_far))
> ;
>   else if (type_arg_types != 0
>&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
> /* Don't include the last named arg.  */
> --n_named_args;
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far)))
> ;
>   else
> /* Treat all args as named.  */
> n_named_args = num_actuals;

I tried the above on arm, aarch64 and x86_64 and that seems fine, including the 
new testcase you added.

R.

> 
> (or n_named_args = 0; instead of ; before the final else?  Dunno).
> I guess we need some testsuite coverage for caller/callee ABI match of
> struct S { char p[64]; };
> struct S foo (...);
> 
>   Jakub
> 



Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-26 Thread Richard Earnshaw (lists)
On 26/02/2024 16:05, Wilco Dijkstra wrote:
> Hi Richard,
> 
>> Did you test this on a thumb1 target?  It seems to me that the target parts 
>> that you've
>> removed were likely related to that.  In fact, I don't see why this test 
>> would need to be changed at all.
> 
> The testcase explicitly forces a Thumb-2 target (arm_arch_v6t2). The patterns
> were wrong for Thumb-2 indeed, and the testcase was explicitly testing for 
> this.
> There is a separate builtin-bswap-2.c for Thumb-1 target (arm_arch_v6m).
> 
> Cheers,
> Wilco

That's why statements like:

* gcc.target/arm/builtin-bswap-1.c: Fix test.

are less than helpful.  Perhaps if you'd said what you actually changed that 
would have made it more obvious.

So OK, but please fix the commit message to say what you did.

R.


Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-26 Thread Richard Earnshaw (lists)
On 23/02/2024 15:46, Wilco Dijkstra wrote:
> Hi Richard,
> 
>> This bit isn't.  The correct fix here is to fix the pattern(s) concerned to 
>> add the missing predicate.
>>
>> Note that builtin-bswap.x explicitly mentions predicated mnemonics in the 
>> comments.
> 
> I fixed the patterns in v2. There are likely some more, plus we could likely 
> merge many t1 and t2
> patterns where the only difference is predication. But those cleanups are for 
> another time...
> 
> Cheers,
> Wilco
> 
> v2: Add predicable to the rev patterns.
> 
> By default most patterns can be conditionalized on Arm targets.  However
> Thumb-2 predication requires the "predicable" attribute be explicitly
> set to "yes".  Most patterns are shared between Arm and Thumb(-2) and are
> marked with "predicable".  Given this sharing, it does not make sense to
> use a different default for Arm.  So only consider conditional execution
> of instructions that have the predicable attribute set to yes.  This ensures
> that patterns not explicitly marked as such are never conditionally executed.
> 
> Passes regress and bootstrap, OK for commit?
> 
> gcc/ChangeLog:
> PR target/113915
> * config/arm/arm.md (NOCOND): Improve comment.
> (arm_rev*) Add predicable.
> * config/arm/arm.cc (arm_final_prescan_insn): Add check for
> PREDICABLE_YES.
> 
> gcc/testsuite/ChangeLog:
> PR target/113915
> * gcc.target/arm/builtin-bswap-1.c: Fix test.
> 
> ---
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 
> 1cd69268ee986a0953cc85ab259355d2191250ac..6a35fe44138135998877a9fb74c2a82a7f99dcd5
>  100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -25613,11 +25613,12 @@ arm_final_prescan_insn (rtx_insn *insn)
> break;
>  
>   case INSN:
> -   /* Instructions using or affecting the condition codes make it
> -  fail.  */
> +   /* Check the instruction is explicitly marked as predicable.
> +  Instructions using or affecting the condition codes are not.  
> */
> scanbody = PATTERN (this_insn);
> if (!(GET_CODE (scanbody) == SET
>   || GET_CODE (scanbody) == PARALLEL)
> +   || get_attr_predicable (this_insn) != PREDICABLE_YES
> || get_attr_conds (this_insn) != CONDS_NOCOND)
>   fail = TRUE;
> break;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 
> 5816409f86f1106b410c5e21d77e599b485f85f2..81237a61d4a2ebcfb77e47c2bd29137aba28a521
>  100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -307,6 +307,8 @@
>  ;
>  ; NOCOND means that the instruction does not use or alter the condition
>  ;   codes but can be converted into a conditionally exectuted instruction.
> +;   Given that NOCOND is the default for most instructions if omitted,
> +;   the attribute predicable must be set to yes as well.
>  
>  (define_attr "conds" "use,set,clob,unconditional,nocond"
>   (if_then_else
> @@ -12547,6 +12549,7 @@
>revsh%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12560,6 +12563,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12584,6 +12588,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12619,6 +12624,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c 
> b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> index 
> c1e7740d14d3ca4e93a71e38b12f82c19791a204..1a311a6a5af647d40abd553e5d0ba1273c76d288
>  100644
> --- a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> @@ -5,14 +5,11 @@
> of the instructions.  Add an -mtune option known to facilitate that.  */
>  /* { dg-additional-options "-O2 -mtune=cortex-a53" } */
>  /* { dg-final { scan-assembler-not "orr\[ \t\]" } } */
> -/* { dg-final { scan-assembler-times "revsh\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "revshne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "revsh\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev16ne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 2 

Re: [PATCH v1 00/13] Add aarch64-w64-mingw32 target

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 17:47, Evgeny Karpov wrote:
> Hello,
> 
> We would like to take your attention to the review of changes for the
> new GCC target, aarch64-w64-mingw32. The new target will be
> supported, tested, added to CI, and maintained by Linaro. This marks
> the first of three planned patch series contributing to the GCC C
> compiler's support for Windows Arm64.
> 
> 1. Minimal aarch64-w64-mingw32 C implementation to cross-compile
> hello-world with libgcc for Windows Arm64 using MinGW.
> 2. Extension of the aarch64-w64-mingw32 C implementation to
> cross-compile OpenSSL, OpenBLAS, FFmpeg, and libjpeg-turbo. All
> packages successfully pass tests.
> 3. Addition of call stack support for debugging, resolution of
> optimization issues in the C compiler, and DLL export/import for the
> aarch64-w64-mingw32 target.
> 
> This patch series introduces the 1st point, which involves building
> hello-world for the aarch64-w64-mingw32 target. The patch depends on
> the binutils changes for the aarch64-w64-mingw32 target that have
> already been merged.
> 
> The binutils should include recent relocation fixes.
> f87eaf8ff3995a5888c6dc4996a20c770e6bcd36
> aarch64: Add new relocations and limit COFF AArch64 relocation offsets
> 
> The series is structured in a way to trivially show that it should not
> affect any other targets.
> 
> In this patch, several changes have been made to support the
> aarch64-w64-mingw32 target for GCC. The modifications include the
> definition of the MS ABI for aarch64, adjustments to FIXED_REGISTERS
> and STATIC_CHAIN_REGNUM for different ABIs, and specific definitions
> for COFF format on AArch64. Additionally, the patch reuses MinGW
>  types and definitions from i386, relocating them to a new
> mingw folder for shared usage between both targets.
> 
> MinGW-specific options have been introduced for AArch64, along with
> override options for aarch64-w64-mingw32. Builtin stack probing for
> override options for aarch64-w64-mingw32. Builtin stack probing for
> AArch64 has been enabled as an alternative for chkstk. Symbol name
> encoding and section information handling for aarch64-w64-mingw32 have
> been incorporated, and the MinGW environment has been added, which
> will also be utilized for defining the Cygwin environment in the
> future.
> 
> The patch includes renaming "x86 Windows Options" to "Cygwin and MinGW
> Options," which now encompasses AArch64 as well. AArch64-specific
> Cygwin and MinGW Options have been introduced for the unique
> requirements of the AArch64 architecture.
> 
> Function type declaration and named sections support have been added.
> The necessary objects for Cygwin and MinGW have been built for the
> aarch64-w64-mingw32 target, and relevant files such as msformat-c.cc
> and winnt-d.cc have been moved to the mingw folder for reuse in
> AArch64.
> 
> Furthermore, the aarch64-w64-mingw32 target has been included in both
> libatomic and libgcc, ensuring support for the AArch64 architecture
> within these libraries. These changes collectively enhance the
> capabilities of GCC for the specified target.
> 
> Coauthors: Zac Walker ,
> Mark Harmstone   and
> Ron Riddle 
> 
> Refactored, prepared, and validated by 
> Radek Barton  and 
> Evgeny Karpov 
> 
> Special thanks to the Linaro GNU toolchain team for internal review
> and assistance in preparing the patch series!
> 
> Regards,
> Evgeny

Thanks for posting this.

I've only read quickly through this patch series and responded where I think 
some action is obviously required.  That doesn't necessarily mean the other 
patches are perfect, though, just that nothing immediately caught my attention.

R.

> 
> 
> Zac Walker (13):
>   Introduce aarch64-w64-mingw32 target
>   aarch64: The aarch64-w64-mingw32 target implements the MS ABI
>   aarch64: Mark x18 register as a fixed register for MS ABI
>   aarch64: Add aarch64-w64-mingw32 COFF
>   Reuse MinGW from i386 for AArch64
>   Rename section and encoding functions from i386 which will be used in
> aarch64
>   Exclude i386 functionality from aarch64 build
>   aarch64: Add Cygwin and MinGW environments for AArch64
>   aarch64: Add SEH to machine_function
>   Rename "x86 Windows Options" to "Cygwin and MinGW Options"
>   aarch64: Build and add objects for Cygwin and MinGW for AArch64
>   aarch64: Add aarch64-w64-mingw32 target to libatomic
>   Add aarch64-w64-mingw32 target to libgcc
> 
>  fixincludes/mkfixinc.sh   |   3 +-
>  gcc/config.gcc|  47 +++--
>  gcc/config/aarch64/aarch64-coff.h |  92 +
>  gcc/config/aarch64/aarch64-opts.h |   7 +
>  gcc/config/aarch64/aarch64-protos.h   |   5 +
>  gcc/config/aarch64/aarch64.h  |  25 ++-
>  gcc/config/aarch64/cygming.h  | 178 ++
>  gcc/config/i386/cygming.h |  18 +-
>  gcc/config/i386/cygming.opt.urls  |  30 ---
>  gcc/config/i386/i386-protos.h  

Re: [PATCH v1 13/13] Add aarch64-w64-mingw32 target to libgcc

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:40, Evgeny Karpov wrote:
> 
+aarch64-*-mingw*)

This doesn't match the glob pattern you added to config.gcc in an earlier 
patch, but see my comment on that.  The two should really be consistent with 
each other or you might get build failures late on.

R.


Re: [PATCH v1 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW Options"

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:38, Evgeny Karpov wrote:
> 
For this change you might want to put some form of re-direct in the manual 
under the old name so that anybody used to looking for the old entry will know 
where things have been moved to.  Something like

x86 Windows Options
  See xref(Cygwin and MinGW Options).

R.


Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:36, Evgeny Karpov wrote:
> 
+/* GNU as supports weak symbols on PECOFF.  */
+#ifdef HAVE_GAS_WEAK

Can't we assume this is true?  It was most likely needed on i386 because 
support goes back longer than the assembler had this feature, but it looks like 
it was added in 2000, or thereabouts, so significantly before aarch64 was 
supported in the assembler.

+#ifndef HAVE_GAS_ALIGNED_COMM

And this was added to GCC in 2009, which probably means it predates 
aarch64-coff support in gas as well.

R.


Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:30, Evgeny Karpov wrote:
> 
+   tm_defines="${tm_defines} TARGET_ARM64_MS_ABI=1"

I missed this on first reading...

The GCC port name uses AARCH64, please use that internally rather than other 
names.  The only time when we should be using ARM64 is when it's needed for 
compatibility with other compilers and that doesn't apply here AFAICT.

R.


Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:30, Evgeny Karpov wrote:
> 
+/* X18 reserved for the TEB on Windows.  */
+#ifdef TARGET_ARM64_MS_ABI
+# define FIXED_X18 1
+# define CALL_USED_X18 0
+#else
+# define FIXED_X18 0
+# define CALL_USED_X18 1
+#endif

I'm not overly keen on ifdefs like this (and the one below), it can get quite 
confusing if we have to support more than a couple of ABIs.  Perhaps we could 
create a couple of new headers, one for the EABI (which all existing targets 
would then need to include) and one for the MS ABI.  Then the mingw port would 
use that instead of the EABI header.

An alternative is to make all this dynamic, based on the setting of the 
aarch64_calling_abi enum and to make the adjustments in 
aarch64_conditional_register_usage.

+# define CALL_USED_X18 0

Is that really correct?  If the register is really reserved, but some code 
modifies it anyway, this will cause the compiler to restore the old value at 
the end of a function; generally, for a reserved register, code that knows what 
it's doing would want to make permanent changes to this value.

+#ifdef TARGET_ARM64_MS_ABI
+# define STATIC_CHAIN_REGNUM   R17_REGNUM
+#else
+# define STATIC_CHAIN_REGNUM   R18_REGNUM
+#endif

If we went the enum way, we'd want something like

#define STATIC_CHAIN_REGNUM (calling_abi == AARCH64_CALLING_ABI_MS ? R17_REGNUM 
: R18_REGNUM)

R.


Re: [PATCH v1 02/13] aarch64: The aarch64-w64-mingw32 target implements

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:26, Evgeny Karpov wrote:
> 
+/* Available call ABIs.  */
+enum calling_abi
+{
+  AARCH64_EABI = 0,
+  MS_ABI = 1
+};
+

The convention in this file seems to be that all enum types to start with 
aarch64.  Also, the enumeration values should start with the name of the 
enumeration type in upper case, so:

enum aarch64_calling_abi
{
  AARCH64_CALLING_ABI_EABI,
  AARCH64_CALLING_ABI_MS
};

or something very much like that.

R.


Re: [PATCH v1 01/13] Introduce aarch64-w64-mingw32 target

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:16, Evgeny Karpov wrote:
> 
+aarch64*-*-mingw*)

Other targets are a bit inconsistent here as well, but, as Andrew mentioned, if 
you don't want to handle big-endian, it might be better to match 
aarch64-*-mingw* here.

R.


Re: [PATCH v1 05/13] Reuse MinGW from i386 for AArch64

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 21:34, rep.dot@gmail.com wrote:
> On 21 February 2024 19:34:43 CET, Evgeny Karpov  
> wrote:
>>
> 
> Please use git send-email. Your mail ends up as empty as here, otherwise.

I don't see anything wrong with it; niether does patchwork 
(https://patchwork.sourceware.org/project/gcc/list/?series=31191) nor does the 
Linaro CI bot.  So perhaps it's your mailer that's misconfigured.

> 
> The ChangeLog has to be expressed in present tense, as mandated by the 
> standard; s/Moved/Move/g etc.

Agreed, but that's a detail that we can get to once the patch has been properly 
reviewed.

> 
> In any sane world ( and in gcc ) to fold, respectively a folder, is something 
> else compared to a directory ( which you probably mean when moving a file 
> from one directory to another directory as you seem to do ).
> 
> Most of the free world has left COFF behind since several decades, so I won't 
> comment on that. YMMV.

This isn't helpful.  Windows platforms use (a derivative of) COFF, so that's 
what the tools need to use when targetting that platform.

R.



Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-21 Thread Richard Earnshaw (lists)
On 21/02/2024 14:34, Wilco Dijkstra wrote:
> 
> By default most patterns can be conditionalized on Arm targets.  However
> Thumb-2 predication requires the "predicable" attribute be explicitly
> set to "yes".  Most patterns are shared between Arm and Thumb(-2) and are
> marked with "predicable".  Given this sharing, it does not make sense to
> use a different default for Arm.  So only consider conditional execution
> of instructions that have the predicable attribute set to yes.  This ensures
> that patterns not explicitly marked as such are never accidentally 
> conditionally executed like in the PR.
> 
> GLIBC codesize was ~0.014% worse due to atomic operations now being
> unconditional and a small number of patterns not setting "predicable".
> 
> Passes regress and bootstrap, OK for commit?
> 
> gcc/ChangeLog:
> PR target/113915
> * config/arm/arm.md (NOCOND): Improve comment.
> * config/arm/arm.cc (arm_final_prescan_insn): Add check for
> PREDICABLE_YES.
> 
> gcc/testsuite/ChangeLog:
> PR target/113915
> * gcc.target/arm/builtin-bswap-1.c: Fix test.
> 
> ---
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 
> c44047c377a802d0c1dc1406df1b88a6b079607b..29771d284831a995adcf9adbb525396fbabb1ea2
>  100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -25610,11 +25610,12 @@ arm_final_prescan_insn (rtx_insn *insn)
> break;
>  
>   case INSN:
> -   /* Instructions using or affecting the condition codes make it
> -  fail.  */
> +   /* Check the instruction is explicitly marked as predicable.
> +  Instructions using or affecting the condition codes are not.  
> */
> scanbody = PATTERN (this_insn);
> if (!(GET_CODE (scanbody) == SET
>   || GET_CODE (scanbody) == PARALLEL)
> +   || get_attr_predicable (this_insn) != PREDICABLE_YES
> || get_attr_conds (this_insn) != CONDS_NOCOND)
>   fail = TRUE;
> break;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 
> 5816409f86f1106b410c5e21d77e599b485f85f2..671f093862259c2c0df93a986fc22fa56a8ea6c7
>  100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -307,6 +307,8 @@
>  ;
>  ; NOCOND means that the instruction does not use or alter the condition
>  ;   codes but can be converted into a conditionally exectuted instruction.
> +;   Given that NOCOND is the default for most instructions if omitted,
> +;   the attribute predicable must be set to yes as well.
>  
>  (define_attr "conds" "use,set,clob,unconditional,nocond"
>   (if_then_else

While this is ok, 

> diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c 
> b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> index 
> c1e7740d14d3ca4e93a71e38b12f82c19791a204..3de7cea81c1128c2fe5a9e1216e6b027d26bcab9
>  100644
> --- a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> @@ -5,14 +5,8 @@
> of the instructions.  Add an -mtune option known to facilitate that.  */
>  /* { dg-additional-options "-O2 -mtune=cortex-a53" } */
>  /* { dg-final { scan-assembler-not "orr\[ \t\]" } } */
> -/* { dg-final { scan-assembler-times "revsh\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "revshne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "revsh\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev16ne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 2 { target { arm_nothumb } } } 
> }  */
> -/* { dg-final { scan-assembler-times "revne\\t" 2 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 4 { target { ! arm_nothumb } } 
> } }  */
> +/* { dg-final { scan-assembler-times "revsh\\t" 2 } }  */
> +/* { dg-final { scan-assembler-times "rev16\\t" 2 } }  */
> +/* { dg-final { scan-assembler-times "rev\\t" 4 } }  */
>  
>  #include "builtin-bswap.x"

This bit isn't.  The correct fix here is to fix the pattern(s) concerned to add 
the missing predicate.

Note that builtin-bswap.x explicitly mentions predicated mnemonics in the 
comments.

R.


Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-19 Thread Richard Earnshaw (lists)
On 19/02/2024 10:58, Tamar Christina wrote:
>> -Original Message-
>> From: Tamar Christina
>> Sent: Thursday, February 15, 2024 11:05 AM
>> To: Richard Earnshaw (lists) ; gcc-
>> patc...@gcc.gnu.org
>> Cc: nd ; Marcus Shawcroft ; Kyrylo
>> Tkachov ; Richard Sandiford
>> 
>> Subject: RE: [PATCH]AArch64: xfail modes_1.f90 [PR107071]
>>
>>> -Original Message-
>>> From: Richard Earnshaw (lists) 
>>> Sent: Thursday, February 15, 2024 11:01 AM
>>> To: Tamar Christina ; gcc-patches@gcc.gnu.org
>>> Cc: nd ; Marcus Shawcroft ;
>> Kyrylo
>>> Tkachov ; Richard Sandiford
>>> 
>>> Subject: Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]
>>>
>>> On 15/02/2024 10:57, Tamar Christina wrote:
>>>> Hi All,
>>>>
>>>> This test has never worked on AArch64 since the day it was committed.  It 
>>>> has
>>>> a number of issues that prevent it from working on AArch64:
>>>>
>>>> 1.  IEEE does not require that FP operations raise a SIGFPE for FP 
>>>> operations,
>>>>     only that an exception is raised somehow.
>>>>
>>>> 2. Most Arm designed cores don't raise SIGFPE and instead set a status 
>>>> register
>>>>    and some partner cores raise a SIGILL instead.
>>>>
>>>> 3. The way it checks for feenableexcept doesn't really work for AArch64.
>>>>
>>>> As such this test doesn't seem to really provide much value on AArch64 so 
>>>> we
>>>> should just xfail it.
>>>>
>>>> Regtested on aarch64-none-linux-gnu and no issues.
>>>>
>>>> Ok for master?
>>>
>>> Wouldn't it be better to just skip the test.  XFAIL just adds clutter to 
>>> verbose
>> output
>>> and suggests that someday the tools might be fixed for this case.
>>>
>>> Better still would be a new dg-requires fp_exceptions_raise_sigfpe as a 
>>> guard for
>>> the test.
>>
> 
> It looks like this is similar to 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314 so
> I'll just similarly skip it.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90 
> b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> index 
> 205c47f38007d06116289c19d6b23cf3bf83bd48..e29d8c678e6e51c3f2e5dac53c7703bb18a99ac4
>  100644
> --- a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> +++ b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> @@ -1,5 +1,5 @@
>  ! { dg-do run }
> -!
> +! { dg-skip-if "PR libfortran/78314" { aarch64*-*-gnu* arm*-*-gnueabi 
> arm*-*-gnueabihf } }
>  ! Test IEEE_MODES_TYPE, IEEE_GET_MODES and IEEE_SET_MODES
>  
> Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK, but please give the fortran maintainers 24hrs to comment before pushing.

R.

> 
> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>   PR fortran/107071
>   * gfortran.dg/ieee/modes_1.f90: skip aarch64, arm.



Re: [PATCH][GCC][Arm] Missing optimization pattern for rev16 on architectures with thumb1

2024-02-15 Thread Richard Earnshaw (lists)
On 12/02/2024 13:48, Matthieu Longo wrote:
> This patch marks a rev16 test as XFAIL for architectures having only Thumb1 
> support. The generated code is functionally correct, but the optimization is 
> disabled when -mthumb is equivalent to Thumb1. Fixing the root issue would 
> requires changes that are not suitable for GCC14 stage 4.
> 
> More information at https://linaro.atlassian.net/browse/GNU-1141
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/rev16_2.c: XFAIL when compiled with Thumb1.

Thanks, I've tweaked the commit message slightly and pushed this.

R.


Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-15 Thread Richard Earnshaw (lists)
On 15/02/2024 10:57, Tamar Christina wrote:
> Hi All,
> 
> This test has never worked on AArch64 since the day it was committed.  It has
> a number of issues that prevent it from working on AArch64:
> 
> 1.  IEEE does not require that FP operations raise a SIGFPE for FP operations,
>     only that an exception is raised somehow.
> 
> 2. Most Arm designed cores don't raise SIGFPE and instead set a status 
> register
>    and some partner cores raise a SIGILL instead.
> 
> 3. The way it checks for feenableexcept doesn't really work for AArch64.
> 
> As such this test doesn't seem to really provide much value on AArch64 so we
> should just xfail it.
> 
> Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

Wouldn't it be better to just skip the test.  XFAIL just adds clutter to 
verbose output and suggests that someday the tools might be fixed for this case.

Better still would be a new dg-requires fp_exceptions_raise_sigfpe as a guard 
for the test.

R.

> 
> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>     PR fortran/107071
>     * gfortran.dg/ieee/modes_1.f90: xfail aarch64.
> 
> --- inline copy of patch --
> diff --git a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90 
> b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> index 
> 205c47f38007d06116289c19d6b23cf3bf83bd48..3667571969427ae7b2b96684ec1af8b3fdd4985f
>  100644
> --- a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> +++ b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> @@ -1,4 +1,4 @@
> -! { dg-do run }
> +! { dg-do run { xfail { aarch64*-*-* } } }
>  !
>  ! Test IEEE_MODES_TYPE, IEEE_GET_MODES and IEEE_SET_MODES
>  
> 
> 
> 
> 
> -- 



Re: [PATCH] Arm: Fix incorrect tailcall-generation for indirect calls [PR113780]

2024-02-14 Thread Richard Earnshaw (lists)
On 14/02/2024 09:20, Tejas Belagod wrote:
> On 2/7/24 11:41 PM, Richard Earnshaw (lists) wrote:
>> On 07/02/2024 07:59, Tejas Belagod wrote:
>>> This patch fixes a bug that causes indirect calls in PAC-enabled functions
>>> to be tailcalled incorrectly when all argument registers R0-R3 are used.
>>>
>>> Tested on arm-none-eabi for armv8.1-m.main. OK for trunk?
>>>
>>> 2024-02-07  Tejas Belagod  
>>>
>>> PR target/113780
>>> * gcc/config/arm.cc (arm_function_ok_for_sibcall): Don't allow tailcalls
>>>   for indirect calls with 4 or more arguments in pac-enabled functions.
>>>
>>> * gcc.target/arm/pac-sibcall.c: New.
>>> ---
>>>   gcc/config/arm/arm.cc  | 12 
>>>   gcc/testsuite/gcc.target/arm/pac-sibcall.c | 11 +++
>>>   2 files changed, 19 insertions(+), 4 deletions(-)
>>>   create mode 100644 gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>>
>>> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
>>> index c44047c377a..c1f8286a4d4 100644
>>> --- a/gcc/config/arm/arm.cc
>>> +++ b/gcc/config/arm/arm.cc
>>> @@ -7980,10 +7980,14 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
>>>     && DECL_WEAK (decl))
>>>   return false;
>>>   -  /* We cannot do a tailcall for an indirect call by descriptor if all 
>>> the
>>> - argument registers are used because the only register left to load the
>>> - address is IP and it will already contain the static chain.  */
>>> -  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
>>> +  /* We cannot do a tailcall for an indirect call by descriptor or for an
>>> + indirect call in a pac-enabled function if all the argument registers
>>> + are used because the only register left to load the address is IP and
>>> + it will already contain the static chain or the PAC signature in the
>>> + case of PAC-enabled functions.  */
>>
>> This comment is becoming a bit unwieldy.  I suggest restructuring it as:
>>
>> We cannot tailcall an indirect call by descriptor if all the call-clobbered
>> general registers are live (r0-r3 and ip).  This can happen when:
>>    - IP contains the static chain, or
>>    - IP is needed for validating the PAC signature.
>>
>>
>>> +  if (!decl
>>> +  && ((CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
>>> +  || arm_current_function_pac_enabled_p()))
>>>   {
>>>     tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
>>>     CUMULATIVE_ARGS cum;
>>> diff --git a/gcc/testsuite/gcc.target/arm/pac-sibcall.c 
>>> b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>> new file mode 100644
>>> index 000..c57bf7a952c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>> @@ -0,0 +1,11 @@
>>> +/* Testing return address signing.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target mbranch_protection_ok } */
>>> +/* { dg-options " -mcpu=cortex-m85 -mbranch-protection=pac-ret+leaf -O2" } 
>>> */
>>
>> No, you can't just add options like this, you need to first check that they 
>> won't result in conflicts with other options on the command line.  See 
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644077.html for an 
>> example of how to handle this.
>>
> Thanks for the review, Richard. Respin attached.
> 
> Thanks,
> Tejas.
> 
>>> +
>>> +void fail(void (*f)(int, int, int, int))
>>> +{
>>> +  f(1, 2, 3, 4);
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-not "bx\tip\t@ indirect register sibling 
>>> call" } } */
>>
>> R.
>>
+++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
@@ -0,0 +1,14 @@
+/* If all call-clobbered general registers are live (r0-r3, ip), disable
+   indirect tail-call for a PAC-enabled function.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target mbranch_protection_ok } */
This only checks if -mbranch-protection can work with the existing 
architecture/cpu; not with the flags you're about to add below.  You should 
check for arm_arch_v8_1m_main_pacbti_ok instead; then you can assume that 
-mbranch-protection can be added.

+/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
+/* { dg-additional-options "-mbranch-protection=pac-ret+leaf -O2" } */

Otherwise this is OK if you fix the above.

R.


Re: [PATCH] testsuite, arm: Fix testcase arm/pr112337.c to check for the options first

2024-02-09 Thread Richard Earnshaw (lists)
On 30/01/2024 17:07, Saurabh Jha wrote:
> Hey,
> 
> Previously, this test was added to fix this bug: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337. However, it did not 
> check the compilation options before using them, leading to errors.
> 
> This patch fixes the test by first checking whether it can use the options 
> before using them.
> 
> Tested for arm-none-eabi and found no regressions. The output of check-gcc 
> with RUNTESTFLAGS="arm.exp=*" changed like this:
> 
> Before:
> # of expected passes  5963
> # of unexpected failures  64
> 
> After:
> # of expected passes  5964
> # of unexpected failures  63
> 
> Ok for master?
> 
> Regards,
> Saurabh
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/pr112337.c: Check whether we can use the compilation 
> options before using them.

My apologies for missing this earlier.  It didn't show up in patchwork. That's 
most likely because the attachment is a binary blob instead of text/plain.  
That also means that the Linaro CI system hasn't seen this patch either.  
Please can you fix your mailer to add plain text patch files.

-/* { dg-options "-O2 -march=armv8.1-m.main+fp.dp+mve.fp -mfloat-abi=hard" } */
+/* { dg-require-effective-target arm_hard_ok } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-options "-O2 -mfloat-abi=hard" } */
+/* { dg-add-options arm_v8_1m_mve } */

This is moving in the right direction, but it adds more than necessary now: 
checking for, and adding -mfloat-abi=hard is not necessary any more as 
arm_v8_1m_mve_ok will work out what float-abi flags are needed to make the 
options work. (What's more, it will prevent the test from running if the base 
configuration of the compiler is incompatible with the hard float ABI, which is 
more than we need.).

So please can you re-spin removing the hard-float check and removing that from 
dg-options.

Thanks,
R.


Re: [PATCH] Arm: Fix incorrect tailcall-generation for indirect calls [PR113780]

2024-02-07 Thread Richard Earnshaw (lists)
On 07/02/2024 07:59, Tejas Belagod wrote:
> This patch fixes a bug that causes indirect calls in PAC-enabled functions
> to be tailcalled incorrectly when all argument registers R0-R3 are used.
> 
> Tested on arm-none-eabi for armv8.1-m.main. OK for trunk?
> 
> 2024-02-07  Tejas Belagod  
> 
>   PR target/113780
>   * gcc/config/arm.cc (arm_function_ok_for_sibcall): Don't allow tailcalls
>   for indirect calls with 4 or more arguments in pac-enabled functions.
> 
>   * gcc.target/arm/pac-sibcall.c: New.
> ---
>  gcc/config/arm/arm.cc  | 12 
>  gcc/testsuite/gcc.target/arm/pac-sibcall.c | 11 +++
>  2 files changed, 19 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/pac-sibcall.c
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index c44047c377a..c1f8286a4d4 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -7980,10 +7980,14 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
>&& DECL_WEAK (decl))
>  return false;
>  
> -  /* We cannot do a tailcall for an indirect call by descriptor if all the
> - argument registers are used because the only register left to load the
> - address is IP and it will already contain the static chain.  */
> -  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
> +  /* We cannot do a tailcall for an indirect call by descriptor or for an
> + indirect call in a pac-enabled function if all the argument registers
> + are used because the only register left to load the address is IP and
> + it will already contain the static chain or the PAC signature in the
> + case of PAC-enabled functions.  */

This comment is becoming a bit unwieldy.  I suggest restructuring it as:

We cannot tailcall an indirect call by descriptor if all the call-clobbered
general registers are live (r0-r3 and ip).  This can happen when:
  - IP contains the static chain, or
  - IP is needed for validating the PAC signature.


> +  if (!decl
> +  && ((CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
> +   || arm_current_function_pac_enabled_p()))
>  {
>tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
>CUMULATIVE_ARGS cum;
> diff --git a/gcc/testsuite/gcc.target/arm/pac-sibcall.c 
> b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
> new file mode 100644
> index 000..c57bf7a952c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
> @@ -0,0 +1,11 @@
> +/* Testing return address signing.  */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target mbranch_protection_ok } */
> +/* { dg-options " -mcpu=cortex-m85 -mbranch-protection=pac-ret+leaf -O2" } */

No, you can't just add options like this, you need to first check that they 
won't result in conflicts with other options on the command line.  See 
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644077.html for an 
example of how to handle this.

> +
> +void fail(void (*f)(int, int, int, int))
> +{
> +  f(1, 2, 3, 4);
> +}
> +
> +/* { dg-final { scan-assembler-not "bx\tip\t@ indirect register sibling 
> call" } } */

R.



Re: [PATCH v2] arm: Fix missing bti instruction for virtual thunks

2024-02-02 Thread Richard Earnshaw (lists)
On 26/01/2024 15:31, Richard Ball wrote:
> v2: Formatting and test options fix.
> 
> Adds missing bti instruction at the beginning of a virtual
> thunk, when bti is enabled.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.cc (arm_output_mi_thunk): Emit
>   insn for bti_c when bti is enabled.
> 
> gcc/testsuite/ChangeLog:
> 
> * lib/target-supports.exp: Add v8_1_m_main_pacbti.
> * g++.target/arm/bti_thunk.C: New test.

OK, thanks.

R.


Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-31 Thread Richard Earnshaw (lists)
On 30/01/2024 14:09, Andre Simoes Dias Vieira wrote:
> Hi Richard,
> 
> Thanks for the reviews, I'm making these changes but just a heads up.
> 
> When hardcoding LR_REGNUM like this we need to change the way we compare the 
> register in doloop_condition_get. This function currently compares the rtx 
> nodes by address, which I think happens to be fine before we assign hard 
> registers, as I suspect we always share the rtx node for the same pseudo, but 
> when assigning registers it seems like we create copies, so things like:
> `XEXP (inc_src, 0) == reg` will fail for
> inc_src: (plus (reg LR) (const_int -n)'
> reg: (reg LR)
> 
> Instead I will substitute the operand '==' with calls to 'rtx_equal_p (op1, 
> op2, NULL)'.

Yes, that's fine.

R.

> 
> Sound good?
> 
> Kind regards,
> Andre
> 
> ____
> From: Richard Earnshaw (lists) 
> Sent: Tuesday, January 30, 2024 11:36 AM
> To: Andre Simoes Dias Vieira; gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov; Stam Markianos-Wright
> Subject: Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low 
> Overhead Loops
> 
> On 19/01/2024 14:40, Andre Vieira wrote:
>>
>> Respin after comments from Kyrill and rebase. I also removed an if-then-else
>> construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
>> functions
>> Kyrill pointed out.
>>
>> After an earlier comment from Richard Sandiford I also added comments to the
>> two tail predication patterns added to explain the need for the unspecs.
> 
> [missing ChangeLog]
> 
> I'm just going to focus on loop-doloop.c in this reply, I'll respond to the 
> other bits in a follow-up.
> 
>   2)  (set (reg) (plus (reg) (const_int -1))
> - (set (pc) (if_then_else (reg != 0)
> -(label_ref (label))
> -(pc))).
> +(set (pc) (if_then_else (reg != 0)
> +(label_ref (label))
> +(pc))).
> 
>   Some targets (ARM) do the comparison before the branch, as in the
>   following form:
> 
> - 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
> -   (set (reg) (plus (reg) (const_int -1)))])
> -(set (pc) (if_then_else (cc == NE)
> ...
> 
> 
> This comment is becoming confusing.  Really the text leading up to 3)... 
> should be inside 3.  Something like:
> 
>   3) Some targets (ARM) do the comparison before the branch, as in the
>   following form:
> 
>   (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
>  (set (reg) (plus (reg) (const_int -1)))])
>   (set (pc) (if_then_else (cc == NE)
>   (label_ref (label))
>   (pc)))])
> 
> 
> The same issue on the comment structure also applies to the new point 4...
> 
> +  The ARM target also supports a special case of a counter that 
> decrements
> +  by `n` and terminating in a GTU condition.  In that case, the compare 
> and
> +  branch are all part of one insn, containing an UNSPEC:
> +
> +  4) (parallel [
> +   (set (pc)
> +   (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
> +   (const_int -n))])
> +  (const_int n-1]))
> +   (label_ref)
> +   (pc)))
> +   (set (reg:SI 14 lr)
> +(plus:SI (reg:SI 14 lr)
> + (const_int -n)))
> + */
> 
> I think this needs a bit more clarification.  Specifically that this 
> construct supports a predicated vectorized do loop.  Also, the placement of 
> the unspec inside the comparison is ugnly and unnecessary.  It should be 
> sufficient to have the unspec inside a USE expression, which the mid-end can 
> then ignore entirely.  So
> 
> (parallel
>  [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
>(const_int n-1))
>   (label_ref) (pc)))
>   (set (reg) (plus (reg) (const_int -n)))
>   (additional clobbers and uses)])
> 
> For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to 
> this pattern to stop anything else from matching it.
> 
> Note that we don't need to mention that the register is 'LR' or the modes, 
> those are specific to a particular backend, not the generic pattern we want 
> to match.
> 
> +  || !CONST_INT_P (XEXP (inc_src, 1))
> +  || INTVAL (XEXP (inc_src, 1)) >= 0)
>  re

Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-30 Thread Richard Earnshaw (lists)
On 19/01/2024 14:40, Andre Vieira wrote:
> 
> Respin after comments from Kyrill and rebase. I also removed an if-then-else
> construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
> functions
> Kyrill pointed out.
> 
> After an earlier comment from Richard Sandiford I also added comments to the
> two tail predication patterns added to explain the need for the unspecs.

[missing ChangeLog]

I'm just going to focus on loop-doloop.c in this reply, I'll respond to the 
other bits in a follow-up.

  2)  (set (reg) (plus (reg) (const_int -1))
- (set (pc) (if_then_else (reg != 0)
-(label_ref (label))
-(pc))).  
+(set (pc) (if_then_else (reg != 0)
+(label_ref (label))
+(pc))).
 
  Some targets (ARM) do the comparison before the branch, as in the
  following form:
 
- 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
-   (set (reg) (plus (reg) (const_int -1)))])
-(set (pc) (if_then_else (cc == NE)
...


This comment is becoming confusing.  Really the text leading up to 3)... should 
be inside 3.  Something like:

  3) Some targets (ARM) do the comparison before the branch, as in the
  following form:
 
  (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
 (set (reg) (plus (reg) (const_int -1)))])
  (set (pc) (if_then_else (cc == NE)
  (label_ref (label))
  (pc)))])


The same issue on the comment structure also applies to the new point 4...

+  The ARM target also supports a special case of a counter that decrements
+  by `n` and terminating in a GTU condition.  In that case, the compare and
+  branch are all part of one insn, containing an UNSPEC:
+
+  4) (parallel [
+   (set (pc)
+   (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
+   (const_int -n))])
+  (const_int n-1]))
+   (label_ref)
+   (pc)))
+   (set (reg:SI 14 lr)
+(plus:SI (reg:SI 14 lr)
+ (const_int -n)))
+ */

I think this needs a bit more clarification.  Specifically that this construct 
supports a predicated vectorized do loop.  Also, the placement of the unspec 
inside the comparison is ugnly and unnecessary.  It should be sufficient to 
have the unspec inside a USE expression, which the mid-end can then ignore 
entirely.  So

(parallel
 [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
   (const_int n-1))
  (label_ref) (pc)))
  (set (reg) (plus (reg) (const_int -n)))
  (additional clobbers and uses)])

For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to 
this pattern to stop anything else from matching it.

Note that we don't need to mention that the register is 'LR' or the modes, 
those are specific to a particular backend, not the generic pattern we want to 
match.

+  || !CONST_INT_P (XEXP (inc_src, 1))
+  || INTVAL (XEXP (inc_src, 1)) >= 0)
 return 0;
+  int dec_num = abs (INTVAL (XEXP (inc_src, 1)));

We can just use '-INTVAL(...)' here, we've verified just above that the 
constant is negative.

-  if ((XEXP (condition, 0) == reg)
+  /* For the ARM special case of having a GTU: re-form the condition without
+ the unspec for the benefit of the middle-end.  */
+  if (GET_CODE (condition) == GTU)
+{
+  condition = gen_rtx_fmt_ee (GTU, VOIDmode, inc_src,
+ GEN_INT (dec_num - 1));
+  return condition;
+}

If you make the change I mentioned above, this re-forming isn't needed any 
more, so the arm-specific comment goes away
 
-   {
+{
  if (GET_CODE (pattern) != PARALLEL)
  /*  For the second form we expect:

You've fixed the indentation of the brace (good), but the body of the braced 
expression needs re-indenting as well.

R.



Re: [PATCH v3 1/2] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-01-30 Thread Richard Earnshaw (lists)
On 19/01/2024 14:40, Andre Vieira wrote:
> 
> Reposting for testing purposes, no changes from v2 (other than rebase).

We seem to have lost the ChangeLog for this hunk :(

The code itself looks OK, though.


Re: [PATCH] Make gcc.target/arm/bics_3.c testcase a bit more generic [PR113542]

2024-01-25 Thread Richard Earnshaw (lists)
On 25/01/2024 10:29, Maxim Kuvyrkov wrote:
> After fwprop improvement in r14-8319-g86de9b66480, codegen in
> bics_3.c test changed from "bics" to "bic" instruction, with
> the overall instruction stream remaining at the same quality.
> 
> This patch makes the scan-assembler directive accept both
> "bics" and "bic".
> 
> BEFORE r14-8319-g86de9b66480:
>   bicsr0, r0, r1 @ 9  [c=4 l=4]  *andsi_notsi_si_compare0_scratch
>   mov r0, #1  @ 23[c=4 l=4]  *thumb2_movsi_vfp/1
>   it  eq
>   moveq   r0, #0  @ 26[c=8 l=4]  *p *thumb2_movsi_vfp/2
>   bx  lr  @ 29[c=8 l=4]  *thumb2_return
> 
> AFTER r14-8319-g86de9b66480:
>   bic r0, r0, r1  @ 8 [c=4 l=4]  andsi_notsi_si
>   subsr0, r0, #0  @ 22[c=4 l=4]  cmpsi2_addneg/0
>   it  ne
>   movne   r0, #1  @ 23[c=8 l=4]  *p *thumb2_movsi_vfp/2
>   bx  lr  @ 26[c=8 l=4]  *thumb2_return
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/113542
>   * gcc.target/arm/bics_3.c: Update scan-assembler directive.
> ---
>  gcc/testsuite/gcc.target/arm/bics_3.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/bics_3.c 
> b/gcc/testsuite/gcc.target/arm/bics_3.c
> index e056b264e15..c5bed3c92d2 100644
> --- a/gcc/testsuite/gcc.target/arm/bics_3.c
> +++ b/gcc/testsuite/gcc.target/arm/bics_3.c
> @@ -35,6 +35,6 @@ main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+" 
> 2 } } */
> -/* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+, 
> .sl #2" 1 } } */
> +/* { dg-final { scan-assembler-times "bics?\tr\[0-9\]+, r\[0-9\]+, 
> r\[0-9\]+" 2 } } */
> +/* { dg-final { scan-assembler-times "bics?\tr\[0-9\]+, r\[0-9\]+, 
> r\[0-9\]+, .sl #2" 1 } } */
>  


The test was added (r6-823-g0454e698401a3e) specifically to check that a BICS 
instruction was being generated.  Whether or not that is right is somewhat 
debatable, but this change seems to be papering over a different issue.

Either we should generate BICS, making this change incorrect, or we should 
disable the test for thumb code on the basis that this isn't really a win.

But really, we should fix the compiler to do better here.  We really want 
something like

BICS  r0, r0, r1  // r0 is 0 or non-zero
MOVNE r0, #1  // convert all non-zero to 1

in Arm state (ie using the BICS instruction to set the result to zero); and in 
thumb2, perhaps something like:

BICS  r0, r0, r1
ITne
MOVNE r0, #1

or maybe even better:

BIC  r0, r0, r1
SUBS r1, r0, #1
SBC  r0, r0, r1

which is slightly better than BICS because SUBS breaks a condition-code chain 
(all the flag bits are set).

There are similar quality issues for other NE(arith-op, 0) cases; we just don't 
have tests for those.

R.


Re: [PATCH] arm: Fix missing bti instruction for virtual thunks

2024-01-24 Thread Richard Earnshaw (lists)
On 23/01/2024 15:53, Richard Ball wrote:
> Adds missing bti instruction at the beginning of a virtual
> thunk, when bti is enabled.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.cc (arm_output_mi_thunk): Emit
>   insn for bti_c when bti is enabled.
> 
> gcc/testsuite/ChangeLog:
> 
> * g++.target/arm/bti_thunk.C: New test.


diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 
e5a944486d7bd583627b0e22dfe8f95862e975bb..91eee8be7c1a59118fbf443557561fb3e0689d61
 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -29257,6 +29257,8 @@ arm_output_mi_thunk (FILE *file, tree thunk, 
HOST_WIDE_INT delta,
   const char *fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk));
 
   assemble_start_function (thunk, fnname);
+  if (aarch_bti_enabled ())
+emit_insn (aarch_gen_bti_c());

Missing space between ...bit_c and the parenthesis.

   if (TARGET_32BIT)
 arm32_output_mi_thunk (file, thunk, delta, vcall_offset, function);
   else

diff --git a/gcc/testsuite/g++.target/arm/bti_thunk.C 
b/gcc/testsuite/g++.target/arm/bti_thunk.C
new file mode 100644
index 
..5c4a8e5a8d74581eca2b877c000a5b34ddca0e9b
--- /dev/null
+++ b/gcc/testsuite/g++.target/arm/bti_thunk.C
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv8.1-m.main+pacbti -O1 -mbranch-protection=bti 
--save-temps" } */

You can't just add options like this; they may not work with other options 
passed by the testsuite framework.  Instead, you should a suitable entry to 
lib/target-supports.exp in the table starting "foreach { armfunc armflag 
armdefs } {" that tests whether the options can be safely added, and then use 
dg-require-effective-target and dg-add-options for your new set of options.

\ No newline at end of file

Please add one :)

R.


Re: [PATCH][GCC][Arm] Add pattern for bswap + rotate -> rev16 [Bug 108933]

2024-01-22 Thread Richard Earnshaw (lists)
On 22/01/2024 12:18, Matthieu Longo wrote:
> rev16 pattern was not recognised anymore as a change in the bswap tree
> pass was introducing a new GIMPLE form, not recognized by the assembly
> final transformation pass.
> 
> More details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933
> 
> gcc/ChangeLog:
> 
>     PR target/108933
>     * config/arm/arm.md (*arm_rev16si2_alt3): new pattern to convert
>   a bswap + rotate by 16 bits into rev16

ChangeLog entries need to be written as sentences, so start with a capital 
letter and end with a full stop; continuation lines should start in column 8 
(one hard tab, don't use spaces).  But in this case, "New pattern." is 
sufficient.

> 
> gcc/testsuite/ChangeLog:
> 
>     PR target/108933
>     * gcc.target/arm/rev16.c: Moved to...
>     * gcc.target/arm/rev16_1.c: ...here.
>     * gcc.target/arm/rev16_2.c: New test to check that rev16 is
>   emitted.


+;; Similar pattern to match (rotate (bswap) 16)
+(define_insn "*arm_rev16si2_alt3"
+  [(set (match_operand:SI 0 "register_operand" "=l,r")
+(rotate:SI (bswap:SI (match_operand:SI 1 "register_operand" "l,r"))
+ (const_int 16)))]
+  "arm_arch6"
+  "rev16\\t%0, %1"
+  [(set_attr "arch" "t,32")
+   (set_attr "length" "2,4")
+   (set_attr "type" "rev")]
+)
+

Unfortunately, this is insufficient.  When generating Arm or Thumb2 code (but 
not thumb1) we also have to handle conditional execution: we need to have '%?' 
in the output template at the point where a condition code might be needed.  
That means we need separate output templates for all three alternatives (as we 
need a 16-bit variant for thumb2 that's conditional and a 16-bit for thumb1 
that isn't).  See the output of arm_rev16 for a guide of what is really needed.

I note that the arm_rev16si2_alt1, and arm_rev16si2_alt2 patterns are incorrect 
in this regard as well; that will need fixing.

I also see that arm_rev16si2 currently expands to the alt1 variant above; given 
that the preferred canonical form would now appear to use bswap + rotate, we 
should change that as well.  In fact, we can merge your new pattern with the 
expand entirely and eliminate the need to call gen_arm_rev16si2_alt1.  
Something like:

(define_insn "arm_rev16si2"
  [(set (match_operand:SI 0 "s_register_operand")
(rotate:SI (bswap:SI (match_operand:SI 1 "s_register_operand")) 
(const_int 16))]
  "arm_arch6"
  "@
  rev16...
  ...


R.



Re: [PATCH] arm: Fix parsecpu.awk for aliases [PR113030]

2024-01-22 Thread Richard Earnshaw (lists)
On 21/01/2024 07:29, Andrew Pinski wrote:
> So the problem here is the 2 functions check_cpu and check_arch use
> the wrong variable to check if an alias is valid for that cpu/arch.
> check_cpu uses cpu_optaliases instead of cpu_opt_alias. cpu_optaliases
> is an array of index'ed by the cpuname that contains all of the valid aliases
> for that cpu but cpu_opt_alias is an double index array which is index'ed
> by cpuname and the alias which provides what is the alias for that option.
> Similar thing happens for check_arch and arch_optaliases vs arch_optaliases.
> 
> Tested by running:
> ```
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+simd" 
> config/arm/arm-cpus.in
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon" 
> config/arm/arm-cpus.in
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon-vfpv3" 
> config/arm/arm-cpus.in
> ```
> And they don't return error back.
> 
> gcc/ChangeLog:
> 
>   PR target/113030
>   * config/arm/parsecpu.awk (check_cpu): Use cpu_opt_alias
>   instead of cpu_optaliases.
>   (check_arch): Use arch_opt_alias instead of arch_optaliases.

OK

Thanks,

R.

> 
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/config/arm/parsecpu.awk | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
> index ddd4f3b440a..384462bdb5b 100644
> --- a/gcc/config/arm/parsecpu.awk
> +++ b/gcc/config/arm/parsecpu.awk
> @@ -529,7 +529,7 @@ function check_cpu (name) {
>  
>  for (n = 2; n <= exts; n++) {
>   if (!((cpu_name, extensions[n]) in cpu_opt_remove)  \
> - && !((cpu_name, extensions[n]) in cpu_optaliases)) {
> + && !((cpu_name, extensions[n]) in cpu_opt_alias)) {
>   return "error"
>   }
>  }
> @@ -552,7 +552,7 @@ function check_arch (name) {
>  
>  for (n = 2; n <= exts; n++) {
>   if (!((extensions[1], extensions[n]) in arch_opt_remove)\
> - && !((extensions[1], extensions[n]) in arch_optaliases)) {
> + && !((extensions[1], extensions[n]) in arch_opt_alias)) {
>   return "error"
>   }
>  }



Re: [PATCH v3 00/12] [GCC] arm: vld1q vst1 vst1q vst1 intrinsics

2024-01-12 Thread Richard Earnshaw (lists)
On 02/01/2024 09:23, ezra.sito...@arm.com wrote:
> From: Ezra Sitorus 
> 
> Add vld1q, vst1, vst1q and vst1 intrinsics to arm port.
> 
> Ezra Sitorus (12):
>   [GCC] arm: vld1q_types_x2 ACLE intrinsics
>   [GCC] arm: vld1q_types_x3 ACLE intrinsics
>   [GCC] arm: vld1q_types_x4 ACLE intrinsics
>   [GCC] arm: vst1_types_x2 ACLE intrinsics
>   [GCC] arm: vst1_types_x3 ACLE intrinsics
>   [GCC] arm: vst1_types_x4 ACLE intrinsics
>   [GCC] arm: vst1q_types_x2 ACLE intrinsics
>   [GCC] arm: vst1q_types_x3 ACLE intrinsics
>   [GCC] arm: vst1q_types_x4 ACLE intrinsics
>   [GCC] arm: vld1_types_x2 ACLE intrinsics
>   [GCC] arm: vld1_types_x3 ACLE intrinsics
>   [GCC] arm: vld1_types_x4 ACLE intrinsics
> 
>  gcc/config/arm/arm_neon.h | 2032 ++---
>  gcc/config/arm/arm_neon_builtins.def  |   12 +
>  gcc/config/arm/iterators.md   |6 +
>  gcc/config/arm/neon.md|  249 ++
>  gcc/config/arm/unspecs.md |8 +
>  .../gcc.target/arm/simd/vld1_base_xN_1.c  |  176 ++
>  .../gcc.target/arm/simd/vld1_bf16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vld1_fp16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vld1_p64_xN_1.c   |   23 +
>  .../gcc.target/arm/simd/vld1q_base_xN_1.c |  183 ++
>  .../gcc.target/arm/simd/vld1q_bf16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vld1q_fp16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vld1q_p64_xN_1.c  |   24 +
>  .../gcc.target/arm/simd/vst1_base_xN_1.c  |  176 ++
>  .../gcc.target/arm/simd/vst1_bf16_xN_1.c  |   22 +
>  .../gcc.target/arm/simd/vst1_fp16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vst1_p64_xN_1.c   |   23 +
>  .../gcc.target/arm/simd/vst1q_base_xN_1.c |  185 ++
>  .../gcc.target/arm/simd/vst1q_bf16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vst1q_fp16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vst1q_p64_xN_1.c  |   24 +
>  21 files changed, 3018 insertions(+), 290 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_p64_xN_1.c
> 

Thanks, I've pushed this series.

Reviewing this series did highlight a couple of issues with the existing code 
base (not your patch); I'll follow up on these separately.

R.


Re: [PATCH v2 7/7] aarch64,arm: Move branch-protection data to targets

2024-01-11 Thread Richard Earnshaw (lists)
On 11/01/2024 14:43, Szabolcs Nagy wrote:
> The 12/07/2023 13:13, Richard Earnshaw wrote:
>> On 03/11/2023 15:36, Szabolcs Nagy wrote:
>>> * config/aarch64/aarch64.cc (aarch_handle_no_branch_protection): Copy.
>>> (aarch_handle_standard_branch_protection): Copy.
>>> (aarch_handle_pac_ret_protection): Copy.
>>> (aarch_handle_pac_ret_leaf): Copy.
>>> (aarch_handle_pac_ret_b_key): Copy.
>>> (aarch_handle_bti_protection): Copy.
>>
>> I think all of the above functions that have been moved back from
>> aarch-common should be renamed back to aarch64_..., unless they are directly
>> referenced statically by code in aarch-common.c.
> 
> done.
> 
>>> +const struct aarch_branch_protect_type aarch_branch_protect_types[] = {
>>
>> can this be made static now?  And maybe pass the structure as a parameter if
>> that's not done already.
> 
> done in v4.
> 
>> It would be nice if, when we raise an error, we could print out the list of
>> valid options (and modifiers), much like we do on Arm for -march/-mcpu.
>>
>> eg.
>> $ gcc -mcpu=crotex-a8
>> cc1: error: unrecognised -mcpu target: crotex-a8
>> cc1: note: valid arguments are: arm8 arm810 strongarm strongarm110 fa526
>> [...rest of list]; did you mean ‘cortex-a8’?
> 
> i implemented this with candidates_list_and_hint but it does
> not work very well if the typo is in a subtype, so i think
> this should be done in a separate patch if at all.
> 

I'd build the candidates list from all the types + subtypes, so that the 
suggestion code has a full list to pick from; but fair enough.

R.


Re: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-14 Thread Richard Earnshaw (lists)
On 09/12/2023 15:39, Lipeng Zhu wrote:
> This patch try to introduce the rwlock and split the read/write to
> unit_root tree and unit_cache with rwlock instead of the mutex to
> increase CPU efficiency. In the get_gfc_unit function, the percentage
> to step into the insert_unit function is around 30%, in most instances,
> we can get the unit in the phase of reading the unit_cache or unit_root
> tree. So split the read/write phase by rwlock would be an approach to
> make it more parallel.
> 
> BTW, the IPC metrics can gain around 9x in our test
> server with 220 cores. The benchmark we used is
> https://github.com/rwesson/NEAT
> 
> libgcc/ChangeLog:
> 
>   * gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro.
>   (__gthrw): New function.
>   (__gthread_rwlock_rdlock): New function.
>   (__gthread_rwlock_tryrdlock): New function.
>   (__gthread_rwlock_wrlock): New function.
>   (__gthread_rwlock_trywrlock): New function.
>   (__gthread_rwlock_unlock): New function.
> 
> libgfortran/ChangeLog:
> 
>   * io/async.c (DEBUG_LINE): New macro.
>   * io/async.h (RWLOCK_DEBUG_ADD): New macro.
>   (CHECK_RDLOCK): New macro.
>   (CHECK_WRLOCK): New macro.
>   (TAIL_RWLOCK_DEBUG_QUEUE): New macro.
>   (IN_RWLOCK_DEBUG_QUEUE): New macro.
>   (RDLOCK): New macro.
>   (WRLOCK): New macro.
>   (RWUNLOCK): New macro.
>   (RD_TO_WRLOCK): New macro.
>   (INTERN_RDLOCK): New macro.
>   (INTERN_WRLOCK): New macro.
>   (INTERN_RWUNLOCK): New macro.
>   * io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in
>   a comment.
>   (unit_lock): Remove including associated internal_proto.
>   (unit_rwlock): New declarations including associated internal_proto.
>   (dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock
>   instead of __gthread_mutex_lock and __gthread_mutex_unlock on
>   unit_lock.
>   * io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on
>   unit_rwlock instead of LOCK and UNLOCK on unit_lock.
>   (st_write_done_worker): Likewise.
>   * io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules'
>   comment. Use unit_rwlock variable instead of unit_lock variable.
>   (get_gfc_unit_from_unit_root): New function.
>   (get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock
>   instead of LOCK and UNLOCK on unit_lock.
>   (close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of
>   LOCK and UNLOCK on unit_lock.
>   (close_units): Likewise.
>   (newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on
>   unit_lock.
>   * io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock
>   instead of LOCK and UNLOCK on unit_lock.
>   (flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead
>   of LOCK and UNLOCK on unit_lock.
> 

It looks like this has broken builds on arm-none-eabi when using newlib:

In file included from /work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran
/runtime/error.c:27:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h: In function 
‘dec_waiting_unlocked’:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1023:3: error
: implicit declaration of function ‘WRLOCK’ [-Wimplicit-function-declaration]
 1023 |   WRLOCK (_rwlock);
  |   ^~
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1025:3: error
: implicit declaration of function ‘RWUNLOCK’ [-Wimplicit-function-declaration]
 1025 |   RWUNLOCK (_rwlock);
  |   ^~~~


R.

> ---
> v1 -> v2:
> Limit the pthread_rwlock usage in libgcc only when __cplusplus isn't defined.
> 
> v2 -> v3:
> Rebase the patch with trunk branch.
> 
> v3 -> v4:
> Update the comments.
> 
> v4 -> v5:
> Fix typos and code formatter.
> 
> v5 -> v6:
> Add unit tests.
> 
> v6 -> v7:
> Update ChangeLog and code formatter.
> 
> Reviewed-by: Hongjiu Lu 
> Reviewed-by: Bernhard Reutner-Fischer 
> Reviewed-by: Thomas Koenig 
> Reviewed-by: Jakub Jelinek 
> Signed-off-by: Lipeng Zhu 
> ---
>  libgcc/gthr-posix.h   |  60 +++
>  libgfortran/io/async.c|   4 +
>  libgfortran/io/async.h| 151 ++
>  libgfortran/io/io.h   |  15 +-
>  libgfortran/io/transfer.c |   8 +-
>  libgfortran/io/unit.c | 117 +-
>  libgfortran/io/unix.c |  16 +-
>  .../testsuite/libgomp.fortran/rwlock_1.f90|  33 
>  .../testsuite/libgomp.fortran/rwlock_2.f90|  22 +++
>  .../testsuite/libgomp.fortran/rwlock_3.f90|  18 +++
>  10 files changed, 386 insertions(+), 58 deletions(-)
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_1.f90
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_2.f90
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_3.f90
> 
> diff --git 

Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Richard Earnshaw (lists)
On 01/12/2023 13:45, Christophe Lyon wrote:
> On Fri, 1 Dec 2023 at 13:44, Richard Earnshaw (lists)
>  wrote:
>>
>> On 01/12/2023 11:28, Saurabh Jha wrote:
>>> Hey,
>>>
>>> I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in 
>>> this commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an 
>>> error which I unfortunately missed. This patch fixes that test.
>>>
>>> Did regression testing on arm-none-eabi and found no regressions. Output of 
>>> running gcc/contrib/compare_tests is this:
>>>
>>> """
>>> Tests that now work, but didn't before (2 tests):
>>>
>>> arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
>>> gcc.target/arm/mve/pr112337.c (test for excess errors)
>>> arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
>>>  gcc.target/arm/mve/pr112337.c (test for excess errors)
>>> """
>>>
>>> Ok for trunk? I don't have commit access so could someone please commit on 
>>> my behalf?
>>>
>>> Regards,
>>> Saurabh
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/arm/mve/pr112337.c: Fix the testcase
>>
>>
>> Hmm, could this be related to the changes Christophe made recently to change 
>> the way MVE vector types were set up internally?  If so, this might indicate 
>> an issue that's going to affect real users with existing code.
>>
> 
> My change was only about vector types, here the problem is with a
> pointer to a scalar.
> Anyway, I ran the test with my commit reverted and it still fails in
> the same way, so I think this patch is needed.
> 
> Thanks,
> 
> Christophe
> 
>> Christophe?
>>
>> R.

Ok, thanks for checking.  In that case, Saurabh, your patch is OK, but please 
change 'Fix testcase' to 'Use int32_t instead of int.'

Note that ChangeLog entries end with a full stop.

R.


Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Richard Earnshaw (lists)
On 01/12/2023 11:28, Saurabh Jha wrote:
> Hey,
> 
> I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in this 
> commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an error 
> which I unfortunately missed. This patch fixes that test.
> 
> Did regression testing on arm-none-eabi and found no regressions. Output of 
> running gcc/contrib/compare_tests is this:
> 
> """
> Tests that now work, but didn't before (2 tests):
> 
> arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
> gcc.target/arm/mve/pr112337.c (test for excess errors)
> arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
>  gcc.target/arm/mve/pr112337.c (test for excess errors)
> """
> 
> Ok for trunk? I don't have commit access so could someone please commit on my 
> behalf?
> 
> Regards,
> Saurabh
> 
> gcc/testsuite/ChangeLog:
> 
>     * gcc.target/arm/mve/pr112337.c: Fix the testcase


Hmm, could this be related to the changes Christophe made recently to change 
the way MVE vector types were set up internally?  If so, this might indicate an 
issue that's going to affect real users with existing code.

Christophe?

R.


Re: [PATCH] AArch64/testsuite: Use non-capturing parentheses with ccmp_1.c

2023-11-22 Thread Richard Earnshaw (lists)
On 22/11/2023 15:21, Maciej W. Rozycki wrote:
> Use non-capturing parentheses for the subexpressions used with 
> `scan-assembler-times', to avoid a quirk with double-counting.
> 
>   gcc/testsuite/
>   * gcc.target/aarch64/ccmp_1.c: Use non-capturing parentheses 
>   with `scan-assembler-times'.

OK

R.

> ---
> Hi,
> 
>  Here's another one.  I realised my original regexp used to grep the tree 
> for `scan-assembler-times' with subexpressions was too strict and with an 
> updated pattern I found this second test case that does regress once the 
> `scan-assembler-times' double-counting quirk has been fixed.
> 
>  As with the ARM change we don't need capturing parentheses here, usually 
> used for back references, so let's just avoid the double-counting quirk 
> altogether and make our matching here work whether the quirk has been 
> fixed or not.
> 
>  Verified for the `aarch64-linux-gnu' target with the quirk fix submitted 
> as  
> and the aarch64.exp subset of the C language test suite.  OK to apply?
> 
>   Maciej
> ---
>  gcc/testsuite/gcc.target/aarch64/ccmp_1.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> gcc-aarch64-test-ccmp_1-non-capturing.diff
> Index: gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> ===
> --- gcc.orig/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> +++ gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> @@ -86,8 +86,8 @@ f13 (int a, int b)
>  /* { dg-final { scan-assembler "cmp\t(.)+35" } } */
>  
>  /* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, 0" 4 } } */
> -/* { dg-final { scan-assembler-times "fcmpe\t(.)+0\\.0" 2 } } */
> -/* { dg-final { scan-assembler-times "fcmp\t(.)+0\\.0" 2 } } */
> +/* { dg-final { scan-assembler-times "fcmpe\t(?:.)+0\\.0" 1 } } */
> +/* { dg-final { scan-assembler-times "fcmp\t(?:.)+0\\.0" 1 } } */
>  
>  /* { dg-final { scan-assembler "adds\t" } } */
>  /* { dg-final { scan-assembler-times "\tccmp\t" 11 } } */



Re: [PATCH 10/11] aarch64: Fix branch-protection error message tests

2023-10-13 Thread Richard Earnshaw (lists)
On 05/09/2023 16:00, Richard Sandiford via Gcc-patches wrote:
> Szabolcs Nagy  writes:
>> Update tests for the new branch-protection parser errors.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/aarch64/branch-protection-attr.c: Update.
>>  * gcc.target/aarch64/branch-protection-option.c: Update.
> 
> OK, thanks.  (And I agree these are better messages. :))
> 
> I think that's the last of the AArch64-specific ones.  The others
> will need to be reviewed by Kyrill or Richard.
> 
> Richard
> 
>> ---
>>  gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c   | 6 +++---
>>  gcc/testsuite/gcc.target/aarch64/branch-protection-option.c | 2 +-
>>  2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c 
>> b/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
>> index 272000c2747..dae2a758a56 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
>> @@ -4,19 +4,19 @@ void __attribute__ ((target("branch-protection=leaf")))
>>  foo1 ()
>>  {
>>  }
>> -/* { dg-error {invalid protection type 'leaf' in 
>> 'target\("branch-protection="\)' pragma or attribute} "" { target *-*-* } 5 
>> } */
>> +/* { dg-error {invalid argument 'leaf' for 
>> 'target\("branch-protection="\)'} "" { target *-*-* } 5 } */
>>  /* { dg-error {pragma or attribute 'target\("branch-protection=leaf"\)' is 
>> not valid} "" { target *-*-* } 5 } */

'leaf' is really a modifier for the other branch protection strategies; perhaps 
it would be better to describe it as that.

But this brings up another issue/question.  If the compiler has been configured 
with, say, '--enable-branch-protection=standard' or some other variety, is 
there (or do we want) a way to extend that to leaf functions without changing 
the underlying strategy?

>>  
>>  void __attribute__ ((target("branch-protection=none+pac-ret")))
>>  foo2 ()
>>  {
>>  }
>> -/* { dg-error "unexpected 'pac-ret' after 'none'" "" { target *-*-* } 12 } 
>> */
>> +/* { dg-error {argument 'none' can only appear alone in 
>> 'target\("branch-protection="\)'} "" { target *-*-* } 12 } */

Or maybe better still: "branch protection strategies 'none' and 'pac-ret' are 
incompatible".

>>  /* { dg-error {pragma or attribute 
>> 'target\("branch-protection=none\+pac-ret"\)' is not valid} "" { target 
>> *-*-* } 12 } */
>>  
>>  void __attribute__ ((target("branch-protection=")))
>>  foo3 ()
>>  {
>>  }
>> -/* { dg-error {missing argument to 'target\("branch-protection="\)' pragma 
>> or attribute} "" { target *-*-* } 19 } */
>> +/* { dg-error {invalid argument '' for 'target\("branch-protection="\)'} "" 
>> { target *-*-* } 19 } */
>>  /* { dg-error {pragma or attribute 'target\("branch-protection="\)' is not 
>> valid} "" { target *-*-* } 19 } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/branch-protection-option.c 
>> b/gcc/testsuite/gcc.target/aarch64/branch-protection-option.c
>> index 1b3bf4ee2b8..e2f847a31c4 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/branch-protection-option.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/branch-protection-option.c
>> @@ -1,4 +1,4 @@
>>  /* { dg-do "compile" } */
>>  /* { dg-options "-mbranch-protection=leaf -mbranch-protection=none+pac-ret" 
>> } */
>>  
>> -/* { dg-error "unexpected 'pac-ret' after 'none'"  "" { target *-*-* } 0 } 
>> */
>> +/* { dg-error "argument 'none' can only appear alone in 
>> '-mbranch-protection='" "" { target *-*-* } 0 } */

But this is all a matter of taste.

However, this patch should be merged with the patch that changes the error 
messages.  Or has that already gone in?

R


Re: Principles of the C99 testsuite conversion

2023-10-11 Thread Richard Earnshaw (lists)
On 11/10/2023 14:56, Jeff Law wrote:
> 
> 
> On 10/11/23 04:39, Florian Weimer wrote:
>> I've started to look at what it is required to convert the testsuite to
>> C99 (without implicit ints, without implicit function declarations, and
>> a few other legacy language features).
> I bet those older tests originating from c-torture will be a bit painful.  
> Torbjorn liked having them minimized, to the point of squashing out nearly 
> everything he considered extraneous.  I'd bet many of those older tests are 
> going to need lots of changes.
> 

I've often wondered just how much of the original c-torture suite is still 
relevant today.  Most of those tests were written at a  time when the compiler 
expanded tree directly into RTL and I suspect that today the tests never get 
even close to tickling the original bug they were intended to validate.

R.



Re: [PATCH 6/6] aarch64: Add front-end argument type checking for target builtins

2023-10-10 Thread Richard Earnshaw (lists)
On 09/10/2023 14:12, Victor Do Nascimento wrote:
> 
> 
> On 10/7/23 12:53, Richard Sandiford wrote:
>> Richard Earnshaw  writes:
>>> On 03/10/2023 16:18, Victor Do Nascimento wrote:
 In implementing the ACLE read/write system register builtins it was
 observed that leaving argument type checking to be done at expand-time
 meant that poorly-formed function calls were being "fixed" by certain
 optimization passes, meaning bad code wasn't being properly picked up
 in checking.

 Example:

     const char *regname = "amcgcr_el0";
     long long a = __builtin_aarch64_rsr64 (regname);

 is reduced by the ccp1 pass to

     long long a = __builtin_aarch64_rsr64 ("amcgcr_el0");

 As these functions require an argument of STRING_CST type, there needs
 to be a check carried out by the front-end capable of picking this up.

 The introduced `check_general_builtin_call' function will be called by
 the TARGET_CHECK_BUILTIN_CALL hook whenever a call to a builtin
 belonging to the AARCH64_BUILTIN_GENERAL category is encountered,
 carrying out any appropriate checks associated with a particular
 builtin function code.
>>>
>>> Doesn't this prevent reasonable wrapping of the __builtin... names with
>>> something more palatable?  Eg:
>>>
>>> static inline __attribute__(("always_inline")) long long get_sysreg_ll
>>> (const char *regname)
>>> {
>>>     return __builtin_aarch64_rsr64 (regname);
>>> }
>>>
>>> ...
>>>     long long x = get_sysreg_ll("amcgcr_el0");
>>> ...
>>
>> I think it's case of picking your poison.  If we didn't do this,
>> and only checked later, then it's unlikely that GCC and Clang would
>> be consistent about when a constant gets folded soon enough.
>>
>> But yeah, it means that the above would need to be a macro in C.
>> Enlightened souls using C++ could instead do:
>>
>>    template
>>    long long get_sysreg_ll()
>>    {
>>  return __builtin_aarch64_rsr64(regname);
>>    }
>>
>>    ... get_sysreg_ll<"amcgcr_el0">() ...
>>
>> Or at least I hope so.  Might be nice to have a test for this.
>>
>> Thanks,
>> Richard
> 
> As Richard Earnshaw mentioned, this does break the use of `static inline 
> __attribute__(("always_inline"))', something I had found out in my testing.  
> My chosen implementation was indeed, to quote Richard Sandiford, a case of 
> "picking your poison" to have things line up with Clang and behaving 
> consistently across optimization levels.
> 
> Relaxing the the use of `TARGET_CHECK_BUILTIN_CALL' meant optimizations were 
> letting too many things through. Example:
> 
> const char *regname = "amcgcr_el0";
> long long a = __builtin_aarch64_rsr64 (regname);
> 
> gets folded to
> 
> long long a = __builtin_aarch64_rsr64 ("amcgcr_el0");
> 
> and compilation passes at -01 even though it fails at -O0.
> 
> I had, however, not given any thought to the use of a template as a valid C++ 
> alternative.
> 
> I will evaluate the use of templates and add tests accordingly.

This just seems inconsistent with all the builtins we already have that require 
literal constants for parameters.  For example (to pick just one of many), 
vshr_n_q8(), where the second parameter must be a literal value.  In practice 
we accept anything that resolves to a compile-time constant integer expression 
and rely on that to avoid having to have hundreds of macros binding the ACLE 
names to the underlying builtin equivalents.

Furthermore, I don't really see the problem with the examples you cite.  It's 
not as though the user can change these at run-time and expect to get a 
different register.

R.

> 
> Cheers,
> Victor



Re: [PATCH v2] Add a GCC Security policy

2023-10-05 Thread Richard Earnshaw (lists)
On 28/09/2023 12:55, Siddhesh Poyarekar wrote:
> +Security features implemented in GCC
> +
> +
[...]
> +
> +Similarly, GCC may transform code in a way that the correctness of
> +the expressed algorithm is preserved, but supplementary properties
> +that are not specifically expressible in a high-level language
> +are not preserved. Examples of such supplementary properties
> +include absence of sensitive data in the program's address space
> +after an attempt to wipe it, or data-independent timing of code.
> +When the source code attempts to express such properties, failure
> +to preserve them in resulting machine code is not a security issue
> +in GCC.

I think it would be worth mentioning here that compilers interpret source code 
according to an abstract machine defined by the source language.  Properties of 
a program that cannot be described in the abstract machine may not be 
translated into the generated machine code.

This is, fundamentally, describing the 'as if' rule.

R.


Re: [PATCH] AArch64: Remove BTI from outline atomics

2023-09-26 Thread Richard Earnshaw (lists)
On 26/09/2023 14:46, Wilco Dijkstra wrote:
> 
> The outline atomic functions have hidden visibility and can only be called
> directly.  Therefore we can remove the BTI at function entry.  This improves
> security by reducing the number of indirect entry points in a binary.
> The BTI markings on the objects are still emitted.

Please can you add a comment to that effect in the source code.  OK with that 
change.

R.

> 
> Passes regress, OK for commit?
> 
> libgcc/ChangeLog:
>     * config/aarch64/lse.S (BTI_C): Remove define.
> 
> ---
> 
> diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
> index 
> ba05047ff02b6fc5752235bffa924fc4a2f48c04..dbfb83fb09083641bf06c50b631a5f27bdf61b80
>  100644
> --- a/libgcc/config/aarch64/lse.S
> +++ b/libgcc/config/aarch64/lse.S
> @@ -163,8 +163,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  #define tmp3    14
>  #define tmp4    13
>  
> -#define BTI_C  hint    34
> -
>  /* Start and end a function.  */
>  .macro  STARTFN name
>  .text
> @@ -174,7 +172,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  .type   \name, %function
>  .cfi_startproc
>  \name:
> -   BTI_C
>  .endm
>  
>  .macro  ENDFN name



Re: [PATCH] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-09-20 Thread Richard Earnshaw (lists)
On 20/09/2023 14:50, Wilco Dijkstra wrote:
> 
> The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
> Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
> Clean up the condition when to use MOPS.
> 
> Passes regress/bootstrap, OK for commit?
> 
> gcc/ChangeLog/
> PR target/103100
> * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.

Shouldn't this be a separate patch?  It's not immediately obvious that this is 
a necessary part of this change.

> (setmemdi): Likewise.
> * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
> strict-align.  Cleanup condition for using MOPS.
> (aarch64_expand_setmem): Likewise.
> 
> ---
> 
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> dd6874d13a75f20d10a244578afc355b25c73da2..8f3bfb91c0f4ec43f37fe9289a66092a29a47e4d
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -25261,27 +25261,23 @@ aarch64_expand_cpymem (rtx *operands)
>int mode_bits;
>rtx dst = operands[0];
>rtx src = operands[1];
> +  unsigned align = INTVAL (operands[3]);

This should read the value with UINTVAL.  Given the useful range of the 
alignment, it should be OK that we're not using unsigned HWI.

>rtx base;
>machine_mode cur_mode = BLKmode;
> +  bool size_p = optimize_function_for_size_p (cfun);
>  
> -  /* Variable-sized memcpy can go through the MOPS expansion if available.  
> */
> -  if (!CONST_INT_P (operands[2]))
> +  /* Variable-sized or strict-align copies may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_cpymem_mops (operands);

So what about align=4 and copying, for example, 8 or 12 bytes; wouldn't we want 
a sequence of LDR/STR in that case?  Doesn't this fall back to MOPS too eagerly?


>  
>unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
>  
> -  /* Try to inline up to 256 bytes or use the MOPS threshold if available.  
> */
> -  unsigned HOST_WIDE_INT max_copy_size
> -= TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256;
> -
> -  bool size_p = optimize_function_for_size_p (cfun);
> +  /* Try to inline up to 256 bytes.  */
> +  unsigned max_copy_size = 256;
> +  unsigned max_mops_size = aarch64_mops_memcpy_size_threshold;

I find this name slightly confusing.  Surely it's min_mops_size (since above 
that we want to use MOPS rather than inlined loads/stores).  But why not just 
use aarch64_mops_memcpy_size_threshold directly in the one place it's used?

>  
> -  /* Large constant-sized cpymem should go through MOPS when possible.
> - It should be a win even for size optimization in the general case.
> - For speed optimization the choice between MOPS and the SIMD sequence
> - depends on the size of the copy, rather than number of instructions,
> - alignment etc.  */
> -  if (size > max_copy_size)
> +  /* Large copies use MOPS when available or a library call.  */
> +  if (size > max_copy_size || (TARGET_MOPS && size > max_mops_size))
>  return aarch64_expand_cpymem_mops (operands);
>  
>int copy_bits = 256;
> @@ -25445,12 +25441,13 @@ aarch64_expand_setmem (rtx *operands)

Similar comments apply to this code as well.

>unsigned HOST_WIDE_INT len;
>rtx dst = operands[0];
>rtx val = operands[2], src;
> +  unsigned align = INTVAL (operands[3]);
>rtx base;
>machine_mode cur_mode = BLKmode, next_mode;
>  
> -  /* If we don't have SIMD registers or the size is variable use the MOPS
> - inlined sequence if possible.  */
> -  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD)
> +  /* Variable-sized or strict-align memset may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD
> +  || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_setmem_mops (operands);
>  
>bool size_p = optimize_function_for_size_p (cfun);
> @@ -25458,10 +25455,13 @@ aarch64_expand_setmem (rtx *operands)

And here.

>/* Default the maximum to 256-bytes when considering only libcall vs
>   SIMD broadcast sequence.  */
>unsigned max_set_size = 256;
> +  unsigned max_mops_size = aarch64_mops_memset_size_threshold;
>  
>len = INTVAL (operands[1]);
> -  if (len > max_set_size && !TARGET_MOPS)
> -return false;
> +
> +  /* Large memset uses MOPS when available or a library call.  */
> +  if (len > max_set_size || (TARGET_MOPS && len > max_mops_size))
> +return aarch64_expand_setmem_mops (operands);
>  
>int cst_val = !!(CONST_INT_P (val) && (INTVAL (val) != 0));
>/* The MOPS sequence takes:
> @@ -25474,12 +25474,6 @@ aarch64_expand_setmem (rtx *operands)
>   the arguments + 1 for the call.  */
>unsigned libcall_cost = 4;
>  
> -  /* Upper bound check.  For large constant-sized setmem use the MOPS 
> sequence
> - when available.  */
> -  if (TARGET_MOPS
> -  && len >= 

Re: [PATCH 2/2] libstdc++: Add dg-require-thread-fence in several tests

2023-09-11 Thread Richard Earnshaw (lists) via Gcc-patches
On 11/09/2023 16:22, Jonathan Wakely via Gcc-patches wrote:
> On Mon, 11 Sept 2023 at 14:57, Christophe Lyon
>  wrote:
>>
>>
>>
>> On Mon, 11 Sept 2023 at 15:12, Jonathan Wakely  wrote:
>>>
>>> On Mon, 11 Sept 2023 at 13:36, Christophe Lyon
>>>  wrote:



 On Mon, 11 Sept 2023 at 12:59, Jonathan Wakely  wrote:
>
> On Sun, 10 Sept 2023 at 20:31, Christophe Lyon
>  wrote:
>>
>> Some targets like arm-eabi with newlib and default settings rely on
>> __sync_synchronize() to ensure synchronization.  Newlib does not
>> implement it by default, to make users aware they have to take special
>> care.
>>
>> This makes a few tests fail to link.
>
> Does this mean those features are unusable on the target, or just that
> users need to provide their own __sync_synchronize to use them?


 IIUC the user is expected to provide them.
 Looks like we discussed this in the past :-)
 In  https://gcc.gnu.org/legacy-ml/gcc-patches/2016-10/msg01632.html,
 see the pointer to Ramana's comment: 
 https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02751.html
>>>
>>> Oh yes, thanks for the reminder!
>>>

 The default arch for arm-eabi is armv4t which is very old.
 When running the testsuite with something more recent (either as default 
 by configuring GCC --with-arch=XXX or by forcing -march/-mcpu via 
 dejagnu's target-board), the compiler generates barrier instructions and 
 there are no such errors.
>>>
>>> Ah yes, that's fine then.
>>>
 For instance, here is a log with the defaults:
 https://git.linaro.org/toolchain/ci/base-artifacts/tcwg_gnu_embed_check_gcc/master-arm_eabi.git/tree/00-sumfiles?h=linaro-local/ci/tcwg_gnu_embed_check_gcc/master-arm_eabi
 and a log when we target cortex-m0 which is still a very small cpu but has 
 barriers:
 https://git.linaro.org/toolchain/ci/base-artifacts/tcwg_gnu_embed_check_gcc/master-thumb_m0_eabi.git/tree/00-sumfiles?h=linaro-local/ci/tcwg_gnu_embed_check_gcc/master-thumb_m0_eabi

 I somehow wanted to get rid of such errors with the default 
 configuration
>>>
>>> Yep, that makes sense, and we'll still be testing them for newer
>>> arches on the target, so it's not completely disabling those parts of
>>> the testsuite.
>>>
>>> But I'm still curious why some of those tests need this change. I
>>> think the ones I noted below are probably failing for some other
>>> reasons.
>>>
>> Just looked at  23_containers/span/back_assert_neg.cc, the linker says it 
>> needs
>> arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(debug.o) to resolve
>> ./back_assert_neg-back_assert_neg.o (std::__glibcxx_assert_fail(char const*, 
>> int, char const*, char const*))
>> and indeed debug.o has a reference to __sync_synchronize
> 
> Aha, that's just because I put __glibcxx_assert_fail in debug.o, but
> there are no dependencies on anything else in that file, including the
> _M_detach member function that uses atomics.
> 
> This would also be solved by -Wl,--gc-sections :-)
> 
> I think it would be better to move __glibcxx_assert_fail to a new
> file, so that it doesn't make every assertion unnecessarily depend on
> __sync_synchronize. I'll do that now.
> 
> We could also make the atomics in debug.o conditional, so that debug
> mode doesn't depend on __sync_synchronize for single-threaded targets.
> Does the arm4t arch have pthreads support in newlib?  I didn't bother
> making the use of atomics conditional, because performance is not
> really a priority for debug mode bookkeeping. But the problem here
> isn't just a slight performance overhead of atomics, it's that they
> aren't even supported for arm4t.

I might be wrong, but I don't think newlib has any support for pthreads.

R.
> 



Re: [PATCH] rtl: Forward declare rtx_code

2023-08-23 Thread Richard Earnshaw (lists) via Gcc-patches
On 23/08/2023 16:49, Richard Sandiford via Gcc-patches wrote:
> Richard Earnshaw via Gcc-patches  writes:
>> Now that we require C++ 11, we can safely forward declare rtx_code
>> so that we can use it in target hooks.
>>
>> gcc/ChangeLog
>>  * coretypes.h (rtx_code): Add forward declaration.
>>  * rtl.h (rtx_code): Make compatible with forward declaration.
>> ---
>>  gcc/coretypes.h | 4 
>>  gcc/rtl.h   | 2 +-
>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/coretypes.h b/gcc/coretypes.h
>> index ca8837cef67..51e9ce0 100644
>> --- a/gcc/coretypes.h
>> +++ b/gcc/coretypes.h
>> @@ -100,6 +100,10 @@ struct gimple;
>>  typedef gimple *gimple_seq;
>>  struct gimple_stmt_iterator;
>>  
>> +/* Forward declare rtx_code, so that we can use it in target hooks without
>> +   needing to pull in rtl.h.  */
>> +enum rtx_code : unsigned;
>> +
>>  /* Forward decls for leaf gimple subclasses (for individual gimple codes).
>> Keep this in the same order as the corresponding codes in gimple.def.  */
>>  
>> diff --git a/gcc/rtl.h b/gcc/rtl.h
>> index e1c51156f90..0e9491b89b4 100644
>> --- a/gcc/rtl.h
>> +++ b/gcc/rtl.h
>> @@ -45,7 +45,7 @@ class predefined_function_abi;
>>  /* Register Transfer Language EXPRESSIONS CODES */
>>  
>>  #define RTX_CODEenum rtx_code
>> -enum rtx_code  {
>> +enum rtx_code : unsigned {
>>  
>>  #define DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS)   ENUM ,
>>  #include "rtl.def"  /* rtl expressions are documented here */
> 
> Given:
> 
>   #define RTX_CODE_BITSIZE 8
> 
> there might be some value in making it uint8_t rather than unsigned.
> Preapproved if you agree.
> 
> But the patch as posted is a strict improvement over the status quo,
> so it's also OK as-is.
> 
> Thanks,
> Richard

I did think about that, but there were two reasons for not doing so:
- it presumes we would never want more than 8 bits for rtx_code (well, not 
quite, 
but it would make it more work to change this).
- it would probably lead to more zero-extension operations happening in the 
compiler

I'll put my patch in as is.

R.


Re: [PATCH] aarch64: fix format specifier

2023-08-21 Thread Richard Earnshaw (lists) via Gcc-patches

On 18/08/2023 17:37, FX Coudert via Gcc-patches wrote:

A rather trivial fix for fprintf() specifier of a HOST_WIDE_INT value.
Tested on aarch64-apple-darwin. OK to commit?

FX



OK.

R.


Re: [RFC] GCC Security policy

2023-08-09 Thread Richard Earnshaw (lists) via Gcc-patches

On 08/08/2023 20:39, Carlos O'Donell via Gcc-patches wrote:

On 8/8/23 13:46, David Edelsohn wrote:

I believe that upstream projects for components that are imported
into GCC should be responsible for their security policy, including
libgo, gofrontend, libsanitizer (other than local patches), zlib,
libtool, libphobos, libcody, libffi, eventually Rust libcore, etc.


I agree completely.

We can reference the upstream and direct people to follow upstream security
policy for these bundled components.

Any other policy risks having conflicting guidance between the projects,
which is not useful for security policy.

There might be exceptions to this rule, particularly when the downstream
wants to accept particular risks while upstream does not; but none of these
components are in that case IMO.



I agree with that, but with one caveat.  Our policy should state what we 
 do once upstream has addressed the issue.


R.


Re: [RFC] GCC Security policy

2023-08-08 Thread Richard Earnshaw (lists) via Gcc-patches

On 08/08/2023 15:40, Siddhesh Poyarekar wrote:

On 2023-08-08 10:37, Jakub Jelinek wrote:

On Tue, Aug 08, 2023 at 10:30:10AM -0400, Siddhesh Poyarekar wrote:

Do you have a suggestion for the language to address libgcc, libstdc++,
etc. and libiberty, libbacktrace, etc.?


I'll work on this a bit and share a draft.


BTW, I think we should perhaps differentiate between production ready
libraries (e.g. libgcc, libstdc++, libgomp, libatomic, libgfortran, 
libquadmath,
libssp) vs. e.g. the sanitizer libraries which are meant for debugging 
and


Agreed, that's why I need some time to sort all of the libraries gcc 
builds to categorize them into various levels of support in terms of 
safety re. untrusted input.


Thanks,
Sid


Related to this, our coding standards should really reflect what we 
consider good practice these days.  eg.  There are many library APIs 
around that were once considered acceptable that frankly we would be 
better uninventing.


R.


Re: [PATCH] aarch64: Fix warnings during libgcc build

2023-07-11 Thread Richard Earnshaw (lists) via Gcc-patches

On 11/07/2023 15:54, Richard Earnshaw (lists) via Gcc-patches wrote:

On 11/07/2023 10:37, Florian Weimer via Gcc-patches wrote:

libgcc/

* config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key):
Add missing const qualifier.  Cast from const unsigned char *
to const char *.  Use __builtin_strchr to avoid an implicit
function declaration.
* config/aarch64/linux-unwind.h (aarch64_fallback_frame_state):
Add missing cast.

---
diff --git a/libgcc/config/aarch64/linux-unwind.h 
b/libgcc/config/aarch64/linux-unwind.h

index 00eba866049..93da7a9537d 100644
--- a/libgcc/config/aarch64/linux-unwind.h
+++ b/libgcc/config/aarch64/linux-unwind.h
@@ -77,7 +77,7 @@ aarch64_fallback_frame_state (struct _Unwind_Context 
*context,

  }
    rt_ = context->cfa;
-  sc = _->uc.uc_mcontext;
+  sc = (struct sigcontext *) _->uc.uc_mcontext;
  /* This define duplicates the definition in aarch64.md */
  #define SP_REGNUM 31




This looks somewhat dubious.  I'm not particularly familiar with the 
kernel headers, but a quick look suggests an mcontext_t is nothing like 
a sigcontext_t.  So isn't the cast just papering over some more 
fundamental problem?


R.


Sorry, I was looking at the wrong set of headers.  It looks like these 
have to match. But in that case, I think we should have a comment about 
that here to explain the suspicious cast.


R.


Re: [PATCH] aarch64: Fix warnings during libgcc build

2023-07-11 Thread Richard Earnshaw (lists) via Gcc-patches

On 11/07/2023 10:37, Florian Weimer via Gcc-patches wrote:

libgcc/

* config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key):
Add missing const qualifier.  Cast from const unsigned char *
to const char *.  Use __builtin_strchr to avoid an implicit
function declaration.
* config/aarch64/linux-unwind.h (aarch64_fallback_frame_state):
Add missing cast.

---
diff --git a/libgcc/config/aarch64/linux-unwind.h 
b/libgcc/config/aarch64/linux-unwind.h
index 00eba866049..93da7a9537d 100644
--- a/libgcc/config/aarch64/linux-unwind.h
+++ b/libgcc/config/aarch64/linux-unwind.h
@@ -77,7 +77,7 @@ aarch64_fallback_frame_state (struct _Unwind_Context *context,
  }
  
rt_ = context->cfa;

-  sc = _->uc.uc_mcontext;
+  sc = (struct sigcontext *) _->uc.uc_mcontext;
  
  /* This define duplicates the definition in aarch64.md */

  #define SP_REGNUM 31




This looks somewhat dubious.  I'm not particularly familiar with the 
kernel headers, but a quick look suggests an mcontext_t is nothing like 
a sigcontext_t.  So isn't the cast just papering over some more 
fundamental problem?


R.


Re: [PATCH 2/2] [testsuite, arm]: Make mve_fp_fpu[12].c accept single or double precision FPU

2023-06-28 Thread Richard Earnshaw (lists) via Gcc-patches

On 28/06/2023 10:26, Christophe Lyon via Gcc-patches wrote:

This tests currently expect a directive containing .fpu fpv5-sp-d16
and thus may fail if the test is executed for instance with
-march=armv8.1-m.main+mve.fp+fp.dp

This patch accepts either fpv5-sp-d16 or fpv5-d16 to avoid the failure.

2023-06-28  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: Fix .fpu
scan-assembler.
* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
---
  gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c | 2 +-
  gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
index e375327fb97..8358a616bb5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
@@ -12,4 +12,4 @@ foo1 (int8x16_t value)
return b;
  }
  
-/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */

+/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
index 1fca1100cf0..5dd2feefc35 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
@@ -12,4 +12,4 @@ foo1 (int8x16_t value)
return b;
  }
  
-/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */

+/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */


OK.


Re: [PATCH 1/2] [testsuite,arm]: Make nomve_fp_1.c require arm_fp

2023-06-28 Thread Richard Earnshaw (lists) via Gcc-patches

On 28/06/2023 10:26, Christophe Lyon via Gcc-patches wrote:

If GCC is configured with the default (soft) -mfloat-abi, and we don't
override the target_board test flags appropriately,
gcc.target/arm/mve/general-c/nomve_fp_1.c fails for lack of
-mfloat-abi=softfp or -mfloat-abi=hard, because it doesn't use
dg-add-options arm_v8_1m_mve (on purpose, see comment in the test).

Require and use the options needed for arm_fp to fix this problem.

2023-06-28  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/general-c/nomve_fp_1.c: Require arm_fp.
---
  gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c 
b/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
index 21c2af16a61..c9d279ead68 100644
--- a/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
@@ -1,9 +1,11 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
  /* Do not use dg-add-options arm_v8_1m_mve, because this might expand to "",
 which could imply mve+fp depending on the user settings. We want to make
 sure the '+fp' extension is not enabled.  */
  /* { dg-options "-mfpu=auto -march=armv8.1-m.main+mve" } */
+/* { dg-add-options arm_fp } */
  
  #include 
  


OK.


Re: [PATCH v5] MIPS: Add speculation_barrier support

2023-06-08 Thread Richard Earnshaw (lists) via Gcc-patches



On 01/06/2023 05:26, YunQiang Su wrote:

speculation_barrier for MIPS needs sync+jr.hb (r2+),
so we implement __speculation_barrier in libgcc, like arm32 does.

gcc/ChangeLog:
* config/mips/mips-protos.h (mips_emit_speculation_barrier): New
 prototype.
* config/mips/mips.cc (speculation_barrier_libfunc): New static
 variable.
(mips_init_libfuncs): Initialize it.
(mips_emit_speculation_barrier): New function.
* config/mips/mips.md (speculation_barrier): Call
 mips_emit_speculation_barrier.

libgcc/ChangeLog:
* config/mips/lib1funcs.S: New file.
define __speculation_barrier and include mips16.S.
* config/mips/t-mips: define LIB1ASMSRC as mips/lib1funcs.S.
define LIB1ASMFUNCS as _speculation_barrier.
set version info for __speculation_barrier.
* config/mips/libgcc-mips.ver: New file.
* config/mips/t-mips16: don't define LIB1ASMSRC as mips16.S
included in lib1funcs.S now.
---


Please remember to cite PR86793 when committing this fix.

R.


  gcc/config/mips/mips-protos.h  |  2 +
  gcc/config/mips/mips.cc| 12 ++
  gcc/config/mips/mips.md| 12 ++
  libgcc/config/mips/lib1funcs.S | 65 ++
  libgcc/config/mips/libgcc-mips.ver | 21 ++
  libgcc/config/mips/t-mips  |  7 
  libgcc/config/mips/t-mips16|  3 +-
  7 files changed, 120 insertions(+), 2 deletions(-)
  create mode 100644 libgcc/config/mips/lib1funcs.S
  create mode 100644 libgcc/config/mips/libgcc-mips.ver

diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index 20483469105..da7902c235b 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -388,4 +388,6 @@ extern void mips_register_frame_header_opt (void);
  extern void mips_expand_vec_cond_expr (machine_mode, machine_mode, rtx *);
  extern void mips_expand_vec_cmp_expr (rtx *);
  
+extern void mips_emit_speculation_barrier_function (void);

+
  #endif /* ! GCC_MIPS_PROTOS_H */
diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index ca491b981a3..c1d1691306e 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -13611,6 +13611,9 @@ mips_autovectorize_vector_modes (vector_modes *modes, 
bool)
return 0;
  }
  
+

+static GTY (()) rtx speculation_barrier_libfunc;
+
  /* Implement TARGET_INIT_LIBFUNCS.  */
  
  static void

@@ -13680,6 +13683,7 @@ mips_init_libfuncs (void)
synchronize_libfunc = init_one_libfunc ("__sync_synchronize");
init_sync_libfuncs (UNITS_PER_WORD);
  }
+  speculation_barrier_libfunc = init_one_libfunc ("__speculation_barrier");
  }
  
  /* Build up a multi-insn sequence that loads label TARGET into $AT.  */

@@ -19092,6 +19096,14 @@ mips_avoid_hazard (rtx_insn *after, rtx_insn *insn, 
int *hilo_delay,
}
  }
  
+/* Emit a speculation barrier.

+   JR.HB is needed, so we put speculation_barrier_libfunc in libgcc.  */
+void
+mips_emit_speculation_barrier_function ()
+{
+  emit_library_call (speculation_barrier_libfunc, LCT_NORMAL, VOIDmode);
+}
+
  /* A SEQUENCE is breakable iff the branch inside it has a compact form
 and the target has compact branches.  */
  
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md

index ac1d77afc7d..5d04ac566dd 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -160,6 +160,8 @@
;; The `.insn' pseudo-op.
UNSPEC_INSN_PSEUDO
UNSPEC_JRHB
+
+  VUNSPEC_SPECULATION_BARRIER
  ])
  
  (define_constants

@@ -7455,6 +7457,16 @@
mips_expand_conditional_move (operands);
DONE;
  })
+
+(define_expand "speculation_barrier"
+  [(unspec_volatile [(const_int 0)] VUNSPEC_SPECULATION_BARRIER)]
+  ""
+  "
+  mips_emit_speculation_barrier_function ();
+  DONE;
+  "
+)
+
  
  ;;
  ;;  
diff --git a/libgcc/config/mips/lib1funcs.S b/libgcc/config/mips/lib1funcs.S
new file mode 100644
index 000..97a3655e8ab
--- /dev/null
+++ b/libgcc/config/mips/lib1funcs.S
@@ -0,0 +1,65 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+This file is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+This file is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, 

Re: [PATCH][GCC][AArch64] convert some patterns to new MD syntax

2023-06-08 Thread Richard Earnshaw (lists) via Gcc-patches

On 08/06/2023 11:00, Tamar Christina via Gcc-patches wrote:

Hi All,

This converts some patterns in the AArch64 backend to use the new
compact syntax.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

gcc/ChangeLog:

* config/aarch64/aarch64.md (arches): Add nosimd.
(*mov_aarch64, *movsi_aarch64, *movdi_aarch64): Rewrite to
compact syntax.

Thanks,
Tamar


A few nits but ok apart from that:



--- inline copy of patch ---

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
8b8951d7b14aa1a8858fdc24bf6f9dd3d927d5ea..601173338a9068f7694867c8e6e78f9b10f32a17
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -366,7 +366,7 @@ (define_constants
  ;; As a convenience, "fp_q" means "fp" + the ability to move between
  ;; Q registers and is equivalent to "simd".
  
-(define_enum "arches" [ any rcpc8_4 fp fp_q simd sve fp16])

+(define_enum "arches" [ any rcpc8_4 fp fp_q simd nosimd sve fp16])
  
  (define_enum_attr "arch" "arches" (const_string "any"))
  
@@ -397,6 +397,9 @@ (define_attr "arch_enabled" "no,yes"

(and (eq_attr "arch" "fp_q, simd")
 (match_test "TARGET_SIMD"))
  
+	(and (eq_attr "arch" "nosimd")

+(match_test "!TARGET_SIMD"))
+
(and (eq_attr "arch" "fp16")
 (match_test "TARGET_FP_F16INST"))
  
@@ -1206,44 +1209,27 @@ (define_expand "mov"

  )
  
  (define_insn "*mov_aarch64"

-  [(set (match_operand:SHORT 0 "nonimmediate_operand" "=r,r,w,r  ,r,w, 
m,m,r,w,w")
-   (match_operand:SHORT 1 "aarch64_mov_operand"  " 
r,M,D,Usv,m,m,rZ,w,w,rZ,w"))]
+  [(set (match_operand:SHORT 0 "nonimmediate_operand")
+   (match_operand:SHORT 1 "aarch64_mov_operand"))]
"(register_operand (operands[0], mode)
  || aarch64_reg_or_zero (operands[1], mode))"
-{
-   switch (which_alternative)
- {
- case 0:
-   return "mov\t%w0, %w1";
- case 1:
-   return "mov\t%w0, %1";
- case 2:
-   return aarch64_output_scalar_simd_mov_immediate (operands[1],
-   mode);
- case 3:
-   return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
- case 4:
-   return "ldr\t%w0, %1";
- case 5:
-   return "ldr\t%0, %1";
- case 6:
-   return "str\t%w1, %0";
- case 7:
-   return "str\t%1, %0";
- case 8:
-   return TARGET_SIMD ? "umov\t%w0, %1.[0]" : "fmov\t%w0, %s1";
- case 9:
-   return TARGET_SIMD ? "dup\t%0., %w1" : "fmov\t%s0, %w1";
- case 10:
-   return TARGET_SIMD ? "dup\t%0, %1.[0]" : "fmov\t%s0, %s1";
- default:
-   gcc_unreachable ();
- }
-}
-  ;; The "mov_imm" type for CNT is just a placeholder.
-  [(set_attr "type" "mov_reg,mov_imm,neon_move,mov_imm,load_4,load_4,store_4,
-store_4,neon_to_gp,neon_from_gp,neon_dup")
-   (set_attr "arch" "*,*,simd,sve,*,*,*,*,*,*,*")]
+  {@ [cons: =0, 1; attrs: type, arch]
+ [r , r; mov_reg, * ] mov\t%w0, %w1

  ^
This space seems redundant as all alternatives have a single letter for 
the first constraint.  Perhaps this is a hang-over from when the first 
alternative had '=r'?




+ [r , M; mov_imm, * ] mov\t%w0, %1
+ [w , D; neon_move  , simd  ] << 
aarch64_output_scalar_simd_mov_immediate (operands[1], mode);
+ /* The "mov_imm" type for CNT is just a placeholder.  */
+ [r , Usv  ; mov_imm, sve   ] << aarch64_output_sve_cnt_immediate ("cnt", 
"%x0", operands[1]);
+ [r , m; load_4 , * ] ldr\t%w0, %1
+ [w , m; load_4 , * ] ldr\t%0, %1
+ [m , rZ   ; store_4, * ] str\\t%w1, %0


I'd write "rZ" as "r Z" to make it clear that the constraints are not a 
multi-letter constraint.



+ [m , w; store_4, * ] str\t%1, %0
+ [r , w; neon_to_gp  , simd  ] umov\t%w0, %1.[0]
+ [r , w; neon_to_gp  , nosimd] fmov\t%w0, %s1 /*foo */
+ [w , rZ   ; neon_from_gp, simd  ] dup\t%0., %w1
+ [w , rZ   ; neon_from_gp, nosimd] fmov\t%s0, %w1
+ [w , w; neon_dup   , simd  ] dup\t%0, %1.[0]
+ [w , w; neon_dup   , nosimd] fmov\t%s0, %s1
+  }
  )
  
  (define_expand "mov"

@@ -1280,79 +1266,71 @@ (define_expand "mov"
  )
  
  (define_insn_and_split "*movsi_aarch64"

-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m,  r,  
r,  r, w,r,w, w")
-   (match_operand:SI 1 "aarch64_mov_operand"  " 
r,r,k,M,n,Usv,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand")
+   (match_operand:SI 1 "aarch64_mov_operand"))]
"(register_operand (operands[0], SImode)
  || aarch64_reg_or_zero (operands[1], SImode))"
-  "@
-   mov\\t%w0, %w1
-   mov\\t%w0, %w1
-   mov\\t%w0, %w1
-   mov\\t%w0, %1
-   #
-   * return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
-   ldr\\t%w0, %1
-   ldr\\t%s0, %1
-   

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Richard Earnshaw (lists) via Gcc-patches

On 08/06/2023 11:29, Richard Earnshaw (lists) via Gcc-patches wrote:

On 08/06/2023 11:12, Andreas Schwab wrote:

On Jun 08 2023, Tamar Christina via Gcc-patches wrote:

@@ -713,6 +714,183 @@ you can use @samp{*} inside of a @samp{@@} 
multi-alternative template:

  @end group
  @end smallexample
+@node Compact Syntax
+@section Compact Syntax
+@cindex compact syntax
+
+In cases where the number of alternatives in a @code{define_insn} or
+@code{define_insn_and_split} are large then it may be beneficial to 
use the


 is large



Or perhaps better still:

When a define_insn or define_insn_and split has many alternatives it may 
be beneficial to ...


R.


Or perhaps even s/many/multiple/.  It doesn't have to have very many to 
make this new syntax preferable, IMO.


R.


Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Richard Earnshaw (lists) via Gcc-patches

On 08/06/2023 11:12, Andreas Schwab wrote:

On Jun 08 2023, Tamar Christina via Gcc-patches wrote:


@@ -713,6 +714,183 @@ you can use @samp{*} inside of a @samp{@@} 
multi-alternative template:
  @end group
  @end smallexample
  
+@node Compact Syntax

+@section Compact Syntax
+@cindex compact syntax
+
+In cases where the number of alternatives in a @code{define_insn} or
+@code{define_insn_and_split} are large then it may be beneficial to use the


 is large



Or perhaps better still:

When a define_insn or define_insn_and split has many alternatives it may 
be beneficial to ...


R.


Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-06 Thread Richard Earnshaw (lists) via Gcc-patches

On 06/06/2023 13:49, Richard Sandiford via Gcc-patches wrote:

Tamar Christina  writes:

int operand_number; /* Operand index in the big array.  */
int output_format;  /* INSN_OUTPUT_FORMAT_*.  */
+  bool compact_syntax_p;
struct operand_data operand[MAX_MAX_OPERANDS];  };

@@ -700,12 +702,57 @@ process_template (class data *d, const char

*template_code)

  if (sp != ep)
message_at (d->loc, "trailing whitespace in output template");

- while (cp < sp)
+ /* Check for any unexpanded iterators.  */
+ if (bp[0] != '*' && d->compact_syntax_p)


I assume the bp[0] != '*' condition skips the check for C code blocks.
Genuine question, but are you sure we want that?  C code often includes asm
strings (in quotes), such as for the SVE CNT[BHWD] example.

Extending the check would mean that any use of <...> for C++ templates will
need to be quoted, but explicit instantiation is pretty rare in .md files.  It 
would
also look weird for conditions.

Either way is fine, just asking.


I excluded it entirely to avoid also running afoul of the binary operators. So 
e.g.
* a < b && b > c ? foo : bar shouldn't trigger it.   It seemed more trouble 
than it's
worth to try to get correct.


Yeah.  I agree it's probably better to skip.


+  }
+
+  /* Adds a character to the end of the string.  */  void add (char
+ c)  {
+con += c;
+  }
+
+  /* Output the string in the form of a brand-new char *, then effectively
+ clear the internal string by resetting len to 0.  */  char * out
+ ()


Formatting: no need for a space before "out".


+  {
+/* Final character is always a trailing comma, so strip it out.
+ */


trailing ',', ';' or ']', rather than just a comma?


Ah no, this is a bit of a lazy intercalate, when the alternatives are pushed in 
it's
not easy to tell how many there will be (because we don't keep track of it in 
this part),
so we just always add a trailing "," and ignore the last char on output.  
Validation of the
alternative counts themselves is done later by the normal machinery.


Ah, I get it now, thanks.


+}
+
+  return index;
+}
+
+/* Modify the attributes list to make space for the implicitly declared
+   attributes in the attrs: list.  */
+
+static void
+create_missing_attributes (rtx x, file_location /* loc */,
+vec_conlist ) {
+  if (attrs.empty ())
+return;
+
+  unsigned int attr_index = GET_CODE (x) == DEFINE_INSN ? 4 : 3;
+ vec_conlist missing;
+
+  /* This is an O(n*m) loop but it's fine, both n and m will always be very
+ small.  */


Agreed that quadraticness isn't a problem.  But I wonder how many people
would write an explicit placeholder set_attr.  Unlike match_operand and
match_scratch, a placeholder set_attr doesn't carry any additional
information.

It might be simpler to drop add_attributes and add all attributes
unconditionally in this function instead.  If the user tries to specify the same
attribute using both syntaxes, the pattern would end up with two definitions
of the same attribute, which ought to be flagged by existing code.



This was done to support the (in arm backend) common thing of having attributes
which are either too complex to add inline in the new syntax or that just 
repeat a
value.

i.e. it's to allow cases like this:

   [(set_attr "length")
(set_attr "predicable" "yes")
(set_attr "predicable_short_it")
(set_attr "arch")
(set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
  (const_string "alu_imm")
  (const_string "alu_sreg")))

Where your attrs contains:

   {@ [cons: =0, 1, 2; attrs: length, predicable_short_it, arch]


Yeah, agree it needs to be possible to define things like "type"
in this way.


You also want it for the case where every alternative takes the same 
value, eg the "predicable - yes" attr.


R.




However you're right, I could simply say that you must omit the set_attr in 
attrs and just
merge the two lists?  I think that's what you were alluding to?


Yeah, that's right.  Or just concatenate them and rely on later
error checking (which should give reasonable diagnostics).

Thanks,
Richard




Re: [PATCH] RFC: New compact syntax for insn and insn_split in Machine Descriptions

2023-05-16 Thread Richard Earnshaw (lists) via Gcc-patches

On 24/04/2023 09:33, Richard Sandiford via Gcc-patches wrote:

Richard Sandiford  writes:

Tamar Christina  writes:

Hi All,

This patch adds support for a compact syntax for specifying constraints in
instruction patterns. Credit for the idea goes to Richard Earnshaw.

I am sending up this RFC to get feedback for it's inclusion in GCC 14.
With this new syntax we want a clean break from the current limitations to make
something that is hopefully easier to use and maintain.

The idea behind this compact syntax is that often times it's quite hard to
correlate the entries in the constrains list, attributes and instruction lists.

One has to count and this often is tedious.  Additionally when changing a single
line in the insn multiple lines in a diff change, making it harder to see what's
going on.

This new syntax takes into account many of the common things that are done in MD
files.   It's also worth saying that this version is intended to deal with the
common case of a string based alternatives.   For C chunks we have some ideas
but those are not intended to be addressed here.

It's easiest to explain with an example:

normal syntax:

(define_insn_and_split "*movsi_aarch64"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m,  r,  
r,  r, w,r,w, w")
(match_operand:SI 1 "aarch64_mov_operand"  " 
r,r,k,M,n,Usv,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %1
#
* return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
ldr\\t%w0, %1
ldr\\t%s0, %1
str\\t%w1, %0
str\\t%s1, %0
adrp\\t%x0, %A1\;ldr\\t%w0, [%x0, %L1]
adr\\t%x0, %c1
adrp\\t%x0, %A1
fmov\\t%s0, %w1
fmov\\t%w0, %s1
fmov\\t%s0, %s1
* return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);"
   "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
 && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
[(const_int 0)]
"{
aarch64_expand_mov_immediate (operands[0], operands[1]);
DONE;
 }"
   ;; The "mov_imm" type for CNT is just a placeholder.
   [(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,mov_imm,mov_imm,load_4,

load_4,store_4,store_4,load_4,adr,adr,f_mcr,f_mrc,fmov,neon_move")
(set_attr "arch"   "*,*,*,*,*,sve,*,fp,*,fp,*,*,*,fp,fp,fp,simd")
(set_attr "length" "4,4,4,4,*,  4,4, 4,4, 4,8,4,4, 4, 4, 4,   4")
]
)

New syntax:

(define_insn_and_split "*movsi_aarch64"
   [(set (match_operand:SI 0 "nonimmediate_operand")
(match_operand:SI 1 "aarch64_mov_operand"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@@ (cons: 0 1; attrs: type arch length)
[=r, r  ; mov_reg  , *   , 4] mov\t%w0, %w1
[k , r  ; mov_reg  , *   , 4] ^
[r , k  ; mov_reg  , *   , 4] ^
[r , M  ; mov_imm  , *   , 4] mov\t%w0, %1
[r , n  ; mov_imm  , *   , *] #
[r , Usv; mov_imm  , sve , 4] << aarch64_output_sve_cnt_immediate ('cnt', 
'%x0', operands[1]);
[r , m  ; load_4   , *   , 4] ldr\t%w0, %1
[w , m  ; load_4   , fp  , 4] ldr\t%s0, %1
[m , rZ ; store_4  , *   , 4] str\t%w1, %0
[m , w  ; store_4  , fp  , 4] str\t%s1, %0
[r , Usw; load_4   , *   , 8] adrp\t%x0, %A1;ldr\t%w0, [%x0, %L1]
[r , Usa; adr  , *   , 4] adr\t%x0, %c1
[r , Ush; adr  , *   , 4] adrp\t%x0, %A1
[w , rZ ; f_mcr, fp  , 4] fmov\t%s0, %w1
[r , w  ; f_mrc, fp  , 4] fmov\t%w0, %s1
[w , w  ; fmov , fp  , 4] fmov\t%s0, %s1
[w , Ds ; neon_move, simd, 4] << aarch64_output_scalar_simd_mov_immediate 
(operands[1], SImode);"
   "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
 && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
   [(const_int 0)]
   {
 aarch64_expand_mov_immediate (operands[0], operands[1]);
 DONE;
   }
   ;; The "mov_imm" type for CNT is just a placeholder.
)

The patch contains some more rewritten examples for both Arm and AArch64.  I
have included them for examples in this RFC but the final version posted in
GCC 14 will have these split out.

The main syntax rules are as follows (See docs for full rules):
   - Template must start with "@@" to use the new syntax.
   - "@@" is followed by a layout in parentheses which is "cons:" followed by
 a list of match_operand/match_scratch IDs, then a semicolon, then the
 same for attributes ("attrs:"). Both sections are optional (so you can
 use only cons, or only attrs, or both), and cons must come before attrs
 if present.
   - Each alternative begins with any amount of whitespace.
   - Following the whitespace is a comma-separated list of constraints and/or
 attributes within brackets [], with sections separated by a semicolon.
   - Following the closing ']' is any amount of 

Re: [GCC][PATCH 13/15, v5] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature.

2023-01-13 Thread Richard Earnshaw (lists) via Gcc-patches

On 13/01/2023 22:12, Jakub Jelinek wrote:

On Fri, Jan 13, 2023 at 09:58:26PM +, Richard Earnshaw (lists) wrote:

> I'm afraid increasing number of DWARF registers is ABI incompatible change.
> E.g. libgcc __frame_state_for function fills in:
> typedef struct frame_state
> {
>    void *cfa;
>    void *eh_ptr;
>    long cfa_offset;
>    long args_size;
>    long reg_or_offset[PRE_GCC3_DWARF_FRAME_REGISTERS+1];
>    unsigned short cfa_reg;
>    unsigned short retaddr_column;
>    char saved[PRE_GCC3_DWARF_FRAME_REGISTERS+1];
> } frame_state;
> 
> structure, where PRE_GCC3_DWARF_FRAME_REGISTERS defaults to

> __LIBGCC_DWARF_FRAME_REGISTERS__, which is defined to
> DWARF_FRAME_REGISTERS, which defaults to FIRST_PSEUDO_REGISTER.
> So, changing FIRST_PSEUDO_REGISTER is an ABI change unless you arrange for
> PRE_GCC3_DWARF_FRAME_REGISTERS to be defined to the old value.
> 
>  Jakub
> 


So where's the red flag that warns about this?

I also note that Richard Sandiford made a similar type of change for AArch64
in r10-4195 (183bfdafc6f1f98711c5400498a7268cc1441096) and nothing was said
about that at the time.

It seems incredibly fragile to me to have some ABI based off the number of
machine registers.


It is.  The new unwinder fortunately doesn't suffer from this (at least I
think it doesn't), but in older gccs the unwinder could be split across 
different

objects, having e.g. parts of the unwinder in one shared library and another
part in another one, each built by different GCC version.

Guess targets which weren't supported in GCC 2.x are ok, while
__frame_state_for is in libgcc, nothing calls it, so while such changes
change the ABI, nothing likely cares.
But for older targets it is a problem.

And it is hard to catch this in the testsuite, one would either need to
hardcode the count for each target in the test, or test with mixing GCC 2.x
compiled code with current trunk.

Before the introduction of libgcc_eh.a etc., parts of the unwinder was e.g.
exported from glibc.
See e.g. 
https://gcc.gnu.org/legacy-ml/gcc-patches/2001-07/threads.html#00472 
<https://gcc.gnu.org/legacy-ml/gcc-patches/2001-07/threads.html#00472>

for some details.

     Jakub



So:
1) GCC-2.* didn't support the EABI, which is all we support these days.
2) the Arm port updated FIRST_PSEUDO_REGISTER in 2019 in r10-4441 
(16155ccf588a403c033ccd7743329671bcfb27d5) and I didn't see any fallout 
from that.
3) The Arm port uses the unwinding mechanism defined by the ABI, not the 
dwarf2 based tables.


So I'm inclined to think this probably isn't going to be a problem in 
reality.


R.


Re: [GCC][PATCH 13/15, v5] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature.

2023-01-13 Thread Richard Earnshaw (lists) via Gcc-patches

On 13/01/2023 18:02, Jakub Jelinek via Gcc-patches wrote:

On Fri, Jan 13, 2023 at 05:44:15PM +, Srinath Parvathaneni via Gcc-patches 
wrote:

Hello,

This patch teaches the DWARF support in gcc about RA_AUTH_CODE pseudo 
hard-register and also
updates the ".save", ".cfi_register", ".cfi_offset", ".cfi_restore" directives 
accordingly.
This patch also adds support to emit ".pacspval" directive when "pac ip, lr, 
sp" instruction
in generated in the assembly.

RA_AUTH_CODE register number is 107 and it's dwarf register number is 143.


I'm afraid increasing number of DWARF registers is ABI incompatible change.
E.g. libgcc __frame_state_for function fills in:
typedef struct frame_state
{
   void *cfa;
   void *eh_ptr;
   long cfa_offset;
   long args_size;
   long reg_or_offset[PRE_GCC3_DWARF_FRAME_REGISTERS+1];
   unsigned short cfa_reg;
   unsigned short retaddr_column;
   char saved[PRE_GCC3_DWARF_FRAME_REGISTERS+1];
} frame_state;

structure, where PRE_GCC3_DWARF_FRAME_REGISTERS defaults to
__LIBGCC_DWARF_FRAME_REGISTERS__, which is defined to
DWARF_FRAME_REGISTERS, which defaults to FIRST_PSEUDO_REGISTER.
So, changing FIRST_PSEUDO_REGISTER is an ABI change unless you arrange for
PRE_GCC3_DWARF_FRAME_REGISTERS to be defined to the old value.

Jakub



So where's the red flag that warns about this?

I also note that Richard Sandiford made a similar type of change for 
AArch64 in r10-4195 (183bfdafc6f1f98711c5400498a7268cc1441096) and 
nothing was said about that at the time.


It seems incredibly fragile to me to have some ABI based off the number 
of machine registers.


R.


[COMMITTED] arm: fix bootstrap failure following automatic mode selection patch

2021-03-09 Thread Richard Earnshaw (lists) via Gcc-patches
Fix a signed vs unsigned comparison in last change.

gcc:
* common/config/arm/arm-common.c (arm_config_default): Change type
of 'i' to unsigned.
---
 gcc/common/config/arm/arm-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/common/config/arm/arm-common.c
b/gcc/common/config/arm/arm-common.c
index 5b03b86724d..9980af6885c 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -248,7 +248,7 @@ check_isa_bits_for (const enum isa_feature* bits,
enum isa_feature bit)
 static const char *
 arm_config_default (const char *name)
 {
-  int i;
+  unsigned i;
if (configure_default_options[0].name == NULL)
 return NULL;
-- 
2.30.0



Re: [PATCH] arm: Ignore --with-mode when CPU only supports one instruction set.

2021-03-03 Thread Richard Earnshaw (lists) via Gcc-patches
On 03/03/2021 14:11, Christophe Lyon via Gcc-patches wrote:
> On Wed, 3 Mar 2021 at 14:55, Richard Earnshaw (lists)
>  wrote:
>>
>> Hopefully this change will reduce the number of times Christophe is
>> needing to tweak the testsuite.
>>
> 
> Thanks!
> 
> I guess this means we can now do some cleanup in the testsuite?
> Did you have a quick look at the amount of tests involved?
> 

No, I wasn't expecting to change the existing tests again where you've
already done this.  But hopefully you won't need to do any more changes
for this reason in future.

R.

> Christophe
> 
>> --
>>
>> Arm processors can support up to two instruction sets.  Some early
>> cores only support the traditional A32 (Arm) instructions, while some
>> more recent devices only support T32 (Thumb) instructions.
>>
>> When configuring the compiler, --with-mode can be used to select the
>> default instruction set to target if the user has not made an explicit
>> choice, but this can cause needless problems if the default is not
>> supported by the requested CPU.
>>
>> To fix this this patch adjusts the way that the --with-mode selection
>> is processed so that it can take into account the selected CPU or
>> architecture and not create a meaningless combination.
>>
>> gcc:
>> * common/config/arm/arm-common.c: Include configargs.h.
>> (arm_config_default): New function.
>> (arm_target_mode): Renamed from arm_target_thumb_only.  Handle
>> processors that do not support Thumb.  Take into account the
>> --with-mode configuration setting for selecting the default.
>> * config/arm/arm.h (OPTION_DEFAULT_SPECS): Remove entry for 'mode'.
>> (TARGET_MODE_SPEC_FUNCTIONS): Update for function name change.
>> ---
>>  gcc/common/config/arm/arm-common.c | 49 ++
>>  gcc/config/arm/arm.h   | 10 +++---
>>  2 files changed, 49 insertions(+), 10 deletions(-)
>>
>>



[PATCH] arm: Ignore --with-mode when CPU only supports one instruction set.

2021-03-03 Thread Richard Earnshaw (lists) via Gcc-patches
Hopefully this change will reduce the number of times Christophe is
needing to tweak the testsuite.

--

Arm processors can support up to two instruction sets.  Some early
cores only support the traditional A32 (Arm) instructions, while some
more recent devices only support T32 (Thumb) instructions.

When configuring the compiler, --with-mode can be used to select the
default instruction set to target if the user has not made an explicit
choice, but this can cause needless problems if the default is not
supported by the requested CPU.

To fix this this patch adjusts the way that the --with-mode selection
is processed so that it can take into account the selected CPU or
architecture and not create a meaningless combination.

gcc:
* common/config/arm/arm-common.c: Include configargs.h.
(arm_config_default): New function.
(arm_target_mode): Renamed from arm_target_thumb_only.  Handle
processors that do not support Thumb.  Take into account the
--with-mode configuration setting for selecting the default.
* config/arm/arm.h (OPTION_DEFAULT_SPECS): Remove entry for 'mode'.
(TARGET_MODE_SPEC_FUNCTIONS): Update for function name change.
---
 gcc/common/config/arm/arm-common.c | 49 ++
 gcc/config/arm/arm.h   | 10 +++---
 2 files changed, 49 insertions(+), 10 deletions(-)


diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index 98824517c7b..5b03b86724d 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -33,6 +33,8 @@
 #include "sbitmap.h"
 #include "diagnostic.h"
 
+#include "configargs.h"
+
 /* Set default optimization options.  */
 static const struct default_options arm_option_optimization_table[] =
   {
@@ -240,16 +242,34 @@ check_isa_bits_for (const enum isa_feature* bits, enum isa_feature bit)
   return false;
 }
 
+/* Look up NAME in the configuration defaults for this build of the
+   the compiler.  Return the value associated with that name, or NULL
+   if no value is found.  */
+static const char *
+arm_config_default (const char *name)
+{
+  int i;
+
+  if (configure_default_options[0].name == NULL)
+return NULL;
+
+  for (i = 0; i < ARRAY_SIZE (configure_default_options); i++)
+if (strcmp (configure_default_options[i].name, name) == 0)
+  return configure_default_options[i].value;
+
+  return NULL;
+}
+
 /* Called by the driver to check whether the target denoted by current
-   command line options is a Thumb-only target.  ARGV is an array of
-   tupples (normally only one) where the first element of the tupple
-   is 'cpu' or 'arch' and the second is the option passed to the
-   compiler for that.  An architecture tupple is always taken in
-   preference to a cpu tupple and the last of each type always
+   command line options is a Thumb-only, or ARM-only, target.  ARGV is
+   an array of tupples (normally only one) where the first element of
+   the tupple is 'cpu' or 'arch' and the second is the option passed
+   to the compiler for that.  An architecture tupple is always taken
+   in preference to a cpu tupple and the last of each type always
overrides any earlier setting.  */
 
 const char *
-arm_target_thumb_only (int argc, const char **argv)
+arm_target_mode (int argc, const char **argv)
 {
   const char *arch = NULL;
   const char *cpu = NULL;
@@ -285,6 +305,9 @@ arm_target_thumb_only (int argc, const char **argv)
   if (arch_opt && !check_isa_bits_for (arch_opt->common.isa_bits,
 	   isa_bit_notm))
 	return "-mthumb";
+  if (arch_opt && !check_isa_bits_for (arch_opt->common.isa_bits,
+	   isa_bit_thumb))
+	return "-marm";
 }
   else if (cpu)
 {
@@ -294,6 +317,20 @@ arm_target_thumb_only (int argc, const char **argv)
   if (cpu_opt && !check_isa_bits_for (cpu_opt->common.isa_bits,
 	  isa_bit_notm))
 	return "-mthumb";
+  if (cpu_opt && !check_isa_bits_for (cpu_opt->common.isa_bits,
+	   isa_bit_thumb))
+	return "-marm";
+}
+
+  const char *default_mode = arm_config_default ("mode");
+  if (default_mode)
+{
+  if (strcmp (default_mode, "thumb") == 0)
+	return "-mthumb";
+  else if (strcmp (default_mode, "arm") == 0)
+	return "-marm";
+  else
+	gcc_unreachable ();
 }
 
   /* Compiler hasn't been configured with a default, and the CPU
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 6bc03ada0bf..113c015c455 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -390,7 +390,10 @@ emission of floating point pcs attributes.  */
--with-float is ignored if -mfloat-abi is specified.
--with-fpu is ignored if -mfpu is specified.
--with-abi is ignored if -mabi is specified.
-   --with-tls is ignored if -mtls-dialect is specified. */
+   --with-tls is ignored if -mtls-dialect is specified.
+   Note: --with-mode is not handled here, that has a special rule
+   TARGET_MODE_CHECK that also takes into account the 

[PATCH] arm: force use of r4 for __gnu_cmse_nonsecure_call when !FPCXT [PR99271]

2021-02-25 Thread Richard Earnshaw (lists) via Gcc-patches

Commit r10-6017 relaxed the constraint on thumb2 calls to
__gnu_cmse_nonsecure_call to allow any register for the call address.
Although the initial code expansion continues to use r4 with the FPCXT
extension is not enabled, the change was unsafe because subsequent
optimizations could use the additional freedom to change which
register was being used.

To fix this we need to split the output patterns in the machine
description to use distinct recognizers: one with the additional
freedom when FPCXT is enabled an another that retains the original
restrictions when the extension is not available.

gcc:
PR target/99271
* config/arm/thumb2.md (nonsecure_call_reg_thumb2_fpcxt): New pattern.
(nonsecure_call_value_reg_thumb2_fpcxt): Likewise.
(nonsecure_call_reg_thumb2): Restrict to using r4 for the callee
address and disable when the FPCXT is not available.
(nonsecure_call_value_reg_thumb2): Likewise.

gcc/testsuite:
* gcc.target/arm/cmse/cmse-18.c: New test.
---
 gcc/config/arm/thumb2.md| 47 ++---
 gcc/testsuite/gcc.target/arm/cmse/cmse-18.c | 11 +
 2 files changed, 42 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/cmse/cmse-18.c


diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index d7fd96c270e..5772f4d0b76 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -536,19 +536,26 @@ (define_insn "*call_reg_thumb2"
   [(set_attr "type" "call")]
 )
 
-(define_insn "*nonsecure_call_reg_thumb2"
+(define_insn "*nonsecure_call_reg_thumb2_fpcxt"
   [(call (unspec:SI [(mem:SI (match_operand:SI 0 "s_register_operand" "l*r"))]
 		UNSPEC_NONSECURE_MEM)
 	 (match_operand 1 "" ""))
(use (match_operand 2 "" ""))
(clobber (reg:SI LR_REGNUM))]
-  "TARGET_THUMB2 && use_cmse"
-  {
-if (TARGET_HAVE_FPCXT_CMSE)
-  return "blxns\\t%0";
-else
-  return "bl\\t__gnu_cmse_nonsecure_call";
-  }
+  "TARGET_THUMB2 && use_cmse && TARGET_HAVE_FPCXT_CMSE"
+  "blxns\\t%0"
+  [(set_attr "length" "4")
+   (set_attr "type" "call")]
+)
+
+(define_insn "*nonsecure_call_reg_thumb2"
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
+		UNSPEC_NONSECURE_MEM)
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
+  "TARGET_THUMB2 && use_cmse && !TARGET_HAVE_FPCXT_CMSE"
+  "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
(set_attr "type" "call")]
 )
@@ -564,7 +571,7 @@ (define_insn "*call_value_reg_thumb2"
   [(set_attr "type" "call")]
 )
 
-(define_insn "*nonsecure_call_value_reg_thumb2"
+(define_insn "*nonsecure_call_value_reg_thumb2_fpcxt"
   [(set (match_operand 0 "" "")
 	(call
 	 (unspec:SI [(mem:SI (match_operand:SI 1 "register_operand" "l*r"))]
@@ -572,13 +579,21 @@ (define_insn "*nonsecure_call_value_reg_thumb2"
 	 (match_operand 2 "" "")))
(use (match_operand 3 "" ""))
(clobber (reg:SI LR_REGNUM))]
-  "TARGET_THUMB2 && use_cmse"
-  {
-if (TARGET_HAVE_FPCXT_CMSE)
-  return "blxns\\t%1";
-else
-  return "bl\\t__gnu_cmse_nonsecure_call";
-  }
+  "TARGET_THUMB2 && use_cmse && TARGET_HAVE_FPCXT_CMSE"
+  "blxns\\t%1"
+  [(set_attr "length" "4")
+   (set_attr "type" "call")]
+)
+
+(define_insn "*nonsecure_call_value_reg_thumb2"
+  [(set (match_operand 0 "" "")
+	(call
+	 (unspec:SI [(mem:SI (reg:SI R4_REGNUM))] UNSPEC_NONSECURE_MEM)
+	 (match_operand 1 "" "")))
+   (use (match_operand 2 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
+  "TARGET_THUMB2 && use_cmse && !TARGET_HAVE_FPCXT_CMSE"
+  "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
(set_attr "type" "call")]
 )
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-18.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-18.c
new file mode 100644
index 000..e1ff09257b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-18.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-mcmse -fdump-rtl-final-slim" } */
+
+typedef void (*f)(int) __attribute__((cmse_nonsecure_call));
+
+void bar(f func, int a)
+{
+  func(a);
+}
+
+/* { dg-final { scan-rtl-dump "call unspec\\\[\\\[r4:SI\\\]\\\]" "final" } } */



Re: [PATCH v3] arm: Low overhead loop handle long range branches [PR98931]

2021-02-11 Thread Richard Earnshaw (lists) via Gcc-patches
On 10/02/2021 17:44, Andrea Corallo via Gcc-patches wrote:
> Andrea Corallo via Gcc-patches  writes:
> 
>> "Richard Earnshaw (lists)"  writes:
>>
>>> On 09/02/2021 16:27, Andrea Corallo via Gcc-patches wrote:
>>>> Jakub Jelinek  writes:
>>>>
>>>>> On Tue, Feb 09, 2021 at 03:09:43PM +0100, Jakub Jelinek via Gcc-patches 
>>>>> wrote:
>>>>>>>"TARGET_32BIT && TARGET_HAVE_LOB"
>>>>>>> -  "le\t%|lr, %l0")
>>>>>>> +  "*
>>>>>>> +  if (get_attr_length (insn) == 4)
>>>>>>> +return \"le\\t%|lr, %l0\";
>>>>>>> +  else
>>>>>>> +return \"subs\\t%|lr, #1\;bne\\t%l0\";
>>>>>>> +  "
>>>>>>
>>>>>> Why not
>>>>>> {
>>>>>>   if (get_attr_length (insn) == 4)
>>>>>> return "le\t%|lr, %l0";
>>>>>>   else
>>>>>> return "subs\t%|lr, #1;bne\t%l0";
>>>>>> }
>>>>>> instead?  Seems the arm backend uses "*..." more than the more modern {},
>>>>>> but one needs to backslash prefix a lot which makes it less readable?
>>>>>
>>>>> Where "more modern" is introduced 19.5 years ago ;)
>>>>>
>>>>>   Jakub
>>>>
>>>> I guess we really like traditions :)
>>>>
>>>> Attached second version addressing this.
>>>>
>>>> Thanks
>>>>
>>>>   Andrea
>>>>
>>>
>>> You're missing a clobber of the condition codes in the RTL.  This wasn't
>>> needed before, but is now.
>>>
>>> R.
>>
>> Hi Richard,
>>
>> thanks for reviewing, I guess this is going to be a good learning moment
>> for me.
>>
>> What we originally expand is:
>>
>> (insn 2396 2360 2397 3 (parallel [
>> (set (reg:CC_NZ 100 cc)
>> (compare:CC_NZ (plus:SI (reg:SI 14 lr)
>> (const_int -1 [0x]))
>> (const_int 0 [0])))
>> (set (reg:SI 14 lr)
>> (plus:SI (reg:SI 14 lr)
>> (const_int -1 [0x])))
>> ]) "p1.c":4:21 -1
>>  (nil))
>> (jump_insn 2397 2396 2365 3 (set (pc)
>> (if_then_else (ne (reg:CC_NZ 100 cc)
>> (const_int 0 [0]))
>> (label_ref:SI 2361)
>> (pc))) "p1.c":4:21 273 {arm_cond_branch}
>>  (expr_list:REG_DEAD (reg:CC_NZ 100 cc)
>> (int_list:REG_BR_PROB 1062895996 (nil)))
>>  -> 2361)
>>
>> Combine recognizing cc:CC_NZ as a dead reg and rewriting the two insns
>> as:
>>
>> (jump_insn 2397 2396 2365 3 (parallel [
>> (set (pc)
>> (if_then_else (ne (reg:SI 14 lr)
>> (const_int 1 [0x1]))
>> (label_ref:SI 2361)
>> (pc)))
>> (set (reg:SI 14 lr)
>> (plus:SI (reg:SI 14 lr)
>> (const_int -1 [0x])))
>> ]) "p1.c":4:21 1047 {*doloop_end_internal}
>>  (int_list:REG_BR_PROB 1062895996 (nil))
>>  -> 2361)
>>
>> I originally thought that because the write of reg:CC_NZ is explicit in
>> the first pattern we expand this was sufficient, but I now understand
>> I'm wrong and combine should produce a pattern still expressing this.
>> Now the question is how to do that.
>>
>> If I add the clobber '(clobber (reg:CC CC_REGNUM))' inside the parallel
>> of *doloop_end_internal as last element of the vector we ICE in
>> 'add_clobbers' called during combine, apparently 'add_clobbers' does not
>> handle the insn_code_number.
>>
>> If I add it as second element of the parallel combine is not combining
>> the two insns.
>>
>> If I place the clobber outside the parallel as a second element of the
>> insn vector combine is crashing in 'recog_for_combine_1'.
>>
>> So the question is probably: where should the clobber be positioned
>> canonically to have this working?
>>
>> Thanks!
>>
>>   Andrea
> 
> Righ, I've been explained by a knowledgeable colleague that the
> 'parallel' is implicit in the 'define_insn' and there's no need to
> express it (interestgly this is confusing the code generating
> 'add_clobbers').
> 
> The attached patch rewrites the pattern as such and adds the missing
> clobber.
> 
> Tests are running, okay for trunk when done with these?
> 
> Regards
> 
>   Andrea
> 
+  [(set (attr "length")
+(if_then_else
+(lt (minus (pc) (match_dup 0)) (const_int 1024))
+   (const_int 4)
+   (const_int 6)))
+   (set_attr "type" "branch")])

Shouldn't that be using "ltu" rather than "lt", so that if, for some
reason, the branch has been retargeted to come after the branch, then
the test will still fail and we'll get the comparison variant back.

Otherwise OK.

R.



Re: [PATCH v2] arm: Low overhead loop handle long range branches [PR98931]

2021-02-09 Thread Richard Earnshaw (lists) via Gcc-patches
On 09/02/2021 16:27, Andrea Corallo via Gcc-patches wrote:
> Jakub Jelinek  writes:
> 
>> On Tue, Feb 09, 2021 at 03:09:43PM +0100, Jakub Jelinek via Gcc-patches 
>> wrote:
"TARGET_32BIT && TARGET_HAVE_LOB"
 -  "le\t%|lr, %l0")
 +  "*
 +  if (get_attr_length (insn) == 4)
 +return \"le\\t%|lr, %l0\";
 +  else
 +return \"subs\\t%|lr, #1\;bne\\t%l0\";
 +  "
>>>
>>> Why not
>>> {
>>>   if (get_attr_length (insn) == 4)
>>> return "le\t%|lr, %l0";
>>>   else
>>> return "subs\t%|lr, #1;bne\t%l0";
>>> }
>>> instead?  Seems the arm backend uses "*..." more than the more modern {},
>>> but one needs to backslash prefix a lot which makes it less readable?
>>
>> Where "more modern" is introduced 19.5 years ago ;)
>>
>>  Jakub
> 
> I guess we really like traditions :)
> 
> Attached second version addressing this.
> 
> Thanks
> 
>   Andrea
> 

You're missing a clobber of the condition codes in the RTL.  This wasn't
needed before, but is now.

R.


Re: [PATCH] arm: [testsuite] fix lob tests for -mfloat-abi=hard

2020-11-26 Thread Richard Earnshaw (lists) via Gcc-patches
On 26/11/2020 13:53, Andrea Corallo via Gcc-patches wrote:
> Hi all,
> 
> I'd like to submit the following simple patch to clean some Low Loop
> Overhead test failing on hard float configurations.
> 
> lob2.c and lob5.c are failing with: "'-mfloat-abi=hard': selected 
> processor lacks an FPU".
> 
> lob3.c and lob5.c got "-mfloat-abi=soft and -mfloat-abi=hard may not
> be used together".
> 
> Okay for trunk?
> 
> Thanks
>   Andrea
>   
> 

I think it would be better to try to do this with suitable
require-effective-target rules (or something similar).  Forcing options
should generally be a last resort and in particular using -mfpu should
really be avoided as we're trying to move away from that.

diff --git a/gcc/testsuite/gcc.target/arm/lob4.c
b/gcc/testsuite/gcc.target/arm/lob4.c
...
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
"-marm" "-mcpu=*" } } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
"-marm" "-mcpu=*" "-mfloat-abi=hard" } } */
 /* { dg-options "-march=armv8.1-m.main -mthumb -O3 --save-temps
-mfloat-abi=soft" } */
 /* { dg-require-effective-target arm_softfloat } */

Why is the effective target arm_softfloat not solving this particular
conflict?

R.


[PATCH] arm: correctly handle negating INT_MIN in arm_split_atomic_op [PR97534]

2020-11-24 Thread Richard Earnshaw (lists) via Gcc-patches
arm_split_atomic_op handles subtracting a constant by converting it
into addition of the negated constant.  But if the type of the operand
is int and the constant is -1 we currently end up generating invalid
RTL which can lead to an abort later on.

The problem is that in a HOST_WIDE_INT, INT_MIN is represented as
0x8000 and the negation of this is 0x8000, but
that's not a valid constant for use in SImode operations.

The fix is straight-forward which is to use gen_int_mode rather than
simply GEN_INT.  This knows how to correctly sign-extend the negated
constant when this is needed.

gcc/
PR target/97534
* config/arm/arm.c (arm_split_atomic_op): Use gen_int_mode when
negating a const_int.
gcc/testsuite/
* gcc.dg/pr97534.c: New test.
---
 gcc/config/arm/arm.c   | 2 +-
 gcc/testsuite/gcc.dg/pr97534.c | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr97534.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 568e1530f24..56ed556b098 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -30824,7 +30824,7 @@ arm_split_atomic_op (enum rtx_code code, rtx old_out, 
rtx new_out, rtx mem,
 case MINUS:
   if (CONST_INT_P (value))
{
- value = GEN_INT (-INTVAL (value));
+ value = gen_int_mode (-INTVAL (value), wmode);
  code = PLUS;
}
   /* FALLTHRU */
diff --git a/gcc/testsuite/gcc.dg/pr97534.c b/gcc/testsuite/gcc.dg/pr97534.c
new file mode 100644
index 000..b363a322aa5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr97534.c
@@ -0,0 +1,9 @@
+/* PR target/97534 - ICE in decompose on arm*-*-*.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -O2 -g" } */
+
+int f (int a)
+{
+  int b;
+  __atomic_fetch_sub(, (int)(-__INT_MAX__ - 1), (int)0);
+}
-- 
2.17.1



Re: [AArch64] Add --with-tune configure flag

2020-11-19 Thread Richard Earnshaw (lists) via Gcc-patches
On 19/11/2020 14:40, Wilco Dijkstra via Gcc-patches wrote:
> Hi,
> 
     As for your second patch, --with-cpu-64 could be a simple alias indeed,
     but what is the exact definition/expected behaviour of a --with-cpu-32
     on a target that only supports 64-bit code? The AArch64 target cannot
     generate AArch32 code, so we shouldn't silently accept it.
>>>
>>> IMO allowing users to specify all the flags available on x86 is important.
>>>
>>
>> This isn't about general users though; it's about how you configure the
>> compiler and that's not all the same.  I don't mind the --with-cpu-64 as
>> a strict alias for --with-cpu, but --with-cpu-32 is both redundant and
>> misleading as it might give the impression that it does something useful.
> 
> We could make it do something useful, for example emit a warning, an error
> or default to -mabi=ilp32 (since that is similar to what other targets do).
> Anything is better than being the only target that doesn't support it...
> 
> Cheers,
> Wilco
> 

Having the same option have a completely different meaning would be even
worse than not having the option at all.  So no, that's a non-starter.

It's not like these configure options have wide-spread usage at present.

R.


Re: [AArch64] Add --with-tune configure flag

2020-11-19 Thread Richard Earnshaw (lists) via Gcc-patches
On 18/11/2020 17:16, Pop, Sebastian via Gcc-patches wrote:
> Hi,
> 
> On 11/18/20, 10:17 AM, "Wilco Dijkstra"  wrote:
>>I presume you're trying to unify the --with- options across most targets?
> 
> Yes, my intention was to provide the same configure options on arm64
> as on x86, such that projects that already use those options can change
> cpu name to "neoverse-n1" and that will build a compiler with the right
> tuning for Graviton2.
> 
> Allowing arm64 users to specify all the flags available on x86 is important.
> 
>>That would be very useful! However there are significant differences 
>> between
>>targets in how they interpret options like --with-arch=native (or 
>> -march). So
>>those differences also need to be looked at and fixed to avoid unexpected 
>> results.
>>
>>As for the first patch, I think support for --witch-tune requires more 
>> changes.
>>Without proper processing of a --with-tune, you get an incorrect 
>> architecture
>>version (if say the CPU you tune for is newer than the --with-cpu/arch
>>or default).
>>
>>   I posted patches to add --with-tune and fix various issues a while back:
>>https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553865.html
>>https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553866.html
> 
> Thanks for pointing me to your patches, I was not aware of these changes.
> I see that your patches enable more use cases and fix several bugs.
> These changes would definitely be good to have in trunk and branches.
> 
> My patch was the minimal change to enable --with-tune=neoverse-n1
> 
>>As for your second patch, --with-cpu-64 could be a simple alias indeed,
>>but what is the exact definition/expected behaviour of a --with-cpu-32
>>on a target that only supports 64-bit code? The AArch64 target cannot
>>generate AArch32 code, so we shouldn't silently accept it.
> 
> IMO allowing users to specify all the flags available on x86 is important.
> 

This isn't about general users though; it's about how you configure the
compiler and that's not all the same.  I don't mind the --with-cpu-64 as
a strict alias for --with-cpu, but --with-cpu-32 is both redundant and
misleading as it might give the impression that it does something useful.

R.

> Thanks,
> Sebastian
> 



Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Richard Earnshaw (lists) via Gcc-patches
On 17/11/2020 15:18, Bernd Edlinger wrote:
> On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
>> On 03/11/2020 15:08, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this fixes a problem with a missing symbol __sync_synchronize
>>> which happens when newlib is used together with libstdc++ for
>>> the non-threaded simulator target arm-none-eabi.
>>>
>>> There are several questions on stackoverflow about this issue.
>>>
>>> I would like to add a weak symbol for this target, since this
>>> is only a default implementation and not meant to override a
>>> possibly more sophisticated synchronization function from the
>>> c-runtime.
>>>
>>>
>>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>>>
>>> Is it OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>
>>
>> I seem to recall that this was a deliberate decision - you can't guess
>> this correctly, at least when trying to build portable code - you just
>> have to know which runtime you will be using.
>>
> 
> Therefore I suggest to use the weak attribute.  It is on purpose not
> implementing all of the atomics.
> 
> The use case, is a C++ program which initializes a local static variable.
> 
> $ cat test.cc
> #include 
> main(int argc, char **argv)
> {
>   static std::string x = "test";
>   return 0;
> }
> 
> compiles to this:
> sub sp, sp, #20
> str r0, [fp, #-24]
> str r1, [fp, #-28]
> ldr r3, .L14
> ldrbr4, [r3]
> bl  __sync_synchronize
> and r3, r4, #255
> and r3, r3, #1
> cmp r3, #0
> moveq   r3, #1
> movne   r3, #0
> and r3, r3, #255
> cmp r3, #0
> beq .L8
> ldr r0, .L14
> bl  __cxa_guard_acquire
> mov r3, r0
> 
> so __sync_synchronize is not defined in newlib since the target (arm-sim)
> is known to be not multi-threaded,
> but __cxa_guard_acquire is also not a thread safe function,
> because __GTHREADS is not defined by libgcc, since it is known
> at configure time, that the target does not support threads.
> So libstdc++ does not try to use a mutex or any atomics either,
> because it is not compiled with __GTHREADS.
> 
> I can further narrow down the patch by only defining this function when
> __GTHREADS is not defined, to make it more clear.
> 
> 
>> I think Ramana had some changes in the works at one point to address
>> (some) of this, but I'm not sure what happened to them.  Ramana?
>>
>>
>> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
>> +|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
>> +|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
>> +|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>>
>> Ug, no!  Use the ACLE macros to avoid this sort of mess.
>>
> 
> Ah, thanks, copy-paste from freebsd-atomic.c :)
> 
> 
> I've attached the updated patch.
> Is it OK?
> 
> 
> Thanks
> Bernd.
> 

libgcc is *still* the wrong place for this.  It belongs in the system
library (eg newlib, or glibc, or whatever), which knows about the system
it's running on.  (Sorry, I should have said this before, but I've
context-switched this out since it's been a long time since it came up).

This hack will just lead to silent code failure of the worst kind
(non-reproducable, racy) at runtime.

R.


Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Richard Earnshaw (lists) via Gcc-patches
On 03/11/2020 15:08, Bernd Edlinger wrote:
> Hi,
> 
> this fixes a problem with a missing symbol __sync_synchronize
> which happens when newlib is used together with libstdc++ for
> the non-threaded simulator target arm-none-eabi.
> 
> There are several questions on stackoverflow about this issue.
> 
> I would like to add a weak symbol for this target, since this
> is only a default implementation and not meant to override a
> possibly more sophisticated synchronization function from the
> c-runtime.
> 
> 
> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
> 
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
> 

I seem to recall that this was a deliberate decision - you can't guess
this correctly, at least when trying to build portable code - you just
have to know which runtime you will be using.

I think Ramana had some changes in the works at one point to address
(some) of this, but I'm not sure what happened to them.  Ramana?


+#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
+|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
+|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
+|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
+#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)

Ug, no!  Use the ACLE macros to avoid this sort of mess.

R.


  1   2   3   4   5   6   7   8   9   10   >