from:"Richard Earnshaw"

Re: [PATCH] testsuite: Verify r0-r3 are extended with CMSE

2024-04-30 Thread Richard Earnshaw (lists)

On 30/04/2024 16:37, Torbjorn SVENSSON wrote:
> 
> 
> On 2024-04-30 17:11, Richard Earnshaw (lists) wrote:
>> On 27/04/2024 15:13, Torbjörn SVENSSON wrote:
>>> Add regression test to the existing zero/sign extend tests for CMSE to
>>> verify that r0, r1, r2 and r3 are properly extended, not just r0.
>>>
>>> Test is done using -O0 to ensure the instructions are in a predictable
>>> order.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/arm/cmse/extend-param.c: Add regression test.
>>>
>>> Signed-off-by: Torbjörn SVENSSON 
>>> ---
>>>   .../gcc.target/arm/cmse/extend-param.c    | 20 ++-
>>>   1 file changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c 
>>> b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> index 01fac786238..b8b8ecbff56 100644
>>> --- a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> +++ b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> @@ -93,4 +93,22 @@ __attribute__((cmse_nonsecure_entry)) char 
>>> boolSecureFunc (bool index) {
>>>   return 0;
>>>     return array[index];
>>>   -}
>>> \ No newline at end of file
>>> +}
>>> +
>>> +/*
>>> +**__acle_se_boolCharShortEnumSecureFunc:
>>> +**    ...
>>> +**    uxtb    r0, r0
>>> +**    uxtb    r1, r1
>>> +**    uxth    r2, r2
>>> +**    uxtb    r3, r3
>>> +**    ...
>>> +*/
>>> +__attribute__((cmse_nonsecure_entry,optimize(0))) char 
>>> boolCharShortEnumSecureFunc (bool a, unsigned char b, unsigned short c, 
>>> enum offset d) {
>>> +
>>> +  size_t index = a + b + c + d;
>>> +  if (index >= ARRAY_SIZE)
>>> +    return 0;
>>> +  return array[index];
>>> +
>>> +}
>>
>> Ok, but please can you add '-fshort-enums' to dg-options to ensure this test 
>> still behaves correctly if run with a different default (I missed that last 
>> time around).
> 
> Ok, I'll add that to extend-param.c. Do you want me to also add it to the 
> extend-return.c test case?
> 
> Kind regards,
> Torbjörn

Yes please, if it has the same issue.

R.

Re: [PATCH] testsuite: Verify r0-r3 are extended with CMSE

2024-04-30 Thread Richard Earnshaw (lists)

On 27/04/2024 15:13, Torbjörn SVENSSON wrote:
> Add regression test to the existing zero/sign extend tests for CMSE to
> verify that r0, r1, r2 and r3 are properly extended, not just r0.
> 
> Test is done using -O0 to ensure the instructions are in a predictable
> order.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/cmse/extend-param.c: Add regression test.
> 
> Signed-off-by: Torbjörn SVENSSON 
> ---
>  .../gcc.target/arm/cmse/extend-param.c| 20 ++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c 
> b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> index 01fac786238..b8b8ecbff56 100644
> --- a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> +++ b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> @@ -93,4 +93,22 @@ __attribute__((cmse_nonsecure_entry)) char boolSecureFunc 
> (bool index) {
>  return 0;
>return array[index];
>  
> -}
> \ No newline at end of file
> +}
> +
> +/*
> +**__acle_se_boolCharShortEnumSecureFunc:
> +**   ...
> +**   uxtbr0, r0
> +**   uxtbr1, r1
> +**   uxthr2, r2
> +**   uxtbr3, r3
> +**   ...
> +*/
> +__attribute__((cmse_nonsecure_entry,optimize(0))) char 
> boolCharShortEnumSecureFunc (bool a, unsigned char b, unsigned short c, enum 
> offset d) {
> +
> +  size_t index = a + b + c + d;
> +  if (index >= ARRAY_SIZE)
> +return 0;
> +  return array[index];
> +
> +}

Ok, but please can you add '-fshort-enums' to dg-options to ensure this test 
still behaves correctly if run with a different default (I missed that last 
time around).

R.

Re: [PATCH][GCC] aarch64: Fix SCHEDULER_IDENT for Cortex-A510

2024-04-26 Thread Richard Earnshaw (lists)

On 25/04/2024 15:59, Richard Ball wrote:
> Hi Richard,
> 
> I committed this combined patch (with Cortex-A520) for trunk 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=cab53aae43cf94171b01320c08302e47a5daa391
>  
> <https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=cab53aae43cf94171b01320c08302e47a5daa391>
> 
> Am I ok to commit just the Cortex-A510 half into gcc-12 and gcc-13.

Yes, if that's the correct thing to do there.

R.

> 
> Thanks,
> Richard Ball
> --
> *From:* Richard Ball
> *Sent:* 12 March 2024 14:08
> *To:* gcc-patches@gcc.gnu.org ; Richard Earnshaw 
> ; Richard Sandiford ; 
> Marcus Shawcroft 
> *Subject:* [PATCH][GCC] aarch64: Fix SCHEDULER_IDENT for Cortex-A510
>  
> The SCHEDULER_IDENT for this CPU was incorrectly
> set to cortexa55, which is incorrect. This can cause
> sub-optimal asm to be generated.
> 
> Ok for trunk?
> 
> Can I also backport this to gcc-12 and gcc-13?
> 
> gcc/ChangeLog:
>     PR target/114272
>     * config/aarch64/aarch64-cores.def (AARCH64_CORE):
>     Change SCHEDULER_IDENT from cortexa55 to cortexa53
>     for Cortex-A510.

Re: [PATCH] arm: Zero/Sign extends for CMSE security

2024-04-26 Thread Richard Earnshaw (lists)

On 26/04/2024 09:39, Torbjorn SVENSSON wrote:
> Hi,
> 
> On 2024-04-25 16:25, Richard Ball wrote:
>> Hi Torbjorn,
>>
>> Thanks very much for the comments.
>> I think given that the code that handles this, is within a 
>> FOREACH_FUNCTION_ARGS loop.
>> It seems a fairly safe assumption that if the code works for one that it 
>> will work for all.
>> To go back and add extra tests to me seems a little overkill.
> 
> For verifying that the implementation does the right thing now, no, but for 
> verifying against future regressions, then yes.
> 
> So, from a regression point of view, I think it makes sense to have the check 
> that more than the first argument is managed properly.
> 
> Kind regards,
> Torbjörn

Feel free to post some additional tests, Torbjorn.

R.

Re: [PATCH] arm: Zero/Sign extends for CMSE security

2024-04-25 Thread Richard Earnshaw (lists)

On 24/04/2024 16:55, Richard Ball wrote:
> This patch makes the following changes:
> 
> 1) When calling a secure function from non-secure code then any arguments
>smaller than 32-bits that are passed in registers are zero- or 
> sign-extended.
> 2) After a non-secure function returns into secure code then any return value
>smaller than 32-bits that is passed in a register is  zero- or 
> sign-extended.
> 
> This patch addresses the following CVE-2024-0151.
> 
> gcc/ChangeLog:
> PR target/114837
> * config/arm/arm.cc (cmse_nonsecure_call_inline_register_clear):
>   Add zero/sign extend.
> (arm_expand_prologue): Add zero/sign extend.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/cmse/extend-param.c: New test.
> * gcc.target/arm/cmse/extend-return.c: New test.

OK.  And OK to backport to active branches.

R.

Re: [PATCH] [testsuite] [arm] require arm_v8_1m_main for pacbti tests

2024-04-19 Thread Richard Earnshaw (lists)

On 19/04/2024 13:45, Alexandre Oliva wrote:
> On Apr 16, 2024, "Richard Earnshaw (lists)"  wrote:
> 
>> The require-effective-target flags test whether a specific set of
>> flags will make the compilation work, so they need to be used in
>> conjunction with the corresponding dg-add-options flags that then
>> apply those options.
> 
> *nod*, that's the theory.  Problem is the architectures suported by
> [add_options_for_]arm_arch_*[_ok] do not match exactly those expected by
> the tests, and I can't quite tell whether the subtle changes they would
> introduce would change what they intend to test, or even whether the
> differences are irrelevant, or would be sensible to add as variants to
> the dg machinery.  I think it would take someone more familiar than I am
> with all of the ARM variants to do this correctly.  I don't even know
> how these changes would need to be tested to be sure they remain
> correct.

It's ok to add additional variations to the table of variants in 
target-supports.exp, but we should avoid writing new specific run-time 
functions unless we really want an executable test.

I started doing some cleanup of the Arm tests infrastructure during phase 3, 
but stopped during phase 4 as I wanted to minimise the changes being made now.  
I plan to go back and work on it some more once stage 1 re-opens.

> 
> Would you be willing to take it from here, or would you accept the patch
> as an incremental yet imperfect improvement, or would you prefer to
> guide me in making it correct, and in verifying it (there are questions
> below)?  I don't have a lot of cycles to put into this (we've already
> worked around the testsuite bugs we ran into), but it would be desirable
> to get a fix into GCC as well, if we can converge on one without
> unreasonably burdening anyone.
> 
> 
>   v8_1m_main "-march=armv8.1-m.main+fp -mthumb" __ARM_ARCH_8M_MAIN__
>   v8_1m_main_pacbti "-march=armv8.1-m.main+pacbti+fp -mthumb"
>   "__ARM_ARCH_8M_MAIN__ && __ARM_FEATURE_BTI && 
> __ARM_FEATURE_PAUTH
> 
> Why do these have +fp in -march but not in the v8_1m* arch name?

It's ... complicated :)

The +fp is there because, with the move to having -mfpu=auto as the default, we 
need to avoid problems when the compiler has been configured with 
--with-float=hard, which requires the extension register set (fp or vector 
support) even if the test code itself doesn't care.  The best way to handle 
this in most cases is to give the architecture strings a default FPU 
specification (ie +fp). 

> 
> 
> gcc/testsuite/g++.target/arm/pac-1.C:
> /* { dg-options "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
> -mthumb -mfloat-abi=hard -g -O0" } */
> 
> v8_1m_main_pacbti plus +mve minus +fp.
> Do we need a dg arch for that?

I'd be inclined to drop +mve from this one; there's nothing I can see in the 
test that would generate mve instructions, so I think it's irrelevant.  We can 
use the existing v8_1m_main_pacbti operations.

> 
> 
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c:
> /* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
> -mfloat-abi=hard" } */
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c:
> /* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
> 
> v8_1m_main_pacbti minus -mthumb.
> AFAICT the -mthumb is redundant.

Nearly, but not quite.  Although the gcc driver knows that m-profile 
architectures require thumb, that's not enough to override an explicit -marm 
from a testsuite configuration run, so if your site.exp file adds -marm in a 
test configuration we need to override that or the test will fail.  But the 
table based list of options will do that for you.

> 
> 
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c:
> /* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
> 
> v8_1m_main minus -mthumb.
> AFAICT the -mthumb is redundant.

As above

> 
> 
> gcc/testsuite/gcc.target/arm/bti-1.c:
> /* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
> -mbranch-protection=bti --save-temps" } */
> gcc/testsuite/gcc.target/arm/bti-2.c:
> /* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
> -mbranch-protection=bti --save-temps" } */
> 
> v8_1m_main minus +fp.> 
> Can these be bumped to +fp, or do we need an extra dg arch?
> 
> Are these missing +pacbti?

The tests themselves do not require fp, but if we use the effective-target 
rules (arm_arch_v8_1m_main), we can remove the -march, -mthumb and -mfloat-abi 
flags from these tests.

These tests for BTI should NOT have +pacbti: they're testing that the compiler 
generates the right nop-based implementation that is backw

Re: [PATCH]AArch64: remove reliance on register allocator for simd/gpreg costing. [PR114741]

2024-04-18 Thread Richard Earnshaw (lists)

On 18/04/2024 11:11, Tamar Christina wrote:
> Hi All,
> 
> In PR114741 we see that we have a regression in codegen when SVE is enable 
> where
> the simple testcase:
> 
> void foo(unsigned v, unsigned *p)
> {
> *p = v & 1;
> }
> 
> generates
> 
> foo:
> fmovs31, w0
> and z31.s, z31.s, #1
> str s31, [x1]
> ret
> 
> instead of:
> 
> foo:
> and w0, w0, 1
> str w0, [x1]
> ret
> 
> This causes an impact it not just codesize but also performance.  This is 
> caused
> by the use of the ^ constraint modifier in the pattern 3.
> 
> The documentation states that this modifier should only have an effect on the
> alternative costing in that a particular alternative is to be preferred unless
> a non-psuedo reload is needed.
> 
> The pattern was trying to convey that whenever both r and w are required, that
> it should prefer r unless a reload is needed.  This is because if a reload is
> needed then we can construct the constants more flexibly on the SIMD side.
> 
> We were using this so simplify the implementation and to get generic cases 
> such
> as:
> 
> double negabs (double x)
> {
>unsigned long long y;
>memcpy (, , sizeof(double));
>y = y | (1UL << 63);
>memcpy (, , sizeof(double));
>return x;
> }
> 
> which don't go through an expander.
> However the implementation of ^ in the register allocator is not according to
> the documentation in that it also has an effect during coloring.  During 
> initial
> register class selection it applies a penalty to a class, similar to how ? 
> does.
> 
> In this example the penalty makes the use of GP regs expensive enough that it 
> no
> longer considers them:
> 
> r106: preferred FP_REGS, alternative NO_REGS, allocno FP_REGS
> ;;3--> b  0: i   9 r106=r105&0x1
> :cortex_a53_slot_any:GENERAL_REGS+0(-1)FP_REGS+1(1)PR_LO_REGS+0(0)
>  PR_HI_REGS+0(0):model 4
> 
> which is not the expected behavior.  For GCC 14 this is a conservative fix.
> 
> 1. we remove the ^ modifier from the logical optabs.
> 
> 2. In order not to regress copysign we then move the copysign expansion to
>directly use the SIMD variant.  Since copysign only supports floating point
>modes this is fine and no longer relies on the register allocator to select
>the right alternative.
> 
> It once again regresses the general case, but this case wasn't optimized in
> earlier GCCs either so it's not a regression in GCC 14.  This change gives
> strict better codegen than earlier GCCs and still optimizes the important 
> cases.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 
>   PR target/114741
>   * config/aarch64/aarch64.md (3): Remove ^ from alt 2.
>   (copysign3): Use SIMD version of IOR directly.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/114741
>   * gcc.target/aarch64/fneg-abs_2.c: Update codegen.
>   * gcc.target/aarch64/fneg-abs_4.c: xfail for now.
>   * gcc.target/aarch64/pr114741.c: New test.
> 
> ---
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 
> 385a669b9b3c31cc9108a660e881b9091c71fc7c..dbde066f7478bec51a8703b017ea553aa98be309
>  100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4811,7 +4811,7 @@ (define_insn "3"
>""
>{@ [ cons: =0 , 1  , 2; attrs: type , arch  ]
>   [ r, %r , r; logic_reg   , * ] \t%0, 
> %1, %2
> - [ rk   , ^r ,  ; logic_imm   , * ] \t%0, 
> %1, %2
> + [ rk   , r  ,  ; logic_imm   , * ] \t%0, 
> %1, %2
>   [ w, 0  ,  ; *   , sve   ] \t%Z0., 
> %Z0., #%2
>   [ w, w  , w; neon_logic  , simd  ] 
> \t%0., %1., %2.
>}
> @@ -7192,22 +7192,29 @@ (define_expand "copysign3"
> (match_operand:GPF 2 "nonmemory_operand")]
>"TARGET_SIMD"
>  {
> -  machine_mode int_mode = mode;
> -  rtx bitmask = gen_reg_rtx (int_mode);
> -  emit_move_insn (bitmask, GEN_INT (HOST_WIDE_INT_M1U
> - << (GET_MODE_BITSIZE (mode) - 1)));
> +  rtx signbit_const = GEN_INT (HOST_WIDE_INT_M1U
> +<< (GET_MODE_BITSIZE (mode) - 1));
>/* copysign (x, -1) should instead be expanded as orr with the sign
>   bit.  */
>rtx op2_elt = unwrap_const_vec_duplicate (operands[2]);
>if (GET_CODE (op2_elt) == CONST_DOUBLE
>&& real_isneg (CONST_DOUBLE_REAL_VALUE (op2_elt)))
>  {
> -  emit_insn (gen_ior3 (
> - lowpart_subreg (int_mode, operands[0], mode),
> - lowpart_subreg (int_mode, operands[1], mode), bitmask));
> +  rtx v_bitmask
> + = force_reg (V2mode,
> +  gen_const_vec_duplicate (V2mode,
> +   signbit_const));
> +
> +  emit_insn (gen_iorv23 (
> + lowpart_subreg (V2mode, operands[0], mode),
> + lowpart_subreg

Re: [PATCH] [testsuite] [arm] accept empty init for bfloat16

2024-04-16 Thread Richard Earnshaw (lists)

On 16/04/2024 04:50, Alexandre Oliva wrote:
> 
> Complete r13-2205, adjusting an arm-specific test that expects a
> no-longer-issued error at an empty initializer.
> 
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/arm/bfloat16_scalar_typecheck.c: Accept C23
> empty initializers.
> ---
>  .../gcc.target/arm/bfloat16_scalar_typecheck.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c 
> b/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> index 8c80c55bc9f4c..04ede93bda152 100644
> --- a/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> +++ b/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> @@ -42,7 +42,7 @@ bfloat16_t footest (bfloat16_t scalar0)
>short initi_1_4 = glob_bfloat; /* { dg-error {invalid conversion from type 
> 'bfloat16_t'} } */
>double initi_1_5 = glob_bfloat; /* { dg-error {invalid conversion from 
> type 'bfloat16_t'} } */
>  
> -  bfloat16_t scalar2_1 = {}; /* { dg-error {empty scalar initializer} } */
> +  bfloat16_t scalar2_1 = {};
>bfloat16_t scalar2_2 = { glob_bfloat };
>bfloat16_t scalar2_3 = { 0 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
>bfloat16_t scalar2_4 = { 0.1 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
> @@ -94,7 +94,7 @@ bfloat16_t footest (bfloat16_t scalar0)
>  
>/* Compound literals.  */
>  
> -  (bfloat16_t) {}; /* { dg-error {empty scalar initializer} } */
> +  (bfloat16_t) {};
>(bfloat16_t) { glob_bfloat };
>(bfloat16_t) { 0 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
>(bfloat16_t) { 0.1 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
> 


This test is checking for errors.  Perhaps it would be better to select an 
older version of the standard and then set pedantic-error mode.

R.

Re: [testsuite] [aarch64] Require fpic effective target

2024-04-16 Thread Richard Earnshaw (lists)

On 16/04/2024 04:08, Alexandre Oliva wrote:
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> Co-authored-by: Olivier Hainque 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/aarch64/pr94201.c: Add missing
>   dg-require-effective-target fpic.
>   * gcc.target/aarch64/pr103085.c: Likewise.
> 
> ---
>  gcc/testsuite/gcc.target/aarch64/pr103085.c |1 +
>  gcc/testsuite/gcc.target/aarch64/pr94201.c  |1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr103085.c 
> b/gcc/testsuite/gcc.target/aarch64/pr103085.c
> index dbc9c15b71f22..347280ed42b2d 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pr103085.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr103085.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fstack-protector-strong -fPIC" } */
> +/* { dg-require-effective-target fpic } */
>  
>  void g(int*);
>  void
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr94201.c 
> b/gcc/testsuite/gcc.target/aarch64/pr94201.c
> index 691761691868a..3b9b79059e02b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pr94201.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr94201.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-mcmodel=tiny -mabi=ilp32 -fPIC" } */
> +/* { dg-require-effective-target fpic } */
>  
>  extern int bar (void *);
>  extern long long a;
> 


OK

R.

Re: [PATCH] [testsuite] [arm] require arm_v8_1m_main for pacbti tests

2024-04-16 Thread Richard Earnshaw (lists)

On 16/04/2024 04:48, Alexandre Oliva wrote:
> 
> arm pac and bti tests that use -march=armv8.1-m.main get an implicit
> -mthumb, that is incompatible with vxworks kernel mode.  Declaring the
> requirement for a 8.1-m.main-compatible toolchain is enough to avoid
> those fails, because the toolchain feature test fails in kernel mode.
> 
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * g++.target/arm/pac-1.C: Require arm_arch_v8_1m_main.
>   * gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.
>   * gcc.target/arm/bti-1.c: Likewise.
>   * gcc.target/arm/bti-2.c: Likewise.
> ---
>  gcc/testsuite/g++.target/arm/pac-1.C   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-11.c   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-12.c   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-7.c|1 +
>  gcc/testsuite/gcc.target/arm/bti-1.c   |1 +
>  gcc/testsuite/gcc.target/arm/bti-2.c   |1 +
>  6 files changed, 6 insertions(+)
> 
> diff --git a/gcc/testsuite/g++.target/arm/pac-1.C 
> b/gcc/testsuite/g++.target/arm/pac-1.C
> index f671a27b048c6..f48ad6cc5cb65 100644
> --- a/gcc/testsuite/g++.target/arm/pac-1.C
> +++ b/gcc/testsuite/g++.target/arm/pac-1.C
> @@ -2,6 +2,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-options "-march=armv8.1-m.main+mve+pacbti 
> -mbranch-protection=pac-ret -mthumb -mfloat-abi=hard -g -O0" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */

The require-effective-target flags test whether a specific set of flags will 
make the compilation work, so they need to be used in conjunction with the 
corresponding dg-add-options flags that then apply those options.  It isn't 
safe to just add a different architecture flag instead.  So if you're going to 
use this effective target, you should use it along with "dg-add-options 
arm_arch_v8_1m_main" (ie the effective-target name minus the trailing '_ok'), 
and then replace dg-options with dg-additional-options adding the remaining 
flags.  You can then remove the dg-skip-if as well because that's what the 
require-effective-target flag is doing.  So something like

dg-do compile
dg-require-effective-target arm_arch_v8_1m_main_ok
dg-add-options arm_arch_v8_1m_main
dg-additional-options "-mbranch-protection=pac-ret -g -O0"

But this test is also adding pacbti to the architecture flags, so it would 
probably be better to use v8_1m_main_pacbti_ok as the effective target.  It's 
not identical to the options above, but it's probably sufficient for this test. 
 Each test below will need checking for the exact flags that are needed for the 
test in question.


>  
>  __attribute__((noinline)) void
>  fn1 (int a, int b, int c)
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> index 6a5ae92c567f3..dba4f491cfea7 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" "-mfloat-abi=*" } } */
>  /* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
>  
>  #if (__ARM_FEATURE_BTI != 1)
>  #error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be defined to 1."
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> index db40b17c3b030..308a41eb4ba4c 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
>  
>  #if defined (__ARM_FEATURE_BTI)
>  #error "Feature test macro __ARM_FEATURE_BTI should not be defined."
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> index 1b25907635e24..10836a84bde56 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
> -mfloat-abi=hard" } */
> +/* {

Re: [PATCH 1/1] aarch64: Sync aarch64-sys-regs.def with Binutils

2024-03-20 Thread Richard Earnshaw (lists)

On 20/03/2024 11:21, Yury Khrustalev wrote:
> This patch updates `aarch64-sys-regs.def', bringing it into sync with
> the Binutils source.
> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-sys-regs.def: Copy from Binutils.

Thanks, I've pushed this.  It's trivial enough and there's value of keeping it 
in sync with binutils.

One comment though, there should be one hard tab before "* config/..."; you 
seem to have some other random characters there that looked like white space.

R.

> ---
>  gcc/config/aarch64/aarch64-sys-regs.def | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
> b/gcc/config/aarch64/aarch64-sys-regs.def
> index 6a948171d6e..8b65673a5d6 100644
> --- a/gcc/config/aarch64/aarch64-sys-regs.def
> +++ b/gcc/config/aarch64/aarch64-sys-regs.def
> @@ -521,6 +521,7 @@
>SYSREG ("id_aa64isar0_el1",CPENC (3,0,0,6,0),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64isar1_el1",CPENC (3,0,0,6,1),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64isar2_el1",CPENC (3,0,0,6,2),  F_REG_READ, 
> AARCH64_NO_FEATURES)
> +  SYSREG ("id_aa64isar3_el1",CPENC (3,0,0,6,3),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr0_el1",CPENC (3,0,0,7,0),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr1_el1",CPENC (3,0,0,7,1),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr2_el1",CPENC (3,0,0,7,2),  F_REG_READ, 
> AARCH64_NO_FEATURES)

Re: [PATCH] arm: [MVE intrinsics] Fix support for loads [PR target/114323]

2024-03-18 Thread Richard Earnshaw (lists)





On 15/03/2024 20:08, Christophe Lyon wrote:

The testcase in this PR shows that we would load from an uninitialized
location, because the vld1 instrinsics are reported as "const". This
is because function_instance::reads_global_state_p() does not take
CP_READ_MEMORY into account.  Fixing this gives vld1 the "pure"
attribute instead, and solves the problem.

2024-03-15  Christophe Lyon  

PR target/114323
gcc/
* config/arm/arm-mve-builtins.cc
(function_instance::reads_global_state_p): Take CP_READ_MEMORY
into account.

gcc/testsuite/
* gcc.target/arm/mve/pr114323.c: New.


OK.

R.


---
  gcc/config/arm/arm-mve-builtins.cc  |  2 +-
  gcc/testsuite/gcc.target/arm/mve/pr114323.c | 22 +
  2 files changed, 23 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.target/arm/mve/pr114323.c

diff --git a/gcc/config/arm/arm-mve-builtins.cc 
b/gcc/config/arm/arm-mve-builtins.cc
index 2f2c0f4a02a..6a5775c67e5 100644
--- a/gcc/config/arm/arm-mve-builtins.cc
+++ b/gcc/config/arm/arm-mve-builtins.cc
@@ -746,7 +746,7 @@ function_instance::reads_global_state_p () const
if (flags & CP_READ_FPCR)
  return true;
  
-  return false;

+  return flags & CP_READ_MEMORY;
  }
  
  /* Return true if calls to the function could modify some form of

diff --git a/gcc/testsuite/gcc.target/arm/mve/pr114323.c 
b/gcc/testsuite/gcc.target/arm/mve/pr114323.c
new file mode 100644
index 000..bd9127b886a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/pr114323.c
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_mve_hw } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+
+#include 
+
+__attribute__((noipa))
+uint32x4_t foo (void) {
+  uint32x4_t V0 = vld1q_u32(((const uint32_t[4]){1, 2, 3, 4}));
+  return V0;
+}
+
+int main(void)
+{
+  uint32_t buf[4];
+ vst1q_u32 (buf, foo());
+
+  for (int i = 0; i < 4; i++)
+if (buf[i] != i+1)
+  __builtin_abort ();
+}

Re: [PATCH] testsuite: Turn errors back into warnings in arm/acle/cde-mve-error-2.c

2024-03-18 Thread Richard Earnshaw (lists)





On 15/03/2024 15:13, Thiago Jung Bauermann wrote:


Hello,

"Richard Earnshaw (lists)"  writes:


On 13/01/2024 20:46, Thiago Jung Bauermann wrote:

diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c 
b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
index 5b7774825442..da283a06a54d 100644
--- a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
+++ b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
@@ -2,6 +2,7 @@

  /* { dg-do assemble } */
  /* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
+/* { dg-options "-fpermissive" } */
  /* { dg-add-options arm_v8_1m_main_cde_mve_fp } */

  /* The error checking files are split since there are three kinds of
@@ -115,73 +116,73 @@ uint8x16_t test_bad_immediates (uint8x16_t n, uint8x16_t 
m, int someval,

/* `imm' is of wrong type.  */
accum += __arm_vcx1q_u8 (0, "");/* { dg-error {argument 
2 to '__builtin_arm_vcx1qv16qi' must be a constant immediate in range \[0-4095\]} } */
-  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 117 } */
+  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 118 } */


Absolute line numbers are a pain, but I think we can use '.-1' (without the 
quotes) in
these cases to minimize the churn.


That worked, thank you for the tip.


If that works, ok with that change.


I took the opportunity to request commit access to the GCC repo so that
I can commit the patch myself. Sorry for the delay. I'll commit it as
soon as I get it.

Thank you for the patch review! I'm including below the updated version.


I pushed this, thanks.

R.



--
Thiago


 From 78e70788da5ed849d7828b0219d3aa8955ad0fea Mon Sep 17 00:00:00 2001
From: Thiago Jung Bauermann 
Date: Sat, 13 Jan 2024 14:28:07 -0300
Subject: [PATCH v2] testsuite: Turn errors back into warnings in
  arm/acle/cde-mve-error-2.c
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Since commit 2c3db94d9fd ("c: Turn int-conversion warnings into
permerrors") the test fails with errors such as:

   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
32)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
33)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
34)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
35)
 ⋮
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 118 (test for 
warnings, line 117)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
119)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 120 (test for 
warnings, line 119)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
121)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 122 (test for 
warnings, line 121)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
123)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 124 (test for 
warnings, line 123)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
125)
 ⋮
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0  (test for excess errors)

There's a total of 1016 errors.  Here's a sample of the excess errors:

   Excess errors:
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:117:31: 
error: passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:119:3: 
error: passing argument 3 of '__builtin_arm_vcx1qav16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:121:3: 
error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:123:3: 
error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]

The test expects these messages to be warnings, not errors.  My first try
was to change it to expect them as errors instead.  This didn't work, IIUC
because the error prevents the compiler from continuing processing the file
and thus other errors which are expected by the test don't get emitted.

Therefore, add -fpermissive so that the test behaves as it did previously.
Because of the additional line in the header, the line numbers of the
expected warnings don't match anymore so replace them with ".-1" as
suggested by Richard Earnshaw.

Tested on armv8l-linux-gnueabihf.

gcc/testsuite/ChangeLog:
* gcc.target/arm/acle/cde-mve-error-

Re: [PATCH] aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310]

2024-03-14 Thread Richard Earnshaw (lists)





On 14/03/2024 08:37, Jakub Jelinek wrote:

Hi!

The following testcase ICEs with LSE atomics.
The problem is that the @atomic_compare_and_swap expander uses
aarch64_reg_or_zero predicate for the desired operand, which is fine,
given that for most of the modes and even for TImode in some cases
it can handle zero immediate just fine, but the TImode
@aarch64_compare_and_swap_lse just uses register_operand for
that operand instead, again intentionally so, because the casp,
caspa, caspl and caspal instructions need to use a pair of consecutive
registers for the operand and xzr is just one register and we can't
just store zero into the link register to emulate pair of zeros.

So, the following patch fixes that by forcing the newval operand into
a register for the TImode LSE case.

Bootstrapped/regtested on aarch64-linux, ok for trunk?


An alternative fix would be to use a mode_attr to pick a different 
predicate for TImode.  But that's probably just a matter of taste; I'm 
not sure that one would be better than the other in reality.


OK (or with my suggestion if you prefer).

R.



2024-03-14  Jakub Jelinek  

PR target/114310
* config/aarch64/aarch64.cc (aarch64_expand_compare_and_swap): For
TImode force newval into a register.

* gcc.dg/pr114310.c: New test.

--- gcc/config/aarch64/aarch64.cc.jj2024-03-12 10:16:12.024101665 +0100
+++ gcc/config/aarch64/aarch64.cc   2024-03-13 18:55:39.147986554 +0100
@@ -24693,6 +24693,8 @@ aarch64_expand_compare_and_swap (rtx ope
  rval = copy_to_mode_reg (r_mode, oldval);
else
emit_move_insn (rval, gen_lowpart (r_mode, oldval));
+  if (mode == TImode)
+   newval = force_reg (mode, newval);
  
emit_insn (gen_aarch64_compare_and_swap_lse (mode, rval, mem,

   newval, mod_s));
--- gcc/testsuite/gcc.dg/pr114310.c.jj  2024-03-13 19:09:25.322597418 +0100
+++ gcc/testsuite/gcc.dg/pr114310.c 2024-03-13 19:08:50.802073314 +0100
@@ -0,0 +1,20 @@
+/* PR target/114310 */
+/* { dg-do run { target int128 } } */
+
+volatile __attribute__((aligned (sizeof (__int128_t __int128_t v = 10;
+
+int
+main ()
+{
+#if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
+  if (__sync_val_compare_and_swap (, (__int128_t) 10, (__int128_t) 0) != 10)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 10, (__int128_t) 15) != 0)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 0, (__int128_t) 42) != 0)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 31, (__int128_t) 35) != 42)
+__builtin_abort ();
+#endif
+  return 0;
+}

Jakub

Re: [PATCH v2] [testsuite] Fixup dg-options in {gcc, g++, gfortran}.dg/vect.exp tests

2024-03-13 Thread Richard Earnshaw





On 13/03/2024 12:12, Maxim Kuvyrkov wrote:

Changes in v2:
- Better changelog entry.
- NFC.


This patch has been tested on
- aarch64-linux-gnu
- arm-linux-gnueabihf (VFP, NEON disabled by default),
- arm-none-eabi (Soft-FP)
with the following [expected] differences in the test results:

   - FAIL now PASS [FAIL => PASS]:
   Executed from: gcc:gcc.dg/vect/vect.exp
 gcc:gcc.dg/vect/vect.exp=gcc.dg/vect/pr113576.c (test for excess errors)
 gcc:gcc.dg/vect/vect.exp=gcc.dg/vect/pr113576.c -flto -ffat-lto-objects 
(test for excess errors)

   - UNSUPPORTED disappears[UNSUP=> ]:
   Executed from: g++:g++.dg/vect/vect.exp
 g++:g++.dg/vect/vect.exp=g++.dg/vect/pr84556.cc  -std=gnu++98

   - UNSUPPORTED appears   [ =>UNSUP]:
   Executed from: g++:g++.dg/vect/vect.exp
 g++:g++.dg/vect/vect.exp=g++.dg/vect/pr84556.cc  -std=c++98

   - UNRESOLVED disappears [UNRES=> ]:
   Executed from: gcc:gcc.dg/vect/vect.exp
 gcc:gcc.dg/vect/vect.exp=gcc.dg/vect/pr113576.c -flto -ffat-lto-objects 
compilation failed to produce executable
 gcc:gcc.dg/vect/vect.exp=gcc.dg/vect/pr113576.c compilation failed to 
produce executable

This patch was motivated by gcc.dg/vect/pr113576.c, which currently
fails to compile for ARM targets without NEON.

=== CUT ===

Testsuites driven by vect.exp rely on check_vect_support_and_set_flags
to set appropriate DEFAULT_VECTFLAGS for a given target (e.g., add
-mfpu=neon for arm-linux-gnueabi).  Unfortunately, these flags are
overwritten by dg-options directive, which can cause tests to fail.

Behavior of dg-options is documented in vect.exp files, but not
all developers look at the .exp file when adding a new testcase.
This caused a few dg-options directives to be used instead of
the more appropriate dg-additional-options.

This patch changes target-independent dg-options into
dg-additional-options.  This patch does not touch target-specific
dg-options and target-specific tests to avoid disturbing the gentle
balance of target-specific vectorization.

This patch also removes a couple of unneeded "dg-do run" directives
to avoid failures on compile-only targets.  Default action is, again,
set by check_vect_support_and_set_flags.

Lastly, I avoided renaming tests that use -O options to O-*
filename format because this support is not consistent between
gcc.dg/vect/, g++.dg/vect/, and gfortran.dg/vect/ testsuites.
It seems dg-additional-options is cleaner.

This patch does the following,
1. do not change target-specific tests, e.g., gcc.dg/vect/costmodel/riscv/*;
2. do not change { dg-options FOO { target { target-*-pattern } } };
3. do not remove { dg-do run { target { target-*-pattern } } };
4. change { dg-options FOO } to { dg-additional-options FOO };
5. remove { dg-do run } in several tests, where it is clearly not needed.

gcc/testsuite/ChangeLog:

PR testsuite/114307
* gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c: Remove dg-run.
* gcc.dg/vect/complex/complex-operations-run.c: Likewise.
* gcc.dg/vect/pr113576.c: Remove dg-run.  Use dg-additional-options for
test-specific flags.
* gcc.dg/vect/gimplefe-40.c: Use dg-additional-options for
test-specific flags.
* gcc.dg/vect/gimplefe-41.c: Likewise.
* gcc.dg/vect/pr101145inf.c: Likewise.
* gcc.dg/vect/pr101145inf_1.c: Likewise.
* gcc.dg/vect/pr108316.c: Likewise.
* gcc.dg/vect/pr109011-1.c: Likewise.
* gcc.dg/vect/pr109011-2.c: Likewise.
* gcc.dg/vect/pr109011-3.c: Likewise.
* gcc.dg/vect/pr109011-4.c: Likewise.
* gcc.dg/vect/pr109011-5.c: Likewise.
* gcc.dg/vect/pr111846.c: Likewise.
* gcc.dg/vect/pr111860-2.c: Likewise.
* gcc.dg/vect/pr111860-3.c: Likewise.
* gcc.dg/vect/pr113002.c: Likewise.
* gcc.dg/vect/pr84711.c: Likewise.
* gcc.dg/vect/pr85597.c: Likewise.
* gcc.dg/vect/pr88497-1.c: Likewise.
* gcc.dg/vect/pr88497-2.c: Likewise.
* gcc.dg/vect/pr88497-3.c: Likewise.
* gcc.dg/vect/pr88497-4.c: Likewise.
* gcc.dg/vect/pr88497-5.c: Likewise.
* gcc.dg/vect/pr88497-7.c: Likewise.
* gcc.dg/vect/pr92347.c: Likewise.
* gcc.dg/vect/pr93069.c: Likewise.
* gcc.dg/vect/pr97241.c: Likewise.
* gcc.dg/vect/pr99102.c: Likewise.
* gcc.dg/vect/vect-early-break_65.c: Likewise.
* gcc.dg/vect/vect-fold-1.c: Likewise.
* gcc.dg/vect/vect-ifcvt-19.c: Likewise.
* gcc.dg/vect/vect-ifcvt-20.c: Likewise.
* gcc.dg/vect/vect-reduc-epilogue-gaps.c: Likewise.
* gcc.dg/vect/vect-singleton_1.c: Likewise.
* g++.dg/vect/pr84556.cc: Likewise.
* gfortran.dg/vect/fast-math-mgrid-resid.f: Likewise.
* gfortran.dg/vect/pr77848.f: Likewise.
* gfortran.dg/vect/pr90913.f90: Likewise.


OK.

(I wonder how many of the target-specific additional options are

Re: [PATCH][GCC] aarch64: Fix SCHEDULER_IDENT for Cortex-A520

2024-03-13 Thread Richard Earnshaw





On 12/03/2024 14:08, Richard Ball wrote:

The SCHEDULER_IDENT for this CPU was incorrectly
set to cortexa55, which is incorrect. This can cause
sub-optimal asm to be generated.

Ok for trunk?

gcc/ChangeLog:
PR target/114272
* config/aarch64/aarch64-cores.def (AARCH64_CORE):
Change SCHEDULER_IDENT from cortexa55 to cortexa53
for Cortex-A520.


I don't see having this as a separate patch to the one for Cortex-A510 
as having any value.


Please merge the two together.  A merged patch is pre-approved.

R.

Re: [PATCH] [testsuite] Fixup dg-options in {gcc, g++, gfortran}.dg/vect.exp tests

2024-03-13 Thread Richard Earnshaw





On 13/03/2024 10:58, Maxim Kuvyrkov wrote:

This patch has been tested on
- aarch64-linux-gnu
- arm-linux-gnueabihf (VFP, NEON disabled by default),
- arm-none-eabi (Soft-FP)
with the following [expected] differences in the test results:

   - FAIL now PASS [FAIL => PASS]:
   Executed from: gcc:gcc.dg/vect/vect.exp
 gcc:gcc.dg/vect/vect.exp=gcc.dg/vect/pr113576.c (test for excess errors)
 gcc:gcc.dg/vect/vect.exp=gcc.dg/vect/pr113576.c -flto -ffat-lto-objects 
(test for excess errors)

   - UNSUPPORTED disappears[UNSUP=> ]:
   Executed from: g++:g++.dg/vect/vect.exp
 g++:g++.dg/vect/vect.exp=g++.dg/vect/pr84556.cc  -std=gnu++98

   - UNSUPPORTED appears   [ =>UNSUP]:
   Executed from: g++:g++.dg/vect/vect.exp
 g++:g++.dg/vect/vect.exp=g++.dg/vect/pr84556.cc  -std=c++98

   - UNRESOLVED disappears [UNRES=> ]:
   Executed from: gcc:gcc.dg/vect/vect.exp
 gcc:gcc.dg/vect/vect.exp=gcc.dg/vect/pr113576.c -flto -ffat-lto-objects 
compilation failed to produce executable
 gcc:gcc.dg/vect/vect.exp=gcc.dg/vect/pr113576.c compilation failed to 
produce executable

This patch was motivated by gcc.dg/vect/pr113576.c, which currently
fails to compile for ARM targets without NEON.

=== CUT ===

Testsuites driven by vect.exp rely on check_vect_support_and_set_flags
to set appropriate DEFAULT_VECTFLAGS for a given target (e.g., add
-mfpu=neon for arm-linux-gnueabi).  Unfortunately, these flags are
overwritten by dg-options directive, which can cause tests to fail.

Behavior of dg-options is documented in vect.exp files, but not
all developers look at the .exp file when adding a new testcase.
This caused a few dg-options directives to be used instead of
the more appropriate dg-additional-options.

This patch changes target-independent dg-options into
dg-additional-options.  This patch does not touch target-specific
dg-options and target-specific tests to avoid disturbing the gentle
balance of target-specific vectorization.

This patch also removes a couple of unneeded "dg-do run" directives
to avoid failures on compile-only targets.  Default action is, again,
set by check_vect_support_and_set_flags.

Lastly, I avoided renaming tests that use -O options to O-*
filename format because this support is not consistent between
gcc.dg/vect/, g++.dg/vect/, and gfortran.dg/vect/ testsuites.
It seems dg-additional-options is cleaner.

This patch does the following,
1. do not change target-specific tests, e.g., gcc.dg/vect/costmodel/riscv/*;
2. do not change { dg-options FOO { target { target-*-pattern } } };
3. do not remove { dg-do run { target { target-*-pattern } } };
4. change { dg-options FOO } to { dg-additional-options FOO };
5. remove { dg-do run } in several tests, where it is clearly not needed.

gcc/testsuite/ChangeLog:

PR testsuite/114307
* g++.dg/vect/pr84556.cc: Fixup.
* gcc.dg/vect/complex/complex-operations-run.c Fixup.
* gcc.dg/vect/gimplefe-40.c Fixup.
* gcc.dg/vect/gimplefe-41.c Fixup.
* gcc.dg/vect/pr101145inf.c Fixup.
* gcc.dg/vect/pr101145inf_1.c Fixup.
* gcc.dg/vect/pr108316.c Fixup.
* gcc.dg/vect/pr109011-1.c Fixup.
* gcc.dg/vect/pr109011-2.c Fixup.
* gcc.dg/vect/pr109011-3.c Fixup.
* gcc.dg/vect/pr109011-4.c Fixup.
* gcc.dg/vect/pr109011-5.c Fixup.
* gcc.dg/vect/pr111846.c Fixup.
* gcc.dg/vect/pr111860-2.c Fixup.
* gcc.dg/vect/pr111860-3.c Fixup.
* gcc.dg/vect/pr113002.c Fixup.
* gcc.dg/vect/pr113576.c Fixup.
* gcc.dg/vect/pr84711.c Fixup.
* gcc.dg/vect/pr85597.c Fixup.
* gcc.dg/vect/pr88497-1.c Fixup.
* gcc.dg/vect/pr88497-2.c Fixup.
* gcc.dg/vect/pr88497-3.c Fixup.
* gcc.dg/vect/pr88497-4.c Fixup.
* gcc.dg/vect/pr88497-5.c Fixup.
* gcc.dg/vect/pr88497-7.c Fixup.
* gcc.dg/vect/pr92347.c Fixup.
* gcc.dg/vect/pr93069.c Fixup.
* gcc.dg/vect/pr97241.c Fixup.
* gcc.dg/vect/pr99102.c Fixup.
* gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c Fixup.
* gcc.dg/vect/vect-early-break_65.c Fixup.
* gcc.dg/vect/vect-fold-1.c Fixup.
* gcc.dg/vect/vect-ifcvt-19.c Fixup.
* gcc.dg/vect/vect-ifcvt-20.c Fixup.
* gcc.dg/vect/vect-reduc-epilogue-gaps.c Fixup.
* gcc.dg/vect/vect-singleton_1.c Fixup.
* gfortran.dg/vect/fast-math-mgrid-resid.f Fixup.
* gfortran.dg/vect/pr77848.f Fixup.
* gfortran.dg/vect/pr90913.f90 Fixup.


Thanks for looking into this, I agree that changing to 
dg-additional-options looks the right choice.


The only thing to be wary of is that later 'dg-options' directives may 
override dg-additional-options directives; you might want to test at 
least one target where there are target-specific dg-options that you've 
not modified.


The patch is OK, but the ChangeLog is not!  Fixup doesn't

[COMMITTED] arm: testsuite: tweak bics_3.c [PR113542]

2024-03-08 Thread Richard Earnshaw


This test was too simple, which meant that the compiler was sometimes
able to find a better optimization of the code than using a BICS
instruction.  Fix this by changing the test slightly to produce a
sequence where BICS should always be the preferred solution.

gcc/testsuite:
PR target/113542
* gcc.target/arm/bics_3.c: Adjust code to something which should
always result in BICS.
---
 gcc/testsuite/gcc.target/arm/bics_3.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/bics_3.c b/gcc/testsuite/gcc.target/arm/bics_3.c
index e056b264e15..4d6938948a1 100644
--- a/gcc/testsuite/gcc.target/arm/bics_3.c
+++ b/gcc/testsuite/gcc.target/arm/bics_3.c
@@ -2,13 +2,11 @@
 /* { dg-options "-O2 --save-temps -fno-inline" } */
 /* { dg-require-effective-target arm32 } */
 
-extern void abort (void);
-
 int
 bics_si_test (int a, int b)
 {
-  if (a & ~b)
-return 1;
+  if ((a & ~b) >= 0)
+return 3;
   else
 return 0;
 }
@@ -16,8 +14,8 @@ bics_si_test (int a, int b)
 int
 bics_si_test2 (int a, int b)
 {
-  if (a & ~ (b << 2))
-return 1;
+  if ((a & ~ (b << 2)) >= 0)
+return 3;
   else
 return 0;
 }
@@ -28,13 +26,12 @@ main (void)
   int a = 5;
   int b = 5;
   int c = 20;
-  if (bics_si_test (a, b))
-abort ();
-  if (bics_si_test2 (c, b))
-abort ();
+  if (bics_si_test (a, b) != 3)
+__builtin_abort ();
+  if (bics_si_test2 (c, b) != 3)
+__builtin_abort ();
   return 0;
 }
 
 /* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+" 2 } } */
 /* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+, .sl #2" 1 } } */
-

Re: [PATCH] arm: fix c23 0-named-args caller-side stdarg

2024-03-07 Thread Richard Earnshaw (lists)

On 06/03/2024 20:28, Alexandre Oliva wrote:
> On Mar  1, 2024, "Richard Earnshaw (lists)"  wrote:
> 
>> On 01/03/2024 04:38, Alexandre Oliva wrote:
>>> Thanks for the review.
> 
>> For closure, Jakub has just pushed a patch to the generic code, so I
>> don't think we need this now.
> 
> ACK.  I see the c2x-stdarg-4.c test is now passing in our arm-eabi
> gcc-13 tree.  Thank you all.
> 
> Alas, the same nightly build showed a new riscv fail in c23-stdarg-6.c,
> that also got backported to gcc-13.  Presumably it's failing in the
> trunk as well, both riscv32-elf and riscv64-elf.
> 
> I haven't looked into whether it's a regression brought about by the
> patch or just a new failure mode that the new test exposed.  Either way,
> I'm not sure whether to link this new failure to any of the associated
> PRs or to file a new one, but, FTR, I'm going to look into it.
> 

I'd suggest a new pr.  It's easier to track than re-opening an existing on.

R.

> -- 
> Alexandre Oliva, happy hacker    https://FSFLA.org/blogs/lxo/ 
> <https://FSFLA.org/blogs/lxo/>
>    Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive

[PATCH] gomp: testsuite: improve compatibility of bad-array-section-3.c [PR113428]

2024-03-06 Thread Richard Earnshaw


This test generates different warnings on ilp32 targets because the size
of an integer matches the size of a pointer.  Avoid this by using
signed char.

gcc/testsuite:

PR testsuite/113428
* gcc.dg/gomp/bad-array-section-c-3.c: Use signed char instead
of int.
---

I think this fixes the issues seen on ilp32 machines, without substantially
changing what the test does, but a second set of eyes wouldn't hurt.

 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-3.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/gomp/bad-array-section-c-3.c b/gcc/testsuite/gcc.dg/gomp/bad-array-section-c-3.c
index 8be15ced8c0..431af71c422 100644
--- a/gcc/testsuite/gcc.dg/gomp/bad-array-section-c-3.c
+++ b/gcc/testsuite/gcc.dg/gomp/bad-array-section-c-3.c
@@ -1,15 +1,15 @@
 /* { dg-do compile } */
 
 struct S {
-  int *ptr;
+  signed char *ptr;
 };
 
 int main()
 {
-  int arr[20];
+  signed char arr[20];
 
   /* Reject array section in compound initialiser.  */
-#pragma omp target map( (struct S) { .ptr = (int *) arr[5:5] } )
+#pragma omp target map( (struct S) { .ptr = (signed char *) arr[5:5] } )
 /* { dg-error {expected '\]' before ':' token} "" { target *-*-* } .-1 } */
 /* { dg-warning {cast to pointer from integer of different size} "" { target *-*-* } .-2 } */
 /* { dg-message {sorry, unimplemented: unsupported map expression} "" { target *-*-* } .-3 } */

Re: [PATCH] arm: Support -mfdpic for more targets

2024-03-06 Thread Richard Earnshaw (lists)

On 06/03/2024 05:07, Fangrui Song wrote:
> On Fri, Feb 23, 2024 at 7:33 PM Fangrui Song  wrote:
>>
>> From: Fangrui Song 
>>
>> Targets that are not arm*-*-uclinuxfdpiceabi can use -S -mfdpic, but -c
>> -mfdpic does not pass --fdpic to gas.  This is an unnecessary
>> restriction.  Just define the ASM_SPEC in bpabi.h.
>>
>> Additionally, use armelf[b]_linux_fdpiceabi emulations for -mfdpic in
>> linux-eabi.h.  This will allow a future musl fdpic port to use the
>> desired BFD emulation.
>>
>> gcc/ChangeLog:
>>
>> * config/arm/bpabi.h (TARGET_FDPIC_ASM_SPEC): Transform -mfdpic.
>> * config/arm/linux-eabi.h (TARGET_FDPIC_LINKER_EMULATION): Define.
>> (SUBTARGET_EXTRA_LINK_SPEC): Use TARGET_FDPIC_LINKER_EMULATION
>> if -mfdpic.
>> ---
>>  gcc/config/arm/bpabi.h  | 2 +-
>>  gcc/config/arm/linux-eabi.h | 5 -
>>  2 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
>> index 7a279f3ed3c..6778be1a8bf 100644
>> --- a/gcc/config/arm/bpabi.h
>> +++ b/gcc/config/arm/bpabi.h
>> @@ -55,7 +55,7 @@
>>  #define TARGET_FIX_V4BX_SPEC " %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*"\
>>    "|march=armv4|mcpu=fa526|mcpu=fa626:--fix-v4bx}"
>>
>> -#define TARGET_FDPIC_ASM_SPEC ""
>> +#define TARGET_FDPIC_ASM_SPEC "%{mfdpic: --fdpic}"
>>
>>  #define BE8_LINK_SPEC  \
>>    "%{!r:%{!mbe32:%:be8_linkopt(%{mlittle-endian:little}"   \
>> diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
>> index eef791f6a02..0c5c58e4928 100644
>> --- a/gcc/config/arm/linux-eabi.h
>> +++ b/gcc/config/arm/linux-eabi.h
>> @@ -46,12 +46,15 @@
>>  #undef  TARGET_LINKER_EMULATION
>>  #if TARGET_BIG_ENDIAN_DEFAULT
>>  #define TARGET_LINKER_EMULATION "armelfb_linux_eabi"
>> +#define TARGET_FDPIC_LINKER_EMULATION "armelfb_linux_fdpiceabi"
>>  #else
>>  #define TARGET_LINKER_EMULATION "armelf_linux_eabi"
>> +#define TARGET_FDPIC_LINKER_EMULATION "armelf_linux_fdpiceabi"
>>  #endif
>>
>>  #undef  SUBTARGET_EXTRA_LINK_SPEC
>> -#define SUBTARGET_EXTRA_LINK_SPEC " -m " TARGET_LINKER_EMULATION
>> +#define SUBTARGET_EXTRA_LINK_SPEC " -m %{mfdpic: " \
>> +  TARGET_FDPIC_LINKER_EMULATION ";:" TARGET_LINKER_EMULATION "}"
>>
>>  /* GNU/Linux on ARM currently supports three dynamic linkers:
>> - ld-linux.so.2 - for the legacy ABI
>> --
>> 2.44.0.rc1.240.g4c46232300-goog
>>
> 
> Ping:)
> 

We're in stage4 at present and this is new material.  I'll look at it after the 
branch has been cut.

R.

> 
> -- 
> 宋方睿

[PATCH] arm: check for low register before applying peephole [PR113510]

2024-03-05 Thread Richard Earnshaw


For thumb1, when using a peephole to fuse

mov reg, #const
add reg, reg, SP

into

add reg, SP, #const

we must first check that reg is a low register, otherwise we will ICE
when trying to recognize the resulting insn.

gcc/ChangeLog:

PR target/113510
* config/arm/thumb1.md (peephole2 to fuse mov imm/add SP): Use
low_register_operand.
---

This appears to have gone latent again, but checked against the known
failing version.

 gcc/config/arm/thumb1.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 14d6df580af..d7074b43f60 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -113,7 +113,7 @@ (define_insn_and_split "*thumb1_addsi3"
 ;; Reloading and elimination of the frame pointer can
 ;; sometimes cause this optimization to be missed.
 (define_peephole2
-  [(set (match_operand:SI 0 "arm_general_register_operand" "")
+  [(set (match_operand:SI 0 "low_register_operand" "")
 	(match_operand:SI 1 "const_int_operand" ""))
(set (match_dup 0)
 	(plus:SI (match_dup 0) (reg:SI SP_REGNUM)))]

Re: [PATCH v2] testsuite, arm: Fix testcase arm/pr112337.c to check for the options first

2024-03-05 Thread Richard Earnshaw (lists)

On 19/02/2024 10:11, Saurabh Jha wrote:
> 
> On 2/9/2024 2:57 PM, Richard Earnshaw (lists) wrote:
>> On 30/01/2024 17:07, Saurabh Jha wrote:
>>> Hey,
>>>
>>> Previously, this test was added to fix this bug: 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337. However, it did not 
>>> check the compilation options before using them, leading to errors.
>>>
>>> This patch fixes the test by first checking whether it can use the options 
>>> before using them.
>>>
>>> Tested for arm-none-eabi and found no regressions. The output of check-gcc 
>>> with RUNTESTFLAGS="arm.exp=*" changed like this:
>>>
>>> Before:
>>> # of expected passes  5963
>>> # of unexpected failures  64
>>>
>>> After:
>>> # of expected passes  5964
>>> # of unexpected failures  63
>>>
>>> Ok for master?
>>>
>>> Regards,
>>> Saurabh
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>  * gcc.target/arm/pr112337.c: Check whether we can use the 
>>> compilation options before using them.
>> My apologies for missing this earlier.  It didn't show up in patchwork. 
>> That's most likely because the attachment is a binary blob instead of 
>> text/plain.  That also means that the Linaro CI system hasn't seen this 
>> patch either.  Please can you fix your mailer to add plain text patch files.
>>
>> -/* { dg-options "-O2 -march=armv8.1-m.main+fp.dp+mve.fp -mfloat-abi=hard" } 
>> */
>> +/* { dg-require-effective-target arm_hard_ok } */
>> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>> +/* { dg-options "-O2 -mfloat-abi=hard" } */
>> +/* { dg-add-options arm_v8_1m_mve } */
>>
>> This is moving in the right direction, but it adds more than necessary now: 
>> checking for, and adding -mfloat-abi=hard is not necessary any more as 
>> arm_v8_1m_mve_ok will work out what float-abi flags are needed to make the 
>> options work. (What's more, it will prevent the test from running if the 
>> base configuration of the compiler is incompatible with the hard float ABI, 
>> which is more than we need.).
>>
>> So please can you re-spin removing the hard-float check and removing that 
>> from dg-options.
>>
>> Thanks,
>> R.
> 
> Hi Richard,
> 
> Agreed with your comments. Please find the patch with the suggested changes 
> attached.
> 
> Regards,
> 
> Saurabh
> 


Thanks, I've pushed this.  Next time, please can you put the commit message 
inside the patch, so that I can apply things automatically.  Eg: 

>From 1c92c94074449929f40cea99a6450bcde3aec12f Mon Sep 17 00:00:00 2001
From: Saurabh Jha 
Date: Tue, 30 Jan 2024 15:03:36 +
Subject: [PATCH] Fix testcase pr112337.c to check the options [PR112337]

gcc.target/arm/pr112337.c was failing to validate that adding MVE options
was compatible with the test environment, so add the missing checks.

gcc/testsuite/ChangeLog:

PR target/112337
* gcc.target/arm/pr112337.c: Check for, then use the right MVE
options.

---
 gcc/testsuite/gcc.target/arm/pr112337.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr112337.c 
b/gcc/testsuite/gcc.target/arm/pr112337.c

...

Re: [PATCH v4] aarch64,arm: Move branch-protection data to targets

2024-03-01 Thread Richard Earnshaw (lists)

On 11/01/2024 14:35, Szabolcs Nagy wrote:
> The branch-protection types are target specific, not the same on arm
> and aarch64.  This currently affects pac-ret+b-key, but there will be
> a new type on aarch64 that is not relevant for arm.
> 
> After the move, change aarch_ identifiers to aarch64_ or arm_ as
> appropriate.
> 
> Refactor aarch_validate_mbranch_protection to take the target specific
> branch-protection types as an argument.
> 
> In case of invalid input currently no hints are provided: the way
> branch-protection types and subtypes can be mixed makes it difficult
> without causing confusion.
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64.md: Rename aarch_ to aarch64_.
>   * config/aarch64/aarch64.opt: Likewise.
>   * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Likewise.
>   * config/aarch64/aarch64.cc (aarch64_expand_prologue): Likewise.
>   (aarch64_expand_epilogue): Likewise.
>   (aarch64_post_cfi_startproc): Likewise.
>   (aarch64_handle_no_branch_protection): Copy and rename.
>   (aarch64_handle_standard_branch_protection): Likewise.
>   (aarch64_handle_pac_ret_protection): Likewise.
>   (aarch64_handle_pac_ret_leaf): Likewise.
>   (aarch64_handle_pac_ret_b_key): Likewise.
>   (aarch64_handle_bti_protection): Likewise.
>   (aarch64_override_options): Update branch protection validation.
>   (aarch64_handle_attr_branch_protection): Likewise.
>   * config/arm/aarch-common-protos.h (aarch_validate_mbranch_protection):
>   Pass branch protection type description as argument.
>   (struct aarch_branch_protect_type): Move from aarch-common.h.
>   * config/arm/aarch-common.cc (aarch_handle_no_branch_protection):
>   Remove.
>   (aarch_handle_standard_branch_protection): Remove.
>   (aarch_handle_pac_ret_protection): Remove.
>   (aarch_handle_pac_ret_leaf): Remove.
>   (aarch_handle_pac_ret_b_key): Remove.
>   (aarch_handle_bti_protection): Remove.
>   (aarch_validate_mbranch_protection): Pass branch protection type
>   description as argument.
>   * config/arm/aarch-common.h (enum aarch_key_type): Remove.
>   (struct aarch_branch_protect_type): Remove.
>   * config/arm/arm-c.cc (arm_cpu_builtins): Remove aarch_ra_sign_key.
>   * config/arm/arm.cc (arm_handle_no_branch_protection): Copy and rename.
>   (arm_handle_standard_branch_protection): Likewise.
>   (arm_handle_pac_ret_protection): Likewise.
>   (arm_handle_pac_ret_leaf): Likewise.
>   (arm_handle_bti_protection): Likewise.
>   (arm_configure_build_target): Update branch protection validation.
>   * config/arm/arm.opt: Remove aarch_ra_sign_key.
> ---
> v4:
> - pass types as argument to validation.
> - make target specific types data static.
> 
>  gcc/config/aarch64/aarch64-c.cc  |  4 +-
>  gcc/config/aarch64/aarch64.cc| 75 
>  gcc/config/aarch64/aarch64.md|  2 +-
>  gcc/config/aarch64/aarch64.opt   |  2 +-
>  gcc/config/arm/aarch-common-protos.h | 19 ++-
>  gcc/config/arm/aarch-common.cc   | 71 --
>  gcc/config/arm/aarch-common.h| 20 
>  gcc/config/arm/arm-c.cc  |  2 -
>  gcc/config/arm/arm.cc| 55 +---
>  gcc/config/arm/arm.opt   |  3 --
>  10 files changed, 145 insertions(+), 108 deletions(-)
> 


OK

R.

Re: [PATCH v6 5/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds support for MVE Tail-Predicated Low Overhead Loops by using 
> the
> doloop funcitonality added to support predicated vectorized hardware loops.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
>   declaration to pass basic_block.
>   (arm_attempt_dlstp_transform): New declaration.
>   * config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): Define targethook.
>   (TARGET_PREDICT_DOLOOP_P): Likewise.
>   (arm_target_bb_ok_for_lob): Adapt condition.
>   (arm_mve_get_vctp_lanes): New function.
>   (arm_dl_usage_type): New internal enum.
>   (arm_get_required_vpr_reg): New function.
>   (arm_get_required_vpr_reg_param): New function.
>   (arm_get_required_vpr_reg_ret_val): New function.
>   (arm_mve_get_loop_vctp): New function.
>   (arm_mve_insn_predicated_by): New function.
>   (arm_mve_across_lane_insn_p): New function.
>   (arm_mve_load_store_insn_p): New function.
>   (arm_mve_impl_pred_on_outputs_p): New function.
>   (arm_mve_impl_pred_on_inputs_p): New function.
>   (arm_last_vect_def_insn): New function.
>   (arm_mve_impl_predicated_p): New function.
>   (arm_mve_check_reg_origin_is_num_elems): New function.
>   (arm_mve_dlstp_check_inc_counter): New function.
>   (arm_mve_dlstp_check_dec_counter): New function.
>   (arm_mve_loop_valid_for_dlstp): New function.
>   (arm_predict_doloop_p): New function.
>   (arm_loop_unroll_adjust): New function.
>   (arm_emit_mve_unpredicated_insn_to_seq): New function.
>   (arm_attempt_dlstp_transform): New function.
>   * config/arm/arm.opt (mdlstp): New option.
>   * config/arm/iteratords.md (dlstp_elemsize, letp_num_lanes,
>   letp_num_lanes_neg, letp_num_lanes_minus_1): New attributes.
>   (DLSTP, LETP): New iterators.
>   (predicated_doloop_end_internal): New pattern.
>   (dlstp_insn): New pattern.
>   * config/arm/thumb2.md (doloop_end): Adapt to support tail-predicated
>   loops.
>   (doloop_begin): Likewise.
>   * config/arm/types.md (mve_misc): New mve type to represent
>   predicated_loop_end insn sequences.
>   * config/arm/unspecs.md:
>   (DLSTP8, DLSTP16, DLSTP32, DSLTP64,
>   LETP8, LETP16, LETP32, LETP64): New unspecs for DLSTP and LETP.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/lob.h: Add new helpers.
>   * gcc.target/arm/lob1.c: Use new helpers.
>   * gcc.target/arm/lob6.c: Likewise.
>   * gcc.target/arm/dlstp-compile-asm-1.c: New test.
>   * gcc.target/arm/dlstp-compile-asm-2.c: New test.
>   * gcc.target/arm/dlstp-compile-asm-3.c: New test.
>   * gcc.target/arm/dlstp-int8x16.c: New test.
>   * gcc.target/arm/dlstp-int8x16-run.c: New test.
>   * gcc.target/arm/dlstp-int16x8.c: New test.
>   * gcc.target/arm/dlstp-int16x8-run.c: New test.
>   * gcc.target/arm/dlstp-int32x4.c: New test.
>   * gcc.target/arm/dlstp-int32x4-run.c: New test.
>   * gcc.target/arm/dlstp-int64x2.c: New test.
>   * gcc.target/arm/dlstp-int64x2-run.c: New test.
>   * gcc.target/arm/dlstp-invalid-asm.c: New test.
> 
> Co-authored-by: Stam Markianos-Wright 
> ---
>  gcc/config/arm/arm-protos.h   |4 +-
>  gcc/config/arm/arm.cc | 1249 -
>  gcc/config/arm/arm.opt|3 +
>  gcc/config/arm/iterators.md   |   15 +
>  gcc/config/arm/mve.md |   50 +
>  gcc/config/arm/thumb2.md  |  138 +-
>  gcc/config/arm/types.md   |6 +-
>  gcc/config/arm/unspecs.md |   14 +-
>  gcc/testsuite/gcc.target/arm/lob.h|  128 +-
>  gcc/testsuite/gcc.target/arm/lob1.c   |   23 +-
>  gcc/testsuite/gcc.target/arm/lob6.c   |8 +-
>  .../gcc.target/arm/mve/dlstp-compile-asm-1.c  |  146 ++
>  .../gcc.target/arm/mve/dlstp-compile-asm-2.c  |  749 ++
>  .../gcc.target/arm/mve/dlstp-compile-asm-3.c  |   46 +
>  .../gcc.target/arm/mve/dlstp-int16x8-run.c|   44 +
>  .../gcc.target/arm/mve/dlstp-int16x8.c|   31 +
>  .../gcc.target/arm/mve/dlstp-int32x4-run.c|   45 +
>  .../gcc.target/arm/mve/dlstp-int32x4.c|   31 +
>  .../gcc.target/arm/mve/dlstp-int64x2-run.c|   48 +
>  .../gcc.target/arm/mve/dlstp-int64x2.c|   28 +
>  .../gcc.target/arm/mve/dlstp-int8x16-run.c|   44 +
>  .../gcc.target/arm/mve/dlstp-int8x16.c|   32 +
>  .../gcc.target/arm/mve/dlstp-invalid-asm.c|  521 +++
>  23 files changed, 3321 insertions(+), 82 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c
>  create mode 100644

Re: [PATCH v6 4/5] doloop: Add support for predicated vectorized loops

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds support in the target agnostic doloop pass for the detection 
> of
> predicated vectorized hardware loops.  Arm is currently the only target that
> will make use of this feature.
> 
> gcc/ChangeLog:
> 
>   * df-core.cc (df_bb_regno_only_def_find): New helper function.
>   * df.h (df_bb_regno_only_def_find): Declare new function.
>   * loop-doloop.cc (doloop_condition_get): Add support for detecting
>   predicated vectorized hardware loops.
>   (doloop_modify): Add support for GTU condition checks.
>   (doloop_optimize): Update costing computation to support alterations to
>   desc->niter_expr by the backend.
> 
> Co-authored-by: Stam Markianos-Wright 
> ---
>  gcc/df-core.cc |  15 +
>  gcc/df.h   |   1 +
>  gcc/loop-doloop.cc | 164 +++--
>  3 files changed, 113 insertions(+), 67 deletions(-)
> 

As discussed, I think we should wait for gcc-15 for this[*]; I know it was 
initially submitted during stage1 but it's had to go through a lot of revision 
since then and we're very close to wanting to cut the release branch.

R.

[*] Unless an independent reviewer wants to sign this off anyway.

Re: [PATCH v6 3/5] arm: Fix a wrong attribute use and remove unused unspecs and iterators

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch fixes the erroneous use of a mode attribute without a mode iterator
> in the pattern and removes unused unspecs and iterators.
> 
> gcc/ChangeLog:
> 
>   * config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
>   VMLALDAVAXQ_U cases.
>   (VMLALDAVXQ): Remove iterator.
>   (VMLALDAVXQ_P): Likewise.
>   (VMLALDAVAXQ): Likewise.
>   * config/arm/mve.md (mve_vstrwq_p_fv4sf): Replace use of 
>   mode iterator attribute with V4BI mode.
>   * config/arm/unspecs.md (VMLALDAVXQ_U, VMLALDAVXQ_P_U,
>   VMLALDAVAXQ_U): Remove unused unspecs.
> ---
>  gcc/config/arm/iterators.md | 9 +++--
>  gcc/config/arm/mve.md   | 2 +-
>  gcc/config/arm/unspecs.md   | 3 ---
>  3 files changed, 4 insertions(+), 10 deletions(-)
> 

OK

R.

Re: [PATCH v6 2/5] arm: Annotate instructions with mve_safe_imp_xlane_pred

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch annotates some MVE across lane instructions with a new attribute.
> We use this attribute to let the compiler know that these instructions can be
> safely implicitly predicated when tail predicating if their operands are
> guaranteed to have zeroed tail predicated lanes.  These instructions were
> selected because having the value 0 in those lanes or 'tail-predicating' those
> lanes have the same effect.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.md (mve_safe_imp_xlane_pred): New attribute.
>   * config/arm/iterators.md (mve_vmaxmin_safe_imp): New iterator
>   attribute.
>   * config/arm/mve.md (vaddvq_s, vaddvq_u, vaddlvq_s, vaddlvq_u,
>   vaddvaq_s, vaddvaq_u, vmaxavq_s, vmaxvq_u, vmladavq_s, vmladavq_u,
>   vmladavxq_s, vmlsdavq_s, vmlsdavxq_s, vaddlvaq_s, vaddlvaq_u,
>   vmlaldavq_u, vmlaldavq_s, vmlaldavq_u, vmlaldavxq_s, vmlsldavq_s,
>   vmlsldavxq_s, vrmlaldavhq_u, vrmlaldavhq_s, vrmlaldavhxq_s,
>   vrmlsldavhq_s, vrmlsldavhxq_s, vrmlaldavhaq_s, vrmlaldavhaq_u,
>   vrmlaldavhaxq_s, vrmlsldavhaq_s, vrmlsldavhaxq_s, vabavq_s, vabavq_u,
>   vmladavaq_u, vmladavaq_s, vmladavaxq_s, vmlsdavaq_s, vmlsdavaxq_s,
>   vmlaldavaq_s, vmlaldavaq_u, vmlaldavaxq_s, vmlsldavaq_s,
>   vmlsldavaxq_s): Added mve_safe_imp_xlane_pred.
> ---
>  gcc/config/arm/arm.md   |  6 ++
>  gcc/config/arm/iterators.md |  8 
>  gcc/config/arm/mve.md   | 12 
>  3 files changed, 26 insertions(+)
> 

OK

R.

Re: [PATCH v6 1/5] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds an attribute to the mve md patterns to be able to identify
> predicable MVE instructions and what their predicated and unpredicated 
> variants
> are.  This attribute is used to encode the icode of the unpredicated variant 
> of
> an instruction in its predicated variant.
> 
> This will make it possible for us to transform VPT-predicated insns in
> the insn chain into their unpredicated equivalents when transforming the loop
> into a MVE Tail-Predicated Low Overhead Loop. For example:
> `mve_vldrbq_z_ -> mve_vldrbq_`.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.md (mve_unpredicated_insn): New attribute.
>   * config/arm/arm.h (MVE_VPT_PREDICATED_INSN_P): New define.
>   (MVE_VPT_UNPREDICATED_INSN_P): Likewise.
>   (MVE_VPT_PREDICABLE_INSN_P): Likewise.
>   * config/arm/vec-common.md (mve_vshlq_): Add attribute.
>   * config/arm/mve.md (arm_vcx1q_p_v16qi): Add attribute.
>   (arm_vcx1qv16qi): Likewise.
>   (arm_vcx1qav16qi): Likewise.
>   (arm_vcx1qv16qi): Likewise.
>   (arm_vcx2q_p_v16qi): Likewise.
>   (arm_vcx2qv16qi): Likewise.
>   (arm_vcx2qav16qi): Likewise.
>   (arm_vcx2qv16qi): Likewise.
>   (arm_vcx3q_p_v16qi): Likewise.
>   (arm_vcx3qv16qi): Likewise.
>   (arm_vcx3qav16qi): Likewise.
>   (arm_vcx3qv16qi): Likewise.
>   (@mve_q_): Likewise.
>   (@mve_q_int_): Likewise.
>   (@mve_q_v4si): Likewise.
>   (@mve_q_n_): Likewise.
>   (@mve_q_r_): Likewise.
>   (@mve_q_f): Likewise.
>   (@mve_q_m_): Likewise.
>   (@mve_q_m_n_): Likewise.
>   (@mve_q_m_r_): Likewise.
>   (@mve_q_m_f): Likewise.
>   (@mve_q_int_m_): Likewise.
>   (@mve_q_p_v4si): Likewise.
>   (@mve_q_p_): Likewise.
>   (@mve_q_): Likewise.
>   (@mve_q_f): Likewise.
>   (@mve_q_m_): Likewise.
>   (@mve_q_m_f): Likewise.
>   (mve_vq_f): Likewise.
>   (mve_q): Likewise.
>   (mve_q_f): Likewise.
>   (mve_vadciq_v4si): Likewise.
>   (mve_vadciq_m_v4si): Likewise.
>   (mve_vadcq_v4si): Likewise.
>   (mve_vadcq_m_v4si): Likewise.
>   (mve_vandq_): Likewise.
>   (mve_vandq_f): Likewise.
>   (mve_vandq_m_): Likewise.
>   (mve_vandq_m_f): Likewise.
>   (mve_vandq_s): Likewise.
>   (mve_vandq_u): Likewise.
>   (mve_vbicq_): Likewise.
>   (mve_vbicq_f): Likewise.
>   (mve_vbicq_m_): Likewise.
>   (mve_vbicq_m_f): Likewise.
>   (mve_vbicq_m_n_): Likewise.
>   (mve_vbicq_n_): Likewise.
>   (mve_vbicq_s): Likewise.
>   (mve_vbicq_u): Likewise.
>   (@mve_vclzq_s): Likewise.
>   (mve_vclzq_u): Likewise.
>   (@mve_vcmp_q_): Likewise.
>   (@mve_vcmp_q_n_): Likewise.
>   (@mve_vcmp_q_f): Likewise.
>   (@mve_vcmp_q_n_f): Likewise.
>   (@mve_vcmp_q_m_f): Likewise.
>   (@mve_vcmp_q_m_n_): Likewise.
>   (@mve_vcmp_q_m_): Likewise.
>   (@mve_vcmp_q_m_n_f): Likewise.
>   (mve_vctpq): Likewise.
>   (mve_vctpq_m): Likewise.
>   (mve_vcvtaq_): Likewise.
>   (mve_vcvtaq_m_): Likewise.
>   (mve_vcvtbq_f16_f32v8hf): Likewise.
>   (mve_vcvtbq_f32_f16v4sf): Likewise.
>   (mve_vcvtbq_m_f16_f32v8hf): Likewise.
>   (mve_vcvtbq_m_f32_f16v4sf): Likewise.
>   (mve_vcvtmq_): Likewise.
>   (mve_vcvtmq_m_): Likewise.
>   (mve_vcvtnq_): Likewise.
>   (mve_vcvtnq_m_): Likewise.
>   (mve_vcvtpq_): Likewise.
>   (mve_vcvtpq_m_): Likewise.
>   (mve_vcvtq_from_f_): Likewise.
>   (mve_vcvtq_m_from_f_): Likewise.
>   (mve_vcvtq_m_n_from_f_): Likewise.
>   (mve_vcvtq_m_n_to_f_): Likewise.
>   (mve_vcvtq_m_to_f_): Likewise.
>   (mve_vcvtq_n_from_f_): Likewise.
>   (mve_vcvtq_n_to_f_): Likewise.
>   (mve_vcvtq_to_f_): Likewise.
>   (mve_vcvttq_f16_f32v8hf): Likewise.
>   (mve_vcvttq_f32_f16v4sf): Likewise.
>   (mve_vcvttq_m_f16_f32v8hf): Likewise.
>   (mve_vcvttq_m_f32_f16v4sf): Likewise.
>   (mve_vdwdupq_m_wb_u_insn): Likewise.
>   (mve_vdwdupq_wb_u_insn): Likewise.
>   (mve_veorq_s>): Likewise.
>   (mve_veorq_u>): Likewise.
>   (mve_veorq_f): Likewise.
>   (mve_vidupq_m_wb_u_insn): Likewise.
>   (mve_vidupq_u_insn): Likewise.
>   (mve_viwdupq_m_wb_u_insn): Likewise.
>   (mve_viwdupq_wb_u_insn): Likewise.
>   (mve_vldrbq_): Likewise.
>   (mve_vldrbq_gather_offset_): Likewise.
>   (mve_vldrbq_gather_offset_z_): Likewise.
>   (mve_vldrbq_z_): Likewise.
>   (mve_vldrdq_gather_base_v2di): Likewise.
>   (mve_vldrdq_gather_base_wb_v2di_insn): Likewise.
>   (mve_vldrdq_gather_base_wb_z_v2di_insn): Likewise.
>   (mve_vldrdq_gather_base_z_v2di): Likewise.
>   (mve_vldrdq_gather_offset_v2di): Likewise.
>   (mve_vldrdq_gather_offset_z_v2di): Likewise.
>   (mve_vldrdq_gather_shifted_offset_v2di): Likewise.
>   (mve_vldrdq_gather_shifted_offset_z_v2di): Likewise.
>   (mve_vldrhq_): Likewise.
>

Re: [PATCH] arm: Fixed C23 call compatibility with arm-none-eabi

2024-03-01 Thread Richard Earnshaw (lists)

On 19/02/2024 09:13, Torbjörn SVENSSON wrote:
> Ok for trunk and releases/gcc-13?
> Regtested on top of 945cb8490cb for arm-none-eabi, without any regression.
> 
> Backporting to releases/gcc-13 will change -std=c23 to -std=c2x.

Jakub has just pushed a different fix for this, so I don't think we need this 
now.

R.


> 
> --
> 
> In commit 4fe34cdcc80ac225b80670eabc38ac5e31ce8a5a, -std=c23 support was
> introduced to support functions without any named arguments.  For
> arm-none-eabi, this is not as simple as placing all arguments on the
> stack.  Align the caller to use r0, r1, r2 and r3 for arguments even for
> functions without any named arguments, as specified in the AAPCS.
> 
> Verify that the generic test case have the arguments are in the right
> order and add ARM specific test cases.
> 
> gcc/ChangeLog:
> 
>   * calls.h: Added the type of the function to function_arg_info.
>   * calls.cc: Save the type of the function.
>   * config/arm/arm.cc: Check in the AAPCS layout function if
>   function has no named args.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/torture/c23-stdarg-split-1a.c: Detect out of order
>   arguments.
>   * gcc.dg/torture/c23-stdarg-split-1b.c: Likewise.
>   * gcc.target/arm/aapcs/align_vaarg3.c: New test.
>   * gcc.target/arm/aapcs/align_vaarg4.c: New test.
> 
> Signed-off-by: Torbjörn SVENSSON 
> Co-authored-by: Yvan ROUX 
> ---
>  gcc/calls.cc  |  2 +-
>  gcc/calls.h   | 20 --
>  gcc/config/arm/arm.cc | 13 ---
>  .../gcc.dg/torture/c23-stdarg-split-1a.c  |  4 +-
>  .../gcc.dg/torture/c23-stdarg-split-1b.c  | 15 +---
>  .../gcc.target/arm/aapcs/align_vaarg3.c   | 37 +++
>  .../gcc.target/arm/aapcs/align_vaarg4.c   | 31 
>  7 files changed, 102 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/align_vaarg3.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/align_vaarg4.c
> 
> diff --git a/gcc/calls.cc b/gcc/calls.cc
> index 01f44734743..a1cc283b952 100644
> --- a/gcc/calls.cc
> +++ b/gcc/calls.cc
> @@ -1376,7 +1376,7 @@ initialize_argument_information (int num_actuals 
> ATTRIBUTE_UNUSED,
>with those made by function.cc.  */
>  
>/* See if this argument should be passed by invisible reference.  */
> -  function_arg_info arg (type, argpos < n_named_args);
> +  function_arg_info arg (type, fntype, argpos < n_named_args);
>if (pass_by_reference (args_so_far_pnt, arg))
>   {
> const bool callee_copies
> diff --git a/gcc/calls.h b/gcc/calls.h
> index 464a4e34e33..88836559ebe 100644
> --- a/gcc/calls.h
> +++ b/gcc/calls.h
> @@ -35,24 +35,33 @@ class function_arg_info
>  {
>  public:
>function_arg_info ()
> -: type (NULL_TREE), mode (VOIDmode), named (false),
> +: type (NULL_TREE), fntype (NULL_TREE), mode (VOIDmode), named (false),
>pass_by_reference (false)
>{}
>  
>/* Initialize an argument of mode MODE, either before or after promotion.  
> */
>function_arg_info (machine_mode mode, bool named)
> -: type (NULL_TREE), mode (mode), named (named), pass_by_reference (false)
> +: type (NULL_TREE), fntype (NULL_TREE), mode (mode), named (named),
> +pass_by_reference (false)
>{}
>  
>/* Initialize an unpromoted argument of type TYPE.  */
>function_arg_info (tree type, bool named)
> -: type (type), mode (TYPE_MODE (type)), named (named),
> +: type (type), fntype (NULL_TREE), mode (TYPE_MODE (type)), named 
> (named),
>pass_by_reference (false)
>{}
>  
> +  /* Initialize an unpromoted argument of type TYPE with a known function 
> type
> + FNTYPE.  */
> +  function_arg_info (tree type, tree fntype, bool named)
> +: type (type), fntype (fntype), mode (TYPE_MODE (type)), named (named),
> +pass_by_reference (false)
> +  {}
> +
>/* Initialize an argument with explicit properties.  */
>function_arg_info (tree type, machine_mode mode, bool named)
> -: type (type), mode (mode), named (named), pass_by_reference (false)
> +: type (type), fntype (NULL_TREE), mode (mode), named (named),
> +pass_by_reference (false)
>{}
>  
>/* Return true if the gimple-level type is an aggregate.  */
> @@ -96,6 +105,9 @@ public:
>   libgcc support functions).  */
>tree type;
>  
> +  /* The type of the function that has this argument, or null if not known.  
> */
> +  tree fntype;
> +
>/* The mode of the argument.  Depending on context, this might be
>   the mode of the argument type or the mode after promotion.  */
>machine_mode mode;
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 1cd69268ee9..98e149e5b7e 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -7006,7 +7006,7 @@ aapcs_libcall_value (machine_mode mode)
> numbers referred to here are those in the

Re: [PATCH] arm: fix c23 0-named-args caller-side stdarg

2024-03-01 Thread Richard Earnshaw (lists)

On 01/03/2024 04:38, Alexandre Oliva wrote:
> Hello, Matthew,
> 
> Thanks for the review.

For closure, Jakub has just pushed a patch to the generic code, so I don't 
think we need this now.

R.

> 
> On Feb 26, 2024, Matthew Malcomson  wrote:
> 
>> I think you're right that the AAPCS32 requires all arguments to be passed in
>> registers for this testcase.
>> (Nit on the commit-message: It says that your reading of the AAPCS32
>> suggests
>> that the *caller* is correct -- I believe based on the change you
>> suggested you
>> meant *callee* is correct in expecting arguments in registers.)
> 
> Ugh, yeah, sorry about the typo.
> 
>> The approach you suggest looks OK to me -- I do notice that it doesn't
>> fix the
>> legacy ABI's of `atpcs` and `apcs` and guess it would be nicer to have them
>> working at the same time though would defer to maintainers on how
>> important that
>> is.
>> (For the benefit of others reading) I don't believe there is any ABI concern
>> with this since it's fixing something that is currently not working at
>> all and
>> only applies to c23 (so a change shouldn't have too much of an impact).
> 
>> You mention you chose to make the change in the arm backend rather
>> than general
>> code due to hesitancy to change the generic ABI-affecting code. That makes
>> sense to me, certainly at this late stage in the development cycle.
> 
> *nod* I wrote the patch in the following context: I hit the problem on
> the very first toolchain I started transitioning to gcc-13.  I couldn't
> really fathom the notion that this breakage could have survived an
> entire release cycle if it affected many targets, and sort of held on to
> an assumption that the abi used by our arm-eabi toolchain had to be an
> uncommon one.
> 
> All of this hypothesizing falls apart by the now apparent knowledge that
> the test is faling elsewhere as well, even on other ARM ABIs, it just
> hadn't been addressed yet.  I'm glad we're getting there :-)
> 
>> From a quick check on c23-stdarg-4.c it does look like the below
>> change ends up
>> with the same codegen as your patch (except in the case of those
>> legacy ABI's,
>> where the below does make the caller and callee ABI match AFAICT):
> 
>> ```
>>   diff --git a/gcc/calls.cc b/gcc/calls.cc
>>   index 01f44734743..0b302f633ed 100644
>>   --- a/gcc/calls.cc
>>   +++ b/gcc/calls.cc
>>   @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int ignore)
>>     we do not have any reliable way to pass unnamed args in
>>     registers, so we must force them into memory.  */
> 
>>   -  if (type_arg_types != 0
>>   +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>>  && targetm.calls.strict_argument_naming (args_so_far))
>>    ;
>>  else if (type_arg_types != 0
>>  && ! targetm.calls.pretend_outgoing_varargs_named
>> (args_so_far))
>>    /* Don't include the last named arg.  */
>>    --n_named_args;
>>   -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>>   +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
>>   +    && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>>    n_named_args = 0;
>>  else
>>    /* Treat all args as named.  */
>> ```
> 
>> Do you agree that this makes sense (i.e. is there something I'm
>> completely missing)?
> 
> Yeah, your argument is quite convincing, and the target knobs are indeed
> in line with the change you suggest, whereas the current code seems to
> deviate from them.
> 
> With my ABI designer hat on, however, I see that there's room for ABIs
> to make decisions about 0-args stdargs that go differently from stdargs
> with leading named args, from prototyped functions, and even from
> prototypeless functions, and we might end up needing more knobs to deal
> with such custom decisions.  We can cross that bridge if/when we get to
> it, though.
> 
>> (lm32 mcore msp430 gcn cris fr30 frv h8300 arm v850 rx pru)
> 
> Interesting that ppc64le is not on your list.  There's PR107453 about
> that, and another thread is discussing a fix for it that is somewhat
> different from what you propose (presumably because the way the problem
> manifests on ppc64le is different), but it also tweaks expand_call.
> 
> I'll copy you when following up there.
>

Re: [PATCH] testsuite: Turn errors back into warnings in arm/acle/cde-mve-error-2.c

2024-03-01 Thread Richard Earnshaw (lists)

On 13/01/2024 20:46, Thiago Jung Bauermann wrote:
> Since commit 2c3db94d9fd ("c: Turn int-conversion warnings into
> permerrors") the test fails with errors such as:
> 
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 32)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 33)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 34)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 35)
> ⋮
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 118 (test for 
> warnings, line 117)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 119)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 120 (test for 
> warnings, line 119)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 121)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 122 (test for 
> warnings, line 121)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 123)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 124 (test for 
> warnings, line 123)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 125)
> ⋮
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0  (test for excess errors)
> 
> There's a total of 1016 errors.  Here's a sample of the excess errors:
> 
>   Excess errors:
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:117:31: 
> error: passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:119:3: 
> error: passing argument 3 of '__builtin_arm_vcx1qav16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:121:3: 
> error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:123:3: 
> error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
> 
> The test expects these messages to be warnings, not errors.  My first try
> was to change it to expect them as errors instead.  This didn't work, IIUC
> because the error prevents the compiler from continuing processing the file
> and thus other errors which are expected by the test don't get emitted.
> 
> Therefore, add -fpermissive so that the test behaves as it did previously.
> Because of the additional line in the header, I had to adjust the line
> numbers of the expected warnings.
> 
> Tested on armv8l-linux-gnueabihf.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.target/arm/acle/cde-mve-error-2.c: Add -fpermissive.
> ---
>  .../gcc.target/arm/acle/cde-mve-error-2.c | 63 ++-
>  1 file changed, 32 insertions(+), 31 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c 
> b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> index 5b7774825442..da283a06a54d 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> @@ -2,6 +2,7 @@
>  
>  /* { dg-do assemble } */
>  /* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
> +/* { dg-options "-fpermissive" } */
>  /* { dg-add-options arm_v8_1m_main_cde_mve_fp } */
>  
>  /* The error checking files are split since there are three kinds of
> @@ -115,73 +116,73 @@ uint8x16_t test_bad_immediates (uint8x16_t n, 
> uint8x16_t m, int someval,
>  
>/* `imm' is of wrong type.  */
>accum += __arm_vcx1q_u8 (0, "");/* { dg-error 
> {argument 2 to '__builtin_arm_vcx1qv16qi' must be a constant immediate in 
> range \[0-4095\]} } */
> -  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes 
> integer from pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 
> 117 } */
> +  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes 
> integer from pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 
> 118 } */

Absolute line numbers are a pain, but I think we can use '.-1' (without the 
quotes) in these cases to minimize the churn.

If that works, ok with that change.

R.

Re: [PATCH] testsuite: Fix fallout of turning warnings into errors on 32-bit Arm

2024-03-01 Thread Richard Earnshaw (lists)

On 01/03/2024 14:23, Andre Vieira (lists) wrote:
> Hi Thiago,
> 
> Thanks for this, LGTM but I can't approve this, CC'ing Richard.
> 
> Do have a nitpick, in the gcc/testsuite/ChangeLog: remove 'gcc/testsuite' 
> from bullet points 2-4.
> 

Yes, this is OK with the change Andre mentioned (your push will fail if you 
don't fix that).

R.

PS, if you've set up GCC git customizations (see 
contrib/gcc-git-customization.sh), you can verify things like this with 'git 
gcc-verify HEAD^..HEAD'


> Kind regards,
> Andre
> 
> On 13/01/2024 00:55, Thiago Jung Bauermann wrote:
>> Since commits 2c3db94d9fd ("c: Turn int-conversion warnings into
>> permerrors") and 55e94561e97e ("c: Turn -Wimplicit-function-declaration
>> into a permerror") these tests fail with errors such as:
>>
>>    FAIL: gcc.target/arm/pr59858.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr65647.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr65710.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr97969.c (test for excess errors)
>>
>> Here's one example of the excess errors:
>>
>>    FAIL: gcc.target/arm/pr65647.c (test for excess errors)
>>    Excess errors:
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:17: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:51: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:62: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:7:48: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:8:9: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:24:5: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:25:5: error: 
>> initialization of 'int' from 'struct S1 *' makes integer from pointer 
>> without a cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:41:3: error: 
>> implicit declaration of function 'fn3'; did you mean 'fn2'? 
>> [-Wimplicit-function-declaration]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:46:3: error: 
>> implicit declaration of function 'fn5'; did you mean 'fn4'? 
>> [-Wimplicit-function-declaration]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:57:16: error: 
>> implicit declaration of function 'fn6'; did you mean 'fn4'? 
>> [-Wimplicit-function-declaration]
>>
>> PR rtl-optimization/59858 and PR target/65710 test the fix of an ICE.
>> PR target/65647 and PR target/97969 test for a compilation infinite loop.
>>
>> Therefore, add -fpermissive so that the tests behave as they did previously.
>> Tested on armv8l-linux-gnueabihf.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.target/arm/pr59858.c: Add -fpermissive.
>> * gcc/testsuite/gcc.target/arm/pr65647.c: Likewise.
>> * gcc/testsuite/gcc.target/arm/pr65710.c: Likewise.
>> * gcc/testsuite/gcc.target/arm/pr97969.c: Likewise.
>> ---
>>   gcc/testsuite/gcc.target/arm/pr59858.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr65647.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr65710.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr97969.c | 2 +-
>>   4 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/pr59858.c 
>> b/gcc/testsuite/gcc.target/arm/pr59858.c
>> index 3360b48e8586..9336edfce277 100644
>> --- a/gcc/testsuite/gcc.target/arm/pr59858.c
>> +++ b/gcc/testsuite/gcc.target/arm/pr59858.c
>> @@ -1,5 +1,5 @@
>>   /* { dg-do compile } */
>> -/* { dg-options "-march=armv5te -fno-builtin -mfloat-abi=soft -mthumb 
>> -fno-stack-protector -Os -fno-tree-loop-optimize -fno-tree-dominator-opts 
>> -fPIC -w" } */
>> +/* { dg-options "-march=armv5te -fno-builtin -mfloat-abi=soft -mthumb 
>> -fno-stack-protector -Os -fno-tree-loop-optimize -fno-tree-dominator-opts 
>> -fPIC -w -fpermissive" } */
>>   /* { dg-require-effective-target fpic } */
>>   /* { dg-skip-if "Incompatible command line options: -mfloat-abi=soft 
>> -mfloat-abi=hard" { *-*-* } { "-mfloat-abi=hard" } { "" } } */
>>   /* { dg-require-effective-target arm_arch_v5te_thumb_ok } */
>> diff --git a/gcc/testsuite/gcc.target/arm/pr65647.c 
>> b/gcc/testsuite/gcc.target/arm/pr65647.c
>> index 26b4e399f6be..3cbf6b804ec0 100644
>> --- a/gcc/testsuite/gcc.target/arm/pr65647.c
>> +++ b/gcc/testsuite/gcc.target/arm/pr65647.c
>> @@ -1,7 +1,7 @@
>>   /* { dg-do compile } */
>>   /* { dg-require-effective-target arm_arch_v6m_ok } */
>>   /* {

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-03-01 Thread Richard Earnshaw (lists)

On 29/02/2024 15:55, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 02:14:05PM +0000, Richard Earnshaw wrote:
>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>> including the new testcase you added.
>>>
>>
>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>> n_named_args entirely, it doesn't need it (I don't think it even existed
>> when the AAPCS code was added).
> 
> So far I've just checked that the new testcase passes not just on
> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
> with vanilla trunk.
> Haven't posted this patch in patch form, plus while I'm not really sure
> whether setting n_named_args to 0 or not changing in the
> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
> correct to me.  If structure_value_addr_parm is 1, the function effectively
> has a single named argument and then ... args and if the target wants
> n_named_args to be number of named arguments except the last, then that
> should be 0 rather than 1.
> 
> Thus, is the following patch ok for trunk then?
> 
> 2024-02-29  Jakub Jelinek  
> 
>   PR target/107453

PR 114136

Would be more appropriate for this, I think.

Otherwise, OK.

R.

>   * calls.cc (expand_call): For TYPE_NO_NAMED_ARGS_STDARG_P set
>   n_named_args initially before INIT_CUMULATIVE_ARGS to
>   structure_value_addr_parm rather than 0, after it don't modify
>   it if strict_argument_naming and clear only if
>   !pretend_outgoing_varargs_named.
> 
> --- gcc/calls.cc.jj   2024-01-22 11:48:08.045847508 +0100
> +++ gcc/calls.cc  2024-02-29 16:24:47.799855912 +0100
> @@ -2938,7 +2938,7 @@ expand_call (tree exp, rtx target, int i
>/* Count the struct value address, if it is passed as a parm.  */
>+ structure_value_addr_parm);
>else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> -n_named_args = 0;
> +n_named_args = structure_value_addr_parm;
>else
>  /* If we know nothing, treat all args as named.  */
>  n_named_args = num_actuals;
> @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int i
>   we do not have any reliable way to pass unnamed args in
>   registers, so we must force them into memory.  */
>  
> -  if (type_arg_types != 0
> +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>&& targetm.calls.strict_argument_naming (args_so_far))
>  ;
>else if (type_arg_types != 0
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  /* Don't include the last named arg.  */
>  --n_named_args;
> -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
>else
>  /* Treat all args as named.  */
> 
>   Jakub
>

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-03-01 Thread Richard Earnshaw (lists)

On 29/02/2024 17:56, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 05:51:03PM +0000, Richard Earnshaw (lists) wrote:
>> Oh, but wait!  Perhaps that now falls into the initial 'if' clause and we 
>> never reach the point where you pick zero.  So perhaps I'm worrying about 
>> nothing.
> 
> If you are worried about the
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
> case in the patch, we know at that point that the initial n_named_args is
> equal to structure_value_addr_parm, so either 0, in that case
> --n_named_args;
> would yield the undesirable negative value, so we want 0 instead; for that
> case we could as well just have ; in there instead of n_named_args = 0;,
> or it is 1, in that case --n_named_args; would turn that into 0.
> 
>   Jakub
> 

No, I was thinking about the case of strict_argument_naming when the first 
argument is the artificial return value pointer.  In that case we'd want 
n_named_args=1.

But I think it's a non-issue as that will be caught by 

  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
   && targetm.calls.strict_argument_naming (args_so_far))
 ;

R.

Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-02-29 Thread Richard Earnshaw (lists)

On 29/02/2024 17:55, Andrew Pinski (QUIC) wrote:
>> -Original Message-
>> From: Maxim Kuvyrkov 
>> Sent: Thursday, February 29, 2024 9:46 AM
>> To: Andrew Pinski (QUIC) 
>> Cc: Evgeny Karpov ; Andrew Pinski
>> ; Richard Sandiford ; gcc-
>> patc...@gcc.gnu.org; 10wa...@gmail.com; m...@harmstone.com; Zac
>> Walker ; Ron Riddle
>> ; Radek Barton 
>> Subject: Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW
>> environments for AArch64
>>
>> WARNING: This email originated from outside of Qualcomm. Please be wary
>> of any links or attachments, and do not enable macros.
>>
>>> On Feb 29, 2024, at 21:35, Andrew Pinski (QUIC)
>>  wrote:
>>>
>>>
>>>
 -Original Message-
 From: Evgeny Karpov 
 Sent: Thursday, February 29, 2024 8:46 AM
 To: Andrew Pinski 
 Cc: Richard Sandiford ; gcc-
 patc...@gcc.gnu.org; 10wa...@gmail.com; Maxim Kuvyrkov
 ; m...@harmstone.com; Zac Walker
 ; Ron Riddle ;
 Radek Barton ; Andrew Pinski (QUIC)
 
 Subject: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments
 for AArch64

 Wednesday, February 28, 2024 2:00 AM
 Andrew Pinski wrote:

> What does this mean with respect to C++ exceptions? Or you using
> SJLJ exceptions support or the dwarf unwinding ones without SEH
>> support?
> I am not sure if SJLJ exceptions is well tested any more in GCC either.
>
> Also I have a question if you ran the full GCC/G++ testsuites and
> what were the results?
> If you did run it, did you use a cross compiler or the native
> compiler? Did you do a bootstrap (GCC uses C++ but no exceptions
>> though)?

 As mentioned in the cover letter and the thread, the current
 contribution covers only the C scope.
 Exception handling is fully disabled for now.
 There is an experimental build with C++ and SEH, however, it is not
 included in the plan for the current contribution.

 https://github.com/Windows-on-ARM-Experiments/mingw-woarm64-
>> build

> If you run using a cross compiler, did you use ssh or some other
> route to run the applications?
>
> Thanks,
> Andrew Pinski

 GitHub Actions are used to cross-compile toolchains, packages and
 tests, and execute tests on Windows Arm64.
>>>
>>> This does not answer my question because what you are running is just
>> simple testcases and not the FULL GCC testsuite.
>>> So again have you ran the GCC testsuite and do you have a dejagnu board to
>> be able to execute the binaries?
>>> I think without the GCC testsuite ran to find all of the known failures, 
>>> you are
>> going to be running into many issues.
>>> The GCC testsuite includes many tests for ABI corner cases and many
>> features that you will most likely not think about testing using your simple
>> testcases.
>>> In fact I suspect there will be some of the aarch64 testcases which will 
>>> need
>> to be modified for the windows ABI which you have not done yet.
>>
>> Hi Andrew,
>>
>> We (Linaro) have a prototype CI loop setup for testing aarch64-w64-
>> mingw32, and we have results for gcc-c and libatomic -- see [1].
>>
>> The results are far from clean, but that's expected.  This patch series aims 
>> at
>> enabling C hello-world only, and subsequent patch series will improve the
>> state of the port.
>>
>> [1] https://ci.linaro.org/job/tcwg_gnu_mingw_check_gcc--master-woa64-
>> build/6/artifact/artifacts/sumfiles/
> 
> Looking at these results, this port is not in any shape or form to be 
> upstreamed right now. Even simple -g will cause failures.
> Note we don't need a clean testsuite run but the patch series is not even 
> allowing enabling hello world due to the -g not being able to used.
> 

It seemed to me as though the patch was posted for comments, not for immediate 
inclusion.  I agree this isn't ready for committing yet, but neither should the 
submitters wait until it's perfect before posting it.

I think it's gcc-15 material, so now is about the right time to be thinking 
about it.

R.

> Thanks,
> Amdrew Pinski
> 
>>
>> Thanks,
>>
>> --
>> Maxim Kuvyrkov
>> https://www.linaro.org
>

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)

On 29/02/2024 17:38, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 05:23:25PM +0000, Richard Earnshaw (lists) wrote:
>> On 29/02/2024 15:55, Jakub Jelinek wrote:
>>> On Thu, Feb 29, 2024 at 02:14:05PM +, Richard Earnshaw wrote:
>>>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>>>> including the new testcase you added.
>>>>>
>>>>
>>>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>>>> n_named_args entirely, it doesn't need it (I don't think it even existed
>>>> when the AAPCS code was added).
>>>
>>> So far I've just checked that the new testcase passes not just on
>>> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
>>> with vanilla trunk.
>>> Haven't posted this patch in patch form, plus while I'm not really sure
>>> whether setting n_named_args to 0 or not changing in the
>>> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
>>> correct to me.  If structure_value_addr_parm is 1, the function effectively
>>> has a single named argument and then ... args and if the target wants
>>> n_named_args to be number of named arguments except the last, then that
>>> should be 0 rather than 1.
>>>
>>> Thus, is the following patch ok for trunk then?
>>
>> The comment at the start of the section says
>>
>>   /* Now possibly adjust the number of named args.
>>  Normally, don't include the last named arg if anonymous args follow.
>>  We do include the last named arg if
>>  targetm.calls.strict_argument_naming() returns nonzero.
>>  (If no anonymous args follow, the result of list_length is actually
>>  one too large.  This is harmless.)
>>
>> So in the case of strict_argument_naming perhaps it should return 1, but 0 
>> for other cases.
> 
> The TYPE_NO_NAMED_ARGS_STDARG_P (funtype) case is as if type_arg_types != 0
> and list_length (type_arg_types) == 0, i.e. no user named arguments.
> As list_length (NULL) returns 0, perhaps it could be even handled just the
> by changing all the type_arg_types != 0 checks to
> type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> There are just 2 cases I'm worried about, one is that I think rest of
> calls.cc nor the backends are prepared to see n_named_args -1 after the
> adjustments, I think it is better to use 0, and then the question is what
> the !strict_argument_naming && !pretend_outgoing_varargs_named case
> wants to do for the aggregate return.  The patch as posted for
> void foo (...); void bar () { foo (1, 2, 3); }
> will set n_named_args initially to 0 (no named args) and with the
> adjustments for strict_argument_naming 0, otherwise for !pretend
> 0 as well, otherwise 3.
> For
> struct { char buf[4096]; } baz (...); void qux () { baz (1, 2, 3); }
> the patch sets n_named_args initially to 1 (the hidden return) and
> with the arguments for strict keep it at 1, for !pretend 0 and otherwise
> 3.
> 
> So, which case do you think is handled incorrectly with that?

The way I was thinking about it (and testing it on Arm) was to look at 
n_named_args for the cases of a traditional varargs case, then reduce that by 
one (except it can't ever be negative).

So for 

void f(...);
void g(int, ...);
struct S { int a[32]; };

struct S h (...);
struct S i (int, ...);

void a ()
{
  struct S x;
  f(1, 2, 3, 4);
  g(1, 2, 3, 4);
  x = h (1, 2, 3, 4);
  x = i (1, 2, 3, 4);
}

There are various permutations that could lead to answers of 0, 1, 2, 4 and 5 
depending on how those various targets treat each case and how the result 
pointer address is handled.  My suspicion is that for a target that has strict 
argument naming and the result pointer passed as a first argument, the answer 
for the 'h()' call should be 1, not zero.  

Oh, but wait!  Perhaps that now falls into the initial 'if' clause and we never 
reach the point where you pick zero.  So perhaps I'm worrying about nothing.

R.

> 
>   Jakub
>

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)

On 29/02/2024 15:55, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 02:14:05PM +0000, Richard Earnshaw wrote:
>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>> including the new testcase you added.
>>>
>>
>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>> n_named_args entirely, it doesn't need it (I don't think it even existed
>> when the AAPCS code was added).
> 
> So far I've just checked that the new testcase passes not just on
> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
> with vanilla trunk.
> Haven't posted this patch in patch form, plus while I'm not really sure
> whether setting n_named_args to 0 or not changing in the
> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
> correct to me.  If structure_value_addr_parm is 1, the function effectively
> has a single named argument and then ... args and if the target wants
> n_named_args to be number of named arguments except the last, then that
> should be 0 rather than 1.
> 
> Thus, is the following patch ok for trunk then?

The comment at the start of the section says

  /* Now possibly adjust the number of named args.
 Normally, don't include the last named arg if anonymous args follow.
 We do include the last named arg if
 targetm.calls.strict_argument_naming() returns nonzero.
 (If no anonymous args follow, the result of list_length is actually
 one too large.  This is harmless.)

So in the case of strict_argument_naming perhaps it should return 1, but 0 for 
other cases.

R.

> 
> 2024-02-29  Jakub Jelinek  
> 
>   PR target/107453
>   * calls.cc (expand_call): For TYPE_NO_NAMED_ARGS_STDARG_P set
>   n_named_args initially before INIT_CUMULATIVE_ARGS to
>   structure_value_addr_parm rather than 0, after it don't modify
>   it if strict_argument_naming and clear only if
>   !pretend_outgoing_varargs_named.
> 
> --- gcc/calls.cc.jj   2024-01-22 11:48:08.045847508 +0100
> +++ gcc/calls.cc  2024-02-29 16:24:47.799855912 +0100
> @@ -2938,7 +2938,7 @@ expand_call (tree exp, rtx target, int i
>/* Count the struct value address, if it is passed as a parm.  */
>+ structure_value_addr_parm);
>else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> -n_named_args = 0;
> +n_named_args = structure_value_addr_parm;
>else
>  /* If we know nothing, treat all args as named.  */
>  n_named_args = num_actuals;
> @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int i
>   we do not have any reliable way to pass unnamed args in
>   registers, so we must force them into memory.  */
>  
> -  if (type_arg_types != 0
> +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>&& targetm.calls.strict_argument_naming (args_so_far))
>  ;
>else if (type_arg_types != 0
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  /* Don't include the last named arg.  */
>  --n_named_args;
> -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
>else
>  /* Treat all args as named.  */
> 
>   Jakub
>

Re: [PATCH] calls: Fix up TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw




On 29/02/2024 14:10, Richard Earnshaw (lists) wrote:
> On 27/02/2024 17:25, Jakub Jelinek wrote:
>> On Tue, Feb 27, 2024 at 04:41:32PM +, Richard Earnshaw wrote:
>>>> 2023-01-09  Jakub Jelinek  
>>>>
>>>>PR target/107453
>>>>* calls.cc (expand_call): For calls with
>>>>TYPE_NO_NAMED_ARGS_STDARG_P (funtype) use zero for n_named_args.
>>>>Formatting fix.
>>>
>>> This one has been festering for a while; both Alexandre and Torbjorn have 
>>> attempted to fix it recently, but I'm not sure either is really right...
>>>
>>> On Arm this is causing all anonymous arguments to be passed on the stack,
>>> which is incorrect per the ABI.  On a target that uses
>>> 'pretend_outgoing_vararg_named', why is it correct to set n_named_args to
>>> zero?  Is it enough to guard both the statements you've added with
>>> !targetm.calls.pretend_outgoing_args_named?
>>
>> I'm afraid I haven't heard of that target hook before.
>> All I was doing with that change was fixing a regression reported in the PR
>> for ppc64le/sparc/nvptx/loongarch at least.
>>
>> The TYPE_NO_NAMED_ARGS_STDARG_P functions (C23 fns like void foo (...) {})
>> have NULL type_arg_types, so the list_length (type_arg_types) isn't done for
>> it, but it should be handled as if it was non-NULL but list length was 0.
>>
>> So, for the
>>   if (type_arg_types != 0)
>> n_named_args
>>   = (list_length (type_arg_types)
>>  /* Count the struct value address, if it is passed as a parm.  */
>>  + structure_value_addr_parm);
>>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>> n_named_args = 0;
>>   else
>> /* If we know nothing, treat all args as named.  */
>> n_named_args = num_actuals;
>> case, I think guarding it by any target hooks is wrong, although
>> I guess it should have been
>> n_named_args = structure_value_addr_parm;
>> instead of
>> n_named_args = 0;
>>
>> For the second
>>   if (type_arg_types != 0
>>   && targetm.calls.strict_argument_naming (args_so_far))
>> ;
>>   else if (type_arg_types != 0
>>&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>> /* Don't include the last named arg.  */
>> --n_named_args;
>>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>> n_named_args = 0;
>>   else
>> /* Treat all args as named.  */
>> n_named_args = num_actuals;
>> bet (but no testing done, don't even know which targets return what for
>> those hooks) we should treat those as if type_arg_types was non-NULL
>> with 0 elements in the list, except the --n_named_args doesn't make sense
>> because that would decrease it to -1.
>> So perhaps
>>   if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>>   && targetm.calls.strict_argument_naming (args_so_far))
>> ;
>>   else if (type_arg_types != 0
>>&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>> /* Don't include the last named arg.  */
>> --n_named_args;
>>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
>> && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far)))
>> ;
>>   else
>> /* Treat all args as named.  */
>> n_named_args = num_actuals;
> 
> I tried the above on arm, aarch64 and x86_64 and that seems fine, including 
> the new testcase you added.
> 

I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores n_named_args 
entirely, it doesn't need it (I don't think it even existed when the AAPCS code 
was added).

R.

> R.
> 
>>
>> (or n_named_args = 0; instead of ; before the final else?  Dunno).
>> I guess we need some testsuite coverage for caller/callee ABI match of
>> struct S { char p[64]; };
>> struct S foo (...);
>>
>>  Jakub
>>
>

Re: [PATCH] calls: Fix up TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)

On 27/02/2024 17:25, Jakub Jelinek wrote:
> On Tue, Feb 27, 2024 at 04:41:32PM +0000, Richard Earnshaw wrote:
>>> 2023-01-09  Jakub Jelinek  
>>>
>>> PR target/107453
>>> * calls.cc (expand_call): For calls with
>>> TYPE_NO_NAMED_ARGS_STDARG_P (funtype) use zero for n_named_args.
>>> Formatting fix.
>>
>> This one has been festering for a while; both Alexandre and Torbjorn have 
>> attempted to fix it recently, but I'm not sure either is really right...
>>
>> On Arm this is causing all anonymous arguments to be passed on the stack,
>> which is incorrect per the ABI.  On a target that uses
>> 'pretend_outgoing_vararg_named', why is it correct to set n_named_args to
>> zero?  Is it enough to guard both the statements you've added with
>> !targetm.calls.pretend_outgoing_args_named?
> 
> I'm afraid I haven't heard of that target hook before.
> All I was doing with that change was fixing a regression reported in the PR
> for ppc64le/sparc/nvptx/loongarch at least.
> 
> The TYPE_NO_NAMED_ARGS_STDARG_P functions (C23 fns like void foo (...) {})
> have NULL type_arg_types, so the list_length (type_arg_types) isn't done for
> it, but it should be handled as if it was non-NULL but list length was 0.
> 
> So, for the
>   if (type_arg_types != 0)
> n_named_args
>   = (list_length (type_arg_types)
>  /* Count the struct value address, if it is passed as a parm.  */
>  + structure_value_addr_parm);
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> n_named_args = 0;
>   else
> /* If we know nothing, treat all args as named.  */
> n_named_args = num_actuals;
> case, I think guarding it by any target hooks is wrong, although
> I guess it should have been
> n_named_args = structure_value_addr_parm;
> instead of
> n_named_args = 0;
> 
> For the second
>   if (type_arg_types != 0
>   && targetm.calls.strict_argument_naming (args_so_far))
> ;
>   else if (type_arg_types != 0
>&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
> /* Don't include the last named arg.  */
> --n_named_args;
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> n_named_args = 0;
>   else
> /* Treat all args as named.  */
> n_named_args = num_actuals;
> bet (but no testing done, don't even know which targets return what for
> those hooks) we should treat those as if type_arg_types was non-NULL
> with 0 elements in the list, except the --n_named_args doesn't make sense
> because that would decrease it to -1.
> So perhaps
>   if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>   && targetm.calls.strict_argument_naming (args_so_far))
> ;
>   else if (type_arg_types != 0
>&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
> /* Don't include the last named arg.  */
> --n_named_args;
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far)))
> ;
>   else
> /* Treat all args as named.  */
> n_named_args = num_actuals;

I tried the above on arm, aarch64 and x86_64 and that seems fine, including the 
new testcase you added.

R.

> 
> (or n_named_args = 0; instead of ; before the final else?  Dunno).
> I guess we need some testsuite coverage for caller/callee ABI match of
> struct S { char p[64]; };
> struct S foo (...);
> 
>   Jakub
>

Re: [PATCH] calls: Fix up TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-27 Thread Richard Earnshaw

[resending, apologies, I accidentally CC'd the wrong person last time]

On 27/02/2024 16:41, Richard Earnshaw wrote:
> 
> 
> On 09/01/2023 10:32, Jakub Jelinek via Gcc-patches wrote:
>> Hi!
>>
>> On powerpc64le-linux, the following patch fixes
>> -FAIL: gcc.dg/c2x-stdarg-4.c execution test
>> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O0  execution test
>> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O1  execution test
>> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2  execution test
>> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2 -flto 
>> -fno-use-linker-plugin -flto-partition=none  execution test
>> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2 -flto -fuse-linker-plugin 
>> -fno-fat-lto-objects  execution test
>> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O3 -g  execution test
>> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -Os  execution test
>> The problem is mismatch between the caller and callee side.
>> On the callee side, we do:
>>   /* NAMED_ARG is a misnomer.  We really mean 'non-variadic'. */
>>   if (!cfun->stdarg)
>> data->arg.named = 1;  /* No variadic parms.  */
>>   else if (DECL_CHAIN (parm))
>> data->arg.named = 1;  /* Not the last non-variadic parm. */
>>   else if (targetm.calls.strict_argument_naming (all->args_so_far))
>> data->arg.named = 1;  /* Only variadic ones are unnamed.  */
>>   else
>> data->arg.named = 0;  /* Treat as variadic.  */
>> which is later passed to the target hooks to determine if a particular
>> argument is named or not.  Now, cfun->stdarg is determined from the stdarg_p
>> call, which for the new C2X TYPE_NO_NAMED_ARGS_STDARG_P function types
>> (rettype fn (...)) returns true.  Such functions have no named arguments,
>> so data->arg.named will be 0 in function.cc.  But on the caller side,
>> as TYPE_NO_NAMED_ARGS_STDARG_P function types have TYPE_ARG_TYPES NULL,
>> we instead treat those calls as unprototyped even when they are prototyped
>> - /* If we know nothing, treat all args as named.  */ n_named_args = 
>> num_actuals;
>> in 2 spots.  We need to treat the TYPE_NO_NAMED_ARGS_STDARG_P cases as
>> prototyped with no named arguments.
>>
>> Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux (where
>> it fixes the above failures), aarch64-linux and s390x-linux, ok for trunk?
>>
>> 2023-01-09  Jakub Jelinek  
>>
>>  PR target/107453
>>  * calls.cc (expand_call): For calls with
>>  TYPE_NO_NAMED_ARGS_STDARG_P (funtype) use zero for n_named_args.
>>  Formatting fix.
> 
> This one has been festering for a while; both Alexandre and Torbjorn have 
> attempted to fix it recently, but I'm not sure either is really right...
> 
> On Arm this is causing all anonymous arguments to be passed on the stack, 
> which is incorrect per the ABI.  On a target that uses 
> 'pretend_outgoing_vararg_named', why is it correct to set n_named_args to 
> zero?  Is it enough to guard both the statements you've added with 
> !targetm.calls.pretend_outgoing_args_named?
> 
> R.
> 
>>
>> --- gcc/calls.cc.jj  2023-01-02 09:32:28.834192105 +0100
>> +++ gcc/calls.cc 2023-01-06 14:52:14.740594896 +0100
>> @@ -2908,8 +2908,8 @@ expand_call (tree exp, rtx target, int i
>>  }
>>  
>>/* Count the arguments and set NUM_ACTUALS.  */
>> -  num_actuals =
>> -call_expr_nargs (exp) + num_complex_actuals + structure_value_addr_parm;
>> +  num_actuals
>> += call_expr_nargs (exp) + num_complex_actuals + 
>> structure_value_addr_parm;
>>  
>>/* Compute number of named args.
>>   First, do a raw count of the args for INIT_CUMULATIVE_ARGS.  */
>> @@ -2919,6 +2919,8 @@ expand_call (tree exp, rtx target, int i
>>= (list_length (type_arg_types)
>>   /* Count the struct value address, if it is passed as a parm.  */
>>   + structure_value_addr_parm);
>> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>> +n_named_args = 0;
>>else
>>  /* If we know nothing, treat all args as named.  */
>>  n_named_args = num_actuals;
>> @@ -2957,6 +2959,8 @@ expand_call (tree exp, rtx target, int i
>> && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>>  /* Don't include the last named arg.  */
>>  --n_named_args;
>> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>> +n_named_args = 0;
>>else
>>  /* Treat all args as named.  */
>>  n_named_args = num_actuals;
>>
>>  Jakub
>>

Re: [PATCH] calls: Fix up TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-27 Thread Richard Earnshaw




On 09/01/2023 10:32, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> On powerpc64le-linux, the following patch fixes
> -FAIL: gcc.dg/c2x-stdarg-4.c execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O0  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O1  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O3 -g  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -Os  execution test
> The problem is mismatch between the caller and callee side.
> On the callee side, we do:
>   /* NAMED_ARG is a misnomer.  We really mean 'non-variadic'. */
>   if (!cfun->stdarg)
> data->arg.named = 1;  /* No variadic parms.  */
>   else if (DECL_CHAIN (parm))
> data->arg.named = 1;  /* Not the last non-variadic parm. */
>   else if (targetm.calls.strict_argument_naming (all->args_so_far))
> data->arg.named = 1;  /* Only variadic ones are unnamed.  */
>   else
> data->arg.named = 0;  /* Treat as variadic.  */
> which is later passed to the target hooks to determine if a particular
> argument is named or not.  Now, cfun->stdarg is determined from the stdarg_p
> call, which for the new C2X TYPE_NO_NAMED_ARGS_STDARG_P function types
> (rettype fn (...)) returns true.  Such functions have no named arguments,
> so data->arg.named will be 0 in function.cc.  But on the caller side,
> as TYPE_NO_NAMED_ARGS_STDARG_P function types have TYPE_ARG_TYPES NULL,
> we instead treat those calls as unprototyped even when they are prototyped
> - /* If we know nothing, treat all args as named.  */ n_named_args = 
> num_actuals;
> in 2 spots.  We need to treat the TYPE_NO_NAMED_ARGS_STDARG_P cases as
> prototyped with no named arguments.
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux (where
> it fixes the above failures), aarch64-linux and s390x-linux, ok for trunk?
> 
> 2023-01-09  Jakub Jelinek  
> 
>   PR target/107453
>   * calls.cc (expand_call): For calls with
>   TYPE_NO_NAMED_ARGS_STDARG_P (funtype) use zero for n_named_args.
>   Formatting fix.

This one has been festering for a while; both Alexandre and Torbjorn have 
attempted to fix it recently, but I'm not sure either is really right...

On Arm this is causing all anonymous arguments to be passed on the stack, which 
is incorrect per the ABI.  On a target that uses 
'pretend_outgoing_vararg_named', why is it correct to set n_named_args to zero? 
 Is it enough to guard both the statements you've added with 
!targetm.calls.pretend_outgoing_args_named?

R.

> 
> --- gcc/calls.cc.jj   2023-01-02 09:32:28.834192105 +0100
> +++ gcc/calls.cc  2023-01-06 14:52:14.740594896 +0100
> @@ -2908,8 +2908,8 @@ expand_call (tree exp, rtx target, int i
>  }
>  
>/* Count the arguments and set NUM_ACTUALS.  */
> -  num_actuals =
> -call_expr_nargs (exp) + num_complex_actuals + structure_value_addr_parm;
> +  num_actuals
> += call_expr_nargs (exp) + num_complex_actuals + 
> structure_value_addr_parm;
>  
>/* Compute number of named args.
>   First, do a raw count of the args for INIT_CUMULATIVE_ARGS.  */
> @@ -2919,6 +2919,8 @@ expand_call (tree exp, rtx target, int i
>= (list_length (type_arg_types)
>/* Count the struct value address, if it is passed as a parm.  */
>+ structure_value_addr_parm);
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +n_named_args = 0;
>else
>  /* If we know nothing, treat all args as named.  */
>  n_named_args = num_actuals;
> @@ -2957,6 +2959,8 @@ expand_call (tree exp, rtx target, int i
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  /* Don't include the last named arg.  */
>  --n_named_args;
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +n_named_args = 0;
>else
>  /* Treat all args as named.  */
>  n_named_args = num_actuals;
> 
>   Jakub
>

[PATCH] arm: warn about deprecation of iwmmx in mmintrin.h

2024-02-27 Thread Richard Earnshaw


GCC 13's changes file documents that iwmmx is deprecated.  Raise the bar
by warning when the mmintrin.h header is included by users, but provide
a way to suppress the warning.

gcc:
* config/arm/mmintrin.h: Warn if this header is included without
defining __ENABLE_DEPRECATED_IWMMXT.
---
 gcc/config/arm/mmintrin.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/arm/mmintrin.h b/gcc/config/arm/mmintrin.h
index 07659502bf2..e9cc3ddd7ab 100644
--- a/gcc/config/arm/mmintrin.h
+++ b/gcc/config/arm/mmintrin.h
@@ -28,6 +28,9 @@
 #error mmintrin.h included without enabling WMMX/WMMX2 instructions (e.g. -march=iwmmxt or -march=iwmmxt2)
 #endif
 
+#ifndef __ENABLE_DEPRECATED_IWMMXT
+#warning support for WMMX/WMMX2 is deprecated and will be removed in GCC 15.  Define __ENABLE_DEPRECATED_IWMMXT to suppress this warning
+#endif
 
 #if defined __cplusplus
 extern "C" {

Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-26 Thread Richard Earnshaw (lists)

On 26/02/2024 16:05, Wilco Dijkstra wrote:
> Hi Richard,
> 
>> Did you test this on a thumb1 target?  It seems to me that the target parts 
>> that you've
>> removed were likely related to that.  In fact, I don't see why this test 
>> would need to be changed at all.
> 
> The testcase explicitly forces a Thumb-2 target (arm_arch_v6t2). The patterns
> were wrong for Thumb-2 indeed, and the testcase was explicitly testing for 
> this.
> There is a separate builtin-bswap-2.c for Thumb-1 target (arm_arch_v6m).
> 
> Cheers,
> Wilco

That's why statements like:

* gcc.target/arm/builtin-bswap-1.c: Fix test.

are less than helpful.  Perhaps if you'd said what you actually changed that 
would have made it more obvious.

So OK, but please fix the commit message to say what you did.

R.

Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-26 Thread Richard Earnshaw (lists)

On 23/02/2024 15:46, Wilco Dijkstra wrote:
> Hi Richard,
> 
>> This bit isn't.  The correct fix here is to fix the pattern(s) concerned to 
>> add the missing predicate.
>>
>> Note that builtin-bswap.x explicitly mentions predicated mnemonics in the 
>> comments.
> 
> I fixed the patterns in v2. There are likely some more, plus we could likely 
> merge many t1 and t2
> patterns where the only difference is predication. But those cleanups are for 
> another time...
> 
> Cheers,
> Wilco
> 
> v2: Add predicable to the rev patterns.
> 
> By default most patterns can be conditionalized on Arm targets.  However
> Thumb-2 predication requires the "predicable" attribute be explicitly
> set to "yes".  Most patterns are shared between Arm and Thumb(-2) and are
> marked with "predicable".  Given this sharing, it does not make sense to
> use a different default for Arm.  So only consider conditional execution
> of instructions that have the predicable attribute set to yes.  This ensures
> that patterns not explicitly marked as such are never conditionally executed.
> 
> Passes regress and bootstrap, OK for commit?
> 
> gcc/ChangeLog:
> PR target/113915
> * config/arm/arm.md (NOCOND): Improve comment.
> (arm_rev*) Add predicable.
> * config/arm/arm.cc (arm_final_prescan_insn): Add check for
> PREDICABLE_YES.
> 
> gcc/testsuite/ChangeLog:
> PR target/113915
> * gcc.target/arm/builtin-bswap-1.c: Fix test.
> 
> ---
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 
> 1cd69268ee986a0953cc85ab259355d2191250ac..6a35fe44138135998877a9fb74c2a82a7f99dcd5
>  100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -25613,11 +25613,12 @@ arm_final_prescan_insn (rtx_insn *insn)
> break;
>  
>   case INSN:
> -   /* Instructions using or affecting the condition codes make it
> -  fail.  */
> +   /* Check the instruction is explicitly marked as predicable.
> +  Instructions using or affecting the condition codes are not.  
> */
> scanbody = PATTERN (this_insn);
> if (!(GET_CODE (scanbody) == SET
>   || GET_CODE (scanbody) == PARALLEL)
> +   || get_attr_predicable (this_insn) != PREDICABLE_YES
> || get_attr_conds (this_insn) != CONDS_NOCOND)
>   fail = TRUE;
> break;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 
> 5816409f86f1106b410c5e21d77e599b485f85f2..81237a61d4a2ebcfb77e47c2bd29137aba28a521
>  100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -307,6 +307,8 @@
>  ;
>  ; NOCOND means that the instruction does not use or alter the condition
>  ;   codes but can be converted into a conditionally exectuted instruction.
> +;   Given that NOCOND is the default for most instructions if omitted,
> +;   the attribute predicable must be set to yes as well.
>  
>  (define_attr "conds" "use,set,clob,unconditional,nocond"
>   (if_then_else
> @@ -12547,6 +12549,7 @@
>revsh%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12560,6 +12563,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12584,6 +12588,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12619,6 +12624,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c 
> b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> index 
> c1e7740d14d3ca4e93a71e38b12f82c19791a204..1a311a6a5af647d40abd553e5d0ba1273c76d288
>  100644
> --- a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> @@ -5,14 +5,11 @@
> of the instructions.  Add an -mtune option known to facilitate that.  */
>  /* { dg-additional-options "-O2 -mtune=cortex-a53" } */
>  /* { dg-final { scan-assembler-not "orr\[ \t\]" } } */
> -/* { dg-final { scan-assembler-times "revsh\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "revshne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "revsh\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev16ne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 2

[committed] arm: fix ICE with vectorized reciprocal division [PR108120]

2024-02-23 Thread Richard Earnshaw


The expand pattern for reciprocal division was enabled for all math
optimization modes, but the patterns it was generating were not
enabled unless -funsafe-math-optimizations were enabled, this leads to
an ICE when the pattern we generate cannot be recognized.

Fixed by only enabling vector division when doing unsafe math.

gcc:

PR target/108120
* config/arm/neon.md (div3): Rename from div3.
Gate with ARM_HAVE_NEON__ARITH.

gcc/testsuite:
PR target/108120
* gcc.target/arm/neon-recip-div-1.c: New file.
---
 gcc/config/arm/neon.md  |  4 ++--
 gcc/testsuite/gcc.target/arm/neon-recip-div-1.c | 16 
 2 files changed, 18 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/neon-recip-div-1.c

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 17c90f436c6..fa4a7aeda35 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -553,11 +553,11 @@ (define_insn "*mul3_neon"
Enabled with -funsafe-math-optimizations -freciprocal-math
and disabled for -Os since it increases code size .  */
 
-(define_expand "div3"
+(define_expand "div3"
   [(set (match_operand:VCVTF 0 "s_register_operand")
 (div:VCVTF (match_operand:VCVTF 1 "s_register_operand")
 		  (match_operand:VCVTF 2 "s_register_operand")))]
-  "TARGET_NEON && !optimize_size
+  "ARM_HAVE_NEON__ARITH && !optimize_size
&& flag_reciprocal_math"
   {
 rtx rec = gen_reg_rtx (mode);
diff --git a/gcc/testsuite/gcc.target/arm/neon-recip-div-1.c b/gcc/testsuite/gcc.target/arm/neon-recip-div-1.c
new file mode 100644
index 000..e15c3ca5fe9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon-recip-div-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3 -freciprocal-math -fno-unsafe-math-optimizations -save-temps" } */
+/* { dg-add-options arm_neon } */
+
+int *a;
+int n;
+void b() {
+  int c;
+  for (c = 0; c < 10; c++)
+a[c] = (float)c / n;
+}
+/* We should not ICE, or get a vectorized reciprocal instruction when unsafe
+   math optimizations are disabled.  */
+/* { dg-final { scan-assembler-not "vrecpe\\.f32\\t\[qd\].*" } } */
+/* { dg-final { scan-assembler-not "vrecps\\.f32\\t\[qd\].*" } } */

Re: [PATCH v1 00/13] Add aarch64-w64-mingw32 target

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 17:47, Evgeny Karpov wrote:
> Hello,
> 
> We would like to take your attention to the review of changes for the
> new GCC target, aarch64-w64-mingw32. The new target will be
> supported, tested, added to CI, and maintained by Linaro. This marks
> the first of three planned patch series contributing to the GCC C
> compiler's support for Windows Arm64.
> 
> 1. Minimal aarch64-w64-mingw32 C implementation to cross-compile
> hello-world with libgcc for Windows Arm64 using MinGW.
> 2. Extension of the aarch64-w64-mingw32 C implementation to
> cross-compile OpenSSL, OpenBLAS, FFmpeg, and libjpeg-turbo. All
> packages successfully pass tests.
> 3. Addition of call stack support for debugging, resolution of
> optimization issues in the C compiler, and DLL export/import for the
> aarch64-w64-mingw32 target.
> 
> This patch series introduces the 1st point, which involves building
> hello-world for the aarch64-w64-mingw32 target. The patch depends on
> the binutils changes for the aarch64-w64-mingw32 target that have
> already been merged.
> 
> The binutils should include recent relocation fixes.
> f87eaf8ff3995a5888c6dc4996a20c770e6bcd36
> aarch64: Add new relocations and limit COFF AArch64 relocation offsets
> 
> The series is structured in a way to trivially show that it should not
> affect any other targets.
> 
> In this patch, several changes have been made to support the
> aarch64-w64-mingw32 target for GCC. The modifications include the
> definition of the MS ABI for aarch64, adjustments to FIXED_REGISTERS
> and STATIC_CHAIN_REGNUM for different ABIs, and specific definitions
> for COFF format on AArch64. Additionally, the patch reuses MinGW
>  types and definitions from i386, relocating them to a new
> mingw folder for shared usage between both targets.
> 
> MinGW-specific options have been introduced for AArch64, along with
> override options for aarch64-w64-mingw32. Builtin stack probing for
> override options for aarch64-w64-mingw32. Builtin stack probing for
> AArch64 has been enabled as an alternative for chkstk. Symbol name
> encoding and section information handling for aarch64-w64-mingw32 have
> been incorporated, and the MinGW environment has been added, which
> will also be utilized for defining the Cygwin environment in the
> future.
> 
> The patch includes renaming "x86 Windows Options" to "Cygwin and MinGW
> Options," which now encompasses AArch64 as well. AArch64-specific
> Cygwin and MinGW Options have been introduced for the unique
> requirements of the AArch64 architecture.
> 
> Function type declaration and named sections support have been added.
> The necessary objects for Cygwin and MinGW have been built for the
> aarch64-w64-mingw32 target, and relevant files such as msformat-c.cc
> and winnt-d.cc have been moved to the mingw folder for reuse in
> AArch64.
> 
> Furthermore, the aarch64-w64-mingw32 target has been included in both
> libatomic and libgcc, ensuring support for the AArch64 architecture
> within these libraries. These changes collectively enhance the
> capabilities of GCC for the specified target.
> 
> Coauthors: Zac Walker ,
> Mark Harmstone   and
> Ron Riddle 
> 
> Refactored, prepared, and validated by 
> Radek Barton  and 
> Evgeny Karpov 
> 
> Special thanks to the Linaro GNU toolchain team for internal review
> and assistance in preparing the patch series!
> 
> Regards,
> Evgeny

Thanks for posting this.

I've only read quickly through this patch series and responded where I think 
some action is obviously required.  That doesn't necessarily mean the other 
patches are perfect, though, just that nothing immediately caught my attention.

R.

> 
> 
> Zac Walker (13):
>   Introduce aarch64-w64-mingw32 target
>   aarch64: The aarch64-w64-mingw32 target implements the MS ABI
>   aarch64: Mark x18 register as a fixed register for MS ABI
>   aarch64: Add aarch64-w64-mingw32 COFF
>   Reuse MinGW from i386 for AArch64
>   Rename section and encoding functions from i386 which will be used in
> aarch64
>   Exclude i386 functionality from aarch64 build
>   aarch64: Add Cygwin and MinGW environments for AArch64
>   aarch64: Add SEH to machine_function
>   Rename "x86 Windows Options" to "Cygwin and MinGW Options"
>   aarch64: Build and add objects for Cygwin and MinGW for AArch64
>   aarch64: Add aarch64-w64-mingw32 target to libatomic
>   Add aarch64-w64-mingw32 target to libgcc
> 
>  fixincludes/mkfixinc.sh   |   3 +-
>  gcc/config.gcc|  47 +++--
>  gcc/config/aarch64/aarch64-coff.h |  92 +
>  gcc/config/aarch64/aarch64-opts.h |   7 +
>  gcc/config/aarch64/aarch64-protos.h   |   5 +
>  gcc/config/aarch64/aarch64.h  |  25 ++-
>  gcc/config/aarch64/cygming.h  | 178 ++
>  gcc/config/i386/cygming.h |  18 +-
>  gcc/config/i386/cygming.opt.urls  |  30 ---
>  gcc/config/i386/i386-protos.h

Re: [PATCH v1 13/13] Add aarch64-w64-mingw32 target to libgcc

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:40, Evgeny Karpov wrote:
> 
+aarch64-*-mingw*)

This doesn't match the glob pattern you added to config.gcc in an earlier 
patch, but see my comment on that.  The two should really be consistent with 
each other or you might get build failures late on.

R.

Re: [PATCH v1 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW Options"

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:38, Evgeny Karpov wrote:
> 
For this change you might want to put some form of re-direct in the manual 
under the old name so that anybody used to looking for the old entry will know 
where things have been moved to.  Something like

x86 Windows Options
  See xref(Cygwin and MinGW Options).

R.

Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:36, Evgeny Karpov wrote:
> 
+/* GNU as supports weak symbols on PECOFF.  */
+#ifdef HAVE_GAS_WEAK

Can't we assume this is true?  It was most likely needed on i386 because 
support goes back longer than the assembler had this feature, but it looks like 
it was added in 2000, or thereabouts, so significantly before aarch64 was 
supported in the assembler.

+#ifndef HAVE_GAS_ALIGNED_COMM

And this was added to GCC in 2009, which probably means it predates 
aarch64-coff support in gas as well.

R.

Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:30, Evgeny Karpov wrote:
> 
+   tm_defines="${tm_defines} TARGET_ARM64_MS_ABI=1"

I missed this on first reading...

The GCC port name uses AARCH64, please use that internally rather than other 
names.  The only time when we should be using ARM64 is when it's needed for 
compatibility with other compilers and that doesn't apply here AFAICT.

R.

Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:30, Evgeny Karpov wrote:
> 
+/* X18 reserved for the TEB on Windows.  */
+#ifdef TARGET_ARM64_MS_ABI
+# define FIXED_X18 1
+# define CALL_USED_X18 0
+#else
+# define FIXED_X18 0
+# define CALL_USED_X18 1
+#endif

I'm not overly keen on ifdefs like this (and the one below), it can get quite 
confusing if we have to support more than a couple of ABIs.  Perhaps we could 
create a couple of new headers, one for the EABI (which all existing targets 
would then need to include) and one for the MS ABI.  Then the mingw port would 
use that instead of the EABI header.

An alternative is to make all this dynamic, based on the setting of the 
aarch64_calling_abi enum and to make the adjustments in 
aarch64_conditional_register_usage.

+# define CALL_USED_X18 0

Is that really correct?  If the register is really reserved, but some code 
modifies it anyway, this will cause the compiler to restore the old value at 
the end of a function; generally, for a reserved register, code that knows what 
it's doing would want to make permanent changes to this value.

+#ifdef TARGET_ARM64_MS_ABI
+# define STATIC_CHAIN_REGNUM   R17_REGNUM
+#else
+# define STATIC_CHAIN_REGNUM   R18_REGNUM
+#endif

If we went the enum way, we'd want something like

#define STATIC_CHAIN_REGNUM (calling_abi == AARCH64_CALLING_ABI_MS ? R17_REGNUM 
: R18_REGNUM)

R.

Re: [PATCH v1 02/13] aarch64: The aarch64-w64-mingw32 target implements

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:26, Evgeny Karpov wrote:
> 
+/* Available call ABIs.  */
+enum calling_abi
+{
+  AARCH64_EABI = 0,
+  MS_ABI = 1
+};
+

The convention in this file seems to be that all enum types to start with 
aarch64.  Also, the enumeration values should start with the name of the 
enumeration type in upper case, so:

enum aarch64_calling_abi
{
  AARCH64_CALLING_ABI_EABI,
  AARCH64_CALLING_ABI_MS
};

or something very much like that.

R.

Re: [PATCH v1 01/13] Introduce aarch64-w64-mingw32 target

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:16, Evgeny Karpov wrote:
> 
+aarch64*-*-mingw*)

Other targets are a bit inconsistent here as well, but, as Andrew mentioned, if 
you don't want to handle big-endian, it might be better to match 
aarch64-*-mingw* here.

R.

Re: [PATCH v1 05/13] Reuse MinGW from i386 for AArch64

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 21:34, rep.dot@gmail.com wrote:
> On 21 February 2024 19:34:43 CET, Evgeny Karpov  
> wrote:
>>
> 
> Please use git send-email. Your mail ends up as empty as here, otherwise.

I don't see anything wrong with it; niether does patchwork 
(https://patchwork.sourceware.org/project/gcc/list/?series=31191) nor does the 
Linaro CI bot.  So perhaps it's your mailer that's misconfigured.

> 
> The ChangeLog has to be expressed in present tense, as mandated by the 
> standard; s/Moved/Move/g etc.

Agreed, but that's a detail that we can get to once the patch has been properly 
reviewed.

> 
> In any sane world ( and in gcc ) to fold, respectively a folder, is something 
> else compared to a directory ( which you probably mean when moving a file 
> from one directory to another directory as you seem to do ).
> 
> Most of the free world has left COFF behind since several decades, so I won't 
> comment on that. YMMV.

This isn't helpful.  Windows platforms use (a derivative of) COFF, so that's 
what the tools need to use when targetting that platform.

R.

Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-21 Thread Richard Earnshaw (lists)

On 21/02/2024 14:34, Wilco Dijkstra wrote:
> 
> By default most patterns can be conditionalized on Arm targets.  However
> Thumb-2 predication requires the "predicable" attribute be explicitly
> set to "yes".  Most patterns are shared between Arm and Thumb(-2) and are
> marked with "predicable".  Given this sharing, it does not make sense to
> use a different default for Arm.  So only consider conditional execution
> of instructions that have the predicable attribute set to yes.  This ensures
> that patterns not explicitly marked as such are never accidentally 
> conditionally executed like in the PR.
> 
> GLIBC codesize was ~0.014% worse due to atomic operations now being
> unconditional and a small number of patterns not setting "predicable".
> 
> Passes regress and bootstrap, OK for commit?
> 
> gcc/ChangeLog:
> PR target/113915
> * config/arm/arm.md (NOCOND): Improve comment.
> * config/arm/arm.cc (arm_final_prescan_insn): Add check for
> PREDICABLE_YES.
> 
> gcc/testsuite/ChangeLog:
> PR target/113915
> * gcc.target/arm/builtin-bswap-1.c: Fix test.
> 
> ---
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 
> c44047c377a802d0c1dc1406df1b88a6b079607b..29771d284831a995adcf9adbb525396fbabb1ea2
>  100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -25610,11 +25610,12 @@ arm_final_prescan_insn (rtx_insn *insn)
> break;
>  
>   case INSN:
> -   /* Instructions using or affecting the condition codes make it
> -  fail.  */
> +   /* Check the instruction is explicitly marked as predicable.
> +  Instructions using or affecting the condition codes are not.  
> */
> scanbody = PATTERN (this_insn);
> if (!(GET_CODE (scanbody) == SET
>   || GET_CODE (scanbody) == PARALLEL)
> +   || get_attr_predicable (this_insn) != PREDICABLE_YES
> || get_attr_conds (this_insn) != CONDS_NOCOND)
>   fail = TRUE;
> break;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 
> 5816409f86f1106b410c5e21d77e599b485f85f2..671f093862259c2c0df93a986fc22fa56a8ea6c7
>  100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -307,6 +307,8 @@
>  ;
>  ; NOCOND means that the instruction does not use or alter the condition
>  ;   codes but can be converted into a conditionally exectuted instruction.
> +;   Given that NOCOND is the default for most instructions if omitted,
> +;   the attribute predicable must be set to yes as well.
>  
>  (define_attr "conds" "use,set,clob,unconditional,nocond"
>   (if_then_else

While this is ok, 

> diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c 
> b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> index 
> c1e7740d14d3ca4e93a71e38b12f82c19791a204..3de7cea81c1128c2fe5a9e1216e6b027d26bcab9
>  100644
> --- a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> @@ -5,14 +5,8 @@
> of the instructions.  Add an -mtune option known to facilitate that.  */
>  /* { dg-additional-options "-O2 -mtune=cortex-a53" } */
>  /* { dg-final { scan-assembler-not "orr\[ \t\]" } } */
> -/* { dg-final { scan-assembler-times "revsh\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "revshne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "revsh\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev16ne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 2 { target { arm_nothumb } } } 
> }  */
> -/* { dg-final { scan-assembler-times "revne\\t" 2 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 4 { target { ! arm_nothumb } } 
> } }  */
> +/* { dg-final { scan-assembler-times "revsh\\t" 2 } }  */
> +/* { dg-final { scan-assembler-times "rev16\\t" 2 } }  */
> +/* { dg-final { scan-assembler-times "rev\\t" 4 } }  */
>  
>  #include "builtin-bswap.x"

This bit isn't.  The correct fix here is to fix the pattern(s) concerned to add 
the missing predicate.

Note that builtin-bswap.x explicitly mentions predicated mnemonics in the 
comments.

R.

Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-19 Thread Richard Earnshaw (lists)

On 19/02/2024 10:58, Tamar Christina wrote:
>> -Original Message-
>> From: Tamar Christina
>> Sent: Thursday, February 15, 2024 11:05 AM
>> To: Richard Earnshaw (lists) ; gcc-
>> patc...@gcc.gnu.org
>> Cc: nd ; Marcus Shawcroft ; Kyrylo
>> Tkachov ; Richard Sandiford
>> 
>> Subject: RE: [PATCH]AArch64: xfail modes_1.f90 [PR107071]
>>
>>> -Original Message-
>>> From: Richard Earnshaw (lists) 
>>> Sent: Thursday, February 15, 2024 11:01 AM
>>> To: Tamar Christina ; gcc-patches@gcc.gnu.org
>>> Cc: nd ; Marcus Shawcroft ;
>> Kyrylo
>>> Tkachov ; Richard Sandiford
>>> 
>>> Subject: Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]
>>>
>>> On 15/02/2024 10:57, Tamar Christina wrote:
>>>> Hi All,
>>>>
>>>> This test has never worked on AArch64 since the day it was committed.  It 
>>>> has
>>>> a number of issues that prevent it from working on AArch64:
>>>>
>>>> 1.  IEEE does not require that FP operations raise a SIGFPE for FP 
>>>> operations,
>>>>     only that an exception is raised somehow.
>>>>
>>>> 2. Most Arm designed cores don't raise SIGFPE and instead set a status 
>>>> register
>>>>    and some partner cores raise a SIGILL instead.
>>>>
>>>> 3. The way it checks for feenableexcept doesn't really work for AArch64.
>>>>
>>>> As such this test doesn't seem to really provide much value on AArch64 so 
>>>> we
>>>> should just xfail it.
>>>>
>>>> Regtested on aarch64-none-linux-gnu and no issues.
>>>>
>>>> Ok for master?
>>>
>>> Wouldn't it be better to just skip the test.  XFAIL just adds clutter to 
>>> verbose
>> output
>>> and suggests that someday the tools might be fixed for this case.
>>>
>>> Better still would be a new dg-requires fp_exceptions_raise_sigfpe as a 
>>> guard for
>>> the test.
>>
> 
> It looks like this is similar to 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314 so
> I'll just similarly skip it.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90 
> b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> index 
> 205c47f38007d06116289c19d6b23cf3bf83bd48..e29d8c678e6e51c3f2e5dac53c7703bb18a99ac4
>  100644
> --- a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> +++ b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> @@ -1,5 +1,5 @@
>  ! { dg-do run }
> -!
> +! { dg-skip-if "PR libfortran/78314" { aarch64*-*-gnu* arm*-*-gnueabi 
> arm*-*-gnueabihf } }
>  ! Test IEEE_MODES_TYPE, IEEE_GET_MODES and IEEE_SET_MODES
>  
> Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK, but please give the fortran maintainers 24hrs to comment before pushing.

R.

> 
> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>   PR fortran/107071
>   * gfortran.dg/ieee/modes_1.f90: skip aarch64, arm.

Re: [PATCH][GCC][Arm] Missing optimization pattern for rev16 on architectures with thumb1

2024-02-15 Thread Richard Earnshaw (lists)

On 12/02/2024 13:48, Matthieu Longo wrote:
> This patch marks a rev16 test as XFAIL for architectures having only Thumb1 
> support. The generated code is functionally correct, but the optimization is 
> disabled when -mthumb is equivalent to Thumb1. Fixing the root issue would 
> requires changes that are not suitable for GCC14 stage 4.
> 
> More information at https://linaro.atlassian.net/browse/GNU-1141
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/rev16_2.c: XFAIL when compiled with Thumb1.

Thanks, I've tweaked the commit message slightly and pushed this.

R.

Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-15 Thread Richard Earnshaw (lists)

On 15/02/2024 10:57, Tamar Christina wrote:
> Hi All,
> 
> This test has never worked on AArch64 since the day it was committed.  It has
> a number of issues that prevent it from working on AArch64:
> 
> 1.  IEEE does not require that FP operations raise a SIGFPE for FP operations,
>     only that an exception is raised somehow.
> 
> 2. Most Arm designed cores don't raise SIGFPE and instead set a status 
> register
>    and some partner cores raise a SIGILL instead.
> 
> 3. The way it checks for feenableexcept doesn't really work for AArch64.
> 
> As such this test doesn't seem to really provide much value on AArch64 so we
> should just xfail it.
> 
> Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

Wouldn't it be better to just skip the test.  XFAIL just adds clutter to 
verbose output and suggests that someday the tools might be fixed for this case.

Better still would be a new dg-requires fp_exceptions_raise_sigfpe as a guard 
for the test.

R.

> 
> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>     PR fortran/107071
>     * gfortran.dg/ieee/modes_1.f90: xfail aarch64.
> 
> --- inline copy of patch --
> diff --git a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90 
> b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> index 
> 205c47f38007d06116289c19d6b23cf3bf83bd48..3667571969427ae7b2b96684ec1af8b3fdd4985f
>  100644
> --- a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> +++ b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> @@ -1,4 +1,4 @@
> -! { dg-do run }
> +! { dg-do run { xfail { aarch64*-*-* } } }
>  !
>  ! Test IEEE_MODES_TYPE, IEEE_GET_MODES and IEEE_SET_MODES
>  
> 
> 
> 
> 
> --

Re: [PATCH] Arm: Fix incorrect tailcall-generation for indirect calls [PR113780]

2024-02-14 Thread Richard Earnshaw (lists)

On 14/02/2024 09:20, Tejas Belagod wrote:
> On 2/7/24 11:41 PM, Richard Earnshaw (lists) wrote:
>> On 07/02/2024 07:59, Tejas Belagod wrote:
>>> This patch fixes a bug that causes indirect calls in PAC-enabled functions
>>> to be tailcalled incorrectly when all argument registers R0-R3 are used.
>>>
>>> Tested on arm-none-eabi for armv8.1-m.main. OK for trunk?
>>>
>>> 2024-02-07  Tejas Belagod  
>>>
>>> PR target/113780
>>> * gcc/config/arm.cc (arm_function_ok_for_sibcall): Don't allow tailcalls
>>>   for indirect calls with 4 or more arguments in pac-enabled functions.
>>>
>>> * gcc.target/arm/pac-sibcall.c: New.
>>> ---
>>>   gcc/config/arm/arm.cc  | 12 
>>>   gcc/testsuite/gcc.target/arm/pac-sibcall.c | 11 +++
>>>   2 files changed, 19 insertions(+), 4 deletions(-)
>>>   create mode 100644 gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>>
>>> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
>>> index c44047c377a..c1f8286a4d4 100644
>>> --- a/gcc/config/arm/arm.cc
>>> +++ b/gcc/config/arm/arm.cc
>>> @@ -7980,10 +7980,14 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
>>>     && DECL_WEAK (decl))
>>>   return false;
>>>   -  /* We cannot do a tailcall for an indirect call by descriptor if all 
>>> the
>>> - argument registers are used because the only register left to load the
>>> - address is IP and it will already contain the static chain.  */
>>> -  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
>>> +  /* We cannot do a tailcall for an indirect call by descriptor or for an
>>> + indirect call in a pac-enabled function if all the argument registers
>>> + are used because the only register left to load the address is IP and
>>> + it will already contain the static chain or the PAC signature in the
>>> + case of PAC-enabled functions.  */
>>
>> This comment is becoming a bit unwieldy.  I suggest restructuring it as:
>>
>> We cannot tailcall an indirect call by descriptor if all the call-clobbered
>> general registers are live (r0-r3 and ip).  This can happen when:
>>    - IP contains the static chain, or
>>    - IP is needed for validating the PAC signature.
>>
>>
>>> +  if (!decl
>>> +  && ((CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
>>> +  || arm_current_function_pac_enabled_p()))
>>>   {
>>>     tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
>>>     CUMULATIVE_ARGS cum;
>>> diff --git a/gcc/testsuite/gcc.target/arm/pac-sibcall.c 
>>> b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>> new file mode 100644
>>> index 000..c57bf7a952c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>> @@ -0,0 +1,11 @@
>>> +/* Testing return address signing.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target mbranch_protection_ok } */
>>> +/* { dg-options " -mcpu=cortex-m85 -mbranch-protection=pac-ret+leaf -O2" } 
>>> */
>>
>> No, you can't just add options like this, you need to first check that they 
>> won't result in conflicts with other options on the command line.  See 
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644077.html for an 
>> example of how to handle this.
>>
> Thanks for the review, Richard. Respin attached.
> 
> Thanks,
> Tejas.
> 
>>> +
>>> +void fail(void (*f)(int, int, int, int))
>>> +{
>>> +  f(1, 2, 3, 4);
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-not "bx\tip\t@ indirect register sibling 
>>> call" } } */
>>
>> R.
>>
+++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
@@ -0,0 +1,14 @@
+/* If all call-clobbered general registers are live (r0-r3, ip), disable
+   indirect tail-call for a PAC-enabled function.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target mbranch_protection_ok } */
This only checks if -mbranch-protection can work with the existing 
architecture/cpu; not with the flags you're about to add below.  You should 
check for arm_arch_v8_1m_main_pacbti_ok instead; then you can assume that 
-mbranch-protection can be added.

+/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
+/* { dg-additional-options "-mbranch-protection=pac-ret+leaf -O2" } */

Otherwise this is OK if you fix the above.

R.

Re: [PATCH] testsuite: Disable test for incompatible Arm targets

2024-02-13 Thread Richard Earnshaw





On 13/02/2024 10:44, Torbjörn SVENSSON wrote:

Ok for trunk and releases/gcc-13?

The alternative approach (that is changing the result a bit) is to drop
the special treatment for arm*-*-*. I'm not sure if this is prefered or
just disable the test for incompatible flags for arm*-*-*.

--

The test assumes it's okay to supply -march=armv7-a+simd, but it depends
on what target you are running the tests for.  For example, running the
GCC testsuite for Cortex-M0 produces the follwing entry in the logs:


Running the testsuite with -mcpu= in runtest/site.exp flags will uncover 
a whole host of problems with tests that try to specify an architecture. 
 It's essentially broken/unsupported at present.


I have some ideas for how to fix this properly, but they will have to 
wait for gcc-15 now.  In the mean time, I'd rather we didn't try to 
paper over the problem by putting random changes into the tests right now.


R.



Testing gcc.dg/pr41574.c
doing compile
Executing on host: arm-none-eabi-gcc .../pr41574.c  -mthumb -march=armv6s-m 
-mcpu=cortex-m0 -mfloat-abi=soft   -fdiagnostics-plain-output  -O2 
-march=armv7-a -mfloat-abi=softfp -mfpu=neon -fno-unsafe-math-optimizations 
-fdump-rtl-combine -ffat-lto-objects -S -o pr41574.s(timeout = 800)
spawn -ignore SIGHUP arm-none-eabi-gcc .../pr41574.c -mthumb -march=armv6s-m 
-mcpu=cortex-m0 -mfloat-abi=soft -fdiagnostics-plain-output -O2 -march=armv7-a 
-mfloat-abi=softfp -mfpu=neon -fno-unsafe-math-optimizations -fdump-rtl-combine 
-ffat-lto-objects -S -o pr41574.s
pid is 9799 -9799
cc1: warning: switch '-mcpu=cortex-m0' conflicts with switch 
'-march=armv7-a+simd'
pid is -1
output is cc1: warning: switch '-mcpu=cortex-m0' conflicts with switch 
'-march=armv7-a+simd'
  status 0
FAIL: gcc.dg/pr41574.c (test for excess errors)
Excess errors:
cc1: warning: switch '-mcpu=cortex-m0' conflicts with switch 
'-march=armv7-a+simd'
PASS: gcc.dg/pr41574.c scan-rtl-dump-not combine "\\(plus:DF \\(mult:DF"

Patch has been verified on Linux.

gcc/testsuite/ChangeLog:

* gcc.dg/pr41574.c: Disable test for Arm targets incompatible
with -march=armv7-a+simd.

Signed-off-by: Torbjörn SVENSSON 
---
  gcc/testsuite/gcc.dg/pr41574.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr41574.c b/gcc/testsuite/gcc.dg/pr41574.c
index 062c0044532..f6af0c34273 100644
--- a/gcc/testsuite/gcc.dg/pr41574.c
+++ b/gcc/testsuite/gcc.dg/pr41574.c
@@ -1,6 +1,7 @@
  /* { dg-do compile } */
-/* { dg-options "-O2 -march=armv7-a -mfloat-abi=softfp -mfpu=neon 
-fno-unsafe-math-optimizations -fdump-rtl-combine" { target { arm*-*-* } } } */
-/* { dg-options "-O2 -fno-unsafe-math-optimizations -fdump-rtl-combine" { 
target { ! arm*-*-* } } } */
+/* { dg-options "-O2 -fno-unsafe-math-optimizations -fdump-rtl-combine" } */
+/* { dg-require-effective-target arm_arch_v7a_neon_multilib { target { 
arm*-*-* } } } */
+/* { dg-additional-options "-march=armv7-a -mfloat-abi=softfp -mfpu=neon" { 
target { arm*-*-* } } } */
  
  
  static const double one=1.0;

Re: [PATCH] testsuite, arm: Fix testcase arm/pr112337.c to check for the options first

2024-02-09 Thread Richard Earnshaw (lists)

On 30/01/2024 17:07, Saurabh Jha wrote:
> Hey,
> 
> Previously, this test was added to fix this bug: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337. However, it did not 
> check the compilation options before using them, leading to errors.
> 
> This patch fixes the test by first checking whether it can use the options 
> before using them.
> 
> Tested for arm-none-eabi and found no regressions. The output of check-gcc 
> with RUNTESTFLAGS="arm.exp=*" changed like this:
> 
> Before:
> # of expected passes  5963
> # of unexpected failures  64
> 
> After:
> # of expected passes  5964
> # of unexpected failures  63
> 
> Ok for master?
> 
> Regards,
> Saurabh
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/pr112337.c: Check whether we can use the compilation 
> options before using them.

My apologies for missing this earlier.  It didn't show up in patchwork. That's 
most likely because the attachment is a binary blob instead of text/plain.  
That also means that the Linaro CI system hasn't seen this patch either.  
Please can you fix your mailer to add plain text patch files.

-/* { dg-options "-O2 -march=armv8.1-m.main+fp.dp+mve.fp -mfloat-abi=hard" } */
+/* { dg-require-effective-target arm_hard_ok } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-options "-O2 -mfloat-abi=hard" } */
+/* { dg-add-options arm_v8_1m_mve } */

This is moving in the right direction, but it adds more than necessary now: 
checking for, and adding -mfloat-abi=hard is not necessary any more as 
arm_v8_1m_mve_ok will work out what float-abi flags are needed to make the 
options work. (What's more, it will prevent the test from running if the base 
configuration of the compiler is incompatible with the hard float ABI, which is 
more than we need.).

So please can you re-spin removing the hard-float check and removing that from 
dg-options.

Thanks,
R.

[PATCH] arm: testsuite: fix issues relating to fp16 alternative testing

2024-02-08 Thread Richard Earnshaw


The v*_fp16_xN_1.c tests on Arm have been unstable since they were
added.  This is not a problem with the tests themselves, or even the
patches that were added, but with the testsuite infrastructure.  It
turned out that another set of dg- tests for fp16 were corrupting the
cached set of options used by the new tests, leading to running the
tests with incorrect flags.

So the primary goal of this patch is to fix the incorrect internal
caching of the options needed to enable fp16 alternative format on
Arm: the code was storing the result in the same variable that was
being used for neon_fp16 and this was leading to testsuite instability
for tests that were checking for neon with fp16.

But in cleaning this up I also noted that we weren't then applying the
flags correctly having detected what they were, so we also address
that.

I suspect there are still some further issues to address here, since
the framework does not correctly test that the multilibs and startup
code enable alternative format; but this is still an improvement over
what we had before.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_arm_fp16_alternative_ok_nocache): Use
et_arm_fp16_alternative_flags to cache the result.  Improve test
for FP16 availability.
(add_options_for_arm_fp16_alternative): Use
et_arm_fp16_alternative_flags.
* g++.dg/ext/arm-fp16/arm-fp16-ops-3.C: Update dg-* flags.
* g++.dg/ext/arm-fp16/arm-fp16-ops-4.C: Likewise.
* gcc.dg/torture/arm-fp16-int-convert-alt.c: Likewise.
* gcc.dg/torture/arm-fp16-ops-3.c: Likewise.
* gcc.dg/torture/arm-fp16-ops-4.c: Likewise.
* gcc.target/arm/fp16-aapcs-3.c: Likewise.
* gcc.target/arm/fp16-aapcs-4.c: Likewise.
* gcc.target/arm/fp16-compile-alt-1.c: Likewise.
* gcc.target/arm/fp16-compile-alt-10.c: Likewise.
* gcc.target/arm/fp16-compile-alt-11.c: Likewise.
* gcc.target/arm/fp16-compile-alt-12.c: Likewise.
* gcc.target/arm/fp16-compile-alt-2.c: Likewise.
* gcc.target/arm/fp16-compile-alt-3.c: Likewise.
* gcc.target/arm/fp16-compile-alt-4.c: Likewise.
* gcc.target/arm/fp16-compile-alt-5.c: Likewise.
* gcc.target/arm/fp16-compile-alt-6.c: Likewise.
* gcc.target/arm/fp16-compile-alt-7.c: Likewise.
* gcc.target/arm/fp16-compile-alt-8.c: Likewise.
* gcc.target/arm/fp16-compile-alt-9.c: Likewise.
* gcc.target/arm/fp16-rounding-alt-1.c: Likewise.
---
 .../g++.dg/ext/arm-fp16/arm-fp16-ops-3.C |  2 +-
 .../g++.dg/ext/arm-fp16/arm-fp16-ops-4.C |  3 ++-
 .../gcc.dg/torture/arm-fp16-int-convert-alt.c|  2 +-
 gcc/testsuite/gcc.dg/torture/arm-fp16-ops-3.c|  2 +-
 gcc/testsuite/gcc.dg/torture/arm-fp16-ops-4.c|  3 ++-
 gcc/testsuite/gcc.target/arm/fp16-aapcs-3.c  |  3 ++-
 gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c  |  3 ++-
 .../gcc.target/arm/fp16-compile-alt-1.c  |  2 +-
 .../gcc.target/arm/fp16-compile-alt-10.c |  3 ++-
 .../gcc.target/arm/fp16-compile-alt-11.c |  3 ++-
 .../gcc.target/arm/fp16-compile-alt-12.c |  2 +-
 .../gcc.target/arm/fp16-compile-alt-2.c  |  2 +-
 .../gcc.target/arm/fp16-compile-alt-3.c  |  2 +-
 .../gcc.target/arm/fp16-compile-alt-4.c  |  2 +-
 .../gcc.target/arm/fp16-compile-alt-5.c  |  2 +-
 .../gcc.target/arm/fp16-compile-alt-6.c  |  2 +-
 .../gcc.target/arm/fp16-compile-alt-7.c  |  3 ++-
 .../gcc.target/arm/fp16-compile-alt-8.c  |  2 +-
 .../gcc.target/arm/fp16-compile-alt-9.c  |  2 +-
 .../gcc.target/arm/fp16-rounding-alt-1.c |  4 +++-
 gcc/testsuite/lib/target-supports.exp| 16 
 21 files changed, 37 insertions(+), 28 deletions(-)

diff --git a/gcc/testsuite/g++.dg/ext/arm-fp16/arm-fp16-ops-3.C b/gcc/testsuite/g++.dg/ext/arm-fp16/arm-fp16-ops-3.C
index 29080c7514f..5eceb3074df 100644
--- a/gcc/testsuite/g++.dg/ext/arm-fp16/arm-fp16-ops-3.C
+++ b/gcc/testsuite/g++.dg/ext/arm-fp16/arm-fp16-ops-3.C
@@ -1,6 +1,6 @@
 /* Test various operators on __fp16 and mixed __fp16/float operands.  */
 /* { dg-do run { target arm*-*-* } } */
 /* { dg-require-effective-target arm_fp16_alternative_ok } */
-/* { dg-options "-mfp16-format=alternative" } */
+/* { dg-add-options arm_fp16_alternative } */
 
 #include "arm-fp16-ops.h"
diff --git a/gcc/testsuite/g++.dg/ext/arm-fp16/arm-fp16-ops-4.C b/gcc/testsuite/g++.dg/ext/arm-fp16/arm-fp16-ops-4.C
index 4be8883faad..d86019f1469 100644
--- a/gcc/testsuite/g++.dg/ext/arm-fp16/arm-fp16-ops-4.C
+++ b/gcc/testsuite/g++.dg/ext/arm-fp16/arm-fp16-ops-4.C
@@ -1,6 +1,7 @@
 /* Test various operators on __fp16 and mixed __fp16/float operands.  */
 /* { dg-do run { target arm*-*-* } } */
 /* { dg-require-effective-target arm_fp16_alternative_ok } */
-/* { dg-options "-mfp16-format=alternative -ffast-math" } */
+/* { dg-options "-ffast-math" } */
+/* {

Re: [PATCH] Arm: Fix incorrect tailcall-generation for indirect calls [PR113780]

2024-02-07 Thread Richard Earnshaw (lists)

On 07/02/2024 07:59, Tejas Belagod wrote:
> This patch fixes a bug that causes indirect calls in PAC-enabled functions
> to be tailcalled incorrectly when all argument registers R0-R3 are used.
> 
> Tested on arm-none-eabi for armv8.1-m.main. OK for trunk?
> 
> 2024-02-07  Tejas Belagod  
> 
>   PR target/113780
>   * gcc/config/arm.cc (arm_function_ok_for_sibcall): Don't allow tailcalls
>   for indirect calls with 4 or more arguments in pac-enabled functions.
> 
>   * gcc.target/arm/pac-sibcall.c: New.
> ---
>  gcc/config/arm/arm.cc  | 12 
>  gcc/testsuite/gcc.target/arm/pac-sibcall.c | 11 +++
>  2 files changed, 19 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/pac-sibcall.c
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index c44047c377a..c1f8286a4d4 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -7980,10 +7980,14 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
>&& DECL_WEAK (decl))
>  return false;
>  
> -  /* We cannot do a tailcall for an indirect call by descriptor if all the
> - argument registers are used because the only register left to load the
> - address is IP and it will already contain the static chain.  */
> -  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
> +  /* We cannot do a tailcall for an indirect call by descriptor or for an
> + indirect call in a pac-enabled function if all the argument registers
> + are used because the only register left to load the address is IP and
> + it will already contain the static chain or the PAC signature in the
> + case of PAC-enabled functions.  */

This comment is becoming a bit unwieldy.  I suggest restructuring it as:

We cannot tailcall an indirect call by descriptor if all the call-clobbered
general registers are live (r0-r3 and ip).  This can happen when:
  - IP contains the static chain, or
  - IP is needed for validating the PAC signature.


> +  if (!decl
> +  && ((CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
> +   || arm_current_function_pac_enabled_p()))
>  {
>tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
>CUMULATIVE_ARGS cum;
> diff --git a/gcc/testsuite/gcc.target/arm/pac-sibcall.c 
> b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
> new file mode 100644
> index 000..c57bf7a952c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
> @@ -0,0 +1,11 @@
> +/* Testing return address signing.  */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target mbranch_protection_ok } */
> +/* { dg-options " -mcpu=cortex-m85 -mbranch-protection=pac-ret+leaf -O2" } */

No, you can't just add options like this, you need to first check that they 
won't result in conflicts with other options on the command line.  See 
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644077.html for an 
example of how to handle this.

> +
> +void fail(void (*f)(int, int, int, int))
> +{
> +  f(1, 2, 3, 4);
> +}
> +
> +/* { dg-final { scan-assembler-not "bx\tip\t@ indirect register sibling 
> call" } } */

R.

Re: [PATCH v2] arm: Fix missing bti instruction for virtual thunks

2024-02-02 Thread Richard Earnshaw (lists)

On 26/01/2024 15:31, Richard Ball wrote:
> v2: Formatting and test options fix.
> 
> Adds missing bti instruction at the beginning of a virtual
> thunk, when bti is enabled.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.cc (arm_output_mi_thunk): Emit
>   insn for bti_c when bti is enabled.
> 
> gcc/testsuite/ChangeLog:
> 
> * lib/target-supports.exp: Add v8_1_m_main_pacbti.
> * g++.target/arm/bti_thunk.C: New test.

OK, thanks.

R.

Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-31 Thread Richard Earnshaw (lists)

On 30/01/2024 14:09, Andre Simoes Dias Vieira wrote:
> Hi Richard,
> 
> Thanks for the reviews, I'm making these changes but just a heads up.
> 
> When hardcoding LR_REGNUM like this we need to change the way we compare the 
> register in doloop_condition_get. This function currently compares the rtx 
> nodes by address, which I think happens to be fine before we assign hard 
> registers, as I suspect we always share the rtx node for the same pseudo, but 
> when assigning registers it seems like we create copies, so things like:
> `XEXP (inc_src, 0) == reg` will fail for
> inc_src: (plus (reg LR) (const_int -n)'
> reg: (reg LR)
> 
> Instead I will substitute the operand '==' with calls to 'rtx_equal_p (op1, 
> op2, NULL)'.

Yes, that's fine.

R.

> 
> Sound good?
> 
> Kind regards,
> Andre
> 
> 
> From: Richard Earnshaw (lists) 
> Sent: Tuesday, January 30, 2024 11:36 AM
> To: Andre Simoes Dias Vieira; gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov; Stam Markianos-Wright
> Subject: Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low 
> Overhead Loops
> 
> On 19/01/2024 14:40, Andre Vieira wrote:
>>
>> Respin after comments from Kyrill and rebase. I also removed an if-then-else
>> construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
>> functions
>> Kyrill pointed out.
>>
>> After an earlier comment from Richard Sandiford I also added comments to the
>> two tail predication patterns added to explain the need for the unspecs.
> 
> [missing ChangeLog]
> 
> I'm just going to focus on loop-doloop.c in this reply, I'll respond to the 
> other bits in a follow-up.
> 
>   2)  (set (reg) (plus (reg) (const_int -1))
> - (set (pc) (if_then_else (reg != 0)
> -(label_ref (label))
> -(pc))).
> +(set (pc) (if_then_else (reg != 0)
> +(label_ref (label))
> +(pc))).
> 
>   Some targets (ARM) do the comparison before the branch, as in the
>   following form:
> 
> - 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
> -   (set (reg) (plus (reg) (const_int -1)))])
> -(set (pc) (if_then_else (cc == NE)
> ...
> 
> 
> This comment is becoming confusing.  Really the text leading up to 3)... 
> should be inside 3.  Something like:
> 
>   3) Some targets (ARM) do the comparison before the branch, as in the
>   following form:
> 
>   (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
>  (set (reg) (plus (reg) (const_int -1)))])
>   (set (pc) (if_then_else (cc == NE)
>   (label_ref (label))
>   (pc)))])
> 
> 
> The same issue on the comment structure also applies to the new point 4...
> 
> +  The ARM target also supports a special case of a counter that 
> decrements
> +  by `n` and terminating in a GTU condition.  In that case, the compare 
> and
> +  branch are all part of one insn, containing an UNSPEC:
> +
> +  4) (parallel [
> +   (set (pc)
> +   (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
> +   (const_int -n))])
> +  (const_int n-1]))
> +   (label_ref)
> +   (pc)))
> +   (set (reg:SI 14 lr)
> +(plus:SI (reg:SI 14 lr)
> + (const_int -n)))
> + */
> 
> I think this needs a bit more clarification.  Specifically that this 
> construct supports a predicated vectorized do loop.  Also, the placement of 
> the unspec inside the comparison is ugnly and unnecessary.  It should be 
> sufficient to have the unspec inside a USE expression, which the mid-end can 
> then ignore entirely.  So
> 
> (parallel
>  [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
>(const_int n-1))
>   (label_ref) (pc)))
>   (set (reg) (plus (reg) (const_int -n)))
>   (additional clobbers and uses)])
> 
> For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to 
> this pattern to stop anything else from matching it.
> 
> Note that we don't need to mention that the register is 'LR' or the modes, 
> those are specific to a particular backend, not the generic pattern we want 
> to match.
> 
> +  || !CONST_INT_P (XEXP (inc_src, 1))
> +  || INTVAL (XEXP (inc_src, 1)) >= 0)
>  re

Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-30 Thread Richard Earnshaw (lists)

On 19/01/2024 14:40, Andre Vieira wrote:
> 
> Respin after comments from Kyrill and rebase. I also removed an if-then-else
> construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
> functions
> Kyrill pointed out.
> 
> After an earlier comment from Richard Sandiford I also added comments to the
> two tail predication patterns added to explain the need for the unspecs.

[missing ChangeLog]

I'm just going to focus on loop-doloop.c in this reply, I'll respond to the 
other bits in a follow-up.

  2)  (set (reg) (plus (reg) (const_int -1))
- (set (pc) (if_then_else (reg != 0)
-(label_ref (label))
-(pc))).  
+(set (pc) (if_then_else (reg != 0)
+(label_ref (label))
+(pc))).

  Some targets (ARM) do the comparison before the branch, as in the
  following form:

- 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
-   (set (reg) (plus (reg) (const_int -1)))])
-(set (pc) (if_then_else (cc == NE)
...

This comment is becoming confusing.  Really the text leading up to 3)... should 
be inside 3.  Something like:

  3) Some targets (ARM) do the comparison before the branch, as in the
  following form:

  (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
 (set (reg) (plus (reg) (const_int -1)))])
  (set (pc) (if_then_else (cc == NE)
  (label_ref (label))
  (pc)))])

The same issue on the comment structure also applies to the new point 4...

+  The ARM target also supports a special case of a counter that decrements
+  by `n` and terminating in a GTU condition.  In that case, the compare and
+  branch are all part of one insn, containing an UNSPEC:
+
+  4) (parallel [
+   (set (pc)
+   (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
+   (const_int -n))])
+  (const_int n-1]))
+   (label_ref)
+   (pc)))
+   (set (reg:SI 14 lr)
+(plus:SI (reg:SI 14 lr)
+ (const_int -n)))
+ */

I think this needs a bit more clarification.  Specifically that this construct 
supports a predicated vectorized do loop.  Also, the placement of the unspec 
inside the comparison is ugnly and unnecessary.  It should be sufficient to 
have the unspec inside a USE expression, which the mid-end can then ignore 
entirely.  So

(parallel
 [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
   (const_int n-1))
  (label_ref) (pc)))
  (set (reg) (plus (reg) (const_int -n)))
  (additional clobbers and uses)])

For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to 
this pattern to stop anything else from matching it.

Note that we don't need to mention that the register is 'LR' or the modes, 
those are specific to a particular backend, not the generic pattern we want to 
match.

+  || !CONST_INT_P (XEXP (inc_src, 1))
+  || INTVAL (XEXP (inc_src, 1)) >= 0)
 return 0;
+  int dec_num = abs (INTVAL (XEXP (inc_src, 1)));

We can just use '-INTVAL(...)' here, we've verified just above that the 
constant is negative.

-  if ((XEXP (condition, 0) == reg)
+  /* For the ARM special case of having a GTU: re-form the condition without
+ the unspec for the benefit of the middle-end.  */
+  if (GET_CODE (condition) == GTU)
+{
+  condition = gen_rtx_fmt_ee (GTU, VOIDmode, inc_src,
+ GEN_INT (dec_num - 1));
+  return condition;
+}

If you make the change I mentioned above, this re-forming isn't needed any 
more, so the arm-specific comment goes away

-   {
+{
  if (GET_CODE (pattern) != PARALLEL)
  /*  For the second form we expect:

You've fixed the indentation of the brace (good), but the body of the braced 
expression needs re-indenting as well.

R.

Re: [PATCH v3 1/2] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-01-30 Thread Richard Earnshaw (lists)

On 19/01/2024 14:40, Andre Vieira wrote:
> 
> Reposting for testing purposes, no changes from v2 (other than rebase).

We seem to have lost the ChangeLog for this hunk :(

The code itself looks OK, though.

Re: [PATCH][GCC][Arm] Add pattern for bswap + rotate -> rev16 [Bug 108933]

2024-01-29 Thread Richard Earnshaw


On 29/01/2024 14:14, Matthieu Longo wrote:

Hi Richard,

Please find below the new patch where I addressed your comments and 
updated the changelog.


rev16 pattern was not recognised anymore as a change in the bswap tree
pass was introducing a new GIMPLE form, not recognized by the assembly
final transformation pass.

More details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933

gcc/ChangeLog:

 PR target/108933
 * config/arm/arm.md (arm_rev16si2): Convert to define_insn.
 Correct generated RTL.
 (arm_rev16si2_alt1): Correctly handle conditional execution.
 (arm_rev16si2_alt2): Likewise.

gcc/testsuite/ChangeLog:

 PR target/108933
 * gcc.target/arm/rev16.c: Moved to...
 * gcc.target/arm/rev16_1.c: ...here.
 * gcc.target/arm/rev16_2.c: New test to check that rev16 is
 emitted.


Thanks.  I've tweaked the commit message very slightly and pushed this.

Could you please prepare backports for gcc-11 thru 13?  It should just 
be a matter of cherry-picking the commit.


R.



On 2024-01-22 16:25, Richard Earnshaw (lists) wrote:

On 22/01/2024 12:18, Matthieu Longo wrote:

rev16 pattern was not recognised anymore as a change in the bswap tree
pass was introducing a new GIMPLE form, not recognized by the assembly
final transformation pass.

More details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933

gcc/ChangeLog:

 PR target/108933
 * config/arm/arm.md (*arm_rev16si2_alt3): new pattern to 
convert

   a bswap + rotate by 16 bits into rev16


ChangeLog entries need to be written as sentences, so start with a 
capital letter and end with a full stop; continuation lines should 
start in column 8 (one hard tab, don't use spaces).  But in this case, 
"New pattern." is sufficient.




gcc/testsuite/ChangeLog:

 PR target/108933
 * gcc.target/arm/rev16.c: Moved to...
 * gcc.target/arm/rev16_1.c: ...here.
 * gcc.target/arm/rev16_2.c: New test to check that rev16 is
   emitted.



+;; Similar pattern to match (rotate (bswap) 16)
+(define_insn "*arm_rev16si2_alt3"
+  [(set (match_operand:SI 0 "register_operand" "=l,r")
+    (rotate:SI (bswap:SI (match_operand:SI 1 "register_operand" 
"l,r"))

+ (const_int 16)))]
+  "arm_arch6"
+  "rev16\\t%0, %1"
+  [(set_attr "arch" "t,32")
+   (set_attr "length" "2,4")
+   (set_attr "type" "rev")]
+)
+

Unfortunately, this is insufficient.  When generating Arm or Thumb2 
code (but not thumb1) we also have to handle conditional execution: we 
need to have '%?' in the output template at the point where a 
condition code might be needed.  That means we need separate output 
templates for all three alternatives (as we need a 16-bit variant for 
thumb2 that's conditional and a 16-bit for thumb1 that isn't).  See 
the output of arm_rev16 for a guide of what is really needed.


I note that the arm_rev16si2_alt1, and arm_rev16si2_alt2 patterns are 
incorrect in this regard as well; that will need fixing.


I also see that arm_rev16si2 currently expands to the alt1 variant 
above; given that the preferred canonical form would now appear to use 
bswap + rotate, we should change that as well.  In fact, we can merge 
your new pattern with the expand entirely and eliminate the need to 
call gen_arm_rev16si2_alt1.  Something like:


(define_insn "arm_rev16si2"
   [(set (match_operand:SI 0 "s_register_operand")
 (rotate:SI (bswap:SI (match_operand:SI 1 
"s_register_operand")) (const_int 16))]

   "arm_arch6"
   "@
   rev16...
   ...


R.

Re: [PATCH] Make gcc.target/arm/bics_3.c testcase a bit more generic [PR113542]

2024-01-25 Thread Richard Earnshaw (lists)

On 25/01/2024 10:29, Maxim Kuvyrkov wrote:
> After fwprop improvement in r14-8319-g86de9b66480, codegen in
> bics_3.c test changed from "bics" to "bic" instruction, with
> the overall instruction stream remaining at the same quality.
> 
> This patch makes the scan-assembler directive accept both
> "bics" and "bic".
> 
> BEFORE r14-8319-g86de9b66480:
>   bicsr0, r0, r1 @ 9  [c=4 l=4]  *andsi_notsi_si_compare0_scratch
>   mov r0, #1  @ 23[c=4 l=4]  *thumb2_movsi_vfp/1
>   it  eq
>   moveq   r0, #0  @ 26[c=8 l=4]  *p *thumb2_movsi_vfp/2
>   bx  lr  @ 29[c=8 l=4]  *thumb2_return
> 
> AFTER r14-8319-g86de9b66480:
>   bic r0, r0, r1  @ 8 [c=4 l=4]  andsi_notsi_si
>   subsr0, r0, #0  @ 22[c=4 l=4]  cmpsi2_addneg/0
>   it  ne
>   movne   r0, #1  @ 23[c=8 l=4]  *p *thumb2_movsi_vfp/2
>   bx  lr  @ 26[c=8 l=4]  *thumb2_return
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/113542
>   * gcc.target/arm/bics_3.c: Update scan-assembler directive.
> ---
>  gcc/testsuite/gcc.target/arm/bics_3.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/bics_3.c 
> b/gcc/testsuite/gcc.target/arm/bics_3.c
> index e056b264e15..c5bed3c92d2 100644
> --- a/gcc/testsuite/gcc.target/arm/bics_3.c
> +++ b/gcc/testsuite/gcc.target/arm/bics_3.c
> @@ -35,6 +35,6 @@ main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+" 
> 2 } } */
> -/* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+, 
> .sl #2" 1 } } */
> +/* { dg-final { scan-assembler-times "bics?\tr\[0-9\]+, r\[0-9\]+, 
> r\[0-9\]+" 2 } } */
> +/* { dg-final { scan-assembler-times "bics?\tr\[0-9\]+, r\[0-9\]+, 
> r\[0-9\]+, .sl #2" 1 } } */
>  


The test was added (r6-823-g0454e698401a3e) specifically to check that a BICS 
instruction was being generated.  Whether or not that is right is somewhat 
debatable, but this change seems to be papering over a different issue.

Either we should generate BICS, making this change incorrect, or we should 
disable the test for thumb code on the basis that this isn't really a win.

But really, we should fix the compiler to do better here.  We really want 
something like

BICS  r0, r0, r1  // r0 is 0 or non-zero
MOVNE r0, #1  // convert all non-zero to 1

in Arm state (ie using the BICS instruction to set the result to zero); and in 
thumb2, perhaps something like:

BICS  r0, r0, r1
ITne
MOVNE r0, #1

or maybe even better:

BIC  r0, r0, r1
SUBS r1, r0, #1
SBC  r0, r0, r1

which is slightly better than BICS because SUBS breaks a condition-code chain 
(all the flag bits are set).

There are similar quality issues for other NE(arith-op, 0) cases; we just don't 
have tests for those.

R.

Re: [PATCH] arm: Fix missing bti instruction for virtual thunks

2024-01-24 Thread Richard Earnshaw (lists)

On 23/01/2024 15:53, Richard Ball wrote:
> Adds missing bti instruction at the beginning of a virtual
> thunk, when bti is enabled.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.cc (arm_output_mi_thunk): Emit
>   insn for bti_c when bti is enabled.
> 
> gcc/testsuite/ChangeLog:
> 
> * g++.target/arm/bti_thunk.C: New test.


diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 
e5a944486d7bd583627b0e22dfe8f95862e975bb..91eee8be7c1a59118fbf443557561fb3e0689d61
 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -29257,6 +29257,8 @@ arm_output_mi_thunk (FILE *file, tree thunk, 
HOST_WIDE_INT delta,
   const char *fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk));
 
   assemble_start_function (thunk, fnname);
+  if (aarch_bti_enabled ())
+emit_insn (aarch_gen_bti_c());

Missing space between ...bit_c and the parenthesis.

   if (TARGET_32BIT)
 arm32_output_mi_thunk (file, thunk, delta, vcall_offset, function);
   else

diff --git a/gcc/testsuite/g++.target/arm/bti_thunk.C 
b/gcc/testsuite/g++.target/arm/bti_thunk.C
new file mode 100644
index 
..5c4a8e5a8d74581eca2b877c000a5b34ddca0e9b
--- /dev/null
+++ b/gcc/testsuite/g++.target/arm/bti_thunk.C
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv8.1-m.main+pacbti -O1 -mbranch-protection=bti 
--save-temps" } */

You can't just add options like this; they may not work with other options 
passed by the testsuite framework.  Instead, you should a suitable entry to 
lib/target-supports.exp in the table starting "foreach { armfunc armflag 
armdefs } {" that tests whether the options can be safely added, and then use 
dg-require-effective-target and dg-add-options for your new set of options.

\ No newline at end of file

Please add one :)

R.

Re: [PATCH][GCC][Arm] Add pattern for bswap + rotate -> rev16 [Bug 108933]

2024-01-22 Thread Richard Earnshaw (lists)

On 22/01/2024 12:18, Matthieu Longo wrote:
> rev16 pattern was not recognised anymore as a change in the bswap tree
> pass was introducing a new GIMPLE form, not recognized by the assembly
> final transformation pass.
> 
> More details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933
> 
> gcc/ChangeLog:
> 
>     PR target/108933
>     * config/arm/arm.md (*arm_rev16si2_alt3): new pattern to convert
>   a bswap + rotate by 16 bits into rev16

ChangeLog entries need to be written as sentences, so start with a capital 
letter and end with a full stop; continuation lines should start in column 8 
(one hard tab, don't use spaces).  But in this case, "New pattern." is 
sufficient.

> 
> gcc/testsuite/ChangeLog:
> 
>     PR target/108933
>     * gcc.target/arm/rev16.c: Moved to...
>     * gcc.target/arm/rev16_1.c: ...here.
>     * gcc.target/arm/rev16_2.c: New test to check that rev16 is
>   emitted.

+;; Similar pattern to match (rotate (bswap) 16)
+(define_insn "*arm_rev16si2_alt3"
+  [(set (match_operand:SI 0 "register_operand" "=l,r")
+(rotate:SI (bswap:SI (match_operand:SI 1 "register_operand" "l,r"))
+ (const_int 16)))]
+  "arm_arch6"
+  "rev16\\t%0, %1"
+  [(set_attr "arch" "t,32")
+   (set_attr "length" "2,4")
+   (set_attr "type" "rev")]
+)
+

Unfortunately, this is insufficient.  When generating Arm or Thumb2 code (but 
not thumb1) we also have to handle conditional execution: we need to have '%?' 
in the output template at the point where a condition code might be needed.  
That means we need separate output templates for all three alternatives (as we 
need a 16-bit variant for thumb2 that's conditional and a 16-bit for thumb1 
that isn't).  See the output of arm_rev16 for a guide of what is really needed.

I note that the arm_rev16si2_alt1, and arm_rev16si2_alt2 patterns are incorrect 
in this regard as well; that will need fixing.

I also see that arm_rev16si2 currently expands to the alt1 variant above; given 
that the preferred canonical form would now appear to use bswap + rotate, we 
should change that as well.  In fact, we can merge your new pattern with the 
expand entirely and eliminate the need to call gen_arm_rev16si2_alt1.  
Something like:

(define_insn "arm_rev16si2"
  [(set (match_operand:SI 0 "s_register_operand")
(rotate:SI (bswap:SI (match_operand:SI 1 "s_register_operand")) 
(const_int 16))]
  "arm_arch6"
  "@
  rev16...
  ...

R.

Re: [PATCH] arm: Fix parsecpu.awk for aliases [PR113030]

2024-01-22 Thread Richard Earnshaw (lists)

On 21/01/2024 07:29, Andrew Pinski wrote:
> So the problem here is the 2 functions check_cpu and check_arch use
> the wrong variable to check if an alias is valid for that cpu/arch.
> check_cpu uses cpu_optaliases instead of cpu_opt_alias. cpu_optaliases
> is an array of index'ed by the cpuname that contains all of the valid aliases
> for that cpu but cpu_opt_alias is an double index array which is index'ed
> by cpuname and the alias which provides what is the alias for that option.
> Similar thing happens for check_arch and arch_optaliases vs arch_optaliases.
> 
> Tested by running:
> ```
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+simd" 
> config/arm/arm-cpus.in
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon" 
> config/arm/arm-cpus.in
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon-vfpv3" 
> config/arm/arm-cpus.in
> ```
> And they don't return error back.
> 
> gcc/ChangeLog:
> 
>   PR target/113030
>   * config/arm/parsecpu.awk (check_cpu): Use cpu_opt_alias
>   instead of cpu_optaliases.
>   (check_arch): Use arch_opt_alias instead of arch_optaliases.

OK

Thanks,

R.

> 
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/config/arm/parsecpu.awk | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
> index ddd4f3b440a..384462bdb5b 100644
> --- a/gcc/config/arm/parsecpu.awk
> +++ b/gcc/config/arm/parsecpu.awk
> @@ -529,7 +529,7 @@ function check_cpu (name) {
>  
>  for (n = 2; n <= exts; n++) {
>   if (!((cpu_name, extensions[n]) in cpu_opt_remove)  \
> - && !((cpu_name, extensions[n]) in cpu_optaliases)) {
> + && !((cpu_name, extensions[n]) in cpu_opt_alias)) {
>   return "error"
>   }
>  }
> @@ -552,7 +552,7 @@ function check_arch (name) {
>  
>  for (n = 2; n <= exts; n++) {
>   if (!((extensions[1], extensions[n]) in arch_opt_remove)\
> - && !((extensions[1], extensions[n]) in arch_optaliases)) {
> + && !((extensions[1], extensions[n]) in arch_opt_alias)) {
>   return "error"
>   }
>  }

Re: [PATCH v3 00/12] [GCC] arm: vld1q vst1 vst1q vst1 intrinsics

2024-01-12 Thread Richard Earnshaw (lists)

On 02/01/2024 09:23, ezra.sito...@arm.com wrote:
> From: Ezra Sitorus 
> 
> Add vld1q, vst1, vst1q and vst1 intrinsics to arm port.
> 
> Ezra Sitorus (12):
>   [GCC] arm: vld1q_types_x2 ACLE intrinsics
>   [GCC] arm: vld1q_types_x3 ACLE intrinsics
>   [GCC] arm: vld1q_types_x4 ACLE intrinsics
>   [GCC] arm: vst1_types_x2 ACLE intrinsics
>   [GCC] arm: vst1_types_x3 ACLE intrinsics
>   [GCC] arm: vst1_types_x4 ACLE intrinsics
>   [GCC] arm: vst1q_types_x2 ACLE intrinsics
>   [GCC] arm: vst1q_types_x3 ACLE intrinsics
>   [GCC] arm: vst1q_types_x4 ACLE intrinsics
>   [GCC] arm: vld1_types_x2 ACLE intrinsics
>   [GCC] arm: vld1_types_x3 ACLE intrinsics
>   [GCC] arm: vld1_types_x4 ACLE intrinsics
> 
>  gcc/config/arm/arm_neon.h | 2032 ++---
>  gcc/config/arm/arm_neon_builtins.def  |   12 +
>  gcc/config/arm/iterators.md   |6 +
>  gcc/config/arm/neon.md|  249 ++
>  gcc/config/arm/unspecs.md |8 +
>  .../gcc.target/arm/simd/vld1_base_xN_1.c  |  176 ++
>  .../gcc.target/arm/simd/vld1_bf16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vld1_fp16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vld1_p64_xN_1.c   |   23 +
>  .../gcc.target/arm/simd/vld1q_base_xN_1.c |  183 ++
>  .../gcc.target/arm/simd/vld1q_bf16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vld1q_fp16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vld1q_p64_xN_1.c  |   24 +
>  .../gcc.target/arm/simd/vst1_base_xN_1.c  |  176 ++
>  .../gcc.target/arm/simd/vst1_bf16_xN_1.c  |   22 +
>  .../gcc.target/arm/simd/vst1_fp16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vst1_p64_xN_1.c   |   23 +
>  .../gcc.target/arm/simd/vst1q_base_xN_1.c |  185 ++
>  .../gcc.target/arm/simd/vst1q_bf16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vst1q_fp16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vst1q_p64_xN_1.c  |   24 +
>  21 files changed, 3018 insertions(+), 290 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_p64_xN_1.c
> 

Thanks, I've pushed this series.

Reviewing this series did highlight a couple of issues with the existing code 
base (not your patch); I'll follow up on these separately.

R.

Re: [PATCH v2 7/7] aarch64,arm: Move branch-protection data to targets

2024-01-11 Thread Richard Earnshaw (lists)

On 11/01/2024 14:43, Szabolcs Nagy wrote:
> The 12/07/2023 13:13, Richard Earnshaw wrote:
>> On 03/11/2023 15:36, Szabolcs Nagy wrote:
>>> * config/aarch64/aarch64.cc (aarch_handle_no_branch_protection): Copy.
>>> (aarch_handle_standard_branch_protection): Copy.
>>> (aarch_handle_pac_ret_protection): Copy.
>>> (aarch_handle_pac_ret_leaf): Copy.
>>> (aarch_handle_pac_ret_b_key): Copy.
>>> (aarch_handle_bti_protection): Copy.
>>
>> I think all of the above functions that have been moved back from
>> aarch-common should be renamed back to aarch64_..., unless they are directly
>> referenced statically by code in aarch-common.c.
> 
> done.
> 
>>> +const struct aarch_branch_protect_type aarch_branch_protect_types[] = {
>>
>> can this be made static now?  And maybe pass the structure as a parameter if
>> that's not done already.
> 
> done in v4.
> 
>> It would be nice if, when we raise an error, we could print out the list of
>> valid options (and modifiers), much like we do on Arm for -march/-mcpu.
>>
>> eg.
>> $ gcc -mcpu=crotex-a8
>> cc1: error: unrecognised -mcpu target: crotex-a8
>> cc1: note: valid arguments are: arm8 arm810 strongarm strongarm110 fa526
>> [...rest of list]; did you mean ‘cortex-a8’?
> 
> i implemented this with candidates_list_and_hint but it does
> not work very well if the typo is in a subtype, so i think
> this should be done in a separate patch if at all.
> 

I'd build the candidates list from all the types + subtypes, so that the 
suggestion code has a full list to pick from; but fair enough.

R.

Re: [libatomic PATCH] Fix testsuite regressions on ARM [raspberry pi].

2024-01-10 Thread Richard Earnshaw





On 08/01/2024 16:07, Roger Sayle wrote:


Bootstrapping GCC on arm-linux-gnueabihf with --with-arch=armv6 currently
has a large number of FAILs in libatomic (regressions since last time I
attempted this).  The failure mode is related to IFUNC handling with the
file tas_8_2_.o containing an unresolved reference to the function
libat_test_and_set_1_i2.

Bearing in mind I've no idea what's going on, the following one line
change, to build tas_1_2_.o when building tas_8_2_.o, resolves the problem
for me and restores the libatomic testsuite to 44 expected passes and 5
unsupported tests [from 22 unexpected failures and 22 unresolved testcases].

If this looks like the correct fix, I'm not confident with rebuilding
Makefile.in with correct version of automake, so I'd very much appreciate
it if someone/the reviewer/mainainer could please check this in for me.
Thanks in advance.


2024-01-08  Roger Sayle  

libatomic/ChangeLog
 * Makefile.am: Build tas_1_2_.o on ARCH_ARM_LINUX
 * Makefile.in: Regenerate.


Roger
--



Hi Roger,

I don't really understand all this make foo :( so I'm not sure if this 
is the right fix either.  If this is, as you say, a regression, have you 
been able to track down when it first started to occur?  That might also 
help me to understand what changed to cause this.


Perhaps we should have a PR for this, to make tracking the fixes easier.

R.

Re: [PATCH][GCC][Arm] Define __ARM_FEATURE_BF16 when +bf16 feature is enabled

2024-01-10 Thread Richard Earnshaw





On 08/01/2024 17:21, Matthieu Longo wrote:

Hi,

Arm GCC backend does not define __ARM_FEATURE_BF16 when +bf16 is 
specified (via -march option, or target pragma) whereas it is supposed 
to be tested before including arm_bf16.h (as specified in ACLE document: 
https://arm-software.github.io/acle/main/acle.html#arm_bf16h).


gcc/ChangeLog:

     * config/arm/arm-c.cc (arm_cpu_builtins): define 
__ARM_FEATURE_BF16

     * config/arm/arm.h: define TARGET_BF16

Ok for master ?

Matthieu
index 
2e181bf7f36bab1209d5358e65d9513541683632..21ca22ac71119eda4ff01709aa95002ca13b1813 
100644

--- a/gcc/config/arm/arm-c.cc
+++ b/gcc/config/arm/arm-c.cc
@@ -425,12 +425,14 @@ arm_cpu_builtins (struct cpp_reader* pfile)
   arm_arch_cde_coproc);

   def_or_undef_macro (pfile, "__ARM_FEATURE_MATMUL_INT8", TARGET_I8MM);
+
+  def_or_undef_macro (pfile, "__ARM_FEATURE_BF16", TARGET_BF16);
+  def_or_undef_macro (pfile, "__ARM_BF16_FORMAT_ALTERNATIVE",
+ TARGET_BF16_FP);
   def_or_undef_macro (pfile, "__ARM_FEATURE_BF16_SCALAR_ARITHMETIC",
  TARGET_BF16_FP);
   def_or_undef_macro (pfile, "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC",
  TARGET_BF16_SIMD);
-  def_or_undef_macro (pfile, "__ARM_BF16_FORMAT_ALTERNATIVE",
- TARGET_BF16_FP || TARGET_BF16_SIMD);

Why is the definition of __ARM_BF16_FORMAT_ALTERNATIVE changed?  And why 
is there explanation of that change?  It doesn't seem directly related 
to $subject.


R.

 }

 void

Re: [PATCH v2] libgfortran: Bugfix if not define HAVE_ATOMIC_FETCH_ADD

2024-01-10 Thread Richard Earnshaw


On 05/01/2024 01:43, Lipeng Zhu wrote:

This patch try to fix the bug when HAVE_ATOMIC_FETCH_ADD is
not defined in dec_waiting_unlocked function. As io.h does
not include async.h, the WRLOCK and RWUNLOCK macros are
undefined.

libgfortran/ChangeLog:

* io/io.h (dec_waiting_unlocked): Use
__gthread_rwlock_wrlock/__gthread_rwlock_unlock or
__gthread_mutex_lock/__gthread_mutex_unlock functions
to replace WRLOCK and RWUNLOCK macros.

Signed-off-by: Lipeng Zhu 


Has this been committed yet?

R.

---
  libgfortran/io/io.h | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h
index 15daa0995b1..c7f0f7d7d9e 100644
--- a/libgfortran/io/io.h
+++ b/libgfortran/io/io.h
@@ -1020,9 +1020,15 @@ dec_waiting_unlocked (gfc_unit *u)
  #ifdef HAVE_ATOMIC_FETCH_ADD
(void) __atomic_fetch_add (>waiting, -1, __ATOMIC_RELAXED);
  #else
-  WRLOCK (_rwlock);
+#ifdef __GTHREAD_RWLOCK_INIT
+  __gthread_rwlock_wrlock (_rwlock);
+  u->waiting--;
+  __gthread_rwlock_unlock (_rwlock);
+#else
+  __gthread_mutex_lock (_rwlock);
u->waiting--;
-  RWUNLOCK (_rwlock);
+  __gthread_mutex_unlock (_rwlock);
+#endif
  #endif
  }

Re: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-15 Thread Richard Earnshaw





On 15/12/2023 11:31, Lipeng Zhu wrote:



On 2023/12/14 23:50, Richard Earnshaw (lists) wrote:

On 09/12/2023 15:39, Lipeng Zhu wrote:

This patch try to introduce the rwlock and split the read/write to
unit_root tree and unit_cache with rwlock instead of the mutex to
increase CPU efficiency. In the get_gfc_unit function, the percentage
to step into the insert_unit function is around 30%, in most instances,
we can get the unit in the phase of reading the unit_cache or unit_root
tree. So split the read/write phase by rwlock would be an approach to
make it more parallel.

BTW, the IPC metrics can gain around 9x in our test
server with 220 cores. The benchmark we used is
https://github.com/rwesson/NEAT

libgcc/ChangeLog:

* gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro.
(__gthrw): New function.
(__gthread_rwlock_rdlock): New function.
(__gthread_rwlock_tryrdlock): New function.
(__gthread_rwlock_wrlock): New function.
(__gthread_rwlock_trywrlock): New function.
(__gthread_rwlock_unlock): New function.

libgfortran/ChangeLog:

* io/async.c (DEBUG_LINE): New macro.
* io/async.h (RWLOCK_DEBUG_ADD): New macro.
(CHECK_RDLOCK): New macro.
(CHECK_WRLOCK): New macro.
(TAIL_RWLOCK_DEBUG_QUEUE): New macro.
(IN_RWLOCK_DEBUG_QUEUE): New macro.
(RDLOCK): New macro.
(WRLOCK): New macro.
(RWUNLOCK): New macro.
(RD_TO_WRLOCK): New macro.
(INTERN_RDLOCK): New macro.
(INTERN_WRLOCK): New macro.
(INTERN_RWUNLOCK): New macro.
* io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in
a comment.
(unit_lock): Remove including associated internal_proto.
(unit_rwlock): New declarations including associated internal_proto.
(dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock
instead of __gthread_mutex_lock and __gthread_mutex_unlock on
unit_lock.
* io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on
unit_rwlock instead of LOCK and UNLOCK on unit_lock.
(st_write_done_worker): Likewise.
* io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules'
comment. Use unit_rwlock variable instead of unit_lock variable.
(get_gfc_unit_from_unit_root): New function.
(get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock
instead of LOCK and UNLOCK on unit_lock.
(close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of
LOCK and UNLOCK on unit_lock.
(close_units): Likewise.
(newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on
unit_lock.
* io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock
instead of LOCK and UNLOCK on unit_lock.
(flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead
of LOCK and UNLOCK on unit_lock.



It looks like this has broken builds on arm-none-eabi when using newlib:

In file included from 
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran

/runtime/error.c:27:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h: In 
function

‘dec_waiting_unlocked’:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1023:3: error
: implicit declaration of function ‘WRLOCK’ 
[-Wimplicit-function-declaration]

  1023 |   WRLOCK (_rwlock);
   |   ^~
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1025:3: error
: implicit declaration of function ‘RWUNLOCK’ 
[-Wimplicit-function-declaration]

  1025 |   RWUNLOCK (_rwlock);
   |   ^~~~


R.


Hi Richard,

The root cause is that the macro WRLOCK and RWUNLOCK are not defined in 
io.h. The reason of x86 platform not failed is that 
HAVE_ATOMIC_FETCH_ADD is defined then caused above macros were never 
been used. Code logic show as below:

#ifdef HAVE_ATOMIC_FETCH_ADD
   (void) __atomic_fetch_add (>waiting, -1, __ATOMIC_RELAXED);
#else
   WRLOCK (_rwlock);
   u->waiting--;
   RWUNLOCK (_rwlock);
#endif

I just draft a patch try to fix this bug, because I didn't have arm 
platform, would you help to validate if it was fixed on arm platform?


diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h
index 15daa0995b1..c7f0f7d7d9e 100644
--- a/libgfortran/io/io.h
+++ b/libgfortran/io/io.h
@@ -1020,9 +1020,15 @@ dec_waiting_unlocked (gfc_unit *u)
  #ifdef HAVE_ATOMIC_FETCH_ADD
    (void) __atomic_fetch_add (>waiting, -1, __ATOMIC_RELAXED);
  #else
-  WRLOCK (_rwlock);
+#ifdef __GTHREAD_RWLOCK_INIT
+  __gthread_rwlock_wrlock (_rwlock);
+  u->waiting--;
+  __gthread_rwlock_unlock (_rwlock);
+#else
+  __gthread_mutex_lock (_rwlock);
    u->waiting--;
-  RWUNLOCK (_rwlock);
+  __gthread_mutex_unlock (_rwlock);
+#endif
  #endif
  }


Lipeng Zhu


Hi Lipeng,

Thanks for the quick reply.  I can confirm that with the above change 
the bootstrap failure is fixed.  However, this shouldn't be considered a 
formal review; libgfortran is not really my area.


I'll be away now until January 2nd.

Richard.

Re: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-14 Thread Richard Earnshaw (lists)

On 09/12/2023 15:39, Lipeng Zhu wrote:
> This patch try to introduce the rwlock and split the read/write to
> unit_root tree and unit_cache with rwlock instead of the mutex to
> increase CPU efficiency. In the get_gfc_unit function, the percentage
> to step into the insert_unit function is around 30%, in most instances,
> we can get the unit in the phase of reading the unit_cache or unit_root
> tree. So split the read/write phase by rwlock would be an approach to
> make it more parallel.
> 
> BTW, the IPC metrics can gain around 9x in our test
> server with 220 cores. The benchmark we used is
> https://github.com/rwesson/NEAT
> 
> libgcc/ChangeLog:
> 
>   * gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro.
>   (__gthrw): New function.
>   (__gthread_rwlock_rdlock): New function.
>   (__gthread_rwlock_tryrdlock): New function.
>   (__gthread_rwlock_wrlock): New function.
>   (__gthread_rwlock_trywrlock): New function.
>   (__gthread_rwlock_unlock): New function.
> 
> libgfortran/ChangeLog:
> 
>   * io/async.c (DEBUG_LINE): New macro.
>   * io/async.h (RWLOCK_DEBUG_ADD): New macro.
>   (CHECK_RDLOCK): New macro.
>   (CHECK_WRLOCK): New macro.
>   (TAIL_RWLOCK_DEBUG_QUEUE): New macro.
>   (IN_RWLOCK_DEBUG_QUEUE): New macro.
>   (RDLOCK): New macro.
>   (WRLOCK): New macro.
>   (RWUNLOCK): New macro.
>   (RD_TO_WRLOCK): New macro.
>   (INTERN_RDLOCK): New macro.
>   (INTERN_WRLOCK): New macro.
>   (INTERN_RWUNLOCK): New macro.
>   * io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in
>   a comment.
>   (unit_lock): Remove including associated internal_proto.
>   (unit_rwlock): New declarations including associated internal_proto.
>   (dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock
>   instead of __gthread_mutex_lock and __gthread_mutex_unlock on
>   unit_lock.
>   * io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on
>   unit_rwlock instead of LOCK and UNLOCK on unit_lock.
>   (st_write_done_worker): Likewise.
>   * io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules'
>   comment. Use unit_rwlock variable instead of unit_lock variable.
>   (get_gfc_unit_from_unit_root): New function.
>   (get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock
>   instead of LOCK and UNLOCK on unit_lock.
>   (close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of
>   LOCK and UNLOCK on unit_lock.
>   (close_units): Likewise.
>   (newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on
>   unit_lock.
>   * io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock
>   instead of LOCK and UNLOCK on unit_lock.
>   (flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead
>   of LOCK and UNLOCK on unit_lock.
> 

It looks like this has broken builds on arm-none-eabi when using newlib:

In file included from /work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran
/runtime/error.c:27:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h: In function 
‘dec_waiting_unlocked’:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1023:3: error
: implicit declaration of function ‘WRLOCK’ [-Wimplicit-function-declaration]
 1023 |   WRLOCK (_rwlock);
  |   ^~
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1025:3: error
: implicit declaration of function ‘RWUNLOCK’ [-Wimplicit-function-declaration]
 1025 |   RWUNLOCK (_rwlock);
  |   ^~~~


R.

> ---
> v1 -> v2:
> Limit the pthread_rwlock usage in libgcc only when __cplusplus isn't defined.
> 
> v2 -> v3:
> Rebase the patch with trunk branch.
> 
> v3 -> v4:
> Update the comments.
> 
> v4 -> v5:
> Fix typos and code formatter.
> 
> v5 -> v6:
> Add unit tests.
> 
> v6 -> v7:
> Update ChangeLog and code formatter.
> 
> Reviewed-by: Hongjiu Lu 
> Reviewed-by: Bernhard Reutner-Fischer 
> Reviewed-by: Thomas Koenig 
> Reviewed-by: Jakub Jelinek 
> Signed-off-by: Lipeng Zhu 
> ---
>  libgcc/gthr-posix.h   |  60 +++
>  libgfortran/io/async.c|   4 +
>  libgfortran/io/async.h| 151 ++
>  libgfortran/io/io.h   |  15 +-
>  libgfortran/io/transfer.c |   8 +-
>  libgfortran/io/unit.c | 117 +-
>  libgfortran/io/unix.c |  16 +-
>  .../testsuite/libgomp.fortran/rwlock_1.f90|  33 
>  .../testsuite/libgomp.fortran/rwlock_2.f90|  22 +++
>  .../testsuite/libgomp.fortran/rwlock_3.f90|  18 +++
>  10 files changed, 386 insertions(+), 58 deletions(-)
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_1.f90
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_2.f90
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_3.f90
> 
> diff --git

Re: [PING][PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-12-12 Thread Richard Earnshaw





On 30/11/2023 12:55, Stamatis Markianos-Wright wrote:

Hi Andre,

Thanks for the comments, see latest revision attached.

On 27/11/2023 12:47, Andre Vieira (lists) wrote:

Hi Stam,

Just some comments.

+/* Recursively scan through the DF chain backwards within the basic 
block and
+   determine if any of the USEs of the original insn (or the USEs of 
the insns
s/Recursively scan/Scan/ as you no longer recurse, thanks for that by 
the way :) +   where thy were DEF-ed, etc., recursively) were affected 
by implicit VPT

remove recursively for the same reasons.

+  if (!CONST_INT_P (cond_counter_iv.step) || !CONST_INT_P 
(cond_temp_iv.step))

+    return NULL;
+  /* Look at the steps and swap around the rtx's if needed. Error 
out if

+ one of them cannot be identified as constant.  */
+  if (INTVAL (cond_counter_iv.step) != 0 && INTVAL 
(cond_temp_iv.step) != 0)

+    return NULL;

Move the comment above the if before, as the erroring out it talks 
about is there.

Done


+  emit_note_after ((enum insn_note)NOTE_KIND (insn), BB_END (body));
 space after 'insn_note)'

@@ -173,14 +176,14 @@ doloop_condition_get (rtx_insn *doloop_pat)
   if (! REG_P (reg))
 return 0;
 -  /* Check if something = (plus (reg) (const_int -1)).
+  /* Check if something = (plus (reg) (const_int -n)).
  On IA-64, this decrement is wrapped in an if_then_else.  */
   inc_src = SET_SRC (inc);
   if (GET_CODE (inc_src) == IF_THEN_ELSE)
 inc_src = XEXP (inc_src, 1);
   if (GET_CODE (inc_src) != PLUS
   || XEXP (inc_src, 0) != reg
-  || XEXP (inc_src, 1) != constm1_rtx)
+  || !CONST_INT_P (XEXP (inc_src, 1)))

Do we ever check that inc_src is negative? We used to check if it was 
-1, now we only check it's a constnat, but not a negative one, so I 
suspect this needs a:

|| INTVAL (XEXP (inc_src, 1)) >= 0

Good point. Done


@@ -492,7 +519,8 @@ doloop_modify (class loop *loop, class niter_desc 
*desc,

 case GE:
   /* Currently only GE tests against zero are supported.  */
   gcc_assert (XEXP (condition, 1) == const0_rtx);
-
+  /* FALLTHRU */
+    case GTU:
   noloop = constm1_rtx;

I spent a very long time staring at this trying to understand why 
noloop = constm1_rtx for GTU, where I thought it should've been (count 
& (n-1)). For the current use of doloop it doesn't matter because ARM 
is the only target using it and you set desc->noloop_assumptions to 
null_rtx in 'arm_attempt_dlstp_transform' so noloop is never used. 
However, if a different target accepts this GTU pattern then this 
target agnostic code will do the wrong thing.  I suggest we either:
 - set noloop to what we think might be the correct value, which if 
you ask me should be 'count & (XEXP (condition, 1))',
 - or add a gcc_assert (GET_CODE (condition) != GTU); under the if 
(desc->noloop_assumption); part and document why.  I have a slight 
preference for the assert given otherwise we are adding code that we 
can't test.


Yea, that's true tbh. I've done the latter, but also separated out the 
"case GTU:" and added a comment, so that it's more clear that the noloop 
things aren't used in the only implemented GTU case (Arm)


Thank you :)



LGTM otherwise (but I don't have the power to approve this ;)).

Kind regards,
Andre

From: Stamatis Markianos-Wright 
Sent: Thursday, November 16, 2023 11:36 AM
To: Stamatis Markianos-Wright via Gcc-patches; Richard Earnshaw; 
Richard Sandiford; Kyrylo Tkachov
Subject: [PING][PATCH 2/2] arm: Add support for MVE Tail-Predicated 
Low Overhead Loops


Pinging back to the top of reviewers' inboxes due to worry about Stage 1
End in a few days :)


See the last email for the latest version of the 2/2 patch. The 1/2
patch is A-Ok from Kyrill's earlier target-backend review.


On 10/11/2023 12:41, Stamatis Markianos-Wright wrote:


On 06/11/2023 17:29, Stamatis Markianos-Wright wrote:


On 06/11/2023 11:24, Richard Sandiford wrote:

Stamatis Markianos-Wright  writes:
One of the main reasons for reading the arm bits was to try to 
answer

the question: if we switch to a downcounting loop with a GE
condition,
how do we make sure that the start value is not a large unsigned
number that is interpreted as negative by GE?  E.g. if the loop
originally counted up in steps of N and used an LTU condition,
it could stop at a value in the range [INT_MAX + 1, UINT_MAX].
But the loop might never iterate if we start counting down from
most values in that range.

Does the patch handle that?

So AFAICT this is actually handled in the generic code in
`doloop_valid_p`:

This kind of loops fail because of they are "desc->infinite", then no
loop-doloop conversion is attempted at all (even for standard
dls/le loops)

Thanks to that check I haven't been able to trigger anything like the
behaviour you describe, do you think the doloop_valid_p checks are
robust enough?

The loops I was thinking of are provably no

Re: [PATCH v2 0/3] [GCC] arm: vld1q_types_xN ACLE intrinsics

2023-12-12 Thread Richard Earnshaw


Pushed, thanks.

R.

On 07/12/2023 15:21, ezra.sito...@arm.com wrote:

Add xN variants of vld1q_types intrinsic.

Re: [PATCH 1/2] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2023-12-12 Thread Richard Earnshaw





On 06/11/2023 11:20, Stamatis Markianos-Wright wrote:

Patch has already been approved at:

https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630326.html


... But I'm sending this again for archiving on the list after rebasing


A couple of minor nits:

1)

+#define MVE_VPT_PREDICABLE_INSN_P(INSN)
\
+  (recog_memoized (INSN) >= 0   \
+  && get_attr_mve_unpredicated_insn (INSN) != 0)   \

I think it's better to write "!= CODE_FOR_nothing".

+(define_attr "mve_unpredicated_insn" "" (const_int 0))
+

And the default value here should similarly be 'symbol_ref 
"CODE_FOR_nothing"'.


So that the style matches the symbol refs elsewhere.


2)
+(define_insn "*predicated_doloop_end_internal"
+  [(set (pc)
+   (if_then_else
+  (ge (plus:SI (reg:SI LR_REGNUM)
+   (match_operand:SI 0 "const_int_operand" ""))
+   (const_int 0))
+(label_ref (match_operand 1 "" ""))
+(pc)))
+   (set (reg:SI LR_REGNUM)
+   (plus:SI (reg:SI LR_REGNUM) (match_dup 0)))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_32BIT && TARGET_HAVE_LOB && TARGET_HAVE_MVE && TARGET_THUMB2"

TARGET_THUMB2 => TARGET_32BIT, so the first test is redundant.  In fact, 
given that TARGET_HAVE_LOB => armv8.1-m.main => thumb2, why do we need 
either?


So
TARGET_HAVE_LOB && TARGET_HAVE_MVE
should be sufficient.


+(define_insn "dlstp_insn"
+  [
+(set (reg:SI LR_REGNUM)
+(unspec:SI [(match_operand:SI 0 "s_register_operand" "r")]
+ DLSTP))
+  ]
+  "TARGET_32BIT && TARGET_HAVE_LOB && TARGET_HAVE_MVE && TARGET_THUMB2"

Same here.

Otherwise, OK.

R.

Re: [PATCH v2 0/3] [GCC] arm: vst1_types_xN ACLE intrinsics

2023-12-08 Thread Richard Earnshaw

Sorry, Ezra, but I've taken the decision to back out all 4 of the patch 
series' related to this.  I think the problems that the CI has shown up 
need to be addressed first, and the fixes don't seem to be entirely trivial.


R.

On 07/12/2023 16:44, Richard Earnshaw wrote:

Pushed, thanks.

R.


On 07/12/2023 15:28, ezra.sito...@arm.com wrote:

Add xN variants of vst1_types intrinsic.

Re: [PATCH v2 0/3] [GCC] arm: vld1_types_xN ACLE intrinsics

2023-12-07 Thread Richard Earnshaw


Pushed, thanks.

R.


On 07/12/2023 15:41, ezra.sito...@arm.com wrote:

Add xN variants of vld1_types intrinsic.

Re: [PATCH v2 0/3] [GCC] arm: vst1q_types_xN ACLE intrinsics

2023-12-07 Thread Richard Earnshaw


Pushed, thanks.

R.


On 07/12/2023 15:36, ezra.sito...@arm.com wrote:

Add xN variants of vst1q_types intrinsic.

Re: [PATCH v2 0/3] [GCC] arm: vst1_types_xN ACLE intrinsics

2023-12-07 Thread Richard Earnshaw


Pushed, thanks.

R.


On 07/12/2023 15:28, ezra.sito...@arm.com wrote:

Add xN variants of vst1_types intrinsic.

Re: [PATCH v2 7/7] aarch64,arm: Move branch-protection data to targets

2023-12-07 Thread Richard Earnshaw





On 03/11/2023 15:36, Szabolcs Nagy wrote:

The branch-protection types are target specific, not the same on arm
and aarch64.  This currently affects pac-ret+b-key, but there will be
a new type on aarch64 that is not relevant for arm.

gcc/ChangeLog:

* config/aarch64/aarch64-opts.h (enum aarch64_key_type): Rename to ...
(enum aarch_key_type): ... this.
* config/aarch64/aarch64.cc (aarch_handle_no_branch_protection): Copy.
(aarch_handle_standard_branch_protection): Copy.
(aarch_handle_pac_ret_protection): Copy.
(aarch_handle_pac_ret_leaf): Copy.
(aarch_handle_pac_ret_b_key): Copy.
(aarch_handle_bti_protection): Copy.


I think all of the above functions that have been moved back from 
aarch-common should be renamed back to aarch64_..., unless they are 
directly referenced statically by code in aarch-common.c.

* config/arm/aarch-common.cc (aarch_handle_no_branch_protection):
Remove.
(aarch_handle_standard_branch_protection): Remove.
(aarch_handle_pac_ret_protection): Remove.
(aarch_handle_pac_ret_leaf): Remove.
(aarch_handle_pac_ret_b_key): Remove.
(aarch_handle_bti_protection): Remove.
* config/arm/aarch-common.h (enum aarch_key_type): Remove.
(struct aarch_branch_protect_type): Declare.
* config/arm/arm-c.cc (arm_cpu_builtins): Remove aarch_ra_sign_key.
* config/arm/arm.cc (aarch_handle_no_branch_protection): Copy.
(aarch_handle_standard_branch_protection): Copy.
(aarch_handle_pac_ret_protection): Copy.
(aarch_handle_pac_ret_leaf): Copy.
(aarch_handle_bti_protection): Copy.
(arm_configure_build_target): Copy.


And the same here.


* config/arm/arm.opt: Remove aarch_ra_sign_key.
---
unchanged compared to v1.
---
  gcc/config/aarch64/aarch64-opts.h |  6 ++--
  gcc/config/aarch64/aarch64.cc | 55 +++
  gcc/config/arm/aarch-common.cc| 55 ---
  gcc/config/arm/aarch-common.h | 11 +++
  gcc/config/arm/arm-c.cc   |  2 --
  gcc/config/arm/arm.cc | 52 +
  gcc/config/arm/arm.opt|  3 --
  7 files changed, 109 insertions(+), 75 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-opts.h 
b/gcc/config/aarch64/aarch64-opts.h
index 831e28ab52a..1abae1442b5 100644
--- a/gcc/config/aarch64/aarch64-opts.h
+++ b/gcc/config/aarch64/aarch64-opts.h
@@ -103,9 +103,9 @@ enum stack_protector_guard {
  };
  
  /* The key type that -msign-return-address should use.  */

-enum aarch64_key_type {
-  AARCH64_KEY_A,
-  AARCH64_KEY_B
+enum aarch_key_type {
+  AARCH_KEY_A,
+  AARCH_KEY_B
  };
  
  /* An enum specifying how to handle load and store pairs using

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 4f7f707b675..9739223831f 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -18620,6 +18620,61 @@ aarch64_set_asm_isa_flags (aarch64_feature_flags flags)
aarch64_set_asm_isa_flags (_options, flags);
  }
  
+static void

+aarch_handle_no_branch_protection (void)
+{
+  aarch_ra_sign_scope = AARCH_FUNCTION_NONE;
+  aarch_enable_bti = 0;
+}
+
+static void
+aarch_handle_standard_branch_protection (void)
+{
+  aarch_ra_sign_scope = AARCH_FUNCTION_NON_LEAF;
+  aarch_ra_sign_key = AARCH_KEY_A;
+  aarch_enable_bti = 1;
+}
+
+static void
+aarch_handle_pac_ret_protection (void)
+{
+  aarch_ra_sign_scope = AARCH_FUNCTION_NON_LEAF;
+  aarch_ra_sign_key = AARCH_KEY_A;
+}
+
+static void
+aarch_handle_pac_ret_leaf (void)
+{
+  aarch_ra_sign_scope = AARCH_FUNCTION_ALL;
+}
+
+static void
+aarch_handle_pac_ret_b_key (void)
+{
+  aarch_ra_sign_key = AARCH_KEY_B;
+}
+
+static void
+aarch_handle_bti_protection (void)
+{
+  aarch_enable_bti = 1;
+}
+
+static const struct aarch_branch_protect_type aarch_pac_ret_subtypes[] = {
+  { "leaf", false, aarch_handle_pac_ret_leaf, NULL, 0 },
+  { "b-key", false, aarch_handle_pac_ret_b_key, NULL, 0 },
+  { NULL, false, NULL, NULL, 0 }
+};
+
+const struct aarch_branch_protect_type aarch_branch_protect_types[] = {


can this be made static now?  And maybe pass the structure as a 
parameter if that's not done already.




+  { "none", true, aarch_handle_no_branch_protection, NULL, 0 },
+  { "standard", true, aarch_handle_standard_branch_protection, NULL, 0 },
+  { "pac-ret", false, aarch_handle_pac_ret_protection, aarch_pac_ret_subtypes,
+ARRAY_SIZE (aarch_pac_ret_subtypes) },
+  { "bti", false, aarch_handle_bti_protection, NULL, 0 },
+  { NULL, false, NULL, NULL, 0 }
+};
+
  /* Implement TARGET_OPTION_OVERRIDE.  This is called once in the beginning
 and is used to parse the -m{cpu,tune,arch} strings and setup the initial
 tuning structs.  In particular it must set selected_tune and
diff --git a/gcc/config/arm/aarch-common.cc b/gcc/config/arm/aarch-common.cc
index 159c61b786c..92e1248f83f 100644
---

Re: [PATCH v2 6/7] aarch64,arm: Fix branch-protection= parsing

2023-12-07 Thread Richard Earnshaw





On 03/11/2023 15:36, Szabolcs Nagy wrote:

Refactor the parsing to have a single API and fix a few parsing issues:

- Different handling of "bti+none" and "none+bti": these should be
   rejected because "none" can only appear alone.

- Accepted empty strings such as "bti++pac-ret" or "bti+", this bug
   was caused by using strtok_r.



These now print
  error: invalid argument ‘’ for ‘-mbranch-protection=’

which is OK, but might be a bit confusing.  Perhaps we could change this 
specific case to "missing feature or flag for '-mbranch-protection'".


The ideal solution (IMO) would be if we could print something like

  in option
  -mbranch-protection=+bti
  ^
  |
  missing feature or flag

much like we do for source code diagnostics now.

However, I don't know if our framework could handle that for things from 
the command line, and it's not important enough to do now.



- Memory got leaked (str_root was never freed). And two buffers got
   allocated when one is enough.

The callbacks now have no failure mode, only parsing can fail and
all failures are handled locally.  The "-mbranch-protection=" vs
"target("branch-protection=")" difference in the error message is
handled by a separate argument to aarch_validate_mbranch_protection.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_override_options): Update.
(aarch64_handle_attr_branch_protection): Update.
* config/arm/aarch-common-protos.h (aarch_parse_branch_protection):
Remove.
(aarch_validate_mbranch_protection): Add new argument.
* config/arm/aarch-common.cc (aarch_handle_no_branch_protection):
Update.
(aarch_handle_standard_branch_protection): Update.
(aarch_handle_pac_ret_protection): Update.
(aarch_handle_pac_ret_leaf): Update.
(aarch_handle_pac_ret_b_key): Update.
(aarch_handle_bti_protection): Update.
(aarch_parse_branch_protection): Remove.
(next_tok): New.
(aarch_validate_mbranch_protection): Rewrite.
* config/arm/aarch-common.h (struct aarch_branch_protect_type):
Add field "alone".
* config/arm/arm.cc (arm_configure_build_target): Update.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/branch-protection-attr.c: Update.
* gcc.target/aarch64/branch-protection-option.c: Update.


This is OK.  If you want to do the simple tweak for the error message 
for the case I mention above, consider that pre-approved.


R.


---
v2: merge tests updates into the patch
error message is not changed, see previous discussion:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633945.html
---
  gcc/config/aarch64/aarch64.cc |  37 +--
  gcc/config/arm/aarch-common-protos.h  |   5 +-
  gcc/config/arm/aarch-common.cc| 214 --
  gcc/config/arm/aarch-common.h |  14 +-
  gcc/config/arm/arm.cc |   3 +-
  .../aarch64/branch-protection-attr.c  |   6 +-
  .../aarch64/branch-protection-option.c|   2 +-
  7 files changed, 113 insertions(+), 168 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index f8e8fefc8d8..4f7f707b675 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -18642,7 +18642,8 @@ aarch64_override_options (void)
  aarch64_validate_sls_mitigation (aarch64_harden_sls_string);
  
if (aarch64_branch_protection_string)

-aarch_validate_mbranch_protection (aarch64_branch_protection_string);
+aarch_validate_mbranch_protection (aarch64_branch_protection_string,
+  "-mbranch-protection=");
  
/* -mcpu=CPU is shorthand for -march=ARCH_FOR_CPU, -mtune=CPU.

   If either of -march or -mtune is given, they override their
@@ -19016,34 +19017,12 @@ aarch64_handle_attr_cpu (const char *str)
  
  /* Handle the argument STR to the branch-protection= attribute.  */
  
- static bool

- aarch64_handle_attr_branch_protection (const char* str)
- {
-  char *err_str = (char *) xmalloc (strlen (str) + 1);
-  enum aarch_parse_opt_result res = aarch_parse_branch_protection (str,
-  _str);
-  bool success = false;
-  switch (res)
-{
- case AARCH_PARSE_MISSING_ARG:
-   error ("missing argument to % pragma 
or"
- " attribute");
-   break;
- case AARCH_PARSE_INVALID_ARG:
-   error ("invalid protection type %qs in % pragma or attribute", err_str);
-   break;
- case AARCH_PARSE_OK:
-   success = true;
-  /* Fall through.  */
- case AARCH_PARSE_INVALID_FEATURE:
-   break;
- default:
-   gcc_unreachable ();
-}
-  free (err_str);
-  return success;
- }
+static bool
+aarch64_handle_attr_branch_protection (const char* str)
+{
+  return aarch_validate_mbranch_protection (str,
+

Re: [PATCH] [arm] testsuite: make mve_intrinsic_type_overloads-int.c libc-agnostic

2023-12-06 Thread Richard Earnshaw


Sorry, I only just spotted this while looking at something else.


On 23/05/2023 15:41, Christophe Lyon via Gcc-patches wrote:

Glibc defines int32_t as 'int' while newlib defines it as 'long int'.

Although these correspond to the same size, g++ complains when using the




   'wrong' version:
   invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'} [-fpermissive]
or
   invalid conversion from 'int*' to 'int32_t*' {aka 'long int*'} [-fpermissive]

when calling vst1q(int32*, int32x4_t) with a first parameter of type
'long int *' (resp. 'int *')

To make this test pass with any type of toolchain, this patch defines
'word_type' according to which libc is in use.

2023-05-23  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c:
Support both definitions of int32_t.
---
  .../mve_intrinsic_type_overloads-int.c| 28 ++-
  1 file changed, 15 insertions(+), 13 deletions(-)

diff --git 
a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
index 7947dc024bc..ab51cc8b323 100644
--- 
a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
+++ 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
@@ -47,14 +47,22 @@ foo2 (short * addr, int16x8_t value)
vst1q (addr, value);
  }
  
-void

-foo3 (int * addr, int32x4_t value)
-{
-  vst1q (addr, value); /* { dg-warning "invalid conversion" "" { target c++ } 
} */
-}
+/* Glibc defines int32_t as 'int' while newlib defines it as 'long int'.
+
+   Although these correspond to the same size, g++ complains when using the
+   'wrong' version:
+  invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'} [-fpermissive]
+
+  The trick below is to make this test pass whether using glibc-based or
+  newlib-based toolchains.  */
  
+#if defined(__GLIBC__)

+#define word_type int
+#else
+#define word_type long int
+#endif


GCC #defines __INT32_TYPE__ for this and should be more reliable than 
trying to detect one specific library implementation.  Did you try that?



  void
-foo4 (long * addr, int32x4_t value)
+foo3 (word_type * addr, int32x4_t value)
  {
vst1q (addr, value);
  }
@@ -78,13 +86,7 @@ foo7 (unsigned short * addr, uint16x8_t value)
  }
  
  void

-foo8 (unsigned int * addr, uint32x4_t value)
-{
-  vst1q (addr, value); /* { dg-warning "invalid conversion" "" { target c++ } 
} */
-}
-
-void
-foo9 (unsigned long * addr, uint32x4_t value)
+foo8 (unsigned word_type * addr, uint32x4_t value)
  {
vst1q (addr, value);
  }


R.

Re: [PATCH] libgfortran: Fix -Wincompatible-pointer-types errors

2023-12-05 Thread Richard Earnshaw





On 05/12/2023 10:59, Jakub Jelinek wrote:

On Tue, Dec 05, 2023 at 10:57:50AM +, Richard Earnshaw wrote:

On 05/12/2023 10:51, Jakub Jelinek wrote:

On Tue, Dec 05, 2023 at 10:47:34AM +, Richard Earnshaw wrote:

The following patch makes libgfortran build on i686-linux after hacking up
--- kinds.h.xx  2023-12-05 00:23:00.133365064 +0100
+++ kinds.h 2023-12-05 11:19:24.409679808 +0100
@@ -10,8 +10,8 @@ typedef GFC_INTEGER_2 GFC_LOGICAL_2;
#define HAVE_GFC_LOGICAL_2
#define HAVE_GFC_INTEGER_2
-typedef int32_t GFC_INTEGER_4;
-typedef uint32_t GFC_UINTEGER_4;
+typedef long GFC_INTEGER_4;
+typedef unsigned long GFC_UINTEGER_4;


That doesn't look right for a 64-bit processor.  Presumably 4 means 4 bytes,


i686-linux is an ILP32 target, which I chose exactly because I regularly build
it, had a tree with it around and because unlike 64-bit targets there are 2
standard 32-bit signed integer types.  Though, normally int32_t there is
int rather than long int and so the errors only appeared after this hack.



My point is that on aarch64/x86_64 etc, this will make GFC_INTEGER_4 a
64-bit type, whereas previously it was 32-bit.


Sure.  The above patch is a hack for a generated header.  I'm not proposing
that as a change, just explaining how I've verified the actual patch on
i686-linux with such a hack.

Jakub



Ah, I understand now.

I've successfully built arm and aarch64 cross toolchains with this patch 
(newlib).  So LGTM, thanks.


R.

Re: [PATCH] libgfortran: Fix -Wincompatible-pointer-types errors

2023-12-05 Thread Richard Earnshaw





On 05/12/2023 10:51, Jakub Jelinek wrote:

On Tue, Dec 05, 2023 at 10:47:34AM +, Richard Earnshaw wrote:

The following patch makes libgfortran build on i686-linux after hacking up
--- kinds.h.xx  2023-12-05 00:23:00.133365064 +0100
+++ kinds.h 2023-12-05 11:19:24.409679808 +0100
@@ -10,8 +10,8 @@ typedef GFC_INTEGER_2 GFC_LOGICAL_2;
   #define HAVE_GFC_LOGICAL_2
   #define HAVE_GFC_INTEGER_2
-typedef int32_t GFC_INTEGER_4;
-typedef uint32_t GFC_UINTEGER_4;
+typedef long GFC_INTEGER_4;
+typedef unsigned long GFC_UINTEGER_4;


That doesn't look right for a 64-bit processor.  Presumably 4 means 4 bytes,


i686-linux is an ILP32 target, which I chose exactly because I regularly build
it, had a tree with it around and because unlike 64-bit targets there are 2
standard 32-bit signed integer types.  Though, normally int32_t there is
int rather than long int and so the errors only appeared after this hack.



My point is that on aarch64/x86_64 etc, this will make GFC_INTEGER_4 a 
64-bit type, whereas previously it was 32-bit.


R.


Jakub

Re: [PATCH] libgfortran: Fix -Wincompatible-pointer-types errors

2023-12-05 Thread Richard Earnshaw





On 05/12/2023 10:33, Jakub Jelinek wrote:

Hi!

On Tue, Dec 05, 2023 at 10:46:02AM +0100, Florian Weimer wrote:

Presumably the fixes will look like this?

diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index db3330060ce..4fcc77dbf83 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -2987,13 +2987,13 @@ nml_read_obj (st_parameter_dt *dtp, namelist_info *nl, 
index_type offset,
/* If this object has a User Defined procedure, call it.  */
if (nl->dtio_sub != NULL)
  {
-   int unit = dtp->u.p.current_unit->unit_number;
+   GFC_INTEGER_4 unit = dtp->u.p.current_unit->unit_number;
char iotype[] = "NAMELIST";
gfc_charlen_type iotype_len = 8;
char tmp_iomsg[IOMSG_LEN] = "";
char *child_iomsg;
gfc_charlen_type child_iomsg_len;
-   int noiostat;
+   GFC_INTEGER_4 noiostat;
int *child_iostat = NULL;
gfc_full_array_i4 vlist;
formatted_dtio dtio_ptr = (formatted_dtio)nl->dtio_sub;


That seems insufficient.

The following patch makes libgfortran build on i686-linux after hacking up
--- kinds.h.xx  2023-12-05 00:23:00.133365064 +0100
+++ kinds.h 2023-12-05 11:19:24.409679808 +0100
@@ -10,8 +10,8 @@ typedef GFC_INTEGER_2 GFC_LOGICAL_2;
  #define HAVE_GFC_LOGICAL_2
  #define HAVE_GFC_INTEGER_2
  
-typedef int32_t GFC_INTEGER_4;

-typedef uint32_t GFC_UINTEGER_4;
+typedef long GFC_INTEGER_4;
+typedef unsigned long GFC_UINTEGER_4;


That doesn't look right for a 64-bit processor.  Presumably 4 means 4 
bytes, but long will generally be 8 on such targets.


R.


  typedef GFC_INTEGER_4 GFC_LOGICAL_4;
  #define HAVE_GFC_LOGICAL_4
  #define HAVE_GFC_INTEGER_4
in the build dir to emulate what newlib aarch64 is doing:

2023-12-05  Florian Weimer  
Jakub Jelinek  

* io/list_read.c (list_formatted_read_scalar) :
Change types of unit and noiostat to GFC_INTEGER_4 from int, change
type of child_iostat from to GFC_INTEGER_4 * from int *, formatting
fixes.
(nml_read_obj): Likewise.
* io/write.c (list_formatted_write_scalar) : Likewise.
(nml_write_obj): Likewise.
* io/transfer.c (unformatted_read, unformatted_write): Likewise.

--- libgfortran/io/list_read.c.jj   2023-05-09 00:07:26.161168737 +0200
+++ libgfortran/io/list_read.c  2023-12-05 11:25:31.837426653 +0100
@@ -2189,14 +2189,14 @@ list_formatted_read_scalar (st_parameter
break;
  case BT_CLASS:
{
- int unit = dtp->u.p.current_unit->unit_number;
+ GFC_INTEGER_4 unit = dtp->u.p.current_unit->unit_number;
  char iotype[] = "LISTDIRECTED";
gfc_charlen_type iotype_len = 12;
  char tmp_iomsg[IOMSG_LEN] = "";
  char *child_iomsg;
  gfc_charlen_type child_iomsg_len;
- int noiostat;
- int *child_iostat = NULL;
+ GFC_INTEGER_4 noiostat;
+ GFC_INTEGER_4 *child_iostat = NULL;
  gfc_full_array_i4 vlist;
  
  	  GFC_DESCRIPTOR_DATA() = NULL;

@@ -2204,8 +2204,8 @@ list_formatted_read_scalar (st_parameter
  
  	  /* Set iostat, intent(out).  */

  noiostat = 0;
- child_iostat = (dtp->common.flags & IOPARM_HAS_IOSTAT) ?
- dtp->common.iostat : 
+ child_iostat = ((dtp->common.flags & IOPARM_HAS_IOSTAT)
+ ? dtp->common.iostat : );
  
  	  /* Set iomsge, intent(inout).  */

  if (dtp->common.flags & IOPARM_HAS_IOMSG)
@@ -2987,14 +2987,14 @@ nml_read_obj (st_parameter_dt *dtp, name
/* If this object has a User Defined procedure, call it.  */
if (nl->dtio_sub != NULL)
  {
-   int unit = dtp->u.p.current_unit->unit_number;
+   GFC_INTEGER_4 unit = dtp->u.p.current_unit->unit_number;
char iotype[] = "NAMELIST";
gfc_charlen_type iotype_len = 8;
char tmp_iomsg[IOMSG_LEN] = "";
char *child_iomsg;
gfc_charlen_type child_iomsg_len;
-   int noiostat;
-   int *child_iostat = NULL;
+   GFC_INTEGER_4 noiostat;
+   GFC_INTEGER_4 *child_iostat = NULL;
gfc_full_array_i4 vlist;
formatted_dtio dtio_ptr = (formatted_dtio)nl->dtio_sub;
  
@@ -3006,8 +3006,8 @@ nml_read_obj (st_parameter_dt *dtp, name
  
  		/* Set iostat, intent(out).  */

noiostat = 0;
-   child_iostat = (dtp->common.flags & IOPARM_HAS_IOSTAT) ?
-   dtp->common.iostat : 
+   child_iostat = ((dtp->common.flags & IOPARM_HAS_IOSTAT)
+   ? dtp->common.iostat : );
  
  		/* Set iomsg, intent(inout).  */

if (dtp->common.flags & IOPARM_HAS_IOMSG)
--- libgfortran/io/write.c.jj   2023-09-28

Re: [PATCH v3 10/11] c: Turn -Wincompatible-pointer-types into a permerror

2023-12-05 Thread Richard Earnshaw





On 05/12/2023 09:46, Florian Weimer wrote:

* Richard Earnshaw:


(I think it's this patch, not one of the others in the series).

This breaks building libgfortran with newlib on arm and aarch64:


/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/list_read.c:2208:46:
error: pointer type mismatch in conditional expression
[-Wincompatible-pointer-types]
  2208 |   dtp->common.iostat : 
   |  ^
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/list_read.c:2208:27:
note: first expression has type ‘GFC_INTEGER_4 *’ {aka ‘long int *’}
  2208 |   dtp->common.iostat : 
   |   ^~
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/list_read.c:2208:48:
note: second expression has type ‘int *’
  2208 |   dtp->common.iostat : 
   |^
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/list_read.c:2224:34:
error: passing argument 2 of ‘dtp->u.p.fdtio_ptr’ from incompatible
pointer type [-Wincompatible-pointer-types]
  2224 |   dtp->u.p.fdtio_ptr (p, , iotype, ,
   |  ^
   |  |
   |  int *
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/list_read.c:2224:34:
note: expected ‘GFC_INTEGER_4 *’ {aka ‘long int *’} but argument is of
type ‘int *’
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/list_read.c:2225:31:
error: passing argument 5 of ‘dtp->u.p.fdtio_ptr’ from incompatible
pointer type [-Wincompatible-pointer-types]
  2225 |   child_iostat, child_iomsg,
   |   ^~~~
   |   |
   |   int *
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/list_read.c:2225:31:
note: expected ‘GFC_INTEGER_4 *’ {aka ‘long int *’} but argument is of
type ‘int *’


Presumably the fixes will look like this?

diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index db3330060ce..4fcc77dbf83 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -2987,13 +2987,13 @@ nml_read_obj (st_parameter_dt *dtp, namelist_info *nl, 
index_type offset,
/* If this object has a User Defined procedure, call it.  */
if (nl->dtio_sub != NULL)
  {
-   int unit = dtp->u.p.current_unit->unit_number;
+   GFC_INTEGER_4 unit = dtp->u.p.current_unit->unit_number;
char iotype[] = "NAMELIST";
gfc_charlen_type iotype_len = 8;
char tmp_iomsg[IOMSG_LEN] = "";
char *child_iomsg;
gfc_charlen_type child_iomsg_len;
-   int noiostat;
+   GFC_INTEGER_4 noiostat;
int *child_iostat = NULL;
gfc_full_array_i4 vlist;
formatted_dtio dtio_ptr = (formatted_dtio)nl->dtio_sub;


Apparently the targets I built define GFC_INTEGER_4 as int, so this
didn't show up.


It looks reasonable to me, but I'm not a real user of libgfortran, so 
there's possibly something more subtle that I've missed (I can't even 
guarantee I copied all of the errors from the build log).


I've copied Tobias as a fortran maintainer, but I don't know if this is 
his forte either, though perhaps he might know who's it is.


R.



Thanks,
Florian

Re: [PATCH v3 10/11] c: Turn -Wincompatible-pointer-types into a permerror

2023-12-05 Thread Richard Earnshaw

* { dg-message "note: expected '\[^'\n\]*' but argument is of type '\[^'\n\]*'" 
"note: expected" { target *-*-* } .-1 } */
+
+
+DECIMAL_COMPOSITE_DECL(128); /* { dg-error "incompatible types when assigning to 
type '\[^\n\]*' from type '\[^\n\]*'" } */
+/* { dg-message "note: expected '\[^'\n\]*' but argument is of type '\[^'\n\]*'" 
"note: expected" { target *-*-* } .-1 } */
+
+
+int main()
+{
+  DECIMAL_COMPOSITE_TEST(32);  /* { dg-error "incompatible pointer type" } */
+  DECIMAL_COMPOSITE_TEST(64);  /* { dg-error "incompatible pointer type" } */
+  DECIMAL_COMPOSITE_TEST(128); /* { dg-error "incompatible pointer type" } */
+
+  return 0;
+}
+
+/* The invalid function redeclarations might also trigger:
+   { dg-prune-output "-Warray-parameter" } */
diff --git a/gcc/testsuite/gcc.dg/dfp/composite-type.c 
b/gcc/testsuite/gcc.dg/dfp/composite-type.c
index ce7d5c1a0a0..2eb601400b5 100644
--- a/gcc/testsuite/gcc.dg/dfp/composite-type.c
+++ b/gcc/testsuite/gcc.dg/dfp/composite-type.c
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-O -Wall -ftrack-macro-expansion=0" } */
+/* { dg-options "-fpermissive -O -Wall -ftrack-macro-expansion=0" } */
  
  /* C99 6.2.7: Compatible type and composite type.  */
  
diff --git a/gcc/testsuite/gcc.dg/diag-aka-1.c b/gcc/testsuite/gcc.dg/diag-aka-1.c

index 3383c1c263b..485a8a5f85d 100644
--- a/gcc/testsuite/gcc.dg/diag-aka-1.c
+++ b/gcc/testsuite/gcc.dg/diag-aka-1.c
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-Wc++-compat" } */
+/* { dg-options "-fpermissive -Wc++-compat" } */
  
  typedef struct A { int i; } B;

  typedef struct T { int i; } *T; /* { dg-warning "using 'T' as both a typedef and a 
tag is invalid" } */
diff --git a/gcc/testsuite/gcc.dg/diag-aka-1a.c 
b/gcc/testsuite/gcc.dg/diag-aka-1a.c
new file mode 100644
index 000..d161b785e7d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/diag-aka-1a.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-Wc++-compat" } */
+
+typedef struct A { int i; } B;
+typedef struct T { int i; } *T; /* { dg-warning "using 'T' as both a typedef and a 
tag is invalid" } */
+typedef const float TFA;
+typedef TFA TFB;
+typedef TFB TFC;
+typedef int IA[];
+typedef IA *IAP;
+extern IAP arr[];
+
+void fn1 (B *); /* { dg-message "expected 'B \\*' {aka 'struct A \\*'} but argument 
is of type 'struct B \\*'" } */
+void fn2 (TFC *);
+
+void
+bar (B *b, int *i)
+{
+  fn1 ((struct B *) b); /* { dg-error "passing argument" } */
+  fn2 (i); /* { dg-error "passing argument" } */
+  sizeof (arr); /* { dg-error "invalid application of .sizeof. to incomplete type 
.int \\(\\*\\\[\\\]\\)\\\[\\\]." } */
+}
+
+int
+foo (void *a)
+{
+  T t = a; /* { dg-warning "request for implicit conversion from 'void \\*' to 'T' 
{aka 'struct T \\*'} not" } */
+  return t->i;
+}
diff --git a/gcc/testsuite/gcc.dg/enum-compat-1.c 
b/gcc/testsuite/gcc.dg/enum-compat-1.c
index 5fb150cee79..b7352f6ddc3 100644
--- a/gcc/testsuite/gcc.dg/enum-compat-1.c
+++ b/gcc/testsuite/gcc.dg/enum-compat-1.c
@@ -3,7 +3,7 @@
  /* Origin: Joseph Myers , based on
 PR c/6024 from Richard Earnshaw  */
  /* { dg-do compile } */
-/* { dg-options "" } */
+/* { dg-options "-fpermissive" } */
  
  /* Original test from PR c/6024.  */

  enum e1 {a, b};
diff --git a/gcc/testsuite/gcc.dg/enum-compat-2.c 
b/gcc/testsuite/gcc.dg/enum-compat-2.c
new file mode 100644
index 000..69509012480
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/enum-compat-2.c
@@ -0,0 +1,32 @@
+/* Test that enumerated types are only considered compatible when they
+   are the same type.  PR c/6024.  */
+/* Origin: Joseph Myers , based on
+   PR c/6024 from Richard Earnshaw  */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+/* Original test from PR c/6024.  */
+enum e1 {a, b};
+enum e2 {c, d};
+
+void f(enum e1); /* { dg-error "prototype" "error at decl" } */
+
+void f(x)
+ enum e2 x; /* { dg-error "doesn't match prototype" } */
+{
+  return;
+}
+
+/* Other compatibility tests.  */
+enum e3 { A };
+enum e4 { B };
+
+enum e3 v3;
+enum e4 *p =  /* { dg-error "incompatible" "incompatible pointer" } */
+enum e3 *q = 
+
+void g(enum e3); /* { dg-message "note: previous declaration" "error at first 
decl" } */
+void g(enum e4); /* { dg-error "conflicting types" "error at second decl" } */
+
+void h(enum e3);
+void h(enum e3);
diff --git a/gcc/testsuite/gcc.dg/func-ptr-conv-1.c 
b/gcc/testsuite/gcc.dg/func-ptr-conv-1.c
index 5c8a101343e..7c2876cc998 100644
--- a/gcc/testsuite/gcc.dg/func-ptr-conv-1.c
+++ b/gcc/testsuite/gcc.dg/func-ptr-conv-1.c
@@ -5,7 +5,7 @@
 are not permitted.  PR c/11234.  */
  /* Origin: Joseph Myers  */
  /* { dg-do compile } *

Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Richard Earnshaw (lists)

On 01/12/2023 13:45, Christophe Lyon wrote:
> On Fri, 1 Dec 2023 at 13:44, Richard Earnshaw (lists)
>  wrote:
>>
>> On 01/12/2023 11:28, Saurabh Jha wrote:
>>> Hey,
>>>
>>> I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in 
>>> this commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an 
>>> error which I unfortunately missed. This patch fixes that test.
>>>
>>> Did regression testing on arm-none-eabi and found no regressions. Output of 
>>> running gcc/contrib/compare_tests is this:
>>>
>>> """
>>> Tests that now work, but didn't before (2 tests):
>>>
>>> arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
>>> gcc.target/arm/mve/pr112337.c (test for excess errors)
>>> arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
>>>  gcc.target/arm/mve/pr112337.c (test for excess errors)
>>> """
>>>
>>> Ok for trunk? I don't have commit access so could someone please commit on 
>>> my behalf?
>>>
>>> Regards,
>>> Saurabh
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/arm/mve/pr112337.c: Fix the testcase
>>
>>
>> Hmm, could this be related to the changes Christophe made recently to change 
>> the way MVE vector types were set up internally?  If so, this might indicate 
>> an issue that's going to affect real users with existing code.
>>
> 
> My change was only about vector types, here the problem is with a
> pointer to a scalar.
> Anyway, I ran the test with my commit reverted and it still fails in
> the same way, so I think this patch is needed.
> 
> Thanks,
> 
> Christophe
> 
>> Christophe?
>>
>> R.

Ok, thanks for checking.  In that case, Saurabh, your patch is OK, but please 
change 'Fix testcase' to 'Use int32_t instead of int.'

Note that ChangeLog entries end with a full stop.

R.

Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Richard Earnshaw (lists)

On 01/12/2023 11:28, Saurabh Jha wrote:
> Hey,
> 
> I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in this 
> commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an error 
> which I unfortunately missed. This patch fixes that test.
> 
> Did regression testing on arm-none-eabi and found no regressions. Output of 
> running gcc/contrib/compare_tests is this:
> 
> """
> Tests that now work, but didn't before (2 tests):
> 
> arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
> gcc.target/arm/mve/pr112337.c (test for excess errors)
> arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
>  gcc.target/arm/mve/pr112337.c (test for excess errors)
> """
> 
> Ok for trunk? I don't have commit access so could someone please commit on my 
> behalf?
> 
> Regards,
> Saurabh
> 
> gcc/testsuite/ChangeLog:
> 
>     * gcc.target/arm/mve/pr112337.c: Fix the testcase


Hmm, could this be related to the changes Christophe made recently to change 
the way MVE vector types were set up internally?  If so, this might indicate an 
issue that's going to affect real users with existing code.

Christophe?

R.

Re: [PATCH] aarch64: modify Ampere CPU tunings on reassociation/FMA

2023-11-30 Thread Richard Earnshaw





On 30/11/2023 08:27, Di Zhao OS wrote:

This patch modifies tunings for ampere1/ampere1a/ampere1b, to:

1. Allow reassociation on FP additions.
2. Avoid generating loop-dependant FMA chains. Added a tuning
option for this.

Bootstrapped and tested. Is this ok for trunk?

Thanks,
Di Zhao

gcc/ChangeLog:

 * config/aarch64/aarch64-tuning-flags.def 
(AARCH64_EXTRA_TUNING_OPTION):
 New tuing option to avoid cross-loop FMA.


typo: tuning


 * config/aarch64/aarch64.cc (aarch64_override_options_internal): Set
 param_avoid_fma_max_bits according to tuning option.
 * config/aarch64/tuning_models/ampere1.h: Modify tunings related with
 FMA.
 * config/aarch64/tuning_models/ampere1a.h: Modify tunings related with
 FMA.
 * config/aarch64/tuning_models/ampere1b.h: Modify tunings related with
 FMA.



You need to mention the name of the structure you're modifying.  Also we 
usually just write 'Likewise.' if the change is identical in effect to 
the change immediately above, so


* config/aarch64/tuning_models/ampere1.h (ampere1_tunings):
Modify tunings related with FMA.
* config/aarch64/tuning_models/ampere1a.h (ampere1a_tunings):
Likewise.
* config/aarch64/tuning_models/ampere1b.h (ampere1b_tunings):
Likewise.

Finally, watch your line length.  The total length of the line should 
not go beyond column 72 in commit log entries, unless that involves 
breaking a single word on a line.


Otherwise, this is OK.

R.


---
  gcc/config/aarch64/aarch64-tuning-flags.def | 2 ++
  gcc/config/aarch64/aarch64.cc   | 6 ++
  gcc/config/aarch64/tuning_models/ampere1.h  | 2 +-
  gcc/config/aarch64/tuning_models/ampere1a.h | 4 ++--
  gcc/config/aarch64/tuning_models/ampere1b.h | 5 +++--
  5 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def 
b/gcc/config/aarch64/aarch64-tuning-flags.def
index 774568e9106..f28a73839a6 100644
--- a/gcc/config/aarch64/aarch64-tuning-flags.def
+++ b/gcc/config/aarch64/aarch64-tuning-flags.def
@@ -47,4 +47,6 @@ AARCH64_EXTRA_TUNING_OPTION ("use_new_vector_costs", 
USE_NEW_VECTOR_COSTS)
  
  AARCH64_EXTRA_TUNING_OPTION ("matched_vector_throughput", MATCHED_VECTOR_THROUGHPUT)
  
+AARCH64_EXTRA_TUNING_OPTION ("avoid_cross_loop_fma", AVOID_CROSS_LOOP_FMA)

+
  #undef AARCH64_EXTRA_TUNING_OPTION
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 64684258b7b..28bc70a787f 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -16083,6 +16083,12 @@ aarch64_override_options_internal (struct gcc_options 
*opts)
&& opts->x_optimize >= aarch64_tune_params.prefetch->default_opt_level)
  opts->x_flag_prefetch_loop_arrays = 1;
  
+  /* Avoid loop-dependant FMA chains.  */

+  if (aarch64_tune_params.extra_tuning_flags
+  & AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA)
+SET_OPTION_IF_UNSET (opts, _options_set, param_avoid_fma_max_bits,
+512);
+
aarch64_override_options_after_change_1 (opts);
  }
  
diff --git a/gcc/config/aarch64/tuning_models/ampere1.h b/gcc/config/aarch64/tuning_models/ampere1.h

index 8d2a1c69610..a144e8f94b3 100644
--- a/gcc/config/aarch64/tuning_models/ampere1.h
+++ b/gcc/config/aarch64/tuning_models/ampere1.h
@@ -104,7 +104,7 @@ static const struct tune_params ampere1_tunings =
2,  /* min_div_recip_mul_df.  */
0,  /* max_case_values.  */
tune_params::AUTOPREFETCHER_WEAK,   /* autoprefetcher_model.  */
-  (AARCH64_EXTRA_TUNE_NONE),   /* tune_flags.  */
+  (AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA),   /* tune_flags.  */
_prefetch_tune,
AARCH64_LDP_STP_POLICY_ALIGNED,   /* ldp_policy_model.  */
AARCH64_LDP_STP_POLICY_ALIGNED/* stp_policy_model.  */
diff --git a/gcc/config/aarch64/tuning_models/ampere1a.h 
b/gcc/config/aarch64/tuning_models/ampere1a.h
index c419ffb3c1a..f688ed08a79 100644
--- a/gcc/config/aarch64/tuning_models/ampere1a.h
+++ b/gcc/config/aarch64/tuning_models/ampere1a.h
@@ -50,13 +50,13 @@ static const struct tune_params ampere1a_tunings =
"32:16",  /* loop_align.  */
2,  /* int_reassoc_width.  */
4,  /* fp_reassoc_width.  */
-  1,   /* fma_reassoc_width.  */
+  4,   /* fma_reassoc_width.  */
2,  /* vec_reassoc_width.  */
2,  /* min_div_recip_mul_sf.  */
2,  /* min_div_recip_mul_df.  */
0,  /* max_case_values.  */
tune_params::AUTOPREFETCHER_WEAK,   /* autoprefetcher_model.  */
-  (AARCH64_EXTRA_TUNE_NONE),   /* tune_flags.  */
+  (AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA),   /* tune_flags.  */
_prefetch_tune,
AARCH64_LDP_STP_POLICY_ALIGNED,   /* ldp_policy_model.  */
AARCH64_LDP_STP_POLICY_ALIGNED/* stp_policy_model.  */
diff --git a/gcc/config/aarch64/tuning_models/ampere1b.h 
b/gcc/config/aarch64/tuning_models/ampere1b.h
index c4928f50d29..a98b6a980f7 100644
--- a/gcc/config/aarch64/tuning_models/ampere1b.h
+++

Re: [PATCH v2] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-11-30 Thread Richard Earnshaw





On 29/11/2023 18:09, Richard Sandiford wrote:

Wilco Dijkstra  writes:

v2: Use UINTVAL, rename max_mops_size.

The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
Clean up the condition when to use MOPS.

Passes regress/bootstrap, OK for commit?

gcc/ChangeLog/
 PR target/103100
 * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.
 (setmemdi): Likewise.
 * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
 strict-align.  Cleanup condition for using MOPS.
 (aarch64_expand_setmem): Likewise.

---

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
dd6874d13a75f20d10a244578afc355b25c73da2..8a12894d6b80de1031d6e7d02dca680c57bce136
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -25261,27 +25261,23 @@ aarch64_expand_cpymem (rtx *operands)
int mode_bits;
rtx dst = operands[0];
rtx src = operands[1];
+  unsigned align = UINTVAL (operands[3]);
rtx base;
machine_mode cur_mode = BLKmode;
+  bool size_p = optimize_function_for_size_p (cfun);

-  /* Variable-sized memcpy can go through the MOPS expansion if available.  */
-  if (!CONST_INT_P (operands[2]))
+  /* Variable-sized or strict-align copies may use the MOPS expansion.  */
+  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
  return aarch64_expand_cpymem_mops (operands);

-  unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
-
-  /* Try to inline up to 256 bytes or use the MOPS threshold if available.  */
-  unsigned HOST_WIDE_INT max_copy_size
-= TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256;
+  unsigned HOST_WIDE_INT size = UINTVAL (operands[2]);

-  bool size_p = optimize_function_for_size_p (cfun);
+  /* Try to inline up to 256 bytes.  */
+  unsigned max_copy_size = 256;
+  unsigned mops_threshold = aarch64_mops_memcpy_size_threshold;

-  /* Large constant-sized cpymem should go through MOPS when possible.
- It should be a win even for size optimization in the general case.
- For speed optimization the choice between MOPS and the SIMD sequence
- depends on the size of the copy, rather than number of instructions,
- alignment etc.  */
-  if (size > max_copy_size)
+  /* Large copies use MOPS when available or a library call.  */
+  if (size > max_copy_size || (TARGET_MOPS && size > mops_threshold))
  return aarch64_expand_cpymem_mops (operands);


It feels a little unintuitive to be calling aarch64_expand_cpymem_mops
for !TARGET_MOPS, but that's pre-existing, and I can see there are
arguments both ways.

Although !TARGET_SIMD is a niche interest on current trunk, it becomes
important for streaming-compatible mode.  So we might want to look
again at the different handling of !TARGET_SIMD in this function (where
we lower the copy size but not the threshold) and aarch64_expand_setmem
(where we bail out early).  That's not something for this patch though,
just mentioning it.

The patch is OK with me, but please give Richard E a day to object.


This is fine by me.

R.



Thanks,
Richard



int copy_bits = 256;
@@ -25445,12 +25441,13 @@ aarch64_expand_setmem (rtx *operands)
unsigned HOST_WIDE_INT len;
rtx dst = operands[0];
rtx val = operands[2], src;
+  unsigned align = UINTVAL (operands[3]);
rtx base;
machine_mode cur_mode = BLKmode, next_mode;

-  /* If we don't have SIMD registers or the size is variable use the MOPS
- inlined sequence if possible.  */
-  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD)
+  /* Variable-sized or strict-align memset may use the MOPS expansion.  */
+  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD
+  || (STRICT_ALIGNMENT && align < 16))
  return aarch64_expand_setmem_mops (operands);

bool size_p = optimize_function_for_size_p (cfun);
@@ -25458,10 +25455,13 @@ aarch64_expand_setmem (rtx *operands)
/* Default the maximum to 256-bytes when considering only libcall vs
   SIMD broadcast sequence.  */
unsigned max_set_size = 256;
+  unsigned mops_threshold = aarch64_mops_memset_size_threshold;

-  len = INTVAL (operands[1]);
-  if (len > max_set_size && !TARGET_MOPS)
-return false;
+  len = UINTVAL (operands[1]);
+
+  /* Large memset uses MOPS when available or a library call.  */
+  if (len > max_set_size || (TARGET_MOPS && len > mops_threshold))
+return aarch64_expand_setmem_mops (operands);

int cst_val = !!(CONST_INT_P (val) && (INTVAL (val) != 0));
/* The MOPS sequence takes:
@@ -25474,12 +25474,6 @@ aarch64_expand_setmem (rtx *operands)
   the arguments + 1 for the call.  */
unsigned libcall_cost = 4;

-  /* Upper bound check.  For large constant-sized setmem use the MOPS sequence
- when available.  */
-  if (TARGET_MOPS
-  && len >= (unsigned HOST_WIDE_INT) aarch64_mops_memset_size_threshold)
-return aarch64_expand_setmem_mops

Re: [PATCH] testsuite/arm: Fix bfloat16_vector_typecheck_[12].c tests

2023-11-30 Thread Richard Earnshaw





On 30/11/2023 10:15, Christophe Lyon wrote:

After commit r14-5617-gb8592186611, int32x[24]_t types now use
elements of 'long int' type instead of 'int' on arm-eabi (it's still
'int' on arm-linux-gnueabihf).  Both are 32-bit types anyway.

This patch adjust the two tests so that they optionnally accept 'long '
before 'int' in the expected error message.

2023-11-30  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/bfloat16_vector_typecheck_1.c: Update expected
error message.
* gcc.target/arm/bfloat16_vector_typecheck_2.c: Likewise.


OK.

R.


---
  gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_1.c | 4 ++--
  gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_2.c | 2 +-
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_1.c 
b/gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_1.c
index f3c350b4cfc..470c13125fb 100644
--- a/gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_1.c
+++ b/gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_1.c
@@ -119,9 +119,9 @@ bfloat16x4_t footest (bfloat16x4_t vector0)
(bfloat16x4_t) { is_a_short_vec }; /* { dg-error {incompatible types when 
initializing type '__bf16' using type 'int16x4_t'} } */
  
(bfloat16x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when initializing type '__bf16' using type 'bfloat16x4_t'} } */

-  (int32x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type 'int' using type 'bfloat16x4_t'} } */
+  (int32x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type '(?:long )?int' using type 'bfloat16x4_t'} } */
(float32x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type 'float' using type 'bfloat16x4_t'} } */
-  (int32x2_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type 'int' using type 'bfloat16x4_t'} } */
+  (int32x2_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type '(?:long )?int' using type 'bfloat16x4_t'} } */
(float16x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type '__fp16' using type 'bfloat16x4_t'} } */
(int16x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type 'short int' using type 'bfloat16x4_t'} } */
  
diff --git a/gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_2.c b/gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_2.c

index de0ade52c10..4e0d37907ce 100644
--- a/gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_2.c
+++ b/gcc/testsuite/gcc.target/arm/bfloat16_vector_typecheck_2.c
@@ -111,7 +111,7 @@ bfloat16x8_t footest (bfloat16x8_t vector0)
(bfloat16x8_t) { is_a_short_vec }; /* { dg-error {incompatible types when 
initializing type '__bf16' using type 'int16x8_t'} } */
  
(bfloat16x8_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when initializing type '__bf16' using type 'bfloat16x8_t'} } */

-  (int32x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type 'int' using type 'bfloat16x8_t'} } */
+  (int32x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type '(?:long )?int' using type 'bfloat16x8_t'} } */
(float32x4_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type 'float' using type 'bfloat16x8_t'} } */
(int64x2_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type 'long long int' using type 'bfloat16x8_t'} } */
(float16x8_t) { glob_bfloat_vec }; /* { dg-error {incompatible types when 
initializing type '__fp16' using type 'bfloat16x8_t'} } */

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2574 matches

Mail list logo