from:"Richard Earnshaw \(lists\)"

Re: [PATCH] aarch64: Fix typo in aarch64-ldp-fusion.cc:combine_reg_notes [PR114936]

2024-05-07 Thread Richard Earnshaw (lists)

On 03/05/2024 15:45, Alex Coplan wrote:
> This fixes a typo in combine_reg_notes in the load/store pair fusion
> pass.  As it stands, the calls to filter_notes store any
> REG_FRAME_RELATED_EXPR to fr_expr with the following association:
> 
>  - i2 -> fr_expr[0]
>  - i1 -> fr_expr[1]
> 
> but then the checks inside the following if statement expect the
> opposite (more natural) association, i.e.:
> 
>  - i2 -> fr_expr[1]
>  - i1 -> fr_expr[0]
> 
> this patch fixes the oversight by swapping the fr_expr indices in the
> calls to filter_notes.
> 
> In hindsight it would probably have been less confusing / error-prone to
> have combine_reg_notes take an array of two insns, then we wouldn't have
> to mix 1-based and 0-based indexing as well as remembering to call
> filter_notes in reverse program order.  This however is a minimal fix
> for backporting purposes.
> 
> Many thanks to Matthew for spotting this typo and pointing it out to me.
> 
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk and the 14
> branch after the 14.1 release?
> 
> Thanks,
> Alex
> 
> gcc/ChangeLog:
> 
>   PR target/114936
>   * config/aarch64/aarch64-ldp-fusion.cc (combine_reg_notes):
>   Ensure insn iN has its REG_FRAME_RELATED_EXPR (if any) stored in
>   FR_EXPR[N-1], thus matching the correspondence expected by the
>   copy_rtx calls.


OK.

R.

Re: [PATCH] testsuite: Verify r0-r3 are extended with CMSE

2024-04-30 Thread Richard Earnshaw (lists)

On 30/04/2024 16:37, Torbjorn SVENSSON wrote:
> 
> 
> On 2024-04-30 17:11, Richard Earnshaw (lists) wrote:
>> On 27/04/2024 15:13, Torbjörn SVENSSON wrote:
>>> Add regression test to the existing zero/sign extend tests for CMSE to
>>> verify that r0, r1, r2 and r3 are properly extended, not just r0.
>>>
>>> Test is done using -O0 to ensure the instructions are in a predictable
>>> order.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/arm/cmse/extend-param.c: Add regression test.
>>>
>>> Signed-off-by: Torbjörn SVENSSON 
>>> ---
>>>   .../gcc.target/arm/cmse/extend-param.c    | 20 ++-
>>>   1 file changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c 
>>> b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> index 01fac786238..b8b8ecbff56 100644
>>> --- a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> +++ b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
>>> @@ -93,4 +93,22 @@ __attribute__((cmse_nonsecure_entry)) char 
>>> boolSecureFunc (bool index) {
>>>   return 0;
>>>     return array[index];
>>>   -}
>>> \ No newline at end of file
>>> +}
>>> +
>>> +/*
>>> +**__acle_se_boolCharShortEnumSecureFunc:
>>> +**    ...
>>> +**    uxtb    r0, r0
>>> +**    uxtb    r1, r1
>>> +**    uxth    r2, r2
>>> +**    uxtb    r3, r3
>>> +**    ...
>>> +*/
>>> +__attribute__((cmse_nonsecure_entry,optimize(0))) char 
>>> boolCharShortEnumSecureFunc (bool a, unsigned char b, unsigned short c, 
>>> enum offset d) {
>>> +
>>> +  size_t index = a + b + c + d;
>>> +  if (index >= ARRAY_SIZE)
>>> +    return 0;
>>> +  return array[index];
>>> +
>>> +}
>>
>> Ok, but please can you add '-fshort-enums' to dg-options to ensure this test 
>> still behaves correctly if run with a different default (I missed that last 
>> time around).
> 
> Ok, I'll add that to extend-param.c. Do you want me to also add it to the 
> extend-return.c test case?
> 
> Kind regards,
> Torbjörn

Yes please, if it has the same issue.

R.

Re: [PATCH] testsuite: Verify r0-r3 are extended with CMSE

2024-04-30 Thread Richard Earnshaw (lists)

On 27/04/2024 15:13, Torbjörn SVENSSON wrote:
> Add regression test to the existing zero/sign extend tests for CMSE to
> verify that r0, r1, r2 and r3 are properly extended, not just r0.
> 
> Test is done using -O0 to ensure the instructions are in a predictable
> order.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/cmse/extend-param.c: Add regression test.
> 
> Signed-off-by: Torbjörn SVENSSON 
> ---
>  .../gcc.target/arm/cmse/extend-param.c| 20 ++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c 
> b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> index 01fac786238..b8b8ecbff56 100644
> --- a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> +++ b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
> @@ -93,4 +93,22 @@ __attribute__((cmse_nonsecure_entry)) char boolSecureFunc 
> (bool index) {
>  return 0;
>return array[index];
>  
> -}
> \ No newline at end of file
> +}
> +
> +/*
> +**__acle_se_boolCharShortEnumSecureFunc:
> +**   ...
> +**   uxtbr0, r0
> +**   uxtbr1, r1
> +**   uxthr2, r2
> +**   uxtbr3, r3
> +**   ...
> +*/
> +__attribute__((cmse_nonsecure_entry,optimize(0))) char 
> boolCharShortEnumSecureFunc (bool a, unsigned char b, unsigned short c, enum 
> offset d) {
> +
> +  size_t index = a + b + c + d;
> +  if (index >= ARRAY_SIZE)
> +return 0;
> +  return array[index];
> +
> +}

Ok, but please can you add '-fshort-enums' to dg-options to ensure this test 
still behaves correctly if run with a different default (I missed that last 
time around).

R.

Re: [PATCH][GCC] aarch64: Fix SCHEDULER_IDENT for Cortex-A510

2024-04-26 Thread Richard Earnshaw (lists)

On 25/04/2024 15:59, Richard Ball wrote:
> Hi Richard,
> 
> I committed this combined patch (with Cortex-A520) for trunk 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=cab53aae43cf94171b01320c08302e47a5daa391
>  
> 
> 
> Am I ok to commit just the Cortex-A510 half into gcc-12 and gcc-13.

Yes, if that's the correct thing to do there.

R.

> 
> Thanks,
> Richard Ball
> --
> *From:* Richard Ball
> *Sent:* 12 March 2024 14:08
> *To:* gcc-patches@gcc.gnu.org ; Richard Earnshaw 
> ; Richard Sandiford ; 
> Marcus Shawcroft 
> *Subject:* [PATCH][GCC] aarch64: Fix SCHEDULER_IDENT for Cortex-A510
>  
> The SCHEDULER_IDENT for this CPU was incorrectly
> set to cortexa55, which is incorrect. This can cause
> sub-optimal asm to be generated.
> 
> Ok for trunk?
> 
> Can I also backport this to gcc-12 and gcc-13?
> 
> gcc/ChangeLog:
>     PR target/114272
>     * config/aarch64/aarch64-cores.def (AARCH64_CORE):
>     Change SCHEDULER_IDENT from cortexa55 to cortexa53
>     for Cortex-A510.

Re: [PATCH] arm: Zero/Sign extends for CMSE security

2024-04-26 Thread Richard Earnshaw (lists)

On 26/04/2024 09:39, Torbjorn SVENSSON wrote:
> Hi,
> 
> On 2024-04-25 16:25, Richard Ball wrote:
>> Hi Torbjorn,
>>
>> Thanks very much for the comments.
>> I think given that the code that handles this, is within a 
>> FOREACH_FUNCTION_ARGS loop.
>> It seems a fairly safe assumption that if the code works for one that it 
>> will work for all.
>> To go back and add extra tests to me seems a little overkill.
> 
> For verifying that the implementation does the right thing now, no, but for 
> verifying against future regressions, then yes.
> 
> So, from a regression point of view, I think it makes sense to have the check 
> that more than the first argument is managed properly.
> 
> Kind regards,
> Torbjörn

Feel free to post some additional tests, Torbjorn.

R.

Re: [PATCH] arm: Zero/Sign extends for CMSE security

2024-04-25 Thread Richard Earnshaw (lists)

On 24/04/2024 16:55, Richard Ball wrote:
> This patch makes the following changes:
> 
> 1) When calling a secure function from non-secure code then any arguments
>smaller than 32-bits that are passed in registers are zero- or 
> sign-extended.
> 2) After a non-secure function returns into secure code then any return value
>smaller than 32-bits that is passed in a register is  zero- or 
> sign-extended.
> 
> This patch addresses the following CVE-2024-0151.
> 
> gcc/ChangeLog:
> PR target/114837
> * config/arm/arm.cc (cmse_nonsecure_call_inline_register_clear):
>   Add zero/sign extend.
> (arm_expand_prologue): Add zero/sign extend.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/cmse/extend-param.c: New test.
> * gcc.target/arm/cmse/extend-return.c: New test.

OK.  And OK to backport to active branches.

R.

Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Richard Earnshaw (lists) via Gcc

On 23/04/2024 09:56, Mark Wielaard wrote:
> Hi,
> 
> On Mon, Apr 22, 2024 at 11:51:00PM -0400, Jason Merrill wrote:
>> On Mon, Apr 22, 2024 at 11:24 PM Tom Tromey  wrote:
>>> Jason> Someone mentioned earlier that gerrit was previously tried
>>> Jason> unsuccessfully.
>>>
>>> We tried it and gdb and then abandoned it.  We tried to integrate it
>>> into the traditional gdb development style, having it send email to
>>> gdb-patches.  I found these somewhat hard to read and in the end we
>>> agreed not to use it.
>>>
>>> I've come around again to thinking we should probably abandon email
>>> instead.  For me the main benefit is that gerrit has patch tracking,
>>> unlike our current system, where losing patches is fairly routine.
>>
>> Indeed.  Though Patchwork is another option for patch tracking, that glibc
>> seem to be having success with.
> 
> Patchworks works if you have people that like it and keep on top of
> it. For elfutils Aaron and I are probably the only ones that use it,
> but if we just go over it once a week it keeps being managable and
> nobody else needs to care. That is also why it seems to work for
> glibc. If you can carve out an hour a week going over the submitted
> patches and delegate them then it is a really useful patch tracking
> tool. Obviously that only works if the patch flow isn't overwhelming
> to begin with...
> 
> I'll work with Sergio who setup the original gerrit instance to
> upgrade it to the latest gerrit version so people try it out. One nice
> thing he did was to automate self-service user registration. Although
> that is one of the things I don't really like about it. As Tom said it
> feels like gerrit is an all or nothing solution that has to be
> mandated to be used and requires everybody to have a centralized
> login. But if you do want that then how Sergio set it up is pretty
> nice. It is just one more thing to monitor for spam accounts...
> 
> Cheers,
> 
> Mark

I've been using patchwork with GCC since, roughly, last year's cauldron.  Its 
main weakness is a poor search function for finding relevant patches, which 
means that since most patches in the queue aren't being managed it's a bit 
hit-and-miss finding the relevant patches.

Its other problem is that it expects a particular workflow model, particularly 
not replying to an existing patch discussion with an updated patch (it expects 
patches to be reposted as an entire series with a new version number, 
Linux-style).

But there are some benefits to using it: I can integrate it with my mail client 
- it can show me the patch series in patchwork when I receive a mail directly; 
it integrates well with git with the git-pw module, so I can pull an entire 
patch series off the list into my worktree from the command line just by 
knowing the patch series number; and I can manage/track patches in bundles, 
which is great if I don't have time in any particular day to deal with the 
email volume.

Finally, it's been integrated with our CI systems (thanks Linaro!), so it can 
automatically pull reviews and run validations on them, then report the results 
back; often before I've even had time to look at the patch.

R.

Re: Updated Sourceware infrastructure plans

2024-04-23 Thread Richard Earnshaw (lists) via Gcc

On 23/04/2024 04:24, Tom Tromey wrote:
> Jason> Someone mentioned earlier that gerrit was previously tried
> Jason> unsuccessfully.
> 
> We tried it and gdb and then abandoned it.  We tried to integrate it
> into the traditional gdb development style, having it send email to
> gdb-patches.  I found these somewhat hard to read and in the end we
> agreed not to use it.
> 
> I've come around again to thinking we should probably abandon email
> instead.  For me the main benefit is that gerrit has patch tracking,
> unlike our current system, where losing patches is fairly routine.
> 
> Jason> I think this is a common pattern in GCC at least: someone has an
> Jason> idea for a workflow improvement, and gets it working, but it
> Jason> isn't widely adopted.
> 
> It essentially has to be mandated, IMO.
> 
> For GCC this seems somewhat harder since the community is larger, so
> there's more people to convince.
> 
> Tom

I've been forced to use gerrit occasionally.  I hate it.  No, I LOATHE it.  The 
UI is anything but intuitive with features hidden behind unobvious selections.  
IMO It's not a tool for a casual developer, which makes it a bad introduction 
to developing software.

R.

Re: [PATCH] [testsuite] [arm] require arm_v8_1m_main for pacbti tests

2024-04-19 Thread Richard Earnshaw (lists)

On 19/04/2024 13:45, Alexandre Oliva wrote:
> On Apr 16, 2024, "Richard Earnshaw (lists)"  wrote:
> 
>> The require-effective-target flags test whether a specific set of
>> flags will make the compilation work, so they need to be used in
>> conjunction with the corresponding dg-add-options flags that then
>> apply those options.
> 
> *nod*, that's the theory.  Problem is the architectures suported by
> [add_options_for_]arm_arch_*[_ok] do not match exactly those expected by
> the tests, and I can't quite tell whether the subtle changes they would
> introduce would change what they intend to test, or even whether the
> differences are irrelevant, or would be sensible to add as variants to
> the dg machinery.  I think it would take someone more familiar than I am
> with all of the ARM variants to do this correctly.  I don't even know
> how these changes would need to be tested to be sure they remain
> correct.

It's ok to add additional variations to the table of variants in 
target-supports.exp, but we should avoid writing new specific run-time 
functions unless we really want an executable test.

I started doing some cleanup of the Arm tests infrastructure during phase 3, 
but stopped during phase 4 as I wanted to minimise the changes being made now.  
I plan to go back and work on it some more once stage 1 re-opens.

> 
> Would you be willing to take it from here, or would you accept the patch
> as an incremental yet imperfect improvement, or would you prefer to
> guide me in making it correct, and in verifying it (there are questions
> below)?  I don't have a lot of cycles to put into this (we've already
> worked around the testsuite bugs we ran into), but it would be desirable
> to get a fix into GCC as well, if we can converge on one without
> unreasonably burdening anyone.
> 
> 
>   v8_1m_main "-march=armv8.1-m.main+fp -mthumb" __ARM_ARCH_8M_MAIN__
>   v8_1m_main_pacbti "-march=armv8.1-m.main+pacbti+fp -mthumb"
>   "__ARM_ARCH_8M_MAIN__ && __ARM_FEATURE_BTI && 
> __ARM_FEATURE_PAUTH
> 
> Why do these have +fp in -march but not in the v8_1m* arch name?

It's ... complicated :)

The +fp is there because, with the move to having -mfpu=auto as the default, we 
need to avoid problems when the compiler has been configured with 
--with-float=hard, which requires the extension register set (fp or vector 
support) even if the test code itself doesn't care.  The best way to handle 
this in most cases is to give the architecture strings a default FPU 
specification (ie +fp). 

> 
> 
> gcc/testsuite/g++.target/arm/pac-1.C:
> /* { dg-options "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
> -mthumb -mfloat-abi=hard -g -O0" } */
> 
> v8_1m_main_pacbti plus +mve minus +fp.
> Do we need a dg arch for that?

I'd be inclined to drop +mve from this one; there's nothing I can see in the 
test that would generate mve instructions, so I think it's irrelevant.  We can 
use the existing v8_1m_main_pacbti operations.

> 
> 
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c:
> /* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
> -mfloat-abi=hard" } */
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c:
> /* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
> 
> v8_1m_main_pacbti minus -mthumb.
> AFAICT the -mthumb is redundant.

Nearly, but not quite.  Although the gcc driver knows that m-profile 
architectures require thumb, that's not enough to override an explicit -marm 
from a testsuite configuration run, so if your site.exp file adds -marm in a 
test configuration we need to override that or the test will fail.  But the 
table based list of options will do that for you.

> 
> 
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c:
> /* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
> 
> v8_1m_main minus -mthumb.
> AFAICT the -mthumb is redundant.

As above

> 
> 
> gcc/testsuite/gcc.target/arm/bti-1.c:
> /* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
> -mbranch-protection=bti --save-temps" } */
> gcc/testsuite/gcc.target/arm/bti-2.c:
> /* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
> -mbranch-protection=bti --save-temps" } */
> 
> v8_1m_main minus +fp.> 
> Can these be bumped to +fp, or do we need an extra dg arch?
> 
> Are these missing +pacbti?

The tests themselves do not require fp, but if we use the effective-target 
rules (arm_arch_v8_1m_main), we can remove the -march, -mthumb and -mfloat-abi 
flags from these tests.

These tests for BTI should NOT have +pacbti: they're testing that the compiler 
generates the right nop-based implementation that is backw

Re: [PATCH]AArch64: remove reliance on register allocator for simd/gpreg costing. [PR114741]

2024-04-18 Thread Richard Earnshaw (lists)

On 18/04/2024 11:11, Tamar Christina wrote:
> Hi All,
> 
> In PR114741 we see that we have a regression in codegen when SVE is enable 
> where
> the simple testcase:
> 
> void foo(unsigned v, unsigned *p)
> {
> *p = v & 1;
> }
> 
> generates
> 
> foo:
> fmovs31, w0
> and z31.s, z31.s, #1
> str s31, [x1]
> ret
> 
> instead of:
> 
> foo:
> and w0, w0, 1
> str w0, [x1]
> ret
> 
> This causes an impact it not just codesize but also performance.  This is 
> caused
> by the use of the ^ constraint modifier in the pattern 3.
> 
> The documentation states that this modifier should only have an effect on the
> alternative costing in that a particular alternative is to be preferred unless
> a non-psuedo reload is needed.
> 
> The pattern was trying to convey that whenever both r and w are required, that
> it should prefer r unless a reload is needed.  This is because if a reload is
> needed then we can construct the constants more flexibly on the SIMD side.
> 
> We were using this so simplify the implementation and to get generic cases 
> such
> as:
> 
> double negabs (double x)
> {
>unsigned long long y;
>memcpy (, , sizeof(double));
>y = y | (1UL << 63);
>memcpy (, , sizeof(double));
>return x;
> }
> 
> which don't go through an expander.
> However the implementation of ^ in the register allocator is not according to
> the documentation in that it also has an effect during coloring.  During 
> initial
> register class selection it applies a penalty to a class, similar to how ? 
> does.
> 
> In this example the penalty makes the use of GP regs expensive enough that it 
> no
> longer considers them:
> 
> r106: preferred FP_REGS, alternative NO_REGS, allocno FP_REGS
> ;;3--> b  0: i   9 r106=r105&0x1
> :cortex_a53_slot_any:GENERAL_REGS+0(-1)FP_REGS+1(1)PR_LO_REGS+0(0)
>  PR_HI_REGS+0(0):model 4
> 
> which is not the expected behavior.  For GCC 14 this is a conservative fix.
> 
> 1. we remove the ^ modifier from the logical optabs.
> 
> 2. In order not to regress copysign we then move the copysign expansion to
>directly use the SIMD variant.  Since copysign only supports floating point
>modes this is fine and no longer relies on the register allocator to select
>the right alternative.
> 
> It once again regresses the general case, but this case wasn't optimized in
> earlier GCCs either so it's not a regression in GCC 14.  This change gives
> strict better codegen than earlier GCCs and still optimizes the important 
> cases.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 
>   PR target/114741
>   * config/aarch64/aarch64.md (3): Remove ^ from alt 2.
>   (copysign3): Use SIMD version of IOR directly.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/114741
>   * gcc.target/aarch64/fneg-abs_2.c: Update codegen.
>   * gcc.target/aarch64/fneg-abs_4.c: xfail for now.
>   * gcc.target/aarch64/pr114741.c: New test.
> 
> ---
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 
> 385a669b9b3c31cc9108a660e881b9091c71fc7c..dbde066f7478bec51a8703b017ea553aa98be309
>  100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4811,7 +4811,7 @@ (define_insn "3"
>""
>{@ [ cons: =0 , 1  , 2; attrs: type , arch  ]
>   [ r, %r , r; logic_reg   , * ] \t%0, 
> %1, %2
> - [ rk   , ^r ,  ; logic_imm   , * ] \t%0, 
> %1, %2
> + [ rk   , r  ,  ; logic_imm   , * ] \t%0, 
> %1, %2
>   [ w, 0  ,  ; *   , sve   ] \t%Z0., 
> %Z0., #%2
>   [ w, w  , w; neon_logic  , simd  ] 
> \t%0., %1., %2.
>}
> @@ -7192,22 +7192,29 @@ (define_expand "copysign3"
> (match_operand:GPF 2 "nonmemory_operand")]
>"TARGET_SIMD"
>  {
> -  machine_mode int_mode = mode;
> -  rtx bitmask = gen_reg_rtx (int_mode);
> -  emit_move_insn (bitmask, GEN_INT (HOST_WIDE_INT_M1U
> - << (GET_MODE_BITSIZE (mode) - 1)));
> +  rtx signbit_const = GEN_INT (HOST_WIDE_INT_M1U
> +<< (GET_MODE_BITSIZE (mode) - 1));
>/* copysign (x, -1) should instead be expanded as orr with the sign
>   bit.  */
>rtx op2_elt = unwrap_const_vec_duplicate (operands[2]);
>if (GET_CODE (op2_elt) == CONST_DOUBLE
>&& real_isneg (CONST_DOUBLE_REAL_VALUE (op2_elt)))
>  {
> -  emit_insn (gen_ior3 (
> - lowpart_subreg (int_mode, operands[0], mode),
> - lowpart_subreg (int_mode, operands[1], mode), bitmask));
> +  rtx v_bitmask
> + = force_reg (V2mode,
> +  gen_const_vec_duplicate (V2mode,
> +   signbit_const));
> +
> +  emit_insn (gen_iorv23 (
> + lowpart_subreg (V2mode, operands[0], mode),
> + lowpart_subreg

Re: [PATCH] [testsuite] [arm] accept empty init for bfloat16

2024-04-16 Thread Richard Earnshaw (lists)

On 16/04/2024 04:50, Alexandre Oliva wrote:
> 
> Complete r13-2205, adjusting an arm-specific test that expects a
> no-longer-issued error at an empty initializer.
> 
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/arm/bfloat16_scalar_typecheck.c: Accept C23
> empty initializers.
> ---
>  .../gcc.target/arm/bfloat16_scalar_typecheck.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c 
> b/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> index 8c80c55bc9f4c..04ede93bda152 100644
> --- a/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> +++ b/gcc/testsuite/gcc.target/arm/bfloat16_scalar_typecheck.c
> @@ -42,7 +42,7 @@ bfloat16_t footest (bfloat16_t scalar0)
>short initi_1_4 = glob_bfloat; /* { dg-error {invalid conversion from type 
> 'bfloat16_t'} } */
>double initi_1_5 = glob_bfloat; /* { dg-error {invalid conversion from 
> type 'bfloat16_t'} } */
>  
> -  bfloat16_t scalar2_1 = {}; /* { dg-error {empty scalar initializer} } */
> +  bfloat16_t scalar2_1 = {};
>bfloat16_t scalar2_2 = { glob_bfloat };
>bfloat16_t scalar2_3 = { 0 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
>bfloat16_t scalar2_4 = { 0.1 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
> @@ -94,7 +94,7 @@ bfloat16_t footest (bfloat16_t scalar0)
>  
>/* Compound literals.  */
>  
> -  (bfloat16_t) {}; /* { dg-error {empty scalar initializer} } */
> +  (bfloat16_t) {};
>(bfloat16_t) { glob_bfloat };
>(bfloat16_t) { 0 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
>(bfloat16_t) { 0.1 }; /* { dg-error {invalid conversion to type 
> 'bfloat16_t'} } */
> 


This test is checking for errors.  Perhaps it would be better to select an 
older version of the standard and then set pedantic-error mode.

R.

Re: [testsuite] [aarch64] Require fpic effective target

2024-04-16 Thread Richard Earnshaw (lists)

On 16/04/2024 04:08, Alexandre Oliva wrote:
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> Co-authored-by: Olivier Hainque 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/aarch64/pr94201.c: Add missing
>   dg-require-effective-target fpic.
>   * gcc.target/aarch64/pr103085.c: Likewise.
> 
> ---
>  gcc/testsuite/gcc.target/aarch64/pr103085.c |1 +
>  gcc/testsuite/gcc.target/aarch64/pr94201.c  |1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr103085.c 
> b/gcc/testsuite/gcc.target/aarch64/pr103085.c
> index dbc9c15b71f22..347280ed42b2d 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pr103085.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr103085.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fstack-protector-strong -fPIC" } */
> +/* { dg-require-effective-target fpic } */
>  
>  void g(int*);
>  void
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr94201.c 
> b/gcc/testsuite/gcc.target/aarch64/pr94201.c
> index 691761691868a..3b9b79059e02b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pr94201.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr94201.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-mcmodel=tiny -mabi=ilp32 -fPIC" } */
> +/* { dg-require-effective-target fpic } */
>  
>  extern int bar (void *);
>  extern long long a;
> 


OK

R.

Re: [PATCH] [testsuite] [arm] require arm_v8_1m_main for pacbti tests

2024-04-16 Thread Richard Earnshaw (lists)

On 16/04/2024 04:48, Alexandre Oliva wrote:
> 
> arm pac and bti tests that use -march=armv8.1-m.main get an implicit
> -mthumb, that is incompatible with vxworks kernel mode.  Declaring the
> requirement for a 8.1-m.main-compatible toolchain is enough to avoid
> those fails, because the toolchain feature test fails in kernel mode.
> 
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-13 on arm-,
> aarch64-, x86- and x86_64-vxworks7r2.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * g++.target/arm/pac-1.C: Require arm_arch_v8_1m_main.
>   * gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.
>   * gcc.target/arm/bti-1.c: Likewise.
>   * gcc.target/arm/bti-2.c: Likewise.
> ---
>  gcc/testsuite/g++.target/arm/pac-1.C   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-11.c   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-12.c   |1 +
>  .../gcc.target/arm/acle/pacbti-m-predef-7.c|1 +
>  gcc/testsuite/gcc.target/arm/bti-1.c   |1 +
>  gcc/testsuite/gcc.target/arm/bti-2.c   |1 +
>  6 files changed, 6 insertions(+)
> 
> diff --git a/gcc/testsuite/g++.target/arm/pac-1.C 
> b/gcc/testsuite/g++.target/arm/pac-1.C
> index f671a27b048c6..f48ad6cc5cb65 100644
> --- a/gcc/testsuite/g++.target/arm/pac-1.C
> +++ b/gcc/testsuite/g++.target/arm/pac-1.C
> @@ -2,6 +2,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-options "-march=armv8.1-m.main+mve+pacbti 
> -mbranch-protection=pac-ret -mthumb -mfloat-abi=hard -g -O0" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */

The require-effective-target flags test whether a specific set of flags will 
make the compilation work, so they need to be used in conjunction with the 
corresponding dg-add-options flags that then apply those options.  It isn't 
safe to just add a different architecture flag instead.  So if you're going to 
use this effective target, you should use it along with "dg-add-options 
arm_arch_v8_1m_main" (ie the effective-target name minus the trailing '_ok'), 
and then replace dg-options with dg-additional-options adding the remaining 
flags.  You can then remove the dg-skip-if as well because that's what the 
require-effective-target flag is doing.  So something like

dg-do compile
dg-require-effective-target arm_arch_v8_1m_main_ok
dg-add-options arm_arch_v8_1m_main
dg-additional-options "-mbranch-protection=pac-ret -g -O0"

But this test is also adding pacbti to the architecture flags, so it would 
probably be better to use v8_1m_main_pacbti_ok as the effective target.  It's 
not identical to the options above, but it's probably sufficient for this test. 
 Each test below will need checking for the exact flags that are needed for the 
test in question.


>  
>  __attribute__((noinline)) void
>  fn1 (int a, int b, int c)
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> index 6a5ae92c567f3..dba4f491cfea7 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" "-mfloat-abi=*" } } */
>  /* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
>  
>  #if (__ARM_FEATURE_BTI != 1)
>  #error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be defined to 1."
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> index db40b17c3b030..308a41eb4ba4c 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
>  
>  #if defined (__ARM_FEATURE_BTI)
>  #error "Feature test macro __ARM_FEATURE_BTI should not be defined."
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> index 1b25907635e24..10836a84bde56 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
>  /* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
> -mfloat-abi=hard" } */
> +/* {

Re: Patches submission policy change

2024-04-08 Thread Richard Earnshaw (lists) via Gcc

On 03/04/2024 14:23, Christophe Lyon via Gcc wrote:
> On Wed, 3 Apr 2024 at 14:59, Joel Sherrill  wrote:
>>
>> Another possible issue which may be better now than in years past
>> is that the versions of autoconf/automake required often had to be
>> installed by hand. I think newlib has gotten better but before the
>> rework on its Makefile/configure, I had a special install of autotools
>> which precisely matched what it required.
>>
>> And that led to very few people being able to successfully regenerate.
>>
>> Is that avoidable?
>>
>> OTOH the set of people touching these files may be small enough that
>> pain isn't an issue.
>>
> 
> For binutils/gcc/gdb we still have to use specific versions which are
> generally not the distro's ones.

That's because at least some distros modify autoconf to their own taste/needs, 
so that it does not generate the same output as the officially released 
version.  Furthermore, they provide no mechanism to make their version revert 
back to the original behaviour.

R.

Re: [PATCH 1/1] aarch64: Sync aarch64-sys-regs.def with Binutils

2024-03-20 Thread Richard Earnshaw (lists)

On 20/03/2024 11:21, Yury Khrustalev wrote:
> This patch updates `aarch64-sys-regs.def', bringing it into sync with
> the Binutils source.
> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-sys-regs.def: Copy from Binutils.

Thanks, I've pushed this.  It's trivial enough and there's value of keeping it 
in sync with binutils.

One comment though, there should be one hard tab before "* config/..."; you 
seem to have some other random characters there that looked like white space.

R.

> ---
>  gcc/config/aarch64/aarch64-sys-regs.def | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
> b/gcc/config/aarch64/aarch64-sys-regs.def
> index 6a948171d6e..8b65673a5d6 100644
> --- a/gcc/config/aarch64/aarch64-sys-regs.def
> +++ b/gcc/config/aarch64/aarch64-sys-regs.def
> @@ -521,6 +521,7 @@
>SYSREG ("id_aa64isar0_el1",CPENC (3,0,0,6,0),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64isar1_el1",CPENC (3,0,0,6,1),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64isar2_el1",CPENC (3,0,0,6,2),  F_REG_READ, 
> AARCH64_NO_FEATURES)
> +  SYSREG ("id_aa64isar3_el1",CPENC (3,0,0,6,3),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr0_el1",CPENC (3,0,0,7,0),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr1_el1",CPENC (3,0,0,7,1),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr2_el1",CPENC (3,0,0,7,2),  F_REG_READ, 
> AARCH64_NO_FEATURES)

Re: [PATCH] arm: [MVE intrinsics] Fix support for loads [PR target/114323]

2024-03-18 Thread Richard Earnshaw (lists)





On 15/03/2024 20:08, Christophe Lyon wrote:

The testcase in this PR shows that we would load from an uninitialized
location, because the vld1 instrinsics are reported as "const". This
is because function_instance::reads_global_state_p() does not take
CP_READ_MEMORY into account.  Fixing this gives vld1 the "pure"
attribute instead, and solves the problem.

2024-03-15  Christophe Lyon  

PR target/114323
gcc/
* config/arm/arm-mve-builtins.cc
(function_instance::reads_global_state_p): Take CP_READ_MEMORY
into account.

gcc/testsuite/
* gcc.target/arm/mve/pr114323.c: New.


OK.

R.


---
  gcc/config/arm/arm-mve-builtins.cc  |  2 +-
  gcc/testsuite/gcc.target/arm/mve/pr114323.c | 22 +
  2 files changed, 23 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.target/arm/mve/pr114323.c

diff --git a/gcc/config/arm/arm-mve-builtins.cc 
b/gcc/config/arm/arm-mve-builtins.cc
index 2f2c0f4a02a..6a5775c67e5 100644
--- a/gcc/config/arm/arm-mve-builtins.cc
+++ b/gcc/config/arm/arm-mve-builtins.cc
@@ -746,7 +746,7 @@ function_instance::reads_global_state_p () const
if (flags & CP_READ_FPCR)
  return true;
  
-  return false;

+  return flags & CP_READ_MEMORY;
  }
  
  /* Return true if calls to the function could modify some form of

diff --git a/gcc/testsuite/gcc.target/arm/mve/pr114323.c 
b/gcc/testsuite/gcc.target/arm/mve/pr114323.c
new file mode 100644
index 000..bd9127b886a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/pr114323.c
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_mve_hw } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+
+#include 
+
+__attribute__((noipa))
+uint32x4_t foo (void) {
+  uint32x4_t V0 = vld1q_u32(((const uint32_t[4]){1, 2, 3, 4}));
+  return V0;
+}
+
+int main(void)
+{
+  uint32_t buf[4];
+ vst1q_u32 (buf, foo());
+
+  for (int i = 0; i < 4; i++)
+if (buf[i] != i+1)
+  __builtin_abort ();
+}

Re: [PATCH] testsuite: Turn errors back into warnings in arm/acle/cde-mve-error-2.c

2024-03-18 Thread Richard Earnshaw (lists)





On 15/03/2024 15:13, Thiago Jung Bauermann wrote:


Hello,

"Richard Earnshaw (lists)"  writes:


On 13/01/2024 20:46, Thiago Jung Bauermann wrote:

diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c 
b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
index 5b7774825442..da283a06a54d 100644
--- a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
+++ b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
@@ -2,6 +2,7 @@

  /* { dg-do assemble } */
  /* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
+/* { dg-options "-fpermissive" } */
  /* { dg-add-options arm_v8_1m_main_cde_mve_fp } */

  /* The error checking files are split since there are three kinds of
@@ -115,73 +116,73 @@ uint8x16_t test_bad_immediates (uint8x16_t n, uint8x16_t 
m, int someval,

/* `imm' is of wrong type.  */
accum += __arm_vcx1q_u8 (0, "");/* { dg-error {argument 
2 to '__builtin_arm_vcx1qv16qi' must be a constant immediate in range \[0-4095\]} } */
-  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 117 } */
+  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 118 } */


Absolute line numbers are a pain, but I think we can use '.-1' (without the 
quotes) in
these cases to minimize the churn.


That worked, thank you for the tip.


If that works, ok with that change.


I took the opportunity to request commit access to the GCC repo so that
I can commit the patch myself. Sorry for the delay. I'll commit it as
soon as I get it.

Thank you for the patch review! I'm including below the updated version.


I pushed this, thanks.

R.



--
Thiago


 From 78e70788da5ed849d7828b0219d3aa8955ad0fea Mon Sep 17 00:00:00 2001
From: Thiago Jung Bauermann 
Date: Sat, 13 Jan 2024 14:28:07 -0300
Subject: [PATCH v2] testsuite: Turn errors back into warnings in
  arm/acle/cde-mve-error-2.c
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Since commit 2c3db94d9fd ("c: Turn int-conversion warnings into
permerrors") the test fails with errors such as:

   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
32)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
33)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
34)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
35)
 ⋮
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 118 (test for 
warnings, line 117)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
119)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 120 (test for 
warnings, line 119)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
121)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 122 (test for 
warnings, line 121)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
123)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 124 (test for 
warnings, line 123)
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
125)
 ⋮
   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0  (test for excess errors)

There's a total of 1016 errors.  Here's a sample of the excess errors:

   Excess errors:
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:117:31: 
error: passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:119:3: 
error: passing argument 3 of '__builtin_arm_vcx1qav16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:121:3: 
error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]
   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:123:3: 
error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
pointer without a cast [-Wint-conversion]

The test expects these messages to be warnings, not errors.  My first try
was to change it to expect them as errors instead.  This didn't work, IIUC
because the error prevents the compiler from continuing processing the file
and thus other errors which are expected by the test don't get emitted.

Therefore, add -fpermissive so that the test behaves as it did previously.
Because of the additional line in the header, the line numbers of the
expected warnings don't match anymore so replace them with ".-1" as
suggested by Richard Earnshaw.

Tested on armv8l-linux-gnueabihf.

gcc/testsuite/ChangeLog:
* gcc.target/arm/acle/cde-mve-error-

Re: [PATCH] aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310]

2024-03-14 Thread Richard Earnshaw (lists)





On 14/03/2024 08:37, Jakub Jelinek wrote:

Hi!

The following testcase ICEs with LSE atomics.
The problem is that the @atomic_compare_and_swap expander uses
aarch64_reg_or_zero predicate for the desired operand, which is fine,
given that for most of the modes and even for TImode in some cases
it can handle zero immediate just fine, but the TImode
@aarch64_compare_and_swap_lse just uses register_operand for
that operand instead, again intentionally so, because the casp,
caspa, caspl and caspal instructions need to use a pair of consecutive
registers for the operand and xzr is just one register and we can't
just store zero into the link register to emulate pair of zeros.

So, the following patch fixes that by forcing the newval operand into
a register for the TImode LSE case.

Bootstrapped/regtested on aarch64-linux, ok for trunk?


An alternative fix would be to use a mode_attr to pick a different 
predicate for TImode.  But that's probably just a matter of taste; I'm 
not sure that one would be better than the other in reality.


OK (or with my suggestion if you prefer).

R.



2024-03-14  Jakub Jelinek  

PR target/114310
* config/aarch64/aarch64.cc (aarch64_expand_compare_and_swap): For
TImode force newval into a register.

* gcc.dg/pr114310.c: New test.

--- gcc/config/aarch64/aarch64.cc.jj2024-03-12 10:16:12.024101665 +0100
+++ gcc/config/aarch64/aarch64.cc   2024-03-13 18:55:39.147986554 +0100
@@ -24693,6 +24693,8 @@ aarch64_expand_compare_and_swap (rtx ope
  rval = copy_to_mode_reg (r_mode, oldval);
else
emit_move_insn (rval, gen_lowpart (r_mode, oldval));
+  if (mode == TImode)
+   newval = force_reg (mode, newval);
  
emit_insn (gen_aarch64_compare_and_swap_lse (mode, rval, mem,

   newval, mod_s));
--- gcc/testsuite/gcc.dg/pr114310.c.jj  2024-03-13 19:09:25.322597418 +0100
+++ gcc/testsuite/gcc.dg/pr114310.c 2024-03-13 19:08:50.802073314 +0100
@@ -0,0 +1,20 @@
+/* PR target/114310 */
+/* { dg-do run { target int128 } } */
+
+volatile __attribute__((aligned (sizeof (__int128_t __int128_t v = 10;
+
+int
+main ()
+{
+#if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
+  if (__sync_val_compare_and_swap (, (__int128_t) 10, (__int128_t) 0) != 10)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 10, (__int128_t) 15) != 0)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 0, (__int128_t) 42) != 0)
+__builtin_abort ();
+  if (__sync_val_compare_and_swap (, (__int128_t) 31, (__int128_t) 35) != 42)
+__builtin_abort ();
+#endif
+  return 0;
+}

Jakub

Re: [PATCH] arm: fix c23 0-named-args caller-side stdarg

2024-03-07 Thread Richard Earnshaw (lists)

On 06/03/2024 20:28, Alexandre Oliva wrote:
> On Mar  1, 2024, "Richard Earnshaw (lists)"  wrote:
> 
>> On 01/03/2024 04:38, Alexandre Oliva wrote:
>>> Thanks for the review.
> 
>> For closure, Jakub has just pushed a patch to the generic code, so I
>> don't think we need this now.
> 
> ACK.  I see the c2x-stdarg-4.c test is now passing in our arm-eabi
> gcc-13 tree.  Thank you all.
> 
> Alas, the same nightly build showed a new riscv fail in c23-stdarg-6.c,
> that also got backported to gcc-13.  Presumably it's failing in the
> trunk as well, both riscv32-elf and riscv64-elf.
> 
> I haven't looked into whether it's a regression brought about by the
> patch or just a new failure mode that the new test exposed.  Either way,
> I'm not sure whether to link this new failure to any of the associated
> PRs or to file a new one, but, FTR, I'm going to look into it.
> 

I'd suggest a new pr.  It's easier to track than re-opening an existing on.

R.

> -- 
> Alexandre Oliva, happy hacker    https://FSFLA.org/blogs/lxo/ 
> <https://FSFLA.org/blogs/lxo/>
>    Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive

Re: Help needed with maintainer-mode

2024-03-06 Thread Richard Earnshaw (lists) via Gcc

On 06/03/2024 15:04, Andrew Carlotti via Gcc wrote:
> On Thu, Feb 29, 2024 at 06:39:54PM +0100, Christophe Lyon via Gcc wrote:
>> On Thu, 29 Feb 2024 at 12:00, Mark Wielaard  wrote:
>>>
>>> Hi Christophe,
>>>
>>> On Thu, Feb 29, 2024 at 11:22:33AM +0100, Christophe Lyon via Gcc wrote:
 I've noticed that sourceware's buildbot has a small script
 "autoregen.py" which does not use the project's build system, but
 rather calls aclocal/autoheader/automake/autoconf in an ad-hoc way.
 Should we replicate that?
>>>
>>> That python script works across gcc/binutils/gdb:
>>> https://sourceware.org/cgit/builder/tree/builder/containers/autoregen.py
>>>
>>> It is installed into a container file that has the exact autoconf and
>>> automake version needed to regenerate the autotool files:
>>> https://sourceware.org/cgit/builder/tree/builder/containers/Containerfile-autotools
>>>
>>> And it was indeed done this way because that way the files are
>>> regenerated in a reproducible way. Which wasn't the case when using 
>>> --enable-maintainer-mode (and autoreconfig also doesn't work).
>>
>> I see. So it is possibly incomplete, in the sense that it may lack
>> some of the steps that maintainer-mode would perform?
>> For instance, gas for aarch64 has some *opcodes*.c files that need
>> regenerating before committing. The regeneration step is enabled in
>> maintainer-mode, so I guess the autoregen bots on Sourceware would
>> miss a problem with these files?
>>
>> Thanks,
>>
>> Christophe
> 
> Speaking of opcodes/aarch64-{asm|dis|opc}-2.c - why are these in the source
> directory in the first place?  For a similar situation in GCC (gimple-match,
> generic-match, insn-emit, etc.) we write the generated files to the build
> directory, and generation is always enabled.  I see no reason not to do the
> same thing for aarch64-{asm|dis|opc}-2.c.
> 

GCC supports building a canadian cross, but binutils doesn't.  To put them in 
the build area would require setting up host compiler as well and I don't think 
there's currently enough appetite for doing that extra work.

But every do has its day; maybe the time has come...

R.

> Andrew
> 
>>>
>>> It is run on all commits and warns if it detects a change in the
>>> (checked in) generated files.
>>> https://builder.sourceware.org/buildbot/#/builders/gcc-autoregen
>>> https://builder.sourceware.org/buildbot/#/builders/binutils-gdb-autoregen
>>>
>>> Cheers,
>>>
>>> Mark

Re: [PATCH] arm: Support -mfdpic for more targets

2024-03-06 Thread Richard Earnshaw (lists)

On 06/03/2024 05:07, Fangrui Song wrote:
> On Fri, Feb 23, 2024 at 7:33 PM Fangrui Song  wrote:
>>
>> From: Fangrui Song 
>>
>> Targets that are not arm*-*-uclinuxfdpiceabi can use -S -mfdpic, but -c
>> -mfdpic does not pass --fdpic to gas.  This is an unnecessary
>> restriction.  Just define the ASM_SPEC in bpabi.h.
>>
>> Additionally, use armelf[b]_linux_fdpiceabi emulations for -mfdpic in
>> linux-eabi.h.  This will allow a future musl fdpic port to use the
>> desired BFD emulation.
>>
>> gcc/ChangeLog:
>>
>> * config/arm/bpabi.h (TARGET_FDPIC_ASM_SPEC): Transform -mfdpic.
>> * config/arm/linux-eabi.h (TARGET_FDPIC_LINKER_EMULATION): Define.
>> (SUBTARGET_EXTRA_LINK_SPEC): Use TARGET_FDPIC_LINKER_EMULATION
>> if -mfdpic.
>> ---
>>  gcc/config/arm/bpabi.h  | 2 +-
>>  gcc/config/arm/linux-eabi.h | 5 -
>>  2 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
>> index 7a279f3ed3c..6778be1a8bf 100644
>> --- a/gcc/config/arm/bpabi.h
>> +++ b/gcc/config/arm/bpabi.h
>> @@ -55,7 +55,7 @@
>>  #define TARGET_FIX_V4BX_SPEC " %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*"\
>>    "|march=armv4|mcpu=fa526|mcpu=fa626:--fix-v4bx}"
>>
>> -#define TARGET_FDPIC_ASM_SPEC ""
>> +#define TARGET_FDPIC_ASM_SPEC "%{mfdpic: --fdpic}"
>>
>>  #define BE8_LINK_SPEC  \
>>    "%{!r:%{!mbe32:%:be8_linkopt(%{mlittle-endian:little}"   \
>> diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
>> index eef791f6a02..0c5c58e4928 100644
>> --- a/gcc/config/arm/linux-eabi.h
>> +++ b/gcc/config/arm/linux-eabi.h
>> @@ -46,12 +46,15 @@
>>  #undef  TARGET_LINKER_EMULATION
>>  #if TARGET_BIG_ENDIAN_DEFAULT
>>  #define TARGET_LINKER_EMULATION "armelfb_linux_eabi"
>> +#define TARGET_FDPIC_LINKER_EMULATION "armelfb_linux_fdpiceabi"
>>  #else
>>  #define TARGET_LINKER_EMULATION "armelf_linux_eabi"
>> +#define TARGET_FDPIC_LINKER_EMULATION "armelf_linux_fdpiceabi"
>>  #endif
>>
>>  #undef  SUBTARGET_EXTRA_LINK_SPEC
>> -#define SUBTARGET_EXTRA_LINK_SPEC " -m " TARGET_LINKER_EMULATION
>> +#define SUBTARGET_EXTRA_LINK_SPEC " -m %{mfdpic: " \
>> +  TARGET_FDPIC_LINKER_EMULATION ";:" TARGET_LINKER_EMULATION "}"
>>
>>  /* GNU/Linux on ARM currently supports three dynamic linkers:
>> - ld-linux.so.2 - for the legacy ABI
>> --
>> 2.44.0.rc1.240.g4c46232300-goog
>>
> 
> Ping:)
> 

We're in stage4 at present and this is new material.  I'll look at it after the 
branch has been cut.

R.

> 
> -- 
> 宋方睿

Re: [PATCH v2] testsuite, arm: Fix testcase arm/pr112337.c to check for the options first

2024-03-05 Thread Richard Earnshaw (lists)

On 19/02/2024 10:11, Saurabh Jha wrote:
> 
> On 2/9/2024 2:57 PM, Richard Earnshaw (lists) wrote:
>> On 30/01/2024 17:07, Saurabh Jha wrote:
>>> Hey,
>>>
>>> Previously, this test was added to fix this bug: 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337. However, it did not 
>>> check the compilation options before using them, leading to errors.
>>>
>>> This patch fixes the test by first checking whether it can use the options 
>>> before using them.
>>>
>>> Tested for arm-none-eabi and found no regressions. The output of check-gcc 
>>> with RUNTESTFLAGS="arm.exp=*" changed like this:
>>>
>>> Before:
>>> # of expected passes  5963
>>> # of unexpected failures  64
>>>
>>> After:
>>> # of expected passes  5964
>>> # of unexpected failures  63
>>>
>>> Ok for master?
>>>
>>> Regards,
>>> Saurabh
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>  * gcc.target/arm/pr112337.c: Check whether we can use the 
>>> compilation options before using them.
>> My apologies for missing this earlier.  It didn't show up in patchwork. 
>> That's most likely because the attachment is a binary blob instead of 
>> text/plain.  That also means that the Linaro CI system hasn't seen this 
>> patch either.  Please can you fix your mailer to add plain text patch files.
>>
>> -/* { dg-options "-O2 -march=armv8.1-m.main+fp.dp+mve.fp -mfloat-abi=hard" } 
>> */
>> +/* { dg-require-effective-target arm_hard_ok } */
>> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>> +/* { dg-options "-O2 -mfloat-abi=hard" } */
>> +/* { dg-add-options arm_v8_1m_mve } */
>>
>> This is moving in the right direction, but it adds more than necessary now: 
>> checking for, and adding -mfloat-abi=hard is not necessary any more as 
>> arm_v8_1m_mve_ok will work out what float-abi flags are needed to make the 
>> options work. (What's more, it will prevent the test from running if the 
>> base configuration of the compiler is incompatible with the hard float ABI, 
>> which is more than we need.).
>>
>> So please can you re-spin removing the hard-float check and removing that 
>> from dg-options.
>>
>> Thanks,
>> R.
> 
> Hi Richard,
> 
> Agreed with your comments. Please find the patch with the suggested changes 
> attached.
> 
> Regards,
> 
> Saurabh
> 


Thanks, I've pushed this.  Next time, please can you put the commit message 
inside the patch, so that I can apply things automatically.  Eg: 

>From 1c92c94074449929f40cea99a6450bcde3aec12f Mon Sep 17 00:00:00 2001
From: Saurabh Jha 
Date: Tue, 30 Jan 2024 15:03:36 +
Subject: [PATCH] Fix testcase pr112337.c to check the options [PR112337]

gcc.target/arm/pr112337.c was failing to validate that adding MVE options
was compatible with the test environment, so add the missing checks.

gcc/testsuite/ChangeLog:

PR target/112337
* gcc.target/arm/pr112337.c: Check for, then use the right MVE
options.

---
 gcc/testsuite/gcc.target/arm/pr112337.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr112337.c 
b/gcc/testsuite/gcc.target/arm/pr112337.c

...

Re: Help needed with maintainer-mode

2024-03-05 Thread Richard Earnshaw (lists) via Gcc

On 04/03/2024 20:04, Jonathan Wakely wrote:
> On Mon, 4 Mar 2024 at 19:27, Vladimir Mezentsev
>  wrote:
>>
>>
>>
>> On 3/4/24 09:38, Richard Earnshaw (lists) wrote:
>>> Tools like git (and svn before it) don't try to maintain time-stamps on 
>>> patches, the tool just modifies the file and the timestamp comes from the 
>>> time of the modification.  That's fine if there is nothing regenerated 
>>> within the repository (it's pure original source), but will cause problems 
>>> if there are generated files as their time stamps aren't necessarily 
>>> correct.  `gcc_update --touch` addresses that by ensuring all the generated 
>>> files are retouched when needed.
>>
>> Why do we save generated files in the source tree?
>> What will be the problem if we remove Makefile.in and configure from
>> source tree and will run `autoreconf -i -f` before building ?
> 
> Having the exact correct versions of autoconf and automake increases
> the barrier for new contributors to start work. And to regenerate
> everything, they also need autogen, mkinfo, etc.

It's worse than that.  They might need multiple versions of those tools because 
different subtrees are built with different, subtly incompatible, versions of 
those tools.

R.

Re: Help needed with maintainer-mode

2024-03-04 Thread Richard Earnshaw (lists) via Gcc

On 04/03/2024 16:42, Christophe Lyon wrote:
> On Mon, 4 Mar 2024 at 16:41, Richard Earnshaw  
> wrote:
>>
>>
>>
>> On 04/03/2024 15:36, Richard Earnshaw (lists) wrote:
>> > On 04/03/2024 14:46, Christophe Lyon via Gcc wrote:
>> >> On Mon, 4 Mar 2024 at 12:25, Jonathan Wakely  
>> >> wrote:
>> >>>
>> >>> On Mon, 4 Mar 2024 at 10:44, Christophe Lyon via Gcc  
>> >>> wrote:
>> >>>>
>> >>>> Hi!
>> >>>>
>> >>>> On Mon, 4 Mar 2024 at 10:36, Thomas Schwinge  
>> >>>> wrote:
>> >>>>>
>> >>>>> Hi!
>> >>>>>
>> >>>>> On 2024-03-04T00:30:05+, Sam James  wrote:
>> >>>>>> Mark Wielaard  writes:
>> >>>>>>> On Fri, Mar 01, 2024 at 05:32:15PM +0100, Christophe Lyon wrote:
>> >>>>>>>> [...], I read
>> >>>>>>>> https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration 
>> >>>>>>>> <https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration> 
>> >>>>>>>> <https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration 
>> >>>>>>>> <https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration>>
>> >>>>>>>> which basically says "run autoreconf in every dir where there is a
>> >>>>>>>> configure script"
>> >>>>>>>> but this is not exactly what autoregen.py is doing. IIRC it is based
>> >>>>>>>> on a script from Martin Liska, do you know/remember why it follows a
>> >>>>>>>> different process?
>> >>>>>>>
>> >>>>>>> CCing Sam and Arsen who helped refine the autoregen.py script, who
>> >>>>>>> might remember more details. We wanted a script that worked for both
>> >>>>>>> gcc and binutils-gdb. And as far as I know autoreconf simply didn't
>> >>>>>>> work in all directories. We also needed to skip some directories that
>> >>>>>>> did contain a configure script, but that were imported (gotools,
>> >>>>>>> readline, minizip).
>> >>>>>>
>> >>>>>> What we really need to do is, for a start, land tschwinge/aoliva's 
>> >>>>>> patches [0]
>> >>>>>> for AC_CONFIG_SUBDIRS.
>> >>>>>
>> >>>>> Let me allocate some time this week to get that one completed.
>> >>>>>
>> >>>>>> Right now, the current situation is janky and it's nowhere near
>> >>>>>> idiomatic autotools usage. It is not a comfortable experience
>> >>>>>> interacting with it even as someone who is familiar and happy with 
>> >>>>>> using
>> >>>>>> autotools otherwise.
>> >>>>>>
>> >>>>>> I didn't yet play with maintainer-mode myself but I also didn't see 
>> >>>>>> much
>> >>>>>> point given I knew of more fundamental problems like this.
>> >>>>>>
>> >>>>>> [0] 
>> >>>>>> https://inbox.sourceware.org/gcc-patches/oril72c4yh@lxoliva.fsfla.org/
>> >>>>>>  
>> >>>>>> <https://inbox.sourceware.org/gcc-patches/oril72c4yh@lxoliva.fsfla.org/>
>> >>>>>>  
>> >>>>>> <https://inbox.sourceware.org/gcc-patches/oril72c4yh@lxoliva.fsfla.org/
>> >>>>>>  
>> >>>>>> <https://inbox.sourceware.org/gcc-patches/oril72c4yh@lxoliva.fsfla.org/>>
>> >>>>>
>> >>>>
>> >>>> Thanks for the background. I didn't follow that discussion at that time 
>> >>>> :-)
>> >>>>
>> >>>> So... I was confused because I noticed many warnings when doing a simple
>> >>>> find . -name configure |while read f; do echo $f;d=$(dirname $f) &&
>> >>>> autoreconf -f $d && echo $d; done
>> >>>> as suggested by https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration 
>> >>>> <https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration> 
>> >>>> <https://gcc.gnu.org/wiki/Regenerating_GCC

Re: Help needed with maintainer-mode

2024-03-04 Thread Richard Earnshaw (lists) via Gcc

On 04/03/2024 14:46, Christophe Lyon via Gcc wrote:
> On Mon, 4 Mar 2024 at 12:25, Jonathan Wakely  wrote:
>>
>> On Mon, 4 Mar 2024 at 10:44, Christophe Lyon via Gcc  wrote:
>>>
>>> Hi!
>>>
>>> On Mon, 4 Mar 2024 at 10:36, Thomas Schwinge  wrote:

 Hi!

 On 2024-03-04T00:30:05+, Sam James  wrote:
> Mark Wielaard  writes:
>> On Fri, Mar 01, 2024 at 05:32:15PM +0100, Christophe Lyon wrote:
>>> [...], I read
>>> https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration
>>> which basically says "run autoreconf in every dir where there is a
>>> configure script"
>>> but this is not exactly what autoregen.py is doing. IIRC it is based
>>> on a script from Martin Liska, do you know/remember why it follows a
>>> different process?
>>
>> CCing Sam and Arsen who helped refine the autoregen.py script, who
>> might remember more details. We wanted a script that worked for both
>> gcc and binutils-gdb. And as far as I know autoreconf simply didn't
>> work in all directories. We also needed to skip some directories that
>> did contain a configure script, but that were imported (gotools,
>> readline, minizip).
>
> What we really need to do is, for a start, land tschwinge/aoliva's 
> patches [0]
> for AC_CONFIG_SUBDIRS.

 Let me allocate some time this week to get that one completed.

> Right now, the current situation is janky and it's nowhere near
> idiomatic autotools usage. It is not a comfortable experience
> interacting with it even as someone who is familiar and happy with using
> autotools otherwise.
>
> I didn't yet play with maintainer-mode myself but I also didn't see much
> point given I knew of more fundamental problems like this.
>
> [0] 
> https://inbox.sourceware.org/gcc-patches/oril72c4yh@lxoliva.fsfla.org/

>>>
>>> Thanks for the background. I didn't follow that discussion at that time :-)
>>>
>>> So... I was confused because I noticed many warnings when doing a simple
>>> find . -name configure |while read f; do echo $f;d=$(dirname $f) &&
>>> autoreconf -f $d && echo $d; done
>>> as suggested by https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration
>>>
>>> Then I tried with autoregen.py, and saw the same and now just
>>> checked Sourceware's bot logs and saw the same numerous warnings at
>>> least in GCC (didn't check binutils yet). Looks like this is
>>> "expected" 
>>>
>>> I started looking at auto-regenerating these files in our CI a couple
>>> of weeks ago, after we received several "complaints" from contributors
>>> saying that our precommit CI was useless / bothering since it didn't
>>> regenerate files, leading to false alarms.
>>> But now I'm wondering how such contributors regenerate the files
>>> impacted by their patches before committing, they probably just
>>> regenerate things in their subdir of interest, not noticing the whole
>>> picture :-(
>>>
>>> As a first step, we can probably use autoregen.py too, and declare
>>> maintainer-mode broken. However, I do notice that besides the rules
>>> about regenerating configure/Makefile.in/..., maintainer-mode is also
>>> used to update some files.
>>> In gcc:
>>> fixincludes: fixincl.x
>>> libffi: doc/version.texi
>>> libgfortran: some stuff :-)
>>> libiberty: functions.texi
>>
>> My recently proposed patch adds the first of those to gcc_update, the
>> other should be done too.
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647027.html
>>
> 
> This script touches files such that they appear more recent than their
> dependencies,
> so IIUC even if one uses --enable-maintainer-mode, it will have no effect.
> For auto* files, this is "fine" as we can run autoreconf or
> autoregen.py before starting configure+build, but what about other
> files?
> For instance, if we have to test a patch which implies changes to
> fixincludes/fixincl.x, how should we proceed?
> 1- git checkout (with possibly "wrong" timestamps)
> 2- apply patch-to-test
> 3- contrib/gcc_update -t
> 4- configure --enable-maintainer-mode

If you ran

git reset --hard master // restore state to 'master'
contrib/gcc_update // pull latest code

then anything coming from upstream will be touched automatically.  You really 
don't want to re-touch the files after patching unless you're sure they've all 
been patched correctly, it will break if there's anything regenerated that's 
missing.

R.

> 
> I guess --enable-maintainer-mode would be largely (if not completely)
> disabled by step 3 above?
> And we should probably swap steps 2 and 3, so that the effects of
> patch-to-test are handled by make?
> 
> Thanks,
> 
> Christophe
> 
>>
>>
>>>
>>> in binutils/bfd:
>>> gdb/sim
>>> bfd/po/SRC-POTFILES.in
>>> bfd/po/BLD-POTFILES.in
>>> bfd/bfd-in2.h
>>> bfd/libbfd.h
>>> bfd/libcoff.h
>>> binutils/po/POTFILES.in
>>> gas/po/POTFILES.in
>>> opcodes/i386*.h
>>> gdb/copying.c
>>> gdb/data-directory/*.xml
>>> gold/po/POTFILES.in

Re: [PATCH v4] aarch64,arm: Move branch-protection data to targets

2024-03-01 Thread Richard Earnshaw (lists)

On 11/01/2024 14:35, Szabolcs Nagy wrote:
> The branch-protection types are target specific, not the same on arm
> and aarch64.  This currently affects pac-ret+b-key, but there will be
> a new type on aarch64 that is not relevant for arm.
> 
> After the move, change aarch_ identifiers to aarch64_ or arm_ as
> appropriate.
> 
> Refactor aarch_validate_mbranch_protection to take the target specific
> branch-protection types as an argument.
> 
> In case of invalid input currently no hints are provided: the way
> branch-protection types and subtypes can be mixed makes it difficult
> without causing confusion.
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64.md: Rename aarch_ to aarch64_.
>   * config/aarch64/aarch64.opt: Likewise.
>   * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Likewise.
>   * config/aarch64/aarch64.cc (aarch64_expand_prologue): Likewise.
>   (aarch64_expand_epilogue): Likewise.
>   (aarch64_post_cfi_startproc): Likewise.
>   (aarch64_handle_no_branch_protection): Copy and rename.
>   (aarch64_handle_standard_branch_protection): Likewise.
>   (aarch64_handle_pac_ret_protection): Likewise.
>   (aarch64_handle_pac_ret_leaf): Likewise.
>   (aarch64_handle_pac_ret_b_key): Likewise.
>   (aarch64_handle_bti_protection): Likewise.
>   (aarch64_override_options): Update branch protection validation.
>   (aarch64_handle_attr_branch_protection): Likewise.
>   * config/arm/aarch-common-protos.h (aarch_validate_mbranch_protection):
>   Pass branch protection type description as argument.
>   (struct aarch_branch_protect_type): Move from aarch-common.h.
>   * config/arm/aarch-common.cc (aarch_handle_no_branch_protection):
>   Remove.
>   (aarch_handle_standard_branch_protection): Remove.
>   (aarch_handle_pac_ret_protection): Remove.
>   (aarch_handle_pac_ret_leaf): Remove.
>   (aarch_handle_pac_ret_b_key): Remove.
>   (aarch_handle_bti_protection): Remove.
>   (aarch_validate_mbranch_protection): Pass branch protection type
>   description as argument.
>   * config/arm/aarch-common.h (enum aarch_key_type): Remove.
>   (struct aarch_branch_protect_type): Remove.
>   * config/arm/arm-c.cc (arm_cpu_builtins): Remove aarch_ra_sign_key.
>   * config/arm/arm.cc (arm_handle_no_branch_protection): Copy and rename.
>   (arm_handle_standard_branch_protection): Likewise.
>   (arm_handle_pac_ret_protection): Likewise.
>   (arm_handle_pac_ret_leaf): Likewise.
>   (arm_handle_bti_protection): Likewise.
>   (arm_configure_build_target): Update branch protection validation.
>   * config/arm/arm.opt: Remove aarch_ra_sign_key.
> ---
> v4:
> - pass types as argument to validation.
> - make target specific types data static.
> 
>  gcc/config/aarch64/aarch64-c.cc  |  4 +-
>  gcc/config/aarch64/aarch64.cc| 75 
>  gcc/config/aarch64/aarch64.md|  2 +-
>  gcc/config/aarch64/aarch64.opt   |  2 +-
>  gcc/config/arm/aarch-common-protos.h | 19 ++-
>  gcc/config/arm/aarch-common.cc   | 71 --
>  gcc/config/arm/aarch-common.h| 20 
>  gcc/config/arm/arm-c.cc  |  2 -
>  gcc/config/arm/arm.cc| 55 +---
>  gcc/config/arm/arm.opt   |  3 --
>  10 files changed, 145 insertions(+), 108 deletions(-)
> 


OK

R.

Re: [PATCH v6 5/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds support for MVE Tail-Predicated Low Overhead Loops by using 
> the
> doloop funcitonality added to support predicated vectorized hardware loops.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
>   declaration to pass basic_block.
>   (arm_attempt_dlstp_transform): New declaration.
>   * config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): Define targethook.
>   (TARGET_PREDICT_DOLOOP_P): Likewise.
>   (arm_target_bb_ok_for_lob): Adapt condition.
>   (arm_mve_get_vctp_lanes): New function.
>   (arm_dl_usage_type): New internal enum.
>   (arm_get_required_vpr_reg): New function.
>   (arm_get_required_vpr_reg_param): New function.
>   (arm_get_required_vpr_reg_ret_val): New function.
>   (arm_mve_get_loop_vctp): New function.
>   (arm_mve_insn_predicated_by): New function.
>   (arm_mve_across_lane_insn_p): New function.
>   (arm_mve_load_store_insn_p): New function.
>   (arm_mve_impl_pred_on_outputs_p): New function.
>   (arm_mve_impl_pred_on_inputs_p): New function.
>   (arm_last_vect_def_insn): New function.
>   (arm_mve_impl_predicated_p): New function.
>   (arm_mve_check_reg_origin_is_num_elems): New function.
>   (arm_mve_dlstp_check_inc_counter): New function.
>   (arm_mve_dlstp_check_dec_counter): New function.
>   (arm_mve_loop_valid_for_dlstp): New function.
>   (arm_predict_doloop_p): New function.
>   (arm_loop_unroll_adjust): New function.
>   (arm_emit_mve_unpredicated_insn_to_seq): New function.
>   (arm_attempt_dlstp_transform): New function.
>   * config/arm/arm.opt (mdlstp): New option.
>   * config/arm/iteratords.md (dlstp_elemsize, letp_num_lanes,
>   letp_num_lanes_neg, letp_num_lanes_minus_1): New attributes.
>   (DLSTP, LETP): New iterators.
>   (predicated_doloop_end_internal): New pattern.
>   (dlstp_insn): New pattern.
>   * config/arm/thumb2.md (doloop_end): Adapt to support tail-predicated
>   loops.
>   (doloop_begin): Likewise.
>   * config/arm/types.md (mve_misc): New mve type to represent
>   predicated_loop_end insn sequences.
>   * config/arm/unspecs.md:
>   (DLSTP8, DLSTP16, DLSTP32, DSLTP64,
>   LETP8, LETP16, LETP32, LETP64): New unspecs for DLSTP and LETP.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/lob.h: Add new helpers.
>   * gcc.target/arm/lob1.c: Use new helpers.
>   * gcc.target/arm/lob6.c: Likewise.
>   * gcc.target/arm/dlstp-compile-asm-1.c: New test.
>   * gcc.target/arm/dlstp-compile-asm-2.c: New test.
>   * gcc.target/arm/dlstp-compile-asm-3.c: New test.
>   * gcc.target/arm/dlstp-int8x16.c: New test.
>   * gcc.target/arm/dlstp-int8x16-run.c: New test.
>   * gcc.target/arm/dlstp-int16x8.c: New test.
>   * gcc.target/arm/dlstp-int16x8-run.c: New test.
>   * gcc.target/arm/dlstp-int32x4.c: New test.
>   * gcc.target/arm/dlstp-int32x4-run.c: New test.
>   * gcc.target/arm/dlstp-int64x2.c: New test.
>   * gcc.target/arm/dlstp-int64x2-run.c: New test.
>   * gcc.target/arm/dlstp-invalid-asm.c: New test.
> 
> Co-authored-by: Stam Markianos-Wright 
> ---
>  gcc/config/arm/arm-protos.h   |4 +-
>  gcc/config/arm/arm.cc | 1249 -
>  gcc/config/arm/arm.opt|3 +
>  gcc/config/arm/iterators.md   |   15 +
>  gcc/config/arm/mve.md |   50 +
>  gcc/config/arm/thumb2.md  |  138 +-
>  gcc/config/arm/types.md   |6 +-
>  gcc/config/arm/unspecs.md |   14 +-
>  gcc/testsuite/gcc.target/arm/lob.h|  128 +-
>  gcc/testsuite/gcc.target/arm/lob1.c   |   23 +-
>  gcc/testsuite/gcc.target/arm/lob6.c   |8 +-
>  .../gcc.target/arm/mve/dlstp-compile-asm-1.c  |  146 ++
>  .../gcc.target/arm/mve/dlstp-compile-asm-2.c  |  749 ++
>  .../gcc.target/arm/mve/dlstp-compile-asm-3.c  |   46 +
>  .../gcc.target/arm/mve/dlstp-int16x8-run.c|   44 +
>  .../gcc.target/arm/mve/dlstp-int16x8.c|   31 +
>  .../gcc.target/arm/mve/dlstp-int32x4-run.c|   45 +
>  .../gcc.target/arm/mve/dlstp-int32x4.c|   31 +
>  .../gcc.target/arm/mve/dlstp-int64x2-run.c|   48 +
>  .../gcc.target/arm/mve/dlstp-int64x2.c|   28 +
>  .../gcc.target/arm/mve/dlstp-int8x16-run.c|   44 +
>  .../gcc.target/arm/mve/dlstp-int8x16.c|   32 +
>  .../gcc.target/arm/mve/dlstp-invalid-asm.c|  521 +++
>  23 files changed, 3321 insertions(+), 82 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c
>  create mode 100644

Re: [PATCH v6 4/5] doloop: Add support for predicated vectorized loops

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds support in the target agnostic doloop pass for the detection 
> of
> predicated vectorized hardware loops.  Arm is currently the only target that
> will make use of this feature.
> 
> gcc/ChangeLog:
> 
>   * df-core.cc (df_bb_regno_only_def_find): New helper function.
>   * df.h (df_bb_regno_only_def_find): Declare new function.
>   * loop-doloop.cc (doloop_condition_get): Add support for detecting
>   predicated vectorized hardware loops.
>   (doloop_modify): Add support for GTU condition checks.
>   (doloop_optimize): Update costing computation to support alterations to
>   desc->niter_expr by the backend.
> 
> Co-authored-by: Stam Markianos-Wright 
> ---
>  gcc/df-core.cc |  15 +
>  gcc/df.h   |   1 +
>  gcc/loop-doloop.cc | 164 +++--
>  3 files changed, 113 insertions(+), 67 deletions(-)
> 

As discussed, I think we should wait for gcc-15 for this[*]; I know it was 
initially submitted during stage1 but it's had to go through a lot of revision 
since then and we're very close to wanting to cut the release branch.

R.

[*] Unless an independent reviewer wants to sign this off anyway.

Re: [PATCH v6 3/5] arm: Fix a wrong attribute use and remove unused unspecs and iterators

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch fixes the erroneous use of a mode attribute without a mode iterator
> in the pattern and removes unused unspecs and iterators.
> 
> gcc/ChangeLog:
> 
>   * config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
>   VMLALDAVAXQ_U cases.
>   (VMLALDAVXQ): Remove iterator.
>   (VMLALDAVXQ_P): Likewise.
>   (VMLALDAVAXQ): Likewise.
>   * config/arm/mve.md (mve_vstrwq_p_fv4sf): Replace use of 
>   mode iterator attribute with V4BI mode.
>   * config/arm/unspecs.md (VMLALDAVXQ_U, VMLALDAVXQ_P_U,
>   VMLALDAVAXQ_U): Remove unused unspecs.
> ---
>  gcc/config/arm/iterators.md | 9 +++--
>  gcc/config/arm/mve.md   | 2 +-
>  gcc/config/arm/unspecs.md   | 3 ---
>  3 files changed, 4 insertions(+), 10 deletions(-)
> 

OK

R.

Re: [PATCH v6 2/5] arm: Annotate instructions with mve_safe_imp_xlane_pred

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch annotates some MVE across lane instructions with a new attribute.
> We use this attribute to let the compiler know that these instructions can be
> safely implicitly predicated when tail predicating if their operands are
> guaranteed to have zeroed tail predicated lanes.  These instructions were
> selected because having the value 0 in those lanes or 'tail-predicating' those
> lanes have the same effect.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.md (mve_safe_imp_xlane_pred): New attribute.
>   * config/arm/iterators.md (mve_vmaxmin_safe_imp): New iterator
>   attribute.
>   * config/arm/mve.md (vaddvq_s, vaddvq_u, vaddlvq_s, vaddlvq_u,
>   vaddvaq_s, vaddvaq_u, vmaxavq_s, vmaxvq_u, vmladavq_s, vmladavq_u,
>   vmladavxq_s, vmlsdavq_s, vmlsdavxq_s, vaddlvaq_s, vaddlvaq_u,
>   vmlaldavq_u, vmlaldavq_s, vmlaldavq_u, vmlaldavxq_s, vmlsldavq_s,
>   vmlsldavxq_s, vrmlaldavhq_u, vrmlaldavhq_s, vrmlaldavhxq_s,
>   vrmlsldavhq_s, vrmlsldavhxq_s, vrmlaldavhaq_s, vrmlaldavhaq_u,
>   vrmlaldavhaxq_s, vrmlsldavhaq_s, vrmlsldavhaxq_s, vabavq_s, vabavq_u,
>   vmladavaq_u, vmladavaq_s, vmladavaxq_s, vmlsdavaq_s, vmlsdavaxq_s,
>   vmlaldavaq_s, vmlaldavaq_u, vmlaldavaxq_s, vmlsldavaq_s,
>   vmlsldavaxq_s): Added mve_safe_imp_xlane_pred.
> ---
>  gcc/config/arm/arm.md   |  6 ++
>  gcc/config/arm/iterators.md |  8 
>  gcc/config/arm/mve.md   | 12 
>  3 files changed, 26 insertions(+)
> 

OK

R.

Re: [PATCH v6 1/5] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-03-01 Thread Richard Earnshaw (lists)

On 27/02/2024 13:56, Andre Vieira wrote:
> 
> This patch adds an attribute to the mve md patterns to be able to identify
> predicable MVE instructions and what their predicated and unpredicated 
> variants
> are.  This attribute is used to encode the icode of the unpredicated variant 
> of
> an instruction in its predicated variant.
> 
> This will make it possible for us to transform VPT-predicated insns in
> the insn chain into their unpredicated equivalents when transforming the loop
> into a MVE Tail-Predicated Low Overhead Loop. For example:
> `mve_vldrbq_z_ -> mve_vldrbq_`.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.md (mve_unpredicated_insn): New attribute.
>   * config/arm/arm.h (MVE_VPT_PREDICATED_INSN_P): New define.
>   (MVE_VPT_UNPREDICATED_INSN_P): Likewise.
>   (MVE_VPT_PREDICABLE_INSN_P): Likewise.
>   * config/arm/vec-common.md (mve_vshlq_): Add attribute.
>   * config/arm/mve.md (arm_vcx1q_p_v16qi): Add attribute.
>   (arm_vcx1qv16qi): Likewise.
>   (arm_vcx1qav16qi): Likewise.
>   (arm_vcx1qv16qi): Likewise.
>   (arm_vcx2q_p_v16qi): Likewise.
>   (arm_vcx2qv16qi): Likewise.
>   (arm_vcx2qav16qi): Likewise.
>   (arm_vcx2qv16qi): Likewise.
>   (arm_vcx3q_p_v16qi): Likewise.
>   (arm_vcx3qv16qi): Likewise.
>   (arm_vcx3qav16qi): Likewise.
>   (arm_vcx3qv16qi): Likewise.
>   (@mve_q_): Likewise.
>   (@mve_q_int_): Likewise.
>   (@mve_q_v4si): Likewise.
>   (@mve_q_n_): Likewise.
>   (@mve_q_r_): Likewise.
>   (@mve_q_f): Likewise.
>   (@mve_q_m_): Likewise.
>   (@mve_q_m_n_): Likewise.
>   (@mve_q_m_r_): Likewise.
>   (@mve_q_m_f): Likewise.
>   (@mve_q_int_m_): Likewise.
>   (@mve_q_p_v4si): Likewise.
>   (@mve_q_p_): Likewise.
>   (@mve_q_): Likewise.
>   (@mve_q_f): Likewise.
>   (@mve_q_m_): Likewise.
>   (@mve_q_m_f): Likewise.
>   (mve_vq_f): Likewise.
>   (mve_q): Likewise.
>   (mve_q_f): Likewise.
>   (mve_vadciq_v4si): Likewise.
>   (mve_vadciq_m_v4si): Likewise.
>   (mve_vadcq_v4si): Likewise.
>   (mve_vadcq_m_v4si): Likewise.
>   (mve_vandq_): Likewise.
>   (mve_vandq_f): Likewise.
>   (mve_vandq_m_): Likewise.
>   (mve_vandq_m_f): Likewise.
>   (mve_vandq_s): Likewise.
>   (mve_vandq_u): Likewise.
>   (mve_vbicq_): Likewise.
>   (mve_vbicq_f): Likewise.
>   (mve_vbicq_m_): Likewise.
>   (mve_vbicq_m_f): Likewise.
>   (mve_vbicq_m_n_): Likewise.
>   (mve_vbicq_n_): Likewise.
>   (mve_vbicq_s): Likewise.
>   (mve_vbicq_u): Likewise.
>   (@mve_vclzq_s): Likewise.
>   (mve_vclzq_u): Likewise.
>   (@mve_vcmp_q_): Likewise.
>   (@mve_vcmp_q_n_): Likewise.
>   (@mve_vcmp_q_f): Likewise.
>   (@mve_vcmp_q_n_f): Likewise.
>   (@mve_vcmp_q_m_f): Likewise.
>   (@mve_vcmp_q_m_n_): Likewise.
>   (@mve_vcmp_q_m_): Likewise.
>   (@mve_vcmp_q_m_n_f): Likewise.
>   (mve_vctpq): Likewise.
>   (mve_vctpq_m): Likewise.
>   (mve_vcvtaq_): Likewise.
>   (mve_vcvtaq_m_): Likewise.
>   (mve_vcvtbq_f16_f32v8hf): Likewise.
>   (mve_vcvtbq_f32_f16v4sf): Likewise.
>   (mve_vcvtbq_m_f16_f32v8hf): Likewise.
>   (mve_vcvtbq_m_f32_f16v4sf): Likewise.
>   (mve_vcvtmq_): Likewise.
>   (mve_vcvtmq_m_): Likewise.
>   (mve_vcvtnq_): Likewise.
>   (mve_vcvtnq_m_): Likewise.
>   (mve_vcvtpq_): Likewise.
>   (mve_vcvtpq_m_): Likewise.
>   (mve_vcvtq_from_f_): Likewise.
>   (mve_vcvtq_m_from_f_): Likewise.
>   (mve_vcvtq_m_n_from_f_): Likewise.
>   (mve_vcvtq_m_n_to_f_): Likewise.
>   (mve_vcvtq_m_to_f_): Likewise.
>   (mve_vcvtq_n_from_f_): Likewise.
>   (mve_vcvtq_n_to_f_): Likewise.
>   (mve_vcvtq_to_f_): Likewise.
>   (mve_vcvttq_f16_f32v8hf): Likewise.
>   (mve_vcvttq_f32_f16v4sf): Likewise.
>   (mve_vcvttq_m_f16_f32v8hf): Likewise.
>   (mve_vcvttq_m_f32_f16v4sf): Likewise.
>   (mve_vdwdupq_m_wb_u_insn): Likewise.
>   (mve_vdwdupq_wb_u_insn): Likewise.
>   (mve_veorq_s>): Likewise.
>   (mve_veorq_u>): Likewise.
>   (mve_veorq_f): Likewise.
>   (mve_vidupq_m_wb_u_insn): Likewise.
>   (mve_vidupq_u_insn): Likewise.
>   (mve_viwdupq_m_wb_u_insn): Likewise.
>   (mve_viwdupq_wb_u_insn): Likewise.
>   (mve_vldrbq_): Likewise.
>   (mve_vldrbq_gather_offset_): Likewise.
>   (mve_vldrbq_gather_offset_z_): Likewise.
>   (mve_vldrbq_z_): Likewise.
>   (mve_vldrdq_gather_base_v2di): Likewise.
>   (mve_vldrdq_gather_base_wb_v2di_insn): Likewise.
>   (mve_vldrdq_gather_base_wb_z_v2di_insn): Likewise.
>   (mve_vldrdq_gather_base_z_v2di): Likewise.
>   (mve_vldrdq_gather_offset_v2di): Likewise.
>   (mve_vldrdq_gather_offset_z_v2di): Likewise.
>   (mve_vldrdq_gather_shifted_offset_v2di): Likewise.
>   (mve_vldrdq_gather_shifted_offset_z_v2di): Likewise.
>   (mve_vldrhq_): Likewise.
>

Re: [PATCH] arm: Fixed C23 call compatibility with arm-none-eabi

2024-03-01 Thread Richard Earnshaw (lists)

On 19/02/2024 09:13, Torbjörn SVENSSON wrote:
> Ok for trunk and releases/gcc-13?
> Regtested on top of 945cb8490cb for arm-none-eabi, without any regression.
> 
> Backporting to releases/gcc-13 will change -std=c23 to -std=c2x.

Jakub has just pushed a different fix for this, so I don't think we need this 
now.

R.


> 
> --
> 
> In commit 4fe34cdcc80ac225b80670eabc38ac5e31ce8a5a, -std=c23 support was
> introduced to support functions without any named arguments.  For
> arm-none-eabi, this is not as simple as placing all arguments on the
> stack.  Align the caller to use r0, r1, r2 and r3 for arguments even for
> functions without any named arguments, as specified in the AAPCS.
> 
> Verify that the generic test case have the arguments are in the right
> order and add ARM specific test cases.
> 
> gcc/ChangeLog:
> 
>   * calls.h: Added the type of the function to function_arg_info.
>   * calls.cc: Save the type of the function.
>   * config/arm/arm.cc: Check in the AAPCS layout function if
>   function has no named args.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/torture/c23-stdarg-split-1a.c: Detect out of order
>   arguments.
>   * gcc.dg/torture/c23-stdarg-split-1b.c: Likewise.
>   * gcc.target/arm/aapcs/align_vaarg3.c: New test.
>   * gcc.target/arm/aapcs/align_vaarg4.c: New test.
> 
> Signed-off-by: Torbjörn SVENSSON 
> Co-authored-by: Yvan ROUX 
> ---
>  gcc/calls.cc  |  2 +-
>  gcc/calls.h   | 20 --
>  gcc/config/arm/arm.cc | 13 ---
>  .../gcc.dg/torture/c23-stdarg-split-1a.c  |  4 +-
>  .../gcc.dg/torture/c23-stdarg-split-1b.c  | 15 +---
>  .../gcc.target/arm/aapcs/align_vaarg3.c   | 37 +++
>  .../gcc.target/arm/aapcs/align_vaarg4.c   | 31 
>  7 files changed, 102 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/align_vaarg3.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/align_vaarg4.c
> 
> diff --git a/gcc/calls.cc b/gcc/calls.cc
> index 01f44734743..a1cc283b952 100644
> --- a/gcc/calls.cc
> +++ b/gcc/calls.cc
> @@ -1376,7 +1376,7 @@ initialize_argument_information (int num_actuals 
> ATTRIBUTE_UNUSED,
>with those made by function.cc.  */
>  
>/* See if this argument should be passed by invisible reference.  */
> -  function_arg_info arg (type, argpos < n_named_args);
> +  function_arg_info arg (type, fntype, argpos < n_named_args);
>if (pass_by_reference (args_so_far_pnt, arg))
>   {
> const bool callee_copies
> diff --git a/gcc/calls.h b/gcc/calls.h
> index 464a4e34e33..88836559ebe 100644
> --- a/gcc/calls.h
> +++ b/gcc/calls.h
> @@ -35,24 +35,33 @@ class function_arg_info
>  {
>  public:
>function_arg_info ()
> -: type (NULL_TREE), mode (VOIDmode), named (false),
> +: type (NULL_TREE), fntype (NULL_TREE), mode (VOIDmode), named (false),
>pass_by_reference (false)
>{}
>  
>/* Initialize an argument of mode MODE, either before or after promotion.  
> */
>function_arg_info (machine_mode mode, bool named)
> -: type (NULL_TREE), mode (mode), named (named), pass_by_reference (false)
> +: type (NULL_TREE), fntype (NULL_TREE), mode (mode), named (named),
> +pass_by_reference (false)
>{}
>  
>/* Initialize an unpromoted argument of type TYPE.  */
>function_arg_info (tree type, bool named)
> -: type (type), mode (TYPE_MODE (type)), named (named),
> +: type (type), fntype (NULL_TREE), mode (TYPE_MODE (type)), named 
> (named),
>pass_by_reference (false)
>{}
>  
> +  /* Initialize an unpromoted argument of type TYPE with a known function 
> type
> + FNTYPE.  */
> +  function_arg_info (tree type, tree fntype, bool named)
> +: type (type), fntype (fntype), mode (TYPE_MODE (type)), named (named),
> +pass_by_reference (false)
> +  {}
> +
>/* Initialize an argument with explicit properties.  */
>function_arg_info (tree type, machine_mode mode, bool named)
> -: type (type), mode (mode), named (named), pass_by_reference (false)
> +: type (type), fntype (NULL_TREE), mode (mode), named (named),
> +pass_by_reference (false)
>{}
>  
>/* Return true if the gimple-level type is an aggregate.  */
> @@ -96,6 +105,9 @@ public:
>   libgcc support functions).  */
>tree type;
>  
> +  /* The type of the function that has this argument, or null if not known.  
> */
> +  tree fntype;
> +
>/* The mode of the argument.  Depending on context, this might be
>   the mode of the argument type or the mode after promotion.  */
>machine_mode mode;
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 1cd69268ee9..98e149e5b7e 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -7006,7 +7006,7 @@ aapcs_libcall_value (machine_mode mode)
> numbers referred to here are those in the

Re: [PATCH] arm: fix c23 0-named-args caller-side stdarg

2024-03-01 Thread Richard Earnshaw (lists)

On 01/03/2024 04:38, Alexandre Oliva wrote:
> Hello, Matthew,
> 
> Thanks for the review.

For closure, Jakub has just pushed a patch to the generic code, so I don't 
think we need this now.

R.

> 
> On Feb 26, 2024, Matthew Malcomson  wrote:
> 
>> I think you're right that the AAPCS32 requires all arguments to be passed in
>> registers for this testcase.
>> (Nit on the commit-message: It says that your reading of the AAPCS32
>> suggests
>> that the *caller* is correct -- I believe based on the change you
>> suggested you
>> meant *callee* is correct in expecting arguments in registers.)
> 
> Ugh, yeah, sorry about the typo.
> 
>> The approach you suggest looks OK to me -- I do notice that it doesn't
>> fix the
>> legacy ABI's of `atpcs` and `apcs` and guess it would be nicer to have them
>> working at the same time though would defer to maintainers on how
>> important that
>> is.
>> (For the benefit of others reading) I don't believe there is any ABI concern
>> with this since it's fixing something that is currently not working at
>> all and
>> only applies to c23 (so a change shouldn't have too much of an impact).
> 
>> You mention you chose to make the change in the arm backend rather
>> than general
>> code due to hesitancy to change the generic ABI-affecting code. That makes
>> sense to me, certainly at this late stage in the development cycle.
> 
> *nod* I wrote the patch in the following context: I hit the problem on
> the very first toolchain I started transitioning to gcc-13.  I couldn't
> really fathom the notion that this breakage could have survived an
> entire release cycle if it affected many targets, and sort of held on to
> an assumption that the abi used by our arm-eabi toolchain had to be an
> uncommon one.
> 
> All of this hypothesizing falls apart by the now apparent knowledge that
> the test is faling elsewhere as well, even on other ARM ABIs, it just
> hadn't been addressed yet.  I'm glad we're getting there :-)
> 
>> From a quick check on c23-stdarg-4.c it does look like the below
>> change ends up
>> with the same codegen as your patch (except in the case of those
>> legacy ABI's,
>> where the below does make the caller and callee ABI match AFAICT):
> 
>> ```
>>   diff --git a/gcc/calls.cc b/gcc/calls.cc
>>   index 01f44734743..0b302f633ed 100644
>>   --- a/gcc/calls.cc
>>   +++ b/gcc/calls.cc
>>   @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int ignore)
>>     we do not have any reliable way to pass unnamed args in
>>     registers, so we must force them into memory.  */
> 
>>   -  if (type_arg_types != 0
>>   +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>>  && targetm.calls.strict_argument_naming (args_so_far))
>>    ;
>>  else if (type_arg_types != 0
>>  && ! targetm.calls.pretend_outgoing_varargs_named
>> (args_so_far))
>>    /* Don't include the last named arg.  */
>>    --n_named_args;
>>   -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>>   +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
>>   +    && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>>    n_named_args = 0;
>>  else
>>    /* Treat all args as named.  */
>> ```
> 
>> Do you agree that this makes sense (i.e. is there something I'm
>> completely missing)?
> 
> Yeah, your argument is quite convincing, and the target knobs are indeed
> in line with the change you suggest, whereas the current code seems to
> deviate from them.
> 
> With my ABI designer hat on, however, I see that there's room for ABIs
> to make decisions about 0-args stdargs that go differently from stdargs
> with leading named args, from prototyped functions, and even from
> prototypeless functions, and we might end up needing more knobs to deal
> with such custom decisions.  We can cross that bridge if/when we get to
> it, though.
> 
>> (lm32 mcore msp430 gcn cris fr30 frv h8300 arm v850 rx pru)
> 
> Interesting that ppc64le is not on your list.  There's PR107453 about
> that, and another thread is discussing a fix for it that is somewhat
> different from what you propose (presumably because the way the problem
> manifests on ppc64le is different), but it also tweaks expand_call.
> 
> I'll copy you when following up there.
>

Re: [PATCH] testsuite: Turn errors back into warnings in arm/acle/cde-mve-error-2.c

2024-03-01 Thread Richard Earnshaw (lists)

On 13/01/2024 20:46, Thiago Jung Bauermann wrote:
> Since commit 2c3db94d9fd ("c: Turn int-conversion warnings into
> permerrors") the test fails with errors such as:
> 
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 32)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 33)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 34)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 35)
> ⋮
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 118 (test for 
> warnings, line 117)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 119)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 120 (test for 
> warnings, line 119)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 121)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 122 (test for 
> warnings, line 121)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 123)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   at line 124 (test for 
> warnings, line 123)
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0   (test for errors, line 
> 125)
> ⋮
>   FAIL: gcc.target/arm/acle/cde-mve-error-2.c   -O0  (test for excess errors)
> 
> There's a total of 1016 errors.  Here's a sample of the excess errors:
> 
>   Excess errors:
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:117:31: 
> error: passing argument 2 of '__builtin_arm_vcx1qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:119:3: 
> error: passing argument 3 of '__builtin_arm_vcx1qav16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:121:3: 
> error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
>   /path/gcc.git/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c:123:3: 
> error: passing argument 3 of '__builtin_arm_vcx2qv16qi' makes integer from 
> pointer without a cast [-Wint-conversion]
> 
> The test expects these messages to be warnings, not errors.  My first try
> was to change it to expect them as errors instead.  This didn't work, IIUC
> because the error prevents the compiler from continuing processing the file
> and thus other errors which are expected by the test don't get emitted.
> 
> Therefore, add -fpermissive so that the test behaves as it did previously.
> Because of the additional line in the header, I had to adjust the line
> numbers of the expected warnings.
> 
> Tested on armv8l-linux-gnueabihf.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.target/arm/acle/cde-mve-error-2.c: Add -fpermissive.
> ---
>  .../gcc.target/arm/acle/cde-mve-error-2.c | 63 ++-
>  1 file changed, 32 insertions(+), 31 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c 
> b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> index 5b7774825442..da283a06a54d 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/cde-mve-error-2.c
> @@ -2,6 +2,7 @@
>  
>  /* { dg-do assemble } */
>  /* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
> +/* { dg-options "-fpermissive" } */
>  /* { dg-add-options arm_v8_1m_main_cde_mve_fp } */
>  
>  /* The error checking files are split since there are three kinds of
> @@ -115,73 +116,73 @@ uint8x16_t test_bad_immediates (uint8x16_t n, 
> uint8x16_t m, int someval,
>  
>/* `imm' is of wrong type.  */
>accum += __arm_vcx1q_u8 (0, "");/* { dg-error 
> {argument 2 to '__builtin_arm_vcx1qv16qi' must be a constant immediate in 
> range \[0-4095\]} } */
> -  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes 
> integer from pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 
> 117 } */
> +  /* { dg-warning {passing argument 2 of '__builtin_arm_vcx1qv16qi' makes 
> integer from pointer without a cast \[-Wint-conversion\]} "" { target *-*-* } 
> 118 } */

Absolute line numbers are a pain, but I think we can use '.-1' (without the 
quotes) in these cases to minimize the churn.

If that works, ok with that change.

R.

Re: [PATCH] testsuite: Fix fallout of turning warnings into errors on 32-bit Arm

2024-03-01 Thread Richard Earnshaw (lists)

On 01/03/2024 14:23, Andre Vieira (lists) wrote:
> Hi Thiago,
> 
> Thanks for this, LGTM but I can't approve this, CC'ing Richard.
> 
> Do have a nitpick, in the gcc/testsuite/ChangeLog: remove 'gcc/testsuite' 
> from bullet points 2-4.
> 

Yes, this is OK with the change Andre mentioned (your push will fail if you 
don't fix that).

R.

PS, if you've set up GCC git customizations (see 
contrib/gcc-git-customization.sh), you can verify things like this with 'git 
gcc-verify HEAD^..HEAD'


> Kind regards,
> Andre
> 
> On 13/01/2024 00:55, Thiago Jung Bauermann wrote:
>> Since commits 2c3db94d9fd ("c: Turn int-conversion warnings into
>> permerrors") and 55e94561e97e ("c: Turn -Wimplicit-function-declaration
>> into a permerror") these tests fail with errors such as:
>>
>>    FAIL: gcc.target/arm/pr59858.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr65647.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr65710.c (test for excess errors)
>>    FAIL: gcc.target/arm/pr97969.c (test for excess errors)
>>
>> Here's one example of the excess errors:
>>
>>    FAIL: gcc.target/arm/pr65647.c (test for excess errors)
>>    Excess errors:
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:17: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:51: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:62: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:7:48: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:8:9: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:24:5: error: 
>> initialization of 'int' from 'int *' makes integer from pointer without a 
>> cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:25:5: error: 
>> initialization of 'int' from 'struct S1 *' makes integer from pointer 
>> without a cast [-Wint-conversion]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:41:3: error: 
>> implicit declaration of function 'fn3'; did you mean 'fn2'? 
>> [-Wimplicit-function-declaration]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:46:3: error: 
>> implicit declaration of function 'fn5'; did you mean 'fn4'? 
>> [-Wimplicit-function-declaration]
>>    /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:57:16: error: 
>> implicit declaration of function 'fn6'; did you mean 'fn4'? 
>> [-Wimplicit-function-declaration]
>>
>> PR rtl-optimization/59858 and PR target/65710 test the fix of an ICE.
>> PR target/65647 and PR target/97969 test for a compilation infinite loop.
>>
>> Therefore, add -fpermissive so that the tests behave as they did previously.
>> Tested on armv8l-linux-gnueabihf.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.target/arm/pr59858.c: Add -fpermissive.
>> * gcc/testsuite/gcc.target/arm/pr65647.c: Likewise.
>> * gcc/testsuite/gcc.target/arm/pr65710.c: Likewise.
>> * gcc/testsuite/gcc.target/arm/pr97969.c: Likewise.
>> ---
>>   gcc/testsuite/gcc.target/arm/pr59858.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr65647.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr65710.c | 2 +-
>>   gcc/testsuite/gcc.target/arm/pr97969.c | 2 +-
>>   4 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/pr59858.c 
>> b/gcc/testsuite/gcc.target/arm/pr59858.c
>> index 3360b48e8586..9336edfce277 100644
>> --- a/gcc/testsuite/gcc.target/arm/pr59858.c
>> +++ b/gcc/testsuite/gcc.target/arm/pr59858.c
>> @@ -1,5 +1,5 @@
>>   /* { dg-do compile } */
>> -/* { dg-options "-march=armv5te -fno-builtin -mfloat-abi=soft -mthumb 
>> -fno-stack-protector -Os -fno-tree-loop-optimize -fno-tree-dominator-opts 
>> -fPIC -w" } */
>> +/* { dg-options "-march=armv5te -fno-builtin -mfloat-abi=soft -mthumb 
>> -fno-stack-protector -Os -fno-tree-loop-optimize -fno-tree-dominator-opts 
>> -fPIC -w -fpermissive" } */
>>   /* { dg-require-effective-target fpic } */
>>   /* { dg-skip-if "Incompatible command line options: -mfloat-abi=soft 
>> -mfloat-abi=hard" { *-*-* } { "-mfloat-abi=hard" } { "" } } */
>>   /* { dg-require-effective-target arm_arch_v5te_thumb_ok } */
>> diff --git a/gcc/testsuite/gcc.target/arm/pr65647.c 
>> b/gcc/testsuite/gcc.target/arm/pr65647.c
>> index 26b4e399f6be..3cbf6b804ec0 100644
>> --- a/gcc/testsuite/gcc.target/arm/pr65647.c
>> +++ b/gcc/testsuite/gcc.target/arm/pr65647.c
>> @@ -1,7 +1,7 @@
>>   /* { dg-do compile } */
>>   /* { dg-require-effective-target arm_arch_v6m_ok } */
>>   /* {

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-03-01 Thread Richard Earnshaw (lists)

On 29/02/2024 15:55, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 02:14:05PM +, Richard Earnshaw wrote:
>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>> including the new testcase you added.
>>>
>>
>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>> n_named_args entirely, it doesn't need it (I don't think it even existed
>> when the AAPCS code was added).
> 
> So far I've just checked that the new testcase passes not just on
> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
> with vanilla trunk.
> Haven't posted this patch in patch form, plus while I'm not really sure
> whether setting n_named_args to 0 or not changing in the
> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
> correct to me.  If structure_value_addr_parm is 1, the function effectively
> has a single named argument and then ... args and if the target wants
> n_named_args to be number of named arguments except the last, then that
> should be 0 rather than 1.
> 
> Thus, is the following patch ok for trunk then?
> 
> 2024-02-29  Jakub Jelinek  
> 
>   PR target/107453

PR 114136

Would be more appropriate for this, I think.

Otherwise, OK.

R.

>   * calls.cc (expand_call): For TYPE_NO_NAMED_ARGS_STDARG_P set
>   n_named_args initially before INIT_CUMULATIVE_ARGS to
>   structure_value_addr_parm rather than 0, after it don't modify
>   it if strict_argument_naming and clear only if
>   !pretend_outgoing_varargs_named.
> 
> --- gcc/calls.cc.jj   2024-01-22 11:48:08.045847508 +0100
> +++ gcc/calls.cc  2024-02-29 16:24:47.799855912 +0100
> @@ -2938,7 +2938,7 @@ expand_call (tree exp, rtx target, int i
>/* Count the struct value address, if it is passed as a parm.  */
>+ structure_value_addr_parm);
>else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> -n_named_args = 0;
> +n_named_args = structure_value_addr_parm;
>else
>  /* If we know nothing, treat all args as named.  */
>  n_named_args = num_actuals;
> @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int i
>   we do not have any reliable way to pass unnamed args in
>   registers, so we must force them into memory.  */
>  
> -  if (type_arg_types != 0
> +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>&& targetm.calls.strict_argument_naming (args_so_far))
>  ;
>else if (type_arg_types != 0
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  /* Don't include the last named arg.  */
>  --n_named_args;
> -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
>else
>  /* Treat all args as named.  */
> 
>   Jakub
>

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-03-01 Thread Richard Earnshaw (lists)

On 29/02/2024 17:56, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 05:51:03PM +0000, Richard Earnshaw (lists) wrote:
>> Oh, but wait!  Perhaps that now falls into the initial 'if' clause and we 
>> never reach the point where you pick zero.  So perhaps I'm worrying about 
>> nothing.
> 
> If you are worried about the
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
> case in the patch, we know at that point that the initial n_named_args is
> equal to structure_value_addr_parm, so either 0, in that case
> --n_named_args;
> would yield the undesirable negative value, so we want 0 instead; for that
> case we could as well just have ; in there instead of n_named_args = 0;,
> or it is 1, in that case --n_named_args; would turn that into 0.
> 
>   Jakub
> 

No, I was thinking about the case of strict_argument_naming when the first 
argument is the artificial return value pointer.  In that case we'd want 
n_named_args=1.

But I think it's a non-issue as that will be caught by 

  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
   && targetm.calls.strict_argument_naming (args_so_far))
 ;

R.

Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-02-29 Thread Richard Earnshaw (lists)

On 29/02/2024 17:55, Andrew Pinski (QUIC) wrote:
>> -Original Message-
>> From: Maxim Kuvyrkov 
>> Sent: Thursday, February 29, 2024 9:46 AM
>> To: Andrew Pinski (QUIC) 
>> Cc: Evgeny Karpov ; Andrew Pinski
>> ; Richard Sandiford ; gcc-
>> patc...@gcc.gnu.org; 10wa...@gmail.com; m...@harmstone.com; Zac
>> Walker ; Ron Riddle
>> ; Radek Barton 
>> Subject: Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW
>> environments for AArch64
>>
>> WARNING: This email originated from outside of Qualcomm. Please be wary
>> of any links or attachments, and do not enable macros.
>>
>>> On Feb 29, 2024, at 21:35, Andrew Pinski (QUIC)
>>  wrote:
>>>
>>>
>>>
 -Original Message-
 From: Evgeny Karpov 
 Sent: Thursday, February 29, 2024 8:46 AM
 To: Andrew Pinski 
 Cc: Richard Sandiford ; gcc-
 patc...@gcc.gnu.org; 10wa...@gmail.com; Maxim Kuvyrkov
 ; m...@harmstone.com; Zac Walker
 ; Ron Riddle ;
 Radek Barton ; Andrew Pinski (QUIC)
 
 Subject: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments
 for AArch64

 Wednesday, February 28, 2024 2:00 AM
 Andrew Pinski wrote:

> What does this mean with respect to C++ exceptions? Or you using
> SJLJ exceptions support or the dwarf unwinding ones without SEH
>> support?
> I am not sure if SJLJ exceptions is well tested any more in GCC either.
>
> Also I have a question if you ran the full GCC/G++ testsuites and
> what were the results?
> If you did run it, did you use a cross compiler or the native
> compiler? Did you do a bootstrap (GCC uses C++ but no exceptions
>> though)?

 As mentioned in the cover letter and the thread, the current
 contribution covers only the C scope.
 Exception handling is fully disabled for now.
 There is an experimental build with C++ and SEH, however, it is not
 included in the plan for the current contribution.

 https://github.com/Windows-on-ARM-Experiments/mingw-woarm64-
>> build

> If you run using a cross compiler, did you use ssh or some other
> route to run the applications?
>
> Thanks,
> Andrew Pinski

 GitHub Actions are used to cross-compile toolchains, packages and
 tests, and execute tests on Windows Arm64.
>>>
>>> This does not answer my question because what you are running is just
>> simple testcases and not the FULL GCC testsuite.
>>> So again have you ran the GCC testsuite and do you have a dejagnu board to
>> be able to execute the binaries?
>>> I think without the GCC testsuite ran to find all of the known failures, 
>>> you are
>> going to be running into many issues.
>>> The GCC testsuite includes many tests for ABI corner cases and many
>> features that you will most likely not think about testing using your simple
>> testcases.
>>> In fact I suspect there will be some of the aarch64 testcases which will 
>>> need
>> to be modified for the windows ABI which you have not done yet.
>>
>> Hi Andrew,
>>
>> We (Linaro) have a prototype CI loop setup for testing aarch64-w64-
>> mingw32, and we have results for gcc-c and libatomic -- see [1].
>>
>> The results are far from clean, but that's expected.  This patch series aims 
>> at
>> enabling C hello-world only, and subsequent patch series will improve the
>> state of the port.
>>
>> [1] https://ci.linaro.org/job/tcwg_gnu_mingw_check_gcc--master-woa64-
>> build/6/artifact/artifacts/sumfiles/
> 
> Looking at these results, this port is not in any shape or form to be 
> upstreamed right now. Even simple -g will cause failures.
> Note we don't need a clean testsuite run but the patch series is not even 
> allowing enabling hello world due to the -g not being able to used.
> 

It seemed to me as though the patch was posted for comments, not for immediate 
inclusion.  I agree this isn't ready for committing yet, but neither should the 
submitters wait until it's perfect before posting it.

I think it's gcc-15 material, so now is about the right time to be thinking 
about it.

R.

> Thanks,
> Amdrew Pinski
> 
>>
>> Thanks,
>>
>> --
>> Maxim Kuvyrkov
>> https://www.linaro.org
>

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)

On 29/02/2024 17:38, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 05:23:25PM +0000, Richard Earnshaw (lists) wrote:
>> On 29/02/2024 15:55, Jakub Jelinek wrote:
>>> On Thu, Feb 29, 2024 at 02:14:05PM +, Richard Earnshaw wrote:
>>>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>>>> including the new testcase you added.
>>>>>
>>>>
>>>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>>>> n_named_args entirely, it doesn't need it (I don't think it even existed
>>>> when the AAPCS code was added).
>>>
>>> So far I've just checked that the new testcase passes not just on
>>> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
>>> with vanilla trunk.
>>> Haven't posted this patch in patch form, plus while I'm not really sure
>>> whether setting n_named_args to 0 or not changing in the
>>> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
>>> correct to me.  If structure_value_addr_parm is 1, the function effectively
>>> has a single named argument and then ... args and if the target wants
>>> n_named_args to be number of named arguments except the last, then that
>>> should be 0 rather than 1.
>>>
>>> Thus, is the following patch ok for trunk then?
>>
>> The comment at the start of the section says
>>
>>   /* Now possibly adjust the number of named args.
>>  Normally, don't include the last named arg if anonymous args follow.
>>  We do include the last named arg if
>>  targetm.calls.strict_argument_naming() returns nonzero.
>>  (If no anonymous args follow, the result of list_length is actually
>>  one too large.  This is harmless.)
>>
>> So in the case of strict_argument_naming perhaps it should return 1, but 0 
>> for other cases.
> 
> The TYPE_NO_NAMED_ARGS_STDARG_P (funtype) case is as if type_arg_types != 0
> and list_length (type_arg_types) == 0, i.e. no user named arguments.
> As list_length (NULL) returns 0, perhaps it could be even handled just the
> by changing all the type_arg_types != 0 checks to
> type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> There are just 2 cases I'm worried about, one is that I think rest of
> calls.cc nor the backends are prepared to see n_named_args -1 after the
> adjustments, I think it is better to use 0, and then the question is what
> the !strict_argument_naming && !pretend_outgoing_varargs_named case
> wants to do for the aggregate return.  The patch as posted for
> void foo (...); void bar () { foo (1, 2, 3); }
> will set n_named_args initially to 0 (no named args) and with the
> adjustments for strict_argument_naming 0, otherwise for !pretend
> 0 as well, otherwise 3.
> For
> struct { char buf[4096]; } baz (...); void qux () { baz (1, 2, 3); }
> the patch sets n_named_args initially to 1 (the hidden return) and
> with the arguments for strict keep it at 1, for !pretend 0 and otherwise
> 3.
> 
> So, which case do you think is handled incorrectly with that?

The way I was thinking about it (and testing it on Arm) was to look at 
n_named_args for the cases of a traditional varargs case, then reduce that by 
one (except it can't ever be negative).

So for 

void f(...);
void g(int, ...);
struct S { int a[32]; };

struct S h (...);
struct S i (int, ...);

void a ()
{
  struct S x;
  f(1, 2, 3, 4);
  g(1, 2, 3, 4);
  x = h (1, 2, 3, 4);
  x = i (1, 2, 3, 4);
}

There are various permutations that could lead to answers of 0, 1, 2, 4 and 5 
depending on how those various targets treat each case and how the result 
pointer address is handled.  My suspicion is that for a target that has strict 
argument naming and the result pointer passed as a first argument, the answer 
for the 'h()' call should be 1, not zero.  

Oh, but wait!  Perhaps that now falls into the initial 'if' clause and we never 
reach the point where you pick zero.  So perhaps I'm worrying about nothing.

R.

> 
>   Jakub
>

Re: [PATCH] calls: Further fixes for TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)

On 29/02/2024 15:55, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 02:14:05PM +, Richard Earnshaw wrote:
>>> I tried the above on arm, aarch64 and x86_64 and that seems fine,
>>> including the new testcase you added.
>>>
>>
>> I should mention though, that INIT_CUMULATIVE_ARGS on arm ignores
>> n_named_args entirely, it doesn't need it (I don't think it even existed
>> when the AAPCS code was added).
> 
> So far I've just checked that the new testcase passes not just on
> x86_64/i686-linux, but also on {powerpc64le,s390x,aarch64}-linux
> with vanilla trunk.
> Haven't posted this patch in patch form, plus while I'm not really sure
> whether setting n_named_args to 0 or not changing in the
> !pretend_outgoing_varargs_named is right, the setting to 0 feels more
> correct to me.  If structure_value_addr_parm is 1, the function effectively
> has a single named argument and then ... args and if the target wants
> n_named_args to be number of named arguments except the last, then that
> should be 0 rather than 1.
> 
> Thus, is the following patch ok for trunk then?

The comment at the start of the section says

  /* Now possibly adjust the number of named args.
 Normally, don't include the last named arg if anonymous args follow.
 We do include the last named arg if
 targetm.calls.strict_argument_naming() returns nonzero.
 (If no anonymous args follow, the result of list_length is actually
 one too large.  This is harmless.)

So in the case of strict_argument_naming perhaps it should return 1, but 0 for 
other cases.

R.

> 
> 2024-02-29  Jakub Jelinek  
> 
>   PR target/107453
>   * calls.cc (expand_call): For TYPE_NO_NAMED_ARGS_STDARG_P set
>   n_named_args initially before INIT_CUMULATIVE_ARGS to
>   structure_value_addr_parm rather than 0, after it don't modify
>   it if strict_argument_naming and clear only if
>   !pretend_outgoing_varargs_named.
> 
> --- gcc/calls.cc.jj   2024-01-22 11:48:08.045847508 +0100
> +++ gcc/calls.cc  2024-02-29 16:24:47.799855912 +0100
> @@ -2938,7 +2938,7 @@ expand_call (tree exp, rtx target, int i
>/* Count the struct value address, if it is passed as a parm.  */
>+ structure_value_addr_parm);
>else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> -n_named_args = 0;
> +n_named_args = structure_value_addr_parm;
>else
>  /* If we know nothing, treat all args as named.  */
>  n_named_args = num_actuals;
> @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int i
>   we do not have any reliable way to pass unnamed args in
>   registers, so we must force them into memory.  */
>  
> -  if (type_arg_types != 0
> +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>&& targetm.calls.strict_argument_naming (args_so_far))
>  ;
>else if (type_arg_types != 0
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  /* Don't include the last named arg.  */
>  --n_named_args;
> -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
> +&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  n_named_args = 0;
>else
>  /* Treat all args as named.  */
> 
>   Jakub
>

Re: [PATCH] calls: Fix up TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2024-02-29 Thread Richard Earnshaw (lists)

On 27/02/2024 17:25, Jakub Jelinek wrote:
> On Tue, Feb 27, 2024 at 04:41:32PM +, Richard Earnshaw wrote:
>>> 2023-01-09  Jakub Jelinek  
>>>
>>> PR target/107453
>>> * calls.cc (expand_call): For calls with
>>> TYPE_NO_NAMED_ARGS_STDARG_P (funtype) use zero for n_named_args.
>>> Formatting fix.
>>
>> This one has been festering for a while; both Alexandre and Torbjorn have 
>> attempted to fix it recently, but I'm not sure either is really right...
>>
>> On Arm this is causing all anonymous arguments to be passed on the stack,
>> which is incorrect per the ABI.  On a target that uses
>> 'pretend_outgoing_vararg_named', why is it correct to set n_named_args to
>> zero?  Is it enough to guard both the statements you've added with
>> !targetm.calls.pretend_outgoing_args_named?
> 
> I'm afraid I haven't heard of that target hook before.
> All I was doing with that change was fixing a regression reported in the PR
> for ppc64le/sparc/nvptx/loongarch at least.
> 
> The TYPE_NO_NAMED_ARGS_STDARG_P functions (C23 fns like void foo (...) {})
> have NULL type_arg_types, so the list_length (type_arg_types) isn't done for
> it, but it should be handled as if it was non-NULL but list length was 0.
> 
> So, for the
>   if (type_arg_types != 0)
> n_named_args
>   = (list_length (type_arg_types)
>  /* Count the struct value address, if it is passed as a parm.  */
>  + structure_value_addr_parm);
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> n_named_args = 0;
>   else
> /* If we know nothing, treat all args as named.  */
> n_named_args = num_actuals;
> case, I think guarding it by any target hooks is wrong, although
> I guess it should have been
> n_named_args = structure_value_addr_parm;
> instead of
> n_named_args = 0;
> 
> For the second
>   if (type_arg_types != 0
>   && targetm.calls.strict_argument_naming (args_so_far))
> ;
>   else if (type_arg_types != 0
>&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
> /* Don't include the last named arg.  */
> --n_named_args;
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> n_named_args = 0;
>   else
> /* Treat all args as named.  */
> n_named_args = num_actuals;
> bet (but no testing done, don't even know which targets return what for
> those hooks) we should treat those as if type_arg_types was non-NULL
> with 0 elements in the list, except the --n_named_args doesn't make sense
> because that would decrease it to -1.
> So perhaps
>   if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
>   && targetm.calls.strict_argument_naming (args_so_far))
> ;
>   else if (type_arg_types != 0
>&& ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
> /* Don't include the last named arg.  */
> --n_named_args;
>   else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far)))
> ;
>   else
> /* Treat all args as named.  */
> n_named_args = num_actuals;

I tried the above on arm, aarch64 and x86_64 and that seems fine, including the 
new testcase you added.

R.

> 
> (or n_named_args = 0; instead of ; before the final else?  Dunno).
> I guess we need some testsuite coverage for caller/callee ABI match of
> struct S { char p[64]; };
> struct S foo (...);
> 
>   Jakub
>

Re: Help needed with maintainer-mode

2024-02-29 Thread Richard Earnshaw (lists) via Gcc

On 29/02/2024 10:22, Christophe Lyon via Gcc wrote:
> Hi!
> 
> Sorry for cross-posting, but I'm not sure the rules/guidelines are the
> same in gcc vs binutils/gdb.
> 
> TL;DR: are there some guidelines about how to use/enable maintainer-mode?
> 
> In the context of the Linaro CI, I've been looking at enabling
> maintainer-mode at configure time in our configurations where we test
> patches before they are committed (aka "precommit CI", which relies on
> patchwork).
> 
> Indeed, auto-generated files are not part of patch submissions, and
> when a patch implies regenerating some files before building, we
> currently report wrong failures because we don't perform such updates.
> 
> I hoped improving this would be as simple as adding
> --enable-maintainer-mode when configuring, after making sure
> autoconf-2.69 and automake-1.15.1 were in the PATH (using our host's
> libtool and gettext seems OK).
> 
> However, doing so triggered several problems, which look like race
> conditions in the build system (we build at -j160):
> - random build errors in binutils / gdb with messages like "No rule to
> make target 'po/BLD-POTFILES.in". I managed to reproduce something
> similar manually once, I noticed an empty Makefile; the build logs are
> of course difficult to read, so I couldn't figure out yet what could
> cause this.
> 
> - random build failures in gcc in fixincludes. I think this is a race
> condition because fixincludes is updated concurrently both from
> /fixincludes and $buillddir/fixincludes. Probably fixable in gcc
> Makefiles.
> 
> - I've seen other errors when building gcc like
> configure.ac:25: error: possibly undefined macro: AM_ENABLE_MULTILIB
> from libquadmath. I haven't investigated this yet.
> 
> I've read binutils' README-maintainer-mode, which contains a warning
> about distclean, but we don't use this: we start our builds from a
> scratch directory.
> 
> So... I'm wondering if there are some "official" guidelines about how
> to regenerate files, and/or use maintainer-mode?  Maybe I missed a
> "magic" make target (eg 'make autoreconf-all') that should be executed
> after configure and before 'make all'?
> 
> I've noticed that sourceware's buildbot has a small script
> "autoregen.py" which does not use the project's build system, but
> rather calls aclocal/autoheader/automake/autoconf in an ad-hoc way.
> Should we replicate that?
> 
> Thanks,
> 
> Christophe

There are other potential gotchas as well, such as the manual copying of the 
generated tm.texi back into the source repo due to relicensing.  Perhaps we 
haven't encountered that one because patches generally contain that duplicated 
output.

If we want a CI to work reliably, then perhaps we should reconsider our policy 
of stripping out regenerated code.  We have a number of developer practices, 
such as replying to an existing patch with an updated version that the CI can't 
handle easily (especially if the patch is part of a series), so there may be 
space for a discussion on how to work smarter.

My calendar says we have a toolchain office hours meeting today, perhaps this 
would be worth bringing up.

R.

Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-26 Thread Richard Earnshaw (lists)

On 26/02/2024 16:05, Wilco Dijkstra wrote:
> Hi Richard,
> 
>> Did you test this on a thumb1 target?  It seems to me that the target parts 
>> that you've
>> removed were likely related to that.  In fact, I don't see why this test 
>> would need to be changed at all.
> 
> The testcase explicitly forces a Thumb-2 target (arm_arch_v6t2). The patterns
> were wrong for Thumb-2 indeed, and the testcase was explicitly testing for 
> this.
> There is a separate builtin-bswap-2.c for Thumb-1 target (arm_arch_v6m).
> 
> Cheers,
> Wilco

That's why statements like:

* gcc.target/arm/builtin-bswap-1.c: Fix test.

are less than helpful.  Perhaps if you'd said what you actually changed that 
would have made it more obvious.

So OK, but please fix the commit message to say what you did.

R.

Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-26 Thread Richard Earnshaw (lists)

On 23/02/2024 15:46, Wilco Dijkstra wrote:
> Hi Richard,
> 
>> This bit isn't.  The correct fix here is to fix the pattern(s) concerned to 
>> add the missing predicate.
>>
>> Note that builtin-bswap.x explicitly mentions predicated mnemonics in the 
>> comments.
> 
> I fixed the patterns in v2. There are likely some more, plus we could likely 
> merge many t1 and t2
> patterns where the only difference is predication. But those cleanups are for 
> another time...
> 
> Cheers,
> Wilco
> 
> v2: Add predicable to the rev patterns.
> 
> By default most patterns can be conditionalized on Arm targets.  However
> Thumb-2 predication requires the "predicable" attribute be explicitly
> set to "yes".  Most patterns are shared between Arm and Thumb(-2) and are
> marked with "predicable".  Given this sharing, it does not make sense to
> use a different default for Arm.  So only consider conditional execution
> of instructions that have the predicable attribute set to yes.  This ensures
> that patterns not explicitly marked as such are never conditionally executed.
> 
> Passes regress and bootstrap, OK for commit?
> 
> gcc/ChangeLog:
> PR target/113915
> * config/arm/arm.md (NOCOND): Improve comment.
> (arm_rev*) Add predicable.
> * config/arm/arm.cc (arm_final_prescan_insn): Add check for
> PREDICABLE_YES.
> 
> gcc/testsuite/ChangeLog:
> PR target/113915
> * gcc.target/arm/builtin-bswap-1.c: Fix test.
> 
> ---
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 
> 1cd69268ee986a0953cc85ab259355d2191250ac..6a35fe44138135998877a9fb74c2a82a7f99dcd5
>  100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -25613,11 +25613,12 @@ arm_final_prescan_insn (rtx_insn *insn)
> break;
>  
>   case INSN:
> -   /* Instructions using or affecting the condition codes make it
> -  fail.  */
> +   /* Check the instruction is explicitly marked as predicable.
> +  Instructions using or affecting the condition codes are not.  
> */
> scanbody = PATTERN (this_insn);
> if (!(GET_CODE (scanbody) == SET
>   || GET_CODE (scanbody) == PARALLEL)
> +   || get_attr_predicable (this_insn) != PREDICABLE_YES
> || get_attr_conds (this_insn) != CONDS_NOCOND)
>   fail = TRUE;
> break;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 
> 5816409f86f1106b410c5e21d77e599b485f85f2..81237a61d4a2ebcfb77e47c2bd29137aba28a521
>  100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -307,6 +307,8 @@
>  ;
>  ; NOCOND means that the instruction does not use or alter the condition
>  ;   codes but can be converted into a conditionally exectuted instruction.
> +;   Given that NOCOND is the default for most instructions if omitted,
> +;   the attribute predicable must be set to yes as well.
>  
>  (define_attr "conds" "use,set,clob,unconditional,nocond"
>   (if_then_else
> @@ -12547,6 +12549,7 @@
>revsh%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12560,6 +12563,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12584,6 +12588,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> @@ -12619,6 +12624,7 @@
> rev16%?\t%0, %1"
>[(set_attr "arch" "t1,t2,32")
> (set_attr "length" "2,2,4")
> +   (set_attr "predicable" "no,yes,yes")
> (set_attr "type" "rev")]
>  )
>  
> diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c 
> b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> index 
> c1e7740d14d3ca4e93a71e38b12f82c19791a204..1a311a6a5af647d40abd553e5d0ba1273c76d288
>  100644
> --- a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> @@ -5,14 +5,11 @@
> of the instructions.  Add an -mtune option known to facilitate that.  */
>  /* { dg-additional-options "-O2 -mtune=cortex-a53" } */
>  /* { dg-final { scan-assembler-not "orr\[ \t\]" } } */
> -/* { dg-final { scan-assembler-times "revsh\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "revshne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "revsh\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev16ne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 2

Re: [PATCH v1 00/13] Add aarch64-w64-mingw32 target

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 17:47, Evgeny Karpov wrote:
> Hello,
> 
> We would like to take your attention to the review of changes for the
> new GCC target, aarch64-w64-mingw32. The new target will be
> supported, tested, added to CI, and maintained by Linaro. This marks
> the first of three planned patch series contributing to the GCC C
> compiler's support for Windows Arm64.
> 
> 1. Minimal aarch64-w64-mingw32 C implementation to cross-compile
> hello-world with libgcc for Windows Arm64 using MinGW.
> 2. Extension of the aarch64-w64-mingw32 C implementation to
> cross-compile OpenSSL, OpenBLAS, FFmpeg, and libjpeg-turbo. All
> packages successfully pass tests.
> 3. Addition of call stack support for debugging, resolution of
> optimization issues in the C compiler, and DLL export/import for the
> aarch64-w64-mingw32 target.
> 
> This patch series introduces the 1st point, which involves building
> hello-world for the aarch64-w64-mingw32 target. The patch depends on
> the binutils changes for the aarch64-w64-mingw32 target that have
> already been merged.
> 
> The binutils should include recent relocation fixes.
> f87eaf8ff3995a5888c6dc4996a20c770e6bcd36
> aarch64: Add new relocations and limit COFF AArch64 relocation offsets
> 
> The series is structured in a way to trivially show that it should not
> affect any other targets.
> 
> In this patch, several changes have been made to support the
> aarch64-w64-mingw32 target for GCC. The modifications include the
> definition of the MS ABI for aarch64, adjustments to FIXED_REGISTERS
> and STATIC_CHAIN_REGNUM for different ABIs, and specific definitions
> for COFF format on AArch64. Additionally, the patch reuses MinGW
>  types and definitions from i386, relocating them to a new
> mingw folder for shared usage between both targets.
> 
> MinGW-specific options have been introduced for AArch64, along with
> override options for aarch64-w64-mingw32. Builtin stack probing for
> override options for aarch64-w64-mingw32. Builtin stack probing for
> AArch64 has been enabled as an alternative for chkstk. Symbol name
> encoding and section information handling for aarch64-w64-mingw32 have
> been incorporated, and the MinGW environment has been added, which
> will also be utilized for defining the Cygwin environment in the
> future.
> 
> The patch includes renaming "x86 Windows Options" to "Cygwin and MinGW
> Options," which now encompasses AArch64 as well. AArch64-specific
> Cygwin and MinGW Options have been introduced for the unique
> requirements of the AArch64 architecture.
> 
> Function type declaration and named sections support have been added.
> The necessary objects for Cygwin and MinGW have been built for the
> aarch64-w64-mingw32 target, and relevant files such as msformat-c.cc
> and winnt-d.cc have been moved to the mingw folder for reuse in
> AArch64.
> 
> Furthermore, the aarch64-w64-mingw32 target has been included in both
> libatomic and libgcc, ensuring support for the AArch64 architecture
> within these libraries. These changes collectively enhance the
> capabilities of GCC for the specified target.
> 
> Coauthors: Zac Walker ,
> Mark Harmstone   and
> Ron Riddle 
> 
> Refactored, prepared, and validated by 
> Radek Barton  and 
> Evgeny Karpov 
> 
> Special thanks to the Linaro GNU toolchain team for internal review
> and assistance in preparing the patch series!
> 
> Regards,
> Evgeny

Thanks for posting this.

I've only read quickly through this patch series and responded where I think 
some action is obviously required.  That doesn't necessarily mean the other 
patches are perfect, though, just that nothing immediately caught my attention.

R.

> 
> 
> Zac Walker (13):
>   Introduce aarch64-w64-mingw32 target
>   aarch64: The aarch64-w64-mingw32 target implements the MS ABI
>   aarch64: Mark x18 register as a fixed register for MS ABI
>   aarch64: Add aarch64-w64-mingw32 COFF
>   Reuse MinGW from i386 for AArch64
>   Rename section and encoding functions from i386 which will be used in
> aarch64
>   Exclude i386 functionality from aarch64 build
>   aarch64: Add Cygwin and MinGW environments for AArch64
>   aarch64: Add SEH to machine_function
>   Rename "x86 Windows Options" to "Cygwin and MinGW Options"
>   aarch64: Build and add objects for Cygwin and MinGW for AArch64
>   aarch64: Add aarch64-w64-mingw32 target to libatomic
>   Add aarch64-w64-mingw32 target to libgcc
> 
>  fixincludes/mkfixinc.sh   |   3 +-
>  gcc/config.gcc|  47 +++--
>  gcc/config/aarch64/aarch64-coff.h |  92 +
>  gcc/config/aarch64/aarch64-opts.h |   7 +
>  gcc/config/aarch64/aarch64-protos.h   |   5 +
>  gcc/config/aarch64/aarch64.h  |  25 ++-
>  gcc/config/aarch64/cygming.h  | 178 ++
>  gcc/config/i386/cygming.h |  18 +-
>  gcc/config/i386/cygming.opt.urls  |  30 ---
>  gcc/config/i386/i386-protos.h

Re: [PATCH v1 13/13] Add aarch64-w64-mingw32 target to libgcc

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:40, Evgeny Karpov wrote:
> 
+aarch64-*-mingw*)

This doesn't match the glob pattern you added to config.gcc in an earlier 
patch, but see my comment on that.  The two should really be consistent with 
each other or you might get build failures late on.

R.

Re: [PATCH v1 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW Options"

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:38, Evgeny Karpov wrote:
> 
For this change you might want to put some form of re-direct in the manual 
under the old name so that anybody used to looking for the old entry will know 
where things have been moved to.  Something like

x86 Windows Options
  See xref(Cygwin and MinGW Options).

R.

Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:36, Evgeny Karpov wrote:
> 
+/* GNU as supports weak symbols on PECOFF.  */
+#ifdef HAVE_GAS_WEAK

Can't we assume this is true?  It was most likely needed on i386 because 
support goes back longer than the assembler had this feature, but it looks like 
it was added in 2000, or thereabouts, so significantly before aarch64 was 
supported in the assembler.

+#ifndef HAVE_GAS_ALIGNED_COMM

And this was added to GCC in 2009, which probably means it predates 
aarch64-coff support in gas as well.

R.

Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:30, Evgeny Karpov wrote:
> 
+   tm_defines="${tm_defines} TARGET_ARM64_MS_ABI=1"

I missed this on first reading...

The GCC port name uses AARCH64, please use that internally rather than other 
names.  The only time when we should be using ARM64 is when it's needed for 
compatibility with other compilers and that doesn't apply here AFAICT.

R.

Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:30, Evgeny Karpov wrote:
> 
+/* X18 reserved for the TEB on Windows.  */
+#ifdef TARGET_ARM64_MS_ABI
+# define FIXED_X18 1
+# define CALL_USED_X18 0
+#else
+# define FIXED_X18 0
+# define CALL_USED_X18 1
+#endif

I'm not overly keen on ifdefs like this (and the one below), it can get quite 
confusing if we have to support more than a couple of ABIs.  Perhaps we could 
create a couple of new headers, one for the EABI (which all existing targets 
would then need to include) and one for the MS ABI.  Then the mingw port would 
use that instead of the EABI header.

An alternative is to make all this dynamic, based on the setting of the 
aarch64_calling_abi enum and to make the adjustments in 
aarch64_conditional_register_usage.

+# define CALL_USED_X18 0

Is that really correct?  If the register is really reserved, but some code 
modifies it anyway, this will cause the compiler to restore the old value at 
the end of a function; generally, for a reserved register, code that knows what 
it's doing would want to make permanent changes to this value.

+#ifdef TARGET_ARM64_MS_ABI
+# define STATIC_CHAIN_REGNUM   R17_REGNUM
+#else
+# define STATIC_CHAIN_REGNUM   R18_REGNUM
+#endif

If we went the enum way, we'd want something like

#define STATIC_CHAIN_REGNUM (calling_abi == AARCH64_CALLING_ABI_MS ? R17_REGNUM 
: R18_REGNUM)

R.

Re: [PATCH v1 02/13] aarch64: The aarch64-w64-mingw32 target implements

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:26, Evgeny Karpov wrote:
> 
+/* Available call ABIs.  */
+enum calling_abi
+{
+  AARCH64_EABI = 0,
+  MS_ABI = 1
+};
+

The convention in this file seems to be that all enum types to start with 
aarch64.  Also, the enumeration values should start with the name of the 
enumeration type in upper case, so:

enum aarch64_calling_abi
{
  AARCH64_CALLING_ABI_EABI,
  AARCH64_CALLING_ABI_MS
};

or something very much like that.

R.

Re: [PATCH v1 01/13] Introduce aarch64-w64-mingw32 target

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 18:16, Evgeny Karpov wrote:
> 
+aarch64*-*-mingw*)

Other targets are a bit inconsistent here as well, but, as Andrew mentioned, if 
you don't want to handle big-endian, it might be better to match 
aarch64-*-mingw* here.

R.

Re: [PATCH v1 05/13] Reuse MinGW from i386 for AArch64

2024-02-22 Thread Richard Earnshaw (lists)

On 21/02/2024 21:34, rep.dot@gmail.com wrote:
> On 21 February 2024 19:34:43 CET, Evgeny Karpov  
> wrote:
>>
> 
> Please use git send-email. Your mail ends up as empty as here, otherwise.

I don't see anything wrong with it; niether does patchwork 
(https://patchwork.sourceware.org/project/gcc/list/?series=31191) nor does the 
Linaro CI bot.  So perhaps it's your mailer that's misconfigured.

> 
> The ChangeLog has to be expressed in present tense, as mandated by the 
> standard; s/Moved/Move/g etc.

Agreed, but that's a detail that we can get to once the patch has been properly 
reviewed.

> 
> In any sane world ( and in gcc ) to fold, respectively a folder, is something 
> else compared to a directory ( which you probably mean when moving a file 
> from one directory to another directory as you seem to do ).
> 
> Most of the free world has left COFF behind since several decades, so I won't 
> comment on that. YMMV.

This isn't helpful.  Windows platforms use (a derivative of) COFF, so that's 
what the tools need to use when targetting that platform.

R.

Re: [PATCH] ARM: Fix conditional execution [PR113915]

2024-02-21 Thread Richard Earnshaw (lists)

On 21/02/2024 14:34, Wilco Dijkstra wrote:
> 
> By default most patterns can be conditionalized on Arm targets.  However
> Thumb-2 predication requires the "predicable" attribute be explicitly
> set to "yes".  Most patterns are shared between Arm and Thumb(-2) and are
> marked with "predicable".  Given this sharing, it does not make sense to
> use a different default for Arm.  So only consider conditional execution
> of instructions that have the predicable attribute set to yes.  This ensures
> that patterns not explicitly marked as such are never accidentally 
> conditionally executed like in the PR.
> 
> GLIBC codesize was ~0.014% worse due to atomic operations now being
> unconditional and a small number of patterns not setting "predicable".
> 
> Passes regress and bootstrap, OK for commit?
> 
> gcc/ChangeLog:
> PR target/113915
> * config/arm/arm.md (NOCOND): Improve comment.
> * config/arm/arm.cc (arm_final_prescan_insn): Add check for
> PREDICABLE_YES.
> 
> gcc/testsuite/ChangeLog:
> PR target/113915
> * gcc.target/arm/builtin-bswap-1.c: Fix test.
> 
> ---
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index 
> c44047c377a802d0c1dc1406df1b88a6b079607b..29771d284831a995adcf9adbb525396fbabb1ea2
>  100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -25610,11 +25610,12 @@ arm_final_prescan_insn (rtx_insn *insn)
> break;
>  
>   case INSN:
> -   /* Instructions using or affecting the condition codes make it
> -  fail.  */
> +   /* Check the instruction is explicitly marked as predicable.
> +  Instructions using or affecting the condition codes are not.  
> */
> scanbody = PATTERN (this_insn);
> if (!(GET_CODE (scanbody) == SET
>   || GET_CODE (scanbody) == PARALLEL)
> +   || get_attr_predicable (this_insn) != PREDICABLE_YES
> || get_attr_conds (this_insn) != CONDS_NOCOND)
>   fail = TRUE;
> break;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 
> 5816409f86f1106b410c5e21d77e599b485f85f2..671f093862259c2c0df93a986fc22fa56a8ea6c7
>  100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -307,6 +307,8 @@
>  ;
>  ; NOCOND means that the instruction does not use or alter the condition
>  ;   codes but can be converted into a conditionally exectuted instruction.
> +;   Given that NOCOND is the default for most instructions if omitted,
> +;   the attribute predicable must be set to yes as well.
>  
>  (define_attr "conds" "use,set,clob,unconditional,nocond"
>   (if_then_else

While this is ok, 

> diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c 
> b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> index 
> c1e7740d14d3ca4e93a71e38b12f82c19791a204..3de7cea81c1128c2fe5a9e1216e6b027d26bcab9
>  100644
> --- a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
> @@ -5,14 +5,8 @@
> of the instructions.  Add an -mtune option known to facilitate that.  */
>  /* { dg-additional-options "-O2 -mtune=cortex-a53" } */
>  /* { dg-final { scan-assembler-not "orr\[ \t\]" } } */
> -/* { dg-final { scan-assembler-times "revsh\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "revshne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "revsh\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 1 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev16ne\\t" 1 { target { arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev16\\t" 2 { target { ! arm_nothumb } 
> } } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 2 { target { arm_nothumb } } } 
> }  */
> -/* { dg-final { scan-assembler-times "revne\\t" 2 { target { arm_nothumb } } 
> } }  */
> -/* { dg-final { scan-assembler-times "rev\\t" 4 { target { ! arm_nothumb } } 
> } }  */
> +/* { dg-final { scan-assembler-times "revsh\\t" 2 } }  */
> +/* { dg-final { scan-assembler-times "rev16\\t" 2 } }  */
> +/* { dg-final { scan-assembler-times "rev\\t" 4 } }  */
>  
>  #include "builtin-bswap.x"

This bit isn't.  The correct fix here is to fix the pattern(s) concerned to add 
the missing predicate.

Note that builtin-bswap.x explicitly mentions predicated mnemonics in the 
comments.

R.

Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-19 Thread Richard Earnshaw (lists)

On 19/02/2024 10:58, Tamar Christina wrote:
>> -Original Message-
>> From: Tamar Christina
>> Sent: Thursday, February 15, 2024 11:05 AM
>> To: Richard Earnshaw (lists) ; gcc-
>> patc...@gcc.gnu.org
>> Cc: nd ; Marcus Shawcroft ; Kyrylo
>> Tkachov ; Richard Sandiford
>> 
>> Subject: RE: [PATCH]AArch64: xfail modes_1.f90 [PR107071]
>>
>>> -Original Message-
>>> From: Richard Earnshaw (lists) 
>>> Sent: Thursday, February 15, 2024 11:01 AM
>>> To: Tamar Christina ; gcc-patches@gcc.gnu.org
>>> Cc: nd ; Marcus Shawcroft ;
>> Kyrylo
>>> Tkachov ; Richard Sandiford
>>> 
>>> Subject: Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]
>>>
>>> On 15/02/2024 10:57, Tamar Christina wrote:
>>>> Hi All,
>>>>
>>>> This test has never worked on AArch64 since the day it was committed.  It 
>>>> has
>>>> a number of issues that prevent it from working on AArch64:
>>>>
>>>> 1.  IEEE does not require that FP operations raise a SIGFPE for FP 
>>>> operations,
>>>>     only that an exception is raised somehow.
>>>>
>>>> 2. Most Arm designed cores don't raise SIGFPE and instead set a status 
>>>> register
>>>>    and some partner cores raise a SIGILL instead.
>>>>
>>>> 3. The way it checks for feenableexcept doesn't really work for AArch64.
>>>>
>>>> As such this test doesn't seem to really provide much value on AArch64 so 
>>>> we
>>>> should just xfail it.
>>>>
>>>> Regtested on aarch64-none-linux-gnu and no issues.
>>>>
>>>> Ok for master?
>>>
>>> Wouldn't it be better to just skip the test.  XFAIL just adds clutter to 
>>> verbose
>> output
>>> and suggests that someday the tools might be fixed for this case.
>>>
>>> Better still would be a new dg-requires fp_exceptions_raise_sigfpe as a 
>>> guard for
>>> the test.
>>
> 
> It looks like this is similar to 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314 so
> I'll just similarly skip it.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90 
> b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> index 
> 205c47f38007d06116289c19d6b23cf3bf83bd48..e29d8c678e6e51c3f2e5dac53c7703bb18a99ac4
>  100644
> --- a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> +++ b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> @@ -1,5 +1,5 @@
>  ! { dg-do run }
> -!
> +! { dg-skip-if "PR libfortran/78314" { aarch64*-*-gnu* arm*-*-gnueabi 
> arm*-*-gnueabihf } }
>  ! Test IEEE_MODES_TYPE, IEEE_GET_MODES and IEEE_SET_MODES
>  
> Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK, but please give the fortran maintainers 24hrs to comment before pushing.

R.

> 
> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>   PR fortran/107071
>   * gfortran.dg/ieee/modes_1.f90: skip aarch64, arm.

Re: [PATCH][GCC][Arm] Missing optimization pattern for rev16 on architectures with thumb1

2024-02-15 Thread Richard Earnshaw (lists)

On 12/02/2024 13:48, Matthieu Longo wrote:
> This patch marks a rev16 test as XFAIL for architectures having only Thumb1 
> support. The generated code is functionally correct, but the optimization is 
> disabled when -mthumb is equivalent to Thumb1. Fixing the root issue would 
> requires changes that are not suitable for GCC14 stage 4.
> 
> More information at https://linaro.atlassian.net/browse/GNU-1141
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/rev16_2.c: XFAIL when compiled with Thumb1.

Thanks, I've tweaked the commit message slightly and pushed this.

R.

Re: [PATCH]AArch64: xfail modes_1.f90 [PR107071]

2024-02-15 Thread Richard Earnshaw (lists)

On 15/02/2024 10:57, Tamar Christina wrote:
> Hi All,
> 
> This test has never worked on AArch64 since the day it was committed.  It has
> a number of issues that prevent it from working on AArch64:
> 
> 1.  IEEE does not require that FP operations raise a SIGFPE for FP operations,
>     only that an exception is raised somehow.
> 
> 2. Most Arm designed cores don't raise SIGFPE and instead set a status 
> register
>    and some partner cores raise a SIGILL instead.
> 
> 3. The way it checks for feenableexcept doesn't really work for AArch64.
> 
> As such this test doesn't seem to really provide much value on AArch64 so we
> should just xfail it.
> 
> Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

Wouldn't it be better to just skip the test.  XFAIL just adds clutter to 
verbose output and suggests that someday the tools might be fixed for this case.

Better still would be a new dg-requires fp_exceptions_raise_sigfpe as a guard 
for the test.

R.

> 
> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>     PR fortran/107071
>     * gfortran.dg/ieee/modes_1.f90: xfail aarch64.
> 
> --- inline copy of patch --
> diff --git a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90 
> b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> index 
> 205c47f38007d06116289c19d6b23cf3bf83bd48..3667571969427ae7b2b96684ec1af8b3fdd4985f
>  100644
> --- a/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> +++ b/gcc/testsuite/gfortran.dg/ieee/modes_1.f90
> @@ -1,4 +1,4 @@
> -! { dg-do run }
> +! { dg-do run { xfail { aarch64*-*-* } } }
>  !
>  ! Test IEEE_MODES_TYPE, IEEE_GET_MODES and IEEE_SET_MODES
>  
> 
> 
> 
> 
> --

Re: [PATCH] Arm: Fix incorrect tailcall-generation for indirect calls [PR113780]

2024-02-14 Thread Richard Earnshaw (lists)

On 14/02/2024 09:20, Tejas Belagod wrote:
> On 2/7/24 11:41 PM, Richard Earnshaw (lists) wrote:
>> On 07/02/2024 07:59, Tejas Belagod wrote:
>>> This patch fixes a bug that causes indirect calls in PAC-enabled functions
>>> to be tailcalled incorrectly when all argument registers R0-R3 are used.
>>>
>>> Tested on arm-none-eabi for armv8.1-m.main. OK for trunk?
>>>
>>> 2024-02-07  Tejas Belagod  
>>>
>>> PR target/113780
>>> * gcc/config/arm.cc (arm_function_ok_for_sibcall): Don't allow tailcalls
>>>   for indirect calls with 4 or more arguments in pac-enabled functions.
>>>
>>> * gcc.target/arm/pac-sibcall.c: New.
>>> ---
>>>   gcc/config/arm/arm.cc  | 12 
>>>   gcc/testsuite/gcc.target/arm/pac-sibcall.c | 11 +++
>>>   2 files changed, 19 insertions(+), 4 deletions(-)
>>>   create mode 100644 gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>>
>>> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
>>> index c44047c377a..c1f8286a4d4 100644
>>> --- a/gcc/config/arm/arm.cc
>>> +++ b/gcc/config/arm/arm.cc
>>> @@ -7980,10 +7980,14 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
>>>     && DECL_WEAK (decl))
>>>   return false;
>>>   -  /* We cannot do a tailcall for an indirect call by descriptor if all 
>>> the
>>> - argument registers are used because the only register left to load the
>>> - address is IP and it will already contain the static chain.  */
>>> -  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
>>> +  /* We cannot do a tailcall for an indirect call by descriptor or for an
>>> + indirect call in a pac-enabled function if all the argument registers
>>> + are used because the only register left to load the address is IP and
>>> + it will already contain the static chain or the PAC signature in the
>>> + case of PAC-enabled functions.  */
>>
>> This comment is becoming a bit unwieldy.  I suggest restructuring it as:
>>
>> We cannot tailcall an indirect call by descriptor if all the call-clobbered
>> general registers are live (r0-r3 and ip).  This can happen when:
>>    - IP contains the static chain, or
>>    - IP is needed for validating the PAC signature.
>>
>>
>>> +  if (!decl
>>> +  && ((CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
>>> +  || arm_current_function_pac_enabled_p()))
>>>   {
>>>     tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
>>>     CUMULATIVE_ARGS cum;
>>> diff --git a/gcc/testsuite/gcc.target/arm/pac-sibcall.c 
>>> b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>> new file mode 100644
>>> index 000..c57bf7a952c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
>>> @@ -0,0 +1,11 @@
>>> +/* Testing return address signing.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target mbranch_protection_ok } */
>>> +/* { dg-options " -mcpu=cortex-m85 -mbranch-protection=pac-ret+leaf -O2" } 
>>> */
>>
>> No, you can't just add options like this, you need to first check that they 
>> won't result in conflicts with other options on the command line.  See 
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644077.html for an 
>> example of how to handle this.
>>
> Thanks for the review, Richard. Respin attached.
> 
> Thanks,
> Tejas.
> 
>>> +
>>> +void fail(void (*f)(int, int, int, int))
>>> +{
>>> +  f(1, 2, 3, 4);
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-not "bx\tip\t@ indirect register sibling 
>>> call" } } */
>>
>> R.
>>
+++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
@@ -0,0 +1,14 @@
+/* If all call-clobbered general registers are live (r0-r3, ip), disable
+   indirect tail-call for a PAC-enabled function.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target mbranch_protection_ok } */
This only checks if -mbranch-protection can work with the existing 
architecture/cpu; not with the flags you're about to add below.  You should 
check for arm_arch_v8_1m_main_pacbti_ok instead; then you can assume that 
-mbranch-protection can be added.

+/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
+/* { dg-additional-options "-mbranch-protection=pac-ret+leaf -O2" } */

Otherwise this is OK if you fix the above.

R.

Re: [PATCH] testsuite, arm: Fix testcase arm/pr112337.c to check for the options first

2024-02-09 Thread Richard Earnshaw (lists)

On 30/01/2024 17:07, Saurabh Jha wrote:
> Hey,
> 
> Previously, this test was added to fix this bug: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337. However, it did not 
> check the compilation options before using them, leading to errors.
> 
> This patch fixes the test by first checking whether it can use the options 
> before using them.
> 
> Tested for arm-none-eabi and found no regressions. The output of check-gcc 
> with RUNTESTFLAGS="arm.exp=*" changed like this:
> 
> Before:
> # of expected passes  5963
> # of unexpected failures  64
> 
> After:
> # of expected passes  5964
> # of unexpected failures  63
> 
> Ok for master?
> 
> Regards,
> Saurabh
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/arm/pr112337.c: Check whether we can use the compilation 
> options before using them.

My apologies for missing this earlier.  It didn't show up in patchwork. That's 
most likely because the attachment is a binary blob instead of text/plain.  
That also means that the Linaro CI system hasn't seen this patch either.  
Please can you fix your mailer to add plain text patch files.

-/* { dg-options "-O2 -march=armv8.1-m.main+fp.dp+mve.fp -mfloat-abi=hard" } */
+/* { dg-require-effective-target arm_hard_ok } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-options "-O2 -mfloat-abi=hard" } */
+/* { dg-add-options arm_v8_1m_mve } */

This is moving in the right direction, but it adds more than necessary now: 
checking for, and adding -mfloat-abi=hard is not necessary any more as 
arm_v8_1m_mve_ok will work out what float-abi flags are needed to make the 
options work. (What's more, it will prevent the test from running if the base 
configuration of the compiler is incompatible with the hard float ABI, which is 
more than we need.).

So please can you re-spin removing the hard-float check and removing that from 
dg-options.

Thanks,
R.

Re: [PATCH] Arm: Fix incorrect tailcall-generation for indirect calls [PR113780]

2024-02-07 Thread Richard Earnshaw (lists)

On 07/02/2024 07:59, Tejas Belagod wrote:
> This patch fixes a bug that causes indirect calls in PAC-enabled functions
> to be tailcalled incorrectly when all argument registers R0-R3 are used.
> 
> Tested on arm-none-eabi for armv8.1-m.main. OK for trunk?
> 
> 2024-02-07  Tejas Belagod  
> 
>   PR target/113780
>   * gcc/config/arm.cc (arm_function_ok_for_sibcall): Don't allow tailcalls
>   for indirect calls with 4 or more arguments in pac-enabled functions.
> 
>   * gcc.target/arm/pac-sibcall.c: New.
> ---
>  gcc/config/arm/arm.cc  | 12 
>  gcc/testsuite/gcc.target/arm/pac-sibcall.c | 11 +++
>  2 files changed, 19 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/pac-sibcall.c
> 
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index c44047c377a..c1f8286a4d4 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -7980,10 +7980,14 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
>&& DECL_WEAK (decl))
>  return false;
>  
> -  /* We cannot do a tailcall for an indirect call by descriptor if all the
> - argument registers are used because the only register left to load the
> - address is IP and it will already contain the static chain.  */
> -  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
> +  /* We cannot do a tailcall for an indirect call by descriptor or for an
> + indirect call in a pac-enabled function if all the argument registers
> + are used because the only register left to load the address is IP and
> + it will already contain the static chain or the PAC signature in the
> + case of PAC-enabled functions.  */

This comment is becoming a bit unwieldy.  I suggest restructuring it as:

We cannot tailcall an indirect call by descriptor if all the call-clobbered
general registers are live (r0-r3 and ip).  This can happen when:
  - IP contains the static chain, or
  - IP is needed for validating the PAC signature.


> +  if (!decl
> +  && ((CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
> +   || arm_current_function_pac_enabled_p()))
>  {
>tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
>CUMULATIVE_ARGS cum;
> diff --git a/gcc/testsuite/gcc.target/arm/pac-sibcall.c 
> b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
> new file mode 100644
> index 000..c57bf7a952c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
> @@ -0,0 +1,11 @@
> +/* Testing return address signing.  */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target mbranch_protection_ok } */
> +/* { dg-options " -mcpu=cortex-m85 -mbranch-protection=pac-ret+leaf -O2" } */

No, you can't just add options like this, you need to first check that they 
won't result in conflicts with other options on the command line.  See 
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644077.html for an 
example of how to handle this.

> +
> +void fail(void (*f)(int, int, int, int))
> +{
> +  f(1, 2, 3, 4);
> +}
> +
> +/* { dg-final { scan-assembler-not "bx\tip\t@ indirect register sibling 
> call" } } */

R.

Re: [PATCH v2] arm: Fix missing bti instruction for virtual thunks

2024-02-02 Thread Richard Earnshaw (lists)

On 26/01/2024 15:31, Richard Ball wrote:
> v2: Formatting and test options fix.
> 
> Adds missing bti instruction at the beginning of a virtual
> thunk, when bti is enabled.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.cc (arm_output_mi_thunk): Emit
>   insn for bti_c when bti is enabled.
> 
> gcc/testsuite/ChangeLog:
> 
> * lib/target-supports.exp: Add v8_1_m_main_pacbti.
> * g++.target/arm/bti_thunk.C: New test.

OK, thanks.

R.

Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-31 Thread Richard Earnshaw (lists)

On 30/01/2024 14:09, Andre Simoes Dias Vieira wrote:
> Hi Richard,
> 
> Thanks for the reviews, I'm making these changes but just a heads up.
> 
> When hardcoding LR_REGNUM like this we need to change the way we compare the 
> register in doloop_condition_get. This function currently compares the rtx 
> nodes by address, which I think happens to be fine before we assign hard 
> registers, as I suspect we always share the rtx node for the same pseudo, but 
> when assigning registers it seems like we create copies, so things like:
> `XEXP (inc_src, 0) == reg` will fail for
> inc_src: (plus (reg LR) (const_int -n)'
> reg: (reg LR)
> 
> Instead I will substitute the operand '==' with calls to 'rtx_equal_p (op1, 
> op2, NULL)'.

Yes, that's fine.

R.

> 
> Sound good?
> 
> Kind regards,
> Andre
> 
> ____
> From: Richard Earnshaw (lists) 
> Sent: Tuesday, January 30, 2024 11:36 AM
> To: Andre Simoes Dias Vieira; gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov; Stam Markianos-Wright
> Subject: Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low 
> Overhead Loops
> 
> On 19/01/2024 14:40, Andre Vieira wrote:
>>
>> Respin after comments from Kyrill and rebase. I also removed an if-then-else
>> construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
>> functions
>> Kyrill pointed out.
>>
>> After an earlier comment from Richard Sandiford I also added comments to the
>> two tail predication patterns added to explain the need for the unspecs.
> 
> [missing ChangeLog]
> 
> I'm just going to focus on loop-doloop.c in this reply, I'll respond to the 
> other bits in a follow-up.
> 
>   2)  (set (reg) (plus (reg) (const_int -1))
> - (set (pc) (if_then_else (reg != 0)
> -(label_ref (label))
> -(pc))).
> +(set (pc) (if_then_else (reg != 0)
> +(label_ref (label))
> +(pc))).
> 
>   Some targets (ARM) do the comparison before the branch, as in the
>   following form:
> 
> - 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
> -   (set (reg) (plus (reg) (const_int -1)))])
> -(set (pc) (if_then_else (cc == NE)
> ...
> 
> 
> This comment is becoming confusing.  Really the text leading up to 3)... 
> should be inside 3.  Something like:
> 
>   3) Some targets (ARM) do the comparison before the branch, as in the
>   following form:
> 
>   (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
>  (set (reg) (plus (reg) (const_int -1)))])
>   (set (pc) (if_then_else (cc == NE)
>   (label_ref (label))
>   (pc)))])
> 
> 
> The same issue on the comment structure also applies to the new point 4...
> 
> +  The ARM target also supports a special case of a counter that 
> decrements
> +  by `n` and terminating in a GTU condition.  In that case, the compare 
> and
> +  branch are all part of one insn, containing an UNSPEC:
> +
> +  4) (parallel [
> +   (set (pc)
> +   (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
> +   (const_int -n))])
> +  (const_int n-1]))
> +   (label_ref)
> +   (pc)))
> +   (set (reg:SI 14 lr)
> +(plus:SI (reg:SI 14 lr)
> + (const_int -n)))
> + */
> 
> I think this needs a bit more clarification.  Specifically that this 
> construct supports a predicated vectorized do loop.  Also, the placement of 
> the unspec inside the comparison is ugnly and unnecessary.  It should be 
> sufficient to have the unspec inside a USE expression, which the mid-end can 
> then ignore entirely.  So
> 
> (parallel
>  [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
>(const_int n-1))
>   (label_ref) (pc)))
>   (set (reg) (plus (reg) (const_int -n)))
>   (additional clobbers and uses)])
> 
> For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to 
> this pattern to stop anything else from matching it.
> 
> Note that we don't need to mention that the register is 'LR' or the modes, 
> those are specific to a particular backend, not the generic pattern we want 
> to match.
> 
> +  || !CONST_INT_P (XEXP (inc_src, 1))
> +  || INTVAL (XEXP (inc_src, 1)) >= 0)
>  re

Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-30 Thread Richard Earnshaw (lists)

On 19/01/2024 14:40, Andre Vieira wrote:
> 
> Respin after comments from Kyrill and rebase. I also removed an if-then-else
> construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
> functions
> Kyrill pointed out.
> 
> After an earlier comment from Richard Sandiford I also added comments to the
> two tail predication patterns added to explain the need for the unspecs.

[missing ChangeLog]

I'm just going to focus on loop-doloop.c in this reply, I'll respond to the 
other bits in a follow-up.

  2)  (set (reg) (plus (reg) (const_int -1))
- (set (pc) (if_then_else (reg != 0)
-(label_ref (label))
-(pc))).  
+(set (pc) (if_then_else (reg != 0)
+(label_ref (label))
+(pc))).

  Some targets (ARM) do the comparison before the branch, as in the
  following form:

- 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
-   (set (reg) (plus (reg) (const_int -1)))])
-(set (pc) (if_then_else (cc == NE)
...

This comment is becoming confusing.  Really the text leading up to 3)... should 
be inside 3.  Something like:

  3) Some targets (ARM) do the comparison before the branch, as in the
  following form:

  (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
 (set (reg) (plus (reg) (const_int -1)))])
  (set (pc) (if_then_else (cc == NE)
  (label_ref (label))
  (pc)))])

The same issue on the comment structure also applies to the new point 4...

+  The ARM target also supports a special case of a counter that decrements
+  by `n` and terminating in a GTU condition.  In that case, the compare and
+  branch are all part of one insn, containing an UNSPEC:
+
+  4) (parallel [
+   (set (pc)
+   (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
+   (const_int -n))])
+  (const_int n-1]))
+   (label_ref)
+   (pc)))
+   (set (reg:SI 14 lr)
+(plus:SI (reg:SI 14 lr)
+ (const_int -n)))
+ */

I think this needs a bit more clarification.  Specifically that this construct 
supports a predicated vectorized do loop.  Also, the placement of the unspec 
inside the comparison is ugnly and unnecessary.  It should be sufficient to 
have the unspec inside a USE expression, which the mid-end can then ignore 
entirely.  So

(parallel
 [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
   (const_int n-1))
  (label_ref) (pc)))
  (set (reg) (plus (reg) (const_int -n)))
  (additional clobbers and uses)])

For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to 
this pattern to stop anything else from matching it.

Note that we don't need to mention that the register is 'LR' or the modes, 
those are specific to a particular backend, not the generic pattern we want to 
match.

+  || !CONST_INT_P (XEXP (inc_src, 1))
+  || INTVAL (XEXP (inc_src, 1)) >= 0)
 return 0;
+  int dec_num = abs (INTVAL (XEXP (inc_src, 1)));

We can just use '-INTVAL(...)' here, we've verified just above that the 
constant is negative.

-  if ((XEXP (condition, 0) == reg)
+  /* For the ARM special case of having a GTU: re-form the condition without
+ the unspec for the benefit of the middle-end.  */
+  if (GET_CODE (condition) == GTU)
+{
+  condition = gen_rtx_fmt_ee (GTU, VOIDmode, inc_src,
+ GEN_INT (dec_num - 1));
+  return condition;
+}

If you make the change I mentioned above, this re-forming isn't needed any 
more, so the arm-specific comment goes away

-   {
+{
  if (GET_CODE (pattern) != PARALLEL)
  /*  For the second form we expect:

You've fixed the indentation of the brace (good), but the body of the braced 
expression needs re-indenting as well.

R.

Re: [PATCH v3 1/2] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-01-30 Thread Richard Earnshaw (lists)

On 19/01/2024 14:40, Andre Vieira wrote:
> 
> Reposting for testing purposes, no changes from v2 (other than rebase).

We seem to have lost the ChangeLog for this hunk :(

The code itself looks OK, though.

Re: [PATCH] Make gcc.target/arm/bics_3.c testcase a bit more generic [PR113542]

2024-01-25 Thread Richard Earnshaw (lists)

On 25/01/2024 10:29, Maxim Kuvyrkov wrote:
> After fwprop improvement in r14-8319-g86de9b66480, codegen in
> bics_3.c test changed from "bics" to "bic" instruction, with
> the overall instruction stream remaining at the same quality.
> 
> This patch makes the scan-assembler directive accept both
> "bics" and "bic".
> 
> BEFORE r14-8319-g86de9b66480:
>   bicsr0, r0, r1 @ 9  [c=4 l=4]  *andsi_notsi_si_compare0_scratch
>   mov r0, #1  @ 23[c=4 l=4]  *thumb2_movsi_vfp/1
>   it  eq
>   moveq   r0, #0  @ 26[c=8 l=4]  *p *thumb2_movsi_vfp/2
>   bx  lr  @ 29[c=8 l=4]  *thumb2_return
> 
> AFTER r14-8319-g86de9b66480:
>   bic r0, r0, r1  @ 8 [c=4 l=4]  andsi_notsi_si
>   subsr0, r0, #0  @ 22[c=4 l=4]  cmpsi2_addneg/0
>   it  ne
>   movne   r0, #1  @ 23[c=8 l=4]  *p *thumb2_movsi_vfp/2
>   bx  lr  @ 26[c=8 l=4]  *thumb2_return
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/113542
>   * gcc.target/arm/bics_3.c: Update scan-assembler directive.
> ---
>  gcc/testsuite/gcc.target/arm/bics_3.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/bics_3.c 
> b/gcc/testsuite/gcc.target/arm/bics_3.c
> index e056b264e15..c5bed3c92d2 100644
> --- a/gcc/testsuite/gcc.target/arm/bics_3.c
> +++ b/gcc/testsuite/gcc.target/arm/bics_3.c
> @@ -35,6 +35,6 @@ main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+" 
> 2 } } */
> -/* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+, 
> .sl #2" 1 } } */
> +/* { dg-final { scan-assembler-times "bics?\tr\[0-9\]+, r\[0-9\]+, 
> r\[0-9\]+" 2 } } */
> +/* { dg-final { scan-assembler-times "bics?\tr\[0-9\]+, r\[0-9\]+, 
> r\[0-9\]+, .sl #2" 1 } } */
>  


The test was added (r6-823-g0454e698401a3e) specifically to check that a BICS 
instruction was being generated.  Whether or not that is right is somewhat 
debatable, but this change seems to be papering over a different issue.

Either we should generate BICS, making this change incorrect, or we should 
disable the test for thumb code on the basis that this isn't really a win.

But really, we should fix the compiler to do better here.  We really want 
something like

BICS  r0, r0, r1  // r0 is 0 or non-zero
MOVNE r0, #1  // convert all non-zero to 1

in Arm state (ie using the BICS instruction to set the result to zero); and in 
thumb2, perhaps something like:

BICS  r0, r0, r1
ITne
MOVNE r0, #1

or maybe even better:

BIC  r0, r0, r1
SUBS r1, r0, #1
SBC  r0, r0, r1

which is slightly better than BICS because SUBS breaks a condition-code chain 
(all the flag bits are set).

There are similar quality issues for other NE(arith-op, 0) cases; we just don't 
have tests for those.

R.

Re: [PATCH] arm: Fix missing bti instruction for virtual thunks

2024-01-24 Thread Richard Earnshaw (lists)

On 23/01/2024 15:53, Richard Ball wrote:
> Adds missing bti instruction at the beginning of a virtual
> thunk, when bti is enabled.
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.cc (arm_output_mi_thunk): Emit
>   insn for bti_c when bti is enabled.
> 
> gcc/testsuite/ChangeLog:
> 
> * g++.target/arm/bti_thunk.C: New test.


diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 
e5a944486d7bd583627b0e22dfe8f95862e975bb..91eee8be7c1a59118fbf443557561fb3e0689d61
 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -29257,6 +29257,8 @@ arm_output_mi_thunk (FILE *file, tree thunk, 
HOST_WIDE_INT delta,
   const char *fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk));
 
   assemble_start_function (thunk, fnname);
+  if (aarch_bti_enabled ())
+emit_insn (aarch_gen_bti_c());

Missing space between ...bit_c and the parenthesis.

   if (TARGET_32BIT)
 arm32_output_mi_thunk (file, thunk, delta, vcall_offset, function);
   else

diff --git a/gcc/testsuite/g++.target/arm/bti_thunk.C 
b/gcc/testsuite/g++.target/arm/bti_thunk.C
new file mode 100644
index 
..5c4a8e5a8d74581eca2b877c000a5b34ddca0e9b
--- /dev/null
+++ b/gcc/testsuite/g++.target/arm/bti_thunk.C
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv8.1-m.main+pacbti -O1 -mbranch-protection=bti 
--save-temps" } */

You can't just add options like this; they may not work with other options 
passed by the testsuite framework.  Instead, you should a suitable entry to 
lib/target-supports.exp in the table starting "foreach { armfunc armflag 
armdefs } {" that tests whether the options can be safely added, and then use 
dg-require-effective-target and dg-add-options for your new set of options.

\ No newline at end of file

Please add one :)

R.

Re: [PATCH][GCC][Arm] Add pattern for bswap + rotate -> rev16 [Bug 108933]

2024-01-22 Thread Richard Earnshaw (lists)

On 22/01/2024 12:18, Matthieu Longo wrote:
> rev16 pattern was not recognised anymore as a change in the bswap tree
> pass was introducing a new GIMPLE form, not recognized by the assembly
> final transformation pass.
> 
> More details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933
> 
> gcc/ChangeLog:
> 
>     PR target/108933
>     * config/arm/arm.md (*arm_rev16si2_alt3): new pattern to convert
>   a bswap + rotate by 16 bits into rev16

ChangeLog entries need to be written as sentences, so start with a capital 
letter and end with a full stop; continuation lines should start in column 8 
(one hard tab, don't use spaces).  But in this case, "New pattern." is 
sufficient.

> 
> gcc/testsuite/ChangeLog:
> 
>     PR target/108933
>     * gcc.target/arm/rev16.c: Moved to...
>     * gcc.target/arm/rev16_1.c: ...here.
>     * gcc.target/arm/rev16_2.c: New test to check that rev16 is
>   emitted.

+;; Similar pattern to match (rotate (bswap) 16)
+(define_insn "*arm_rev16si2_alt3"
+  [(set (match_operand:SI 0 "register_operand" "=l,r")
+(rotate:SI (bswap:SI (match_operand:SI 1 "register_operand" "l,r"))
+ (const_int 16)))]
+  "arm_arch6"
+  "rev16\\t%0, %1"
+  [(set_attr "arch" "t,32")
+   (set_attr "length" "2,4")
+   (set_attr "type" "rev")]
+)
+

Unfortunately, this is insufficient.  When generating Arm or Thumb2 code (but 
not thumb1) we also have to handle conditional execution: we need to have '%?' 
in the output template at the point where a condition code might be needed.  
That means we need separate output templates for all three alternatives (as we 
need a 16-bit variant for thumb2 that's conditional and a 16-bit for thumb1 
that isn't).  See the output of arm_rev16 for a guide of what is really needed.

I note that the arm_rev16si2_alt1, and arm_rev16si2_alt2 patterns are incorrect 
in this regard as well; that will need fixing.

I also see that arm_rev16si2 currently expands to the alt1 variant above; given 
that the preferred canonical form would now appear to use bswap + rotate, we 
should change that as well.  In fact, we can merge your new pattern with the 
expand entirely and eliminate the need to call gen_arm_rev16si2_alt1.  
Something like:

(define_insn "arm_rev16si2"
  [(set (match_operand:SI 0 "s_register_operand")
(rotate:SI (bswap:SI (match_operand:SI 1 "s_register_operand")) 
(const_int 16))]
  "arm_arch6"
  "@
  rev16...
  ...

R.

Re: [PATCH] arm: Fix parsecpu.awk for aliases [PR113030]

2024-01-22 Thread Richard Earnshaw (lists)

On 21/01/2024 07:29, Andrew Pinski wrote:
> So the problem here is the 2 functions check_cpu and check_arch use
> the wrong variable to check if an alias is valid for that cpu/arch.
> check_cpu uses cpu_optaliases instead of cpu_opt_alias. cpu_optaliases
> is an array of index'ed by the cpuname that contains all of the valid aliases
> for that cpu but cpu_opt_alias is an double index array which is index'ed
> by cpuname and the alias which provides what is the alias for that option.
> Similar thing happens for check_arch and arch_optaliases vs arch_optaliases.
> 
> Tested by running:
> ```
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+simd" 
> config/arm/arm-cpus.in
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon" 
> config/arm/arm-cpus.in
> awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon-vfpv3" 
> config/arm/arm-cpus.in
> ```
> And they don't return error back.
> 
> gcc/ChangeLog:
> 
>   PR target/113030
>   * config/arm/parsecpu.awk (check_cpu): Use cpu_opt_alias
>   instead of cpu_optaliases.
>   (check_arch): Use arch_opt_alias instead of arch_optaliases.

OK

Thanks,

R.

> 
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/config/arm/parsecpu.awk | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
> index ddd4f3b440a..384462bdb5b 100644
> --- a/gcc/config/arm/parsecpu.awk
> +++ b/gcc/config/arm/parsecpu.awk
> @@ -529,7 +529,7 @@ function check_cpu (name) {
>  
>  for (n = 2; n <= exts; n++) {
>   if (!((cpu_name, extensions[n]) in cpu_opt_remove)  \
> - && !((cpu_name, extensions[n]) in cpu_optaliases)) {
> + && !((cpu_name, extensions[n]) in cpu_opt_alias)) {
>   return "error"
>   }
>  }
> @@ -552,7 +552,7 @@ function check_arch (name) {
>  
>  for (n = 2; n <= exts; n++) {
>   if (!((extensions[1], extensions[n]) in arch_opt_remove)\
> - && !((extensions[1], extensions[n]) in arch_optaliases)) {
> + && !((extensions[1], extensions[n]) in arch_opt_alias)) {
>   return "error"
>   }
>  }

Re: [PATCH v3 00/12] [GCC] arm: vld1q vst1 vst1q vst1 intrinsics

2024-01-12 Thread Richard Earnshaw (lists)

On 02/01/2024 09:23, ezra.sito...@arm.com wrote:
> From: Ezra Sitorus 
> 
> Add vld1q, vst1, vst1q and vst1 intrinsics to arm port.
> 
> Ezra Sitorus (12):
>   [GCC] arm: vld1q_types_x2 ACLE intrinsics
>   [GCC] arm: vld1q_types_x3 ACLE intrinsics
>   [GCC] arm: vld1q_types_x4 ACLE intrinsics
>   [GCC] arm: vst1_types_x2 ACLE intrinsics
>   [GCC] arm: vst1_types_x3 ACLE intrinsics
>   [GCC] arm: vst1_types_x4 ACLE intrinsics
>   [GCC] arm: vst1q_types_x2 ACLE intrinsics
>   [GCC] arm: vst1q_types_x3 ACLE intrinsics
>   [GCC] arm: vst1q_types_x4 ACLE intrinsics
>   [GCC] arm: vld1_types_x2 ACLE intrinsics
>   [GCC] arm: vld1_types_x3 ACLE intrinsics
>   [GCC] arm: vld1_types_x4 ACLE intrinsics
> 
>  gcc/config/arm/arm_neon.h | 2032 ++---
>  gcc/config/arm/arm_neon_builtins.def  |   12 +
>  gcc/config/arm/iterators.md   |6 +
>  gcc/config/arm/neon.md|  249 ++
>  gcc/config/arm/unspecs.md |8 +
>  .../gcc.target/arm/simd/vld1_base_xN_1.c  |  176 ++
>  .../gcc.target/arm/simd/vld1_bf16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vld1_fp16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vld1_p64_xN_1.c   |   23 +
>  .../gcc.target/arm/simd/vld1q_base_xN_1.c |  183 ++
>  .../gcc.target/arm/simd/vld1q_bf16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vld1q_fp16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vld1q_p64_xN_1.c  |   24 +
>  .../gcc.target/arm/simd/vst1_base_xN_1.c  |  176 ++
>  .../gcc.target/arm/simd/vst1_bf16_xN_1.c  |   22 +
>  .../gcc.target/arm/simd/vst1_fp16_xN_1.c  |   23 +
>  .../gcc.target/arm/simd/vst1_p64_xN_1.c   |   23 +
>  .../gcc.target/arm/simd/vst1q_base_xN_1.c |  185 ++
>  .../gcc.target/arm/simd/vst1q_bf16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vst1q_fp16_xN_1.c |   24 +
>  .../gcc.target/arm/simd/vst1q_p64_xN_1.c  |   24 +
>  21 files changed, 3018 insertions(+), 290 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1q_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_p64_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_base_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_bf16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_fp16_xN_1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_p64_xN_1.c
> 

Thanks, I've pushed this series.

Reviewing this series did highlight a couple of issues with the existing code 
base (not your patch); I'll follow up on these separately.

R.

Re: [PATCH v2 7/7] aarch64,arm: Move branch-protection data to targets

2024-01-11 Thread Richard Earnshaw (lists)

On 11/01/2024 14:43, Szabolcs Nagy wrote:
> The 12/07/2023 13:13, Richard Earnshaw wrote:
>> On 03/11/2023 15:36, Szabolcs Nagy wrote:
>>> * config/aarch64/aarch64.cc (aarch_handle_no_branch_protection): Copy.
>>> (aarch_handle_standard_branch_protection): Copy.
>>> (aarch_handle_pac_ret_protection): Copy.
>>> (aarch_handle_pac_ret_leaf): Copy.
>>> (aarch_handle_pac_ret_b_key): Copy.
>>> (aarch_handle_bti_protection): Copy.
>>
>> I think all of the above functions that have been moved back from
>> aarch-common should be renamed back to aarch64_..., unless they are directly
>> referenced statically by code in aarch-common.c.
> 
> done.
> 
>>> +const struct aarch_branch_protect_type aarch_branch_protect_types[] = {
>>
>> can this be made static now?  And maybe pass the structure as a parameter if
>> that's not done already.
> 
> done in v4.
> 
>> It would be nice if, when we raise an error, we could print out the list of
>> valid options (and modifiers), much like we do on Arm for -march/-mcpu.
>>
>> eg.
>> $ gcc -mcpu=crotex-a8
>> cc1: error: unrecognised -mcpu target: crotex-a8
>> cc1: note: valid arguments are: arm8 arm810 strongarm strongarm110 fa526
>> [...rest of list]; did you mean ‘cortex-a8’?
> 
> i implemented this with candidates_list_and_hint but it does
> not work very well if the typo is in a subtype, so i think
> this should be done in a separate patch if at all.
> 

I'd build the candidates list from all the types + subtypes, so that the 
suggestion code has a full list to pick from; but fair enough.

R.

Re: Discussion about arm/aarch64 testcase failures seen with patch for PR111673

2023-12-14 Thread Richard Earnshaw (lists) via Gcc

On 14/12/2023 07:17, Surya Kumari Jangala via Gcc wrote:
> Hi Richard,
> Thanks a lot for your response!
> 
> Another failure reported by the Linaro CI is as follows:
> 
> Running gcc:gcc.dg/dg.exp ...
> FAIL: gcc.dg/ira-shrinkwrap-prep-1.c scan-rtl-dump pro_and_epilogue 
> "Performing shrink-wrapping"
> FAIL: gcc.dg/pr10474.c scan-rtl-dump pro_and_epilogue "Performing 
> shrink-wrapping"
> 
> I analyzed the failures and the root cause is the same for both the failures.
> 
> The test pr10474.c is as follows:
> 
> void f(int *i)
> {
> if (!i)
> return;
> else
> {
> __builtin_printf("Hi");
> *i=0;
> }
> }
> 
> 
> With the patch (for PR111673), x1 (volatile) is being assigned to hold value 
> of
> x0 (first parameter). Since it is a volatile, x1 is saved to the stack as 
> there
> is a call later on. The save to the stack is generated in the LRA pass. The 
> save
> is generated in the entry basic block. Due to the usage of the stack pointer 
> in
> the entry bb, the testcase fails to be shrink wrapped.

I'm not entirely sure I understand what you mean from a quick glance.  Do you 
mean that X1 has the /v flag marked on it (ie it's printed in dumps as 
"reg/v")?  If so, that's not volatile, it just means that the register is 
associated with a user variable (as opposed to a compiler-generated temporary 
variable):

>From the manual:

@item REG_USERVAR_P (@var{x})
In a @code{reg}, nonzero if it corresponds to a variable present in
the user's source code.  Zero for temporaries generated internally by
the compiler.  Stored in the @code{volatil} field and printed as
@samp{/v}.

There are several other cases where we re-use this bit on different RTL 
constructs to mean things other than 'volatile': it pretty much only has the 
conventional meaning on MEM objects.

> 
> The reason why LRA generates the store insn in the entry bb is as follows:
> LRA emits insns to save volatile registers in the inheritance/splitting pass.
> In this pass, LRA builds EBBs (Extended Basic Block) and traverses the insns 
> in
> the EBBs in reverse order from the last insn to the first insn. When LRA sees 
> a
> write to a pseudo (that has been assigned a volatile register), and there is a
> read following the write, with an intervening call insn between the write and 
> read,
> then LRA generates a spill immediately after the write and a restore 
> immediately
> before the read. In the above test, there is an EBB containing the entry bb 
> and
> the bb with the printf call. In the entry bb, there is a write to x1 
> (basically
> a copy from x0 to x1) and in the printf bb, there is a read of x1 after the 
> call
> insn. So LRA generates a spill in the entry bb.
> 
> Without patch, x19 is chosen to hold the value of x0. Since x19 is a 
> non-volatile,
> the input RTL to the shrink wrap pass does not have any code to save x19 to 
> the
> stack. Only the insn that copies x0 to x19 is present in the entry bb. In the
> shrink wrap pass, this insn is moved down the cfg to the bb containing the 
> call
> to printf, thereby allowing prolog to be allocated only where needed. Thus 
> shrink
> wrap succeeds.
> 
> 
> Shrink wrap can be made to succeed if the save of x1 occurs just before the 
> call
> insn, instead of generating it after the write to x1. This will ensure that 
> the
> spill does not occur in the entry bb. In fact, it is more efficient if the 
> save
> occurs only in the path containing the printf call instead of occurring in the
> entry bb.
> 
> I have a patch (bootstrapped and regtested on powerpc) that makes changes in
> LRA to save volatile registers before a call instead of after the write to the
> volatile. With this patch, both the above tests pass.
> 
> Since the patch for PR111673 has been approved by Vladimir, I plan to
> commit the patch to trunk. And I will fix the test failures after doing the
> commit.
> 

I think I'd probably understand this better if you could give some example RTL 
(before and after).  Do you have that?

R.

> Regards,
> Surya
> 
> 
> 
> On 28/11/23 7:18 pm, Richard Earnshaw wrote:
>>
>>
>> On 28/11/2023 12:52, Surya Kumari Jangala wrote:
>>> Hi Richard,
>>> Thanks a lot for your response!
>>>
>>> Another failure reported by the Linaro CI is as follows :
>>> (Note: I am planning to send a separate mail for each failure, as this will 
>>> make
>>> the discussion easy to track)
>>>
>>> FAIL: gcc.target/aarch64/sve/acle/general/cpy_1.c -march=armv8.2-a+sve 
>>> -moverride=tune=none  check-function-bodies dup_x0_m
>>>
>>> Expected code:
>>>
>>>    ...
>>>    add (x[0-9]+), x0, #?1
>>>    mov (p[0-7])\.b, p15\.b
>>>    mov z0\.d, \2/m, \1
>>>    ...
>>>    ret
>>>
>>>
>>> Code obtained w/o patch:
>>>  addvl   sp, sp, #-1
>>>  str p15, [sp]
>>>  add x0, x0, 1
>>>  mov p3.b, p15.b
>>>  mov z0.d, p3/m, x0
>>>  ldr p15, [sp]

Re: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-14 Thread Richard Earnshaw (lists)

On 09/12/2023 15:39, Lipeng Zhu wrote:
> This patch try to introduce the rwlock and split the read/write to
> unit_root tree and unit_cache with rwlock instead of the mutex to
> increase CPU efficiency. In the get_gfc_unit function, the percentage
> to step into the insert_unit function is around 30%, in most instances,
> we can get the unit in the phase of reading the unit_cache or unit_root
> tree. So split the read/write phase by rwlock would be an approach to
> make it more parallel.
> 
> BTW, the IPC metrics can gain around 9x in our test
> server with 220 cores. The benchmark we used is
> https://github.com/rwesson/NEAT
> 
> libgcc/ChangeLog:
> 
>   * gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro.
>   (__gthrw): New function.
>   (__gthread_rwlock_rdlock): New function.
>   (__gthread_rwlock_tryrdlock): New function.
>   (__gthread_rwlock_wrlock): New function.
>   (__gthread_rwlock_trywrlock): New function.
>   (__gthread_rwlock_unlock): New function.
> 
> libgfortran/ChangeLog:
> 
>   * io/async.c (DEBUG_LINE): New macro.
>   * io/async.h (RWLOCK_DEBUG_ADD): New macro.
>   (CHECK_RDLOCK): New macro.
>   (CHECK_WRLOCK): New macro.
>   (TAIL_RWLOCK_DEBUG_QUEUE): New macro.
>   (IN_RWLOCK_DEBUG_QUEUE): New macro.
>   (RDLOCK): New macro.
>   (WRLOCK): New macro.
>   (RWUNLOCK): New macro.
>   (RD_TO_WRLOCK): New macro.
>   (INTERN_RDLOCK): New macro.
>   (INTERN_WRLOCK): New macro.
>   (INTERN_RWUNLOCK): New macro.
>   * io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in
>   a comment.
>   (unit_lock): Remove including associated internal_proto.
>   (unit_rwlock): New declarations including associated internal_proto.
>   (dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock
>   instead of __gthread_mutex_lock and __gthread_mutex_unlock on
>   unit_lock.
>   * io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on
>   unit_rwlock instead of LOCK and UNLOCK on unit_lock.
>   (st_write_done_worker): Likewise.
>   * io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules'
>   comment. Use unit_rwlock variable instead of unit_lock variable.
>   (get_gfc_unit_from_unit_root): New function.
>   (get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock
>   instead of LOCK and UNLOCK on unit_lock.
>   (close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of
>   LOCK and UNLOCK on unit_lock.
>   (close_units): Likewise.
>   (newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on
>   unit_lock.
>   * io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock
>   instead of LOCK and UNLOCK on unit_lock.
>   (flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead
>   of LOCK and UNLOCK on unit_lock.
> 

It looks like this has broken builds on arm-none-eabi when using newlib:

In file included from /work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran
/runtime/error.c:27:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h: In function 
‘dec_waiting_unlocked’:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1023:3: error
: implicit declaration of function ‘WRLOCK’ [-Wimplicit-function-declaration]
 1023 |   WRLOCK (_rwlock);
  |   ^~
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1025:3: error
: implicit declaration of function ‘RWUNLOCK’ [-Wimplicit-function-declaration]
 1025 |   RWUNLOCK (_rwlock);
  |   ^~~~


R.

> ---
> v1 -> v2:
> Limit the pthread_rwlock usage in libgcc only when __cplusplus isn't defined.
> 
> v2 -> v3:
> Rebase the patch with trunk branch.
> 
> v3 -> v4:
> Update the comments.
> 
> v4 -> v5:
> Fix typos and code formatter.
> 
> v5 -> v6:
> Add unit tests.
> 
> v6 -> v7:
> Update ChangeLog and code formatter.
> 
> Reviewed-by: Hongjiu Lu 
> Reviewed-by: Bernhard Reutner-Fischer 
> Reviewed-by: Thomas Koenig 
> Reviewed-by: Jakub Jelinek 
> Signed-off-by: Lipeng Zhu 
> ---
>  libgcc/gthr-posix.h   |  60 +++
>  libgfortran/io/async.c|   4 +
>  libgfortran/io/async.h| 151 ++
>  libgfortran/io/io.h   |  15 +-
>  libgfortran/io/transfer.c |   8 +-
>  libgfortran/io/unit.c | 117 +-
>  libgfortran/io/unix.c |  16 +-
>  .../testsuite/libgomp.fortran/rwlock_1.f90|  33 
>  .../testsuite/libgomp.fortran/rwlock_2.f90|  22 +++
>  .../testsuite/libgomp.fortran/rwlock_3.f90|  18 +++
>  10 files changed, 386 insertions(+), 58 deletions(-)
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_1.f90
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_2.f90
>  create mode 100644 libgomp/testsuite/libgomp.fortran/rwlock_3.f90
> 
> diff --git

Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Richard Earnshaw (lists)

On 01/12/2023 13:45, Christophe Lyon wrote:
> On Fri, 1 Dec 2023 at 13:44, Richard Earnshaw (lists)
>  wrote:
>>
>> On 01/12/2023 11:28, Saurabh Jha wrote:
>>> Hey,
>>>
>>> I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in 
>>> this commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an 
>>> error which I unfortunately missed. This patch fixes that test.
>>>
>>> Did regression testing on arm-none-eabi and found no regressions. Output of 
>>> running gcc/contrib/compare_tests is this:
>>>
>>> """
>>> Tests that now work, but didn't before (2 tests):
>>>
>>> arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
>>> gcc.target/arm/mve/pr112337.c (test for excess errors)
>>> arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
>>>  gcc.target/arm/mve/pr112337.c (test for excess errors)
>>> """
>>>
>>> Ok for trunk? I don't have commit access so could someone please commit on 
>>> my behalf?
>>>
>>> Regards,
>>> Saurabh
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/arm/mve/pr112337.c: Fix the testcase
>>
>>
>> Hmm, could this be related to the changes Christophe made recently to change 
>> the way MVE vector types were set up internally?  If so, this might indicate 
>> an issue that's going to affect real users with existing code.
>>
> 
> My change was only about vector types, here the problem is with a
> pointer to a scalar.
> Anyway, I ran the test with my commit reverted and it still fails in
> the same way, so I think this patch is needed.
> 
> Thanks,
> 
> Christophe
> 
>> Christophe?
>>
>> R.

Ok, thanks for checking.  In that case, Saurabh, your patch is OK, but please 
change 'Fix testcase' to 'Use int32_t instead of int.'

Note that ChangeLog entries end with a full stop.

R.

Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Richard Earnshaw (lists)

On 01/12/2023 11:28, Saurabh Jha wrote:
> Hey,
> 
> I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in this 
> commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an error 
> which I unfortunately missed. This patch fixes that test.
> 
> Did regression testing on arm-none-eabi and found no regressions. Output of 
> running gcc/contrib/compare_tests is this:
> 
> """
> Tests that now work, but didn't before (2 tests):
> 
> arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
> gcc.target/arm/mve/pr112337.c (test for excess errors)
> arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
>  gcc.target/arm/mve/pr112337.c (test for excess errors)
> """
> 
> Ok for trunk? I don't have commit access so could someone please commit on my 
> behalf?
> 
> Regards,
> Saurabh
> 
> gcc/testsuite/ChangeLog:
> 
>     * gcc.target/arm/mve/pr112337.c: Fix the testcase


Hmm, could this be related to the changes Christophe made recently to change 
the way MVE vector types were set up internally?  If so, this might indicate an 
issue that's going to affect real users with existing code.

Christophe?

R.

Re: [PATCH] AArch64/testsuite: Use non-capturing parentheses with ccmp_1.c

2023-11-22 Thread Richard Earnshaw (lists)

On 22/11/2023 15:21, Maciej W. Rozycki wrote:
> Use non-capturing parentheses for the subexpressions used with 
> `scan-assembler-times', to avoid a quirk with double-counting.
> 
>   gcc/testsuite/
>   * gcc.target/aarch64/ccmp_1.c: Use non-capturing parentheses 
>   with `scan-assembler-times'.

OK

R.

> ---
> Hi,
> 
>  Here's another one.  I realised my original regexp used to grep the tree 
> for `scan-assembler-times' with subexpressions was too strict and with an 
> updated pattern I found this second test case that does regress once the 
> `scan-assembler-times' double-counting quirk has been fixed.
> 
>  As with the ARM change we don't need capturing parentheses here, usually 
> used for back references, so let's just avoid the double-counting quirk 
> altogether and make our matching here work whether the quirk has been 
> fixed or not.
> 
>  Verified for the `aarch64-linux-gnu' target with the quirk fix submitted 
> as  
> and the aarch64.exp subset of the C language test suite.  OK to apply?
> 
>   Maciej
> ---
>  gcc/testsuite/gcc.target/aarch64/ccmp_1.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> gcc-aarch64-test-ccmp_1-non-capturing.diff
> Index: gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> ===
> --- gcc.orig/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> +++ gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> @@ -86,8 +86,8 @@ f13 (int a, int b)
>  /* { dg-final { scan-assembler "cmp\t(.)+35" } } */
>  
>  /* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, 0" 4 } } */
> -/* { dg-final { scan-assembler-times "fcmpe\t(.)+0\\.0" 2 } } */
> -/* { dg-final { scan-assembler-times "fcmp\t(.)+0\\.0" 2 } } */
> +/* { dg-final { scan-assembler-times "fcmpe\t(?:.)+0\\.0" 1 } } */
> +/* { dg-final { scan-assembler-times "fcmp\t(?:.)+0\\.0" 1 } } */
>  
>  /* { dg-final { scan-assembler "adds\t" } } */
>  /* { dg-final { scan-assembler-times "\tccmp\t" 11 } } */

Re: [PATCH 10/11] aarch64: Fix branch-protection error message tests

2023-10-13 Thread Richard Earnshaw (lists)

On 05/09/2023 16:00, Richard Sandiford via Gcc-patches wrote:
> Szabolcs Nagy  writes:
>> Update tests for the new branch-protection parser errors.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/aarch64/branch-protection-attr.c: Update.
>>  * gcc.target/aarch64/branch-protection-option.c: Update.
> 
> OK, thanks.  (And I agree these are better messages. :))
> 
> I think that's the last of the AArch64-specific ones.  The others
> will need to be reviewed by Kyrill or Richard.
> 
> Richard
> 
>> ---
>>  gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c   | 6 +++---
>>  gcc/testsuite/gcc.target/aarch64/branch-protection-option.c | 2 +-
>>  2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c 
>> b/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
>> index 272000c2747..dae2a758a56 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
>> @@ -4,19 +4,19 @@ void __attribute__ ((target("branch-protection=leaf")))
>>  foo1 ()
>>  {
>>  }
>> -/* { dg-error {invalid protection type 'leaf' in 
>> 'target\("branch-protection="\)' pragma or attribute} "" { target *-*-* } 5 
>> } */
>> +/* { dg-error {invalid argument 'leaf' for 
>> 'target\("branch-protection="\)'} "" { target *-*-* } 5 } */
>>  /* { dg-error {pragma or attribute 'target\("branch-protection=leaf"\)' is 
>> not valid} "" { target *-*-* } 5 } */

'leaf' is really a modifier for the other branch protection strategies; perhaps 
it would be better to describe it as that.

But this brings up another issue/question.  If the compiler has been configured 
with, say, '--enable-branch-protection=standard' or some other variety, is 
there (or do we want) a way to extend that to leaf functions without changing 
the underlying strategy?

>>  
>>  void __attribute__ ((target("branch-protection=none+pac-ret")))
>>  foo2 ()
>>  {
>>  }
>> -/* { dg-error "unexpected 'pac-ret' after 'none'" "" { target *-*-* } 12 } 
>> */
>> +/* { dg-error {argument 'none' can only appear alone in 
>> 'target\("branch-protection="\)'} "" { target *-*-* } 12 } */

Or maybe better still: "branch protection strategies 'none' and 'pac-ret' are 
incompatible".

>>  /* { dg-error {pragma or attribute 
>> 'target\("branch-protection=none\+pac-ret"\)' is not valid} "" { target 
>> *-*-* } 12 } */
>>  
>>  void __attribute__ ((target("branch-protection=")))
>>  foo3 ()
>>  {
>>  }
>> -/* { dg-error {missing argument to 'target\("branch-protection="\)' pragma 
>> or attribute} "" { target *-*-* } 19 } */
>> +/* { dg-error {invalid argument '' for 'target\("branch-protection="\)'} "" 
>> { target *-*-* } 19 } */
>>  /* { dg-error {pragma or attribute 'target\("branch-protection="\)' is not 
>> valid} "" { target *-*-* } 19 } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/branch-protection-option.c 
>> b/gcc/testsuite/gcc.target/aarch64/branch-protection-option.c
>> index 1b3bf4ee2b8..e2f847a31c4 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/branch-protection-option.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/branch-protection-option.c
>> @@ -1,4 +1,4 @@
>>  /* { dg-do "compile" } */
>>  /* { dg-options "-mbranch-protection=leaf -mbranch-protection=none+pac-ret" 
>> } */
>>  
>> -/* { dg-error "unexpected 'pac-ret' after 'none'"  "" { target *-*-* } 0 } 
>> */
>> +/* { dg-error "argument 'none' can only appear alone in 
>> '-mbranch-protection='" "" { target *-*-* } 0 } */

But this is all a matter of taste.

However, this patch should be merged with the patch that changes the error 
messages.  Or has that already gone in?

R

Re: Principles of the C99 testsuite conversion

2023-10-11 Thread Richard Earnshaw (lists)

On 11/10/2023 14:56, Jeff Law wrote:
> 
> 
> On 10/11/23 04:39, Florian Weimer wrote:
>> I've started to look at what it is required to convert the testsuite to
>> C99 (without implicit ints, without implicit function declarations, and
>> a few other legacy language features).
> I bet those older tests originating from c-torture will be a bit painful.  
> Torbjorn liked having them minimized, to the point of squashing out nearly 
> everything he considered extraneous.  I'd bet many of those older tests are 
> going to need lots of changes.
> 

I've often wondered just how much of the original c-torture suite is still 
relevant today.  Most of those tests were written at a  time when the compiler 
expanded tree directly into RTL and I suspect that today the tests never get 
even close to tickling the original bug they were intended to validate.

R.

Re: Register allocation cost question

2023-10-11 Thread Richard Earnshaw (lists) via Gcc

On 11/10/2023 09:58, Andrew Stubbs wrote:
> On 11/10/2023 07:54, Chung-Lin Tang wrote:
>>
>>
>> On 2023/10/10 11:11 PM, Andrew Stubbs wrote:
>>> Hi all,
>>>
>>> I'm trying to add a new register set to the GCN port, but I've hit a
>>> problem I don't understand.
>>>
>>> There are 256 new registers (each 2048 bit vector register) but the
>>> register file has to be divided between all the running hardware
>>> threads; if you can use fewer registers you can get more parallelism,
>>> which means that it's important that they're allocated in order.
>>>
>>> The problem is that they're not allocated in order. Somehow the IRA pass
>>> is calculating different costs for the registers within the class. It
>>> seems to prefer registers a32, a96, a160, and a224.
>>>
>>> The internal regno are 448, 512, 576, 640. These are not random numbers!
>>> They all have zero for the 6 LSB.
>>>
>>> What could cause this? Did I overrun some magic limit? What target hook
>>> might I have miscoded?
>>>
>>> I'm also seeing wrong-code bugs when I allow more than 32 new registers,
>>> but that might be an unrelated problem. Or the allocation is broken? I'm
>>> still analyzing this.
>>>
>>> If it matters, ... the new registers can't be used for general purposes,
>>> so I'm trying to set them up as a temporary spill destination. This
>>> means they're typically not busy. It feels like it shouldn't be this
>>> hard... :(
>>
>> Have you tried experimenting with REG_ALLOC_ORDER? I see that the GCN port 
>> currently isn't using this target macro.
> 
> The default definition is 0,1,2,3,4 and is already the desired behaviour.
> 
> Andrew

You may need to define HONOR_REG_ALLOC_ORDER though.

Re: [PATCH 6/6] aarch64: Add front-end argument type checking for target builtins

2023-10-10 Thread Richard Earnshaw (lists)

On 09/10/2023 14:12, Victor Do Nascimento wrote:
> 
> 
> On 10/7/23 12:53, Richard Sandiford wrote:
>> Richard Earnshaw  writes:
>>> On 03/10/2023 16:18, Victor Do Nascimento wrote:
 In implementing the ACLE read/write system register builtins it was
 observed that leaving argument type checking to be done at expand-time
 meant that poorly-formed function calls were being "fixed" by certain
 optimization passes, meaning bad code wasn't being properly picked up
 in checking.

 Example:

     const char *regname = "amcgcr_el0";
     long long a = __builtin_aarch64_rsr64 (regname);

 is reduced by the ccp1 pass to

     long long a = __builtin_aarch64_rsr64 ("amcgcr_el0");

 As these functions require an argument of STRING_CST type, there needs
 to be a check carried out by the front-end capable of picking this up.

 The introduced `check_general_builtin_call' function will be called by
 the TARGET_CHECK_BUILTIN_CALL hook whenever a call to a builtin
 belonging to the AARCH64_BUILTIN_GENERAL category is encountered,
 carrying out any appropriate checks associated with a particular
 builtin function code.
>>>
>>> Doesn't this prevent reasonable wrapping of the __builtin... names with
>>> something more palatable?  Eg:
>>>
>>> static inline __attribute__(("always_inline")) long long get_sysreg_ll
>>> (const char *regname)
>>> {
>>>     return __builtin_aarch64_rsr64 (regname);
>>> }
>>>
>>> ...
>>>     long long x = get_sysreg_ll("amcgcr_el0");
>>> ...
>>
>> I think it's case of picking your poison.  If we didn't do this,
>> and only checked later, then it's unlikely that GCC and Clang would
>> be consistent about when a constant gets folded soon enough.
>>
>> But yeah, it means that the above would need to be a macro in C.
>> Enlightened souls using C++ could instead do:
>>
>>    template
>>    long long get_sysreg_ll()
>>    {
>>  return __builtin_aarch64_rsr64(regname);
>>    }
>>
>>    ... get_sysreg_ll<"amcgcr_el0">() ...
>>
>> Or at least I hope so.  Might be nice to have a test for this.
>>
>> Thanks,
>> Richard
> 
> As Richard Earnshaw mentioned, this does break the use of `static inline 
> __attribute__(("always_inline"))', something I had found out in my testing.  
> My chosen implementation was indeed, to quote Richard Sandiford, a case of 
> "picking your poison" to have things line up with Clang and behaving 
> consistently across optimization levels.
> 
> Relaxing the the use of `TARGET_CHECK_BUILTIN_CALL' meant optimizations were 
> letting too many things through. Example:
> 
> const char *regname = "amcgcr_el0";
> long long a = __builtin_aarch64_rsr64 (regname);
> 
> gets folded to
> 
> long long a = __builtin_aarch64_rsr64 ("amcgcr_el0");
> 
> and compilation passes at -01 even though it fails at -O0.
> 
> I had, however, not given any thought to the use of a template as a valid C++ 
> alternative.
> 
> I will evaluate the use of templates and add tests accordingly.

This just seems inconsistent with all the builtins we already have that require 
literal constants for parameters.  For example (to pick just one of many), 
vshr_n_q8(), where the second parameter must be a literal value.  In practice 
we accept anything that resolves to a compile-time constant integer expression 
and rely on that to avoid having to have hundreds of macros binding the ACLE 
names to the underlying builtin equivalents.

Furthermore, I don't really see the problem with the examples you cite.  It's 
not as though the user can change these at run-time and expect to get a 
different register.

R.

> 
> Cheers,
> Victor

Re: Documenting common C/C++ options

2023-10-10 Thread Richard Earnshaw (lists) via Gcc

On 10/10/2023 11:46, Richard Earnshaw (lists) via Gcc wrote:
> On 10/10/2023 10:47, Florian Weimer via Gcc wrote:
>> Currently, -fsigned-char and -funsigned-char are only documented as C
>> language options, although they work for C++ as well (and Objective-C
>> and Objective-C++, I assume, but I have not tested this).  There does
>> not seem to be a place for this kind of options in the manual.
>>
>> The options -fshort-enums and -fshort-wchar are documented under
>> code-generation options, but this seems to be a bit of a stretch because
>> (at least for -fshort-wchar), these too seem to be more about front-end
>> behavior.
>>
>> What would be a good way to address this?
>>
>> Thanks,
>> Florian
>>
> 
> 
> All of these are ABI; so where ever it goes, it should be documented that 
> changing them will potentially cause issues with any pre-compiled object 
> files having different settings.
> 
> R.

And you can add -f[un]signed-bitfield to that list as well.

R.

Re: Documenting common C/C++ options

2023-10-10 Thread Richard Earnshaw (lists) via Gcc

On 10/10/2023 10:47, Florian Weimer via Gcc wrote:
> Currently, -fsigned-char and -funsigned-char are only documented as C
> language options, although they work for C++ as well (and Objective-C
> and Objective-C++, I assume, but I have not tested this).  There does
> not seem to be a place for this kind of options in the manual.
> 
> The options -fshort-enums and -fshort-wchar are documented under
> code-generation options, but this seems to be a bit of a stretch because
> (at least for -fshort-wchar), these too seem to be more about front-end
> behavior.
> 
> What would be a good way to address this?
> 
> Thanks,
> Florian
> 


All of these are ABI; so where ever it goes, it should be documented that 
changing them will potentially cause issues with any pre-compiled object files 
having different settings.

R.

Re: [PATCH v2] Add a GCC Security policy

2023-10-05 Thread Richard Earnshaw (lists)

On 28/09/2023 12:55, Siddhesh Poyarekar wrote:
> +Security features implemented in GCC
> +
> +
[...]
> +
> +Similarly, GCC may transform code in a way that the correctness of
> +the expressed algorithm is preserved, but supplementary properties
> +that are not specifically expressible in a high-level language
> +are not preserved. Examples of such supplementary properties
> +include absence of sensitive data in the program's address space
> +after an attempt to wipe it, or data-independent timing of code.
> +When the source code attempts to express such properties, failure
> +to preserve them in resulting machine code is not a security issue
> +in GCC.

I think it would be worth mentioning here that compilers interpret source code 
according to an abstract machine defined by the source language.  Properties of 
a program that cannot be described in the abstract machine may not be 
translated into the generated machine code.

This is, fundamentally, describing the 'as if' rule.

R.

Re: [PATCH] AArch64: Remove BTI from outline atomics

2023-09-26 Thread Richard Earnshaw (lists)

On 26/09/2023 14:46, Wilco Dijkstra wrote:
> 
> The outline atomic functions have hidden visibility and can only be called
> directly.  Therefore we can remove the BTI at function entry.  This improves
> security by reducing the number of indirect entry points in a binary.
> The BTI markings on the objects are still emitted.

Please can you add a comment to that effect in the source code.  OK with that 
change.

R.

> 
> Passes regress, OK for commit?
> 
> libgcc/ChangeLog:
>     * config/aarch64/lse.S (BTI_C): Remove define.
> 
> ---
> 
> diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
> index 
> ba05047ff02b6fc5752235bffa924fc4a2f48c04..dbfb83fb09083641bf06c50b631a5f27bdf61b80
>  100644
> --- a/libgcc/config/aarch64/lse.S
> +++ b/libgcc/config/aarch64/lse.S
> @@ -163,8 +163,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  #define tmp3    14
>  #define tmp4    13
>  
> -#define BTI_C  hint    34
> -
>  /* Start and end a function.  */
>  .macro  STARTFN name
>  .text
> @@ -174,7 +172,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  .type   \name, %function
>  .cfi_startproc
>  \name:
> -   BTI_C
>  .endm
>  
>  .macro  ENDFN name

Re: [PATCH] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-09-20 Thread Richard Earnshaw (lists)

On 20/09/2023 14:50, Wilco Dijkstra wrote:
> 
> The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
> Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
> Clean up the condition when to use MOPS.
> 
> Passes regress/bootstrap, OK for commit?
> 
> gcc/ChangeLog/
> PR target/103100
> * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.

Shouldn't this be a separate patch?  It's not immediately obvious that this is 
a necessary part of this change.

> (setmemdi): Likewise.
> * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
> strict-align.  Cleanup condition for using MOPS.
> (aarch64_expand_setmem): Likewise.
> 
> ---
> 
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> dd6874d13a75f20d10a244578afc355b25c73da2..8f3bfb91c0f4ec43f37fe9289a66092a29a47e4d
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -25261,27 +25261,23 @@ aarch64_expand_cpymem (rtx *operands)
>int mode_bits;
>rtx dst = operands[0];
>rtx src = operands[1];
> +  unsigned align = INTVAL (operands[3]);

This should read the value with UINTVAL.  Given the useful range of the 
alignment, it should be OK that we're not using unsigned HWI.

>rtx base;
>machine_mode cur_mode = BLKmode;
> +  bool size_p = optimize_function_for_size_p (cfun);
>  
> -  /* Variable-sized memcpy can go through the MOPS expansion if available.  
> */
> -  if (!CONST_INT_P (operands[2]))
> +  /* Variable-sized or strict-align copies may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_cpymem_mops (operands);

So what about align=4 and copying, for example, 8 or 12 bytes; wouldn't we want 
a sequence of LDR/STR in that case?  Doesn't this fall back to MOPS too eagerly?


>  
>unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
>  
> -  /* Try to inline up to 256 bytes or use the MOPS threshold if available.  
> */
> -  unsigned HOST_WIDE_INT max_copy_size
> -= TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256;
> -
> -  bool size_p = optimize_function_for_size_p (cfun);
> +  /* Try to inline up to 256 bytes.  */
> +  unsigned max_copy_size = 256;
> +  unsigned max_mops_size = aarch64_mops_memcpy_size_threshold;

I find this name slightly confusing.  Surely it's min_mops_size (since above 
that we want to use MOPS rather than inlined loads/stores).  But why not just 
use aarch64_mops_memcpy_size_threshold directly in the one place it's used?

>  
> -  /* Large constant-sized cpymem should go through MOPS when possible.
> - It should be a win even for size optimization in the general case.
> - For speed optimization the choice between MOPS and the SIMD sequence
> - depends on the size of the copy, rather than number of instructions,
> - alignment etc.  */
> -  if (size > max_copy_size)
> +  /* Large copies use MOPS when available or a library call.  */
> +  if (size > max_copy_size || (TARGET_MOPS && size > max_mops_size))
>  return aarch64_expand_cpymem_mops (operands);
>  
>int copy_bits = 256;
> @@ -25445,12 +25441,13 @@ aarch64_expand_setmem (rtx *operands)

Similar comments apply to this code as well.

>unsigned HOST_WIDE_INT len;
>rtx dst = operands[0];
>rtx val = operands[2], src;
> +  unsigned align = INTVAL (operands[3]);
>rtx base;
>machine_mode cur_mode = BLKmode, next_mode;
>  
> -  /* If we don't have SIMD registers or the size is variable use the MOPS
> - inlined sequence if possible.  */
> -  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD)
> +  /* Variable-sized or strict-align memset may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD
> +  || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_setmem_mops (operands);
>  
>bool size_p = optimize_function_for_size_p (cfun);
> @@ -25458,10 +25455,13 @@ aarch64_expand_setmem (rtx *operands)

And here.

>/* Default the maximum to 256-bytes when considering only libcall vs
>   SIMD broadcast sequence.  */
>unsigned max_set_size = 256;
> +  unsigned max_mops_size = aarch64_mops_memset_size_threshold;
>  
>len = INTVAL (operands[1]);
> -  if (len > max_set_size && !TARGET_MOPS)
> -return false;
> +
> +  /* Large memset uses MOPS when available or a library call.  */
> +  if (len > max_set_size || (TARGET_MOPS && len > max_mops_size))
> +return aarch64_expand_setmem_mops (operands);
>  
>int cst_val = !!(CONST_INT_P (val) && (INTVAL (val) != 0));
>/* The MOPS sequence takes:
> @@ -25474,12 +25474,6 @@ aarch64_expand_setmem (rtx *operands)
>   the arguments + 1 for the call.  */
>unsigned libcall_cost = 4;
>  
> -  /* Upper bound check.  For large constant-sized setmem use the MOPS 
> sequence
> - when available.  */
> -  if (TARGET_MOPS
> -  && len >=

Re: [PATCH 2/2] libstdc++: Add dg-require-thread-fence in several tests

2023-09-11 Thread Richard Earnshaw (lists) via Gcc-patches

On 11/09/2023 16:22, Jonathan Wakely via Gcc-patches wrote:
> On Mon, 11 Sept 2023 at 14:57, Christophe Lyon
>  wrote:
>>
>>
>>
>> On Mon, 11 Sept 2023 at 15:12, Jonathan Wakely  wrote:
>>>
>>> On Mon, 11 Sept 2023 at 13:36, Christophe Lyon
>>>  wrote:



 On Mon, 11 Sept 2023 at 12:59, Jonathan Wakely  wrote:
>
> On Sun, 10 Sept 2023 at 20:31, Christophe Lyon
>  wrote:
>>
>> Some targets like arm-eabi with newlib and default settings rely on
>> __sync_synchronize() to ensure synchronization.  Newlib does not
>> implement it by default, to make users aware they have to take special
>> care.
>>
>> This makes a few tests fail to link.
>
> Does this mean those features are unusable on the target, or just that
> users need to provide their own __sync_synchronize to use them?


 IIUC the user is expected to provide them.
 Looks like we discussed this in the past :-)
 In  https://gcc.gnu.org/legacy-ml/gcc-patches/2016-10/msg01632.html,
 see the pointer to Ramana's comment: 
 https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02751.html
>>>
>>> Oh yes, thanks for the reminder!
>>>

 The default arch for arm-eabi is armv4t which is very old.
 When running the testsuite with something more recent (either as default 
 by configuring GCC --with-arch=XXX or by forcing -march/-mcpu via 
 dejagnu's target-board), the compiler generates barrier instructions and 
 there are no such errors.
>>>
>>> Ah yes, that's fine then.
>>>
 For instance, here is a log with the defaults:
 https://git.linaro.org/toolchain/ci/base-artifacts/tcwg_gnu_embed_check_gcc/master-arm_eabi.git/tree/00-sumfiles?h=linaro-local/ci/tcwg_gnu_embed_check_gcc/master-arm_eabi
 and a log when we target cortex-m0 which is still a very small cpu but has 
 barriers:
 https://git.linaro.org/toolchain/ci/base-artifacts/tcwg_gnu_embed_check_gcc/master-thumb_m0_eabi.git/tree/00-sumfiles?h=linaro-local/ci/tcwg_gnu_embed_check_gcc/master-thumb_m0_eabi

 I somehow wanted to get rid of such errors with the default 
 configuration
>>>
>>> Yep, that makes sense, and we'll still be testing them for newer
>>> arches on the target, so it's not completely disabling those parts of
>>> the testsuite.
>>>
>>> But I'm still curious why some of those tests need this change. I
>>> think the ones I noted below are probably failing for some other
>>> reasons.
>>>
>> Just looked at  23_containers/span/back_assert_neg.cc, the linker says it 
>> needs
>> arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(debug.o) to resolve
>> ./back_assert_neg-back_assert_neg.o (std::__glibcxx_assert_fail(char const*, 
>> int, char const*, char const*))
>> and indeed debug.o has a reference to __sync_synchronize
> 
> Aha, that's just because I put __glibcxx_assert_fail in debug.o, but
> there are no dependencies on anything else in that file, including the
> _M_detach member function that uses atomics.
> 
> This would also be solved by -Wl,--gc-sections :-)
> 
> I think it would be better to move __glibcxx_assert_fail to a new
> file, so that it doesn't make every assertion unnecessarily depend on
> __sync_synchronize. I'll do that now.
> 
> We could also make the atomics in debug.o conditional, so that debug
> mode doesn't depend on __sync_synchronize for single-threaded targets.
> Does the arm4t arch have pthreads support in newlib?  I didn't bother
> making the use of atomics conditional, because performance is not
> really a priority for debug mode bookkeeping. But the problem here
> isn't just a slight performance overhead of atomics, it's that they
> aren't even supported for arm4t.

I might be wrong, but I don't think newlib has any support for pthreads.

R.
>

Re: Cauldron schedule: diagnostics and security features talks

2023-09-11 Thread Richard Earnshaw (lists) via Gcc

On 08/09/2023 19:18, Siddhesh Poyarekar wrote:
> Hello,
> 
> I want to begin by apologizing because I know from first hand experience that 
> scheduling can be an immensely painful job.
> 
> The Cauldron 2023 schedule[1] looks packed and I noticed that Qing and 
> David's talks on security features and diagnostics respectively are in the 
> same time slot.  Both those sessions are likely to have pretty big overlaps 
> in audience IMO since the topics are thematically related.  Is there a way in 
> which they could be put in different time slots?
> 
> IIRC they were in different time slots before and were probably moved around 
> to cater for another conflict (hence maybe making it harder to move them 
> again) but I figured I'd rather air my request and be turned down than have 
> to make the difficult choice :)
> 
> Thanks!
> Sid
> 
> [1] https://gcc.gnu.org/wiki/cauldron2023

This has been pointed out privately as well.  I'll have one more go at juggling 
the order, but you're right, this has been the most painful part of the 
organising so far.

R.

Re: Question on aarch64 prologue code.

2023-09-06 Thread Richard Earnshaw (lists) via Gcc

On 06/09/2023 15:03, Iain Sandoe wrote:
> Hi Richard,
> 
>> On 6 Sep 2023, at 13:43, Richard Sandiford via Gcc  wrote:
>>
>> Iain Sandoe  writes:
> 
>>> On the Darwin aarch64 port, we have a number of cleanup test fails (pretty 
>>> much corresponding to the [still open] 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39244).  However, let’s assume 
>>> that bug could be a red herring..
>>>
>>> the underlying reason is missing CFI for the set of the FP which [with 
>>> Darwin’s LLVM libunwind impl.] breaks the unwind through the function that 
>>> triggers a signal.
>>
>> Just curious, do you have more details about why that is?  If the unwinder
>> is sophisticated enough to process CFI, it seems odd that it requires the
>> CFA to be defined in terms of the frame pointer.
> 
> Let me see if I can answer that below.
> 
> 
> 
>>> <——
>>>
>>> I have currently worked around this by defining a 
>>> TARGET_FRAME_POINTER_REQUIRED which returns true unless the function is a 
>>> leaf (if that’s the correct solution, then all is fine).
>>
>> I suppose it depends on why the frame-pointer-based CFA is important
>> for Darwin.  If it's due to a more general requirement for a frame
>> pointer to be used, then yeah, that's probably the right fix.
> 
> The Darwin ABI  mandates a frame pointer (although it is omitted by clang for 
> leaf functions).
> 
>>  If it's
>> more a quirk of the unwinder. then we could probably expose whatever
>> that quirk is as a new status bit.  Target-independent code in
>> dwarf2cfi.cc would then need to be aware as well.
> 
> (I suspect) it is the interaction between the mandatory FP and the fact that 
> GCC lays out the stack differently from the other Darwin toolchains at 
> present [port Issue #19].
> 
> For the system toolchain, 30 and 29 are always placed first, right below the 
> SP (other callee saves are made below that in a specified order and always in 
> pairs - presumably, with an uneccessary spill half the time) - Actually, I 
> had a look at the weekend, but cannot find specific documentation on this 
> particular aspect of the ABI  (but, of course, the de facto ABI is what the 
> system toolchain does, regardless of presence/absence of any such doc).
> 
> However (speculation) that means that the FP is not saved where the system 
> tools expect it, maybe that is confusing the unwinder absent the fp cfa.  Of 
> course, it could also just be an unwinder bug that is never triggered by 
> clang’s codegen.
> 
> GCC’s different layout currently defeats compact unwinding on all but leaf 
> frames, so one day I want to fix it ...
> .. however making this change is quite heavy lifting and I think there are 
> higher priorities for the port (so, as long as we have working unwind and no 
> observable fallout, I am deferring that change).
> 
> Note that Darwin’s ABI also has a red zone (but we have not yet made any use 
> of this, since there is no existing aarch64 impl. and I’ve not had time to 
> get to it).  However, AFAICS that is an optimisation - we can still be 
> correct without it.
> 
>>> ———
>>>
>>> However, it does seem odd that the existing code sets up the FP, but never 
>>> produces any CFA for it.
>>>
>>> So is this a possible bug, or just that I misunderstand the relevant set of 
>>> circumstances?
>>
>> emit_frame_chain fulfills an ABI requirement that every non-leaf function
>> set up a frame-chain record.  When emit_frame_chain && !frame_pointer_needed,
>> we set up the FP for ABI purposes only.  GCC can still access everything
>> relative to the stack pointer, and it can still describe the CFI based
>> purely on the stack pointer.
> 
> Thanks that makes sense
> - I guess libunwind is never used with aarch64 linux, even in a clang/llvm 
> toolchain.
>>
>> glibc-based systems only need the CFA to be based on the frame pointer
>> if the stack pointer moves during the body of the function (usually due
>> to alloca or VLAs).
> 
> I’d have to poke more at the unwinder code and do some more debugging - it 
> seems reasonable that it could work for any unwinder that’s based on DWARF 
> (although, if we have completely missing unwind info, then the different 
> stack layout would surely defeat any fallback proceedure).
> 

This is only a guess, but it sounds to me like the issue might be that although 
we create a frame record, we don't use the frame pointer for accessing stack 
variables unless SP can't be used (eg: because the function calls alloca()).  
This tends to be more efficient because offset addressing for SP is more 
flexible.  If we wanted to switch to making FP be the canonical frame address 
register we'd need to change all the code gen to use FP in addressing as well 
(or end up with some really messy translation when emitting debug information).

R.

> thanks
> Iain
> 
>>
>> Thanks,
>> Richard
>

Re: [PATCH] rtl: Forward declare rtx_code

2023-08-23 Thread Richard Earnshaw (lists) via Gcc-patches

On 23/08/2023 16:49, Richard Sandiford via Gcc-patches wrote:
> Richard Earnshaw via Gcc-patches  writes:
>> Now that we require C++ 11, we can safely forward declare rtx_code
>> so that we can use it in target hooks.
>>
>> gcc/ChangeLog
>>  * coretypes.h (rtx_code): Add forward declaration.
>>  * rtl.h (rtx_code): Make compatible with forward declaration.
>> ---
>>  gcc/coretypes.h | 4 
>>  gcc/rtl.h   | 2 +-
>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/coretypes.h b/gcc/coretypes.h
>> index ca8837cef67..51e9ce0 100644
>> --- a/gcc/coretypes.h
>> +++ b/gcc/coretypes.h
>> @@ -100,6 +100,10 @@ struct gimple;
>>  typedef gimple *gimple_seq;
>>  struct gimple_stmt_iterator;
>>  
>> +/* Forward declare rtx_code, so that we can use it in target hooks without
>> +   needing to pull in rtl.h.  */
>> +enum rtx_code : unsigned;
>> +
>>  /* Forward decls for leaf gimple subclasses (for individual gimple codes).
>> Keep this in the same order as the corresponding codes in gimple.def.  */
>>  
>> diff --git a/gcc/rtl.h b/gcc/rtl.h
>> index e1c51156f90..0e9491b89b4 100644
>> --- a/gcc/rtl.h
>> +++ b/gcc/rtl.h
>> @@ -45,7 +45,7 @@ class predefined_function_abi;
>>  /* Register Transfer Language EXPRESSIONS CODES */
>>  
>>  #define RTX_CODEenum rtx_code
>> -enum rtx_code  {
>> +enum rtx_code : unsigned {
>>  
>>  #define DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS)   ENUM ,
>>  #include "rtl.def"  /* rtl expressions are documented here */
> 
> Given:
> 
>   #define RTX_CODE_BITSIZE 8
> 
> there might be some value in making it uint8_t rather than unsigned.
> Preapproved if you agree.
> 
> But the patch as posted is a strict improvement over the status quo,
> so it's also OK as-is.
> 
> Thanks,
> Richard

I did think about that, but there were two reasons for not doing so:
- it presumes we would never want more than 8 bits for rtx_code (well, not 
quite, 
but it would make it more work to change this).
- it would probably lead to more zero-extension operations happening in the 
compiler

I'll put my patch in as is.

R.

Re: [PATCH] aarch64: fix format specifier

2023-08-21 Thread Richard Earnshaw (lists) via Gcc-patches


On 18/08/2023 17:37, FX Coudert via Gcc-patches wrote:

A rather trivial fix for fprintf() specifier of a HOST_WIDE_INT value.
Tested on aarch64-apple-darwin. OK to commit?

FX



OK.

R.

Re: [RFC] GCC Security policy

2023-08-09 Thread Richard Earnshaw (lists) via Gcc-patches


On 08/08/2023 20:39, Carlos O'Donell via Gcc-patches wrote:

On 8/8/23 13:46, David Edelsohn wrote:

I believe that upstream projects for components that are imported
into GCC should be responsible for their security policy, including
libgo, gofrontend, libsanitizer (other than local patches), zlib,
libtool, libphobos, libcody, libffi, eventually Rust libcore, etc.


I agree completely.

We can reference the upstream and direct people to follow upstream security
policy for these bundled components.

Any other policy risks having conflicting guidance between the projects,
which is not useful for security policy.

There might be exceptions to this rule, particularly when the downstream
wants to accept particular risks while upstream does not; but none of these
components are in that case IMO.



I agree with that, but with one caveat.  Our policy should state what we 
 do once upstream has addressed the issue.


R.

Re: [RFC] GCC Security policy

2023-08-08 Thread Richard Earnshaw (lists) via Gcc-patches


On 08/08/2023 15:40, Siddhesh Poyarekar wrote:

On 2023-08-08 10:37, Jakub Jelinek wrote:

On Tue, Aug 08, 2023 at 10:30:10AM -0400, Siddhesh Poyarekar wrote:

Do you have a suggestion for the language to address libgcc, libstdc++,
etc. and libiberty, libbacktrace, etc.?


I'll work on this a bit and share a draft.


BTW, I think we should perhaps differentiate between production ready
libraries (e.g. libgcc, libstdc++, libgomp, libatomic, libgfortran, 
libquadmath,
libssp) vs. e.g. the sanitizer libraries which are meant for debugging 
and


Agreed, that's why I need some time to sort all of the libraries gcc 
builds to categorize them into various levels of support in terms of 
safety re. untrusted input.


Thanks,
Sid


Related to this, our coding standards should really reflect what we 
consider good practice these days.  eg.  There are many library APIs 
around that were once considered acceptable that frankly we would be 
better uninventing.


R.

Re: There Might a Bug in the Compiler: When Calling Weak Defined Function

2023-08-07 Thread Richard Earnshaw (lists) via Gcc


On 07/08/2023 16:51, Şahin Duran via Gcc wrote:

Dear GCC Developers,

I think I've just discovered a bug/ undefined situation in the compiler.
When I try to call a weakly defined function, compiler successfully
generates the code of calling procedure. However, this calling procedure is
nothing but branching to address 0 which results in segmentation fault. I
am not sure if this is the case for the latest version of GCC but it is for
GCC 4.9.2 and many online compilers. I just thought that maybe including a
rule that generates compilation error when the user defines a weak function
and calls it without actually implementing it. You may find the results in
the attachments.

Kind regards,
I am looking forward to hearing from you about this.
Şahin Duran




If a function might be weak, you need to test that it is defined before 
calling it.  So this is a bug in your program.  You need to write


  if (add)
add (31, 31);

Or something like that which checks that the symbol has been defined.

R.


Attachments:

Source Code:
#include 
#include 
#include "header.h"

__attribute__((weak)) int add(int,int);

int main(int argc, char *argv[]) {
printf("%x",add);
add(31,31);
return 0;
}
terminal result : 0

Disassembly (on a 64bit AMD Machine):
0x00401530 <+0>: push   rbp
0x00401531 <+1>: movrbp,rsp
0x00401534 <+4>: subrsp,0x20
0x00401538 <+8>: movDWORD PTR [rbp+0x10],ecx
0x0040153b <+11>: movQWORD PTR [rbp+0x18],rdx
0x0040153f <+15>: call   0x402100 <__main>
0x00401544 <+20>: movrdx,QWORD PTR [rip+0x2ed5]#
0x404420 <.refptr.add>
0x0040154b <+27>: learcx,[rip+0x2aae]# 0x404000
0x00401552 <+34>: call   0x402b18 
=> 0x00401557 <+39>: movedx,0x1f
0x0040155c <+44>: movecx,0x1f
0x00401561 <+49>: call   0x0
0x00401566 <+54>: moveax,0x0
0x0040156b <+59>: addrsp,0x20
0x0040156f <+63>: poprbp
0x00401570 <+64>: ret

Re: GNU Tools Cauldron 2023

2023-08-07 Thread Richard Earnshaw (lists) via Gcc


We have now finalized the ticket price for this year's Cauldron at £75.
Sarah is now contacting those who have already registered to arrange
payment.

If you have not yet registered then there is still time.  Registration
closes at 12 Noon BST (7am EDT) on Friday 1st September, and all tickets 
must be paid for by 6pm BST (1pm EDT) on Monday 4th September.  This is 
to allow time to finalize catering requirements before the event.


We have a full schedule for the event, so registration of talks is now
closed.  If you have not yet submitted a talk but wish to do so, please
do contact the organisers: we will fit you in if we can, but priority
will go to those who have already submitted something.

Richard.

On 25/07/2023 18:01, Richard Earnshaw via Gcc wrote:
It is now just under 2 months until the GNU Tools Cauldron. Registration 
is still open, but we would really appreciate it if you could register 
as soon as possible so that we have a clear idea of the numbers.


Richard.

On 05/06/2023 14:59, Richard Earnshaw wrote:

We are pleased to invite you all to the next GNU Tools Cauldron,
taking place in Cambridge, UK, on September 22-24, 2023.

As for the previous instances, we have setup a wiki page for
details:

https://gcc.gnu.org/wiki/cauldron2023 




Like last year, we are having to charge for attendance.  We are still
working out what we will need to charge, but it will be no more than 
£250.


Attendance will remain free for community volunteers and others who do
not have a commercial backer and we will be providing a small number of
travel bursaries for students to attend.

For all details of how to register, and how to submit a proposal for a
track session, please see the wiki page.

The Cauldron is organized by a group of volunteers. We are keen to add
some more people so others can stand down. If you'd like to be part of
that organizing committee, please email the same address.

This announcement is being sent to the main mailing list of the
following groups: GCC, GDB, binutils, CGEN, DejaGnu, newlib and glibc.

Please feel free to share with other groups as appropriate.

Richard (on behalf of the GNU Tools Cauldron organizing committee).

Re: [PATCH] aarch64: Fix warnings during libgcc build

2023-07-11 Thread Richard Earnshaw (lists) via Gcc-patches


On 11/07/2023 15:54, Richard Earnshaw (lists) via Gcc-patches wrote:

On 11/07/2023 10:37, Florian Weimer via Gcc-patches wrote:

libgcc/

* config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key):
Add missing const qualifier.  Cast from const unsigned char *
to const char *.  Use __builtin_strchr to avoid an implicit
function declaration.
* config/aarch64/linux-unwind.h (aarch64_fallback_frame_state):
Add missing cast.

---
diff --git a/libgcc/config/aarch64/linux-unwind.h 
b/libgcc/config/aarch64/linux-unwind.h

index 00eba866049..93da7a9537d 100644
--- a/libgcc/config/aarch64/linux-unwind.h
+++ b/libgcc/config/aarch64/linux-unwind.h
@@ -77,7 +77,7 @@ aarch64_fallback_frame_state (struct _Unwind_Context 
*context,

  }
    rt_ = context->cfa;
-  sc = _->uc.uc_mcontext;
+  sc = (struct sigcontext *) _->uc.uc_mcontext;
  /* This define duplicates the definition in aarch64.md */
  #define SP_REGNUM 31




This looks somewhat dubious.  I'm not particularly familiar with the 
kernel headers, but a quick look suggests an mcontext_t is nothing like 
a sigcontext_t.  So isn't the cast just papering over some more 
fundamental problem?


R.


Sorry, I was looking at the wrong set of headers.  It looks like these 
have to match. But in that case, I think we should have a comment about 
that here to explain the suspicious cast.


R.

Re: [PATCH] aarch64: Fix warnings during libgcc build

2023-07-11 Thread Richard Earnshaw (lists) via Gcc-patches


On 11/07/2023 10:37, Florian Weimer via Gcc-patches wrote:

libgcc/

* config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key):
Add missing const qualifier.  Cast from const unsigned char *
to const char *.  Use __builtin_strchr to avoid an implicit
function declaration.
* config/aarch64/linux-unwind.h (aarch64_fallback_frame_state):
Add missing cast.

---
diff --git a/libgcc/config/aarch64/linux-unwind.h 
b/libgcc/config/aarch64/linux-unwind.h
index 00eba866049..93da7a9537d 100644
--- a/libgcc/config/aarch64/linux-unwind.h
+++ b/libgcc/config/aarch64/linux-unwind.h
@@ -77,7 +77,7 @@ aarch64_fallback_frame_state (struct _Unwind_Context *context,
  }
  
rt_ = context->cfa;

-  sc = _->uc.uc_mcontext;
+  sc = (struct sigcontext *) _->uc.uc_mcontext;
  
  /* This define duplicates the definition in aarch64.md */

  #define SP_REGNUM 31




This looks somewhat dubious.  I'm not particularly familiar with the 
kernel headers, but a quick look suggests an mcontext_t is nothing like 
a sigcontext_t.  So isn't the cast just papering over some more 
fundamental problem?


R.

Re: wishlist: support for shorter pointers

2023-07-03 Thread Richard Earnshaw (lists) via Gcc


On 03/07/2023 17:42, Rafał Pietrak via Gcc wrote:

Hi Ian,

W dniu 3.07.2023 o 17:07, Ian Lance Taylor pisze:
On Wed, Jun 28, 2023 at 11:21 PM Rafał Pietrak via Gcc 
 wrote:

[]

I was thinking about that, and it doesn't look as requiring that deep
rewrites. ABI spec, that  could accomodate the functionality could be as
little as one additional attribute to linker segments.


If I understand correctly, you are looking for something like the x32
mode that was available for a while on x86_64 processors:
https://en.wikipedia.org/wiki/X32_ABI .  That was a substantial amount
of work including changes to the compiler, assembler, linker, standard
library, and kernel.  And at least to me it's never seemed
particularly popular.


Yes.

And WiKi reporting up to 40% performance improvements in some corner 
cases is impressive and encouraging. I believe, that the reported 
average of 5-8% improvement would be significantly better within MCU 
tiny resources environment. In MCU world, such improvement could mean 
fit-nofit of a project into a particular device.


-R


I think you need to be very careful when reading benchmarketing (sic) 
numbers like this.  Firstly, this is a 32-bit vs 64-bit measurement; 
secondly, the benchmark (spec 2000) is very old now and IIRC was not 
fully optimized for 64-bit processors (it predates the 64-bit version of 
the x86 instruction set); thirdly, there are benchmarks in SPEC which 
are very sensitive to cache size and the 32-bit ABI just happened to 
allow them to fit enough data in the caches to make the numbers leap.


R.

Re: gcc tricore porting

2023-07-03 Thread Richard Earnshaw (lists) via Gcc


On 03/07/2023 15:34, Joel Sherrill wrote:

On Mon, Jul 3, 2023, 4:33 AM Claudio Eterno 
wrote:


Hi Joel, I'll give an answer ASAP on the newlib and libgloss...
I supposed your question were about the licences question on newlib,
instead you were really asking what changed on the repo libs...



It was a bit of both. If they put the right licenses on the newlib and
libgloss ports, you should be able to use them and eventually submit them.
But GCC, binutils, and gdb would be gpl and require an assignment to the
FSF. That is all I meant.


It's not quite as restricted as that.  For GCC, I suggest reading 
https://gcc.gnu.org/contribute.html#legal for more details.


I think there are similar processes in place for binutils as well.  (I'm 
not quite so sure for GDB).


R.



An option here is to reach out to the authors and ask if they are willing
to do the FSF assignment. If they are, then any GPL licensed code from them
might be a baseline.

It looks like their current products may be based on LLVM.

--joel


C.



Il giorno dom 2 lug 2023 alle ore 19:53 Claudio Eterno <
eterno.clau...@gmail.com> ha scritto:


Hi Joel, can you give me more info regarding newlib or libgloss cases?
Unfortunately I'm a newbie on th9is world...
Thank you,
Claudio

Il giorno dom 2 lug 2023 alle ore 17:38 Joel Sherrill 
ha scritto:




On Sun, Jul 2, 2023, 3:29 AM Claudio Eterno 
wrote:


Hi, Joel and Mikael
taking a look at the code it seems that the repo owner is higtech
 but we have no confirmations.
In fact, after a comparison with gcc 9.4.0 original files i see this on
a lot of ("WITH_HIGHTEC") [intl.c]:
[image: image.png]
Probably this version of gcc is a basic version of their tricore-gcc
and probably works fine but that repo doesn't show any extra info.
Seems also impossible to contact the owner (that account doesn't show
any email or other info)..
Honestly with these conditions, from gcc development point of view,
that repo has no value.



Without an assignment, you can't submit that code. That's a blocker on
using it if there isn't one.

But you can file an issue against the repo asking questions.


Anyway this is a good starting point...




Maybe not if you can't submit it. Anything that needs to be GOL licensed
and owned by the FSF is off limits.

But areas with permissive licenses might be ok if they stuck with those.
Look at what they did with newlib and libgloss.

--joel



C.



Il giorno lun 19 giu 2023 alle ore 18:55 Joel Sherrill 
ha scritto:




On Mon, Jun 19, 2023, 10:36 AM Mikael Pettersson via Gcc <
gcc@gcc.gnu.org> wrote:


(Note I'm reading the gcc mailing list via the Web archives, which
doesn't let me
create "proper" replies. Oh well.)

On Sun Jun 18 09:58:56 GMT 2023,  wrote:

Hi, this is my first time with open source development. I worked in
automotive for 22 years and we (generally) were using tricore

series for

these products. GCC doesn't compile on that platform. I left my

work some

days ago and so I'll have some spare time in the next few months. I

would

like to know how difficult it is to port the tricore platform on

gcc and if

during this process somebody can support me as tutor and... also if

the gcc

team is interested in this item...


https://github.com/volumit has a port of gcc + binutils + newlib +
gdb
to Tricore,
and it's not _that_ ancient. I have no idea where it originates from
or how complete
it is, but I do know the gcc-4.9.4 based one builds with some tweaks.




https://github.com/volumit/package_494 says there is a port in

process to gcc 9. Perhaps digging in and assessing that would be a good
start.



One question is whether that code has proper assignments on file for
ultimate inclusion. That should be part of your assessment.

--joel





I don't know anything more about it, I'm just a collector of

cross-compilers for
obscure / lost / forgotten / abandoned targets.

/Mikael





--
Claudio Eterno
via colle dell'Assietta 17
10036 Settimo Torinese (TO)





--
Claudio Eterno
via colle dell'Assietta 17
10036 Settimo Torinese (TO)




--
Claudio Eterno
via colle dell'Assietta 17
10036 Settimo Torinese (TO)

Re: wishlist: support for shorter pointers

2023-06-28 Thread Richard Earnshaw (lists) via Gcc


On 28/06/2023 17:07, Martin Uecker wrote:

Am Mittwoch, dem 28.06.2023 um 16:44 +0100 schrieb Richard Earnshaw (lists):

On 28/06/2023 15:51, Rafał Pietrak via Gcc wrote:

Hi Martin,

W dniu 28.06.2023 o 15:00, Martin Uecker pisze:


Sounds like named address spaces to me:
https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html


Only to same extend, and only in x86 case.

The goal of the wish-item I've describe is to shorten pointers. I may be
wrong and have misread the specs, but the "address spaces"
implementation you've pointed out don't look like doing that. In
particular the AVR variant applies to devices that have a "native int"
of 16-bits, and those devices (most of them) have address space no
larger. So there is no gain. Their pointers cover all their address
space and if one wanted to have shorter pointers ... like 12-bits -
those wouldn't "nicely fit into register", or 8-bits - those would
reduce the "addressable" space to 256 bytes, which is VERY tight for any
practical application.

Additionally, the AVR case is explained as "only for rodata" - this
completely dismisses it from my use.

To explain a little more: the functionality I'm looking for is something
like x86 implementation of that "address spaces". The key functionality
here is the additional register like fs/gs (an address offset register).
IMHO the feature/implementation in question would HAVE TO use additional
register instead of letting linker adjust them at link time, because
those "short" pointers would need to be load-and-stored dynamically and
changed dynamically at runtime. That's why I've put an example of ARM
instruction that does this. Again IMHO the only "syntactic" feature,that
is required for a compiler to do "the right thing" is to make compiler
consider segment (segment name, ordinary linker segment name) where a
particular pointer target resides. Then if that segment where data (of
that pointer) reside is declared "short pointers", then compiler loads
and uses additional register pointing to the base of that segment. Quite
like intel segments work in hardware.

Naturally, although I have hints on such mechanism behavior, I have no
skills to even imagine where to tweak the sources to achieve that.



I think I understand what you're asking for but:
1) You'd need a new ABI specification to handle this, probably involving
register assignments (for the 'segment' addresses), the initialization
of those at startup, assembler and linker extensions to allow for
relocations describing the symbols, etc.
2) Implementations for all of the above (it would be a lot of work -
weeks to months, not days).  Little existing code, including most of the
hand-written assembly routines is likely to be compatible with the
register conventions you'd need to define, so all that code would need
auditing and alternatives developed.
3) I doubt it would be an overall win in the end.

I base the last assertion on the fact that you'd now have three values
in many addresses, the base (segment), the pointer and then a final
offset.  This means quite a bit more code being generated, so you trade
smaller pointers in your data section for more code in your code
section.  For example,

struct f
{
    int a;
    int b;
};

int func (struct f *p)
{
    return p->b;
}

would currently compile to something like

ldr r0, [r0, #4]
bx lr

but with the new, shorter, pointer you'd end up with

add r0, r_seg, r0
ldr r0, [r0, #4]
bx lr

In some cases it might be even worse as you'd end up with
zero-extensions of the pointer values as well.



I do not quite understand why this wouldn't work with
named address spaces?

__near struct {
   int x;
   int y;
};

int func (__near struct f *p)
{
return p->b;
}

could produce exactly such code?   If you need multiple
such segments one could have __near0, ..., __near9.

Such a pointer could also be converted to a regular
pointer, which could reduce code overhead.

Martin


Named address spaces, as they exist today, don't really do anything (at 
least, in the Arm port).  A pointer is still 32-bits in size, so they 
become just syntactic sugar.


If you're going to use them as 'bases', then you still have to define 
how the base address is accessed - it doesn't just happen by magic.


R.






R.


-R



Best,
Martin

Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc:

Hello everybody,

I'm not quite sure if this is correct mailbox for this suggestion (may
be "embedded" would be better), but let me present it first (and while
the examples is from ARM stm32 environment, the issue would equally
apply to i386 or even amd64). So:

1. Small MPU (like stm32f103) would normally have small amount of RAM,
and even somewhat larger variant do have its memory "partitioned/
dedicated" to various subsystems (like CloseCoupledMemory, Ethe

Re: wishlist: support for shorter pointers

2023-06-28 Thread Richard Earnshaw (lists) via Gcc


On 28/06/2023 15:51, Rafał Pietrak via Gcc wrote:

Hi Martin,

W dniu 28.06.2023 o 15:00, Martin Uecker pisze:


Sounds like named address spaces to me:
https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html


Only to same extend, and only in x86 case.

The goal of the wish-item I've describe is to shorten pointers. I may be 
wrong and have misread the specs, but the "address spaces" 
implementation you've pointed out don't look like doing that. In 
particular the AVR variant applies to devices that have a "native int" 
of 16-bits, and those devices (most of them) have address space no 
larger. So there is no gain. Their pointers cover all their address 
space and if one wanted to have shorter pointers ... like 12-bits - 
those wouldn't "nicely fit into register", or 8-bits - those would 
reduce the "addressable" space to 256 bytes, which is VERY tight for any 
practical application.


Additionally, the AVR case is explained as "only for rodata" - this 
completely dismisses it from my use.


To explain a little more: the functionality I'm looking for is something 
like x86 implementation of that "address spaces". The key functionality 
here is the additional register like fs/gs (an address offset register). 
IMHO the feature/implementation in question would HAVE TO use additional 
register instead of letting linker adjust them at link time, because 
those "short" pointers would need to be load-and-stored dynamically and 
changed dynamically at runtime. That's why I've put an example of ARM 
instruction that does this. Again IMHO the only "syntactic" feature,that 
is required for a compiler to do "the right thing" is to make compiler 
consider segment (segment name, ordinary linker segment name) where a 
particular pointer target resides. Then if that segment where data (of 
that pointer) reside is declared "short pointers", then compiler loads 
and uses additional register pointing to the base of that segment. Quite 
like intel segments work in hardware.


Naturally, although I have hints on such mechanism behavior, I have no 
skills to even imagine where to tweak the sources to achieve that.



I think I understand what you're asking for but:
1) You'd need a new ABI specification to handle this, probably involving 
register assignments (for the 'segment' addresses), the initialization 
of those at startup, assembler and linker extensions to allow for 
relocations describing the symbols, etc.
2) Implementations for all of the above (it would be a lot of work - 
weeks to months, not days).  Little existing code, including most of the 
hand-written assembly routines is likely to be compatible with the 
register conventions you'd need to define, so all that code would need 
auditing and alternatives developed.

3) I doubt it would be an overall win in the end.

I base the last assertion on the fact that you'd now have three values 
in many addresses, the base (segment), the pointer and then a final 
offset.  This means quite a bit more code being generated, so you trade 
smaller pointers in your data section for more code in your code 
section.  For example,


struct f
{
  int a;
  int b;
};

int func (struct f *p)
{
  return p->b;
}

would currently compile to something like

ldr r0, [r0, #4]
bx lr

but with the new, shorter, pointer you'd end up with

add r0, r_seg, r0
ldr r0, [r0, #4]
bx lr

In some cases it might be even worse as you'd end up with 
zero-extensions of the pointer values as well.


R.


-R



Best,
Martin

Am Dienstag, dem 27.06.2023 um 14:26 +0200 schrieb Rafał Pietrak via Gcc:

Hello everybody,

I'm not quite sure if this is correct mailbox for this suggestion (may
be "embedded" would be better), but let me present it first (and while
the examples is from ARM stm32 environment, the issue would equally
apply to i386 or even amd64). So:

1. Small MPU (like stm32f103) would normally have small amount of RAM,
and even somewhat larger variant do have its memory "partitioned/
dedicated" to various subsystems (like CloseCoupledMemory, Ethernet
buffers, USB buffs, etc).

2. to address any location within those sections of that memory (or
their entire RAM) it would suffice to use 16-bit pointers.

3. still, declaring a pointer in GCC always allocate "natural" size of a
pointer in given architecture. In case of ARM stm32 it would be 32-bits.

4. programs using pointers do keep them around in structures. So
programs with heavy use of pointers have those structures like 2 times
larger then necessary  if only pointers were 16-bit. And memory in
those devices is scarce.

5. the same thing applies to 64-bit world. Programs that don't require
huge memories but do use pointers excessively, MUST take up 64-bit for a
pointer no matter what.

So I was wondering if it would be feasible for GCC to allow SEGMENT to
be declared as "small" (like 16-bit addressable in 32-bit CPU, or 32-bit
addressable in 64-bit CPU), and ANY pointer declared to reference
location

Re: [PATCH 2/2] [testsuite, arm]: Make mve_fp_fpu[12].c accept single or double precision FPU

2023-06-28 Thread Richard Earnshaw (lists) via Gcc-patches


On 28/06/2023 10:26, Christophe Lyon via Gcc-patches wrote:

This tests currently expect a directive containing .fpu fpv5-sp-d16
and thus may fail if the test is executed for instance with
-march=armv8.1-m.main+mve.fp+fp.dp

This patch accepts either fpv5-sp-d16 or fpv5-d16 to avoid the failure.

2023-06-28  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: Fix .fpu
scan-assembler.
* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
---
  gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c | 2 +-
  gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
index e375327fb97..8358a616bb5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
@@ -12,4 +12,4 @@ foo1 (int8x16_t value)
return b;
  }
  
-/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */

+/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
index 1fca1100cf0..5dd2feefc35 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
@@ -12,4 +12,4 @@ foo1 (int8x16_t value)
return b;
  }
  
-/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */

+/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */


OK.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1127 matches

Mail list logo