[PATCH] AArch64: aarch64_class_max_nregs mishandles 64-bit structure modes [PR112577]

2024-01-16 Thread Tejas Belagod
The target hook aarch64_class_max_nregs returns the incorrect result for 64-bit structure modes like V31DImode or V41DFmode etc. The calculation of the nregs is based on the size of AdvSIMD vector register for 64-bit modes which ought to be UNITS_PER_VREG / 2. This patch fixes the register size.

[PATCH] aarch64: Fix function multiversioning mangling

2024-01-16 Thread Andrew Carlotti
It would be neater if the middle end for target_clones used a target hook for version name mangling, so we only do version name mangling once. However, that would require more intrusive refactoring that will have to wait till Stage 1. This patch builds upon the testsuite additions in patch 1/5

[patch,avr,applied] Add support for AVR16EB, ABR16EA and AVR32EA devices

2024-01-16 Thread Georg-Johann Lay
This adds some more entries to avr-mcus.def Johann -- AVR: Add AVR16EB, AVR16EA and AVR32EA devices. gcc/ * config/avr/avr-mcus.def (avr16eb14, avr16eb20, avr16eb28, avr16eb32) (avr16ea28, avr16ea32, avr16ea48, avr32ea28, avr32ea32, avr32ea48): Add. *

Re: [PATCH] cfgexpand: Workaround CSE of ADDR_EXPRs in VAR_DECL partitioning [PR113372]

2024-01-16 Thread Jakub Jelinek
On Tue, Jan 16, 2024 at 10:00:09AM +0100, Richard Biener wrote: > I'm not sure how fancy we need to get with this workaround, so > changing to INTEGRAL_TYPE_P works for me. I'll go for it. BTW, I've also built linux kernel allyesconfig, and in there per the statistics gathering patch there are

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-16 Thread Richard Biener
On Tue, Jan 16, 2024 at 5:58 AM Xi Ruoyao wrote: > > On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote: > > 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道: > > > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote: > > > > At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote: > > > > > On

Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-16 Thread Xi Ruoyao
On Tue, 2024-01-16 at 17:53 +0800, xndcn wrote: > Thanks, so I add a test: atomic_float/compare_exchange_padding.cc, > which will fail due to timeout without the patch. Please resend in plain text instead of HTML. Sending in HTML causes the patch mangled. And libstdc++ patches should CC

Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-16 Thread xndcn
Thanks, so I add a test: atomic_float/compare_exchange_padding.cc, which will fail due to timeout without the patch. --- libstdc++-v3/ChangeLog: * include/bits/atomic_base.h: add __builtin_clear_padding in __atomic_float constructor. * testsuite/lib/dg-options.exp: enable libatomic for IA32

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-16 Thread Xi Ruoyao
On Tue, 2024-01-16 at 12:58 +0800, Xi Ruoyao wrote: > On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote: > > 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道: > > > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote: > > > > At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote: > > > > > On

Re: [PATCH] cfgexpand: Workaround CSE of ADDR_EXPRs in VAR_DECL partitioning [PR113372]

2024-01-16 Thread Richard Biener
On Tue, 16 Jan 2024, Jakub Jelinek wrote: > Hi! > > The following patch adds a quick workaround to bugs in VAR_DECL > partitioning. > The problem is that there is no dependency between ADDR_EXPRs of local > decls and CLOBBERs of those vars, so VN can CSE uses of ADDR_EXPRs > (including ivopts

Re: [PATCH] libgcc: Fix __builtin_nested_func_ptr_{created,deleted} symbol versions [PR113402]

2024-01-16 Thread Richard Biener
On Tue, 16 Jan 2024, Jakub Jelinek wrote: > Hi! > > These symbols were exported at an incorrect symbol version, > the following patch fixes that. > > I believe we should also rename the symbols (__nested_func_ptr_* > or __gcc_nested_func_ptr_* or similar), __builtin_ in the name > doesn't look

Re: Re: [PATCH] test regression fix: Remove xfail for variable length targets of bb-slp-subgroups-3.c

2024-01-16 Thread Richard Biener
On Tue, 16 Jan 2024, juzhe.zh...@rivai.ai wrote: > I think it's vectorized by 128bit vector too. > > vector(4) int vect__9.9; > vector(4) int vect__2.6; > vector(4) int vect__1.5; > int _1; > int _5; > int _11; > int _13; > vector(4) int _27; > >[local count: 1073741824]: >

Re: [PATCH v2] test regression fix: Add vect128 for bb-slp-43.c

2024-01-16 Thread Richard Biener
On Tue, 16 Jan 2024, Juzhe-Zhong wrote: > gcc/testsuite/ChangeLog: OK > * gcc.dg/vect/bb-slp-43.c: Add vect128. > > --- > gcc/testsuite/gcc.dg/vect/bb-slp-43.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-43.c >

Re: [PATCH] PR rtl-optimization/111267: Improved forward propagation.

2024-01-16 Thread Richard Biener
On Tue, Jan 16, 2024 at 2:13 AM Roger Sayle wrote: > > > This patch resolves PR rtl-optimization/111267 by improving RTL-level > forward propagation. This x86_64 code quality regression was caused > (exposed) by my changes to improve how x86's (TImode) argument passing > is represented at the

[PATCH] libgcc: Fix __builtin_nested_func_ptr_{created,deleted} symbol versions [PR113402]

2024-01-16 Thread Jakub Jelinek
Hi! These symbols were exported at an incorrect symbol version, the following patch fixes that. I believe we should also rename the symbols (__nested_func_ptr_* or __gcc_nested_func_ptr_* or similar), __builtin_ in the name doesn't look right, but that will need more changes to make it work.

Re: [PATCH 2/5] tree: Extend DECL_FUNCTION_VERSIONED to an enum

2024-01-16 Thread Andrew Carlotti
On Mon, Jan 15, 2024 at 01:28:04PM +0100, Richard Biener wrote: > On Mon, Jan 15, 2024 at 12:27 PM Andrew Carlotti > wrote: > > > > This allows code to determine why a particular function is > > multiversioned. For now, this will primarily be used to preserve > > existing name mangling quirks

[PATCH] cfgexpand: Workaround CSE of ADDR_EXPRs in VAR_DECL partitioning [PR113372]

2024-01-16 Thread Jakub Jelinek
Hi! The following patch adds a quick workaround to bugs in VAR_DECL partitioning. The problem is that there is no dependency between ADDR_EXPRs of local decls and CLOBBERs of those vars, so VN can CSE uses of ADDR_EXPRs (including ivopts integral variants thereof), which can break

Re: Re: [PATCH] test regression fix: Remove xfail for variable length targets of bb-slp-subgroups-3.c

2024-01-16 Thread juzhe.zh...@rivai.ai
I think it's vectorized by 128bit vector too. vector(4) int vect__9.9; vector(4) int vect__2.6; vector(4) int vect__1.5; int _1; int _5; int _11; int _13; vector(4) int _27; [local count: 1073741824]: vect__1.5_24 = MEM [(int *)]; vect__2.6_25 = vect__1.5_24 + { 1, 2, 3,

[PATCH v2] test regression fix: Add vect128 for bb-slp-43.c

2024-01-16 Thread Juzhe-Zhong
gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-43.c: Add vect128. --- gcc/testsuite/gcc.dg/vect/bb-slp-43.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-43.c b/gcc/testsuite/gcc.dg/vect/bb-slp-43.c index dad2d24262d..8aedb06bf72

<    1   2