Re: [PATCH 1/6] PowerPC: Add -mcpu=future option

2023-10-18 Thread Michael Meissner
This patch implements support for a potential future PowerPC cpu. Features added with -mcpu=future, may or may not be added to new PowerPC processors. This patch adds support for the -mcpu=future option. If you use -mcpu=future, the macro __ARCH_PWR_FUTURE__ is defined, and the assembler

[PATCH 2/6] PowerPC: Make -mcpu=future enable -mblock-ops-vector-pair.

2023-10-18 Thread Michael Meissner
This patch re-enables generating load and store vector pair instructions when doing certain memory copy operations when -mcpu=future is used. During power10 development, it was determined that using store vector pair instructions were problematical in a few cases, so we disabled generating load

[PATCH 6/6] PowerPC: Add support for 1,024 bit DMR registers.

2023-10-18 Thread Michael Meissner
This patch is a prelimianry patch to add the full 1,024 bit dense math register (DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the top of the DMR register. This patch only adds the new 1,024 bit register support. It does not add support for any instructions that need 1,024 bit

[COMMITTED] Fix expansion of `(a & 2) != 1`

2023-10-18 Thread Andrew Pinski
I had a thinko in r14-1600-ge60593f3881c72a96a3fa4844d73e8a2cd14f670 where we would remove the `& CST` part if we ended up not calling expand_single_bit_test. This fixes the problem by introducing a new variable that will be used for calling expand_single_bit_test. As afar as I know this can only

[PATCH 0/6] PowerPC Future patches

2023-10-18 Thread Michael Meissner
This patch is very preliminary support for a potential new feature to the PowerPC that extends the current power10 MMA architecture. This feature may or may not be present in any specific future PowerPC processor. In the current MMA subsystem for Power10, there are 8 512-bit accumulator

[PATCH 5/6] PowerPC: Switch to dense math names for all MMA operations.

2023-10-18 Thread Michael Meissner
This patch changes the assembler instruction names for MMA instructions from the original name used in power10 to the new name when used with the dense math system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the same bits for either spelling. The patches have been tested on

[PATCH 4/6] PowerPC: Make MMA insns support DMR registers.

2023-10-18 Thread Michael Meissner
This patch changes the MMA instructions to use either FPR registers (-mcpu=power10) or DMRs (-mcpu=future). In this patch, the existing MMA instruction names are used. A macro (__PPC_DMR__) is defined if the MMA instructions use the DMRs. The patches have been tested on both little and big

[PATCH 3/6] PowerPC: Add support for accumulators in DMR registers.

2023-10-18 Thread Michael Meissner
The MMA subsystem added the notion of accumulator registers as an optional feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with the traditional floating point registers 0..31, but logically the accumulator registers were separate from the FPR registers. In ISA 3.1, it was

[PATCH][_Hashtable] Fix merge

2023-10-18 Thread François Dumont
libstdc++: [_Hashtable] Do not reuse untrusted cached hash code On merge reuse merged node cached hash code only if we are on the same type of hash and this hash is stateless. Usage of function pointers or std::function as hash functor will prevent this optimization. libstdc++-v3/ChangeLog  

[PATCH] aarch64: [PR110986] Emit csinv again for `a ? ~b : b`

2023-10-18 Thread Andrew Pinski
After r14-3110-g7fb65f10285, the canonical form for `a ? ~b : b` changed to be `-(a) ^ b` that means for aarch64 we need to add a few new insn patterns to be able to catch this and change it to be what is the canonical form for the aarch64 backend. A secondary pattern was needed to support a

RE: [PATCH 0/3] Add Intel new cpu archs

2023-10-18 Thread Jiang, Haochen
> -Original Message- > From: Hongtao Liu > Sent: Wednesday, October 18, 2023 8:25 AM > To: Jiang, Haochen > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: Re: [PATCH 0/3] Add Intel new cpu archs > > On Mon, Oct 16, 2023 at 2:25 PM Haochen Jiang > wrote: >

Re: [Patch] Fortran: Support OpenMP's 'allocate' directive for stack vars

2023-10-18 Thread Thomas Schwinge
Hi Tobias! On 2023-10-13T15:29:52+0200, Tobias Burnus wrote: > => Updated patch attached When cherry-picking this commit 2d3dbf0eff668bed5f5f168b3cafd8590c54 "Fortran: Support OpenMP's 'allocate' directive for stack vars" on top of slightly older GCC sources (mentioning that just in case

Re: PING Re: [PATCH v2 RFA] diagnostic: add permerror variants with opt

2023-10-18 Thread Richard Biener
On Tue, Oct 17, 2023 at 9:51 PM Jason Merrill wrote: > > Ping? OK. Thanks, Richard. > On 10/3/23 17:09, Jason Merrill wrote: > > This revision changes from using DK_PEDWARN for permerror-with-option to > > using > > DK_PERMERROR. > > > > Tested x86_64-pc-linux-gnu. OK for trunk? > > > > --

Re: [r14-4629 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 07:14:36AM +, Richard Biener wrote: > It's interesting that when the target has AVX512 enabled we get > AVX512 style masks used also for SSE and AVX vector sizes but the > OMP SIMD clones for SSE and AVX vector sizes use SSE/AVX style > masks and only the AVX512 size

Re: [PATCH] RISC-V: Optimize consecutive permutation index pattern by vrgather.vi/vx

2023-10-18 Thread Robin Dapp
LGTM. Regards Robin

[PATCH] Re-instantiate integer mask to traditional vector mask support

2023-10-18 Thread Richard Biener
The following allows to pass integer mask data as traditional vector mask for OMP SIMD clone calls which is required due to the limited set of OMP SIMD clones in the x86 ABI when using AVX512 but a prefered vector size of 256 bits. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

GCC 14.0.0 Status Report (2023-10-18), Stage 1 ends Nov. 19th

2023-10-18 Thread Richard Biener
Status == The GCC development branch which will become GCC 14 is in general development mode (Stage 1) and will transition to general bugfixing mode (Stage 3) at the start of Nov. 19th and from there to regression and documentation fixing mode (Stage 4) at the start of Jan. 8th. Please plan

[PATCH] Avoid compile time hog on vect_peel_nonlinear_iv_init for nonlinear induction vec_step_op_mul when iteration count is too big. 65; 6800; 1c There's loop in vect_peel_nonlinear_iv_init to get i

2023-10-18 Thread liuhongt
Also give up vectorization when niters_skip is negative which will be used for fully masked loop. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR tree-optimization/111820 PR tree-optimization/111833 * tree-vect-loop-manip.cc

Re: [PATCH V2 00/14] Refactor and cleanup vsetvl pass

2023-10-18 Thread Lehua Ding
Hi Patrick, I can locally reproduce the failure of the testcase slp-7.c (the reason is because I missed a small part of the code while splitting the patch). But I can't reproduce the testcases of the assembly check locally, can you help me to see how the compiler options of a failed case in

Re: [PATCH v1] RISC-V: Remove the type size restriction of vectorizer

2023-10-18 Thread Richard Biener
On Wed, Oct 18, 2023 at 3:20 AM wrote: > > From: Pan Li > > The vectoriable_call has one restriction of the size of data type. > Aka DF to DI is allowed but SF to DI isn't. You may see below message > when try to vectorize function call like lrintf. > > void > test_lrintf (long *out, float *in,

[PATCH] x86: Correct ISA enabled for clients since Arrow Lake

2023-10-18 Thread Haochen Jiang
Hi all, I just found that since ISAs enabled on Sierra Forest changed, clients since Arrow Lake will wrongly enable ENQCMD according to the current code. To avoid messing up again in the future, I changed the dependency on how ISAs are enabled currently by making clients depending on clients and

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Eric Gallager
On Sun, Oct 15, 2023 at 7:43 AM Richard Sandiford wrote: > > "Roger Sayle" writes: > > I'd like to ping my patch for restoring bootstrap using g++ 4.8.5 > > (the system compiler on RHEL 7 and later systems). > > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632008.html > > > > Note the

Re: [PATCH] Avoid compile time hog on vect_peel_nonlinear_iv_init for nonlinear induction vec_step_op_mul when iteration count is too big. 65; 6800; 1c There's loop in vect_peel_nonlinear_iv_init to

2023-10-18 Thread Hongtao Liu
On Wed, Oct 18, 2023 at 4:33 PM liuhongt wrote: > Cut from subject... There's a loop in vect_peel_nonlinear_iv_init to get init_expr * pow (step_expr, skip_niters). When skipn_iters is too big, compile time hogs. To avoid that, optimize init_expr * pow (step_expr, skip_niters) to init_expr <<

Pushed: [PATCH] LoongArch: Use fcmp.caf.s instead of movgr2cf for zeroing a fcc

2023-10-18 Thread Xi Ruoyao
On Wed, 2023-10-18 at 09:34 +0800, chenglulu wrote: > > 在 2023/10/17 下午10:24, WANG Xuerui 写道: > > > > On 10/17/23 22:06, Xi Ruoyao wrote: > > > During the review of a LLVM change [1], on LA464 we found that zeroing > > "an" LLVM change (because the word LLVM is pronounced letter-by-letter) > > >

Re: [r14-4629 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-10-18 Thread Richard Biener
On Wed, 18 Oct 2023, Jiang, Haochen wrote: > On Linux/x86_64, > > 3179ad72f67f31824c444ef30ef171ad7495d274 is the first bad commit > commit 3179ad72f67f31824c444ef30ef171ad7495d274 > Author: Richard Biener rguent...@suse.de > Date: Fri Oct 13 12:32:51 2023 +0200 > >

Re: [r14-4629 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-10-18 Thread Richard Biener
On Wed, 18 Oct 2023, Jakub Jelinek wrote: > On Wed, Oct 18, 2023 at 07:14:36AM +, Richard Biener wrote: > > It's interesting that when the target has AVX512 enabled we get > > AVX512 style masks used also for SSE and AVX vector sizes but the > > OMP SIMD clones for SSE and AVX vector sizes

[PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
Hi, as I didn't manage to get back to the generic vectorizer fallback for popcount in time (still the generic costing problem) I figured I'd rather implement the popcount fallback in the riscv backend. It uses the WWG algorithm from libgcc. rvv.exp is unchanged, vect and dg.exp testsuites are

Re: [PATCH] RISC-V: Optimize consecutive permutation index pattern by vrgather.vi/vx

2023-10-18 Thread Lehua Ding
Committed, thanks Robin. On 2023/10/18 15:53, Robin Dapp wrote: LGTM. Regards Robin -- Best, Lehua (RiVAI) lehua.d...@rivai.ai

[PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread juzhe.zh...@rivai.ai
LGTM popcount patch. juzhe.zh...@rivai.ai

[PATCH V2 1/7] aarch64: Sync system register information with Binutils

2023-10-18 Thread Victor Do Nascimento
This patch adds the `aarch64-sys-regs.def' file, originally written for Binutils, to GCC. In so doing, it provides GCC with the necessary information for teaching the compiler about system registers known to the assembler and how these can be used. By aligning the representation of data common to

[PATCH V2 2/7] aarch64: Add support for aarch64-sys-regs.def

2023-10-18 Thread Victor Do Nascimento
This patch defines the structure of a new .def file used for representing the aarch64 system registers, what information it should hold and the basic framework in GCC to process this file. Entries in the aarch64-system-regs.def file should be as follows: SYSREG (NAME, CPENC (sn,op1,cn,cm,op2),

[PATCH V2 7/7] aarch64: Add system register duplication check selftest

2023-10-18 Thread Victor Do Nascimento
Add a build-time test to check whether system register data, as imported from `aarch64-sys-reg.def' has any duplicate entries. Duplicate entries are defined as any two SYSREG entries in the .def file which share the same encoding values (as specified by its `CPENC' field) and where the

[PATCH V2 4/7] aarch64: Add basic target_print_operand support for CONST_STRING

2023-10-18 Thread Victor Do Nascimento
Motivated by the need to print system register names in output assembly, this patch adds the required logic to `aarch64_print_operand' to accept rtxs of type CONST_STRING and process these accordingly. Consequently, an rtx such as: (set (reg/i:DI 0 x0) (unspec:DI [(const_string

aarch64: Replace duplicated selftests

2023-10-18 Thread Andrew Carlotti
Pushed as obvious. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_test_fractional_cost): Test <= instead of testing < twice. diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao
> On Oct 18, 2023, at 11:18 AM, Siddhesh Poyarekar wrote: > > On 2023-10-18 10:51, Qing Zhao wrote: > + member FIELD_DECL is a valid field of the containing structure's > fieldlist, > + FIELDLIST, Report error and remove this attribute when it's not. */ > +static void

Re: Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread 钟居哲
Could you by the way add this mention this PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111791 Add the test of this PR ? juzhe.zh...@rivai.ai   From: Robin Dapp Date: 2023-10-18 21:51 To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw CC: rdapp.gcc

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-18 Thread Jason Merrill
On 10/18/23 07:46, waffl3x wrote: Any progress on this, or do I need to coax the process along? :) Yeah, I've been working on it since the copyright assignment process has finished, originally I was going to note that on my next update which I had hoped to finish today or tomorrow. Well, in

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao
Hi, Sid, Thanks a lot for your time and effort to review this patch set! And sorry for my late reply due to a long vacation immediately after Cauldron, just came back this Monday.. See my reply embedded below: > On Oct 5, 2023, at 2:51 PM, Siddhesh Poyarekar wrote: > > On 2023-08-25 11:24,

[PATCH V2 5/7] aarch64: Implement system register r/w arm ACLE intrinsic functions

2023-10-18 Thread Victor Do Nascimento
Implement the aarch64 intrinsics for reading and writing system registers with the following signatures: uint32_t __arm_rsr(const char *special_register); uint64_t __arm_rsr64(const char *special_register); void* __arm_rsrp(const char *special_register); float

[PATCH V2 0/7] aarch64: Add support for __arm_rsr and __arm_wsr ACLE function family

2023-10-18 Thread Victor Do Nascimento
This revision of the patch series addresses the following key pieces of upstream feedback: * `aarch64-sys-regs.def', being identical in content to the file with the same name in Binutils, now retains the copyright header from Binutils. * We migrate away from the binary search handling of

[0/3] target_version and aarch64 function multiversioning

2023-10-18 Thread Andrew Carlotti
This series adds support for function multiversioning on aarch64. There are a few minor issues in patch 2/3, that I intend to fix in future versions or follow-up patches. I also have some open questions about the correctness of existing function multiversioning implementations [1], that could

[2/3] [aarch64] Add function multiversioning support

2023-10-18 Thread Andrew Carlotti
This adds initial support for function multiversion on aarch64 using the target_version and target_clones attributes. This mostly follows the Beta specification in the ACLE [1], with a few diffences that remain to be fixed: - Symbol mangling for target_clones differs from that for target_version

Re: aarch64, vect, omp: Add SVE support for simd clones [PR 96342]

2023-10-18 Thread Andre Vieira (lists)
Hi, I noticed I had missed one of the preparatory patches at the start of this series (first one) added now, also removed the 'vect: Add vector_mode paramater to simd_clone_usable' since after review we no longer deemed it necessary. And replaced the old vect: Add

Re: [PATCH 1/8] parloops: Copy target and optimizations when creating a function clone

2023-10-18 Thread Andre Vieira (lists)
Just posting a rebase for completion. On 30/08/2023 13:31, Richard Biener wrote: On Wed, 30 Aug 2023, Andre Vieira (lists) wrote: SVE simd clones require to be compiled with a SVE target enabled or the argument types will not be created properly. To achieve this we need to copy

Re: [Patch 2/8] parloops: Allow poly nit and bound

2023-10-18 Thread Andre Vieira (lists)
Posting the changed patch for completion, already reviewed. On 30/08/2023 13:32, Richard Biener wrote: On Wed, 30 Aug 2023, Andre Vieira (lists) wrote: Teach parloops how to handle a poly nit and bound e ahead of the changes to enable non-constant simdlen. Can you use poly_int_tree_p to

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao
>>> + member FIELD_DECL is a valid field of the containing structure's >>> fieldlist, >>> + FIELDLIST, Report error and remove this attribute when it's not. */ >>> +static void >>> +verify_counted_by_attribute (tree fieldlist, tree field_decl) >>> +{ >>> + tree attr_counted_by =

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Siddhesh Poyarekar
On 2023-10-18 10:51, Qing Zhao wrote: + member FIELD_DECL is a valid field of the containing structure's fieldlist, + FIELDLIST, Report error and remove this attribute when it's not. */ +static void +verify_counted_by_attribute (tree fieldlist, tree field_decl) +{ + tree attr_counted_by

[1/3] Add support for target_version attribute

2023-10-18 Thread Andrew Carlotti
This patch adds support for the "target_version" attribute to the middle end and the C++ frontend, which will be used to implement function multiversioning in the aarch64 backend. Note that C++ is currently the only frontend which supports multiversioning using the "target" attribute, whereas the

Re: [Backport RFA] lra: Avoid unfolded plus-0

2023-10-18 Thread Vladimir Makarov
On 10/18/23 09:37, Richard Sandiford wrote: Vlad, is it OK if I backport the patch below to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111528 ? Jakub has given a conditional OK on irc. Ok.  It should be safe.  I don't expect any issues because of this.

[PATCH] vect: Allow same precision for bit-precision conversions.

2023-10-18 Thread Robin Dapp
Hi, even though there was no full conclusion yet I took the liberty of just posting this as a patch in case of further discussion. In PR/111794 we miss a vectorization because on riscv type precision and mode precision differ for mask types. We can still vectorize when allowing assignments with

RE: [x86 PATCH] PR target/110551: Fix reg allocation for widening multiplications.

2023-10-18 Thread Roger Sayle
Many thanks to Tobias Burnus for pointing out the mistake/typo in the PR number. This fix is for PR 110551, not PR 110511. I'll update the ChangeLog and filename of the new testcase, if approved. Sorry for any inconvenience/confusion. Cheers, Roger -- > -Original Message- > From:

Re: [PATCH 8/8] aarch64: Add SVE support for simd clones [PR 96342]

2023-10-18 Thread Andre Vieira (lists)
Rebased, no major changes, still needs review. On 30/08/2023 10:19, Andre Vieira (lists) via Gcc-patches wrote: This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum

[PATCH6/8] omp: Reorder call for TARGET_SIMD_CLONE_ADJUST (was Re: [PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM)

2023-10-18 Thread Andre Vieira (lists)
This patch moves the call to TARGET_SIMD_CLONE_ADJUST until after the arguments and return types have been transformed into vector types. It also constructs the adjuments and retval modifications after this call, allowing targets to alter the types of the arguments and return of the clone

Re: [PATCH 5/8] vect: Use inbranch simdclones in masked loops

2023-10-18 Thread Andre Vieira (lists)
Rebased, needs review. On 30/08/2023 10:13, Andre Vieira (lists) via Gcc-patches wrote: This patch enables the compiler to use inbranch simdclones when generating masked loops in autovectorization. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function

Re: [Patch 3/8] vect: Fix vect_get_smallest_scalar_type for simd clones

2023-10-18 Thread Andre Vieira (lists)
Made it a local function and changed prototype according to comments. Is this OK? gcc/ChangeLog: * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Special case simd clone calls and only use types that are mapped to vectors. (simd_clone_call_p): New

[PATCH 0/8] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS

2023-10-18 Thread Andre Vieira (lists)
Refactor simd clone handling code ahead of support for poly simdlen. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_subparts): Remove. (simd_clone_init_simd_arrays): Replace simd_clone_supbarts with TYPE_VECTOR_SUBPARTS. (ipa_simd_modify_function_body):

Re: [PATCH 4/8] vect: don't allow fully masked loops with non-masked simd clones [PR 110485]

2023-10-18 Thread Andre Vieira (lists)
Rebased on top of trunk, minor change to check if loop_vinfo since we now do some slp vectorization for simd_clones. I assume the previous OK still holds. On 30/08/2023 13:54, Richard Biener wrote: On Wed, 30 Aug 2023, Andre Vieira (lists) wrote: When analyzing a loop and choosing a

Re: [PATCH] vect: Allow same precision for bit-precision conversions.

2023-10-18 Thread Richard Biener
> Am 18.10.2023 um 16:19 schrieb Robin Dapp : > > Hi, > > even though there was no full conclusion yet I took the liberty of > just posting this as a patch in case of further discussion. > > In PR/111794 we miss a vectorization because on riscv type precision and > mode precision differ for

Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-18 Thread Siddhesh Poyarekar
[Sorry, I forgot to respond to this] On 2023-10-06 16:01, Martin Uecker wrote: Am Freitag, dem 06.10.2023 um 06:50 -0400 schrieb Siddhesh Poyarekar: On 2023-10-06 01:11, Martin Uecker wrote: Am Donnerstag, dem 05.10.2023 um 15:35 -0700 schrieb Kees Cook: On Thu, Oct 05, 2023 at 04:08:52PM

[3/3] WIP/RFC: Fix name mangling for target_clones

2023-10-18 Thread Andrew Carlotti
This is a partial patch to make the mangling of function version names for target_clones match those generated using the target or target_version attributes. It modifies the name of function versions, but does not yet rename the resolved symbol, resulting in a duplicate symbol name (and an error

RE: [Patch] nvptx: Use fatal_error when -march= is missing not an assert [PR111093]

2023-10-18 Thread Roger Sayle
Hi Tomas, Tobias and Tom, Thanks for asking. Interestingly, I've a patch (attached) from last year that tackled some of the issues here. The surface problem is that nvptx's march and misa are related in complicated ways. Specifying an arch defines the range of valid isa's, and specifying an

Re: [PATCH v2] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-10-18 Thread David Edelsohn
[Resending from correct email.] Hi, Surya Thanks for working on this issue and creating a patch. It helps if you explicitly send patches to Segher and me, and copy gcc-patches. +/* Return true if insn is a non-permuting load/store. */ +static bool +non_permuting_mem_insn (swap_web_entry

Re: [PATCH v2] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-10-18 Thread David Edelsohn
Hi, Surya Thanks for working on this issue and creating a patch. It helps if you explicitly send patches to Segher and me, and copy gcc-patches. +/* Return true if insn is a non-permuting load/store. */ +static bool +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i) +{ +

[PATCH V2 3/7] aarch64: Implement system register validation tools

2023-10-18 Thread Victor Do Nascimento
Given the implementation of a mechanism of encoding system registers into GCC, this patch provides the mechanism of validating their use by the compiler. In particular, this involves: 1. Ensuring a supplied string corresponds to a known system register name. System registers can be

[PATCH V2 6/7] aarch64: Add front-end argument type checking for target builtins

2023-10-18 Thread Victor Do Nascimento
In implementing the ACLE read/write system register builtins it was observed that leaving argument type checking to be done at expand-time meant that poorly-formed function calls were being "fixed" by certain optimization passes, meaning bad code wasn't being properly picked up in checking.

Re: [PATCH] libstdc++: testsuite: Enhance codecvt_unicode with tests for length()

2023-10-18 Thread Jonathan Wakely
On Tue, 17 Oct 2023 at 23:51, Dimitrij Mijoski wrote: > > We can test codecvt::length() with the same data that we test > codecvt::in(). For each call of in() we add another call to length(). > Some additional small cosmentic changes are applied. Thanks! I'll get this applied. > >

[PATCH] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-18 Thread Juzhe-Zhong
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848 But it generate horrible register spillings. The root cause is that we didn't hoist the vmv.v.x outside the loop which increase the SLP loop register pressure. So,

[pushed] Darwin: Check as for .build_version support and use it if available.

2023-10-18 Thread Iain Sandoe
Tested on i686-darwin9, x86_64-darwin17,19,21 and x86_64-linux-gnu, pushed to master, thanks, Iain --- 8< --- This adds support for the minimum OS version data in assembler files. At present, we have no mechanism to detect the SDK version in use, and so that is omitted from build_versions. We

Re: [Patch] nvptx: Use fatal_error when -march= is missing not an assert [PR111093]

2023-10-18 Thread Thomas Schwinge
Hi Tobias! On 2023-10-16T11:18:45+0200, Tobias Burnus wrote: > While mkoffload ensures that there is always a -march=, nvptx's > cc1 can also be run directly. > > In my case, I wanted to know which target-specific #define are > available; hence, I did run: >accel/nvptx-none/cc1 -E -dM <

Re: [PATCH] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-18 Thread juzhe.zh...@rivai.ai
Forget about this patch. Commit log code example is wrong, fixed it in V2: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633420.html Thanks. juzhe.zh...@rivai.ai From: Juzhe-Zhong Date: 2023-10-18 18:21 To: gcc-patches CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc;

Re: [Patch] OpenMP: Avoid ICE with LTO and 'omp allocate (was: [Patch] Fortran: Support OpenMP's 'allocate' directive for stack vars)

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 12:56:01PM +0200, Tobias Burnus wrote: > On 18.10.23 11:36, Jakub Jelinek wrote: > > On Wed, Oct 18, 2023 at 11:12:44AM +0200, Thomas Schwinge wrote: > > > +FAIL: gfortran.dg/gomp/allocate-13.f90 -O (internal compiler > > > error: tree code 'statement_list' is not

RE: [PATCH v1] RISC-V: Remove the type size restriction of vectorizer

2023-10-18 Thread Li, Pan2
Thanks Richard, let's wait for a while incase there are comments from others due to not familiar with these parts. Pan -Original Message- From: Richard Biener Sent: Wednesday, October 18, 2023 2:34 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang ;

[PATCH V5] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]

2023-10-18 Thread Juzhe-Zhong
This patch fixes this following FAILs in RISC-V regression: FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump vect "Loop contains only SLP stmts" FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts" FAIL: gcc.dg/vect/vect-gather-3.c -flto

Re: [Patch] Fortran: Support OpenMP's 'allocate' directive for stack vars

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 11:12:44AM +0200, Thomas Schwinge wrote: > Hi Tobias! > > On 2023-10-13T15:29:52+0200, Tobias Burnus wrote: > > => Updated patch attached > > When cherry-picking this commit 2d3dbf0eff668bed5f5f168b3cafd8590c54 > "Fortran: Support OpenMP's 'allocate' directive for

Re: [Patch] OpenMP: Add ME support for 'omp allocate' stack variables

2023-10-18 Thread Thomas Schwinge
Hi Tobias! No need to change anything now, but in case that's useful later: On 2023-09-18T14:22:50+0200, Tobias Burnus wrote: > --- /dev/null > +++ b/libgomp/testsuite/libgomp.c/allocate-4.c > @@ -0,0 +1,84 @@ > +/* TODO: move to ../libgomp.c-c++-common once C++ is implemented. */ > +/* NOTE: {

[Patch] OpenMP: Avoid ICE with LTO and 'omp allocate (was: [Patch] Fortran: Support OpenMP's 'allocate' directive for stack vars)

2023-10-18 Thread Tobias Burnus
On 18.10.23 11:36, Jakub Jelinek wrote: On Wed, Oct 18, 2023 at 11:12:44AM +0200, Thomas Schwinge wrote: +FAIL: gfortran.dg/gomp/allocate-13.f90 -O (internal compiler error: tree code 'statement_list' is not supported in LTO streams) Any references to GENERIC code in clauses etc.

Re: [PING] [PATCH] Harmonize headers between both dg-extract-results scripts

2023-10-18 Thread Thomas Schwinge
Hi! On 2023-09-29T08:52:24-0600, Jeff Law wrote: > On 9/29/23 02:19, Paul Iannetta wrote: >> On Tue, Sep 26, 2023 at 08:29:11AM -0600, Jeff Law wrote: >>> On 9/25/23 03:55, Paul Iannetta wrote: On Mon, Sep 18, 2023 at 08:39:34AM +0200, Paul Iannetta wrote: > On Thu, Sep 14, 2023 at

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Richard Sandiford
Jakub Jelinek writes: > On Sun, Oct 15, 2023 at 12:43:10PM +0100, Richard Sandiford wrote: >> It seemed like there was considerable support for bumping the minimum >> to beyond 4.8. I think we should wait until a decision has been made >> before adding more 4.8 workarounds. > > I think adding a

Re: [PATCH] tree-ssa-math-opts: Fix up match_uaddc_usubc [PR111845]

2023-10-18 Thread Richard Biener
On Wed, 18 Oct 2023, Jakub Jelinek wrote: > Hi! > > GCC ICEs on the first testcase. Successful match_uaddc_usubc ends up with > some dead stmts which DCE will remove (hopefully) later all. > The ICE is because one of the dead stmts refers to a freed SSA_NAME. > The code already gsi_removes a

[PATCH v2] libstdc++: testsuite: Enhance codecvt_unicode with tests for length()

2023-10-18 Thread Dimitrij Mijoski
We can test codecvt::length() with the same data that we test codecvt::in(). For each call of in() we add another call to length(). Some additional small cosmentic changes are applied. libstdc++-v3/ChangeLog: * testsuite/22_locale/codecvt/codecvt_unicode.h: Test length() ---

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 11:23:49AM +0100, Richard Sandiford wrote: > > --- a/gcc/cse.cc > > +++ b/gcc/cse.cc > > @@ -4951,8 +4951,14 @@ cse_insn (rtx_insn *insn) > > && is_a (mode, _mode) > > && (extend_op = load_extend_op (int_mode)) != UNKNOWN) > > { > > +#if

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-18 Thread waffl3x
> Any progress on this, or do I need to coax the process along? :) Yeah, I've been working on it since the copyright assignment process has finished, originally I was going to note that on my next update which I had hoped to finish today or tomorrow. Well, in truth I was hoping to send one the

Re: [PATCH V5] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]

2023-10-18 Thread juzhe.zh...@rivai.ai
Hi, this patch fix V4 issue: Previously as Richard S commented: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633178.html slp_op and mask_vectype are only initialised when mask_index >= 0. Shouldn't this code be under mask_index >= 0 too? Also, when do we encounter mismatched

Re: [Patch] OpenMP: Add ME support for 'omp allocate' stack variables

2023-10-18 Thread Tobias Burnus
Hi Thomas, On 18.10.23 11:44, Thomas Schwinge wrote: No need to change anything now, but in case that's useful later: [...] ..., just noting that '{ target c }', '{ target c++ }' are trivial to implement; see libgomp OpenACC testing: libgomp/testsuite/libgomp.oacc-c/c.exp:proc

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Jakub Jelinek
On Sun, Oct 15, 2023 at 12:43:10PM +0100, Richard Sandiford wrote: > It seemed like there was considerable support for bumping the minimum > to beyond 4.8. I think we should wait until a decision has been made > before adding more 4.8 workarounds. I think adding a workaround until that decision

Re: [PATCH] Avoid compile time hog on vect_peel_nonlinear_iv_init for nonlinear induction vec_step_op_mul when iteration count is too big. 65;6800;1c There's loop in vect_peel_nonlinear_iv_init to get

2023-10-18 Thread Richard Biener
On Wed, 18 Oct 2023, liuhongt wrote: > Also give up vectorization when niters_skip is negative which will be > used for fully masked loop. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR tree-optimization/111820 > PR

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 01:33:40PM +0200, Jakub Jelinek wrote: > Making it guaranteed that it has at least one argument say through > template poly_int(const U &, const T &...) {} > fixes it for 4.8/4.9 as well. So, perhaps (but so far totally untested, the other bootstrap is still running):

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> I saw you didn't extend VI -> V_VLSI. I guess will failed SLP on > popcount. Hehe, right, I just copied and pasted the expander from my old patch. Will adjust it and add the test. Regards Robin

[PATCH] tree-ssa-math-opts: Fix up match_uaddc_usubc [PR111845]

2023-10-18 Thread Jakub Jelinek
Hi! GCC ICEs on the first testcase. Successful match_uaddc_usubc ends up with some dead stmts which DCE will remove (hopefully) later all. The ICE is because one of the dead stmts refers to a freed SSA_NAME. The code already gsi_removes a couple of stmts in the /* Remove some statements which

Re: [Patch] OpenMP: Add ME support for 'omp allocate' stack variables

2023-10-18 Thread Thomas Schwinge
Hi Tobias! On 2023-10-18T11:53:30+0200, Tobias Burnus wrote: > On 18.10.23 11:44, Thomas Schwinge wrote: >> No need to change anything now, but in case that's useful later: >> [...] >> ..., just noting that '{ target c }', '{ target c++ }' are trivial to >> implement; see libgomp OpenACC

[PATCH V2] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-18 Thread Juzhe-Zhong
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848 But it generate horrible register spillings. The root cause is that we didn't hoist the vmv.v.x outside the loop which increase the SLP loop register pressure. So,

Re: [PATCH] libstdc++: testsuite: Enhance codecvt_unicode with tests for length()

2023-10-18 Thread Dimitrij Mijoski
On Wed, 2023-10-18 at 10:52 +0100, Jonathan Wakely wrote: > On Tue, 17 Oct 2023 at 23:51, Dimitrij Mijoski wrote: > > > > We can test codecvt::length() with the same data that we test > > codecvt::in(). For each call of in() we add another call to length(). > > Some additional small cosmentic

Re: [PATCH V2] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-18 Thread juzhe.zh...@rivai.ai
More details of VSETVL bug: Loop: 10ddc: 9ed030d7vmv1r.v v1,v13 10de0: b21040d7vncvt.x.x.w v1,v1 10de4: 5e0785d7vmv.v.v v11,v15 10de8: b700a5d7vmacc.vvv11,v1,v16 10dec:

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> I saw you didn't extend VI -> V_VLSI. I guess will failed SLP on popcount. Added VLS modes and your test in v2. Testsuite looks unchanged on my side (vect, dg, rvv). Regards Robin Subject: [PATCH v2] RISC-V: Add popcount fallback expander. I didn't manage to get back to the generic

[Backport RFA] lra: Avoid unfolded plus-0

2023-10-18 Thread Richard Sandiford
Vlad, is it OK if I backport the patch below to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111528 ? Jakub has given a conditional OK on irc. Thanks, Richard Richard Sandiford writes: > While backporting another patch to an earlier release, I hit a > situation in which

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
I didn't push this yet because it would have introduced an UNRESOLVED that my summary script didn't catch. Normally I go with just contrib/test_summary but that only filters out FAIL and XPASS. I should really be using compare_testsuite_log.py from riscv-gnu-toolchain/scripts. It was caused by

[avr,committed] LibF7: Implement a function that was missing for devices without MUL.

2023-10-18 Thread Georg-Johann Lay
This implements the worker function for double multiplication for devices without MUL instruction. Johann -- LibF7: Implement mul_mant for devices without MUL instruction. libgcc/config/avr/libf7/ * libf7-asm.sx (mul_mant): Implement for devices without MUL. * asm-defs.h

[committed] pru: Implement TARGET_INSN_COST

2023-10-18 Thread Dimitar Dimitrov
This patch slightly improves the embench-iot benchmark score for PRU code size. There is also small improvement in a few real-world firmware programs. Embench-iot size -- Benchmark before afterdelta -

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-18 Thread waffl3x
> > I will try to get something done today, but I was struggling with > > writing some of the tests, there's also a lot more of them now. I also > > wrote a bunch of musings in comments that I would like feedback on. > > > > My most concrete question is, how exactly should I be testing a > >

Re: [x86 PATCH] PR 106245: Split (x<<31)>>31 as -(x&1) in i386.md

2023-10-18 Thread Uros Bizjak
On Tue, Oct 17, 2023 at 7:54 PM Roger Sayle wrote: > > > Hi Uros, > Thanks for the speedy review. > > > From: Uros Bizjak > > Sent: 17 October 2023 17:38 > > > > On Tue, Oct 17, 2023 at 3:08 PM Roger Sayle > > wrote: > > > > > > > > > This patch is the backend piece of a solution to PRs 101955

  1   2   >