Re: RISC-V: Fix round_32.c test on RV32

2024-05-31 Thread Jeff Law
On 5/27/24 4:17 PM, Jivan Hakobyan wrote: Ya, makes sense -- I guess the current values aren't that exciting for execution, but we could just add some more interesting ones... During the development of the patch, I have an issue with large numbers (2e34, -2e34). They are used in

Re: [RFC/RFA] [PATCH 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs

2024-05-31 Thread Jeff Law
On 5/28/24 12:44 AM, Richard Biener wrote: On Mon, May 27, 2024 at 5:16 PM Jeff Law wrote: On 5/27/24 12:38 AM, Richard Biener wrote: On Fri, May 24, 2024 at 10:44 AM Mariam Arutunian wrote: This patch introduces new built-in functions to GCC for computing bit-forward and bit

Re: [PATCH 5/5][v3] RISC-V: Avoid inserting after a GIMPLE_COND with SLP and early break

2024-05-31 Thread Jeff Law
On 5/31/24 7:44 AM, Richard Biener wrote: When vectorizing an early break loop with LENs (do we miss some check here to disallow this?) we can end up deciding to insert stmts after a GIMPLE_COND when doing SLP scheduling and trying to be conservative with placing of stmts only dependent on

Re: [PING] [PATCH] RISC-V: Add Zfbfmin extension

2024-05-31 Thread Jeff Law
On 5/30/24 5:38 AM, Xiao Zeng wrote: 1 In the previous patch, the libcall for BF16 was implemented: 2 Riscv provides Zfbfmin extension, which completes the "Scalar BF16 Converts":

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-05-31 Thread Jeff Law
On 5/31/24 9:03 AM, Robin Dapp wrote: Hi, before noce_find_if_block processes a block it sets up an if_info structure that holds the original costs. At that point the costs of the then/else blocks have not been added so we only care about the "if" cost. The code originally used BRANCH_COST

Re: [PATCH] RISC-V: Add min/max patterns for ifcvt.

2024-05-31 Thread Jeff Law
On 5/31/24 9:07 AM, Robin Dapp wrote: Hi, ifcvt likes to emit (set (if_then_else) (ge (reg 1) (reg2)) (reg 1) (reg 2)) which can be recognized as min/max patterns in the backend. This patch adds such patterns and the respective iterators as well as a test. This depends

Re: Reverted recent patches to resource.cc

2024-05-30 Thread Jeff Law
On 5/30/24 8:09 PM, Hans-Peter Nilsson wrote: Date: Wed, 29 May 2024 21:23:58 -0600 Cc: gcc-patches@gcc.gnu.org I don't bother with qemu.exp at all. I've set up binfmt handlers so that I can execute foreign binaries. So given a root filesystem, I can chroot into it and do whatever I

[to-be-committed] [RISC-V] Use Zbkb for general 64 bit constants when profitable

2024-05-30 Thread Jeff Law
Basically this adds the ability to generate two independent constants during synthesis, then bring them together with a pack instruction. Thus we never need to go out to the constant pool when zbkb is enabled. The worst sequence we ever generate is lui+addi+lui+addi+pack Obviously if either

Re: Reverted recent patches to resource.cc

2024-05-29 Thread Jeff Law
On 5/29/24 8:41 PM, Hans-Peter Nilsson wrote: I do bootstraps and regression testsuite runs on a variety of systems via qemu (alpha, m68k, aarch64, s390, ppc64, etc). It ain't fast, but it does work if QEMU is in pretty good shape and you can find a root filesystem to use. That might

Re: Reverted recent patches to resource.cc

2024-05-29 Thread Jeff Law
On 5/29/24 7:28 PM, Hans-Peter Nilsson wrote: From: Hans-Peter Nilsson Date: Mon, 27 May 2024 19:51:47 +0200 2: Does not depend on 1, but corrects an incidentally found wart: find_basic_block calls fails too often. Replace it with "modern" insn-to-basic-block cross-referencing. 3: Just

Re: [RFC/RFA] [PATCH 08/12] Add a new pass for naive CRC loops detection

2024-05-29 Thread Jeff Law
On 5/28/24 1:01 AM, Richard Biener wrote: On Fri, May 24, 2024 at 10:46 AM Mariam Arutunian wrote: This patch adds a new compiler pass aimed at identifying naive CRC implementations, characterized by the presence of a loop calculating a CRC (polynomial long division). Upon detection of a

[to-be-committed] [RISC-V] Use pack to handle repeating constants

2024-05-28 Thread Jeff Law
This patch utilizes zbkb to improve the code we generate for 64bit constants when the high half is a duplicate of the low half. Basically we generate the low half and use a pack instruction with that same register repeated. ie pack dest,src,src That gives us a maximum sequence of 3

Re: [RFC/RFA] [PATCH 08/12] Add a new pass for naive CRC loops detection

2024-05-27 Thread Jeff Law
On 5/24/24 2:42 AM, Mariam Arutunian wrote: This patch adds a new compiler pass aimed at identifying naive CRC implementations, characterized by the presence of a loop calculating a CRC (polynomial long division). Upon detection of a potential CRC, the pass prints an informational message.

Re: [PATCH 4/4] resource.cc: Remove redundant conditionals

2024-05-27 Thread Jeff Law
On 5/27/24 11:54 AM, Hans-Peter Nilsson wrote: Regtested cris-elf. Ok to commit? -- >8 -- No functional change. - We always have a target_hash_table and bb_ticks because init_resource_info is always called. These conditionals are an ancient artifact: it's been quite a while since

Re: [PATCH 3/4] resource.cc (mark_target_live_regs): Remove check for bb not found

2024-05-27 Thread Jeff Law
On 5/27/24 11:53 AM, Hans-Peter Nilsson wrote: Regtested cris-elf. Ok to commit? -- >8 -- No functional change. A "git diff -wb" (ignore whitespace diff) shows that this commit just removes a "if (b != -1)" after a "gcc_assert (b != -1)" and also removes the subsequent "else" clause.

Re: [PATCH 2/4] resource.cc: Replace calls to find_basic_block with cfgrtl BLOCK_FOR_INSN

2024-05-27 Thread Jeff Law
On 5/27/24 11:52 AM, Hans-Peter Nilsson wrote: Regtested cris-elf. Ok to commit? -- >8 -- ...and call compute_bb_for_insn in init_resource_info and free_bb_for_insn in free_resource_info. I put a gcc_unreachable in that else-clause for a failing find_basic_block in mark_target_live_regs

Re: [PATCH 1/4] resource.cc (mark_target_live_regs): Don't look past target insn, PR115182

2024-05-27 Thread Jeff Law
On 5/27/24 11:52 AM, Hans-Peter Nilsson wrote: The problem is in mark_target_live_regs: it consults a hash-table indexed by insn uid, where it tracks the currently live registers with a "generation" count to handle when it moves around insn, filling delay-slots. As a fall-back, it starts

Re: [PATCH 0/4] Some improvements to resource.cc, including fixing PR115182

2024-05-27 Thread Jeff Law
On 5/27/24 11:51 AM, Hans-Peter Nilsson wrote: The code in resource.cc is exclusively used by the delay-slot-filling machinery: the "dbr" pass in reorg.cc, sometimes referred to just as "reorg". Its implementation is quite arcane, scanning RTL, with only a little dash of cfgrtl. I'm sure

[to-be-committed] [RISC-V] Some basic patterns for zbkb code generation

2024-05-27 Thread Jeff Law
And here's Lyut's basic Zbkb support. Essentially it's four new patterns for packh, packw, pack plus a bridge pattern needed for packh. packw is a bit ugly as we need to match a sign extension in an inconvenient location. We pull it out so that the extension is exposed in a convenient place

Re: [RFC/RFA] [PATCH 12/12] Add tests for CRC detection and generation.

2024-05-27 Thread Jeff Law
On 5/27/24 12:39 AM, Richard Biener wrote: On Sat, May 25, 2024 at 8:34 PM Jeff Law wrote: On 5/24/24 2:42 AM, Mariam Arutunian wrote: gcc/testsuite/gcc.c-torture/compile/ * crc-11.c: New test. * crc-15.c: Likewise. * crc-16.c: Likewise. * crc-19.c: Likewise

Re: [RFC/RFA] [PATCH 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs

2024-05-27 Thread Jeff Law
On 5/27/24 12:38 AM, Richard Biener wrote: On Fri, May 24, 2024 at 10:44 AM Mariam Arutunian wrote: This patch introduces new built-in functions to GCC for computing bit-forward and bit-reversed CRCs. These builtins aim to provide efficient CRC calculation capabilities. When the target

[to-be-committed][RISC-V] Reassociate constants in logical ops

2024-05-26 Thread Jeff Law
This patch from Lyut will reassociate operands when we have shifted logical operations. This can simplify a constant that may not be fit in a simm12 into a form that does fit into a simm12. The basic work was done by Lyut. I generalized it to handle XOR/OR. It stands on its own, but also

[to-be-committed] [RISC-V] Try inverting for constant synthesis

2024-05-26 Thread Jeff Law
So there's another class of constants we're failing to synthesize well. Specifically those where we can invert our original constant C into C' and C' takes at least 2 fewer instructions to synthesize than C. In that case we can initially generate C', then use xori with the constant -1 to flip

Re: [PATCHv2 2/2] libiberty/buildargv: handle input consisting of only white space

2024-05-26 Thread Jeff Law
On 2/10/24 10:26 AM, Andrew Burgess wrote: GDB makes use of the libiberty function buildargv for splitting the inferior (program being debugged) argument string in the case where the inferior is not being started under a shell. I have recently been working to improve this area of GDB, and

Re: [PATCHv2 1/2] libiberty/buildargv: POSIX behaviour for backslash handling

2024-05-26 Thread Jeff Law
On 2/10/24 10:26 AM, Andrew Burgess wrote: GDB makes use of the libiberty function buildargv for splitting the inferior (program being debugged) argument string in the case where the inferior is not being started under a shell. I have recently been working to improve this area of GDB, and

Re: [PATCH] Support libcall __float{,un}sibf by SF when it is not supported for _bf16

2024-05-26 Thread Jeff Law
On 12/20/23 4:17 AM, Jin Ma wrote: We don't have SI -> BF library functions, use SI -> SF -> BF instead. Although this can also be implemented in a target machine description, it is more appropriate to move into target independent code. gcc/ChangeLog: * optabs.cc (expand_float):

Re: [PATCH] gimple-vr-values:Add constraint for gimple-cond optimization

2024-05-26 Thread Jeff Law
On 11/22/23 10:47 PM, Feng Wang wrote: This patch add another condition for gimple-cond optimization. Refer to the following test case. int foo1 (int data, int res) { res = data & 0xf; res |= res << 4; if (res < 0x22) return 0x22; return res; } with the compilation flag

Re: [PATCH] libcpp: Correct typo 'r' -> '\r'

2024-05-26 Thread Jeff Law
On 5/25/24 11:16 AM, Peter Damianov wrote: libcpp/ChangeLog: * lex.cc (do_peek_prev): Correct typo in argument to __builtin_expect() THanks. I've pushed this to the trunk. jeff

Re: [PATCH v1] Gen-Match: Fix gen_kids_1 right hand braces mis-alignment

2024-05-26 Thread Jeff Law
On 5/25/24 6:39 PM, pan2...@intel.com wrote: From: Pan Li Notice some mis-alignment for gen_kids_1 right hand braces as below: if ((_q50 == _q20 && ! TREE_SIDE_EFFECTS (... { if ((_q51 == _q21 && ! TREE_SIDE_EFFECTS

[committed] [v2] More logical op simplifications in simplify-rtx.cc

2024-05-25 Thread Jeff Law
ng to the trunk. jeff commit 05daf617ea22e1d818295ed2d037456937e23530 Author: Jeff Law Date: Sat May 25 12:39:05 2024 -0600 [committed] [v2] More logical op simplifications in simplify-rtx.cc This is a revamp of what started as a target specific patch. Basically xalan (c

Re: [RFC/RFA] [PATCH 04/12] RISC-V: Add CRC built-ins tests for the target ZBC.

2024-05-25 Thread Jeff Law
On 5/24/24 2:41 AM, Mariam Arutunian wrote:   gcc/testsuite/gcc.target/riscv/     * crc-builtin-zbc32.c: New file.     * crc-builtin-zbc64.c: Likewise. OK once prerequisites are approved. jeff

Re: [RFC/RFA] [PATCH 12/12] Add tests for CRC detection and generation.

2024-05-25 Thread Jeff Law
On 5/24/24 2:42 AM, Mariam Arutunian wrote:   gcc/testsuite/gcc.c-torture/compile/     * crc-11.c: New test.     * crc-15.c: Likewise.     * crc-16.c: Likewise.     * crc-19.c: Likewise.     * crc-2.c: Likewise.     * crc-20.c: Likewise.     * crc-24.c: Likewise.     * crc-29.c:

Re: [RFC/RFA] [PATCH 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-05-25 Thread Jeff Law
On 5/24/24 2:41 AM, Mariam Arutunian wrote: If the target is ZBC or ZBKC, it uses clmul instruction for the CRC calculation. Otherwise, if the target is ZBKB, generates table-based CRC, but for reversing inputs and the output uses bswap and brev8 instructions. Add new tests to check CRC

Re: [RFC/RFA] [PATCH 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs

2024-05-25 Thread Jeff Law
On 5/24/24 2:41 AM, Mariam Arutunian wrote: This patch introduces new built-in functions to GCC for computing bit- forward and bit-reversed CRCs. These builtins aim to provide efficient CRC calculation capabilities. When the target architecture supports CRC operations (as indicated by the

Re: [RFC/RFA] [PATCH 01/12] Implement internal functions for efficient CRC computation

2024-05-25 Thread Jeff Law
On 5/24/24 2:41 AM, Mariam Arutunian wrote: Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster CRC generation. One performs bit-forward and the other bit-reversed CRC computation. If CRC optabs are supported, they are used for the CRC computation. Otherwise, table-based

Re: [RFC/RFA][PATCH 00/12] CRC optimization

2024-05-24 Thread Jeff Law
On 5/24/24 2:41 AM, Mariam Arutunian wrote: Hello! This patch set detects bitwise CRC implementation loops (with branches) in the GIMPLE optimizers and replaces them with more optimal CRC implementations in RTL. These patches introduce new internal functions, built-in functions, and

Re: [PATCH v2] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Jeff Law
On 5/24/24 5:43 PM, Palmer Dabbelt wrote: I'm only reading Zicclsm as saying both scalar and vector misaligned accesses are supported, but nothing about the performance. I think it was in the vector docs.  It didn't say anything about performance, just a note that scalar & vector behavior

Re: [PATCH v2] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Jeff Law
On 5/24/24 5:39 PM, Palmer Dabbelt wrote: On Fri, 24 May 2024 16:31:48 PDT (-0700), jeffreya...@gmail.com wrote: On 5/24/24 11:14 AM, Palmer Dabbelt wrote: On Fri, 24 May 2024 09:19:09 PDT (-0700), Robin Dapp wrote: We should have something in doc/invoke too, this one is going to be

Re: [PATCH] RISC-V: Avoid splitting store dataref groups during SLP discovery

2024-05-24 Thread Jeff Law
On 5/23/24 11:52 PM, Richard Biener wrote: This worked out so I pushed the change. The gcc.dg/vect/pr97428.c test is FAILing on RISC-V (it still gets 0 SLP), because of missed load permutations. I hope the followup reorg for the load side will fix this. It also FAILs

Re: [PATCH v2] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Jeff Law
On 5/24/24 11:14 AM, Palmer Dabbelt wrote: On Fri, 24 May 2024 09:19:09 PDT (-0700), Robin Dapp wrote: We should have something in doc/invoke too, this one is going to be tricky for users.  We'll also have to define how this interacts with the existing -mstrict-align. Addressed the rest in

[to-be-committed][v2][RISC-V] Use bclri in constant synthesis

2024-05-23 Thread Jeff Law
Testing with Zbs enabled by default showed a minor logic error. After the loop clearing things with bclri, we can only use the sequence if we were able to clear all the necessary bits. If any bits are still on, then the bclr sequence turned out to not be profitable. -- So this is

[to-be-committed] [RISC-V] Use bclri in constant synthesis

2024-05-23 Thread Jeff Law
So this is conceptually similar to how we handled direct generation of bseti for constant synthesis, but this time for bclr. In the bclr case, we already have an expander for AND. So we just needed to adjust the predicate to accept another class of constant operands (those with a single bit

Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-23 Thread Jeff Law
On 5/23/24 6:14 AM, Richard Biener wrote: On Thu, May 23, 2024 at 1:08 PM Li, Pan2 wrote: I have a try to convert the PHI from Part-A to Part-B, aka PHI to _2 = phi_cond ? _1 : 255. And then we can do the matching on COND_EXPR in the underlying widen-mul pass. Unfortunately, meet some

Re: RISC-V: Fix round_32.c test on RV32

2024-05-22 Thread Jeff Law
On 5/22/24 12:15 PM, Palmer Dabbelt wrote: On Wed, 22 May 2024 11:01:16 PDT (-0700), jeffreya...@gmail.com wrote: On 5/22/24 6:47 AM, Jivan Hakobyan wrote: After 8367c996e55b2 commit several checks on round_32.c test started to fail. The reason is that we prevent rounding DF->SI->DF on

Re: RISC-V: Fix round_32.c test on RV32

2024-05-22 Thread Jeff Law
On 5/22/24 6:47 AM, Jivan Hakobyan wrote: After 8367c996e55b2 commit several checks on round_32.c test started to fail. The reason is that we prevent rounding DF->SI->DF on RV32 and instead of a conversation sequence we get calls to appropriate library functions. gcc/testsuite/ChangeLog:  

Re: [PATCH] Fix PR rtl-optimization/115038

2024-05-22 Thread Jeff Law
On 5/20/24 1:13 AM, Eric Botcazou wrote: Hi, this is a regression present on mainline and 14 branch under the form of an ICE in seh_cfa_offset from config/i386/winnt.cc on the attached C++ testcase compiled with -O2 -fno-omit-frame-pointer. The problem directly comes from the

Re: [PATCH 4/4] Testsuite updates

2024-05-22 Thread Jeff Law
On 5/22/24 4:58 AM, Richard Biener wrote: RISC-V CI didn't trigger (not sure what magic is required). Both ARM and AARCH64 show that the "Vectorizing stmts using SLP" are a bit fragile because we sometimes cancel SLP becuase we want to use load/store-lanes. The RISC-V tag on the subject

Re: [PATCH] [tree-optimization/110279] fix testcase pr110279-1.c

2024-05-22 Thread Jeff Law
On 5/22/24 5:46 AM, Di Zhao OS wrote: The test case is for targets that support FMA. Previously the "target" selector is missed in dg-final command. Tested on x86_64-pc-linux-gnu. Thanks Di Zhao gcc/testsuite/ChangeLog: * gcc.dg/pr110279-1.c: add target selector. Rather than list

Re: [PATCH v1 2/2] RISC-V: Add test cases for __builtin_add_overflow branchless unsigned SAT_ADD

2024-05-21 Thread Jeff Law
On 5/19/24 12:37 AM, pan2...@intel.com wrote: From: Pan Li After we support branchless __builtin_add_overflow unsigned SAT_ADD from the middle end. Add more tests case to cover the functionarlities. The below test suites are passed. * The rv64gcv fully regression test.

Re: [PATCH v1 2/2] RISC-V: Add test cases for branch form unsigned SAT_ADD

2024-05-21 Thread Jeff Law
On 5/20/24 5:01 AM, pan2...@intel.com wrote: From: Pan Li After we support branch form unsigned SAT_ADD from the middle end. Add more tests case to cover the functionarlities. The below test suites are passed. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: *

Re: [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion

2024-05-21 Thread Jeff Law
On 5/20/24 5:32 PM, Vineet Gupta wrote: This is testsuite clean however there's a dwarf quirk which I want to run by the experts. The test that was tripping CI has following fragment: Before patch| After Patch --

Re: [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-21 Thread Jeff Law
On 5/20/24 5:32 PM, Vineet Gupta wrote: Changes since v2: - Broke out the hunk corresponding to alloca in epilogue expansion in a seperate patch. --- If the constant used for stack offset can be expressed as sum of two S12 values, the constant need not be materialized (in a reg) and

Re: [PATCH v1 2/2] RISC-V: Add test cases for __builtin_add_overflow branch form unsigned SAT_ADD

2024-05-21 Thread Jeff Law
On 5/21/24 4:53 AM, pan2...@intel.com wrote: From: Pan Li After we support __builtin_add_overflow branch form unsigned SAT_ADD from the middle end. Add more tests case to cover the functionarlities. The below test suites are passed. * The rv64gcv fully regression test.

Re: [committed] PATCH for Re: Stepping down as maintainer for ARC and Epiphany

2024-05-21 Thread Jeff Law
On 5/21/24 8:02 AM, Paul Koning wrote: On May 21, 2024, at 9:57 AM, Jeff Law wrote: On 5/21/24 12:05 AM, Richard Biener via Gcc wrote: On Mon, May 20, 2024 at 4:45 PM Gerald Pfeifer wrote: On Wed, 5 Jul 2023, Joern Rennecke wrote: I haven't worked with these targets in years

Re: [committed] PATCH for Re: Stepping down as maintainer for ARC and Epiphany

2024-05-21 Thread Jeff Law
On 5/21/24 12:05 AM, Richard Biener via Gcc wrote: On Mon, May 20, 2024 at 4:45 PM Gerald Pfeifer wrote: On Wed, 5 Jul 2023, Joern Rennecke wrote: I haven't worked with these targets in years and can't really do sensible maintenance or reviews of patches for them. I am currently working

Re: [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion

2024-05-20 Thread Jeff Law
On 5/20/24 5:32 PM, Vineet Gupta wrote: This is testsuite clean however there's a dwarf quirk which I want to run by the experts. The test that was tripping CI has following fragment: Before patch| After Patch --

Re: [to-be-committed][RISC-V] Eliminate redundant bitmanip operation

2024-05-19 Thread Jeff Law
On 5/19/24 1:59 PM, Andrew Pinski wrote: On Sun, May 19, 2024 at 10:58 AM Jeff Law wrote: perl has some internal bitmap code. One of its implementation properties is that if you ask it to set a bit, the bit is first cleared. Unfortunately this is fairly hard to see in gimple/match due

[to-be-committed][RISC-V] Eliminate redundant bitmanip operation

2024-05-19 Thread Jeff Law
perl has some internal bitmap code. One of its implementation properties is that if you ask it to set a bit, the bit is first cleared. Unfortunately this is fairly hard to see in gimple/match due to type changes in the IL. But it is easy to see in the code we get from combine. So we just

Re: [PATCH v4] DSE: Fix ICE after allow vector type in get_stored_val

2024-05-19 Thread Jeff Law
On 5/2/24 7:51 PM, pan2...@intel.com wrote: From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, the valididate_subreg treats the vector type's size is less than vector register as invalid. Then we will have ICE here.

Re: [PATCH] Add widening expansion of MULT_HIGHPART_EXPR for integral modes

2024-05-19 Thread Jeff Law
On 5/19/24 3:40 AM, Eric Botcazou wrote: Hi, Just notice that this patch may result in some ICE when build libc++ for the riscv port, details as below. Please note not all configuration can reproduce this issue, feel free to ping me if you cannot reproduce this issue. CC more riscv port

[to-be-committed][RISC-V][PR target/115142] Do not create invalidate shift-add insn

2024-05-18 Thread Jeff Law
Repost, this time with the RISC-V tag so it's picked up by the CI system. This fixes a minor bug that showed up in the CI system, presumably with fuzz testing. Under the right circumstances, we could end trying to emit a shift-add style sequence where the to-be-shifted operand was not a

[to-be-committed][PR target/115142] Do not create invalidate shift-add insn

2024-05-18 Thread Jeff Law
This fixes a minor bug that showed up in the CI system, presumably with fuzz testing. Under the right circumstances, we could end trying to emit a shift-add style sequence where the to-be-shifted operand was not a register. This naturally leads to an unrecognized insn. The circumstances

Re: [PATCH] RISC-V: Fix "Nan-box the result of movbf on soft-bf16"

2024-05-17 Thread Jeff Law
On 5/15/24 7:55 PM, Xiao Zeng wrote: 1 According to unpriv-isa spec: 1.1 "FMV.H.X moves the half-precision value encoded in IEEE 754-2008 standard encoding from the

Re: [PATCH] RISC-V: Modify _Bfloat16 to __bf16

2024-05-17 Thread Jeff Law
On 5/17/24 2:19 AM, Kito Cheng wrote: LGTM, thanks for fixing this :) And just to be clear for Xiao, you can go ahead and commit this patch to the trunk. An ACK from Kito, Juzhe, Palmer, Robin or myself is all you need for a change that is isolated to RISC-V code. jeff

Re: [PATCH] RISC-V: Remove dead perm series code and document.

2024-05-17 Thread Jeff Law
On 5/17/24 9:27 AM, Robin Dapp wrote: Hi, with the introduction of shuffle_series_patterns the explicit handler code for a perm series is dead. This patch removes it and also adds a function-level comment to shuffle_series_patterns. Regtested on rv64gcv_zvfh_zvbb. Regards Robin

Re: [PATCH v1] RISC-V: Cleanup some temporally files [NFC]

2024-05-17 Thread Jeff Law
On 5/16/24 6:12 PM, Li, Pan2 wrote: Committed, thanks Juzhe. Thanks for cleaning up my little mess! Sorry about that. jeff

Re: [PATCH gcc-13] Fix RISC-V missing stack tie

2024-05-16 Thread Jeff Law
On 5/16/24 12:24 PM, Palmer Dabbelt wrote: gcc/ * config/riscv/riscv.cc (riscv_expand_prologue): Add missing stack tie for scalable and final stack adjustment if needed. Co-authored-by: Raphael Zinsly (cherry picked from commit

Re: [PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-16 Thread Jeff Law
On 5/16/24 5:58 AM, Richard Biener wrote: On Thu, May 16, 2024 at 11:35 AM Li, Pan2 wrote: OK. Thanks Richard for help and coaching. To double confirm, are you OK with this patch only or for the series patch(es) of SAT middle-end? Thanks again for reviewing and suggestions. For the

Re: [PATCH] tree-optimization/13962 - handle ptr-ptr compares in ptrs_compare_unequal

2024-05-16 Thread Jeff Law
On 5/16/24 6:03 AM, Richard Biener wrote: Now that we handle pt.null conservatively we can implement the missing tracking of constant pool entries (aka STRING_CST) and handle ptr-ptr compares using points-to info in ptrs_compare_unequal. Bootstrapped on x86_64-unknown-linux-gnu, (re-)testing

Re: [PATCH v2 1/2] RISC-V: Add cmpmemsi expansion

2024-05-15 Thread Jeff Law
On 5/15/24 12:49 AM, Christoph Müllner wrote: GCC has a generic cmpmemsi expansion via the by-pieces framework, which shows some room for target-specific optimizations. E.g. for comparing two aligned memory blocks of 15 bytes we get the following sequence: my_mem_cmp_aligned_15: li

Re: [PATCH] RISC-V: propgue/epilogue expansion code minor changes [NFC]

2024-05-15 Thread Jeff Law
On 5/15/24 12:55 PM, Vineet Gupta wrote: Saw this little room for improvement in current debugging of prologue/epilogue expansion code. --- Use the following pattern consistently `RTX_FRAME_RELATED_P (gen_insn (insn)) = 1` vs. calling gen_insn around apriori gen_xxx_insn () calls.

[to-be-committed][RISC-V] Improve some shift-add sequences

2024-05-15 Thread Jeff Law
ow selection between (x << C1) + C2 vs (x + C2') << C1 depending on the cost C2 vs C2'. gcc/testsuite * gcc.target/riscv/shift-add-1.c: New test. commit 03933cf8813b28587ceb7f6f66ac03d08c5de58b Author: Jeff Law Date: Thu Apr 4 13:35:54 2024 -0600 Optim

Re: [PATCH] RISC-V: Fix cbo.zero expansion for rv32

2024-05-15 Thread Jeff Law
On 5/15/24 12:48 AM, Christoph Müllner wrote: Emitting a DI pattern won't find a match for rv32 and manifests in the failing test case gcc.target/riscv/cmo-zicboz-zic64-1.c. Let's fix this in the expansion and also address the different code that gets generated for rv32/rv64. gcc/ChangeLog:

Re: [PATCH] RISC-V: Test cbo.zero expansion for rv32

2024-05-15 Thread Jeff Law
On 5/15/24 1:28 AM, Christoph Müllner wrote: We had an issue when expanding via cmo-zero for RV32. This was fixed upstream, but we don't have a RV32 test. Therefore, this patch introduces such a test. gcc/testsuite/ChangeLog: * gcc.target/riscv/cmo-zicboz-zic64-1.c: Fix for rv32.

[committed] Fix rv32 issues with recent zicboz work

2024-05-14 Thread Jeff Law
k-function-bodies clear_buf_123 Pushed to the trunk. Jeff commit e410ad74e5e4589aeb666aa298b2f933e7b5d9e7 Author: Jeff Law Date: Tue May 14 22:50:15 2024 -0600 [committed] Fix rv32 issues with recent zicboz work I should have double-checked the CI system before pushing Christoph'

Re: [PATCH] RISC-V: Implement -m{,no}fence-tso

2024-05-14 Thread Jeff Law
On 5/14/24 5:13 PM, Palmer Dabbelt wrote: Some processors from T-Head don't implement the `fence.tso` instruction natively and instead trap to firmware. This breaks some users who haven't yet updated the firmware and one could imagine it breaking users who are trying to build firmware if

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 10:36 AM, Vineet Gupta wrote: On 5/14/24 08:44, Jeff Law wrote: On 5/14/24 8:51 AM, Patrick O'Neill wrote: I was able to find the summary info: Tests that now fail, but worked before (15 tests): libgomp: libgomp.fortran/simd7.f90   -O0  execution test libgomp: libgomp.fortran

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 8:51 AM, Patrick O'Neill wrote: I was able to find the summary info: Tests that now fail, but worked before (15 tests): libgomp: libgomp.fortran/simd7.f90   -O0  execution test libgomp: libgomp.fortran/task2.f90   -O0  execution test libgomp: libgomp.fortran/vla2.f90   -O0 

Re: [PATCH 1/3] expr: Export clear_by_pieces()

2024-05-14 Thread Jeff Law
On 5/7/24 11:38 PM, Christoph Müllner wrote: Make clear_by_pieces() available to other parts of the compiler, similar to store_by_pieces(). gcc/ChangeLog: * expr.cc (clear_by_pieces): Remove static from clear_by_pieces. * expr.h (clear_by_pieces): Add prototype for

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Jeff Law
On 5/14/24 8:57 AM, Qing Zhao wrote: On May 13, 2024, at 20:14, Kees Cook wrote: On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: But it makes no sense to warn about: void sparx5_set (int * ptr, struct nums * sg, int

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 8:51 AM, Patrick O'Neill wrote: On 5/13/24 20:36, Jeff Law wrote: On 5/13/24 6:54 PM, Patrick O'Neill wrote: On 5/13/24 13:28, Jeff Law wrote: On 5/13/24 12:49 PM, Vineet Gupta wrote: If the constant used for stack offset can be expressed as sum of two S12 values

[to-be-committed][RISC-V] Remove redundant AND in shift-add sequence

2024-05-14 Thread Jeff Law
So this patch allows us to eliminate an redundant AND in some shift-add style sequences. I think the testcase was reduced from xz by the RAU team, but I'm not highly confident of that. Specifically the AND is masking off the upper 32 bits of the un-shifted value and there's an outer

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-13 Thread Jeff Law
On 5/13/24 6:54 PM, Patrick O'Neill wrote: On 5/13/24 13:28, Jeff Law wrote: On 5/13/24 12:49 PM, Vineet Gupta wrote: If the constant used for stack offset can be expressed as sum of two S12 values, the constant need not be materialized (in a reg) and instead the two S12 bits can

Re: [PATCH v2 1/3] RISC-V: movmem for RISCV with V extension

2024-05-13 Thread Jeff Law
On 12/19/23 10:28 PM, Jeff Law wrote: On 12/19/23 02:53, Sergei Lewis wrote: gcc/ChangeLog * config/riscv/riscv.md (movmem): Use riscv_vector::expand_block_move, if and only if we know the entire operation can be performed using one vector load followed by one vector

Re: Follow up #1 (was Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265])

2024-05-13 Thread Jeff Law
On 5/13/24 3:13 PM, Vineet Gupta wrote: On 5/13/24 11:49, Vineet Gupta wrote: 500.perlbench_r-0 | 1,214,534,029,025 | 1,212,887,959,387 | 500.perlbench_r-1 |740,383,419,739 | 739,280,308,163 | 500.perlbench_r-2 |692,074,638,817 | 691,118,734,547 | 502.gcc_r-0 |

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Jeff Law
On 5/13/24 1:48 PM, Qing Zhao wrote: -Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-13 Thread Jeff Law
On 5/13/24 12:49 PM, Vineet Gupta wrote: If the constant used for stack offset can be expressed as sum of two S12 values, the constant need not be materialized (in a reg) and instead the two S12 bits can be added to instructions involved with frame pointer. This avoids burning a register and

Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265]

2024-05-13 Thread Jeff Law
On 5/13/24 12:49 PM, Vineet Gupta wrote: Apologies for the delay in getting this out. Needed to fix one ICE with glibc build and fresh round of testing: both testsuite and SPEC runs (which are similar to v1 in terms of Cactu gains, but some more minor regressions elsewhere gcc). Again those

[to-be-committed][RISC-V] Improve AND with some constants

2024-05-13 Thread Jeff Law
If we have an AND with a constant operand and the constant operand requires synthesis, then we may be able to generate more efficient code than we do now. Essentially the need for constant synthesis gives us a budget for alternative ways to clear bits, which zext.w can do for bits 32..63

Re: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-13 Thread Jeff Law
On 5/13/24 9:00 AM, Li, Pan2 wrote: Committed, thanks Juzhe and Kito. Let's wait for a while before backport to 14. Could you fix the formatting nits caught by the CI linter? === ERROR type #1: trailing operator (4 error(s)) === gcc/config/riscv/riscv-vector-builtins.cc:4641:39: if ((exts

[to-be-committed] [RISC-V] Improve single inverted bit extraction - v3

2024-05-12 Thread Jeff Law
The only change in v2 vs v3 is testsuite adjustments for the updated sequences and fixing the name of the second pattern. -- So this patch fixes a minor code generation inefficiency that (IIRC) the RAU team discovered a while ago in spec. If we want the inverted value of a single bit we

[to-be-committed] [RISC-V] Improve single inverted bit extraction - v2

2024-05-12 Thread Jeff Law
So the first version failed CI and after looking at the patch again, I think it can be improved. First, the output pattern might as well go ahead and use the zero_extract form. Second, we should be able to handle cases where all the ops are in word_mode as well as when the shift is in a

[to-be-committed] [RISC-V] Improve single inverted bit extraction

2024-05-12 Thread Jeff Law
So the first time I sent this, I attached the wrong patch. As a result the CI system wasn't happy. The second time I sent the right patch, but I don't see evidence the CI system ran the correct patch through. So I'm just starting over ;-) -- So this patch fixes a minor code generation

[to-be-committed][RISC-V] Improve usage of slli.uw in constant synthesis

2024-05-11 Thread Jeff Law
And an improvement to using slli.uw... I recently added the ability to use slli.uw in the synthesis path. That code was conditional on the right justified constant being a LUI_OPERAND after sign extending from bit 31 to bit 63. That code is working fine, but could be improved. Specifically

[to-be-committed] RISC-V Fix minor regression in synthesis WRT bseti usage

2024-05-11 Thread Jeff Law
Overnight testing showed a small number of cases where constant synthesis was doing something dumb. Specifically generating more instructions than the number of bits set in the constant. It was a minor goof in the recent bseti code. In the code to first figure out what bits LUI could set, I

Re: [PATCH v2 1/4] Support for CodeView debugging format

2024-05-11 Thread Jeff Law
On 10/30/23 6:28 PM, Mark Harmstone wrote: This patch and the following add initial support for Microsoft's CodeView debugging format, as used by MSVC, to mingw targets. Note that you will need a recent version of binutils for this to be useful. The best way to view the output is to run

Re: [to-be-committed][RISC-V] Improve extraction of inverted single bit

2024-05-10 Thread Jeff Law
On 5/10/24 4:28 PM, Jeff Law wrote: So this patch fixes a minor code generation inefficiency that (IIRC) the RAU team discovered a while ago in spec. If we want the inverted value of a single bit we can use bext to extract the bit, then seq to invert the value (if viewed as a 0/1 truth

Re: [wwwdocs] Add Cauldron2024

2024-05-10 Thread Jeff Law
On 5/7/24 4:34 AM, Jan Hubicka wrote: Hi, this adds Cauldron2024 to main page. OK? OK, of course. jeff

Re: [PATCH 4/4] RISC-V: Allow by-pieces to do overlapping accesses in block_move_straight

2024-05-10 Thread Jeff Law
On 5/7/24 11:17 PM, Christoph Müllner wrote: The current implementation of riscv_block_move_straight() emits a couple of loads/stores with with maximum width (e.g. 8-byte for RV64). The remainder is handed over to move_by_pieces(). The by-pieces framework utilizes target hooks to decide about

Re: [PATCH 3/4] RISC-V: tune: Add setting for overlapping mem ops to tuning struct

2024-05-10 Thread Jeff Law
On 5/7/24 11:17 PM, Christoph Müllner wrote: This patch adds the field overlap_op_by_pieces to the struct riscv_tune_param, which is used by the TARGET_OVERLAP_OP_BY_PIECES_P() hook. This hook is used by the by-pieces infrastructure to decide if overlapping memory accesses should be emitted.

  1   2   3   4   5   6   7   8   9   10   >