Re: [to-be-committed][RISC-V] Eliminate redundant bitmanip operation

2024-05-19 Thread Jeff Law
On 5/19/24 1:59 PM, Andrew Pinski wrote: On Sun, May 19, 2024 at 10:58 AM Jeff Law wrote: perl has some internal bitmap code. One of its implementation properties is that if you ask it to set a bit, the bit is first cleared. Unfortunately this is fairly hard to see in gimple/match due

[to-be-committed][RISC-V] Eliminate redundant bitmanip operation

2024-05-19 Thread Jeff Law
perl has some internal bitmap code. One of its implementation properties is that if you ask it to set a bit, the bit is first cleared. Unfortunately this is fairly hard to see in gimple/match due to type changes in the IL. But it is easy to see in the code we get from combine. So we just

Re: [PATCH v4] DSE: Fix ICE after allow vector type in get_stored_val

2024-05-19 Thread Jeff Law
On 5/2/24 7:51 PM, pan2...@intel.com wrote: From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, the valididate_subreg treats the vector type's size is less than vector register as invalid. Then we will have ICE here.

Re: [PATCH] Add widening expansion of MULT_HIGHPART_EXPR for integral modes

2024-05-19 Thread Jeff Law
On 5/19/24 3:40 AM, Eric Botcazou wrote: Hi, Just notice that this patch may result in some ICE when build libc++ for the riscv port, details as below. Please note not all configuration can reproduce this issue, feel free to ping me if you cannot reproduce this issue. CC more riscv port

[to-be-committed][RISC-V][PR target/115142] Do not create invalidate shift-add insn

2024-05-18 Thread Jeff Law
Repost, this time with the RISC-V tag so it's picked up by the CI system. This fixes a minor bug that showed up in the CI system, presumably with fuzz testing. Under the right circumstances, we could end trying to emit a shift-add style sequence where the to-be-shifted operand was not a

[to-be-committed][PR target/115142] Do not create invalidate shift-add insn

2024-05-18 Thread Jeff Law
This fixes a minor bug that showed up in the CI system, presumably with fuzz testing. Under the right circumstances, we could end trying to emit a shift-add style sequence where the to-be-shifted operand was not a register. This naturally leads to an unrecognized insn. The circumstances

Re: [PATCH] RISC-V: Fix "Nan-box the result of movbf on soft-bf16"

2024-05-17 Thread Jeff Law
On 5/15/24 7:55 PM, Xiao Zeng wrote: 1 According to unpriv-isa spec: 1.1 "FMV.H.X moves the half-precision value encoded in IEEE 754-2008 standard encoding from the

Re: [PATCH] RISC-V: Modify _Bfloat16 to __bf16

2024-05-17 Thread Jeff Law
On 5/17/24 2:19 AM, Kito Cheng wrote: LGTM, thanks for fixing this :) And just to be clear for Xiao, you can go ahead and commit this patch to the trunk. An ACK from Kito, Juzhe, Palmer, Robin or myself is all you need for a change that is isolated to RISC-V code. jeff

Re: [PATCH] RISC-V: Remove dead perm series code and document.

2024-05-17 Thread Jeff Law
On 5/17/24 9:27 AM, Robin Dapp wrote: Hi, with the introduction of shuffle_series_patterns the explicit handler code for a perm series is dead. This patch removes it and also adds a function-level comment to shuffle_series_patterns. Regtested on rv64gcv_zvfh_zvbb. Regards Robin

Re: [PATCH v1] RISC-V: Cleanup some temporally files [NFC]

2024-05-17 Thread Jeff Law
On 5/16/24 6:12 PM, Li, Pan2 wrote: Committed, thanks Juzhe. Thanks for cleaning up my little mess! Sorry about that. jeff

Re: [PATCH gcc-13] Fix RISC-V missing stack tie

2024-05-16 Thread Jeff Law
On 5/16/24 12:24 PM, Palmer Dabbelt wrote: gcc/ * config/riscv/riscv.cc (riscv_expand_prologue): Add missing stack tie for scalable and final stack adjustment if needed. Co-authored-by: Raphael Zinsly (cherry picked from commit

Re: [PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-16 Thread Jeff Law
On 5/16/24 5:58 AM, Richard Biener wrote: On Thu, May 16, 2024 at 11:35 AM Li, Pan2 wrote: OK. Thanks Richard for help and coaching. To double confirm, are you OK with this patch only or for the series patch(es) of SAT middle-end? Thanks again for reviewing and suggestions. For the

Re: [PATCH] tree-optimization/13962 - handle ptr-ptr compares in ptrs_compare_unequal

2024-05-16 Thread Jeff Law
On 5/16/24 6:03 AM, Richard Biener wrote: Now that we handle pt.null conservatively we can implement the missing tracking of constant pool entries (aka STRING_CST) and handle ptr-ptr compares using points-to info in ptrs_compare_unequal. Bootstrapped on x86_64-unknown-linux-gnu, (re-)testing

Re: [PATCH v2 1/2] RISC-V: Add cmpmemsi expansion

2024-05-15 Thread Jeff Law
On 5/15/24 12:49 AM, Christoph Müllner wrote: GCC has a generic cmpmemsi expansion via the by-pieces framework, which shows some room for target-specific optimizations. E.g. for comparing two aligned memory blocks of 15 bytes we get the following sequence: my_mem_cmp_aligned_15: li

Re: [PATCH] RISC-V: propgue/epilogue expansion code minor changes [NFC]

2024-05-15 Thread Jeff Law
On 5/15/24 12:55 PM, Vineet Gupta wrote: Saw this little room for improvement in current debugging of prologue/epilogue expansion code. --- Use the following pattern consistently `RTX_FRAME_RELATED_P (gen_insn (insn)) = 1` vs. calling gen_insn around apriori gen_xxx_insn () calls.

[to-be-committed][RISC-V] Improve some shift-add sequences

2024-05-15 Thread Jeff Law
ow selection between (x << C1) + C2 vs (x + C2') << C1 depending on the cost C2 vs C2'. gcc/testsuite * gcc.target/riscv/shift-add-1.c: New test. commit 03933cf8813b28587ceb7f6f66ac03d08c5de58b Author: Jeff Law Date: Thu Apr 4 13:35:54 2024 -0600 Optim

Re: [PATCH] RISC-V: Fix cbo.zero expansion for rv32

2024-05-15 Thread Jeff Law
On 5/15/24 12:48 AM, Christoph Müllner wrote: Emitting a DI pattern won't find a match for rv32 and manifests in the failing test case gcc.target/riscv/cmo-zicboz-zic64-1.c. Let's fix this in the expansion and also address the different code that gets generated for rv32/rv64. gcc/ChangeLog:

Re: [PATCH] RISC-V: Test cbo.zero expansion for rv32

2024-05-15 Thread Jeff Law
On 5/15/24 1:28 AM, Christoph Müllner wrote: We had an issue when expanding via cmo-zero for RV32. This was fixed upstream, but we don't have a RV32 test. Therefore, this patch introduces such a test. gcc/testsuite/ChangeLog: * gcc.target/riscv/cmo-zicboz-zic64-1.c: Fix for rv32.

[committed] Fix rv32 issues with recent zicboz work

2024-05-14 Thread Jeff Law
k-function-bodies clear_buf_123 Pushed to the trunk. Jeff commit e410ad74e5e4589aeb666aa298b2f933e7b5d9e7 Author: Jeff Law Date: Tue May 14 22:50:15 2024 -0600 [committed] Fix rv32 issues with recent zicboz work I should have double-checked the CI system before pushing Christoph'

Re: [PATCH] RISC-V: Implement -m{,no}fence-tso

2024-05-14 Thread Jeff Law
On 5/14/24 5:13 PM, Palmer Dabbelt wrote: Some processors from T-Head don't implement the `fence.tso` instruction natively and instead trap to firmware. This breaks some users who haven't yet updated the firmware and one could imagine it breaking users who are trying to build firmware if

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 10:36 AM, Vineet Gupta wrote: On 5/14/24 08:44, Jeff Law wrote: On 5/14/24 8:51 AM, Patrick O'Neill wrote: I was able to find the summary info: Tests that now fail, but worked before (15 tests): libgomp: libgomp.fortran/simd7.f90   -O0  execution test libgomp: libgomp.fortran

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 8:51 AM, Patrick O'Neill wrote: I was able to find the summary info: Tests that now fail, but worked before (15 tests): libgomp: libgomp.fortran/simd7.f90   -O0  execution test libgomp: libgomp.fortran/task2.f90   -O0  execution test libgomp: libgomp.fortran/vla2.f90   -O0 

Re: [PATCH 1/3] expr: Export clear_by_pieces()

2024-05-14 Thread Jeff Law
On 5/7/24 11:38 PM, Christoph Müllner wrote: Make clear_by_pieces() available to other parts of the compiler, similar to store_by_pieces(). gcc/ChangeLog: * expr.cc (clear_by_pieces): Remove static from clear_by_pieces. * expr.h (clear_by_pieces): Add prototype for

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Jeff Law
On 5/14/24 8:57 AM, Qing Zhao wrote: On May 13, 2024, at 20:14, Kees Cook wrote: On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: But it makes no sense to warn about: void sparx5_set (int * ptr, struct nums * sg, int

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 8:51 AM, Patrick O'Neill wrote: On 5/13/24 20:36, Jeff Law wrote: On 5/13/24 6:54 PM, Patrick O'Neill wrote: On 5/13/24 13:28, Jeff Law wrote: On 5/13/24 12:49 PM, Vineet Gupta wrote: If the constant used for stack offset can be expressed as sum of two S12 values

[to-be-committed][RISC-V] Remove redundant AND in shift-add sequence

2024-05-14 Thread Jeff Law
So this patch allows us to eliminate an redundant AND in some shift-add style sequences. I think the testcase was reduced from xz by the RAU team, but I'm not highly confident of that. Specifically the AND is masking off the upper 32 bits of the un-shifted value and there's an outer

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-13 Thread Jeff Law
On 5/13/24 6:54 PM, Patrick O'Neill wrote: On 5/13/24 13:28, Jeff Law wrote: On 5/13/24 12:49 PM, Vineet Gupta wrote: If the constant used for stack offset can be expressed as sum of two S12 values, the constant need not be materialized (in a reg) and instead the two S12 bits can

Re: [PATCH v2 1/3] RISC-V: movmem for RISCV with V extension

2024-05-13 Thread Jeff Law
On 12/19/23 10:28 PM, Jeff Law wrote: On 12/19/23 02:53, Sergei Lewis wrote: gcc/ChangeLog * config/riscv/riscv.md (movmem): Use riscv_vector::expand_block_move, if and only if we know the entire operation can be performed using one vector load followed by one vector

Re: Follow up #1 (was Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265])

2024-05-13 Thread Jeff Law
On 5/13/24 3:13 PM, Vineet Gupta wrote: On 5/13/24 11:49, Vineet Gupta wrote: 500.perlbench_r-0 | 1,214,534,029,025 | 1,212,887,959,387 | 500.perlbench_r-1 |740,383,419,739 | 739,280,308,163 | 500.perlbench_r-2 |692,074,638,817 | 691,118,734,547 | 502.gcc_r-0 |

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Jeff Law
On 5/13/24 1:48 PM, Qing Zhao wrote: -Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-13 Thread Jeff Law
On 5/13/24 12:49 PM, Vineet Gupta wrote: If the constant used for stack offset can be expressed as sum of two S12 values, the constant need not be materialized (in a reg) and instead the two S12 bits can be added to instructions involved with frame pointer. This avoids burning a register and

Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265]

2024-05-13 Thread Jeff Law
On 5/13/24 12:49 PM, Vineet Gupta wrote: Apologies for the delay in getting this out. Needed to fix one ICE with glibc build and fresh round of testing: both testsuite and SPEC runs (which are similar to v1 in terms of Cactu gains, but some more minor regressions elsewhere gcc). Again those

[to-be-committed][RISC-V] Improve AND with some constants

2024-05-13 Thread Jeff Law
If we have an AND with a constant operand and the constant operand requires synthesis, then we may be able to generate more efficient code than we do now. Essentially the need for constant synthesis gives us a budget for alternative ways to clear bits, which zext.w can do for bits 32..63

Re: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-13 Thread Jeff Law
On 5/13/24 9:00 AM, Li, Pan2 wrote: Committed, thanks Juzhe and Kito. Let's wait for a while before backport to 14. Could you fix the formatting nits caught by the CI linter? === ERROR type #1: trailing operator (4 error(s)) === gcc/config/riscv/riscv-vector-builtins.cc:4641:39: if ((exts

[to-be-committed] [RISC-V] Improve single inverted bit extraction - v3

2024-05-12 Thread Jeff Law
The only change in v2 vs v3 is testsuite adjustments for the updated sequences and fixing the name of the second pattern. -- So this patch fixes a minor code generation inefficiency that (IIRC) the RAU team discovered a while ago in spec. If we want the inverted value of a single bit we

[to-be-committed] [RISC-V] Improve single inverted bit extraction - v2

2024-05-12 Thread Jeff Law
So the first version failed CI and after looking at the patch again, I think it can be improved. First, the output pattern might as well go ahead and use the zero_extract form. Second, we should be able to handle cases where all the ops are in word_mode as well as when the shift is in a

[to-be-committed] [RISC-V] Improve single inverted bit extraction

2024-05-12 Thread Jeff Law
So the first time I sent this, I attached the wrong patch. As a result the CI system wasn't happy. The second time I sent the right patch, but I don't see evidence the CI system ran the correct patch through. So I'm just starting over ;-) -- So this patch fixes a minor code generation

[to-be-committed][RISC-V] Improve usage of slli.uw in constant synthesis

2024-05-11 Thread Jeff Law
And an improvement to using slli.uw... I recently added the ability to use slli.uw in the synthesis path. That code was conditional on the right justified constant being a LUI_OPERAND after sign extending from bit 31 to bit 63. That code is working fine, but could be improved. Specifically

[to-be-committed] RISC-V Fix minor regression in synthesis WRT bseti usage

2024-05-11 Thread Jeff Law
Overnight testing showed a small number of cases where constant synthesis was doing something dumb. Specifically generating more instructions than the number of bits set in the constant. It was a minor goof in the recent bseti code. In the code to first figure out what bits LUI could set, I

Re: [PATCH v2 1/4] Support for CodeView debugging format

2024-05-11 Thread Jeff Law
On 10/30/23 6:28 PM, Mark Harmstone wrote: This patch and the following add initial support for Microsoft's CodeView debugging format, as used by MSVC, to mingw targets. Note that you will need a recent version of binutils for this to be useful. The best way to view the output is to run

Re: [to-be-committed][RISC-V] Improve extraction of inverted single bit

2024-05-10 Thread Jeff Law
On 5/10/24 4:28 PM, Jeff Law wrote: So this patch fixes a minor code generation inefficiency that (IIRC) the RAU team discovered a while ago in spec. If we want the inverted value of a single bit we can use bext to extract the bit, then seq to invert the value (if viewed as a 0/1 truth

Re: [wwwdocs] Add Cauldron2024

2024-05-10 Thread Jeff Law
On 5/7/24 4:34 AM, Jan Hubicka wrote: Hi, this adds Cauldron2024 to main page. OK? OK, of course. jeff

Re: [PATCH 4/4] RISC-V: Allow by-pieces to do overlapping accesses in block_move_straight

2024-05-10 Thread Jeff Law
On 5/7/24 11:17 PM, Christoph Müllner wrote: The current implementation of riscv_block_move_straight() emits a couple of loads/stores with with maximum width (e.g. 8-byte for RV64). The remainder is handed over to move_by_pieces(). The by-pieces framework utilizes target hooks to decide about

Re: [PATCH 3/4] RISC-V: tune: Add setting for overlapping mem ops to tuning struct

2024-05-10 Thread Jeff Law
On 5/7/24 11:17 PM, Christoph Müllner wrote: This patch adds the field overlap_op_by_pieces to the struct riscv_tune_param, which is used by the TARGET_OVERLAP_OP_BY_PIECES_P() hook. This hook is used by the by-pieces infrastructure to decide if overlapping memory accesses should be emitted.

Re: [PATCH 2/4] RISC-V: Allow unaligned accesses in cpymemsi expansion

2024-05-10 Thread Jeff Law
On 5/7/24 11:17 PM, Christoph Müllner wrote: The RISC-V cpymemsi expansion is called, whenever the by-pieces infrastructure will not take care of the builtin expansion. The code emitted by the by-pieces infrastructure may emits code, that includes unaligned accesses if

[to-be-committed][RISC-V] Improve extraction of inverted single bit

2024-05-10 Thread Jeff Law
So this patch fixes a minor code generation inefficiency that (IIRC) the RAU team discovered a while ago in spec. If we want the inverted value of a single bit we can use bext to extract the bit, then seq to invert the value (if viewed as a 0/1 truth value). The RTL is fairly convoluted, but

[RISC-V] Use shNadd for constant synthesis

2024-05-09 Thread Jeff Law
So here's the next idiom to improve constant synthesis. The basic idea here is to try and use shNadd to generate the constant when profitable. Let's take 0x30801. Right now that generates: li a0,3145728 addia0,a0,1 sllia0,a0,12 addi

Re: [PATCH 1/4] RISC-V: Add test cases for cpymem expansion

2024-05-09 Thread Jeff Law
On 5/7/24 11:17 PM, Christoph Müllner wrote: We have two mechanisms in the RISC-V backend that expand cpymem pattern: a) by-pieces, b) riscv_expand_block_move() in riscv-string.cc. The by-pieces framework has higher priority and emits a sequence of up to 15 instructions (see

Re: [patch,avr] PR114981: Implement __builtin_powif in assembly

2024-05-09 Thread Jeff Law
On 5/8/24 4:10 AM, Georg-Johann Lay wrote: __builtin_powif is currently implemented in C, and this patch implements it (__powisf2) in assembly. Ok for master? Johann -- AVR: target/114981 - Tweak __powisf2 Implement __powisf2 in assembly. PR target/114981 libgcc/ *

Re: [PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero

2024-05-09 Thread Jeff Law
On 5/7/24 11:38 PM, Christoph Müllner wrote: The Zicboz extension offers the cbo.zero instruction, which can be used to clean a memory region corresponding to a cache block. The Zic64b extension defines the cache block size to 64 byte. If both extensions are available, it is possible to use

Re: [PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe

2024-05-09 Thread Jeff Law
On 5/7/24 11:38 PM, Christoph Müllner wrote: Let's add '\t' to the instruction match pattern to avoid false positive matches when compiling with -flto. gcc/testsuite/ChangeLog: * gcc.target/riscv/cmo-zicbom-1.c: Add \t to test pattern. * gcc.target/riscv/cmo-zicbom-2.c:

Re: [PATCH 1/3] expr: Export clear_by_pieces()

2024-05-09 Thread Jeff Law
On 5/7/24 11:38 PM, Christoph Müllner wrote: Make clear_by_pieces() available to other parts of the compiler, similar to store_by_pieces(). gcc/ChangeLog: * expr.cc (clear_by_pieces): Remove static from clear_by_pieces. * expr.h (clear_by_pieces): Add prototype for

Re: [PATCH 2/2] RISC-V: Add cmpmemsi expansion

2024-05-09 Thread Jeff Law
On 5/7/24 11:52 PM, Christoph Müllner wrote: GCC has a generic cmpmemsi expansion via the by-pieces framework, which shows some room for target-specific optimizations. E.g. for comparing two aligned memory blocks of 15 bytes we get the following sequence: my_mem_cmp_aligned_15: li

Re: [PATCH 1/2] RISC-V: Add tests for cpymemsi expansion

2024-05-08 Thread Jeff Law
On 5/7/24 11:52 PM, Christoph Müllner wrote: cpymemsi expansion was available for RISC-V since the initial port. However, there are not tests to detect regression. This patch adds such tests. Three of the tests target the expansion requirements (known length and alignment). One test reuses

Re: [PATCH gcc-13-backport] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2024-05-08 Thread Jeff Law
On 5/8/24 11:32 AM, Palmer Dabbelt wrote: From: Yanzhang Wang gcc/ChangeLog: * config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf when enabling -mno-omit-leaf-frame-pointer (riscv_option_override): Override omit-frame-pointer.

[committed] [RISC-V] Provide splitting guidance to combine to faciliate shNadd.uw generation

2024-05-08 Thread Jeff Law
This fixes a minor code quality issue I found while comparing GCC and LLVM. Essentially we want to do a bit of re-association to generate shNadd.uw instructions. Combine does the right thing and finds all the necessary instructions, reassociates the operands, combines constants, etc. Where

Re: [PATCH v1 1/1] RISC-V: Nan-box the result of movbf on soft-bf16

2024-05-08 Thread Jeff Law
On 5/7/24 6:38 PM, Xiao Zeng wrote: 1 This patch implements the Nan-box of bf16. 2 Please refer to the Nan-box implementation of hf16 in: 3 The discussion about Nan-box can be found on the website:

Re: [PATCH v2 4/4] RISC-V: Cover sign-extensions in lshr3_zero_extend_4

2024-05-08 Thread Jeff Law
On 5/8/24 1:36 AM, Christoph Müllner wrote: The lshr3_zero_extend_4 pattern targets bit extraction with zero-extension. This pattern represents the canonical form of zero-extensions of a logical right shift. The same optimization can be applied to sign-extensions. Given the two optimizations

Re: [PATCH v2 3/4] RISC-V: Add zero_extract support for rv64gc

2024-05-08 Thread Jeff Law
On 5/8/24 1:36 AM, Christoph Müllner wrote: The combiner attempts to optimize a zero-extension of a logical right shift using zero_extract. We already utilize this optimization for those cases that result in a single instructions. Let's add a insn_and_split pattern that also matches the

Re: [PATCH v2 2/4] RISC-V: Cover sign-extensions in lshrsi3_zero_extend_2

2024-05-08 Thread Jeff Law
On 5/8/24 1:36 AM, Christoph Müllner wrote: The pattern lshrsi3_zero_extend_2 extracts the MSB bits of the lower 32-bit word and zero-extends it back to DImode. This is realized using srliw, which operates on 32-bit registers. The same optimziation can be applied to sign-extensions when

Re: [PATCH v2 1/4] RISC-V: Add test for sraiw-31 special case

2024-05-08 Thread Jeff Law
On 5/8/24 1:36 AM, Christoph Müllner wrote: We already optimize a sign-extension of a right-shift by 31 in si3_extend. Let's add a test for that (similar to zero-extend-1.c). gcc/testsuite/ChangeLog: * gcc.target/riscv/sign-extend-1.c: New test. OK jeff

[committed][RISC-V] Turn on overlap_op_by_pieces for generic-ooo tuning

2024-05-07 Thread Jeff Law
Per quick email exchange with Palmer. Given the triviality, I'm just pushing it. jeffcommit 9f14f1978260148d4d6208dfd73df1858e623758 Author: Jeff Law Date: Tue May 7 15:34:16 2024 -0600 [committed][RISC-V] Turn on overlap_op_by_pieces for generic-ooo tuning Per quick email

Re: [committed] [RISC-V] Allow uarchs to set TARGET_OVERLAP_OP_BY_PIECES_P

2024-05-07 Thread Jeff Law
On 5/7/24 3:24 PM, Palmer Dabbelt wrote: @@ -529,6 +536,7 @@ static const struct riscv_tune_param generic_ooo_tune_info = { 4, /* fmv_cost */ false, /* slow_unaligned_access */ false,

[committed] [RISC-V] Allow uarchs to set TARGET_OVERLAP_OP_BY_PIECES_P

2024-05-07 Thread Jeff Law
This is almost exclusively work from the VRULL team. As we've discussed in the Tuesday meeting in the past, we'd like to have a knob in the tuning structure to indicate that overlapped stores during move_by_pieces expansion of memcpy & friends are acceptable. This patch adds the that

Re: [PATCH] MATCH: Add some more value_replacement simplifications (a != 0 ? expr : 0) to match

2024-05-07 Thread Jeff Law
On 4/30/24 9:21 PM, Andrew Pinski wrote: This adds a few more of what is currently done in phiopt's value_replacement to match. I noticed this when I was hooking up phiopt's value_replacement code to use match and disabling the old code. But this can be done independently from the hooking up

Re: [PATCH v3] DCE __cxa_atexit calls where the function is pure/const [PR19661]

2024-05-07 Thread Jeff Law
On 5/4/24 5:58 PM, Andrew Pinski wrote: In C++ sometimes you have a deconstructor function which is "empty", like for an example with unions or with arrays. The front-end might not know it is empty either so this should be done on during optimization.o To implement it I added it to DCE

Re: [patch,avr] PR114975: Better 8-bit parity detection.

2024-05-07 Thread Jeff Law
On 5/7/24 11:23 AM, Georg-Johann Lay wrote: Add a combine pattern for parity detection. Ok for master? Johann AVR: target/114975 - Add combine-pattern for __parityqi2. PR target/114975 gcc/ * config/avr/avr.md: Add combine pattern for 8-bit parity detection. gcc/testsuite/

Re: [patch,avr] PR114975: Better 8-bit popcount detection.

2024-05-07 Thread Jeff Law
On 5/7/24 11:25 AM, Georg-Johann Lay wrote: Add a pattern for better popcount detection. Ok for master? Johann -- AVR: target/114975 - Add combine-pattern for __popcountqi2. PR target/114975 gcc/ * config/avr/avr.md: Add combine pattern for 8-bit popcount detection.

Re: [PATCH][risc-v] libstdc++: Preserve signbit of nan when converting float to double [PR113578]

2024-05-07 Thread Jeff Law
On 5/7/24 9:36 AM, Andreas Schwab wrote: On Mai 07 2024, Jonathan Wakely wrote: +#ifdef __riscv + return _M_insert(__builtin_copysign((double)__f, + (double)-__builtin_signbit(__f)); Should this use static_cast? And it's missing a close

Re: [RFA][RISC-V] [PATCH v2] Enable inlining str* by default

2024-05-07 Thread Jeff Law
On 5/4/24 8:41 AM, Jeff Law wrote: The CI system caught a latent bug in the inline string comparison code that shows up with rv32+zbb.  It was hardcoding 64 when AFAICT it should have been using BITS_PER_WORD. So v2 with that fixed. So per the discussion in today's call I reviewed a couple

Re: [PATCH][risc-v] libstdc++: Preserve signbit of nan when converting float to double [PR113578]

2024-05-07 Thread Jeff Law
On 5/7/24 8:06 AM, Jonathan Wakely wrote: On Tue, 7 May 2024 at 14:57, Jeff Law wrote: On 5/7/24 7:49 AM, Jonathan Wakely wrote: Do we want this change for RISC-V, to fix PR113578? I haven't tested it on RISC-V, only on x86_64-linux (where it doesn't do anything). -- >8 -- libs

Re: [PATCH][risc-v] libstdc++: Preserve signbit of nan when converting float to double [PR113578]

2024-05-07 Thread Jeff Law
On 5/7/24 7:49 AM, Jonathan Wakely wrote: Do we want this change for RISC-V, to fix PR113578? I haven't tested it on RISC-V, only on x86_64-linux (where it doesn't do anything). -- >8 -- libstdc++-v3/ChangeLog: PR libstdc++/113578 * include/std/ostream

[RISC-V][V2] Fix incorrect if-then-else nesting of Zbs usage in constant synthesis

2024-05-06 Thread Jeff Law
Reposting without the patch that ignores whitespace. The CI system doesn't like including both patches, that'll generate a failure to apply and none of the tests actually get run. So I managed to goof the if-then-else level of the bseti bits last week. They were supposed to be a last ditch

Re: [PATCH 1/1] RISC-V: Add Zfbfmin extension to the -march= option

2024-05-06 Thread Jeff Law
On 4/11/24 9:32 PM, Xiao Zeng wrote: This patch would like to add new sub extension (aka Zfbfmin) to the -march= option. It introduces a new data type BF16. 1 The Zfbfmin extension depend on 'F', and the FLH, FSH, FMV.X.H, and FMV.H.X instructions as defined in the Zfh extension. 2 The

Re: [PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-06 Thread Jeff Law
On 5/6/24 3:42 PM, Vineet Gupta wrote: On 5/6/24 13:40, Christoph Müllner wrote: The combiner attempts to optimize a zero-extension of a logical right shift using zero_extract. We already utilize this optimization for those cases that result in a single instructions. Let's add a

Re: [NOT CODE REVIEW] [PATCH v3 1/1] [RISC-V] Add support for _Bfloat16

2024-05-06 Thread Jeff Law
On 5/5/24 6:38 PM, Xiao Zeng wrote: 1 At point , BF16 has already been completed "post public review". 2 LLVM has also added support for RISCV BF16 in and . 3 According to

Re: [PATCH] RISC-V: Document -mcmodel=large

2024-05-06 Thread Jeff Law
On 12/20/23 11:13 AM, Jeff Law wrote: On 12/20/23 11:08, Palmer Dabbelt wrote: This slipped through the cracks.  Probably also NEWS-worthy. gcc/ChangeLog: * doc/invoke.texi (RISC-V): Add -mcmodel=large. OK. And yes, I think we're going to need to to a new/changes update

Re: [RFA][RISC-V] Use "uw" forms for constant synthesis

2024-05-06 Thread Jeff Law
On 5/4/24 6:53 PM, Jeff Law wrote: So another constant synthesis improvement. In this patch we're looking at cases where we'd like to be able to use lui+slli, but can't because of the sign extending nature of lui on TARGET_64BIT.  For example: 0x800110020UL.  The trunk currently

Re: [PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-06 Thread Jeff Law
On 5/6/24 2:40 PM, Christoph Müllner wrote: The combiner attempts to optimize a zero-extension of a logical right shift using zero_extract. We already utilize this optimization for those cases that result in a single instructions. Let's add a insn_and_split pattern that also matches the

Re: [PATCH v2 1/1] [RISC-V] Add support for _Bfloat16

2024-05-06 Thread Jeff Law
On 5/4/24 8:08 PM, Xiao Zeng wrote: https://github.com/ewlu/gcc-precommit-ci/issues/1412#issuecomment-2031568644 In the future, my patch will strictly adhere to the formatting suggestions provided by CI. No worries. Even those of us who have been working on the project for 30+ years

[RISC-V] Fix incorrect if-then-else nesting of Zbs usage in constant synthesis

2024-05-06 Thread Jeff Law
So I managed to goof the if-then-else level of the bseti bits last week. They were supposed to be a last ditch effort to improve the result, but ended up inside a conditional where they don't really belong. I almost always use Zba, Zbb and Zbs together, so it slipped by. So it's NFC if you

Re: [RFA][RISC-V] Use "uw" forms for constant synthesis

2024-05-05 Thread Jeff Law
On 5/4/24 6:53 PM, Jeff Law wrote: So another constant synthesis improvement. In this patch we're looking at cases where we'd like to be able to use lui+slli, but can't because of the sign extending nature of lui on TARGET_64BIT.  For example: 0x800110020UL.  The trunk currently

[RFA][RISC-V] Use "uw" forms for constant synthesis

2024-05-04 Thread Jeff Law
So another constant synthesis improvement. In this patch we're looking at cases where we'd like to be able to use lui+slli, but can't because of the sign extending nature of lui on TARGET_64BIT. For example: 0x800110020UL. The trunk currently generates 4 instructions for that constant,

Re: [PATCH v2 1/1] [RISC-V] Add support for _Bfloat16

2024-05-04 Thread Jeff Law
On 4/2/24 3:22 AM, Xiao Zeng wrote: 1 At point , BF16 has already been completed "post public review". 2 LLVM has also added support for RISCV BF16 in and . 3 According to

[RFA][RISC-V] [PATCH v2] Enable inlining str* by default

2024-05-04 Thread Jeff Law
The CI system caught a latent bug in the inline string comparison code that shows up with rv32+zbb. It was hardcoding 64 when AFAICT it should have been using BITS_PER_WORD. So v2 with that fixed. -- So with Chrstoph's patches from late 2022 we've had the ability to inline strlen, and

[RFA][RISC-V] Enable inlining str* by default

2024-05-03 Thread Jeff Law
So with Chrstoph's patches from late 2022 we've had the ability to inline strlen, and str[n]cmp (scalar). However, we never actually turned this capability on by default! This patch flips the those default to allow inlinining by default. It also fixes one bug exposed by our internal

Re: [PATCH] DCE __cxa_atexit calls where the function is pure/const [PR19661]

2024-05-03 Thread Jeff Law
On 5/2/24 3:56 PM, Andrew Pinski wrote: In C++ sometimes you have a deconstructor function which is "empty", like for an example with unions or with arrays. The front-end might not know it is empty either so this should be done on during optimization.o To implement it I added it to DCE

[committed][RISC-V] Fix nearbyint failure on rv32 and formatting nits

2024-05-02 Thread Jeff Law
effcommit 8367c996e55b2c54aeee25e446357a1015a1d11d Author: Jeff Law Date: Thu May 2 17:13:12 2024 -0600 [committed][RISC-V] Fix nearbyint failure on rv32 and formatting nits The CI system tripped an execution failure for rv32 with the ceil/round patch. The fundamental problem is th

Re: [committed] [RISC-V] Improve floor, ceil & related operations for RISC-V

2024-05-02 Thread Jeff Law
On 5/1/24 12:44 PM, Patrick O'Neill wrote: FAIL: gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c execution test on rv32gcv newlib/linux. So the issue here is the code tried to handle DFmode inputs for rv32 by converting to a SImode integer. That's not a good idea on multiple

Re: [PATCH 3/3] combine: initialize a local var

2024-05-02 Thread Jeff Law
On 5/2/24 12:59 PM, Vineet Gupta wrote: This is no logic change (but technically still a functional change). Ran into this when stepping thru combine code. @newpat has some random garbage for a bit until it is actually set. With the fix it remains 0 until actually set. gcc/ChangeLog:

Re: [PATCH 2/3] RISC-V: miscll comment fixes [NFC]

2024-05-02 Thread Jeff Law
On 5/2/24 12:59 PM, Vineet Gupta wrote: gcc/ChangeLog: * config/riscv/riscv.cc: Comment updates. * config/riscv/riscv.h: Ditto. OK jeff

Re: [PATCH 1/3] docs: rtl: document GET_MODE_INNER

2024-05-02 Thread Jeff Law
On 5/2/24 12:59 PM, Vineet Gupta wrote: gcc/ChangeLog * doc/rtl.texi: Add entry for GET_MODE_INNER. Signed-off-by: Vineet Gupta --- gcc/doc/rtl.texi | 4 1 file changed, 4 insertions(+) diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi index 8ea6588cb71f..f1643f41dfc6 100644

Re: [RFA][RISC-V] Improve constant synthesis for constants with 2 bits set

2024-05-02 Thread Jeff Law
On 5/2/24 11:28 AM, Palmer Dabbelt wrote: What's the "A" that you're requesting? Review/Approval :-) Build and regression tested on rv64gc.  OK for the trunk? The CI picked up some Zbb-subsuming targets too.  There's some minor comments, but Reviewed-by: Palmer Dabbelt Acked-by:

[committed] [RISC-V] Don't run new rounding tests on newlib risc-v targets

2024-05-02 Thread Jeff Law
mmit 1e29da0b6508b23a7a6b14a7fb643b917a195003 Author: Jeff Law Date: Thu May 2 08:42:32 2024 -0600 [committed] [RISC-V] Don't run new rounding tests on newlib risc-v targets The new round_32.c and round_64.c tests depend on the optimizers to recognize the conversions feeding the floor/ceil calls and convert

Re: [committed] [RISC-V] Improve floor, ceil & related operations for RISC-V

2024-05-01 Thread Jeff Law
On 5/1/24 12:44 PM, Patrick O'Neill wrote: It also introduced: FAIL: gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c execution test on rv32gcv newlib/linux. I think I see what's going on here as well. Need to ponder this one a bit longer, but I'm confident I'll be able to sort

Re: [committed] [RISC-V] Improve floor, ceil & related operations for RISC-V

2024-05-01 Thread Jeff Law
On 5/1/24 12:44 PM, Patrick O'Neill wrote: Hi Jeff, It looks like this patch's gcc.target/riscv/round_64.c testcase doesn't pass when run with newlib. So I expected this would ultimately end up being a case where certain builtins aren't enabled when we're using a newlib based C library

Re: [committed] [RISC-V] Improve floor, ceil & related operations for RISC-V

2024-05-01 Thread Jeff Law
On 5/1/24 12:44 PM, Patrick O'Neill wrote: Hi Jeff, It looks like this patch's gcc.target/riscv/round_64.c testcase doesn't pass when run with newlib. Looks like a testsuite error as much as anything. The test relies on the gimple optimizers to propagate the input paramters to their use

Re: [committed] [RISC-V] Improve floor, ceil & related operations for RISC-V

2024-05-01 Thread Jeff Law
On 5/1/24 12:44 PM, Patrick O'Neill wrote: Hi Jeff, It looks like this patch's gcc.target/riscv/round_64.c testcase doesn't pass when run with newlib. It also introduced: FAIL: gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c execution test on rv32gcv newlib/linux.

[committed] [RISC-V] Trivial pattern cleanup

2024-05-01 Thread Jeff Law
; coordination branch. jeffcommit 76ca6e1f8b1524b82a871ce29cf58c79e5e77e2b Author: Jeff Law Date: Wed May 1 12:43:37 2024 -0600 [committed] [RISC-V] Trivial pattern cleanup As I was reviewing and cleaning up some internal work, I noticed a particular idiom being used elsewhere in the RISC-V back

[committed] [RISC-V] Fix detection of store pair fusion cases

2024-05-01 Thread Jeff Law
: Jeff Law Date: Wed May 1 11:28:41 2024 -0600 [committed] [RISC-V] Fix detection of store pair fusion cases We've got the ability to count the number of store pair fusions happening in the front-end of the pipeline. When comparing some code from last year vs the current

  1   2   3   4   5   6   7   8   9   10   >