Re: [RFC] gcc: xtensa: use salt/saltu in xtensa_expand_scc

2023-09-08 Thread Takayuki 'January June' Suwa via Gcc-patches
Hi! On 2023/09/07 23:22, Max Filippov wrote: > gcc/ > * config/xtensa/predicates.md (xtensa_cstoresi_operator): Add > unsigned comparisons. > * config/xtensa/xtensa.cc (xtensa_expand_scc): Add code > generation of salt/saltu instructions. > * config/xtensa/xtensa.h

[PATCH] xtensa: Optimize several boolean evaluations of EQ/NE against constant zero

2023-09-08 Thread Takayuki 'January June' Suwa via Gcc-patches
An idiomatic implementation of boolean evaluation of whether a register is zero or not in Xtensa is to assign 0 and 1 to the temporary and destination, and then issue the MOV[EQ/NE]Z machine instruction (See 8.3.2 Instruction Idioms, Xtensa ISA refman., p.599): ;; A2 = (A3 != 0) ? 1 : 0;

Re: [PATCH] xtensa: Optimize boolean evaluation when SImode EQ/NE to zero if TARGET_MINMAX

2023-09-05 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/09/06 8:01, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Tue, Sep 5, 2023 at 2:29 AM Takayuki 'January June' Suwa > wrote: >> >> This patch optimizes the boolean evaluation for equality to 0 in SImode >> using the MINU (Minimum Value Unsigned) machine instruction available >> when

[PATCH] xtensa: Optimize boolean evaluation when SImode EQ/NE to zero if TARGET_MINMAX

2023-09-05 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes the boolean evaluation for equality to 0 in SImode using the MINU (Minimum Value Unsigned) machine instruction available when TARGET_MINMAX is configured, for example, (x != 0) to MINU(x, 1) and (x == 0) to (MINU(x, 1) ^ 1). /* example */ int test0(int x) {

[PATCH] xtensa: Use HARD_REG_SET instead of bare integer

2023-07-03 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/xtensa.cc (machine_function, xtensa_expand_prologue): Change to use HARD_REG_BIT and its macros. * config/xtensa/xtensa.md (peephole2: regmove elimination during DFmode input reload): Likewise. --- gcc/config/xtensa/xtensa.cc

[PATCH 1/2] xtensa: Fix missing mode warning in "*eqne_INT_MIN"

2023-07-01 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/xtensa.md (*eqne_INT_MIN): Add missing ":SI" to the match_operator. --- gcc/config/xtensa/xtensa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index

[PATCH 2/2] xtensa: The use of CLAMPS instruction also requires TARGET_MINMAX, as well as TARGET_CLAMPS

2023-07-01 Thread Takayuki 'January June' Suwa via Gcc-patches
Because both smin and smax requiring TARGET_MINMAX are essential to the RTL representation. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_match_CLAMPS_imms_p): Simplify. * config/xtensa/xtensa.md (*xtensa_clamps): Add TARGET_MINMAX to the condition. ---

[PATCH 1/2] xtensa: Remove TARGET_MEMORY_MOVE_COST hook

2023-06-18 Thread Takayuki 'January June' Suwa via Gcc-patches
It used to always return a constant 4, which is same as the default behavior, but doesn't take into account the effects of secondary reloads. Therefore, the implementation of this target hook is removed. gcc/ChangeLog: * config/xtensa/xtensa.cc (TARGET_MEMORY_MOVE_COST,

[PATCH 2/2] xtensa: constantsynth: Add new 2-insns synthesis pattern

2023-06-18 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch adds a new 2-instructions constant synthesis pattern: - A non-negative square value that root can fit into a signed 12-bit: => "MOVI(.N) Ax, simm12" + "MULL Ax, Ax, Ax" Due to the execution cost of the integer multiply instruction (MULL), this synthesis works only when the 32-bit

Re: [PATCH v2] xtensa: Optimize boolean evaluation or branching when EQ/NE to zero in S[IF]mode

2023-06-05 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/06/06 0:15, Max Filippov wrote: > Hi Suwa-san, Hi! Thanks for your regtest every time. > > On Mon, Jun 5, 2023 at 2:37 AM Takayuki 'January June' Suwa > wrote: >> >> This patch optimizes the boolean evaluation of EQ/NE against zero >> by adding two insn_and_split patterns similar to

[PATCH v2] xtensa: Optimize boolean evaluation or branching when EQ/NE to zero in S[IF]mode

2023-06-05 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes the boolean evaluation of EQ/NE against zero by adding two insn_and_split patterns similar to SImode conditional store: "eq_zero": op0 = (op1 == 0) ? 1 : 0; op0 = clz(op1) >> 5; /* optimized (requires TARGET_NSA) */ "movsicc_ne0_reg_0": op0 = (op1 !=

[PATCH] xtensa: Optimize boolean evaluation or branching when EQ/NE to INT_MIN

2023-06-03 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes both the boolean evaluation of and the branching of EQ/NE against INT_MIN (-2147483648), by taking advantage of the specifi- cation the ABS machine instruction on Xtensa returns INT_MIN iff INT_MIN, otherwise non-negative value. /* example */ int test0(int x) {

[PATCH] xtensa: Optimize boolean evaluation or branching when EQ/NE to zero in S[IF]mode

2023-06-03 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes the boolean evaluation of EQ/NE against zero by adding two insn_and_split patterns similar to SImode conditional store: "eq_zero": op0 = (op1 == 0) ? 1 : 0; op0 = clz(op1) >> 5; /* optimized (requires TARGET_NSA) */ "movsicc_ne0_reg_0": op0 = (op1 !=

Re: [PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/06/01 23:20, Max Filippov wrote: > On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa > wrote: >> More optimized than the default RTL generation. >> >> gcc/ChangeLog: >> >> * config/xtensa/xtensa.md (adddi3, subdi3): >> New RTL generation patterns implemented

[PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/05/31 15:02, Max Filippov wrote: Hi! > On Tue, May 30, 2023 at 2:50 AM Takayuki 'January June' Suwa > wrote: >> >> Resubmitting the correct one due to a mistake in merging order of fixes. >> --- >> More optimized than the default RTL generation. >> >> gcc/ChangeLog: >> >> *

[PATCH 3/3 v2] xtensa: Optimize 'cstoresi4' insn pattern

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
Resubmitting the correct one due to a mistake in merging order of fixes. --- This patch introduces more optimized implementations for the 6 cstoresi4 insn comparison methods (eq/ne/lt/le/gt/ge, however, required TARGET_NSA for eq). gcc/ChangeLog: * config/xtensa/xtensa.cc

[PATCH 2/3 v2] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
Resubmitting the correct one due to a mistake in merging order of fixes. --- More optimized than the default RTL generation. gcc/ChangeLog: * config/xtensa/xtensa.md (adddi3, subdi3): New RTL generation patterns implemented according to the instruc- tion idioms described

[PATCH 3/3] xtensa: Optimize 'cstoresi4' insn pattern

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch introduces more optimized implementations for the 6 cstoresi4 insn comparison methods (eq/ne/lt/le/gt/ge, however, required TARGET_NSA for eq). gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_scc): Add dedicated optimization code for cstoresi4

[PATCH 2/3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
More optimized than the default RTL generation. gcc/ChangeLog: * config/xtensa/xtensa.md (adddi3, subdi3): New RTL generation patterns implemented according to the instruc- tion idioms described in the Xtensa ISA reference manual (p. 600). --- gcc/config/xtensa/xtensa.md

[PATCH 1/3] xtensa: Improve "*shlrd_reg" insn pattern and its variant

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
The insn "*shlrd_reg" shifts two registers with a funnel shifter by the third register to get a single word result: reg0 = (reg1 SHIFT_OP0 reg3) BIT_JOIN_OP (reg2 SHIFT_OP1 (32 - reg3)) where the funnel left shift is SHIFT_OP0 := ASHIFT, SHIFT_OP1 := LSHIFTRT and its right shift is SHIFT_OP0

[PATCH 3/3] xtensa: Rework 'setmemsi' insn pattern

2023-05-25 Thread Takayuki 'January June' Suwa via Gcc-patches
In order to reject voodoo estimation logic with lots of magic numbers, this patch revises the code to measure the costs of the three memset methods based on the actual emission size of the insn sequence corresponding to each method and choose the smallest one. gcc/ChangeLog: *

[PATCH 1/3] xtensa: Addendum of the commit e33d2dcb463161a110ac345a451132ce8b2b23d9

2023-05-25 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/xtensa.md (*extzvsi-1bit_ashlsi3): Retract excessive line folding, and correct the value of the "length" insn attribute related to TARGET_DENSITY. (*extzvsi-1bit_addsubx): Ditto. --- gcc/config/xtensa/xtensa.md | 11 ++- 1

[PATCH 2/3] xtensa: Add 'subtraction from constant' insn pattern

2023-05-25 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch makes try to eliminate using temporary pseudo for '(minus:SI (const_int) (reg:SI))' if the addition of negative constant value can be emitted in a single machine instruction. /* example */ int test0(int x) { return 1 - x; } int test1(int x) { return 100 - x;

[PATCH v2] xtensa: Optimize '(x & CST1_POW2) != 0 ? CST2_POW2 : 0'

2023-05-22 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/05/23 11:27, Max Filippov wrote: > Hi Suwa-san, Hi! > This change introduces a bunch of test failures on big endian configuration. > I believe that's because the starting bit position for zero_extract is counted > from different ends depending on the endianness. Oops, what a stupid

[PATCH 1/2] xtensa: Optimize '(x & CST1_POW2) != 0 ? CST2_POW2 : 0'

2023-05-22 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch decreses one machine instruction from "single bit extraction with shifting" operation, and tries to eliminate the conditional branch if CST2_POW2 doesn't fit into signed 12 bits with the help of ifcvt optimization. /* example #1 */ int test0(int x) { return (x & 1048576)

[PATCH 2/2] xtensa: Merge '*addx' and '*subx' insn patterns into one

2023-05-22 Thread Takayuki 'January June' Suwa via Gcc-patches
By making use of the 'addsub_operator' added in the last patch. gcc/ChangeLog: * config/xtensa/xtensa.md (*addsubx): Rename from '*addx', and change to also accept '*subx' pattern. (*subx): Remove. --- gcc/config/xtensa/xtensa.md | 31 +-- 1

[PATCH v2] xtensa: Make full transition to LRA

2023-05-08 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/05/08 22:43, Richard Biener wrote: [snip] >> -mlra > > If they were in any released compiler options should be kept > (doing nothing) for backward compatibility. Use for example > > mlra > Target WarnRemoved > Removed in GCC 14. This switch has no effect. > > or > > mlra > Target

[PATCH] xtensa: Make full transition to LRA

2023-05-08 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/constraints.md (R, T, U): Change define_constraint to define_memory_constraint. * config/xtensa/xtensa.cc (xtensa_lra_p, TARGET_LRA_P): Remove. (xtensa_emit_move_sequence): Remove "if (reload_in_progress)" clause as it

[PATCH] xtensa: Remove REG_OK_STRICT and its derivatives

2023-03-12 Thread Takayuki 'January June' Suwa via Gcc-patches
Because GO_IF_LEGITIMATE_ADDRESS was deprecated a long time ago (see commit c6c3dba931548987c78719180e30ebc863404b89). gcc/ChangeLog: * config/xtensa/xtensa.h (REG_OK_STRICT, REG_OK_FOR_INDEX_P, REG_OK_FOR_BASE_P): Remove. --- gcc/config/xtensa/xtensa.h | 11 +-- 1 file

[PATCH] xtensa: Fix for enabling LRA

2023-03-07 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch makes LRA well with some exceptions (e.g. MI thunk generation due to pretending reload_completed). gcc/ChangeLog: * config/xtensa/constraints.md (R, T, U): Change define_constraint to define_memory_constraint. * config/xtensa/xtensa.cc

[PATCH] xtensa: Make use of CLAMPS instruction if configured

2023-02-26 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch introduces the use of CLAMPS instruction when the instruction is configured. /* example */ int test(int a) { if (a < -512) return -512; if (a > 511) return 511; return a; } ;; prereq: TARGET_CLAMPS test: clamps a2, a2, 9

Re: [PATCH] gcc: xtensa: fix PR target/108919

2023-02-25 Thread Takayuki 'January June' Suwa via Gcc-patches
Hello, Max: On 2023/02/25 19:01, Max Filippov wrote: > gcc/ > PR target/108919 > > * config/xtensa/xtensa-protos.h > (xtensa_prepare_expand_call): Rename to xtensa_expand_call. > * config/xtensa/xtensa.cc (xtensa_prepare_expand_call): Rename > to

[PATCH 2/2] xtensa: Fix missing mode warnings in machine description

2023-02-22 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/xtensa.md (zero_cost_loop_start, zero_cost_loop_end, loop_end): Add missing "SI:" to PLUS RTXes. --- gcc/config/xtensa/xtensa.md | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/gcc/config/xtensa/xtensa.md

[PATCH 1/2] xtensa: Fix non-fatal regression introduced by b2ef02e8cbbaf95fee98be255f697f47193960ec

2023-02-22 Thread Takayuki 'January June' Suwa via Gcc-patches
In commit b2ef02e8cbbaf95fee98be255f697f47193960ec, the sibling call insn included (use (reg:SI A0_REG)) to fix the problem, which added a USE chain unconditionally to the data flow of register A0 during the sibling call. As a result, df_regs_ever_live_p (A0_REG) returns true, so even if register

[PATCH] xtensa: Enforce return address saving when -Og is specified

2023-02-17 Thread Takayuki 'January June' Suwa via Gcc-patches
Leaf function often omits saving its return address to the stack slot, and this feature often makes debugging very confusing, especially for stack dump analysis. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_call_save_reg): Change to return true if register A0 (return address

[PATCH v5] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-02-17 Thread Takayuki 'January June' Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1

[PATCH v7] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-02-16 Thread Takayuki 'January June' Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (with

Re: [PATCH v6] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-02-16 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/02/16 7:18, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Thu, Jan 26, 2023 at 7:17 PM Takayuki 'January June' Suwa > wrote: >> >> In the case of the CALL0 ABI, values that must be retained before and >> after function calls are placed in the callee-saved registers (A12 >> through A15)

[PATCH v6] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-26 Thread Takayuki 'January June' Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (with

[PATCH v4] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-23 Thread Takayuki 'January June' Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1

[PATCH v5] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-23 Thread Takayuki 'January June' Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (with

Re: [PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-22 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/01/23 0:45, Max Filippov wrote: > On Fri, Jan 20, 2023 at 8:39 PM Takayuki 'January June' Suwa > wrote: >> On 2023/01/21 0:14, Max Filippov wrote: >>> After having this many attempts and getting to the issues that are >>> really hard to detect I wonder if the target backend is the right

Re: [PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-20 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/01/21 0:14, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Wed, Jan 18, 2023 at 7:50 PM Takayuki 'January June' Suwa > wrote: >> >> In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the >> reg check (possibly FAIL). >> >> = >> In the case of the CALL0 ABI, values

[PATCH] xtensa: Revise 89afb2e86fcb29c559b2957fdcbea0d01740c49b

2023-01-19 Thread Takayuki 'January June' Suwa via Gcc-patches
In the previously posted patch "xtensa: Make complex hard register clobber elimination more robust and accurate", the check code for insns that refer to the [DS]Cmode hard register before it is overwritten after it is clobbered is incomplete. Fortunately such insns are seldom emitted, so it

[PATCH v3] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-18 Thread Takayuki 'January June' Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1

[PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-18 Thread Takayuki 'January June' Suwa via Gcc-patches
In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the reg check (possibly FAIL). = In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is

[PATCH] xtensa: Optimize inversion of the MSB

2023-01-17 Thread Takayuki 'January June' Suwa via Gcc-patches
Such operation can be done either bitwise-XOR or addition with -2147483648, but the latter is one byte less if TARGET_DENSITY. gcc/ChangeLog: * config/xtensa/xtensa.md (xorsi3_internal): Rename from the original of "xorsi3". (xorsi3): New expansion pattern that emits

[PATCH v2] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-17 Thread Takayuki 'January June' Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1

[PATCH v3] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-17 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/01/17 20:23, Max Filippov wrote: > Hi Suwa-san, Hi! > There's still a few regressions in tests with -fcompare-debug because > code generated with -g and without it is different: > E.g. check the following test with -g0 and -g: Again debug_insn is the problem... = In the case of the

[PATCH] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-16 Thread Takayuki 'January June' Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1

[PATCH v2] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-16 Thread Takayuki 'January June' Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (the

[PATCH] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-15 Thread Takayuki 'January June' Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move. e.g.

[PATCH] xtensa: Remove old broken tweak for leaf function

2023-01-13 Thread Takayuki 'January June' Suwa via Gcc-patches
In the before-IRA era, ORDER_REGS_FOR_LOCAL_ALLOC was called for each function in Xtensa, and there was register allocation table reordering for leaf functions to compensate for the poor performance of local-alloc. Today the adjustment hook is still called via its alternative

[PATCH 2/2] xtensa: Optimize ctzsi2 and ffssi2 a bit

2023-01-11 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch saves one byte when the Code Density Option is enabled, gcc/ChangeLog: * config/xtensa/xtensa.md (ctzsi2, ffssi2): Rearrange the emitting codes. --- gcc/config/xtensa/xtensa.md | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git

[PATCH 1/2] xtensa: Tune "*btrue" insn pattern

2023-01-11 Thread Takayuki 'January June' Suwa via Gcc-patches
This branch instruction has short encoding if EQ/NE comparison against immediate zero when the Code Density Option is enabled, but its "length" attribute was only for normal encoding. This patch fixes it. This patch also prevents undesireable replacement the comparison immediate zero of the

Re: [PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2023-01-11 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/01/11 17:02, Robin Dapp wrote: > Hi, Hi! > >> On optimizing for speed, default_noce_conversion_profitable_p() allows >> plenty of headroom, so this patch has little impact. >> >> Also, if the target-specific cost estimate is accurate or allows for >> margins, the impact should be

[PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2023-01-10 Thread Takayuki 'January June' Suwa via Gcc-patches
Currently, cond_move_process_if_block() does the conversion without balancing the cost of the converted sequence with the original one, but this should be checked by calling targetm.noce_conversion_profitable_p(). Doing so allows us to provide a way based on the target-specific cost estimate, to

[PATCH] xtensa: Make instruction cost estimation for size more accurate

2023-01-09 Thread Takayuki 'January June' Suwa via Gcc-patches
Until now, we applied COSTS_N_INSNS() (multiplying by 4) after dividing the instruction length by 3, so we couldn't express the difference less than modulo 3 in insn cost for size (e.g. 11 Bytes and 12 bytes cost the same). This patch fixes that. ;; 2 bytes addi.n a2, a2, -1 ; cost 3 ;; 3

[PATCH v2] xtensa: Optimize bitwise splicing operation

2023-01-07 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes the operation of cutting and splicing two register values at a specified bit position, in other words, combining (bitwise ORing) bits 0 through (C-1) of the register with bits C through 31 of the other, where C is the specified immediate integer 17 through 31. This typically

Re: [PATCH] xtensa: Optimize bitwise splicing operation

2023-01-07 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/01/08 6:53, Max Filippov wrote: > On Fri, Jan 6, 2023 at 6:55 PM Takayuki 'January June' Suwa > wrote: >> >> This patch optimizes the operation of cutting and splicing two register >> values at a specified bit position, in other words, combining (bitwise >> ORing) bits 0 through (C-1) of

[PATCH] xtensa: Optimize bitwise splicing operation

2023-01-06 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes the operation of cutting and splicing two register values at a specified bit position, in other words, combining (bitwise ORing) bits 0 through (C-1) of the register with bits C through 31 of the other, where C is the specified immediate integer 1 through 31. This typically

[PATCH v2] xtensa: Optimize stack frame adjustment more

2023-01-06 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch introduces a convenient helper function for integer immediate addition with scratch register as needed, that splits and emits either up to two ADDI/ADDMI machine instructions or an addition by register following an integer immediate load (which may later be transformed by

Re: [PATCH] xtensa: Optimize stack frame adjustment more

2023-01-06 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/01/06 17:05, Max Filippov wrote: > On Thu, Jan 5, 2023 at 10:57 PM Takayuki 'January June' Suwa > wrote: >> By using the helper function, it makes stack frame adjustment logic >> simplified and instruction count less in some cases. > > I've built a couple linux configurations with and

Re: [PATCH] xtensa: Optimize stack frame adjustment more

2023-01-05 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/01/06 15:26, Max Filippov wrote: > On Thu, Jan 5, 2023 at 7:35 PM Takayuki 'January June' Suwa > wrote: >> On second thought, it cannot be a good idea to split addition/subtraction to >> the stack pointer. >> >>> -4aaf: b0a192 movia9, 0x1b0 >>> -4ab2: 1f9a

Re: [PATCH] xtensa: Optimize stack frame adjustment more

2023-01-05 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/01/06 6:32, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Thu, Jan 5, 2023 at 3:57 AM Takayuki 'January June' Suwa > wrote: >> >> This patch introduces a convenient helper function for integer immediate >> addition with scratch register as needed, that splits and emits either >> up to

[PATCH] xtensa: Optimize stack frame adjustment more

2023-01-05 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch introduces a convenient helper function for integer immediate addition with scratch register as needed, that splits and emits either up to two ADDI/ADDMI machine instructions or an addition by register following an immediate integer load (which may later be transformed by

[PATCH] xtensa: Check DF availability before use

2022-12-29 Thread Takayuki 'January June' Suwa via Gcc-patches
Parhaps no problem, but for safety. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_prologue): Fix to check DF availability before use of DF_* macros. --- gcc/config/xtensa/xtensa.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[PATCH] xtensa: Apply a few minor fixes

2022-12-26 Thread Takayuki 'January June' Suwa via Gcc-patches
Almost cosmetic and no functional changes. gcc/ChangeLog: * config/xtensa/*: Tabify, and trim trailing spaces. * config/xtensa/xtensa.h (GP_RETURN, GP_RETURN_REG_COUNT): Change to GP_RETURN_FIRST and GP_RETURN_LAST, respectively. * config/xtensa/xtensa.cc

Re: [PATCH 2/2] xtensa: Implement new target hook: TARGET_CONSTANT_OK_FOR_CPROP_P

2022-09-12 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2022/09/13 4:34, Max Filippov wrote: Hi! > On Sun, Sep 11, 2022 at 1:50 PM Takayuki 'January June' Suwa > wrote: >> >> This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in >> order to exclude CONST_INTs that cannot fit into a MOVI machine instruction >> from cprop. >> >>

[PATCH 2/2] xtensa: Implement new target hook: TARGET_CONSTANT_OK_FOR_CPROP_P

2022-09-11 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in order to exclude CONST_INTs that cannot fit into a MOVI machine instruction from cprop. gcc/ChangeLog: * config/xtensa/xtensa.c (TARGET_CONSTANT_OK_FOR_CPROP_P): New macro definition.

[PATCH 1/2] Add new target hook: constant_ok_for_cprop_p

2022-09-11 Thread Takayuki 'January June' Suwa via Gcc-patches
Hi, Many RISC machines, as we know, have some restrictions on placing register-width constants in the source of load-immediate machine instructions, so the target must provide a solution for that in the machine description. A naive way would be to solve it early, ie. to replace with read

[PATCH] xtensa: constantsynth: Add new 3-insns synthesis pattern

2022-09-10 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch adds a new 3-instructions constant synthesis pattern: - A value that can fit into a signed 12-bit after a number of either bitwise left or right rotations: => "MOVI(.N) Ax, simm12" + "SSAI (1 ... 11) or (21 ... 31)" + "SRC Ax, Ax, Ax" gcc/ChangeLog: *

[PATCH v4 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-09-08 Thread Takayuki 'January June' Suwa via Gcc-patches
Changes from v3: (xtensa_expand_prologue): Changed to exclude debug insns from DF use chain analysis. --- In the example below, 'x' is once placed on the stack frame and then read into registers as the argument value of bar(): /* example */ struct foo { int a, b; };

[PATCH v3 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-09-07 Thread Takayuki 'January June' Suwa via Gcc-patches
Changes from v2: (xtensa_expand_prologue): Changed to check conditions for suppressing emit insns in advance, instead of tracking emitted and later replacing them with NOPs if they are found to be unnecessary. --- In the example below, 'x' is once placed on the stack frame and then read into

[PATCH v2 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-09-02 Thread Takayuki 'January June' Suwa via Gcc-patches
Changes from v1: (xtensa_expand_epilogue): Fixed forgetting to consider hard_frame_pointer_rtx when sharing codes. --- In the example below, 'x' is once placed on the stack frame and then read into registers as the argument value of bar(): /* example */ struct foo { int a, b;

[PATCH 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-08-31 Thread Takayuki 'January June' Suwa via Gcc-patches
In the example below, 'x' is once placed on the stack frame and then read into registers as the argument value of bar(): /* example */ struct foo { int a, b; }; extern struct foo bar(struct foo); struct foo test(void) { struct foo x = { 0, 1 }; return bar(x);

[PATCH 2/2] xtensa: Make complex hard register clobber elimination more robust and accurate

2022-08-31 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch eliminates all clobbers for complex hard registers that will be overwritten entirely afterwards (supersedence of 3867d414bd7d9e5b6fb2a51b1fb3d9e9e1eae9). gcc/ChangeLog: * config/xtensa/xtensa.md: Rewrite the split pattern that performs the abovementioned process so

[PATCH] xtensa: Improve indirect sibling call handling

2022-08-18 Thread Takayuki 'January June' Suwa via Gcc-patches
No longer needs the dedicated hard register (A11) for the address of the call and the split patterns for fixups, due to the introduction of appropriate register class and constraint. (Note: "ISC_REGS" contains a hard register A8 used as a "static chain" pointer for nested functions, but no

[PATCH] xtensa: Optimize stack pointer updates in function pro/epilogue under certain conditions

2022-08-17 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch enforces the use of "addmi" machine instruction instead of addition/subtraction with two source registers for adjusting the stack pointer, if the adjustment fits into a signed 16-bit and is also a multiple of 256. /* example */ void test(void) { char buffer[4096];

Re: [PATCH] xtensa: Prevent emitting integer additions of constant zero

2022-08-17 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2022/08/17 4:58, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Tue, Aug 16, 2022 at 5:42 AM Takayuki 'January June' Suwa > wrote: >> >> In a few cases, obviously omitable add instructions can be emitted via >> invoking gen_addsi3. >> >> gcc/ChangeLog: >> >> * config/xtensa/xtensa.md

[PATCH] xtensa: Prevent emitting integer additions of constant zero

2022-08-16 Thread Takayuki 'January June' Suwa via Gcc-patches
In a few cases, obviously omitable add instructions can be emitted via invoking gen_addsi3. gcc/ChangeLog: * config/xtensa/xtensa.md (addsi3_internal): Rename from "addsi3". (addsi3): New define_expand in order to reject integer additions of constant zero. ---

[PATCH] xtensa: Turn on -fsplit-wide-types-early by default

2022-08-14 Thread Takayuki 'January June' Suwa via Gcc-patches
Since GCC10, the "subreg2" optimization pass was no longer tied to enabling "subreg1" unless -fsplit-wide-types-early was turned on (PR88233). However on the Xtensa port, the lack of "subreg2" can degrade the quality of the output code, especially for those that produce many D[FC]mode pseudos.

Re: [PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))"

2022-08-04 Thread Takayuki 'January June' Suwa via Gcc-patches
(sorry repost due to the lack of cc here) Hi! On 2022/08/04 18:49, Richard Sandiford wrote: > Takayuki 'January June' Suwa writes: >> Thanks for your response. >> >> On 2022/08/03 16:52, Richard Sandiford wrote: >>> Takayuki 'January June' Suwa via Gcc-patch

Re: [PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))"

2022-08-03 Thread Takayuki 'January June' Suwa via Gcc-patches
Thanks for your response. On 2022/08/03 16:52, Richard Sandiford wrote: > Takayuki 'January June' Suwa via Gcc-patches writes: >> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps >> data flow consistent, but it also increases register al

[PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))"

2022-08-02 Thread Takayuki 'January June' Suwa via Gcc-patches
Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps data flow consistent, but it also increases register allocation pressure and thus often creates many unwanted register-to-register moves that cannot be optimized away. It seems just analogous to partial register stall which

[PATCH 2/2] xtensa: Fix conflicting hard regno between indirect sibcall fixups and EH_RETURN_STACKADJ_RTX

2022-07-29 Thread Takayuki 'January June' Suwa via Gcc-patches
The hard register A10 was already allocated for EH_RETURN_STACKADJ_RTX. (although exception handling and sibling call may not apply at the same time, but for safety) gcc/ChangeLog: * config/xtensa/xtensa.md: Change hard register number used in the split patterns for indirect

[PATCH 1/2] xtensa: Add RTX costs for if_then_else

2022-07-29 Thread Takayuki 'January June' Suwa via Gcc-patches
It takes one machine instruction for both condtional branch and move. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_rtx_costs): Add new case for IF_THEN_ELSE. --- gcc/config/xtensa/xtensa.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/config/xtensa/xtensa.cc

[PATCH] xtensa: Optimize "bitwise AND NOT with imm" followed by "branch if (not) equal to zero"

2022-07-22 Thread Takayuki 'January June' Suwa via Gcc-patches
The RTL combiner will transform "if ((x & C) == C) goto label;" into "if ((~x & C) == 0) goto label;" and will try to match it with the insn patterns. /* example */ void test_0(int a) { if ((char)a == 255) foo(); } void test_1(int a) { if ((unsigned short)a ==

[PATCH] xtensa: Correct the relative RTX cost that corresponds to the Move Immediate "MOVI" instruction

2022-07-18 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch corrects the overestimation of the relative cost of '(set (reg) (const_int N))' where N fits into the instruction itself. In fact, such overestimation confuses the RTL loop invariant motion pass. As a result, it brings almost no negative impact from the speed point of view, but

[PATCH 1/2] xtensa: constantsynth: Make try to find shorter instruction

2022-07-15 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch allows the constant synthesis to choose shorter instruction if possible. /* example */ int test(void) { return 128 << 8; } ;; before test: movia2, 0x100 addmi a2, a2, 0x7f00 ret.n ;; after test: movi.n a2, 1

[PATCH 2/2] xtensa: Optimize "bitwise AND with imm1" followed by "branch if (not) equal to imm2"

2022-07-15 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch enhances the effectiveness of the previously posted one: "xtensa: Optimize bitwise AND operation with some specific forms of constants". /* example */ extern void foo(int); void test(int a) { if ((a & (-1U << 8)) == (128 << 8)) /* 0 or one of "b4const" */

[PATCH] xtensa: Minor fix for FP constant synthesis

2022-07-13 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch fixes an non-fatal issue about negative constant values derived from FP constant synthesis on hosts whose 'long' is wider than 'int32_t'. And also replaces the dedicated code in FP constant synthesis split pattern with the appropriate existing function call. gcc/ChangeLog: *

Re: [RFA] Improve initialization of objects when the initializer has trailing zeros.

2022-07-07 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2022/07/07 23:46, Jeff Law wrote: > This is an update to a patch originally posted by Takayuki Suwa a few months > ago. > > When we initialize an array from a STRING_CST we perform the initialization > in two steps.  The first step copies the STRING_CST to the destination.  The > second

[PATCH] xtensa: Optimize integer constant addition that is between -32896 and 32639

2022-06-26 Thread Takayuki 'January June' Suwa via Gcc-patches
Such constants are often subject to the constant synthesis: int test(int a) { return a - 31999; } test: movia3, 1 addmi a3, a3, -0x7d00 add a2, a2, a3 ret This patch optimizes such case as follows: test: addia2, a2, 1

[PATCH] xtensa: Fix buffer overflow

2022-06-21 Thread Takayuki 'January June' Suwa via Gcc-patches
Fortify buffer overflow message reported. (see https://github.com/earlephilhower/esp-quick-toolchain/issues/36) gcc/ChangeLog: * config/xtensa/xtensa.md (bswapsi2_internal): Enlarge the buffer that is obviously smaller than the template string given to sprintf(). ---

[PATCH 1/2] xtensa: Apply a few minor fixes

2022-06-19 Thread Takayuki 'January June' Suwa via Gcc-patches
No functional changes. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_emit_move_sequence): Use can_create_pseudo_p(), instead of using individual reload_in_progress and reload_completed. (xtensa_expand_block_set_small_loop): Use xtensa_simm8x256(), the

[PATCH 2/2] xtensa: Fix RTL insn cost estimation about relaxed MOVI instructions

2022-06-19 Thread Takayuki 'January June' Suwa via Gcc-patches
These instructions will all be converted to L32R ones with litpool entries by the assembler. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p): Consider relaxed MOVI instructions as L32R. --- gcc/config/xtensa/xtensa.cc | 22 ++ 1 file changed,

Re: [PATCH] xtensa: Defer storing integer constants into litpool until reload

2022-06-17 Thread Takayuki 'January June' Suwa via Gcc-patches
erratum: - extern unsigned int value; + extern unsigned short value; On 2022/06/17 22:47, Takayuki 'January June' Suwa via Gcc-patches wrote: > Storing integer constants into litpool in the early stage of compilation > hinders some integer optimizations. In fact, such i

[PATCH] xtensa: Defer storing integer constants into litpool until reload

2022-06-17 Thread Takayuki 'January June' Suwa via Gcc-patches
Storing integer constants into litpool in the early stage of compilation hinders some integer optimizations. In fact, such integer constants are not subject to the constant folding process. For example: extern unsigned int value; extern void foo(void); void test(void) { if

[PATCH v2 2/5] xtensa: Add support for sibling call optimization

2022-06-15 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2022/06/15 5:17, Max Filippov wrote: > Hi Suwa-san, hi! > This change results in a bunch of new regression test failures: > The code generated for e.g. gcc.c-torture/execute/921208-2.c looks like this: oh, PICed... indirect (incl. via function pointer, virtual functions and of course PIC ones

  1   2   >