Re: [PATCH] PR target/103069: Relax cmpxchg loop for x86 target

2021-11-15 Thread Uros Bizjak via Gcc-patches
On Sat, Nov 13, 2021 at 3:34 AM Hongyu Wang wrote: > > Hi, > > From the CPU's point of view, getting a cache line for writing is more > expensive than reading. See Appendix A.2 Spinlock in: > > https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ > xeon-lock-scaling-analysis

Re: [PATCH] x86: Update -mtune=alderlake

2021-11-10 Thread Uros Bizjak via Gcc-patches
On Wed, Nov 10, 2021 at 10:09 AM Cui,Lili wrote: > > Hi Uros, > > This patch is to update mtune for alderlake. > > Bootstrap is ok, and no regressions for i386/x86-64 testsuite. > > OK for master? > > Update mtune for alderlake, Alder Lake Intel Hybrid Technology will not > support > Intel® AVX-5

Re: [i386] Fix couple of issues in large PIC model on x86-64/VxWorks

2021-11-08 Thread Uros Bizjak via Gcc-patches
On Tue, Oct 5, 2021 at 5:50 PM Eric Botcazou via Gcc-patches wrote: > > Hi, > > the first issue is that the !gotoff_operand path of legitimize_pic_address in > large PIC model does not make use of REG when it is available, which breaks > for thunks because new pseudo-registers can no longer be cre

Re: [PATCH] x86: Check leal/addl gcc.target/i386/amxtile-3.c for x32

2021-11-04 Thread Uros Bizjak via Gcc-patches
On Thu, Nov 4, 2021 at 3:44 PM H.J. Lu via Gcc-patches wrote: > > Check leal and addl for x32 to fix: > > FAIL: gcc.target/i386/amxtile-3.c scan-assembler addq[ \\t]+\\$12 > FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+4 > FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+

Re: [PATCH] ia32: Disallow mode(V1TI) [PR103020]

2021-11-02 Thread Uros Bizjak via Gcc-patches
On Tue, Nov 2, 2021 at 9:41 AM Jakub Jelinek wrote: > > Hi! > > As discussed in the PR, TImode isn't supported for -m32 on x86 (for the same > reason as on most 32-bit targets, no support for > 2 * BITS_PER_WORD > precision integers), but since PR32280 V1TImode is allowed with -msse in SSE > regs,

Re: [PATCH] x86_64: Improved implementation of TImode rotations.

2021-11-01 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 1, 2021 at 5:45 PM Roger Sayle wrote: > > > This simple patch improves the implementation of 128-bit (TImode) > rotations on x86_64 (a missed optimization opportunity spotted > during the recent V1TImode improvements). > > Currently, the function: > > unsigned __int128 rotrti3(unsigned

Re: [PATCH Take #2] x86_64: Expand ashrv1ti (and PR target/102986)

2021-11-01 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 1, 2021 at 9:43 AM Jakub Jelinek wrote: > > On Mon, Nov 01, 2021 at 08:27:12AM +0100, Uros Bizjak wrote: > > > Also, I wonder for all these patterns (previously and now added), > > > shouldn't > > > they have && TARGET_64BIT in conditions? I mean, we don't really support > > > scalar

Re: [PATCH Take #2] x86_64: Expand ashrv1ti (and PR target/102986)

2021-11-01 Thread Uros Bizjak via Gcc-patches
On Sun, Oct 31, 2021 at 11:02 AM Roger Sayle wrote: > > > Very many thanks to Jakub for proof-reading my patch, catching my silly > GNU-style > mistakes and making excellent suggestions. This revised patch incorporates > all of > his feedback, and has been tested on x86_64-pc-linux-gnu with make

Re: [PATCH] x86_64: Implement V1TI mode shifts/rotates by a constant

2021-10-25 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 25, 2021 at 4:16 PM Roger Sayle wrote: > > > Hi Uros, > I believe the proposed sequences should be dramatically faster than LLVM's > implementation(s), due to the large latencies required to move values between > the vector and scalar parts on modern x86_64 microarchitectures. All of

Re: [PATCH] x86_64: Implement V1TI mode shifts/rotates by a constant

2021-10-25 Thread Uros Bizjak via Gcc-patches
On Sun, Oct 24, 2021 at 6:34 PM Roger Sayle wrote: > > > This patch provides RTL expanders to implement logical shifts and > rotates of 128-bit values (stored in vector integer registers) by > constant bit counts. Previously, GCC would transfer these values > to a pair of scalar registers (TImode

Re: [PATCH] x86_64: Add insn patterns for V1TI mode logic operations.

2021-10-22 Thread Uros Bizjak via Gcc-patches
On Fri, Oct 22, 2021 at 9:19 AM Roger Sayle wrote: > > > On x86_64, V1TI mode holds a 128-bit integer value in a (vector) SSE > register (where regular TI mode uses a pair of 64-bit general purpose > scalar registers). This patch improves the implementation of AND, IOR, > XOR and NOT on these val

Re: [PATCH] x86: Document -fcf-protection requires i686 or newer

2021-10-21 Thread Uros Bizjak via Gcc-patches
On Thu, Oct 21, 2021 at 6:47 PM H.J. Lu wrote: > > PR target/98667 > * doc/invoke.texi: Document -fcf-protection requires i686 or > new. Obvious patch? Uros. > --- > gcc/doc/invoke.texi | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/gcc/doc/i

Re: [PATCH] x86: Adjust gcc.target/i386/pr22076.c

2021-10-21 Thread Uros Bizjak via Gcc-patches
On Thu, Oct 21, 2021 at 6:50 PM H.J. Lu wrote: > > On Tue, Oct 19, 2021 at 11:42 PM Uros Bizjak wrote: > > > > On Tue, Oct 19, 2021 at 8:23 PM H.J. Lu wrote: > > > > > > commit 247c407c83f0015f4b92d5f71e45b63192f6757e > > > Author: Roger Sayle > > > Date: Mon Oct 18 12:15:40 2021 +0100 > > >

Re: [PATCH] x86: Adjust gcc.target/i386/pr22076.c

2021-10-19 Thread Uros Bizjak via Gcc-patches
On Tue, Oct 19, 2021 at 8:23 PM H.J. Lu wrote: > > commit 247c407c83f0015f4b92d5f71e45b63192f6757e > Author: Roger Sayle > Date: Mon Oct 18 12:15:40 2021 +0100 > > Try placing RTL folded constants in the constant pool. > > My recent attempts to come up with a testcase for my patch to ev

Re: [PATCH][i386] target: support spaces in target attribute.

2021-10-18 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 18, 2021 at 1:23 PM Martin Liška wrote: > > On 10/11/21 13:17, Martin Liška wrote: > > On 10/4/21 23:02, Andrew Pinski wrote: > >> It might be useful to skip tabs for the same reason as spaces really. > > > > Sure, be my guest. > > > > Martin > > May I please ping this i386-specific pa

[PATCH] i386: Fix ICE in ix86_print_opreand_address [PR 102761]

2021-10-18 Thread Uros Bizjak via Gcc-patches
2021-10-18 Uroš Bizjak PR target/102761 gcc/ChangeLog: * config/i386/i386.c (ix86_print_operand_address): Error out for non-address_operand asm operands. gcc/testsuite/ChangeLog: * gcc.target/i386/pr102761.c: New test. Boostrapped and regression tested on x86_64-linux-gnu {

Re: [PATCH] Allow early sets of SSE hard registers from standard_sse_constant_p

2021-10-15 Thread Uros Bizjak via Gcc-patches
On Fri, Oct 15, 2021 at 2:15 PM Roger Sayle wrote: > > > My previous patch, which was intended to reduce the differences seen by > the combination of -march=cascadelake and -m32, has additionally found > some more instances where this combination behaves differently to regular > x86_64-pc-linux-gn

Re: [PATCH v2] x86_64: Some SUBREG related optimization tweaks to i386 backend.

2021-10-13 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 13, 2021 at 10:23 AM Roger Sayle wrote: > > > Good catch. I agree with Hongtao that although my testing revealed > no problems with the previous version of this patch, it makes sense to > call gen_reg_rtx to generate an pseudo intermediate instead of attempting > to reuse the existing

[PATCH] i386: Improve workaround for PR82524 LRA limitation [PR85730]

2021-10-12 Thread Uros Bizjak via Gcc-patches
As explained in PR82524, LRA is not able to reload strict_low_part inout operand with matched input operand. The patch introduces a workaround, where we allow LRA to generate an instruction with non-matched input operand which is split post reload to an instruction that inserts non-matched input op

Re: [PATCH][i386] Support reduc_{plus, smax, smin, umax, umin}_scal_v4qi.

2021-10-11 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 11, 2021 at 8:26 AM liuhongt wrote: > > After providing expanders for reduc_umin/umax/smin/smax_scal_v4qi, > perfomance are a little bit faster than before for reduce operations > w/ options -O2 -march=haswell, -O2 -march=skylake-avx512 > and -Ofast -march=skylake-avx512. > > gcc/Cha

[PATCH] i386: Eliminate sign extension after logic operation [PR89954]

2021-09-30 Thread Uros Bizjak via Gcc-patches
Convert (sign_extend:WIDE (any_logic:NARROW (memory, immediate))) to (any_logic:WIDE (sign_extend (memory)), (sign_extend (immediate))). This eliminates sign extension after logic operation. 2021-09-30 Uroš Bizjak gcc/ PR target/89954 * config/i386/i386.md (sign_extend:WIDE (any_lo

Re: [PATCH] i386: Don't emit fldpi etc. if -frounding-math [PR102498]

2021-09-28 Thread Uros Bizjak via Gcc-patches
On Tue, Sep 28, 2021 at 11:33 AM Jakub Jelinek wrote: > > Hi! > > i387 has instructions to store some transcedental numbers into the top of > stack. The problem is that what exact bit in the last place one gets for > those depends on the current rounding mode, the CPU knows the number with > slig

Re: [PATCH] [i386] Support reduc_{plus, smax, smin, umax, min}_scal_v4hi.

2021-09-28 Thread Uros Bizjak via Gcc-patches
On Tue, Sep 28, 2021 at 8:42 AM liuhongt wrote: > > Hi: > Bootstrapped and regtested on x86_64-pc-lunux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/102494 > * config/i386/i386-expand.c (emit_reduc_half): Hanlde V4HImode. > * config/i386/mmx.md (reduc_pl

Re: [PATCH] AVX512FP16:support basic 64/32bit vector type and operation.

2021-09-27 Thread Uros Bizjak via Gcc-patches
x-gnu{-m32,} and sde. > > OK for master with the updated one? I'd put this new pattern in mmx.md to keep 64bit/32bit modes in mmx.md, similar to e.g. FMA patterns among others. OK with the eventual above change. Thanks, Uros. > > Uros Bizjak via Gcc-patches 于2021年9月27日周一 下午7:

Re: [PATCH] AVX512FP16:support basic 64/32bit vector type and operation.

2021-09-27 Thread Uros Bizjak via Gcc-patches
On Mon, Sep 27, 2021 at 12:42 PM Hongyu Wang wrote: > > Hi Uros, > > This patch intends to support V4HF/V2HF vector type and basic operations. > > For 32bit target, V4HF vector is parsed same as __m64 type, V2HF > is parsed by stack and returned from GPR since it is not specified > by ABI. > > We

Re: [PATCH] [GIMPLE] Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available.

2021-09-24 Thread Uros Bizjak via Gcc-patches
On Fri, Sep 24, 2021 at 1:26 PM liuhongt wrote: > > Hi: > Related discussion in [1] and PR. > > Bootstrapped and regtest on x86_64-linux-gnu{-m32,}. > Ok for trunk? > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574330.html > > gcc/ChangeLog: > > PR target/102464 >

Re: [PATCH] Support 64bit fma/fms/fnma/fnms under avx512vl.

2021-09-21 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 22, 2021 at 7:09 AM liuhongt wrote: > > Hi: > fma/fms/fnma/fnmsv2sf4 are defined only under (TARGET_FMA || TARGET_FMA4). > The patch extend the expanders to TARGET_AVX512VL. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > >

Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC

2021-09-21 Thread Uros Bizjak via Gcc-patches
On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches wrote: > > PING^5 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng wrote: > > > > PING^4 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > One major d

Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-16 Thread Uros Bizjak via Gcc-patches
On Fri, Sep 17, 2021 at 5:15 AM Cui, Lili wrote: > > > > -Original Message- > > From: Uros Bizjak > > Sent: Thursday, September 16, 2021 2:28 PM > > To: Cui, Lili > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; H. J. Lu > > > > Subject: Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle >

[PATCH] [i386] Change ix86_decompose_address return type to bool.

2021-09-16 Thread Uros Bizjak via Gcc-patches
After a recent change only a boolean value is returned. 2021-09-16 Uroš Bizjak gcc/ * config/i386/i386-protos.h (ix86_decompose_address): Change return type to bool. * config/i386/i386.c (ix86_decompose_address): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32

Re: [PATCH 2/4] [PATCH 2/4] x86: Update memcpy/memset inline strategies for -mtune=tremont

2021-09-15 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 15, 2021 at 10:10 AM wrote: > > From: "H.J. Lu" > > Simply memcpy and memset inline strategies to avoid branches for > -mtune=tremont: > > 1. Create Tremont cost model from generic cost model. > 2. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector >load and store

Re: [PATCH 1/4] [PATCH 1/4] x86: Update -mtune=tremont

2021-09-15 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 15, 2021 at 10:09 AM wrote: > > From: "H.J. Lu" > > Initial -mtune=tremont update > > 1. Use Haswell scheduling model. > 2. Assume that stack engine allows to execute push&pop instructions in > parall. > 3. Prepare for scheduling pass as -mtune=generic. > 4. Use the same issue rate as

Re: [PATCH 4/4] [PATCH 4/4] x86: Add TARGET_SSE_PARTIAL_REG_[FP_]CONVERTS_DEPENDENCY

2021-09-15 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 15, 2021 at 10:10 AM wrote: > > From: "H.J. Lu" > > 1. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with > TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY in SSE FP to FP splitters. > 2. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with > TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY in SSE INT

Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-15 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 15, 2021 at 10:10 AM wrote: > > From: "H.J. Lu" > > Check TARGET_USE_VECTOR_FP_CONVERTS or TARGET_USE_VECTOR_CONVERTS when > handling avx_partial_xmm_update attribute. Don't convert AVX partial > XMM register update if vector packed SSE conversion should be used. > > gcc/ > >

Re: [PATCH] i386: support micro-levels in target{, _clone} attrs [PR101696]

2021-09-13 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 12, 2021 at 5:32 PM Martin Liška wrote: > > On 8/12/21 5:26 PM, H.J. Lu wrote: > > Will it hurt if they have proper feature_priorities you added? > > No. They are unused, by we should use the proper priorities. gcc/ChangeLog: * common/config/i386/cpuinfo.h (cpu_indicator_init): Add s

[PATCH] [i386] Call force_reg unconditionally.

2021-08-26 Thread Uros Bizjak via Gcc-patches
There is no point to check RTXes before calling force_reg, force_reg checks for REG RTX by itself. 2021-08-26 Uroš Bizjak gcc/ * config/i386/i386.md (*btr_1): Call force_reg unconditionally. (conditional moves with memory inputs splitters): Ditto. * config/i386/sse.md (one_cmpl2):

[PATCH] [i386] Set all_regs to true in the call to replace_rtx [PR102057]

2021-08-26 Thread Uros Bizjak via Gcc-patches
We want to replace all REGs equal to FROM. 2021-08-26 Uroš Bizjak gcc/ PR target/102057 * config/i386/i386.md (cmove reg-reg move elimination peephole2s): Set all_regs to true in the call to replace_rtx. I was not able to create a testcase without warnings. Bootstrapped and regre

Re: [PATCH] i386: Add peephole for lea and zero extend [PR 101716]

2021-08-25 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 24, 2021 at 5:22 PM Hongyu Wang wrote: > > Hi Uros, > > Sorry for the late update. I have tried adjusting the combine pass but > found it is not easy to modify shift const, so I came up with an > alternative solution with your patch. It matches the non-canonical > zero-extend in ix86_d

Re: [GCC-11] [PATCH 0/5] Finish and general-regs-only

2021-08-25 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu wrote: > > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener > wrote: > > > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu wrote: > > > > > > and target("general-regs-only") function attribute > > > were added to GCC 11. But their implementations are incomplete

Re: [PATCH] i386: Add peephole for lea and zero extend [PR 101716]

2021-08-16 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 16, 2021 at 11:18 AM Hongyu Wang wrote: > > > So, the question is if the combine pass really needs to zero-extend > > with 0xfffe, the left shift << 1 guarantees zero in the LSB, so > > 0x should be better and in line with canonical zero-extension > > RTX. > > The shift mas

Re: [PATCH] [i386] Fix ICE.

2021-08-16 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 16, 2021 at 11:19 AM liuhongt wrote: > > Hi: > avx512f_scalef2 only accept register_operand for operands[1], > force it to reg in ldexp3. > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > Ok for trunk. > > gcc/ChangeLog: > > PR target/101930 > * config/

Re: [PATCH] i386: Add peephole for lea and zero extend [PR 101716]

2021-08-16 Thread Uros Bizjak via Gcc-patches
On Fri, Aug 13, 2021 at 9:21 AM Uros Bizjak wrote: > > On Fri, Aug 13, 2021 at 2:48 AM Hongyu Wang wrote: > > > > Hi, > > > > For lea + zero_extendsidi insns, if dest of lea and src of zext are the > > same, combine them with single leal under 64bit target since 32bit > > register will be automat

Re: [PATCH] i386: Add peephole for lea and zero extend [PR 101716]

2021-08-13 Thread Uros Bizjak via Gcc-patches
On Fri, Aug 13, 2021 at 2:48 AM Hongyu Wang wrote: > > Hi, > > For lea + zero_extendsidi insns, if dest of lea and src of zext are the > same, combine them with single leal under 64bit target since 32bit > register will be automatically zero-extended. > > Bootstrapped and regtested on x86_64-linux

Re: [PATCH] Extend ldexp{s, d}f3 to vscalefs{s, d} when TARGET_AVX512F and TARGET_SSE_MATH.

2021-08-12 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 12, 2021 at 6:40 AM Hongtao Liu wrote: > > > > Hi: > > > > AVX512F supported vscalefs{s,d} which is the same as ldexp except the > > > > second operand should be floating point. > > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > > > > > > > gcc/ChangeLog: > > > >

Re: [PATCH] Extend ldexp{s, d}f3 to vscalefs{s, d} when TARGET_AVX512F and TARGET_SSE_MATH.

2021-08-11 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 11, 2021 at 8:36 AM Uros Bizjak wrote: > > On Tue, Aug 10, 2021 at 2:13 PM liuhongt wrote: > > > > Hi: > > AVX512F supported vscalefs{s,d} which is the same as ldexp except the > > second operand should be floating point. > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.

Re: [PATCH] Extend ldexp{s, d}f3 to vscalefs{s, d} when TARGET_AVX512F and TARGET_SSE_MATH.

2021-08-10 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 10, 2021 at 2:13 PM liuhongt wrote: > > Hi: > AVX512F supported vscalefs{s,d} which is the same as ldexp except the > second operand should be floating point. > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > gcc/ChangeLog: > > PR target/98309 > * config

Re: [PATCH v3] x86: Optimize load of const all 1s FP vectors

2021-08-09 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 9, 2021 at 7:47 PM H.J. Lu wrote: > > On Mon, Aug 9, 2021 at 8:27 AM Uros Bizjak wrote: > > > > On Mon, Aug 9, 2021 at 5:24 PM H.J. Lu wrote: > > > > > > On Sun, Aug 8, 2021 at 1:23 PM Uros Bizjak wrote: > > > > > > > > On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu wrote: > > > > > > > >

Re: [PATCH v2] x86: Optimize load of const all 1s float vectors

2021-08-09 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 9, 2021 at 5:24 PM H.J. Lu wrote: > > On Sun, Aug 8, 2021 at 1:23 PM Uros Bizjak wrote: > > > > On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu wrote: > > > > > > Update vector_all_ones_operand to return true for const all 1s float > > > vectors. > > > > > > gcc/ > > > > > > PR target

[PATCH] i386: Name V2SF logic insns [PR101812]

2021-08-09 Thread Uros Bizjak via Gcc-patches
Name V2SF logic insns, so expand_simple_binop works with V2SF modes. 2021-08-09 Uroš Bizjak gcc/ PR target/101812 * config/i386/mmx.md (v2sf3): Rename from *mmx_v2sf3 gcc/testsuite/ PR target/101812 * gcc.target/i386/pr101812.c: New test. Bootstrapped and regression teste

Re: [PATCH] x86: Optimize load of const all 1s float vectors

2021-08-08 Thread Uros Bizjak via Gcc-patches
On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu wrote: > > Update vector_all_ones_operand to return true for const all 1s float > vectors. > > gcc/ > > PR target/101804 > * config/i386/predicates.md (vector_all_ones_operand): Return > true for const all 1s float vectors. > > gcc/tes

[PATCH] i386: Fix conditional move reg-to-reg move elimination peepholes [PR101797]

2021-08-06 Thread Uros Bizjak via Gcc-patches
Add missing operand predicate, otherwise any RTX will match. 2021-08-06 Uroš Bizjak gcc/ PR target/101797 * config/i386/i386.md (cmove reg-to-reg move elimination peephole2s): Add general_gr_operand predicate to operand 3. gcc/testsuite/ PR target/101797 * gcc.target/i386/

Re: [PATCH v2] x86: Update STORE_MAX_PIECES

2021-08-04 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 4, 2021 at 3:34 PM H.J. Lu wrote: > > On Tue, Aug 3, 2021 at 6:56 AM H.J. Lu wrote: > > > > 1. Update x86 STORE_MAX_PIECES to use OImode and XImode only if inter-unit > > move is enabled since x86 uses vec_duplicate, which is enabled only when > > inter-unit move is enabled, to implem

Re: [PATCH] x86: Avoid stack realignment when copying data with SSE register

2021-08-04 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 4, 2021 at 3:20 PM H.J. Lu wrote: > > To avoid stack realignment, call ix86_gen_scratch_sse_rtx to get a > scratch SSE register to copy data with with SSE register from one > memory location to another. > > gcc/ > > PR target/101772 > * config/i386/i386-expand.c (ix86_e

Re: [PATCH 5/6] AVX512FP16: Initial support for AVX512FP16 feature and scalar _Float16 instructions.

2021-08-04 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 2, 2021 at 8:44 AM liuhongt wrote: > > From: "Guo, Xuepeng" > > gcc/ChangeLog: > > * common/config/i386/cpuinfo.h (get_available_features): > Detect FEATURE_AVX512FP16. > * common/config/i386/i386-common.c > (OPTION_MASK_ISA_AVX512FP16_SET, > OP

Re: [PATCH] [i386] Refine predicate of peephole2 to general_reg_operand. [PR target/101743]

2021-08-04 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 4, 2021 at 5:33 AM liuhongt wrote: > > Hi: > The define_peephole2 which is added by r12-2640-gf7bf03cf69ccb7dc > should only work on general registers, considering that x86 also > supports mov instructions between gpr, sse reg, mask reg, limiting the > peephole2 predicate to general_

Re: [PATCH] x86: Use XMM31 for scratch SSE register

2021-08-03 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 3, 2021 at 10:15 AM Hongtao Liu wrote: > > On Tue, Aug 3, 2021 at 4:03 PM Uros Bizjak via Gcc-patches > wrote: > > > > On Mon, Aug 2, 2021 at 7:47 PM H.J. Lu wrote: > > > > > > In 64-bit mode, use XMM31 for scratch SSE register to avoid vzerou

Re: [PATCH] x86: Use XMM31 for scratch SSE register

2021-08-03 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 2, 2021 at 7:47 PM H.J. Lu wrote: > > In 64-bit mode, use XMM31 for scratch SSE register to avoid vzeroupper > if possible. > > gcc/ > > * config/i386/i386.c (ix86_gen_scratch_sse_rtx): In 64-bit mode, > try XMM31 to avoid vzeroupper. > > gcc/testsuite/ > > * gc

Re: [PATCH v7 03/10] x86: Update piecewise move and store

2021-08-02 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 2, 2021 at 4:57 PM H.J. Lu wrote: > > On Mon, Aug 2, 2021 at 4:20 AM Uros Bizjak wrote: > > > > On Fri, Jul 30, 2021 at 11:32 PM H.J. Lu wrote: > > > > > > We can use TImode/OImode/XImode integers for piecewise move and store. > > > > > > 1. Define MAX_MOVE_MAX to 64, which is the co

Re: [PATCH v6 03/10] x86: Update piecewise move and store

2021-08-02 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 30, 2021 at 11:32 PM H.J. Lu wrote: > > We can use TImode/OImode/XImode integers for piecewise move and store. > > 1. Define MAX_MOVE_MAX to 64, which is the constant maximum number of > bytes that a single instruction can move quickly between memory and > registers or between two memo

Re: [PATCH] i386: Improve SImode constant - __builtin_clzll for -mno-lzcnt

2021-08-01 Thread Uros Bizjak via Gcc-patches
On Sun, Aug 1, 2021 at 7:12 PM H.J. Lu wrote: > > On Sat, Jul 31, 2021 at 12:53:44PM -0700, H.J. Lu wrote: > > On Fri, Jul 30, 2021 at 6:27 AM Jakub Jelinek via Gcc-patches > > wrote: > > > > > > On Fri, Jul 30, 2021 at 12:27:39PM +0200, Uros Bizjak wrote: > > > > Please put some space here, e.g.

Re: [PATCH] x86: Don't enable LZCNT/POPCNT if disabled explicitly

2021-07-30 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 30, 2021 at 3:04 PM H.J. Lu wrote: > > gcc/ > > PR target/101685 > * config/i386/i386-options.c (ix86_option_override_internal): > Don't enable LZCNT/POPCNT if they have been disabled explicitly. > > gcc/testsuite/ > > PR target/101685 > * gcc.ta

Re: [PATCH] i386: Improve extensions of __builtin_clz and constant - __builtin_clz for -mno-lzcnt [PR78103]

2021-07-30 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 28, 2021 at 10:36 AM Jakub Jelinek wrote: > > Hi! > > This patch improves emitted code for the non-TARGET_LZCNT case. > As __builtin_clz* is UB on 0 argument and for !TARGET_LZCNT > CLZ_VALUE_DEFINED_AT_ZERO is 0, it is UB even at RTL time and so we > can take advantage of that and ass

Re: PING^1 [PATCH v2] x86: Check AVX512 without mask instructions

2021-07-30 Thread Uros Bizjak via Gcc-patches
> > On Fri, Jun 25, 2021 at 4:51 AM Hongtao Liu wrote: > > > > > > > > > > On Fri, Jun 25, 2021 at 12:13 AM Uros Bizjak via Gcc-patches > > > > > wrote: > > > > > > > > > > > > On Thu, Jun 24, 2021 at 2:12 PM H.J. L

Re: [x86_64 PATCH] Decrement followed by cmov improvements.

2021-07-30 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 26, 2021 at 1:27 PM Roger Sayle wrote: > > > The following patch to the x86_64 backend improves the code generated > for a decrement followed by a conditional move. The primary change is > to recognize that after subtracting one, checking the result is -1 (or > equivalently that the o

Re: [PATCH 04/10] AVX512FP16: Initial support for AVX512FP16 feature and scalar _Float16 instructions.

2021-07-22 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 21, 2021 at 9:44 AM liuhongt wrote: > > From: "Guo, Xuepeng" > > gcc/ChangeLog: > > * common/config/i386/cpuinfo.h (get_available_features): > Detect FEATURE_AVX512FP16. > * common/config/i386/i386-common.c > (OPTION_MASK_ISA_AVX512FP16_SET, > O

Re: [PATCH] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions

2021-07-21 Thread Uros Bizjak via Gcc-patches
V sre., 21. jul. 2021 14:23 je oseba H.J. Lu napisala: > Since > > commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd > Author: H.J. Lu > Date: Thu Apr 15 05:59:48 2021 -0700 > > x86: Use crc32 target option for CRC32 intrinsics > > enabled OPTION_MASK_ISA_CRC32 for -msse4 and removed TARGET_

Re: [PATCH 03/10] [i386] libgcc: Enable hfmode soft-sf/df/xf/tf extensions and truncations.

2021-07-21 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 21, 2021 at 9:43 AM liuhongt wrote: > > gcc/ChangeLog: > > * optabs-query.c (get_best_extraction_insn): Use word_mode for > HF field. > > libgcc/ChangeLog: > > * config/i386/32/sfp-machine.h (_FP_NANFRAC_H): New macro. > * config/i386/64/sfp-machine.h (_

Re: [PATCH 02/10] [i386] Enable _Float16 type for TARGET_SSE2 and above.

2021-07-21 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 21, 2021 at 9:43 AM liuhongt wrote: > > gcc/ChangeLog: > > * config/i386/i386-modes.def (FLOAT_MODE): Define ieee HFmode. > * config/i386/i386.c (enum x86_64_reg_class): Add > X86_64_SSEHF_CLASS. > (merge_classes): Handle X86_64_SSEHF_CLASS. > (e

Re: [PATCH] Support logic shift left/right for avx512 mask type.

2021-07-21 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 21, 2021 at 5:05 AM Hongtao Liu wrote: > > On Tue, Jul 20, 2021 at 9:41 PM Uros Bizjak wrote: > > > > On Tue, Jul 20, 2021 at 2:33 PM liuhongt wrote: > > > > > > Hi: > > > As mention in > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575420.html > > > > > > cut start

Re: [PATCH] Support logic shift left/right for avx512 mask type.

2021-07-20 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 20, 2021 at 2:33 PM liuhongt wrote: > > Hi: > As mention in > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575420.html > > cut start- > > note for the lowpart we can just view-convert away the excess bits, > > fully re-using the mask. We generate surprisingly "good"

[PATCH] i386: Remove atomic_storedi_fpu and atomic_loaddi_fpu peepholes [PR100182]

2021-07-19 Thread Uros Bizjak via Gcc-patches
These patterns result in non-atomic sequence. 2021-07-21 Uroš Bizjak gcc/ PR target/100182 * config/i386/sync.md (define_peephole2 atomic_storedi_fpu): Remove. (define_peephole2 atomic_loaddi_fpu): Ditto. gcc/testsuite/ PR target/100182 * gcc.target/i386/pr71245-1.c: R

Re: [PATCH] ix86: Enable the GPR only instructions for -mgeneral-regs-only

2021-07-18 Thread Uros Bizjak via Gcc-patches
On Sun, Jul 18, 2021 at 3:40 AM H.J. Lu wrote: > > For -mgeneral-regs-only, enable the GPR only instructions which are > enabled implicitly by SSE ISAs unless they have been disabled explicitly. > > gcc/ > > PR target/101492 > * common/config/i386/i386-common.c (ix86_handle_option)

Re: [PATCH] x86: Don't issue vzeroupper if callee returns AVX register

2021-07-18 Thread Uros Bizjak via Gcc-patches
On Sun, Jul 18, 2021 at 6:47 PM H.J. Lu wrote: > > Don't issue vzeroupper before function call if callee returns AVX > register since callee must be compiled with AVX. > > gcc/ > > PR target/101495 > * config/i386/i386.c (ix86_check_avx_upper_stores): Moved before > ix86_av

[PATCH] i386: Fix ix86_hard_regno_mode_ok for TDmode on 32bit targets [PR101346]

2021-07-15 Thread Uros Bizjak via Gcc-patches
General regs on 32bit targets do not support 128bit modes, including TDmode. gcc/ 2021-07-15 Uroš Bizjak PR target/101346 * config/i386/i386.h (VALID_SSE_REG_MODE): Add TDmode. (VALID_INT_MODE_P): Add SDmode and DDmode. Add TDmode for TARGET_64BIT. (VALID_DFP_MODE_P): Remo

Re: [PATCH v3] vect: Recog mul_highpart pattern

2021-07-15 Thread Uros Bizjak via Gcc-patches
V čet., 15. jul. 2021 10:49 je oseba Kewen.Lin napisala: > on 2021/7/15 下午4:23, Uros Bizjak wrote: > > On Thu, Jul 15, 2021 at 10:04 AM Kewen.Lin wrote: > >> > >> Hi Uros, > >> > >> on 2021/7/15 下午3:17, Uros Bizjak wrote: > >>> On Thu, Jul 15, 2021 at 9:07 AM Kewen.Lin wrote: > > on 2

Re: [PATCH v3] vect: Recog mul_highpart pattern

2021-07-15 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 15, 2021 at 10:04 AM Kewen.Lin wrote: > > Hi Uros, > > on 2021/7/15 下午3:17, Uros Bizjak wrote: > > On Thu, Jul 15, 2021 at 9:07 AM Kewen.Lin wrote: > >> > >> on 2021/7/14 下午3:45, Kewen.Lin via Gcc-patches wrote: > >>> on 2021/7/14 下午2:38, Richard Biener wrote: > On Tue, Jul 13, 2

Re: [PATCH v3] vect: Recog mul_highpart pattern

2021-07-15 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 15, 2021 at 9:07 AM Kewen.Lin wrote: > > on 2021/7/14 下午3:45, Kewen.Lin via Gcc-patches wrote: > > on 2021/7/14 下午2:38, Richard Biener wrote: > >> On Tue, Jul 13, 2021 at 4:59 PM Kewen.Lin wrote: > >>> > >>> on 2021/7/13 下午8:42, Richard Biener wrote: > On Tue, Jul 13, 2021 at 12:

Re: [PATCH v3] x86: Don't enable UINTR in 32-bit mode

2021-07-13 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 13, 2021 at 8:59 PM Jakub Jelinek wrote: > > On Tue, Jul 13, 2021 at 09:35:18AM -0700, H.J. Lu wrote: > > Here is the v3 patch. OK for master? > > From my POV LGTM, but please give Uros a chance to chime in. > > > From ceab81ef97ab102c410830c41ba7fea911170d1a Mon Sep 17 00:00:00 2001

[PATCH] i386: Fix vec_set expanders [PR101424]

2021-07-12 Thread Uros Bizjak via Gcc-patches
AVX does not support 32-byte integer compares, required by ix86_expand_vector_set_var. The following patch fixes vec_set expanders by introducing new vec_setm_avx2_operand predicate for AVX vector modes. gcc/ 2021-07-12 Uroš Bizjak PR target/101424 * config/i386/predicates.md (vec_se

[PATCH] Change the type of memory classification functions to bool

2021-07-09 Thread Uros Bizjak via Gcc-patches
2021-07-09 Uroš Bizjak gcc/ * recog.c (memory_address_addr_space_p): Change the type to bool. Return true/false instead of 1/0. (offsettable_memref_p): Ditto. (offsettable_nonstrict_memref_p): Ditto. (offsettable_address_addr_space_p): Ditto. Change the type of addressp

[PATCH] i386: Fix *udivmodsi4_pow2_zext_? patterns

2021-07-09 Thread Uros Bizjak via Gcc-patches
In addition to the obvious cut-n-pasto where *udivmodsi4_pow2_zext_2 never matches, limit the range of the immediate operand to prevent out of range immediate operand of AND instruction. Found by inspection, the patterns rarely match (if at all), since tree optimizers do the transformation before

Re: Ping: [PATCH] Darwini,X86: Adjust call clobbers to allow for lazy-binding [PR100152].

2021-07-09 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 9, 2021 at 10:25 AM Iain Sandoe wrote: > > (early) ping; > if possible I’d like to get this onto master in time to back-port for 11.2. > > > On 4 Jul 2021, at 21:08, Iain Sandoe wrote: > > > > Hi, > > > > (I’m not going to defend the status quo here, it seems a bit prone > > to confus

Re: [x86_64 PATCH]: Improvement to signed division of integer constant.

2021-07-08 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 8, 2021 at 10:25 AM Roger Sayle wrote: > > > This patch tweaks the way GCC handles 32-bit integer division on > x86_64, when the numerator is constant. Currently the function > > int foo (int x) { > return 100/x; > } > > generates the code: > foo:movl$100, %eax > clt

[PATCH] i386: Add pack/unpack patterns for 32bit vectors [PR100637]

2021-07-08 Thread Uros Bizjak via Gcc-patches
V1SI mode shift is needed to shift 32bit operands and consequently we need to implement V1SI moves and pushes. 2021-07-08 Uroš Bizjak gcc/ PR target/100637 * config/i386/i386-expand.c (ix86_expand_sse_unpack): Handle V4QI mode. * config/i386/mmx.md (V_32): New mode iterator.

[PATCH] i386: Add variable vec_set for 32bit vectors [PR97194]

2021-07-06 Thread Uros Bizjak via Gcc-patches
To generate sane code a SSE4.1 variable PBLENDV instruction is needed. Also enable variable vec_set through vec_setm_operand predicate for TARGET_SSE4_1 instead of TARGET_AVX2. ix86_expand_vector_init_duplicate is able to emulate vpbroadcast{b,w} with pxor/pshufb. 2021-07-06 Uroš Bizjak gcc/

[PATCH] i386: Implement 4-byte vector (V4QI/V2HI) constant permutations [PR100637]

2021-07-05 Thread Uros Bizjak via Gcc-patches
2021-07-05 Uroš Bizjak gcc/ PR target/100637 * config/i386/i386-expand.c (ix86_split_mmx_punpck): Handle V4QI and V2HI modes. (expand_vec_perm_blend): Allow 4-byte vector modes with TARGET_SSE4_1. Handle V4QI mode. Emit mmx_pblendvb32 for 4-byte modes. (expand_vec_perm_p

Re: [PATCH] [i386] Remove rex64suffix for v?cvtt?(ss|sd)*2si

2021-07-02 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 2, 2021 at 12:48 PM Hongyu Wang wrote: > > > > > On Fri, Jul 2, 2021 at 10:30 AM Hongyu Wang wrote: > > > > > > Hi, > > > > > > For instructions like cvtss2si, there is no need to output the 'l' > > > or 'q' suffixes just like cvtss2usi, since the output operand is always > > > regist

Re: [PATCH] [i386] Remove rex64suffix for v?cvtt?(ss|sd)*2si

2021-07-02 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 2, 2021 at 10:30 AM Hongyu Wang wrote: > > Hi, > > For instructions like cvtss2si, there is no need to output the 'l' > or 'q' suffixes just like cvtss2usi, since the output operand is always > register and those suffixes are only used to distinguish ambiguous > memory operands. > > Bo

Re: [PATCH] i386: Disable param ira-consider-dup-in-all-alts [PR100328]

2021-07-02 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 2, 2021 at 4:28 AM Kewen.Lin wrote: > > Hi, > > With Hongtao's help (thanks), we got the SPEC2017 performance > evaluation result on x86_64 (see [1]), this new parameter > ira-consider-dup-in-all-alts has negative effects on i386. > Since we observed it can benefit ports aarch64 and rs

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-02 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 2, 2021 at 8:25 AM Hongtao Liu wrote: > > > AVX512FP16 is disclosed, refer to [1]. > > > There're 100+ instructions for AVX512FP16, 67 gcc patches, for the > > > convenience of review, we divide the 67 patches into 2 major parts. > > > The first part is 2 patches containing bas

[PATCH] i386: Return true/false instead of 1/0 from predicates.

2021-07-01 Thread Uros Bizjak via Gcc-patches
No functional changes. 2021-07-01 Uroš Bizjak gcc/ * config/i386/predicates.md (ix86_endbr_immediate_operand): Return true/false instead of 1/0. (movq_parallel): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/confi

[PATCH] Return true/false instead of 1/0 from generic predicates.

2021-07-01 Thread Uros Bizjak via Gcc-patches
No functional changes. 2021-07-01 Uroš Bizjak gcc/ * recog.c (general_operand): Return true/false instead of 1/0. (register_operand): Ditto. (immediate_operand): Ditto. (const_int_operand): Ditto. (const_scalar_int_operand): Ditto. (const_double_operand): Ditto. (pu

Re: [PATCH v6 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-07-01 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 1, 2021 at 2:42 PM H.J. Lu wrote: > > Hi Uros, > > On Thu, Jul 1, 2021 at 1:32 AM Hongtao Liu wrote: > > > > On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote: > > > > > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR > > > operands to vector broadcast from an i

[PATCH] Change the type of predicates to bool.

2021-07-01 Thread Uros Bizjak via Gcc-patches
On Wed, Jun 30, 2021 at 12:50 PM Richard Biener wrote: > > On Wed, Jun 30, 2021 at 10:47 AM Uros Bizjak via Gcc-patches > wrote: > > > > This RFC patch changes the type of predicates to bool. However, some > > of the targets (e.g. x86) use indirect functions to

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 1, 2021 at 2:40 PM H.J. Lu wrote: > > On Thu, Jul 1, 2021 at 4:10 AM Uros Bizjak wrote: > > > > [Sorry for double post, gcc-patches address was wrong in original post] > > > > On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote: > > > > > > Hi: > > > AVX512FP16 is disclosed, refer to [1]

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread Uros Bizjak via Gcc-patches
[Sorry for double post, gcc-patches address was wrong in original post] On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote: > > Hi: > AVX512FP16 is disclosed, refer to [1]. > There're 100+ instructions for AVX512FP16, 67 gcc patches, for the > convenience of review, we divide the 67 patches into

[PATCH] i386: Add integer nabs instructions [PR101044]

2021-07-01 Thread Uros Bizjak via Gcc-patches
The patch adds integer nabs "(NEG (ABS (...)))" instructions, adds STV conversion and adjusts STV cost calculations accordingly. When CMOV instruction is used to implement abs, the sign is determined from the preceding operand negation, and CMOVS is used to select between negated and non-negated v

[RFC PATCH] Change the type of predicates to bool.

2021-06-30 Thread Uros Bizjak via Gcc-patches
This RFC patch changes the type of predicates to bool. However, some of the targets (e.g. x86) use indirect functions to call the predicates, so without the local change, the build fails. Putting the patch through CI bots should weed out the problems, but I have no infrastructure to do it myself.

[PATCH] i386: Add V2SFmode vec_addsub pattern [PR95046]

2021-06-29 Thread Uros Bizjak via Gcc-patches
gcc/ 2021-06-21 Uroš Bizjak PR target/95046 * config/i386/mmx.md (vec_addsubv2sf3): New insn pattern. gcc/testsuite/ 2021-06-21 Uroš Bizjak PR target/95046 * gcc.target/i386/pr95046-9.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to

Re: [PATCH][RFC] Add x86 subadd SLP pattern

2021-06-25 Thread Uros Bizjak via Gcc-patches
On Fri, Jun 25, 2021 at 8:48 AM Richard Biener wrote: > > On Thu, 24 Jun 2021, Uros Bizjak wrote: > > > On Thu, Jun 24, 2021 at 1:07 PM Richard Biener wrote: > > > > > This addds SLP pattern recognition for the SSE3/AVX [v]addsubp{ds} v0, v1 > > > instructions which compute { v0[0] - v1[0], v0[1]

<    2   3   4   5   6   7   8   9   10   11   >