Re: [PATCH] i386: Optimize code generation of __mm256_zextsi128_si256(__mm_set1_epi8(-1))

2022-09-22 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 23, 2022 at 11:07 AM Hu, Lin1 wrote: > > Hi, Hongtao > > I have modefied this patch and regtested on x86_64-pc-linux-gnu. > Ok. > BRs. > Lin > > -Original Message- > From: Hongtao Liu > Sent: Friday, September 23, 2022 9:48 AM > To: Hu,

Re: [PATCH] i386: Optimize code generation of __mm256_zextsi128_si256(__mm_set1_epi8(-1))

2022-09-22 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 22, 2022 at 3:20 PM Hu, Lin1 via Gcc-patches wrote: > > Hi all, > > This patch aims to optimize code generation of > __mm256_zextsi128_si256(__mm_set1_epi8(-1)). Reduce the number of > instructions required to achieve the final result. > > Regtested on x86_64-pc-linux-gnu. Ok for tru

Re: [RFC PATCH] __trunc{tf, xf, df, sf, hf}bf2, __truncbfhf2 and __extendbfsf2

2022-09-22 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 22, 2022 at 11:56 PM Jakub Jelinek wrote: > > On Tue, Sep 20, 2022 at 10:51:18AM +0200, Jakub Jelinek via Gcc-patches wrote: > > On Tue, Sep 20, 2022 at 11:35:07AM +0800, Hongtao Liu wrote: > > > > The question is (mainly for aarch64, arm and x86 backend mai

Re: [PATCH] [x86] Fix typo in floorv2sf2, should be register_operand for op1, not vector_operand.

2022-09-21 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 22, 2022 at 9:17 AM liuhongt wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Verify 526.blend_r can be rebuilt with the fix. > > Ok for trunk? > > gcc/ChangeLog: > > PR target/106994 > * config/i386/mmx.md (floorv2sf2): Fix typo, use > reg

Re: [PATCH] Don't check can_vec_perm_const_p for nonlinear iv_init when it's constant.

2022-09-21 Thread Hongtao Liu via Gcc-patches
On Wed, Sep 21, 2022 at 3:41 PM Richard Biener via Gcc-patches wrote: > > On Wed, Sep 21, 2022 at 1:41 AM liuhongt via Gcc-patches > wrote: > > > > When init_expr is INTEGER_CST or REAL_CST, can_vec_perm_const_p is not > > necessary since there's no real vec_perm needed, but > > vec_gen_perm_mask

Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-20 Thread Hongtao Liu via Gcc-patches
+My intel folk phoebe working for llvm side. On Tue, Sep 20, 2022 at 11:35 AM Hongtao Liu wrote: > > On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches > wrote: > > > > Hi! > > > > The following patch implements the compiler part of C++23 > >

Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-19 Thread Hongtao Liu via Gcc-patches
On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches wrote: > > Hi! > > The following patch implements the compiler part of C++23 > P1467R9 - Extended floating-point types and standard names compiler part > by introducing _Float{16,32,64,128} as keywords and builtin types > like they are

Re: [PATCH] Fix incorrect handle in vectorizable_induction for mixed induction type.

2022-09-19 Thread Hongtao Liu via Gcc-patches
On Tue, Sep 20, 2022 at 10:23 AM liuhongt wrote: > > The codes in vectorizable_induction for slp_node assume all phi_info > have same induction type(vect_step_op_add), but since we support > nonlinear induction, it could be wrong handled. > So the patch return false when slp_node has mixed inducti

Re: [PATCH] [x86]Don't optimize cmp mem, 0 to load mem, reg + test reg, reg

2022-09-19 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 16, 2022 at 9:38 PM Alexander Monakov via Gcc-patches wrote: > > On Fri, 16 Sep 2022, Uros Bizjak via Gcc-patches wrote: > > > On Fri, Sep 16, 2022 at 3:32 AM Jeff Law via Gcc-patches > > wrote: > > > > > > > > > On 9/15/22 19:06, liuhongt via Gcc-patches wrote: > > > > There's peepho

Re: [PATCH] Support 64-bit vectorization for single-precision floating rounding operation.

2022-09-19 Thread Hongtao Liu via Gcc-patches
On Tue, Sep 20, 2022 at 10:14 AM liuhongt wrote: > > Here's list the patch supported. > rint/nearbyint/ceil/floor/trunc/lrint/lceil/lfloor/round/lround. > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} > Ok for trunk? > > gcc/ChangeLog: > > PR target/106910 > * config

Re: [PATCH] [x86]Don't optimize cmp mem, 0 to load mem, reg + test reg, reg

2022-09-15 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 16, 2022 at 9:09 AM liuhongt via Gcc-patches wrote: > > There's peephole2 submit in 1990s which split cmp mem, 0 to load mem, > reg + test reg, reg. I don't know exact reason why gcc do this. > > For latest x86 processors, ciscization should help processor frontend > also codesize, for

Re: [PATCH] Modernize ix86_builtin_vectorized_function with corresponding expanders.

2022-09-15 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 16, 2022 at 8:55 AM liuhongt wrote: > > For ifloor/lfloor/iceil/lceil/irint/lrint/iround/lround when size of > in_mode is not equal out_mode, vectorizer doesn't go to internal fn > way,still left that part in the ix86_builtin_vectorized_function. > > Remove others builtins and add corr

Re: [PATCH] i386: Fixed vec_init_dup_v16bf [PR106887]

2022-09-14 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 15, 2022 at 11:36 AM Kong, Lingling via Gcc-patches wrote: > > Hi > > The patch is to fix vec_init_dup_v16bf, add correct handle for v16bf mode in > ix86_expand_vector_init_duplicate. > Add testcase with sse2 without avx2. > > OK for master? > > gcc/ChangeLog: > > PR target/10

Re: [PATCH] Fix _mm512_cvt_roundps_ph to generate sae instruction.

2022-09-04 Thread Hongtao Liu via Gcc-patches
On Mon, Sep 5, 2022 at 10:44 AM liuhongt wrote: > > zmm-version vcvtps2ph is special, it encodes {sae} in evex, but put > round control in the imm. For intrinsic _mm512_cvt_roundps_ph (a, > imm), imm contains both {sae} and round control, we need to separate > it in the assembly output since vcvtp

Re: [PATCH] x86: Handle V8BF in expand_vec_perm_broadcast_1

2022-09-04 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 2, 2022 at 4:08 PM Kong, Lingling wrote: > > Hi, > > I fixed it in a new patch. And added BF vector mode in SUBST_V and > avx512fmaskhalfmode for @vec_interleave_high. > Ok for trunk ? Ok. > > > > Hi, > > > > > > Handle E_V8BFmode in expand_vec_perm_broadcast_1 and > > ix86_expand_ve

Re: [PATCH] x86: Handle V8BF in expand_vec_perm_broadcast_1

2022-08-31 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 31, 2022 at 2:52 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > Handle E_V8BFmode in expand_vec_perm_broadcast_1 and > ix86_expand_vector_init_duplicate. > Ok for trunk? > > gcc/ChangeLog: > > PR target/106742 > * config/i386/i386-expand.cc (ix86_expand_vector_in

Re: [PATCH] x86: Cast stride to __PTRDIFF_TYPE__ in AMX intrinsics

2022-08-28 Thread Hongtao Liu via Gcc-patches
On Sat, Aug 27, 2022 at 12:51 AM H.J. Lu wrote: > > On Mon, Aug 22, 2022 at 7:05 PM Hongtao Liu wrote: > > > > On Tue, Aug 23, 2022 at 1:02 AM H.J. Lu wrote: > > > > > > On 64-bit Windows, long is 32 bits and can't be used as stride in memory > &g

Re: [PATCH] Don't gimple fold ymm-version vblendvpd/vblendvps/vpblendvb w/o TARGET_AVX2

2022-08-24 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 24, 2022 at 9:15 AM liuhongt wrote: > > Since 256-bit vector integer comparison is under TARGET_AVX2, > and gimple folding for vblendvpd/vblendvps/vpblendvb relies on that. > Restrict gimple fold condition to TARGET_AVX2. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. >

Re: [PATCH] Add __m128bf16/__m256bf16/__m512bf16 type for bf16 abi test

2022-08-22 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 22, 2022 at 10:16 AM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch added __m128bf16/__m256bf16/__m512bf16 type in testcases. Ok. > > BRs, > Haochen > > gcc/testsuite/ChangeLog: > > * gcc.target/x86_64/abi/bf16/bf16-helper.h: > Add _m128bf16/m256bf16/_m

Re: [PATCH] x86: Cast stride to __PTRDIFF_TYPE__ in AMX intrinsics

2022-08-22 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 23, 2022 at 1:02 AM H.J. Lu wrote: > > On 64-bit Windows, long is 32 bits and can't be used as stride in memory > operand when base is a pointer which is 64 bits. Cast stride to > __PTRDIFF_TYPE__, instead of long. Ok. > > PR target/106714 > * config/i386/amxtileintrin

Re: [PATCH] Add ABI test for __bf16 type

2022-08-21 Thread Hongtao Liu via Gcc-patches
On Mon, Aug 22, 2022 at 9:02 AM Hongtao Liu wrote: > > On Sat, Aug 20, 2022 at 1:31 AM H.J. Lu wrote: > > > > On Thu, Aug 18, 2022 at 5:56 PM Hongtao Liu via Gcc-patches > > wrote: > > > > > > On Thu, Aug 18, 2022 at 3:36 PM Haochen Jiang via Gcc-pa

Re: [PATCH] Add ABI test for __bf16 type

2022-08-21 Thread Hongtao Liu via Gcc-patches
On Sat, Aug 20, 2022 at 1:31 AM H.J. Lu wrote: > > On Thu, Aug 18, 2022 at 5:56 PM Hongtao Liu via Gcc-patches > wrote: > > > > On Thu, Aug 18, 2022 at 3:36 PM Haochen Jiang via Gcc-patches > > wrote: > > > > > > Hi all, > > > > > &

Re: [PATCH] x86: Support vector __bf16 type.

2022-08-16 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 16, 2022 at 3:50 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > The patch is support vector init/broadcast/set/extract for __bf16 type. > The __bf16 type is a storage type. > > OK for master? Ok. > > gcc/ChangeLog: > > * config/i386/i386-expand.cc (ix86_expand_sse_movcc):

Re: [PATCH] i386 testsuite: cope with --enable-default-pie

2022-08-14 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 10, 2022 at 1:42 PM Alexandre Oliva via Gcc-patches wrote: > > On Aug 9, 2022, Alexandre Oliva wrote: > > > Ping? > > https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598276.html > > Oops, sorry, I linked to the wrong patch. This is the one I meant to ping: > > https://gcc.gnu.or

Re: [RFC: PATCH] Extend vectorizer to handle nonlinear induction for neg, mul/lshift/rshift with a constant.

2022-08-04 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 4, 2022 at 4:19 PM Richard Biener via Gcc-patches wrote: > > On Thu, Aug 4, 2022 at 6:29 AM liuhongt via Gcc-patches > wrote: > > > > For neg, the patch create a vec_init as [ a, -a, a, -a, ... ] and no > > vec_step is needed to update vectorized iv since vf is always multiple > > of

Re: [PATCH] x86: Enable __bf16 type for TARGET_SSE2 and above

2022-08-03 Thread Hongtao Liu via Gcc-patches
On Wed, Aug 3, 2022 at 4:41 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > Old patch has some mistake in `*movbf_internal` , now disable BFmode constant > double move in `*movbf_internal`. LGTM. > > Thanks, > Lingling > > > -Original Message- > > From: Kong, Lingling > > Sent: Tues

Re: [PATCH] Move pass_cse_sincos after vectorizer.

2022-07-20 Thread Hongtao Liu via Gcc-patches
On Wed, Jul 20, 2022 at 3:59 PM Richard Biener via Gcc-patches wrote: > > On Wed, Jul 20, 2022 at 4:20 AM liuhongt wrote: > > > > __builtin_cexpi can't be vectorized since there's gap between it and > > vectorized sincos version(In libmvec, it passes a double and two > > double pointer and return

gcc-patches@gcc.gnu.org

2022-07-20 Thread Hongtao Liu via Gcc-patches
a.c. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? > > OK. > > Are there cases left your vectorizer patch handles over this one? No. > > Thanks, > Richard. > > > 2022-07-20 Richard Biener > > H

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-20 Thread Hongtao Liu via Gcc-patches
On Wed, Jul 20, 2022 at 3:18 PM Uros Bizjak wrote: > > On Wed, Jul 20, 2022 at 8:54 AM Hongtao Liu wrote: > > > > On Wed, Jul 20, 2022 at 2:18 PM Uros Bizjak wrote: > > > > > > On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak wrote: > > > > > >

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Hongtao Liu via Gcc-patches
On Wed, Jul 20, 2022 at 2:18 PM Uros Bizjak wrote: > > On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak wrote: > > > > On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu wrote: > > > > > > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote: > > > > > >

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Hongtao Liu via Gcc-patches
On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote: > > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote: > > > > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches > > wrote: > > > > > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote:

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-18 Thread Hongtao Liu via Gcc-patches
On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches wrote: > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote: > > > > And split it after reload. > > > > > You will need ix86_binary_operator_ok insn constraint here with > > > corresponding expander using ix86_fixup_binary_operands_no_copy

Re: [AVX512 PATCH] Add UNSPEC_MASKOP to kupck instructions in sse.md.

2022-07-17 Thread Hongtao Liu via Gcc-patches
On Sat, Jul 16, 2022 at 10:08 PM Roger Sayle wrote: > > > This AVX512 specific patch to sse.md is split out from an earlier patch: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596199.html > > The new splitters proposed in that patch interfere with AVX512's > kunpckdq instruction which is

Re: [PATCH] x86: Disable sibcall if indirect_return attribute doesn't match

2022-07-14 Thread Hongtao Liu via Gcc-patches
On Fri, Jul 15, 2022 at 1:44 AM H.J. Lu via Gcc-patches wrote: > > When shadow stack is enabled, function with indirect_return attribute > may return via indirect jump. In this case, we need to disable sibcall > if caller doesn't have indirect_return attribute and indirect branch > tracking is en

Re: [PATCH] i386: Fix _mm_[u]comixx_{ss,sd} codegen and add PF result. [PR106113]

2022-07-14 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 14, 2022 at 2:11 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > The patch is to fix _mm_[u]comixx_{ss,sd} codegen and add PF result. These > intrinsics have changed over time, like `_mm_comieq_ss ` old operation is > `RETURN ( a[31:0] == b[31:0] ) ? 1 : 0`, and new operation u

Re: [PATCH] Extend 64-bit vector bit_op patterns with ?r alternative

2022-07-14 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 14, 2022 at 3:22 PM Uros Bizjak via Gcc-patches wrote: > > On Thu, Jul 14, 2022 at 7:33 AM liuhongt wrote: > > > > And split it to GPR-version instruction after reload. > > > > > ?r was introduced under the assumption that we want vector values > > > mostly in vector registers. Curren

Re: [PATCH] [RFC]Support vectorization for Complex type.

2022-07-14 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 14, 2022 at 4:53 PM Hongtao Liu wrote: > > On Thu, Jul 14, 2022 at 4:20 PM Richard Biener > wrote: > > > > On Wed, Jul 13, 2022 at 9:34 AM Richard Biener > > wrote: > > > > > > On Wed, Jul 13, 2022 at 6:47 AM Hongtao Liu wrote: >

Re: [PATCH] [RFC]Support vectorization for Complex type.

2022-07-14 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 14, 2022 at 4:20 PM Richard Biener wrote: > > On Wed, Jul 13, 2022 at 9:34 AM Richard Biener > wrote: > > > > On Wed, Jul 13, 2022 at 6:47 AM Hongtao Liu wrote: > > > > > > On Tue, Jul 12, 2022 at 10:12 PM Richard Biener > > > wrote:

Re: [PATCH] [RFC]Support vectorization for Complex type.

2022-07-12 Thread Hongtao Liu via Gcc-patches
On Tue, Jul 12, 2022 at 10:12 PM Richard Biener wrote: > > On Tue, Jul 12, 2022 at 6:11 AM Hongtao Liu wrote: > > > > On Mon, Jul 11, 2022 at 7:47 PM Richard Biener via Gcc-patches > > wrote: > > > > > > On Mon, Jul 11, 2022 at 5:44 AM liuhongt wro

Re: [PATCH] Allocate general register(memory/immediate) for 16/32/64-bit vector bit_op patterns.

2022-07-11 Thread Hongtao Liu via Gcc-patches
On Mon, Jul 11, 2022 at 4:03 PM Uros Bizjak via Gcc-patches wrote: > > On Mon, Jul 11, 2022 at 3:15 AM liuhongt wrote: > > > > And split it to GPR-version instruction after reload. > > > > This will enable below optimization for 16/32/64-bit vector bit_op > > > > - movd(%rdi), %xmm0 > >

Re: [PATCH] [RFC]Support vectorization for Complex type.

2022-07-11 Thread Hongtao Liu via Gcc-patches
On Mon, Jul 11, 2022 at 7:47 PM Richard Biener via Gcc-patches wrote: > > On Mon, Jul 11, 2022 at 5:44 AM liuhongt wrote: > > > > The patch only handles load/store(including ctor/permutation, except > > gather/scatter) for complex type, other operations don't needs to be > > handled since they wi

Re: [x86 PATCH] UNSPEC_PALIGNR optimizations and clean-ups.

2022-07-04 Thread Hongtao Liu via Gcc-patches
; This revised patch has been tested on x86_64-pc-linux-gnu with make > bootstrap and make -k check, both with and with --target_board=unix{-32}, > with no new failures. Is this revised version Ok for mainline? Ok. > > > 2022-07-04 Roger Sayle > Hongtao Liu >

Re: [x86 PATCH] UNSPEC_PALIGNR optimizations and clean-ups.

2022-06-30 Thread Hongtao Liu via Gcc-patches
On Fri, Jul 1, 2022 at 10:12 AM Hongtao Liu wrote: > > On Fri, Jul 1, 2022 at 2:42 AM Roger Sayle wrote: > > > > > > This patch is a follow-up to Hongtao's fix for PR target/105854. That > > fix is perfectly correct, but the thing that caught my eye was w

Re: [PATCH] Add myself for write after approval

2022-06-30 Thread Hongtao Liu via Gcc-patches
I think this can be taken as an obvious fix without prior approval. "Obvious fixes can be committed without prior approval. Just check in the fix and copy it to gcc-patches." Quoted from https://gcc.gnu.org/gitwrite.html On Fri, Jul 1, 2022 at 10:02 AM Haochen Jiang via Gcc-patches wrote: > > Hi

Re: [x86 PATCH] UNSPEC_PALIGNR optimizations and clean-ups.

2022-06-30 Thread Hongtao Liu via Gcc-patches
On Fri, Jul 1, 2022 at 2:42 AM Roger Sayle wrote: > > > This patch is a follow-up to Hongtao's fix for PR target/105854. That > fix is perfectly correct, but the thing that caught my eye was why is > the compiler generating a shift by zero at all. Digging deeper it > turns out that we can easily

Re: [PATCH gcc 0/1] [PATCH] target: Fix asm generation for AVX builtins when using -masm=intel [PR106095]

2022-06-28 Thread Hongtao Liu via Gcc-patches
this. Yes for the case in your patch, I think it's a typo. But there could be some difference for operand modifiers between AT&T and Intel syntaxes in some patterns. .i.e the use of mode attr . > > On Tue, 2022-06-28 at 14:22 +0800, Hongtao Liu wrote: > > On Tue, J

Re: [PATCH gcc 0/1] [PATCH] target: Fix asm generation for AVX builtins when using -masm=intel [PR106095]

2022-06-27 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 28, 2022 at 9:26 AM ~antoyo via Gcc-patches wrote: > > Hi. > > This fixes the following bug: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106095 The patch LGTM, thanks for handling this. > > It's the first time I work outside of the jit component, so please tell > me if I forgot anyt

Re: PING^1 [PATCH] x86: Skip ENDBR when emitting direct call/jmp to local function

2022-06-26 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 21, 2022 at 3:50 AM Uros Bizjak via Gcc-patches wrote: > > On Mon, Jun 20, 2022 at 8:14 PM H.J. Lu wrote: > > > > On Tue, May 10, 2022 at 9:25 AM H.J. Lu wrote: > > > > > > Mark a function with SYMBOL_FLAG_FUNCTION_ENDBR when inserting ENDBR at > > > function entry. Skip the 4-byte

Re: [PATCH] Add optional __Bfloat16 support

2022-06-12 Thread Hongtao Liu via Gcc-patches
On Sat, Jun 11, 2022 at 1:46 AM H.J. Lu wrote: > > On Fri, Jun 10, 2022 at 7:44 AM H.J. Lu wrote: > > > > On Fri, Jun 10, 2022 at 2:38 AM Florian Weimer wrote: > > > > > > * liuhongt via Libc-alpha: > > > > > > > +\subsubsection{Special Types} > > > > + > > > > +The \code{__Bfloat16} type uses a

Re: [PATCH] testsuite: Add -mtune=generic to dg-options for two testcases.

2022-06-10 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 10, 2022 at 4:45 PM Cui,Lili via Gcc-patches wrote: > > This patch is to change dg-options for two testcases. > > Use -mtune=generic to limit these two testcases. Because configuring them with > -mtune=cascadelake or znver3 will vectorize them. > > regtested on x86_64-linux-gnu{-m32,}.

Re: [PATCH] Add optional __Bfloat16 support

2022-06-10 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 10, 2022 at 3:47 PM liuhongt via Libc-alpha wrote: > > Pass and return __Bfloat16 values in XMM registers. > > Background: > __Bfloat16 (BF16) is a new floating-point format that can accelerate machine > learning (deep learning training, in particular) algorithms. > It's first introdu

Re: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit preference to vector store.

2022-06-09 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 8, 2022 at 11:44 AM Cui, Lili wrote: > > > -Original Message- > > From: Hongtao Liu > > Sent: Monday, June 6, 2022 1:25 PM > > To: H.J. Lu > > Cc: Cui, Lili ; Liu, Hongtao ; > > GCC > > Patches > > Subject: Re: [PATCH] U

Re: [PATCH] Disparages SSE_REGS alternatives sligntly with ?v instead of *v in *mov{si, di}_internal.

2022-06-07 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 7, 2022 at 3:41 PM liuhongt via Gcc-patches wrote: > > So alternative v won't be igored in record_reg_classess. > > Similar for *r alternatives in some vector patterns. > > It helps testcase in the PR, also RA now makes better decisions for > gcc.target/i386/extract-insert-combining.c

Re: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit preference to vector store.

2022-06-05 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 1, 2022 at 11:56 PM H.J. Lu via Gcc-patches wrote: > > On Tue, May 31, 2022 at 10:06 PM Cui,Lili wrote: > > > > This patch is to update {skylake,icelake,alderlake}_cost to add a bit > > preference to vector store. > > Since the interger vector construction cost has changed, we need t

Re: [PATCH] x86: harmonize __builtin_ia32_psadbw*() types

2022-06-05 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 6, 2022 at 3:17 AM Uros Bizjak via Gcc-patches wrote: > > On Thu, Jun 2, 2022 at 5:04 PM Jan Beulich wrote: > > > > The 64-bit, 128-bit, and 512-bit variants have VDI return type, in > > line with instruction behavior. Make the 256-bit builtin match, thus > > also making it match the

Re: [x86_64 PATCH] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.

2022-06-01 Thread Hongtao Liu via Gcc-patches
On Thu, Jun 2, 2022 at 2:24 PM Roger Sayle wrote: > > > This patch resolves PR target/105791 which is a regression that was > accidentally introduced for my workaround to PR tree-optimization/10566. > (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it > shouldn't). The latest is

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-31 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 1, 2022 at 12:40 AM Richard Sandiford wrote: > > Vladimir Makarov via Gcc-patches writes: > > On 2022-05-29 23:05, Hongtao Liu wrote: > >> On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches > >> wrote: > >>> > >>>

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-30 Thread Hongtao Liu via Gcc-patches
On Mon, May 30, 2022 at 3:44 PM Alexander Monakov wrote: > > On Mon, 30 May 2022, Hongtao Liu wrote: > > > On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches > > wrote: > > > > > > > > The spill is mainly decided by 3 insns related

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-30 Thread Hongtao Liu via Gcc-patches
On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches wrote: > > > > In the PR, the spill happens in the initial basic block of the function, > > > i.e. > > > the one with the highest frequency. > > > > > > Also as noted in the PR, swapping the 'unlikely' branch to 'likely' > > > avo

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-29 Thread Hongtao Liu via Gcc-patches
On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches wrote: > > > On 2022-05-24 23:39, liuhongt wrote: > > Rigt now, mem_cost for separate mem alternative is 1 * frequency which > > is pretty small and caused the unnecessary SSE spill in the PR, I've tried > > to rework backend cost mo

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-24 Thread Hongtao Liu via Gcc-patches
On Wed, May 25, 2022 at 11:39 AM liuhongt via Gcc-patches wrote: > > Rigt now, mem_cost for separate mem alternative is 1 * frequency which > is pretty small and caused the unnecessary SSE spill in the PR, I've tried > to rework backend cost model, but RA still not happy with that(regress > somewh

Re: [PATCH] [i386] recognize bzhi pattern when there's zero_extendsidi.

2022-05-17 Thread Hongtao Liu via Gcc-patches
On Tue, May 17, 2022 at 6:07 PM Uros Bizjak via Gcc-patches wrote: > > On Tue, May 17, 2022 at 5:06 AM liuhongt wrote: > > > > backend has > > > > 16550(define_insn "*bmi2_bzhi_3_2" > > 16551 [(set (match_operand:SWI48 0 "register_operand" "=r") > > 16552(and:SWI48 > > 16553 (pl

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.

2022-05-17 Thread Hongtao Liu via Gcc-patches
On Tue, May 17, 2022 at 6:03 PM Uros Bizjak wrote: > > On Tue, May 17, 2022 at 3:33 AM Hongtao Liu wrote: > > > > On Mon, May 16, 2022 at 5:21 PM Uros Bizjak via Gcc-patches > > wrote: > > > > > > On Sat, May 7, 2022 at 7:05 AM liuhongt wrote: >

Re: [PATCH] [Middle-end] Enhance final_value_replacement_loop to handle bitwise induction.

2022-05-17 Thread Hongtao Liu via Gcc-patches
On Fri, May 13, 2022 at 7:16 PM Richard Biener wrote: > > On Fri, May 13, 2022 at 5:37 AM Hongtao Liu wrote: > > > > On Wed, May 11, 2022 at 4:45 PM Richard Biener via Gcc-patches > > wrote: > > > > > > On Mon, May 9, 2022 at 7:19 AM liuhongt wrote: &g

Re: [committed] forwprop: Fix a typo and comment formatting

2022-05-17 Thread Hongtao Liu via Gcc-patches
thanks. On Tue, May 17, 2022 at 3:09 PM Jakub Jelinek via Gcc-patches wrote: > > Hi! > > When looking around the spot of the PR105591 fix, I've noticed a typo > and incorrectly formatted comment. > > Bootstrapped/regtested on x86_64-linux and i668-linux, committed to > trunk as obvious. > > 2022-

Re: [PATCH] [i386] recognize bzhi pattern when there's zero_extendsidi.

2022-05-16 Thread Hongtao Liu via Gcc-patches
On Tue, May 17, 2022 at 11:06 AM liuhongt via Gcc-patches wrote: > > backend has > > 16550(define_insn "*bmi2_bzhi_3_2" > 16551 [(set (match_operand:SWI48 0 "register_operand" "=r") > 16552(and:SWI48 > 16553 (plus:SWI48 > 16554(ashift:SWI48 (const_int 1) > 16555

Re: [PATCH v2] Optimize vpermtiw/b to vpunpcklqdq for certain cases.

2022-05-16 Thread Hongtao Liu via Gcc-patches
I've committed the patch. On Fri, May 13, 2022 at 5:22 PM liuhongt via Gcc-patches wrote: > > Here's updated patch which adds ix86_pre_reload_split () to those 2 > define_insn_and_splits. > > Assembly Optimization like: > - vmovq %xmm0, %xmm2 > - vmovdqa .LC0(%rip), %xmm0 >

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.

2022-05-16 Thread Hongtao Liu via Gcc-patches
On Mon, May 16, 2022 at 5:21 PM Uros Bizjak via Gcc-patches wrote: > > On Sat, May 7, 2022 at 7:05 AM liuhongt wrote: > > > > This is adjusted patch only for OImode. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? > > > > gcc/ChangeLog: > > > > PR targe

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.

2022-05-15 Thread Hongtao Liu via Gcc-patches
ping. On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches wrote: > > This is adjusted patch only for OImode. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/104610 > * config/i386/i386-expand.cc (ix86_expand_branch

Re: [PATCH] [Middle-end] Enhance final_value_replacement_loop to handle bitwise induction.

2022-05-12 Thread Hongtao Liu via Gcc-patches
On Wed, May 11, 2022 at 4:45 PM Richard Biener via Gcc-patches wrote: > > On Mon, May 9, 2022 at 7:19 AM liuhongt wrote: > > > > This patch will enable below optimization: > > > > { > > - int bit; > > - long long unsigned int _1; > > - long long unsigned int _2; > > - > > [local count: 46

Re: [PATCH v2] Strip of a vector load which is only used partially.

2022-05-11 Thread Hongtao Liu via Gcc-patches
On Tue, May 10, 2022 at 2:54 PM Richard Biener via Gcc-patches wrote: > > On Mon, May 9, 2022 at 7:11 AM liuhongt via Gcc-patches > wrote: > > > > Here's adjused patch. > > Ok for trunk? > > > > Optimize > > > > _4 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>; > > _5 = BIT_FIELD_REF <

Re: [PATCH] [i386] Optimize movzwl + vmovd/vmovq to vmovw.

2022-05-10 Thread Hongtao Liu via Gcc-patches
On Mon, May 9, 2022 at 4:28 PM Uros Bizjak wrote: > > On Mon, May 9, 2022 at 4:03 AM liuhongt wrote: > > > > Similarly optimize movl + vmovq to vmovd. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? > > > > gcc/ChangeLog: > > > > PR target/104915 > >

Re: [PATCH] [i386] Implement permutation with pslldq + psrldq + por when pshufb is not available.

2022-05-09 Thread Hongtao Liu via Gcc-patches
On Mon, May 9, 2022 at 4:19 PM Uros Bizjak wrote: > > On Mon, May 9, 2022 at 7:24 AM Hongtao Liu wrote: > > > > On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches > > wrote: > > > > > > pand/pandn may be used to clear upper/lower bits of the oper

Re: [PATCH] Optimize vec_setv8{hi,hf}_0 + pmovzxbq to pmovzxbq.

2022-05-08 Thread Hongtao Liu via Gcc-patches
On Mon, May 9, 2022 at 2:43 PM liuhongt via Gcc-patches wrote: > > Clean up of 16-bit uppers is not needed for pmovzxbq/pmovsxbq. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/105072 > * config/i386/sse.md (*sse4_1_v2

Re: [PATCH] [i386] Implement permutation with pslldq + psrldq + por when pshufb is not available.

2022-05-08 Thread Hongtao Liu via Gcc-patches
On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches wrote: > > pand/pandn may be used to clear upper/lower bits of the operands, in > that case there will be 4-5 instructions for permutation, and it's > still better than scalar codes. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.

2022-05-08 Thread Hongtao Liu via Gcc-patches
On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches wrote: > > This is adjusted patch only for OImode. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/104610 > * config/i386/i386-expand.cc (ix86_expand_branch): Use

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OI/TImode.

2022-05-05 Thread Hongtao Liu via Gcc-patches
On Thu, May 5, 2022 at 4:09 PM Uros Bizjak via Gcc-patches wrote: > > On Thu, May 5, 2022 at 9:50 AM Richard Biener via Gcc-patches > wrote: > > > > On Thu, May 5, 2022 at 9:37 AM liuhongt via Gcc-patches > > wrote: > > > > > > Enable optimization for TImode only under 32-bit target, for 64-bit

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OI/TImode.

2022-05-05 Thread Hongtao Liu via Gcc-patches
On Thu, May 5, 2022 at 3:37 PM liuhongt wrote: > > Enable optimization for TImode only under 32-bit target, for 64-bit > target there could be extra ineteger <-> sse move regarding psABI, > not efficient. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} > Ok for trunk? > > gcc/ChangeLo

Re: [PATCH] AVX512F: Add missing macro for mask(z?)_scalf_s[sd] [PR 105339]

2022-04-23 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 22, 2022 at 8:43 PM Hongyu Wang wrote: > > > Please add the corresponding intrinsic test in sse-14.c > > Sorry for forgetting this part. Updated patch. Thanks. > LGTM. > Hongtao Liu via Gcc-patches 于2022年4月22日周五 16:49写道: > > > > On Fri, Apr 22, 20

Re: [PATCH] AVX512F: Add missing macro for mask(z?)_scalf_s[sd] [PR 105339]

2022-04-22 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 22, 2022 at 4:12 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > Add missing macro under O0 and adjust macro format for scalf > intrinsics. > Please add the corresponding intrinsic test in sse-14.c. > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for master and backpor

Re: [x86_64 PATCH] Support pandn for V1TI mode (i.e. *andnotv1ti3).

2022-04-05 Thread Hongtao Liu via Gcc-patches
On Wed, Apr 6, 2022 at 5:56 AM Roger Sayle wrote: > > > > This simple patch allows the i386 backend to generate pandn instructions > > for V1TI mode. Currently, the testcase: > > > > typedef unsigned __int128 v1ti __attribute__ ((__vector_size__ (16))); > > v1ti andnot1(v1ti x, v1ti y) { return ~

Re: [PATCH V3] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-04 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 1, 2022 at 4:32 PM liuhongt via Gcc-patches wrote: > > Update in V3: > 1. Add -param=x86-stlf-window-ninsns= (default 64). > 2. Exclude call in the window. > > Since cfg is freed before machine_reorg, just do a rough calculation > of the window according to the layout. > Also according

Re: [PATCH] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-01 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 1, 2022 at 2:54 PM Richard Biener via Gcc-patches wrote: > > On Fri, Apr 1, 2022 at 8:47 AM liuhongt via Gcc-patches > wrote: > > > > Update in V2: > > 1. Use get_insns instead of FOR_EACH_BB_CFUN and FOR_BB_INSNS. > > 2. Return for any_uncondjump_p and ANY_RETURN_P. > > 3. Add dump i

Re: [PATCH] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-03-31 Thread Hongtao Liu via Gcc-patches
On Thu, Mar 31, 2022 at 6:45 PM Richard Biener via Gcc-patches wrote: > > On Thu, Mar 31, 2022 at 7:51 AM liuhongt wrote: > > > > Since cfg is freed before machine_reorg, just do a rough calculation > > of the window according to the layout. > > Also according to an experiment on CLX, set window

Re: [PATCH] x86: Use x constraint on KL patterns

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 10:05 AM Hongyu Wang via Gcc-patches wrote: > > > > Is it possible to create a test case that gas would throw an error for > > > invalid operands? > > > > You can use -ffix-xmmN to disable XMM0-15. > > I mean can we create an intrinsic test for this PR that produces xmm16-3

Re: [PATCH] x86: Use x constraint on SSSE3 patterns with MMX operands

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 1:27 AM H.J. Lu via Gcc-patches wrote: > > Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND > have no AVX512 version, replace the "Yv" register constraint with the > "x" register constraint. LGTM, please backport to GCC10/GCC11 branch. > > PR t

Re: [PATCH] x86: Use x constraint on KL patterns

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 4:50 AM H.J. Lu via Gcc-patches wrote: > > Since KL instructions have no AVX512 version, replace the "v" register > constraint with the "x" register constraint. > > PR target/105058 > * config/i386/sse.md (loadiwkey): Replace "v" with "x". > (aesu8):

Re: [PATCH][RFC] tree-optimization/101908 - avoid STLF fails when vectorizing

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 25, 2022 at 9:42 PM Richard Biener wrote: > > On Fri, 25 Mar 2022, Hongtao Liu wrote: > > > On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches > > wrote: > > > > > > Since we're now vectorizing by default at -O2 issues like P

Re: [PATCH][RFC] tree-optimization/101908 - avoid STLF fails when vectorizing

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches wrote: > > Since we're now vectorizing by default at -O2 issues like PR101908 > become more important where we apply basic-block vectorization to > parts of the function covering loads from function parameters passed > on the stack. S

Re: [PATCH] Fix ICE caused by NULL_RTX returned by lowpart_subreg.

2022-03-22 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 23, 2022 at 2:05 PM liuhongt via Gcc-patches wrote: > > In validate_subreg, both (subreg:V2HF (reg:SI) 0) > and (subreg:V8HF (reg:V2HF) 0) are valid, but not > for (subreg:V8HF (reg:SI) 0) which causes ICE. > > Ideally it should be handled in validate_subreg to support > subreg for all

Re: [PATCH] [i386] Extend splitter pattern to reversed condition by swapping then and else rtx. [PR target/104982]

2022-03-21 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 21, 2022 at 9:06 PM liuhongt wrote: > > Failed to match this instruction: > (set (reg/v:SI 88 [ z ]) > (if_then_else:SI (eq (zero_extract:SI (reg:SI 92) > (const_int 1 [0x1]) > (zero_extend:SI (subreg:QI (reg:SI 93) 0))) > (const_int 0 [0

Re: [PATCH v2] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-21 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 21, 2022 at 7:52 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For complex scalar intrinsic like _mm_mask_fcmadd_sch, the > mask should be and by 1 to ensure the mask is bind to lowest byte. > Use masked vmovss to perform same operation which omits higher bits > of mask. > > Bootst

Re: [PATCH] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-20 Thread Hongtao Liu via Gcc-patches
m_mask_move_ss (__m128 src, __mmask8 k, __m128 a, __m128 b) https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=vmovss&ig_expand=3807,3081,3082,3084,3083,4837,4838 > > LLVM generates mask & 1 for these intrinsics. > > Hongtao Liu via Gcc-patches 于20

Re: [PATCH] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-20 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 19, 2022 at 8:09 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For complex scalar intrinsic like _mm_mask_fcmadd_sch, the > mask should be and by 1 to ensure the mask is bind to lowest byte. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > Ok for master? > > gcc

Re: [PATCH] AVX512FP16: Fix masm=intel output for vfc?(madd|mul)csh [PR 104977]

2022-03-20 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 19, 2022 at 8:06 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > This patch fixes typo in subst for scalar complex mask_round operand. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > Ok for master? > Ok. > gcc/ChangeLog: > > PR target/104977 > * c

Re: [PATCH] x86: Correct march=sapphirerapids to base on icelake server

2022-03-18 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote: > > Hi Hongtao, > > This patch is to correct march=sapphirerapids to base on icelake server. > and update sapphirerapids in the documentation. > > OK for master and backport to GCC 11? Ok. > > > gcc/Changelog: > > PR target/104963 >

Re: [PATCH] [i386] Add extra cost for unsigned_load which may have stall forward issue.

2022-03-17 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 16, 2022 at 5:54 PM Richard Biener via Gcc-patches wrote: > > On Wed, Mar 16, 2022 at 3:19 AM liuhongt wrote: > > > > This patch only handle pure-slp for by-value passed parameter which > > has nothing to do with IPA but psABI. For by-reference passed > > parameter IPA is required. >

Re: [x86 PATCH] PR target/94680: Clear upper bits of V2DF using movq (like V2DI).

2022-03-15 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 15, 2022 at 10:52 PM Roger Sayle wrote: > > > This simple i386 patch unblocks a more significant change. The testcase > gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and > alas the fix for PR target/94680 doesn't (yet) handle V2DF mode. > > For the first test fro

Re: [PATCH v2] x86: Also check _SOFT_FLOAT in

2022-03-15 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 15, 2022 at 10:40 PM H.J. Lu wrote: > > On Mon, Mar 14, 2022 at 7:31 AM H.J. Lu wrote: > > > > Push target("general-regs-only") in if x87 is enabled. > > > > gcc/ > > > > PR target/104890 > > * config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before > > push

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-14 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 14, 2022 at 8:20 PM Hongtao Liu wrote: > > On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote: > > > > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote: > > > LGTM, thanks for handling this. > > > > Thanks, committed. > >

<    1   2   3   4   5   6   7   8   9   10   >