On Fri, Sep 23, 2022 at 11:07 AM Hu, Lin1 wrote:
>
> Hi, Hongtao
>
> I have modefied this patch and regtested on x86_64-pc-linux-gnu.
>
Ok.
> BRs.
> Lin
>
> -Original Message-
> From: Hongtao Liu
> Sent: Friday, September 23, 2022 9:48 AM
> To: Hu,
On Thu, Sep 22, 2022 at 3:20 PM Hu, Lin1 via Gcc-patches
wrote:
>
> Hi all,
>
> This patch aims to optimize code generation of
> __mm256_zextsi128_si256(__mm_set1_epi8(-1)). Reduce the number of
> instructions required to achieve the final result.
>
> Regtested on x86_64-pc-linux-gnu. Ok for tru
On Thu, Sep 22, 2022 at 11:56 PM Jakub Jelinek wrote:
>
> On Tue, Sep 20, 2022 at 10:51:18AM +0200, Jakub Jelinek via Gcc-patches wrote:
> > On Tue, Sep 20, 2022 at 11:35:07AM +0800, Hongtao Liu wrote:
> > > > The question is (mainly for aarch64, arm and x86 backend mai
On Thu, Sep 22, 2022 at 9:17 AM liuhongt wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Verify 526.blend_r can be rebuilt with the fix.
>
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/106994
> * config/i386/mmx.md (floorv2sf2): Fix typo, use
> reg
On Wed, Sep 21, 2022 at 3:41 PM Richard Biener via Gcc-patches
wrote:
>
> On Wed, Sep 21, 2022 at 1:41 AM liuhongt via Gcc-patches
> wrote:
> >
> > When init_expr is INTEGER_CST or REAL_CST, can_vec_perm_const_p is not
> > necessary since there's no real vec_perm needed, but
> > vec_gen_perm_mask
+My intel folk phoebe working for llvm side.
On Tue, Sep 20, 2022 at 11:35 AM Hongtao Liu wrote:
>
> On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches
> wrote:
> >
> > Hi!
> >
> > The following patch implements the compiler part of C++23
> >
On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches
wrote:
>
> Hi!
>
> The following patch implements the compiler part of C++23
> P1467R9 - Extended floating-point types and standard names compiler part
> by introducing _Float{16,32,64,128} as keywords and builtin types
> like they are
On Tue, Sep 20, 2022 at 10:23 AM liuhongt wrote:
>
> The codes in vectorizable_induction for slp_node assume all phi_info
> have same induction type(vect_step_op_add), but since we support
> nonlinear induction, it could be wrong handled.
> So the patch return false when slp_node has mixed inducti
On Fri, Sep 16, 2022 at 9:38 PM Alexander Monakov via Gcc-patches
wrote:
>
> On Fri, 16 Sep 2022, Uros Bizjak via Gcc-patches wrote:
>
> > On Fri, Sep 16, 2022 at 3:32 AM Jeff Law via Gcc-patches
> > wrote:
> > >
> > >
> > > On 9/15/22 19:06, liuhongt via Gcc-patches wrote:
> > > > There's peepho
On Tue, Sep 20, 2022 at 10:14 AM liuhongt wrote:
>
> Here's list the patch supported.
> rint/nearbyint/ceil/floor/trunc/lrint/lceil/lfloor/round/lround.
>
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/106910
> * config
On Fri, Sep 16, 2022 at 9:09 AM liuhongt via Gcc-patches
wrote:
>
> There's peephole2 submit in 1990s which split cmp mem, 0 to load mem,
> reg + test reg, reg. I don't know exact reason why gcc do this.
>
> For latest x86 processors, ciscization should help processor frontend
> also codesize, for
On Fri, Sep 16, 2022 at 8:55 AM liuhongt wrote:
>
> For ifloor/lfloor/iceil/lceil/irint/lrint/iround/lround when size of
> in_mode is not equal out_mode, vectorizer doesn't go to internal fn
> way,still left that part in the ix86_builtin_vectorized_function.
>
> Remove others builtins and add corr
On Thu, Sep 15, 2022 at 11:36 AM Kong, Lingling via Gcc-patches
wrote:
>
> Hi
>
> The patch is to fix vec_init_dup_v16bf, add correct handle for v16bf mode in
> ix86_expand_vector_init_duplicate.
> Add testcase with sse2 without avx2.
>
> OK for master?
>
> gcc/ChangeLog:
>
> PR target/10
On Mon, Sep 5, 2022 at 10:44 AM liuhongt wrote:
>
> zmm-version vcvtps2ph is special, it encodes {sae} in evex, but put
> round control in the imm. For intrinsic _mm512_cvt_roundps_ph (a,
> imm), imm contains both {sae} and round control, we need to separate
> it in the assembly output since vcvtp
On Fri, Sep 2, 2022 at 4:08 PM Kong, Lingling wrote:
>
> Hi,
>
> I fixed it in a new patch. And added BF vector mode in SUBST_V and
> avx512fmaskhalfmode for @vec_interleave_high.
> Ok for trunk ?
Ok.
>
> > > Hi,
> > >
> > > Handle E_V8BFmode in expand_vec_perm_broadcast_1 and
> > ix86_expand_ve
On Wed, Aug 31, 2022 at 2:52 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> Handle E_V8BFmode in expand_vec_perm_broadcast_1 and
> ix86_expand_vector_init_duplicate.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/106742
> * config/i386/i386-expand.cc (ix86_expand_vector_in
On Sat, Aug 27, 2022 at 12:51 AM H.J. Lu wrote:
>
> On Mon, Aug 22, 2022 at 7:05 PM Hongtao Liu wrote:
> >
> > On Tue, Aug 23, 2022 at 1:02 AM H.J. Lu wrote:
> > >
> > > On 64-bit Windows, long is 32 bits and can't be used as stride in memory
> &g
On Wed, Aug 24, 2022 at 9:15 AM liuhongt wrote:
>
> Since 256-bit vector integer comparison is under TARGET_AVX2,
> and gimple folding for vblendvpd/vblendvps/vpblendvb relies on that.
> Restrict gimple fold condition to TARGET_AVX2.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
>
On Mon, Aug 22, 2022 at 10:16 AM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> This patch added __m128bf16/__m256bf16/__m512bf16 type in testcases.
Ok.
>
> BRs,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/x86_64/abi/bf16/bf16-helper.h:
> Add _m128bf16/m256bf16/_m
On Tue, Aug 23, 2022 at 1:02 AM H.J. Lu wrote:
>
> On 64-bit Windows, long is 32 bits and can't be used as stride in memory
> operand when base is a pointer which is 64 bits. Cast stride to
> __PTRDIFF_TYPE__, instead of long.
Ok.
>
> PR target/106714
> * config/i386/amxtileintrin
On Mon, Aug 22, 2022 at 9:02 AM Hongtao Liu wrote:
>
> On Sat, Aug 20, 2022 at 1:31 AM H.J. Lu wrote:
> >
> > On Thu, Aug 18, 2022 at 5:56 PM Hongtao Liu via Gcc-patches
> > wrote:
> > >
> > > On Thu, Aug 18, 2022 at 3:36 PM Haochen Jiang via Gcc-pa
On Sat, Aug 20, 2022 at 1:31 AM H.J. Lu wrote:
>
> On Thu, Aug 18, 2022 at 5:56 PM Hongtao Liu via Gcc-patches
> wrote:
> >
> > On Thu, Aug 18, 2022 at 3:36 PM Haochen Jiang via Gcc-patches
> > wrote:
> > >
> > > Hi all,
> > >
> > &
On Tue, Aug 16, 2022 at 3:50 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> The patch is support vector init/broadcast/set/extract for __bf16 type.
> The __bf16 type is a storage type.
>
> OK for master?
Ok.
>
> gcc/ChangeLog:
>
> * config/i386/i386-expand.cc (ix86_expand_sse_movcc):
On Wed, Aug 10, 2022 at 1:42 PM Alexandre Oliva via Gcc-patches
wrote:
>
> On Aug 9, 2022, Alexandre Oliva wrote:
>
> > Ping?
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598276.html
>
> Oops, sorry, I linked to the wrong patch. This is the one I meant to ping:
>
> https://gcc.gnu.or
On Thu, Aug 4, 2022 at 4:19 PM Richard Biener via Gcc-patches
wrote:
>
> On Thu, Aug 4, 2022 at 6:29 AM liuhongt via Gcc-patches
> wrote:
> >
> > For neg, the patch create a vec_init as [ a, -a, a, -a, ... ] and no
> > vec_step is needed to update vectorized iv since vf is always multiple
> > of
On Wed, Aug 3, 2022 at 4:41 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> Old patch has some mistake in `*movbf_internal` , now disable BFmode constant
> double move in `*movbf_internal`.
LGTM.
>
> Thanks,
> Lingling
>
> > -Original Message-
> > From: Kong, Lingling
> > Sent: Tues
On Wed, Jul 20, 2022 at 3:59 PM Richard Biener via Gcc-patches
wrote:
>
> On Wed, Jul 20, 2022 at 4:20 AM liuhongt wrote:
> >
> > __builtin_cexpi can't be vectorized since there's gap between it and
> > vectorized sincos version(In libmvec, it passes a double and two
> > double pointer and return
a.c.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
>
> OK.
>
> Are there cases left your vectorizer patch handles over this one?
No.
>
> Thanks,
> Richard.
>
> > 2022-07-20 Richard Biener
> > H
On Wed, Jul 20, 2022 at 3:18 PM Uros Bizjak wrote:
>
> On Wed, Jul 20, 2022 at 8:54 AM Hongtao Liu wrote:
> >
> > On Wed, Jul 20, 2022 at 2:18 PM Uros Bizjak wrote:
> > >
> > > On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak wrote:
> > > >
> >
On Wed, Jul 20, 2022 at 2:18 PM Uros Bizjak wrote:
>
> On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak wrote:
> >
> > On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu wrote:
> > >
> > > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote:
> > > >
> >
On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote:
>
> On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote:
> >
> > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
> > wrote:
> > >
> > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote:
On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote:
> >
> > And split it after reload.
> >
> > > You will need ix86_binary_operator_ok insn constraint here with
> > > corresponding expander using ix86_fixup_binary_operands_no_copy
On Sat, Jul 16, 2022 at 10:08 PM Roger Sayle wrote:
>
>
> This AVX512 specific patch to sse.md is split out from an earlier patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596199.html
>
> The new splitters proposed in that patch interfere with AVX512's
> kunpckdq instruction which is
On Fri, Jul 15, 2022 at 1:44 AM H.J. Lu via Gcc-patches
wrote:
>
> When shadow stack is enabled, function with indirect_return attribute
> may return via indirect jump. In this case, we need to disable sibcall
> if caller doesn't have indirect_return attribute and indirect branch
> tracking is en
On Thu, Jul 14, 2022 at 2:11 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> The patch is to fix _mm_[u]comixx_{ss,sd} codegen and add PF result. These
> intrinsics have changed over time, like `_mm_comieq_ss ` old operation is
> `RETURN ( a[31:0] == b[31:0] ) ? 1 : 0`, and new operation u
On Thu, Jul 14, 2022 at 3:22 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, Jul 14, 2022 at 7:33 AM liuhongt wrote:
> >
> > And split it to GPR-version instruction after reload.
> >
> > > ?r was introduced under the assumption that we want vector values
> > > mostly in vector registers. Curren
On Thu, Jul 14, 2022 at 4:53 PM Hongtao Liu wrote:
>
> On Thu, Jul 14, 2022 at 4:20 PM Richard Biener
> wrote:
> >
> > On Wed, Jul 13, 2022 at 9:34 AM Richard Biener
> > wrote:
> > >
> > > On Wed, Jul 13, 2022 at 6:47 AM Hongtao Liu wrote:
>
On Thu, Jul 14, 2022 at 4:20 PM Richard Biener
wrote:
>
> On Wed, Jul 13, 2022 at 9:34 AM Richard Biener
> wrote:
> >
> > On Wed, Jul 13, 2022 at 6:47 AM Hongtao Liu wrote:
> > >
> > > On Tue, Jul 12, 2022 at 10:12 PM Richard Biener
> > > wrote:
On Tue, Jul 12, 2022 at 10:12 PM Richard Biener
wrote:
>
> On Tue, Jul 12, 2022 at 6:11 AM Hongtao Liu wrote:
> >
> > On Mon, Jul 11, 2022 at 7:47 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > On Mon, Jul 11, 2022 at 5:44 AM liuhongt wro
On Mon, Jul 11, 2022 at 4:03 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Mon, Jul 11, 2022 at 3:15 AM liuhongt wrote:
> >
> > And split it to GPR-version instruction after reload.
> >
> > This will enable below optimization for 16/32/64-bit vector bit_op
> >
> > - movd(%rdi), %xmm0
> >
On Mon, Jul 11, 2022 at 7:47 PM Richard Biener via Gcc-patches
wrote:
>
> On Mon, Jul 11, 2022 at 5:44 AM liuhongt wrote:
> >
> > The patch only handles load/store(including ctor/permutation, except
> > gather/scatter) for complex type, other operations don't needs to be
> > handled since they wi
; This revised patch has been tested on x86_64-pc-linux-gnu with make
> bootstrap and make -k check, both with and with --target_board=unix{-32},
> with no new failures. Is this revised version Ok for mainline?
Ok.
>
>
> 2022-07-04 Roger Sayle
> Hongtao Liu
>
On Fri, Jul 1, 2022 at 10:12 AM Hongtao Liu wrote:
>
> On Fri, Jul 1, 2022 at 2:42 AM Roger Sayle wrote:
> >
> >
> > This patch is a follow-up to Hongtao's fix for PR target/105854. That
> > fix is perfectly correct, but the thing that caught my eye was w
I think this can be taken as an obvious fix without prior approval.
"Obvious fixes can be committed without prior approval. Just check in
the fix and copy it to gcc-patches."
Quoted from https://gcc.gnu.org/gitwrite.html
On Fri, Jul 1, 2022 at 10:02 AM Haochen Jiang via Gcc-patches
wrote:
>
> Hi
On Fri, Jul 1, 2022 at 2:42 AM Roger Sayle wrote:
>
>
> This patch is a follow-up to Hongtao's fix for PR target/105854. That
> fix is perfectly correct, but the thing that caught my eye was why is
> the compiler generating a shift by zero at all. Digging deeper it
> turns out that we can easily
this.
Yes for the case in your patch, I think it's a typo.
But there could be some difference for operand modifiers between AT&T
and Intel syntaxes in some patterns.
.i.e the use of mode attr .
>
> On Tue, 2022-06-28 at 14:22 +0800, Hongtao Liu wrote:
> > On Tue, J
On Tue, Jun 28, 2022 at 9:26 AM ~antoyo via Gcc-patches
wrote:
>
> Hi.
>
> This fixes the following bug:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106095
The patch LGTM, thanks for handling this.
>
> It's the first time I work outside of the jit component, so please tell
> me if I forgot anyt
On Tue, Jun 21, 2022 at 3:50 AM Uros Bizjak via Gcc-patches
wrote:
>
> On Mon, Jun 20, 2022 at 8:14 PM H.J. Lu wrote:
> >
> > On Tue, May 10, 2022 at 9:25 AM H.J. Lu wrote:
> > >
> > > Mark a function with SYMBOL_FLAG_FUNCTION_ENDBR when inserting ENDBR at
> > > function entry. Skip the 4-byte
On Sat, Jun 11, 2022 at 1:46 AM H.J. Lu wrote:
>
> On Fri, Jun 10, 2022 at 7:44 AM H.J. Lu wrote:
> >
> > On Fri, Jun 10, 2022 at 2:38 AM Florian Weimer wrote:
> > >
> > > * liuhongt via Libc-alpha:
> > >
> > > > +\subsubsection{Special Types}
> > > > +
> > > > +The \code{__Bfloat16} type uses a
On Fri, Jun 10, 2022 at 4:45 PM Cui,Lili via Gcc-patches
wrote:
>
> This patch is to change dg-options for two testcases.
>
> Use -mtune=generic to limit these two testcases. Because configuring them with
> -mtune=cascadelake or znver3 will vectorize them.
>
> regtested on x86_64-linux-gnu{-m32,}.
On Fri, Jun 10, 2022 at 3:47 PM liuhongt via Libc-alpha
wrote:
>
> Pass and return __Bfloat16 values in XMM registers.
>
> Background:
> __Bfloat16 (BF16) is a new floating-point format that can accelerate machine
> learning (deep learning training, in particular) algorithms.
> It's first introdu
On Wed, Jun 8, 2022 at 11:44 AM Cui, Lili wrote:
>
> > -Original Message-
> > From: Hongtao Liu
> > Sent: Monday, June 6, 2022 1:25 PM
> > To: H.J. Lu
> > Cc: Cui, Lili ; Liu, Hongtao ;
> > GCC
> > Patches
> > Subject: Re: [PATCH] U
On Tue, Jun 7, 2022 at 3:41 PM liuhongt via Gcc-patches
wrote:
>
> So alternative v won't be igored in record_reg_classess.
>
> Similar for *r alternatives in some vector patterns.
>
> It helps testcase in the PR, also RA now makes better decisions for
> gcc.target/i386/extract-insert-combining.c
On Wed, Jun 1, 2022 at 11:56 PM H.J. Lu via Gcc-patches
wrote:
>
> On Tue, May 31, 2022 at 10:06 PM Cui,Lili wrote:
> >
> > This patch is to update {skylake,icelake,alderlake}_cost to add a bit
> > preference to vector store.
> > Since the interger vector construction cost has changed, we need t
On Mon, Jun 6, 2022 at 3:17 AM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, Jun 2, 2022 at 5:04 PM Jan Beulich wrote:
> >
> > The 64-bit, 128-bit, and 512-bit variants have VDI return type, in
> > line with instruction behavior. Make the 256-bit builtin match, thus
> > also making it match the
On Thu, Jun 2, 2022 at 2:24 PM Roger Sayle wrote:
>
>
> This patch resolves PR target/105791 which is a regression that was
> accidentally introduced for my workaround to PR tree-optimization/10566.
> (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
> shouldn't). The latest is
On Wed, Jun 1, 2022 at 12:40 AM Richard Sandiford
wrote:
>
> Vladimir Makarov via Gcc-patches writes:
> > On 2022-05-29 23:05, Hongtao Liu wrote:
> >> On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches
> >> wrote:
> >>>
> >>>
On Mon, May 30, 2022 at 3:44 PM Alexander Monakov wrote:
>
> On Mon, 30 May 2022, Hongtao Liu wrote:
>
> > On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches
> > wrote:
> > > >
> > > > The spill is mainly decided by 3 insns related
On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches
wrote:
>
> > > In the PR, the spill happens in the initial basic block of the function,
> > > i.e.
> > > the one with the highest frequency.
> > >
> > > Also as noted in the PR, swapping the 'unlikely' branch to 'likely'
> > > avo
On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches
wrote:
>
>
> On 2022-05-24 23:39, liuhongt wrote:
> > Rigt now, mem_cost for separate mem alternative is 1 * frequency which
> > is pretty small and caused the unnecessary SSE spill in the PR, I've tried
> > to rework backend cost mo
On Wed, May 25, 2022 at 11:39 AM liuhongt via Gcc-patches
wrote:
>
> Rigt now, mem_cost for separate mem alternative is 1 * frequency which
> is pretty small and caused the unnecessary SSE spill in the PR, I've tried
> to rework backend cost model, but RA still not happy with that(regress
> somewh
On Tue, May 17, 2022 at 6:07 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Tue, May 17, 2022 at 5:06 AM liuhongt wrote:
> >
> > backend has
> >
> > 16550(define_insn "*bmi2_bzhi_3_2"
> > 16551 [(set (match_operand:SWI48 0 "register_operand" "=r")
> > 16552(and:SWI48
> > 16553 (pl
On Tue, May 17, 2022 at 6:03 PM Uros Bizjak wrote:
>
> On Tue, May 17, 2022 at 3:33 AM Hongtao Liu wrote:
> >
> > On Mon, May 16, 2022 at 5:21 PM Uros Bizjak via Gcc-patches
> > wrote:
> > >
> > > On Sat, May 7, 2022 at 7:05 AM liuhongt wrote:
>
On Fri, May 13, 2022 at 7:16 PM Richard Biener
wrote:
>
> On Fri, May 13, 2022 at 5:37 AM Hongtao Liu wrote:
> >
> > On Wed, May 11, 2022 at 4:45 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > On Mon, May 9, 2022 at 7:19 AM liuhongt wrote:
&g
thanks.
On Tue, May 17, 2022 at 3:09 PM Jakub Jelinek via Gcc-patches
wrote:
>
> Hi!
>
> When looking around the spot of the PR105591 fix, I've noticed a typo
> and incorrectly formatted comment.
>
> Bootstrapped/regtested on x86_64-linux and i668-linux, committed to
> trunk as obvious.
>
> 2022-
On Tue, May 17, 2022 at 11:06 AM liuhongt via Gcc-patches
wrote:
>
> backend has
>
> 16550(define_insn "*bmi2_bzhi_3_2"
> 16551 [(set (match_operand:SWI48 0 "register_operand" "=r")
> 16552(and:SWI48
> 16553 (plus:SWI48
> 16554(ashift:SWI48 (const_int 1)
> 16555
I've committed the patch.
On Fri, May 13, 2022 at 5:22 PM liuhongt via Gcc-patches
wrote:
>
> Here's updated patch which adds ix86_pre_reload_split () to those 2
> define_insn_and_splits.
>
> Assembly Optimization like:
> - vmovq %xmm0, %xmm2
> - vmovdqa .LC0(%rip), %xmm0
>
On Mon, May 16, 2022 at 5:21 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Sat, May 7, 2022 at 7:05 AM liuhongt wrote:
> >
> > This is adjusted patch only for OImode.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR targe
ping.
On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches
wrote:
>
> This is adjusted patch only for OImode.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/104610
> * config/i386/i386-expand.cc (ix86_expand_branch
On Wed, May 11, 2022 at 4:45 PM Richard Biener via Gcc-patches
wrote:
>
> On Mon, May 9, 2022 at 7:19 AM liuhongt wrote:
> >
> > This patch will enable below optimization:
> >
> > {
> > - int bit;
> > - long long unsigned int _1;
> > - long long unsigned int _2;
> > -
> > [local count: 46
On Tue, May 10, 2022 at 2:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Mon, May 9, 2022 at 7:11 AM liuhongt via Gcc-patches
> wrote:
> >
> > Here's adjused patch.
> > Ok for trunk?
> >
> > Optimize
> >
> > _4 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>;
> > _5 = BIT_FIELD_REF <
On Mon, May 9, 2022 at 4:28 PM Uros Bizjak wrote:
>
> On Mon, May 9, 2022 at 4:03 AM liuhongt wrote:
> >
> > Similarly optimize movl + vmovq to vmovd.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR target/104915
> >
On Mon, May 9, 2022 at 4:19 PM Uros Bizjak wrote:
>
> On Mon, May 9, 2022 at 7:24 AM Hongtao Liu wrote:
> >
> > On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches
> > wrote:
> > >
> > > pand/pandn may be used to clear upper/lower bits of the oper
On Mon, May 9, 2022 at 2:43 PM liuhongt via Gcc-patches
wrote:
>
> Clean up of 16-bit uppers is not needed for pmovzxbq/pmovsxbq.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/105072
> * config/i386/sse.md (*sse4_1_v2
On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches
wrote:
>
> pand/pandn may be used to clear upper/lower bits of the operands, in
> that case there will be 4-5 instructions for permutation, and it's
> still better than scalar codes.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,
On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches
wrote:
>
> This is adjusted patch only for OImode.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/104610
> * config/i386/i386-expand.cc (ix86_expand_branch): Use
On Thu, May 5, 2022 at 4:09 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, May 5, 2022 at 9:50 AM Richard Biener via Gcc-patches
> wrote:
> >
> > On Thu, May 5, 2022 at 9:37 AM liuhongt via Gcc-patches
> > wrote:
> > >
> > > Enable optimization for TImode only under 32-bit target, for 64-bit
On Thu, May 5, 2022 at 3:37 PM liuhongt wrote:
>
> Enable optimization for TImode only under 32-bit target, for 64-bit
> target there could be extra ineteger <-> sse move regarding psABI,
> not efficient.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
> gcc/ChangeLo
On Fri, Apr 22, 2022 at 8:43 PM Hongyu Wang wrote:
>
> > Please add the corresponding intrinsic test in sse-14.c
>
> Sorry for forgetting this part. Updated patch. Thanks.
>
LGTM.
> Hongtao Liu via Gcc-patches 于2022年4月22日周五 16:49写道:
> >
> > On Fri, Apr 22, 20
On Fri, Apr 22, 2022 at 4:12 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> Add missing macro under O0 and adjust macro format for scalf
> intrinsics.
>
Please add the corresponding intrinsic test in sse-14.c.
> Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}.
>
> Ok for master and backpor
On Wed, Apr 6, 2022 at 5:56 AM Roger Sayle wrote:
>
>
>
> This simple patch allows the i386 backend to generate pandn instructions
>
> for V1TI mode. Currently, the testcase:
>
>
>
> typedef unsigned __int128 v1ti __attribute__ ((__vector_size__ (16)));
>
> v1ti andnot1(v1ti x, v1ti y) { return ~
On Fri, Apr 1, 2022 at 4:32 PM liuhongt via Gcc-patches
wrote:
>
> Update in V3:
> 1. Add -param=x86-stlf-window-ninsns= (default 64).
> 2. Exclude call in the window.
>
> Since cfg is freed before machine_reorg, just do a rough calculation
> of the window according to the layout.
> Also according
On Fri, Apr 1, 2022 at 2:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Apr 1, 2022 at 8:47 AM liuhongt via Gcc-patches
> wrote:
> >
> > Update in V2:
> > 1. Use get_insns instead of FOR_EACH_BB_CFUN and FOR_BB_INSNS.
> > 2. Return for any_uncondjump_p and ANY_RETURN_P.
> > 3. Add dump i
On Thu, Mar 31, 2022 at 6:45 PM Richard Biener via Gcc-patches
wrote:
>
> On Thu, Mar 31, 2022 at 7:51 AM liuhongt wrote:
> >
> > Since cfg is freed before machine_reorg, just do a rough calculation
> > of the window according to the layout.
> > Also according to an experiment on CLX, set window
On Sat, Mar 26, 2022 at 10:05 AM Hongyu Wang via Gcc-patches
wrote:
>
> > > Is it possible to create a test case that gas would throw an error for
> > > invalid operands?
> >
> > You can use -ffix-xmmN to disable XMM0-15.
>
> I mean can we create an intrinsic test for this PR that produces xmm16-3
On Sat, Mar 26, 2022 at 1:27 AM H.J. Lu via Gcc-patches
wrote:
>
> Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND
> have no AVX512 version, replace the "Yv" register constraint with the
> "x" register constraint.
LGTM, please backport to GCC10/GCC11 branch.
>
> PR t
On Sat, Mar 26, 2022 at 4:50 AM H.J. Lu via Gcc-patches
wrote:
>
> Since KL instructions have no AVX512 version, replace the "v" register
> constraint with the "x" register constraint.
>
> PR target/105058
> * config/i386/sse.md (loadiwkey): Replace "v" with "x".
> (aesu8):
On Fri, Mar 25, 2022 at 9:42 PM Richard Biener wrote:
>
> On Fri, 25 Mar 2022, Hongtao Liu wrote:
>
> > On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > Since we're now vectorizing by default at -O2 issues like P
On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches
wrote:
>
> Since we're now vectorizing by default at -O2 issues like PR101908
> become more important where we apply basic-block vectorization to
> parts of the function covering loads from function parameters passed
> on the stack. S
On Wed, Mar 23, 2022 at 2:05 PM liuhongt via Gcc-patches
wrote:
>
> In validate_subreg, both (subreg:V2HF (reg:SI) 0)
> and (subreg:V8HF (reg:V2HF) 0) are valid, but not
> for (subreg:V8HF (reg:SI) 0) which causes ICE.
>
> Ideally it should be handled in validate_subreg to support
> subreg for all
On Mon, Mar 21, 2022 at 9:06 PM liuhongt wrote:
>
> Failed to match this instruction:
> (set (reg/v:SI 88 [ z ])
> (if_then_else:SI (eq (zero_extract:SI (reg:SI 92)
> (const_int 1 [0x1])
> (zero_extend:SI (subreg:QI (reg:SI 93) 0)))
> (const_int 0 [0
On Mon, Mar 21, 2022 at 7:52 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
> mask should be and by 1 to ensure the mask is bind to lowest byte.
> Use masked vmovss to perform same operation which omits higher bits
> of mask.
>
> Bootst
m_mask_move_ss (__m128 src, __mmask8 k, __m128 a, __m128 b)
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=vmovss&ig_expand=3807,3081,3082,3084,3083,4837,4838
>
> LLVM generates mask & 1 for these intrinsics.
>
> Hongtao Liu via Gcc-patches 于20
On Sat, Mar 19, 2022 at 8:09 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
> mask should be and by 1 to ensure the mask is bind to lowest byte.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
>
> Ok for master?
>
> gcc
On Sat, Mar 19, 2022 at 8:06 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch fixes typo in subst for scalar complex mask_round operand.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
>
> Ok for master?
>
Ok.
> gcc/ChangeLog:
>
> PR target/104977
> * c
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote:
>
> Hi Hongtao,
>
> This patch is to correct march=sapphirerapids to base on icelake server.
> and update sapphirerapids in the documentation.
>
> OK for master and backport to GCC 11?
Ok.
>
>
> gcc/Changelog:
>
> PR target/104963
>
On Wed, Mar 16, 2022 at 5:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Wed, Mar 16, 2022 at 3:19 AM liuhongt wrote:
> >
> > This patch only handle pure-slp for by-value passed parameter which
> > has nothing to do with IPA but psABI. For by-reference passed
> > parameter IPA is required.
>
On Tue, Mar 15, 2022 at 10:52 PM Roger Sayle wrote:
>
>
> This simple i386 patch unblocks a more significant change. The testcase
> gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and
> alas the fix for PR target/94680 doesn't (yet) handle V2DF mode.
>
> For the first test fro
On Tue, Mar 15, 2022 at 10:40 PM H.J. Lu wrote:
>
> On Mon, Mar 14, 2022 at 7:31 AM H.J. Lu wrote:
> >
> > Push target("general-regs-only") in if x87 is enabled.
> >
> > gcc/
> >
> > PR target/104890
> > * config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before
> > push
On Mon, Mar 14, 2022 at 8:20 PM Hongtao Liu wrote:
>
> On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote:
> >
> > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote:
> > > LGTM, thanks for handling this.
> >
> > Thanks, committed.
> >
501 - 600 of 1180 matches
Mail list logo