https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
--- Comment #6 from Hongtao Liu ---
(In reply to rguent...@suse.de from comment #5)
> On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> >
> > --- Comment #4 from Hongtao Liu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
--- Comment #4 from Hongtao Liu ---
(In reply to rguent...@suse.de from comment #3)
> On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> >
> > --- Comment #2 from Hongtao Liu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
--- Comment #2 from Hongtao Liu ---
(In reply to Richard Biener from comment #1)
> Btw, I had opened PR115490 with my results for this already. Some mitigation
> should be from optimizing ISEL expansion to vcond_mask and I'd start with
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
Bug ID: 115517
Summary: Fix regression after dropping uses of
vcond{,u,eq}_optab
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
--- Comment #5 from Hongtao Liu ---
It's fixed by r15-1100-gec985bc97a0157
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115463
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115452
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115452
Bug ID: 115452
Summary: ICE when dump stv2 for gcc.target/i386/pr70322-2.c
with -march=cascadelake
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115384
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365
--- Comment #7 from Hongtao Liu ---
+/* { dg-final { scan-rtl-dump-times {(?n)^(?!.*REG_EQUIV)(?=.*\(fix:SI)} 3
"final" } } */
Does this fix the testcase on solaris2?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115418
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365
Hongtao Liu changed:
What|Removed |Added
Target|powerpc64le-linux-gnu, |powerpc64le-linux-gnu,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #6 from Hongtao Liu ---
For 1 element vector, when backend doesn't support it's vector mode, the scalar
mode is used for the type, which makes expand_vec_cond_expr_p use QImode for
icode check.(vcond_mask_qiqi)
It could also be the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #5 from Hongtao Liu ---
> _2 = VEC_COND_EXPR <_1, { -1 }, { 0 }>;
Hmm, it should check vcond_mask_qiv1qi instead of vcond_mask_qiqi, I guess
since the backend doesn't supports v1qi, TYPE_MODE of V is QImode, then it
wrongly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #4 from Hongtao Liu ---
>
> and for _2 = VIEW_CONVERT_EXPR(_1); we explicitly
> clear the upper bits due to PR113576, and then we get 1 hit the abort.
It's not VIEW_CONVERT_EXPR clear the uppper bits, but _1 = { -1 };
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #3 from Hongtao Liu ---
typedef __attribute__((__vector_size__ (1))) char V;
char
foo (V v)
{
return ((V) v == v)[0];
}
int
main ()
{
char x = foo ((V) { });
if (x != -1)
__builtin_abort ();
}
w/ vcond_mask_qiqi, it's
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
Hongtao Liu changed:
What|Removed |Added
Status|NEW |ASSIGNED
--- Comment #2 from Hongtao Liu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115384
Hongtao Liu changed:
What|Removed |Added
Status|NEW |ASSIGNED
--- Comment #3 from Hongtao Liu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115334
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365
--- Comment #5 from Hongtao Liu ---
(In reply to Rainer Orth from comment #4)
> Unfortunately, the fix broke 32-bit Solaris/SPARC in exchange:
>
> FAIL: gcc.dg/pr100927.c scan-rtl-dump-times final "(?n)(fix:SI" 3
>
/* { dg-final {
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115370
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115369
Bug ID: 115369
Summary: ifcvt failed to condition elimination
for__builtin_mul_overflow
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43618
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43618
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365
--- Comment #1 from Hongtao Liu ---
pr100927.c.349r.final:(fix:SI (reg:SF 32 0 [120])))
"../../gcc/intel-innersource/pr115365/gcc/testsuite/gcc.dg/pr100927.c":12:10
428 {*fix_truncsfsi2_p8}
pr100927.c.349r.final:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114428
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115351
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115341
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115334
--- Comment #2 from Hongtao Liu ---
diff --git a/gcc/testsuite/gcc.dg/vect/pr112325.c
b/gcc/testsuite/gcc.dg/vect/pr112325.c
index dea6cca3b86..143903beab2 100644
--- a/gcc/testsuite/gcc.dg/vect/pr112325.c
+++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115334
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115299
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113609
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115299
--- Comment #2 from Hongtao Liu ---
> Maybe r14-53-g675b1a7f113adb .
Probably, current cost model may need adjustment.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115299
Bug ID: 115299
Summary: [14 regression] pr86722.c failed to eliminate branch.
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization, needs-bisection
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114125
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84508
--- Comment #26 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #25)
> (In reply to Peter Cordes from comment #22)
> > Why are we adding an alignment requirement to _mm_storel_pd, the intrinsic
> > for MOVLPD?
> >
> From Intel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84508
--- Comment #25 from Hongtao Liu ---
(In reply to Peter Cordes from comment #22)
> Why are we adding an alignment requirement to _mm_storel_pd, the intrinsic
> for MOVLPD?
>
>From Intel intrinsic guide[1], there's explict "mem_addr does not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 112325, which changed state.
Bug 112325 Summary: Missed vectorization of reduction after unrolling
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115146
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
--- Comment #25 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #17)
> I don't think the cost of using UNSPEC would be significant if the backend
> tried to constant fold more target builtins. Anyway, with the proposed
> changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114148
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114148
--- Comment #4 from Hongtao Liu ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #3)
> To investigate further, I've added comparison functions to a reduced
> version of pr106010-7b.c, with
>
> void
> cmp_epi8 (_Complex unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
--- Comment #16 from Hongtao Liu ---
>
> That said, this change really won't help the backend which supposedly should
> have the same behavior regardless of -fno-trapping-math, because in that
> case it is the value
> of the result (which is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115069
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
--- Comment #11 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #10)
> Any of the floating point to integer intrinsics if they have out of range
> value (haven't checked whether floating point to unsigned intrinsic is a
> problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114427
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
--- Comment #4 from Hongtao Liu ---
(In reply to Hu Lin from comment #3)
> I found compiler allocates mem to the third source register of vpternlog in
> IRA after commit f55cdce3f8dd8503e080e35be59c5f5390f6d95e. And it cause the
> generate code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115069
--- Comment #16 from Hongtao Liu ---
> Should we also run a SPEC on with -O2 -mtune=generic -march=x86-64-v3 to see
> if there is any surprise?
Sure, I guess no.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115069
--- Comment #14 from Hongtao Liu ---
(In reply to Uroš Bizjak from comment #13)
> (In reply to Haochen Jiang from comment #12)
> > (In reply to Hongtao Liu from comment #11)
> > > (In reply to Haochen Jiang from comment #10)
> > > > A patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115146
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115069
--- Comment #11 from Hongtao Liu ---
(In reply to Haochen Jiang from comment #10)
> A patch like Comment 8 could definitely solve the problem. But I need to
> test more benchmarks to see if there is surprise.
>
> But, yes, as Uros said in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115069
--- Comment #5 from Hongtao Liu ---
(In reply to Krzysztof Kanas from comment #4)
> I bisected the issue and it seems that commit
> 0368fc54bc11f15bfa0ed9913fd0017815dfaa5d introduces regression.
I guess the real guilty commit is
commit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115116
Bug ID: 115116
Summary: [x86] rtx_cost is overestimated for big size memory.
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114514
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115115
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115101
Bug ID: 115101
Summary: [wrong code] with -O1 -floop-nest-optimize for
gcc.dg/graphite/interchange-8.c
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101017
Hongtao Liu changed:
What|Removed |Added
CC||haochen.jiang at intel dot com
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114987
--- Comment #6 from Hongtao Liu ---
> I tried to move "vmovdqa %xmm1,0xd0(%rsp)" before "vmovdqa %xmm0,0xe0(%rsp)"
> and rebuilt the binary and it will save half the regression.
57.93 │200: vaddps 0xc0(%rsp),%ymm3,%ymm5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
Bug ID: 115021
Summary: [14/15 regression] unnecessary spill for vpternlog
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84508
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113090
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114943
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114907
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114883
--- Comment #10 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #9)
> Created attachment 58073 [details]
> gcc14-pr114883.patch
>
> Full untested patch.
This will fix 521.wrf_r ICE, and pass runtime validation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114883
--- Comment #5 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #4)
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index a6cf0a5546c..ae6abe00f3e 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114883
--- Comment #4 from Hongtao Liu ---
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index a6cf0a5546c..ae6abe00f3e 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -8505,7 +8505,8 @@ vect_transform_reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114883
--- Comment #3 from Hongtao Liu ---
Created attachment 58066
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58066=edit
reproduced testcase
gfortran -O2 -march=x86-64-v4 -fvect-cost-model=cheap.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114883
--- Comment #2 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #1)
> Can you reduce the fortran code down for the ICE? It should not be hard, you
> can use delta even.
Let me try.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114883
Bug ID: 114883
Summary: 521.wrf_r ICE with -O2 -march=sapphirerapids
-fvect-cost-model=cheap
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110621
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85048
--- Comment #16 from Hongtao Liu ---
(In reply to Matthias Kretz (Vir) from comment #15)
> So it seems that if at least one of the vector builtins involved in the
> expression is 512 bits GCC needs to locally increase prefer-vector-width to
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85048
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731
--- Comment #7 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #4)
> (In reply to Hongtao Liu from comment #3)
> > Looks like ix86_vect_estimate_reg_pressure doesn't work here, taking a look.
>
> Oh, ix86_vect_estimate_reg_pressure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731
--- Comment #4 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #3)
> Looks like ix86_vect_estimate_reg_pressure doesn't work here, taking a look.
Oh, ix86_vect_estimate_reg_pressure is only for loop, BB vectorizer only use
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591
--- Comment #16 from Hongtao Liu ---
>
> 4952 /* See if a MEM has already been loaded with a widening operation;
> 4953 if it has, we can use a subreg of that. Many CISC machines
> 4954 also have such operations, but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591
--- Comment #15 from Hongtao Liu ---
> I don't see this as problematic. IIRC, there was a discussion in the past
> that a couple (two?) memory accesses from the same location close to each
> other can be faster (so, -O2, not -Os) than
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110027
--- Comment #19 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #17)
> Both of the posted patches are incorrect, this needs to be fixed in
> asan_emit_stack_protection, account for the different offsets[0] which
> happens when a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591
--- Comment #12 from Hongtao Liu ---
short a;
short c;
short d;
void
foo (short b, short f)
{
c = b + a;
d = f + a;
}
foo(short, short):
addwa(%rip), %di
addwa(%rip), %si
movw%di, c(%rip)
movw
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591
--- Comment #11 from Hongtao Liu ---
unsigned v;
long long v2;
char foo ()
{
v2 = v;
return v;
}
This is related to *movqi_internal, and codegen has been worse since gcc8.1
foo:
movlv(%rip), %eax
movq%rax,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591
--- Comment #9 from Hongtao Liu ---
>
> It looks that different modes of memory read confuse LRA to not CSE the read.
>
> IMO, if the preloaded value is later accessed in different modes, LRA should
> leave it. Alternatively, LRA should CSE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591
--- Comment #5 from Hongtao Liu ---
> My experience is memory cost for the operand with rm or separate r, m is
> different which impacts RA decision.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595573.html
Change operands[1]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66862
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113288
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114544
--- Comment #3 from Hongtao Liu ---
<__umodti3>:
...
37 58: 66 48 0f 6e c7 movq %rdi,%xmm0
38 5d: 66 48 0f 6e d6 movq %rsi,%xmm2
39 62: 66 0f 6c c2 punpcklqdq %xmm2,%xmm0
40 66:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114570
Bug ID: 114570
Summary: GCC doesn't perform good loop invariant code motion
for very long vector operations.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114556
Bug ID: 114556
Summary: weird loop unrolling when there's attribute aligned in
side the loop
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114544
--- Comment #2 from Hongtao Liu ---
Also for
void
foo2 (v128_t* a, v128_t* b)
{
c = (*a & *b)+ *b;
}
(insn 9 8 10 2 (set (reg:V1TI 108 [ _3 ])
(and:V1TI (reg:V1TI 99 [ _2 ])
(mem:V1TI (reg:DI 113) [1 *a_6(D)+0 S16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114544
--- Comment #1 from Hongtao Liu ---
20590;; Turn SImode or DImode extraction from arbitrary SSE/AVX/AVX512F
20591;; vector modes into vec_extract*.
20592(define_split
20593 [(set (match_operand:SWI48x 0 "nonimmediate_operand")
20594
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114544
Bug ID: 114544
Summary: [x86] stv should transform (subreg DI (V1TI) 8) as
(vec_select:DI (V2DI) (const_int 1))
Product: gcc
Version: 14.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114514
--- Comment #3 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #1)
> Confirmed.
>
> Note non sign bit can be improved too:
> ```
I assume you're talking about broadcast from imm or directly from constant
pool. GCC chooses the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114514
Bug ID: 114514
Summary: v16qi >> 7 can be optimized with vpcmpgtb
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114471
--- Comment #6 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #5)
> Maybe we should always use kmask under AVX512, currently only >= 128-bits
> vector of vector _Float16 use kmask, < 128 bits vector still use vector mask.
>
and we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114471
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114429
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
1 - 100 of 248 matches
Mail list logo