https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373
--- Comment #2 from Hongtao.liu ---
I think
Change lea_cost from 2 --> 1 in skylake can fix this regressions.
Since it's stage4 now, i hold my patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94375
--- Comment #1 from Hongtao.liu ---
Try -mprefer-vector-width=128,256-bit vectorization is not helpful for 548
according to our experience.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373
--- Comment #3 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #2)
> I think
> Change lea_cost from 2 --> 1 in skylake can fix this regressions.
>
> Since it's stage4 now, i hold my patch.
Classify: it's for -O2 -mtune=skylake-avx512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94375
--- Comment #4 from Hongtao.liu ---
(In reply to Martin Jambor from comment #3)
> (In reply to Hongtao.liu from comment #1)
> > Try -mprefer-vector-width=128,256-bit vectorization is not helpful for 548
> > according to our experience.
>
> I hav
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94375
--- Comment #5 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #4)
> (In reply to Martin Jambor from comment #3)
> > (In reply to Hongtao.liu from comment #1)
> > > Try -mprefer-vector-width=128,256-bit vectorization is not helpful for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94736
--- Comment #1 from Hongtao.liu ---
Indirect jump `goto *p` is optimized off, so there's no indirect jump, either
no need for inserting endbr64
: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
CC: hjl.tools at gmail dot com, tkoenig at gcc dot
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94118
--- Comment #2 from Hongtao.liu ---
(In reply to Frédéric Recoules from comment #0)
> The section 6.47.2.8 x86 Operand Modifiers of
> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html is only about x86.
>
> As it was done for Operand Constrai
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
CC: hjl.tools at gmail dot com
Target Milestone: ---
Target: i386, x86-64
cat test.c
int foo (int* p1, int* p2, int scale)
{
int ret = *(p1 + scale * 4 + 11);
*p2 = 3;
int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95078
--- Comment #2 from Hongtao.liu ---
(In reply to Richard Biener from comment #1)
> TER should go away, not be extended. So you are suggesting that we replace
>
> leaq44(%rdi,%rdx,4), %rdx --- redundant could be fwprop
> mov
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 92611, which changed state.
Bug 92611 Summary: auto vectorization failed for type promotation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92611
What|Removed |Added
-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92492
Bug 92492 depends on bug 92611, which changed state.
Bug 92611 Summary: auto vectorization failed for type promotation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92611
What|Removed |Added
-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92611
Hongtao.liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
--- Comment #13 from Hongtao.liu ---
*** Bug 92611 has been marked as a duplicate of this bug. ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94962
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94962
--- Comment #3 from Hongtao.liu ---
You're right, from intel SDM:
VEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are
zeroed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94962
--- Comment #4 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #2)
> But such an instruction isn't always redundant, it really depends on what
> the previous setter of the register did, whether the upper 128 bit of the
> 256-bit regi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94962
--- Comment #6 from Hongtao.liu ---
(In reply to Nemo from comment #5)
> (In reply to Jakub Jelinek from comment #2)
>
> I would be happy if GCC could just emit optimal code (single vcmpeqd
> instruction) for this useful constant:
>
> _mm25
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
--- Comment #16 from Hongtao.liu ---
(In reply to Uroš Bizjak from comment #15)
> I will leave truncations (Down Converts in Intel speak) which are AVX512F
> instructions to someone else. It should be easy to add missing patterns and
> tests foll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
--- Comment #17 from Hongtao.liu ---
Created attachment 48570
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48570&action=edit
0001-Add-missing-vector-truncmn2-expanders-PR92658.patch
Seems there're only truncmn2 for truncate, not expander
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95125
--- Comment #4 from Hongtao.liu ---
(In reply to Uroš Bizjak from comment #3)
> It turns out that a bunch of patterns have to be renamed (and testcases
> added).
>
> Easyhack, waiting for someone to show some love to conversion patterns in
> sse
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95125
--- Comment #5 from Hongtao.liu ---
(In reply to Uroš Bizjak from comment #3)
> It turns out that a bunch of patterns have to be renamed (and testcases
> added).
>
> Easyhack, waiting for someone to show some love to conversion patterns in
> sse
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
--- Comment #20 from Hongtao.liu ---
(In reply to Mark Wielaard from comment #19)
> (In reply to CVS Commits from comment #18)
> > gcc/testsuite/ChangeLog:
> > * gcc.target/i386/pr92658-avx512f.c: New test.
> > * gcc.t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95256
--- Comment #6 from Hongtao.liu ---
(In reply to Arseny Solokha from comment #5)
> Is there some further work pending, or should this PR be closed now?
It's fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95211
--- Comment #9 from Hongtao.liu ---
(In reply to Arseny Solokha from comment #8)
> Is there some further work pending, or should this PR be closed now?
Fixed in GCC11.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95453
--- Comment #2 from Hongtao.liu ---
Duplicated as PR95076?
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
Target: x86_64-*-* i?86-*-*
cat test.c
---
typedef unsigned char v16qi __attribute__ ((vector_size (16)));
v16qi
foo (v16qi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95488
--- Comment #1 from Hongtao.liu ---
I think it's this TYPE_SIGN (TREE_TYPE (REG_EXPR (op1))).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95488
--- Comment #3 from Hongtao.liu ---
(In reply to Richard Biener from comment #2)
> (In reply to Hongtao.liu from comment #1)
> > I think it's this TYPE_SIGN (TREE_TYPE (REG_EXPR (op1))).
>
> That's not reliable. Mutliplication shouldn't care ab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95488
--- Comment #4 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #3)
> (In reply to Richard Biener from comment #2)
> > (In reply to Hongtao.liu from comment #1)
> > > I think it's this TYPE_SIGN (TREE_TYPE (REG_EXPR (op1))).
> >
> > Th
: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
Target: x86_64-*-* i?86-*-*
cat test.c
---
typedef char v16qi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95400
--- Comment #5 from Hongtao.liu ---
(In reply to Martin Liška from comment #4)
> Can we backport the change to active branches?
Backport to GCC9, GCC10.
Partially backport to GCC8.(drop tremont and tigerlake part).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95488
--- Comment #5 from Hongtao.liu ---
Microbenchmark
cat test.c
#include
#include
#include
typedef char v16qi __attribute__ ((vector_size (16)));
extern v16qi interleave_mul (v16qi, v16qi);
extern v16qi extend_mul (v16qi, v16qi);
#defi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95524
--- Comment #1 from Hongtao.liu ---
Microbenchmark show
interleave_ashiftrt : 69023847
magic_ashiftrt : 62488066
Seems 10% improvement.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95488
Hongtao.liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95524
--- Comment #2 from Hongtao.liu ---
Microbenchmark show on Skylake client
---
benchmark Skylake client
ashift improvement
v16qi 13%
v32qi 5%
v64qi 7%
ashiftrt
v16qi 5%
v32q
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95524
--- Comment #3 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #0)
> icc has
> ---
> ashift(char __vector(16)):
> vpsllwxmm1, xmm0, 5 #9.16
> vpand xmm0, xmm1, XMMWORD PTR .L_2il
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95488
--- Comment #9 from Hongtao.liu ---
(In reply to H.J. Lu from comment #8)
> -march=skylake-avx512 gave:
>
> [hjl@gnu-cfl-2 gcc]$
> /export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/tools-build/gcc-debug/b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95740
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95524
Hongtao.liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95488
Hongtao.liu changed:
What|Removed |Added
Status|REOPENED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95766
--- Comment #1 from Hongtao.liu ---
Shouldn't **a** be extended to int first?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95766
--- Comment #4 from Hongtao.liu ---
Simple case:
cat test.c:
int f(unsigned short a)
{
return a * 101;
}
gcc:
f(unsigned short):
movzwl %di, %eax
imull $101, %eax, %eax
ret
llvm:
f(unsigned short): # @f(unsigned short)
imull $101,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87767
--- Comment #7 from Hongtao.liu ---
a patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549713.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96186
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96201
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
Target: i386, x86-64
When tring to relax
(define_expand "_eq3"
[(set (match_
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
cat test.c
---
typedef int v8si __attribute__ ((__vector_size__ (32)));
v8si
foo (v8si a, v8si b, v8si c, v8si d)
{
v8si e;
for (int i = 0; i != 8; i++)
e[i] = a[i] >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96243
--- Comment #1 from Hongtao.liu ---
cut from cse.c
---
3342 case RTX_COMPARE:
3343 case RTX_COMM_COMPARE:
3344 /* See what items are actually being compared and set FOLDED_ARG[01]
3345 to those values and CODE to the actual
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
Target: i386, x86-64
cat test.c
---
typedef int v8si __attribute__ ((__vector_size__ (32)));
v8si
foo (v8si a, v8si b, v8si c, v8si d)
{
return a >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96244
--- Comment #2 from Hongtao.liu ---
(In reply to Richard Biener from comment #1)
> so range-info is one index too pessimistic here. So IMHO it's not about
> "redundant" masked loads, it's about the fact that we end up with loads
> at all here.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246
--- Comment #2 from Hongtao.liu ---
(In reply to Richard Biener from comment #1)
> With -mavx2 it works:
>
> vpcmpgtd%ymm1, %ymm0, %ymm0
> vpblendvb %ymm0, %ymm2, %ymm3, %ymm0
>
> not sure how _load comes into play
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96262
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96262
--- Comment #2 from Hongtao.liu ---
2268inline wi::storage_ref
2269wi::int_traits ::decompose (HOST_WIDE_INT *,
2270unsigned int precision,
2271const rtx_mode_t &x)
2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96273
Hongtao.liu changed:
What|Removed |Added
CC||ubizjak at gmail dot com
--- Comment #2 fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96271
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96262
--- Comment #3 from Hongtao.liu ---
a patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550427.html
: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
CC: hjl.tools at gmail dot com
Target Milestone: ---
Target: i386, x86-64
ENDBR32 and ENDBR64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96476
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
testcase not vectorized:
-
#include
inline unsigned opt(unsigned a, unsigned b, unsigned c, unsigned d) {
return a > b ? c : d;
}
void opt( unsig
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70314
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70314
--- Comment #6 from Hongtao.liu ---
Same issue mentioned in PR88808
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96243
Hongtao.liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96512
--- Comment #4 from Hongtao.liu ---
It's ok with GCC8.4.0.
/export/liuhongt/install/gcc8.4.0/bin/gcc -O1 -D_GCC_VEC_=1
-march=skylake-avx512 test.c -lm
./a.out
SIMD: avx512 -- vector size = 8
:: 0 == 0
:: 0.067 == 0.067
:: 0.13 =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96535
--- Comment #1 from Hongtao.liu ---
for cmdline option, it's handled in process_options which will enable
flag_cunroll_grow_size which is the real effective flag to unroll the loop in
testcase.
cut from toplev.c
---
/* Unrolling all loops impl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96536
--- Comment #1 from Hongtao.liu ---
I'm testing patch like
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index b24a4557871..269c528c3ad 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -19132,15 +19132,15 @@
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96536
--- Comment #3 from Hongtao.liu ---
(In reply to Uroš Bizjak from comment #2)
> (In reply to Hongtao.liu from comment #1)
> > I'm testing patch like
>
> You can probably use gen_sub2_insn here.
>
> On a related note, "@" prefix can be used for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96551
--- Comment #1 from Hongtao.liu ---
For `vec_unpacku_float_hi_v16si` `vec_unpacku_float_lo_v16si`
---
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index cf083ca28aa..2e60f596bc1 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/confi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96562
--- Comment #3 from Hongtao.liu ---
a simple c testcase
typedef struct
{
unsigned char* p;
unsigned int a;
}st;
st foo (unsigned char* p, unsigned char* q)
{
return {p, (unsigned int)(q-p)};
}
There's two issues here.
1. gcc use memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96562
--- Comment #4 from Hongtao.liu ---
in ix86_expand_pinsr with
src:(reg:DI 88)
dst:(subreg:DI (reg:TI 84 [ D.1940 ]) 8)
pos: 64
size: 32
it goes into
---
20360
20361 case E_SImode:
20362 if (!TARGET_SSE4_1)
20363
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96574
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96562
--- Comment #6 from Hongtao.liu ---
I'm testing this patch
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index e194214804b..29809d69782 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96578
--- Comment #1 from Hongtao.liu ---
It's the same as PR96551.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246
Hongtao.liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96536
--- Comment #5 from Hongtao.liu ---
(In reply to Uroš Bizjak from comment #4)
> Created attachment 49060 [details]
> Proposed patch
>
> Attached patch completely rewrites restore_stack_nonlocal expander.
>
> Can someone please test the patch on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96350
Hongtao.liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96574
--- Comment #2 from Hongtao.liu ---
This testcase is used to check vector compare to integer mask. So i deleted
scan-assembler for vmov instruction, also add -mprefer-vector-width=512 to
avoid impact of different default arch of GCC.
--- a/gcc/t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96625
--- Comment #1 from Hongtao.liu ---
movabs rax,0x1ff8 --- it also clear high 3 bits.
andrax,rdi
differs from
andrax,0xfff8
using g++ -O2 test.c -S got
---
movq%rdi, %rax
andq$-8, %rax
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96574
--- Comment #4 from Hongtao.liu ---
Fixed in GCC11.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93897
--- Comment #6 from Hongtao.liu ---
Fixed in GCC11, backport to GCC10.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96562
--- Comment #9 from Hongtao.liu ---
Fixed in GCC11, backport to GCC10.
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
Target: i386, x86-64
On Linux/x86_64,
7123217afb33d4a2860f552ad778a819cc8dea5e is the first bad commit
commit 7123217afb33d4a2860f552ad778a819cc8dea5e
Author
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96667
--- Comment #1 from Hongtao.liu ---
Testcase need to be adjusted.
I'll rewrite testcase with cpp source file, then vector compare operator could
be used directly.
--- a/gcc/testsuite/gcc.target/i386/avx512bw-pr96246-1.c
+++ b/gcc/testsuite/g++.t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96536
--- Comment #7 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #5)
> (In reply to Uroš Bizjak from comment #4)
> > Created attachment 49060 [details]
> > Proposed patch
> >
> > Attached patch completely rewrites restore_stack_nonloc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88808
--- Comment #5 from Hongtao.liu ---
Fixed in GCC11.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88473
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71453
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96262
Hongtao.liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96744
--- Comment #2 from Hongtao.liu ---
Created attachment 49107
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49107&action=edit
Enable spill to mask only under m_core_AVX512
this patch will fail
cat test.c
#include
void
_mm512_2interse
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96755
--- Comment #2 from Hongtao.liu ---
Sorry for TYPO
---
(define_split
[(set (match_operand:DI 0 "mask_reg_operand")
(zero_extend:DI
- (not:DI (match_operand:SI 1 "mask_reg_operand"]
+ (not:SI (match_operand:SI 1 "ma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96744
--- Comment #7 from Hongtao.liu ---
(In reply to Uroš Bizjak from comment #5)
> (In reply to Hongtao.liu from comment #2)
>
> > Need to add define_insn for movp2qi/movp2hi?
>
> Yes, this is needed to cover some corner cases. Please see attachme
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96755
--- Comment #4 from Hongtao.liu ---
Fixed in GCC11.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96667
Hongtao.liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96744
--- Comment #9 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #7)
> (In reply to Uroš Bizjak from comment #5)
> > (In reply to Hongtao.liu from comment #2)
> >
> > > Need to add define_insn for movp2qi/movp2hi?
> >
> > Yes, this is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96744
--- Comment #10 from Hongtao.liu ---
(In reply to Uroš Bizjak from comment #3)
> Created attachment 49112 [details]
> Retune mask <-> general moves cost
>
> It looks to me that mask <-> general cost is too low, so the compiler now
> prefers thes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246
--- Comment #6 from Hongtao.liu ---
(In reply to Nathan Sidwell from comment #5)
> FAIL: g++.target/i386/avx512bw-pr96246-2.C execution test
> FAIL: g++.target/i386/avx512vl-pr96246-2.C execution test
>
>
> the tests can fail at runtime, be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96849
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96551
--- Comment #3 from Hongtao.liu ---
a patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552230.html
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
Target: x86_64-*-* i?86-*-*
On Linux/x86_64,
e740f3d73144abbca1ad98a04825c6bd63314a0b is the first bad commit commit
1 - 100 of 1495 matches
Mail list logo