https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
Bug ID: 104582
Summary: Unoptimal code for __negdi2 (and others) from libgcc2
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Compo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104448
--- Comment #3 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #2)
> r12-7147-g2f9ab267e725ddf2b6b44113e4fc4fb8b2a6adfb fixed this.
> So, shall we just add the testcase into the testsuite and be done with it?
I think so. -mno-xsave
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104522
--- Comment #5 from Uroš Bizjak ---
(In reply to Richard Biener from comment #4)
> But I do wonder whether real_from_target needs fixing to handle invalid
> input gracefully which is ultimatively decode_ieee_extended?
long double foo (void)
{
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79754
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008
--- Comment #14 from Uroš Bizjak ---
Created attachment 52428
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52428&action=edit
Proposed patch
The attached patch implements:
fmod (a, p) = a - trunc (a/p) * p
drem (a, p) = a - roundeven (a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79754
Uroš Bizjak changed:
What|Removed |Added
Status|NEW |ASSIGNED
Assignee|unassigned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008
--- Comment #13 from Uroš Bizjak ---
(In reply to Richard Biener from comment #12)
> Just as data-point on znver2 Uros testcase shows
>
> rguenther@ryzen:/tmp> gcc-11 t.c -Ofast -lm -march=znver2
> rguenther@ryzen:/tmp> numactl --physcpubind=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008
--- Comment #10 from Uroš Bizjak ---
FYI, the following testcase:
--cut here--
#include
float
__attribute__((noinline))
_fmodf (float x, float y)
{
return x - truncf (x/y) * y;
}
int
main ()
{
float a, b;
volatile float z;
for (a =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104469
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104485
Uroš Bizjak changed:
What|Removed |Added
Depends on||103008
--- Comment #2 from Uroš Bizjak -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104481
--- Comment #5 from Uroš Bizjak ---
-save-temps needs to be added to dg-options to cure the UNRESOLVED part.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104481
--- Comment #4 from Uroš Bizjak ---
(In reply to Richard Biener from comment #3)
> I'm also seeing those with GNU ld (GNU Binutils; SUSE Linux Enterprise 15)
> 2.37.20211103-7.26
Here with:
$ ld --version
GNU ld version 2.35-18.fc33
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104481
--- Comment #2 from Uroš Bizjak ---
spawn -ignore SIGHUP /hdd/uros/gcc-build-fast/gcc/xgcc
-B/hdd/uros/gcc-build-fast/gcc/ -fdiagnostics-plain-output -mx32 -O2 -fno-pic
-fexceptions -fasynchronous-unwind-tables -mno-direct-extern-access
-ffat-lt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104481
Uroš Bizjak changed:
What|Removed |Added
CC||hjl.tools at gmail dot com
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104481
Bug ID: 104481
Summary: gcc.target/i386/pr35513-8.c and
g++.target/i386/pr35513-[12].C testsuire failures
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104469
Uroš Bizjak changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104458
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Target Milestone|11.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104462
Uroš Bizjak changed:
What|Removed |Added
Target Milestone|--- |11.4
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104467
Uroš Bizjak changed:
What|Removed |Added
Component|target |middle-end
--- Comment #2 from Uroš Bizja
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104462
Uroš Bizjak changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104445
--- Comment #9 from Uroš Bizjak ---
(In reply to rguent...@suse.de from comment #8)
> > (In reply to Richard Biener from comment #6)
> > > We are missing vec_extractv2sisi or vec_extractv8qiv4qi, with -mno-mmx
> > > -mavx.
> > > It seems we hav
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104458
Uroš Bizjak changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104445
--- Comment #7 from Uroš Bizjak ---
(In reply to Richard Biener from comment #6)
> We are missing vec_extractv2sisi or vec_extractv8qiv4qi, with -mno-mmx -mavx.
> It seems we have addv2si3 available though.
vec_extractv2sisi is available in mmx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104445
--- Comment #5 from Uroš Bizjak ---
We do have:
(define_expand "vec_extractv4qiqi"
[(match_operand:QI 0 "register_operand")
(match_operand:V4QI 1 "register_operand")
(match_operand 2 "const_int_operand")]
"TARGET_SSE4_1"
{
ix86_expa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104362
Uroš Bizjak changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104362
--- Comment #4 from Uroš Bizjak ---
Or simply:
--cut here--
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index ad5a5caa413..dd5584fb8ed 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -7400,7 +7400,8 @@ f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104362
--- Comment #3 from Uroš Bizjak ---
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index ad5a5caa413..a61a5390127 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -7403,6 +7403,10 @@ find_drap_reg (void)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104151
--- Comment #12 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #10)
> (In reply to Hongtao.liu from comment #4)
> > Also there's separate issue, codegen for below is not optimal
> > gimple:
> > _11 = VIEW_CONVERT_EXPR(a_3(D))
> > asm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104306
Bug ID: 104306
Summary: Use secondary_reload for optimized interunit reg-reg
moves
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Pr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104151
--- Comment #10 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #4)
> Also there's separate issue, codegen for below is not optimal
> gimple:
> _11 = VIEW_CONVERT_EXPR(a_3(D))
> asm:
> mov QWORD PTR [rsp-24], rdi
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054
--- Comment #8 from Uroš Bizjak ---
Without debug instructions, the compiler is able to rename insns to:
65: di:DI=si:DI
66: dx:DI=r11:DI
74: cx:QI=0x1
REG_EQUAL 0x1
41: L41:
42: NOTE_INSN_BASIC_BLOCK 6
43: NOTE_INSN_DEL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054
--- Comment #7 from Uroš Bizjak ---
For some reason the pass does not detect usage of Register si in (insn 55):
(debug_insn 55 54 56 6 (var_location:TI b (reg/v:TI 4 si [orig:86 b ] [86])) -1
(nil))
Register ax (1):
Register dx (1):
Regis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054
--- Comment #5 from Uroš Bizjak ---
Could be a red herring, but in _.rnreg dump:
Register r9 (1): 75 [GENERAL_REGS] 18 [ALL_REGS] 97 [GENERAL_REGS]
Register r10 (1): 76 [GENERAL_REGS] 18 [ALL_REGS] 23 [GENERAL_REGS]
...
Register di (1): 55 [ALL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054
Uroš Bizjak changed:
What|Removed |Added
Keywords|wrong-code |
--- Comment #4 from Uroš Bizjak ---
(In
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054
--- Comment #3 from Uroš Bizjak ---
The first difference is in rnreg pass, w/o -g:
28: L28:
29: NOTE_INSN_BASIC_BLOCK 4
30: [`i']=0
63: di:DI=r9:DI <--- here
64: dx:DI=r10:DI
9: r8:HI=0x5
REG_EQUAL 0x5
98: {cx:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104003
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104001
--- Comment #4 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #2)
> I'm testing
>
> 1 file changed, 3 insertions(+), 3 deletions(-)
> gcc/config/i386/i386.md | 6 +++---
>
> modified gcc/config/i386/i386.md
> @@ -10455,7 +10455,7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104003
--- Comment #2 from Uroš Bizjak ---
(define_insn "*xop_pcmov_"
- [(set (match_operand:VI_32 0 "register_operand" "=x")
-(if_then_else:VI_32
- (match_operand:VI_32 3 "register_operand" "x")
- (match_operand:VI_32 1 "re
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104003
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103935
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103997
Uroš Bizjak changed:
What|Removed |Added
Target||x86
Keywords|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103997
Bug ID: 103997
Summary: gcc.target/i386/pr88531-??.c scan-assembler-times
FAILs
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Prior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100637
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670
Bug 88670 depends on bug 103948, which changed state.
Bug 103948 Summary: Vectorizer does not use vec_cmpMN without vcondMN pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
What|Removed |Added
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 103948, which changed state.
Bug 103948 Summary: Vectorizer does not use vec_cmpMN without vcondMN pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
What|Removed |Added
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
Uroš Bizjak changed:
What|Removed |Added
Target Milestone|--- |12.0
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
--- Comment #7 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #6)
> I'll try your proposed patch from Comment #5 later today and report here.
Yes, the patch works for me.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
--- Comment #6 from Uroš Bizjak ---
(In reply to Richard Biener from comment #5)
> I guess that tree-vect-generic.c is not up-to-date with gimple-isel.cc. We
> should probably somehow factor out relevant pieces.
>
> Note vector lowering will s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103935
--- Comment #3 from Uroš Bizjak ---
(In reply to Richard Biener from comment #2)
> no longer xfailed. I suggest to re-add the { xfail *-*-* } to the
> profitability check.
You mean xfail for non-x86 targets?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
--- Comment #4 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #3)
> diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
> index 78e388d82f6..871366f3b7e 100644
> --- a/gcc/optabs-tree.c
> +++ b/gcc/optabs-tree.c
> @@ -502,6 +502,9 @@
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
--- Comment #3 from Uroš Bizjak ---
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 78e388d82f6..871366f3b7e 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -502,6 +502,9 @@ expand_vec_cond_expr_p (tree value_type, tree cmp_op
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
--- Comment #2 from Uroš Bizjak ---
Created attachment 52146
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52146&action=edit
The complete testcase
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
--- Comment #1 from Uroš Bizjak ---
Created attachment 52145
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52145&action=edit
Patch that illustrates the problem on x86 target
This patch should vectorize all integer relational operations w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
Bug ID: 103948
Summary: Vectorizer does not use vec_cmpMN without vcondMN
pattern
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Pri
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941
Bug ID: 103941
Summary: uavgv2qi3_ceil is not used
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103935
--- Comment #1 from Uroš Bizjak ---
As said in the patch submission:
I have changed scan-tree-dump patterns in g++.dg/vect/slp-pr98855.cc
to check that no SLP vectorization was performed. The existing
scan-tree-dump-times was too fragile, since
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103928
--- Comment #12 from Uroš Bizjak ---
(In reply to Manuel Lauss from comment #10)
> So it was either fixed in trunk in the last 20 hours, or pgo build broke
> gcc, or "-mno-xop" fixed it.
The fix for PR103905 was pushed to the master in the last
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103928
--- Comment #11 from Uroš Bizjak ---
(In reply to Martin Liška from comment #8)
> > No, bdver4 does not include XOP.
>
> Ohh, didn't know that...
Sorry, I was wrong:
{"bdver4", PROCESSOR_BDVER4, CPU_BDVER4,
PTA_64BIT | PTA_MMX | PTA_SSE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103928
--- Comment #7 from Uroš Bizjak ---
(In reply to Martin Liška from comment #6)
> Then you may be affected by PR103905 which is fixed on the current master.
> Please pull to tip of master branch.
No, bdver4 does not include XOP.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94440
--- Comment #21 from Uroš Bizjak ---
Fixed?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103915
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92860
Bug 92860 depends on bug 103905, which changed state.
Bug 103905 Summary: [12 Regression] Miscompiled i386-expand.c with
-march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905
What
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103915
Uroš Bizjak changed:
What|Removed |Added
Status|NEW |ASSIGNED
Assignee|unassigned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905
Uroš Bizjak changed:
What|Removed |Added
Attachment #52120|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905
Uroš Bizjak changed:
What|Removed |Added
Attachment #52123|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905
--- Comment #6 from Uroš Bizjak ---
@Jakub: It looks the problem is in expand_vec_perm_pshufb, where permutation
vector is recalculated for partial vectors:
if (vmode == V4QImode
|| vmode == V8QImode)
{
rtx m128 = GEN_INT (-12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905
--- Comment #4 from Uroš Bizjak ---
Created attachment 52123
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52123&action=edit
Patch that disables XOP permute with partial vectors
Please try this patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905
--- Comment #3 from Uroš Bizjak ---
(In reply to Martin Liška from comment #1)
> Created attachment 52120 [details]
> Isolated test-case
>
> Isolated test-case where only the miscompiled function
> ix86_expand_vec_extract_even_odd uses -O3.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905
--- Comment #2 from Uroš Bizjak ---
The referred patch adds:
+;; Pack/unpack vector modes
+(define_mode_attr mmxpackmode
+ [(V4HI "V8QI") (V2SI "V4HI")])
+
+(define_expand "vec_pack_trunc_"
+ [(match_operand: 0 "register_operand")
+ (match_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103900
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861
--- Comment #7 from Uroš Bizjak ---
(In reply to Richard Biener from comment #6)
> Not fully fixed I guess?
Not yet. I have a bunch of follow-up patches for various operations.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103900
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103900
--- Comment #6 from Uroš Bizjak ---
(In reply to Martin Liška from comment #5)
> No, it still crashes with the current master (g:fbb592407c9):
Ah, the compiler is blindly trying to generate V2QI XOR due to missing
one_cmplv2qi2 pattern. I have
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103900
--- Comment #2 from Uroš Bizjak ---
Looks fixed, does not ICE for me with:
GNU C17 (GCC) version 12.0.0 20220104 (experimental) [master
r12-6200-g62c8b21d48a] (x86_64-pc-linux-gnu)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103894
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103894
Uroš Bizjak changed:
What|Removed |Added
Status|NEW |ASSIGNED
Assignee|unassigned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861
--- Comment #3 from Uroš Bizjak ---
The patched compiler compiles the testcase from Comment #0 on x86_64 with -O2
to:
plus:
movl%edi, %edx
movl%esi, %eax
addb%sil, %dl
addb%ah, %dh
movl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861
--- Comment #2 from Uroš Bizjak ---
Created attachment 52087
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52087&action=edit
Protorypw patch to vectorize with v2qi vectors
Patch that implmenents V2QI moves, logic and basic arithmetic ope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861
--- Comment #1 from Uroš Bizjak ---
Also:
char r[2], a[2], b[2];
void foo (void)
{
int i;
for (i = 0; i < 2; i++)
r[i] = a[i] + b[i];
}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861
Bug ID: 103861
Summary: [i386] vectorize v2qi vectors
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
A
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103842
--- Comment #6 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #5)
> Created attachment 52068 [details]
> gcc12-pr103842.patch
>
> Untested fix.
The patch is OK.
Thanks,
Uros.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797
--- Comment #17 from Uroš Bizjak ---
(In reply to hubicka from comment #16)
> > >
> > > It could be done, but I was under impression that the sequence to load
> > > 1.0f
> > > into topmost elements nullifies the benefit of operation to divide
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797
--- Comment #14 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #13)
> Created attachment 52051 [details]
> Patch that implements v2sf division
This patch also enables vectorization of the testcase from Comment #7. Using
-ffast-math,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797
--- Comment #13 from Uroš Bizjak ---
Created attachment 52051
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52051&action=edit
Patch that implements v2sf division
Please try the attached patch, for the following testcase:
--cut here--
fl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797
--- Comment #12 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #10)
> At least on your short testcase clang doesn't use divps either.
> We do support mulv2sf3, addv2sf3 etc. but not divv2sf3 I bet because with
> TARGET_MMX_WITH_SSE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103772
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103772
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750
--- Comment #9 from Uroš Bizjak ---
(In reply to Thiago Macieira from comment #0)
> Testcase:
...
> The assembly for this produces:
>
> vmovdqu16 (%rdi), %ymm1
> vmovdqu16 32(%rdi), %ymm2
> vpcmpuw $0, %ymm0,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103753
Bug ID: 103753
Summary: Unoptimal avx2 V16HF vector insert to element 0
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812
Uroš Bizjak changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
--- Comment #28 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #18)
> codegen for foo1/foo2 is suboptimal under -mavx2, i guess we can have
> vec_setv16hf_0 and with vpblendw.
True, some opportunities are missing from expand_vec_per
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
--- Comment #27 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #17)
> (In reply to Hongtao.liu from comment #16)
> > There're already testcases for vec_extract/vec_set/vec_duplicate, but those
> > testcases are written under TARGET_A
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
--- Comment #25 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #22)
> Yes, besides TARGET_VECTOR_MODE_SUPPORTED_P, other part in the attached
> patch looks fine, the condition should be binded to real instructions but
> not mode.
OK
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
Uroš Bizjak changed:
What|Removed |Added
Attachment #51950|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
Uroš Bizjak changed:
What|Removed |Added
CC||rguenth at gcc dot gnu.org
--- Comment #2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
Uroš Bizjak changed:
What|Removed |Added
Attachment #51948|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
--- Comment #13 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #12)
> Hongtao, can you please review the patch and perhaps test it a bit more?
This part is missing from ix86_expand_vector_set_var:
--cut here
@@ -15912,7 +15921,8 @@
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
--- Comment #12 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #10)
> Sure.
Please find attached the complete patch that enables HF vector modes in Comment
#11. The patch survives bootstrap and regression test and works OK for the
f
501 - 600 of 902 matches
Mail list logo