: fortran
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
A couple of issues with -ffpe-trap and -ffpe-summary options:
a) Invalid argument report should be switched:
$ gfortran -ffpe-summary=aaa ac.f90
f951: Fatal Error: Argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110587
--- Comment #20 from Uroš Bizjak ---
Can we revert the Comment #13 kludge now?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762
--- Comment #22 from Uroš Bizjak ---
It looks to me that partial vector half-float instructions have the same issue.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832
--- Comment #10 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #9)
> for mov_internal, we can just set alternative (v,v) with mode DI, then
> it will use vmovq, for other alternatives which set sse_regs, the
> instructions has
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832
--- Comment #8 from Uroš Bizjak ---
(In reply to Richard Biener from comment #6)
> Do we know whether we could in theory improve the sanitizing by optimization
> without -funsafe-math-optimizations (I think -fno-trapping-math,
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832
--- Comment #7 from Uroš Bizjak ---
(In reply to Richard Biener from comment #6)
> Do we know whether we could in theory improve the sanitizing by optimization
> without -funsafe-math-optimizations (I think -fno-trapping-math,
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832
Uroš Bizjak changed:
What|Removed |Added
CC||ubizjak at gmail dot com
--- Comment #5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832
--- Comment #4 from Uroš Bizjak ---
Created attachment 55652
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55652=edit
Patch to recover performance for -funsafe-math-optimizations
This patch will recover performance with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762
Uroš Bizjak changed:
What|Removed |Added
CC|uros at gcc dot gnu.org|
Target Milestone|11.5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91838
--- Comment #18 from Uroš Bizjak ---
(In reply to Richard Biener from comment #17)
> Interestingly even with -mno-sse we somehow have a shift for V2QImode.
This is implemented by a combination of shl rl,cl and shl rh,cl, so no XMM
registers are
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110788
--- Comment #3 from Uroš Bizjak ---
(In reply to Richard Biener from comment #0)
> I suppose it could also be a missed optimization in REE since I think
> the HImode regs should already be zero-extended?
No, only SImode moves have implicit zero
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762
--- Comment #18 from Uroš Bizjak ---
(In reply to Richard Biener from comment #17)
> > compiles to:
> >
> > movq%xmm1, %xmm1# 8 [c=4 l=4] *vec_concatv4sf_0
> > movq%xmm0, %xmm0# 9 [c=4 l=4]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762
--- Comment #13 from Uroš Bizjak ---
I think we should put all partial vector V2SF operations under
!flag_trapping_math.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762
--- Comment #10 from Uroš Bizjak ---
(In reply to Richard Biener from comment #7)
> I guess for the specific usage we need to wrap this in an UNSPEC?
Probably, so a MOVQ xmm, xmm insn should be emitted for __builtin_ia32_storelps
(AKA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762
--- Comment #3 from Uroš Bizjak ---
(In reply to Richard Biener from comment #1)
> So what's the issue? That this is wrong for -ftrapping-math? Or that the
> return value has undefined contents in the upper half? (I don't think the
> ABI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717
Uroš Bizjak changed:
What|Removed |Added
Assignee|ubizjak at gmail dot com |unassigned at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717
Uroš Bizjak changed:
What|Removed |Added
Target Milestone|--- |14.0
CC|uros at gcc dot
at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #4 from Uroš Bizjak ---
Created attachment 55578
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55578=edit
Proposed patch
Patch in testing.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Target Milestone|14.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #16 from Uroš Bizjak ---
v2 patch at [1].
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624491.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #15 from Uroš Bizjak ---
Created attachment 55537
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55537=edit
Proposed patch.
v2 patch in testing.
This version prevents emission of invalid REG_EQUAL note in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106966
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #14 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #10)
> (In reply to Uroš Bizjak from comment #9)
> > and simplify_replace_rtx simplifies the above to:
> >
> > (gdb) p debug_rtx (src)
> > (const_vector:V8HI [
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #13 from Uroš Bizjak ---
(In reply to Richard Biener from comment #12)
> I can see cprop1 adds the REG_EQUAL note:
>
> (insn 22 21 23 4 (set (reg:V8HI 100)
> (zero_extend:V8HI (vec_select:V8QI (subreg:V16QI (reg:V4QI 98) 0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106966
--- Comment #10 from Uroš Bizjak ---
(In reply to matoro from comment #9)
> (In reply to Uroš Bizjak from comment #8)
> > Created attachment 55504 [details]
> > Proposed patch.
> >
> > Can someone please bootstrap and test the attached patch?
dot gnu.org |ubizjak at gmail dot com
Status|NEW |ASSIGNED
--- Comment #11 from Uroš Bizjak ---
Patch at [1].
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623933.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110588
Uroš Bizjak changed:
What|Removed |Added
See Also||https://gcc.gnu.org/bugzill
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #10 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #9)
> and simplify_replace_rtx simplifies the above to:
>
> (gdb) p debug_rtx (src)
> (const_vector:V8HI [
> (const_int 204 [0xcc]) repeated x8
> ])
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #9 from Uroš Bizjak ---
Some more digging through the code:
In cprop.cc/try_replace_reg, we try to simplify the source of the set given our
substitution:
Breakpoint 1, try_replace_reg (from=0x7fffe9f0b7f8, to=0x7fffe9f099e0,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #8 from Uroš Bizjak ---
The testcase needs __attribute__((noinline)) to supress unwanted constant
propagation with recent gcc.
void
__attribute__((noinline))
foo (U u, u16 c, V *r)
...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106966
Uroš Bizjak changed:
What|Removed |Added
Summary|alpha cross build crashes |[12/13/14 Regression] alpha
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106966
--- Comment #8 from Uroš Bizjak ---
Created attachment 55504
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55504=edit
Proposed patch.
Can someone please bootstrap and test the attached patch?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106966
Uroš Bizjak changed:
What|Removed |Added
CC||doko at gcc dot gnu.org
--- Comment #7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110597
Uroš Bizjak changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110533
--- Comment #2 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #1)
> >clobbering other parameters and callee-saved registers.
>
>
> (insn 2 8 3 2 (set (reg:DI 84)
> (reg:DI 5 di [ aD.2522 ])) "/app/example.cpp":3:25 -1
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311
--- Comment #39 from Uroš Bizjak ---
(In reply to anlauf from comment #36)
> Breakpoint 2, rng_stream.rng_stream_s::mmm_mod (x1=330289839997,
> x2=4294967087) at rng_stream_sub.f90:336
> 336 res = mod (x1, x2)
> (gdb) info float
> R7:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311
--- Comment #35 from Uroš Bizjak ---
(In reply to anlauf from comment #33)
> (In reply to Jakub Jelinek from comment #32)
> > Then maybe r13-6361-g8020c9c42349f51f75239b
> > is the commit that changed it?
> > Would be good to put a breakpoint
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110479
--- Comment #1 from Uroš Bizjak ---
(In reply to Thomas Koenig from comment #0)
> movl%edi, %ecx
This one? It is needed because SAL wants its count argument in %cl and first
argument is passed in %edi (mandated by x86_64 ABI).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104610
--- Comment #21 from Uroš Bizjak ---
Just before the patch from Comment #20, the compiler creates (-O2 -mavx):
--cut here--
vmovdqa .LC1(%rip), %xmm0
vmovdqa %xmm0, -24(%rsp)
vmovdqu (%rdi), %xmm0
vpxor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110372
Uroš Bizjak changed:
What|Removed |Added
Last reconfirmed||2023-06-26
Component|target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110372
--- Comment #1 from Uroš Bizjak ---
Before reload, we have this sequence:
--cut here--
(insn 34 4 2 2 (set (reg:TI 119)
(reg:TI 20 xmm0 [ u ])) "pr110372.c":8:1 89 {*movti_internal}
(expr_list:REG_DEAD (reg:TI 20 xmm0 [ u ])
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812
--- Comment #18 from Uroš Bizjak ---
One interesting observation:
clang is able to do this:
0.09 │ │ vmovddup -0x8(%rdx,%rsi,1),%xmm3 ▒
...
0.11 │ │ vfmadd231sd %xmm2,%xmm3,%xmm1▒
...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105935
--- Comment #4 from Uroš Bizjak ---
(In reply to Francois-Xavier Coudert from comment #3)
> > These two functions are available from libiberty.
>
> Are we linking runtime libraries like libgfortran against libiberty? I
> thought that was only
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105935
--- Comment #2 from Uroš Bizjak ---
(In reply to Francois-Xavier Coudert from comment #1)
> Created attachment 55363 [details]
> Proposed patch
>
> The issue is real, but I would suggest that snprintf() and vsnprintf()
> should always be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #7 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #4)
> cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
> that it equals 8 elements of QImode:
8 elements of HImode.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #6 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #3)
> However, VPMULLW needs all 8 QImode elements, but %xmm4 only has 4 loaded;
To be consistent, VPSRLVW and VPMULLW use HImode elements.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
Uroš Bizjak changed:
What|Removed |Added
Component|target |rtl-optimization
--- Comment #5 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #4 from Uroš Bizjak ---
cprop1 pass does not consider paradoxical subreg and for (insn 22) claims that
it equals 8 elements of QImode:
(insn 21 19 22 4 (set (reg:V4QI 98)
(mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #3 from Uroš Bizjak ---
Here is the problem:
vmovd .LC1(%rip), %xmm4 # 21[c=4 l=10] *movv4qi_internal/4
...
vpmovzxbw %xmm4, %xmm4# 22[c=10 l=6]
sse4_1_zero_extendv8qiv8hi2/2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087
--- Comment #9 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #8)
> Please file this separately, since it is a different issue.
PR110155.
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
#include
_Bool foo (void);
int bar (int r)
{
if (foo ())
r++;
return r;
}
--cut here--
compiles (gcc -O2) to:
movl%edi, %ebx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087
--- Comment #7 from Uroš Bizjak ---
Similar conversion, not performed by gcc:
--cut here--
#include
_Bool foo (void);
int bar (int r)
{
if (foo ())
r++;
return r;
}
--cut here--
gcc -O2:
movl%edi, %ebx
call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110089
--- Comment #5 from Uroš Bizjak ---
The important pattern in i386.md is *sub2, which allows CCGOCmode
compare. This means that garbage in Overlow and Carry flags are allowed.
In ix86_cc_modes_compatible, CCmode is returned for combination of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087
--- Comment #1 from Uroš Bizjak ---
BTW: If the result of foo is random, then cmove gets badly predicted.
Considering the problems with cmove on x86 (even without bad prediction), the
above optimization can be quite important. Clang does it.
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
#include
_Bool foo (void);
_Bool bar (_Bool r)
{
if (foo ())
r = true;
return r;
}
--cut here--
compiles for x86_64 target (-O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041
Uroš Bizjak changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Version|unknown
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110021
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110021
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982
--- Comment #3 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #1)
> Also fails with "-mtune=znver1 -mavx":
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x004048ef in func_21 (p_22=0x41b330 , p_23=0, p_24=8) at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982
--- Comment #2 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #1)
> (gdb) p/x $rdx
> $3 = 0x41a824
>
> Unaligned access.
Actually, just a garbage value.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982
--- Comment #1 from Uroš Bizjak ---
Also fails with "-mtune=znver1 -mavx":
Program received signal SIGSEGV, Segmentation fault.
0x004048ef in func_21 (p_22=0x41b330 , p_23=0, p_24=8) at
runData/keep/in.11.c:597
597 in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91838
Uroš Bizjak changed:
What|Removed |Added
Resolution|FIXED |---
Status|RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109866
--- Comment #2 from Uroš Bizjak ---
A small improvement would be:
subl%esi, %edi
je .L5
testl %edi, %edi
jle .L3
jmp h()
.L3:
jmp t()
.L5:
jmp g()
Not to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109866
--- Comment #1 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #0)
> Take:
> ```
> int g(void); int h(void); int t(void);
> int f(int a, int b)
> {
> int c = a - b;
> if(c == 0)
> return g();
> if (c > 0)
> return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
CC||jamborm at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
CC||slyfox at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109838
--- Comment #2 from Uroš Bizjak ---
*** This bug has been marked as a duplicate of bug 109807 ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
Resolution|DUPLICATE |FIXED
--- Comment #13 from Uroš Bizjak
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
--- Comment #11 from Uroš Bizjak ---
(In reply to David Binderman from comment #10)
> (In reply to Uroš Bizjak from comment #8)
> > Fixed.
>
> I don't think so. The code I gave seems still to crash the compiler:
Yes, the cost function is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
Resolution|FIXED |DUPLICATE
--- Comment #9 from Uroš
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
Uroš Bizjak changed:
What|Removed |Added
CC||haochen.jiang at intel dot com
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 109797, which changed state.
Bug 109797 Summary: 456.hmmer compiled with -O2 -flto regressed by 15% on AMD
zen3 between r14-487-g6f18f344338b37 and r14-540-gb7fe38c14e5f1b
|--- |14.0
Resolution|--- |FIXED
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #12 from Uroš Bizjak ---
Fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
--- Comment #4 from Uroš Bizjak ---
Like this:
--cut here--
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 62fe06fdbaa..e6091b8bd35 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -20417,14 +20417,12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
--- Comment #3 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #2)
> (In reply to Richard Biener from comment #1)
> > I think this was just fixed?
>
> No, the asked mode is V2HImode, so it should also be added.
OTOH, it is a bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
--- Comment #2 from Uroš Bizjak ---
(In reply to Richard Biener from comment #1)
> I think this was just fixed?
No, the asked mode is V2HImode, so it should also be added.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #9 from Uroš Bizjak ---
Created attachment 55057
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55057=edit
Patch to enable mulv2si for TARGET_SSE4_1 only
The alternative approach is to enable mulv2si for TARGET_SSE4_1 only.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
|1
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #6 from Uroš Bizjak ---
Created attachment 55053
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55053=edit
Propo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #6 from Uroš Bizjak ---
Created attachment 55045
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55045=edit
Proposed patch
Proposed patch to fix the ix86_multiplication_cost for non-SSE4 V2SI emulation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #7 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #6)
> Created attachment 55045 [details]
> Proposed patch
>
> Proposed patch to fix the ix86_multiplication_cost for non-SSE4 V2SI
> emulation.
Martin, does this patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
--- Comment #5 from Uroš Bizjak ---
(In reply to Haochen Jiang from comment #4)
> (In reply to Uroš Bizjak from comment #2)
> > (In reply to Haochen Jiang from comment #1)
> > > I further checked the reason, V2SI should never dropped into that
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #5 from Uroš Bizjak ---
(In reply to Richard Biener from comment #4)
> No, it's indeed plain -O2 with the default architecture level, thus SSE2
> only.
>
> For the case of "complex" expansions we might want to bite the bullet and
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
--- Comment #2 from Uroš Bizjak ---
(In reply to Haochen Jiang from comment #1)
> I further checked the reason, V2SI should never dropped into that function
> because we have no pattern under V2SI.
>
> I suppose it is because
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #3 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #1)
> Maybe:
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;
> h=919642fa4b2bc4c32910336dd200d53766801c80
Is this with -msse4? In case of TARGET_SSE4_1 the revision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764
--- Comment #3 from Uroš Bizjak ---
(In reply to Richard Biener from comment #2)
> Confirmed. Pattern recog recognizes the widening multiplication but not a
> highpart multiplication. That's currently missing.
Please note that the following
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764
--- Comment #1 from Uroš Bizjak ---
Created attachment 55017
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55017=edit
Patch that adds mulv2si3_highpart expander
The compiler should vectorize the testcase using "mulv2si3_highpart"
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
The folowing testcase:
--cut here--
#define N 2
unsigned int ur[N], ua[N], ub[N];
void mulh (void)
{
int i;
for (i = 0; i < N; i++)
ur[i] = ((unsig
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109690
--- Comment #6 from Uroš Bizjak ---
The missing pattern was committed as part of:
commit r14-493-g919642fa4b2bc4c32910336dd200d53766801c80
Author: Uros Bizjak
Date: Fri May 5 14:10:18 2023 +0200
i386: Introduce mulv2si3 instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109690
--- Comment #5 from Uroš Bizjak ---
Created attachment 55002
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55002=edit
Patch that introduces mulv2si3
The compiled code with -march=znver1 is now the same as without the flag:
loop:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109733
Uroš Bizjak changed:
What|Removed |Added
Target Milestone|--- |14.0
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109733
Uroš Bizjak changed:
What|Removed |Added
Attachment #54996|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109733
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109733
--- Comment #1 from Uroš Bizjak ---
The patched compiler just happens to trigger the existing problem where:
(insn 188 416 379 18 (parallel [
(set (reg:SI 72 k4 [orig:121 _114 ] [121])
(ashift:SI (reg:SI 70 k2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207
--- Comment #13 from Uroš Bizjak ---
(In reply to Richard Biener from comment #11)
> I wonder if we can for simplicity deprecate non EV6 ... does any other
> existing architecture use this functionality?
To be more precise: is there a target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207
--- Comment #12 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #10)
> (In reply to Uroš Bizjak from comment #6)
> > So, LRA testresults are clean on alphaev68-linux-gnu.
>
> Please note that the above applies to alpha*EV6*, not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207
--- Comment #10 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #6)
> So, LRA testresults are clean on alphaev68-linux-gnu.
Please note that the above applies to alpha*EV6*, not plain alpha.
Plain alpha is !BWX architecture and uses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101096
Uroš Bizjak changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1
201 - 300 of 6602 matches
Mail list logo