https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105935
--- Comment #4 from Uroš Bizjak ---
(In reply to Francois-Xavier Coudert from comment #3)
> > These two functions are available from libiberty.
>
> Are we linking runtime libraries like libgfortran against libiberty? I
> thought that was only f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105935
--- Comment #2 from Uroš Bizjak ---
(In reply to Francois-Xavier Coudert from comment #1)
> Created attachment 55363 [details]
> Proposed patch
>
> The issue is real, but I would suggest that snprintf() and vsnprintf()
> should always be availa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #7 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #4)
> cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
> that it equals 8 elements of QImode:
8 elements of HImode.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #6 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #3)
> However, VPMULLW needs all 8 QImode elements, but %xmm4 only has 4 loaded;
To be consistent, VPSRLVW and VPMULLW use HImode elements.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
Uroš Bizjak changed:
What|Removed |Added
Component|target |rtl-optimization
--- Comment #5 from Uroš
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #4 from Uroš Bizjak ---
cprop1 pass does not consider paradoxical subreg and for (insn 22) claims that
it equals 8 elements of QImode:
(insn 21 19 22 4 (set (reg:V4QI 98)
(mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2])
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206
--- Comment #3 from Uroš Bizjak ---
Here is the problem:
vmovd .LC1(%rip), %xmm4 # 21[c=4 l=10] *movv4qi_internal/4
...
vpmovzxbw %xmm4, %xmm4# 22[c=10 l=6]
sse4_1_zero_extendv8qiv8hi2/2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087
--- Comment #9 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #8)
> Please file this separately, since it is a different issue.
PR110155.
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
#include
_Bool foo (void);
int bar (int r)
{
if (foo ())
r++;
return r;
}
--cut here--
compiles (gcc -O2) to:
movl%edi, %ebx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087
--- Comment #7 from Uroš Bizjak ---
Similar conversion, not performed by gcc:
--cut here--
#include
_Bool foo (void);
int bar (int r)
{
if (foo ())
r++;
return r;
}
--cut here--
gcc -O2:
movl%edi, %ebx
call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110089
--- Comment #5 from Uroš Bizjak ---
The important pattern in i386.md is *sub2, which allows CCGOCmode
compare. This means that garbage in Overlow and Carry flags are allowed.
In ix86_cc_modes_compatible, CCmode is returned for combination of CC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087
--- Comment #1 from Uroš Bizjak ---
BTW: If the result of foo is random, then cmove gets badly predicted.
Considering the problems with cmove on x86 (even without bad prediction), the
above optimization can be quite important. Clang does it.
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
#include
_Bool foo (void);
_Bool bar (_Bool r)
{
if (foo ())
r = true;
return r;
}
--cut here--
compiles for x86_64 target (-O2) to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041
Uroš Bizjak changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Version|unknown
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110021
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110021
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982
--- Comment #3 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #1)
> Also fails with "-mtune=znver1 -mavx":
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x004048ef in func_21 (p_22=0x41b330 , p_23=0, p_24=8) at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982
--- Comment #2 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #1)
> (gdb) p/x $rdx
> $3 = 0x41a824
>
> Unaligned access.
Actually, just a garbage value.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982
--- Comment #1 from Uroš Bizjak ---
Also fails with "-mtune=znver1 -mavx":
Program received signal SIGSEGV, Segmentation fault.
0x004048ef in func_21 (p_22=0x41b330 , p_23=0, p_24=8) at
runData/keep/in.11.c:597
597 in runData/keep/i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91838
Uroš Bizjak changed:
What|Removed |Added
Resolution|FIXED |---
Status|RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109866
--- Comment #2 from Uroš Bizjak ---
A small improvement would be:
subl%esi, %edi
je .L5
testl %edi, %edi
jle .L3
jmp h()
.L3:
jmp t()
.L5:
jmp g()
Not to mentio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109866
--- Comment #1 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #0)
> Take:
> ```
> int g(void); int h(void); int t(void);
> int f(int a, int b)
> {
> int c = a - b;
> if(c == 0)
> return g();
> if (c > 0)
> return h();
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
CC||jamborm at gcc dot gnu.org
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
CC||slyfox at gcc dot gnu.org
--- Comment #14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109838
--- Comment #2 from Uroš Bizjak ---
*** This bug has been marked as a duplicate of bug 109807 ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
Resolution|DUPLICATE |FIXED
--- Comment #13 from Uroš Bizjak -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
--- Comment #11 from Uroš Bizjak ---
(In reply to David Binderman from comment #10)
> (In reply to Uroš Bizjak from comment #8)
> > Fixed.
>
> I don't think so. The code I gave seems still to crash the compiler:
Yes, the cost function is ICEin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
Resolution|FIXED |DUPLICATE
--- Comment #9 from Uroš Bizjak
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
Uroš Bizjak changed:
What|Removed |Added
CC||haochen.jiang at intel dot com
--- Commen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 109797, which changed state.
Bug 109797 Summary: 456.hmmer compiled with -O2 -flto regressed by 15% on AMD
zen3 between r14-487-g6f18f344338b37 and r14-540-gb7fe38c14e5f1b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=1097
|--- |14.0
Resolution|--- |FIXED
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #12 from Uroš Bizjak ---
Fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
--- Comment #4 from Uroš Bizjak ---
Like this:
--cut here--
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 62fe06fdbaa..e6091b8bd35 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -20417,14 +20417,12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
--- Comment #3 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #2)
> (In reply to Richard Biener from comment #1)
> > I think this was just fixed?
>
> No, the asked mode is V2HImode, so it should also be added.
OTOH, it is a bit str
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109825
--- Comment #2 from Uroš Bizjak ---
(In reply to Richard Biener from comment #1)
> I think this was just fixed?
No, the asked mode is V2HImode, so it should also be added.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #9 from Uroš Bizjak ---
Created attachment 55057
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55057&action=edit
Patch to enable mulv2si for TARGET_SSE4_1 only
The alternative approach is to enable mulv2si for TARGET_SSE4_1 o
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
|1
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #6 from Uroš Bizjak ---
Created attachment 55053
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55053&acti
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #6 from Uroš Bizjak ---
Created attachment 55045
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55045&action=edit
Proposed patch
Proposed patch to fix the ix86_multiplication_cost for non-SSE4 V2SI emulation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #7 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #6)
> Created attachment 55045 [details]
> Proposed patch
>
> Proposed patch to fix the ix86_multiplication_cost for non-SSE4 V2SI
> emulation.
Martin, does this patch f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
--- Comment #5 from Uroš Bizjak ---
(In reply to Haochen Jiang from comment #4)
> (In reply to Uroš Bizjak from comment #2)
> > (In reply to Haochen Jiang from comment #1)
> > > I further checked the reason, V2SI should never dropped into that f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #5 from Uroš Bizjak ---
(In reply to Richard Biener from comment #4)
> No, it's indeed plain -O2 with the default architecture level, thus SSE2
> only.
>
> For the case of "complex" expansions we might want to bite the bullet and
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109807
--- Comment #2 from Uroš Bizjak ---
(In reply to Haochen Jiang from comment #1)
> I further checked the reason, V2SI should never dropped into that function
> because we have no pattern under V2SI.
>
> I suppose it is because -march=cascadelake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109797
--- Comment #3 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #1)
> Maybe:
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;
> h=919642fa4b2bc4c32910336dd200d53766801c80
Is this with -msse4? In case of TARGET_SSE4_1 the revision just
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764
--- Comment #3 from Uroš Bizjak ---
(In reply to Richard Biener from comment #2)
> Confirmed. Pattern recog recognizes the widening multiplication but not a
> highpart multiplication. That's currently missing.
Please note that the following t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764
--- Comment #1 from Uroš Bizjak ---
Created attachment 55017
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55017&action=edit
Patch that adds mulv2si3_highpart expander
The compiler should vectorize the testcase using "mulv2si3_highpart"
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
The folowing testcase:
--cut here--
#define N 2
unsigned int ur[N], ua[N], ub[N];
void mulh (void)
{
int i;
for (i = 0; i < N; i++)
ur[i] = ((unsig
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109690
--- Comment #6 from Uroš Bizjak ---
The missing pattern was committed as part of:
commit r14-493-g919642fa4b2bc4c32910336dd200d53766801c80
Author: Uros Bizjak
Date: Fri May 5 14:10:18 2023 +0200
i386: Introduce mulv2si3 instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109690
--- Comment #5 from Uroš Bizjak ---
Created attachment 55002
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55002&action=edit
Patch that introduces mulv2si3
The compiled code with -march=znver1 is now the same as without the flag:
loop:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109733
Uroš Bizjak changed:
What|Removed |Added
Target Milestone|--- |14.0
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109733
Uroš Bizjak changed:
What|Removed |Added
Attachment #54996|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109733
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109733
--- Comment #1 from Uroš Bizjak ---
The patched compiler just happens to trigger the existing problem where:
(insn 188 416 379 18 (parallel [
(set (reg:SI 72 k4 [orig:121 _114 ] [121])
(ashift:SI (reg:SI 70 k2 [orig:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207
--- Comment #13 from Uroš Bizjak ---
(In reply to Richard Biener from comment #11)
> I wonder if we can for simplicity deprecate non EV6 ... does any other
> existing architecture use this functionality?
To be more precise: is there a target tha
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207
--- Comment #12 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #10)
> (In reply to Uroš Bizjak from comment #6)
> > So, LRA testresults are clean on alphaev68-linux-gnu.
>
> Please note that the above applies to alpha*EV6*, not plain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207
--- Comment #10 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #6)
> So, LRA testresults are clean on alphaev68-linux-gnu.
Please note that the above applies to alpha*EV6*, not plain alpha.
Plain alpha is !BWX architecture and uses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101096
Uroš Bizjak changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109549
--- Comment #7 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #2)
> Note for some x86 cores having 2 or more cmove back to back is worse than a
> conditional jump so maybe the testcase is now catching what it should happen
> ...
P
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94908
--- Comment #12 from Uroš Bizjak ---
Implemented also for x86.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109483
--- Comment #4 from Uroš Bizjak ---
(In reply to Richard Biener from comment #2)
> Note that clang seems to propagate the constant equivalence which we
> instead un-propagate. With -fdisable-tree-uncprop1 you'll get the
> expected code:
>
> f
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase (int3 mnemonic is for marker only):
--cut here--
_Bool foo (int cnt)
{
if (cnt == -1)
{
_Bool success;
asm volatile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109351
--- Comment #2 from Uroš Bizjak ---
Patch at https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615074.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109351
Uroš Bizjak changed:
What|Removed |Added
CC||vmakarov at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85048
--- Comment #12 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #9)
> With the patch, we can generate optimized code expect for those 16 {u,}qq
> cases, since the ABI doesn't support 1024-bit vector.
Can't these be vectorized using pa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276
--- Comment #17 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #16)
> (In reply to Uroš Bizjak from comment #15)
> > (In reply to Jakub Jelinek from comment #13)
> > > asks for a DImode stack slot, ix86_local_alignment newly doesn'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276
--- Comment #15 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #13)
> asks for a DImode stack slot, ix86_local_alignment newly doesn't lower the
> alignment
> which isn't good for -mpreferred-stack-boundary=2.
IIRC, DImode FILD/FI
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
This warning/error is actually emitted when compiling
drivers/infiniband/core/user_mad.c linux source file.
The testcase:
--cut here
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109290
--- Comment #2 from Uroš Bizjak ---
Created attachment 54761
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54761&action=edit
Minimized testcase
-Warray-bounds -fno-delete-null-pointer-checks -O2
In function ‘btrfs_show_u64’,
inlined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109290
--- Comment #1 from Uroš Bizjak ---
Created attachment 54760
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54760&action=edit
Preprocessed file
-Warray-bounds -O2 -fno-strict-aliasing -fcf-protection=branch
-fno-delete-null-pointer-checks
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Happens while compiling recent linux kernel. Several instances of ... in the
same place:
In function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109233
--- Comment #13 from Uroš Bizjak ---
(In reply to Martin Liška from comment #7)
> Note, the linux kernel disables the -Werror of the warning for GCC 11 and 12:
> https://github.com/torvalds/linux/blob/
> a1effab7a3a35a837dd9d2b974a1bc4939df1ad5/
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109233
Uroš Bizjak changed:
What|Removed |Added
Attachment #54729|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109233
--- Comment #5 from Uroš Bizjak ---
Created attachment 54731
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54731&action=edit
Even more minimized testcase.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109233
--- Comment #3 from Uroš Bizjak ---
Created attachment 54729
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54729&action=edit
Minimized testcase
WIP, but *substantially* minimized.
gcc -O2 -Warray-bounds:
tg3-6.c: In function ‘tg3_init_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109233
--- Comment #2 from Uroš Bizjak ---
As can be seen from the preprocessed file, tp->irq_max is set to:
tp->irq_max = 1;
or
tp->irq_max = (4 + 1);
and the compilation warns in tg3_init_one at:
for (i = 0; i < tp->irq_max; i++) {
struct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109233
--- Comment #1 from Uroš Bizjak ---
Created attachment 54719
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54719&action=edit
Preprocessed file
-O2 -Warray-bounds:
In function ‘tg3_init_one’,
inlined from ‘tg3_init_one’ at
drivers/ne
: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
There is another bogus array bounds warning when compiling linux in:
drivers/net/ethernet/broadcom/tg3.c: In function ‘tg3_init_one
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109215
--- Comment #1 from Uroš Bizjak ---
The minimized testcase:
--cut here--
#define SB_FREEZE_COMPLETE 4
struct lock_class_key { };
struct file_system_type {
struct lock_class_key s_writers_key[(SB_FREEZE_COMPLETE - 1)];
struct lock_class_key
: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
The linux kernel compile fails with gcc-13 in super.c with:
fs/super.c: In function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109052
--- Comment #5 from Uroš Bizjak ---
(In reply to Vladimir Makarov from comment #4)
> So I think the current patch is probably an adequate solution.
Perhaps the compiler should also try to swap input operands to fit the combined
insn when commu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109092
--- Comment #5 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #2)
> (In reply to Andrew Pinski from comment #1)
>
> > The issue is register_operand accepts subreg but then REGNO is checked on
> > it.
> > That is obviously wrong. It
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109092
--- Comment #2 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #1)
> The issue is register_operand accepts subreg but then REGNO is checked on it.
> That is obviously wrong. It should be "REG_P (operands[1]) && REGNO
> (operands[1]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088
--- Comment #1 from Uroš Bizjak ---
Please read https://gcc.gnu.org/bugs/ on how to correctly report a bug.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109079
Uroš Bizjak changed:
What|Removed |Added
Component|target |rtl-optimization
--- Comment #2 from Uroš
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94908
Uroš Bizjak changed:
What|Removed |Added
Attachment #54607|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94908
--- Comment #9 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #8)
> I'm thinking of something like below so it can be matched both by
> expand_vselect_vconcat in ix86_expand_vec_perm_const_1 and patterns created
> by pass_combine(the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94908
--- Comment #7 from Uroš Bizjak ---
Created attachment 54607
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54607&action=edit
Proposed patch
Patch in testing.
Attached patch produces (-O2 -msse4.1):
f:
subq$24, %rsp
x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109052
--- Comment #2 from Uroš Bizjak ---
The original testcase is:
double foo (double a, double b)
{
double z = __builtin_fmod (a, 3.14);
return z * b;
}
-O2 -fno-math-errno:
foo:
fldl.LC0(%rip)
movsd %xmm0, -8(%rsp)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109052
Uroš Bizjak changed:
What|Removed |Added
CC||vmakarov at gcc dot gnu.org
Key
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
double foo (double a)
{
double tmp = a;
asm ("" : "+t" (tmp));
return a * tmp;
}
--cut here--
compiles wi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109028
--- Comment #3 from Uroš Bizjak ---
(In reply to Andrew Pinski from comment #1)
> X87 code generation is definitely not as optimized as other code really.
You are wrong here.
> Also fcmov is newish.
It is because fcmov would require two memor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
--- Comment #30 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #29)
> Note, fmod_optab is only used on i?86 (where because of the commit mentioned
> here it was limited to finite math only) and rs6000 (which guards it on
> unsafe m
|ASSIGNED
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #28 from Uroš Bizjak ---
I think that we cleared all questions here. I'll prepare the revert later
today.
On a related note, it would be nice if Intel corrected the table 3-30
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
--- Comment #26 from Uroš Bizjak ---
(In reply to Jan Kratochvil from comment #23)
> Created attachment 54542 [details]
> fmoderrno.cpp
>
> (In reply to Uroš Bizjak from comment #21)
> > When g:93ba85fdd253b4b9cf2b9e54e8e5969b1a3db098 is revert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
--- Comment #21 from Uroš Bizjak ---
(In reply to Alexander Monakov from comment #19)
> I get the feeling that you're ignoring me, but gcc-4.8.3 was already
> emitting a helper fmod call for setting errno without any flag_errno_math
> checks in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
--- Comment #20 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #16)
> More questionable is the #Z case, where Table 8-11 just talks about
> Divide or reverse divide operation Returns an ∞ signed with the exclusive
> OR of the
> w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
--- Comment #17 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #16)
> Doesn't the SDM guarantee the right behavior though?
Indeed, this is what is missing from Table 3-31.
> It is true that the FPREM results table says * and ** i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
--- Comment #14 from Uroš Bizjak ---
(In reply to Jan Kratochvil from comment #13)
> The question is whether gcc can rely on the undocumented Intel behavior as
> described in Comment 7. glibc already relies on it anyway.
I don't think this is t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
--- Comment #12 from Uroš Bizjak ---
(In reply to Jan Kratochvil from comment #8)
> The revert makes it 13x faster. But the produced code still falls back to
> calling glibc fmod() as shown in the disassembly in Comment 0.
> If I use the "fprem
301 - 400 of 6644 matches
Mail list logo