Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcases should use VPMOV downconvert instruction with AVX512VL:
void
foo (unsigned short* p1, unsigned short* p2, char
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101058
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101058
--- Comment #9 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #8)
> Though, when this *punpckwd define_insn_and_split handles all possible
> constant permutations for V2HImode, shouldn't ix86_vectorize_vec_perm_const
> say so:
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101058
--- Comment #6 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #5)
We can split directly to sse2_pshuflw_1, avoiding mmx_pshufw_1. These two
actually generate the same instruction (PSHUFLW) when XMM registers are
involved.
at gcc dot gnu.org |ubizjak at gmail dot com
Ever confirmed|0 |1
Status|UNCONFIRMED |ASSIGNED
--- Comment #4 from Uroš Bizjak ---
Created attachment 51007
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51007&acti
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101044
--- Comment #1 from Uroš Bizjak ---
The first neg also sets sign flag (SF) for the following CMOVS.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101021
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101023
Uroš Bizjak changed:
What|Removed |Added
CC||hjl.tools at gmail dot com
--- Comment #1
|1
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
Status|UNCONFIRMED |ASSIGNED
--- Comment #1 from Uroš Bizjak ---
Created attachment 50982
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50982&acti
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
typedef char S;
typedef S VV __attribute__((vector_size(16 * sizeof(S;
VV ref_perm_pshufd
gcc dot gnu.org |ubizjak at gmail dot com
Status|UNCONFIRMED |RESOLVED
Target Milestone|--- |12.0
--- Comment #3 from Uroš Bizjak ---
Fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43526
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |INVALID
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100936
--- Comment #1 from Uroš Bizjak ---
Proposed patch:
--cut here--
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 04649b42122..0773a4a9ba8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13531,7 +13531,7 @
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
__seg_gs int var = 123;
static int
*foo (void)
{
int *addr;
asm ("lea %p1, %0" : "=r"(addr) : "
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Currently, the build allows define_insn RTX without insn template. It would be
nice to detect this invalid RTX and error out during
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100626
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100722
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
|--- |12.0
Ever confirmed|0 |1
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #1 from Uroš Bizjak ---
Missing push insns for vector modes (the same as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100696
Uroš Bizjak changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100701
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
at gcc dot gnu.org |ubizjak at gmail dot com
CC|uros at gcc dot gnu.org|
--- Comment #2 from Uroš Bizjak ---
(In reply to Richard Biener from comment #1)
> orq %rdi, %rsi
> pshuflw $0, %xmm3, %xmm0
> movq%xmm0, %rbp
>
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcases:
--cut here--
#define N 4
short r[N], a[N], b[N];
unsigned short ur[N], ua[N], ub[N];
void mul (void)
{
int i;
for (i = 0; i < N; i++)
r[i] =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100626
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
at gcc dot gnu.org |ubizjak at gmail dot com
Last reconfirmed||2021-05-17
Ever confirmed|0 |1
--- Comment #1 from Uroš Bizjak ---
Created attachment 50822
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50822&action=edi
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcases involving 4 byte vectors, e.g.:
typedef char __v4qi __attribute__ ((__vector_size__ (4)));
__v4qi foo (__v4qi a, __v4qi b, __v4qi c)
{
return (a & ~b) + c;
}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100626
--- Comment #3 from Uroš Bizjak ---
*di3_doubleword calls split_double_mode with:
op0: (subreg:DI (reg/v:SI 89 [ li_18 ]) 0)
op1: (reg:DI 90 [ uc_4 ])
op2: (mem/c:DI (plus:SI (reg/f:SI 19 frame)
(const_int -4 [0xfffc]))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
--- Comment #16 from Uroš Bizjak ---
(In reply to David Binderman from comment #15)
> Bug first appears sometime between git hash 21dfb22920ce32fc,
> dated yesterday and git hash 097fde5e7514e909, dated today.
Fixed by PR100581.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100581
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100581
--- Comment #3 from Uroš Bizjak ---
(In reply to Alex Coplan from comment #1)
> Is it valid to create a vector type with total size less than the element
> size? Shouldn't this be rejected?
No, the generated code is:
vmovq ff_b(%rip)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100581
Uroš Bizjak changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
--- Comment #13 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #12)
> Yeah, this is a non-existent SSE "cmove". I tried to find all paths where
> this should divert to a sequence of logic instructions or PBLENDB, but due
> to plethora
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
--- Comment #12 from Uroš Bizjak ---
(In reply to David Binderman from comment #11)
> I might be seeing something similar:
>
> caxcpy.f: In function 'caxcpy':
> caxcpy.f:53:72: error: unrecognizable insn:
>53 | end subroutine
> |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
Uroš Bizjak changed:
What|Removed |Added
Assignee|ubizjak at gmail dot com |unassigned at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98375
Bug 98375 depends on bug 98218, which changed state.
Bug 98218 Summary: [TARGET_MMX_WITH_SSE] Implement 64bit vector compares
(AVX512 masked compares missing)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
Uroš Bizjak changed:
What|Removed |Added
Summary|[TARGET_MMX_WITH_SSE] Miss |[TARGET_MMX_WITH_SSE]
|v
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100461
Uroš Bizjak changed:
What|Removed |Added
CC||hjl.tools at gmail dot com
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100445
--- Comment #10 from Uroš Bizjak ---
Following patch fixes the failures:
--cut here--
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 4dfe7d6c282..61b2f921f41 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100445
--- Comment #9 from Uroš Bizjak ---
ix86_use_mask_cmp_p should be refined, it has an early return for 64bit modes:
if (GET_MODE_SIZE (mode) == 64)
return true;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100445
--- Comment #6 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #5)
> ix86_expand_sse_movcc has special TARGET_XOP path, so the following patch is
> needed:
Ah, you beat me by the second ;)
Anyway, I have no XOP target, so probably y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100445
--- Comment #5 from Uroš Bizjak ---
ix86_expand_sse_movcc has special TARGET_XOP path, so the following patch is
needed:
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 347295afbb5..667dd057e0d 100644
--- a/gcc/config/i386/mm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98375
Bug 98375 depends on bug 98218, which changed state.
Bug 98218 Summary: [TARGET_MMX_WITH_SSE] Miss vec_cmpmn/vcondmn expander for
64bit vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
What|Removed |Ad
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342
--- Comment #8 from Uroš Bizjak ---
FYI, this whole analysis was done with Fedora 33 system compiler:
gcc version 10.3.1 20210422 (Red Hat 10.3.1-1) (GCC)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342
--- Comment #7 from Uroš Bizjak ---
I have traced a bit where (insn 2275) and (insn 2287) come from.
In _.ira, we have:
613: r125:QI=r2067:DI#0
...
659: zero_extract(r2080:DI,0x8,0x8)=r125:QI#0
And in _.reload, a DImode reload is insert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342
--- Comment #5 from Uroš Bizjak ---
The problem can be seen in _.pro_and_epilogue pass:
Starting with:
_.cmpelim
2741: r14:DI=[sp:DI+0x38]
...
368: di:DI=r14:DI
...
613: si:QI=r14:QI
...
2737: bp:DI=r14:DI
...
658: strict_low_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342
--- Comment #4 from Uroš Bizjak ---
The problematic insn is:
401cec: 44 89 f6mov%r14d,%esi
This one should be 64 bit wide,
movl%r14d, %esi # 613 [c=4 l=3] *movqi_internal/2
but is actually a QIm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342
--- Comment #3 from Uroš Bizjak ---
For some reason the *input* value at BSWAP insn is truncated to 32bits.
v256u128 v256u128_1 =
SHLV (SHLSV (__builtin_bswap64 (u128_0), (v256u128) (0 < v256u128_0)) <=
0, v256u128_0);
u128_0 i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100355
--- Comment #3 from Uroš Bizjak ---
(In reply to Christophe Lyon from comment #2)
> Tried that, but it's not taken into account.
>
> ieee.exp uses c-torture-execute, maybe that function does not honor dg
> directives? (none of the tests under i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98375
Bug 98375 depends on bug 98060, which changed state.
Bug 98060 Summary: Failure to optimize cmp+setnb+add to cmp+sbb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98060
What|Removed |Added
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98060
Uroš Bizjak changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100312
Uroš Bizjak changed:
What|Removed |Added
Assignee|rguenth at gcc dot gnu.org |ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735
--- Comment #11 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #9)
> (In reply to Richard Biener from comment #4)
> > Indeed as far as I understand an unspec volatile isn't sth clobbering
> > registers (not even memory?!). The insn i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735
--- Comment #9 from Uroš Bizjak ---
(In reply to Richard Biener from comment #4)
> Indeed as far as I understand an unspec volatile isn't sth clobbering
> registers (not even memory?!). The insn is missing inputs/outputs
> (we might be able to m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735
--- Comment #8 from Uroš Bizjak ---
(In reply to Hongtao.liu from comment #7)
> Confirmed, let me fix this.
Please note that the current definition of vzeroupper does not model effects of
the instruction at all. The current definition is intende
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
Uroš Bizjak changed:
What|Removed |Added
Attachment #50649|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
--- Comment #17 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #16)
> (In reply to Jakub Jelinek from comment #15)
> > Yes, but do they preserve all the bits and never modify any bit patterns,
> > including qNaNs and sNaNs? I though
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
--- Comment #16 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #15)
> Yes, but do they preserve all the bits and never modify any bit patterns,
> including qNaNs and sNaNs? I thought the point of using the fistp was that
> it pres
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
--- Comment #14 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #13)
> DFmode loads and stores *are* atomic, this is what the optimization is based
> on.
Loads and stores to/from x87 and SSE registers, to be clear.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
--- Comment #13 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #12)
> They do. Though, in the combined patch I'm still a little bit worried about
> the first 4 modified peephole2s, the last 4 look good to me.
> The last 4 are wher
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
--- Comment #11 from Uroš Bizjak ---
Jakub, do these two patches fix your failures?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
--- Comment #10 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #9)
> (In reply to Jakub Jelinek from comment #8)
> > I think there are 8 those peephole2s rather than just 4 (I've been looking
> > for
> > rtx_equal_p (XEXP.*, 0) in sy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
--- Comment #9 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #8)
> I think there are 8 those peephole2s rather than just 4 (I've been looking
> for
> rtx_equal_p (XEXP.*, 0) in sync.md
No, the other are not problematic.
dot gnu.org|
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #7 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #1)
> In this particular case it is the sync.md:398 peephole2:
> (define_peephole2
> [(set (match_ope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100119
--- Comment #2 from Uroš Bizjak ---
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index dda08ff67f2..5a7a00c13bd 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -1550,6 +1550,8 @@ ix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041
Uroš Bizjak changed:
What|Removed |Added
Target Milestone|11.0|12.0
--- Comment #20 from Uroš Bizjak --
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041
--- Comment #18 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #17)
> Can we go with #c15 for GCC11 and do #c16 for GCC12?
I'd like to kill the option for GCC11, and the solution is safer than #c15.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041
Uroš Bizjak changed:
What|Removed |Added
Target|x86_64-linux-musl |x86_64
Target Milestone|---
at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #16 from Uroš Bizjak ---
Created attachment 50568
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50568&action=edit
Proposed patch
Attached patch disables -m96bit-long-double for 64-bit targets.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041
--- Comment #15 from Uroš Bizjak ---
(In reply to Richard Biener from comment #12)
> A possible solution might be to disallow the -m64 -m96bit-long-double
> combination, the documentation suggests -m128bit-long-double was intended
> as an "optim
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041
--- Comment #13 from Uroš Bizjak ---
See PR79514.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021
--- Comment #2 from Uroš Bizjak ---
Also, you are passing -march=sandybridge, but the profiler seems to show
Skylake (SKX) target. The STV pass heavily depends on target costs, and when
-march=skylake is passed, the conversion is avoided.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021
--- Comment #1 from Uroš Bizjak ---
This is not vectorization, but the compiler uses vector registers to perform
scalar operations. This is STV (scalar-to-vector) pass in action, you can use
-mno-stv to avoid transformation.
The transformation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99930
--- Comment #6 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #4)
> Is there some reason why the patterns are written that way rather than split
> immediately into the AND or XOR? Perhaps it could be done on SUBREGs to
> make it va
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652
--- Comment #5 from Uroš Bizjak ---
inline long double
foo (void)
{
return 1.0;
}
gcc -S -O2 -mno-80387 double.c
double.c: In function ‘foo’:
double.c:3:1: error: x87 register return with x87 disabled
3 | {
| ^
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99601
--- Comment #3 from Uroš Bizjak ---
(In reply to CVS Commits from comment #1)
> The master branch has been updated by Nathan Sidwell :
>
> https://gcc.gnu.org/g:770d3487ef18a71f65626c182625889eee29f580
There is a typo in the selector:
+// { dg-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #34 from Uroš Bizjak ---
(In reply to rguent...@suse.de from comment #32)
> what about reload_completed? We really only want to do this after RA.
No need for it, this is peephole2 pass that *always* runs after reload.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405
--- Comment #2 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #1)
> Created attachment 50306 [details]
> gcc11-pr99405.patch
>
> Untested fix.
- (match_operand:SI 2 "register_operand" "c")
+ (match_operand:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #31 from Uroš Bizjak ---
(In reply to Richard Biener from comment #29)
> The simplified variant below works but IMHO matches cases we do not
> want to transform. I can't find any example on how to achieve that
> though.
I think that
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #28 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #27)
> (In reply to Richard Biener from comment #26)
> > but that doesn't seem to match for some unknown reason.
> Try this:
The latency problem with the original testca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #27 from Uroš Bizjak ---
(In reply to Richard Biener from comment #26)
> but that doesn't seem to match for some unknown reason.
Try this:
(define_peephole2
[(match_scratch:DI 5 "Yv")
(set (match_operand:DI 0 "sse_reg_operand")
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #24 from Uroš Bizjak ---
(In reply to Richard Biener from comment #22)
> That works to avoid the vpinsrq. I guess the case of a mem operand
> behaves similar to a gpr (plus the load uop), at least I don't have any
> contrary evidenc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #21 from Uroš Bizjak ---
(In reply to Uroš Bizjak from comment #20)
> (In reply to Richard Biener from comment #18)
> > Even on Skylake it's 2 (movq) + 3 (vpinsr), so there it's 6 vs. 3. Not
> > sure if we should somehow do this late
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #20 from Uroš Bizjak ---
(In reply to Richard Biener from comment #18)
> Even on Skylake it's 2 (movq) + 3 (vpinsr), so there it's 6 vs. 3. Not
> sure if we should somehow do this late somehow (peephole or splitter) since
> it requir
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 99083, which changed state.
Bug 99083 Summary: Big run-time regressions of 519.lbm_r with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
What|Removed |Added
-
||patch
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
Resolution|FIXED |---
--- Comment #13 from Uroš Bizjak ---
(In reply to Martin Jambor from comment #12)
> For the record, I have benchmarked the patches f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 99083, which changed state.
Bug 99083 Summary: Big run-time regressions of 519.lbm_r with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
What|Removed |Added
-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
Uroš Bizjak changed:
What|Removed |Added
Target Milestone|--- |11.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
Uroš Bizjak changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99115
--- Comment #4 from Uroš Bizjak ---
Compiles OK with:
GNU C++14 (GCC) version 8.4.1 20210216 [releases/gcc-8 revision
c6513400d84:39c49bc104d:1f3a07da9b6bcfa4733750826746bd18ac6f20db]
(alpha-unknown-openbsd6.8)
built as a cross from x86_64-linu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99115
Uroš Bizjak changed:
What|Removed |Added
Known to work||11.0
--- Comment #3 from Uroš Bizjak ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
--- Comment #10 from Uroš Bizjak ---
(In reply to Richard Biener from comment #7)
> There are a lot of targets that define REG_ALLOC_ORDER ^
> HONOR_REG_ALLOC_ORDER and thus are affected by this change...
The following patch should solve this is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
--- Comment #8 from Uroš Bizjak ---
(In reply to Richard Biener from comment #7)
> Btw, for GCC 11 it might be tempting to simply revert the "no-op" change?
I agree, this is the safest way at this time. The situation now looks like
going into ra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
--- Comment #6 from Uroš Bizjak ---
As a side note, it is strange that ADJUST_REG_ALLOC_ORDER somehow require
REG_ALLOC_ORDER to be defined (c.f. Comment #3), while its documentation says:
The macro body should not assume anything about the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
--- Comment #5 from Uroš Bizjak ---
Martin, can you please benchmark the patch from Comment #4?
The patch is not totally trivial, because it introduces HONOR_REG_ALLOC_ORDER
to x86 and this define disables some other code in ira-color.c,
assign_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
--- Comment #4 from Uroš Bizjak ---
Created attachment 50185
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50185&action=edit
Proposed patch
Proposed patch that fixes ira-color.c and introduces HONOR_REG_ALLOC_ORDER.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
--- Comment #3 from Uroš Bizjak ---
It looks to me another one is in reload1.c, find_reg:
if (this_cost < best_cost
/* Among registers with equal cost, prefer caller-saved ones, or
use REG_ALLOC_ORDER if
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083
--- Comment #1 from Uroš Bizjak ---
This should be a no-op. According to the documentation:
--q--
Macro: REG_ALLOC_ORDER
If defined, an initializer for a vector of integers, containing the numbers
of hard registers in the order in which GCC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99025
--- Comment #2 from Uroš Bizjak ---
Comment on attachment 50154
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50154
gcc11-pr99025.patch
>2021-02-09 Jakub Jelinek
>+ if (SUBREG_P (operands[1]))
>+operands[1] = force_reg (V2SFmode,
701 - 800 of 6684 matches
Mail list logo