[Bug target/101096] New: AVX512 VPMOV instruction should be used to downconvert vectors

2021-06-16 Thread ubizjak at gmail dot com via Gcc-bugs
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcases should use VPMOV downconvert instruction with AVX512VL: void foo (unsigned short* p1, unsigned short* p2, char

[Bug target/101058] [12 Regression] ICE in extract_insn, at recog.c:2770 since r12-1215-g8d7dae0eb366a88a

2021-06-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101058 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/101058] [12 Regression] ICE in extract_insn, at recog.c:2770 since r12-1215-g8d7dae0eb366a88a

2021-06-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101058 --- Comment #9 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #8) > Though, when this *punpckwd define_insn_and_split handles all possible > constant permutations for V2HImode, shouldn't ix86_vectorize_vec_perm_const > say so: >

[Bug target/101058] [12 Regression] ICE in extract_insn, at recog.c:2770 since r12-1215-g8d7dae0eb366a88a

2021-06-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101058 --- Comment #6 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #5) We can split directly to sse2_pshuflw_1, avoiding mmx_pshufw_1. These two actually generate the same instruction (PSHUFLW) when XMM registers are involved.

[Bug target/101058] [12 Regression] ICE in extract_insn, at recog.c:2770 since r12-1215-g8d7dae0eb366a88a

2021-06-14 Thread ubizjak at gmail dot com via Gcc-bugs
at gcc dot gnu.org |ubizjak at gmail dot com Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #4 from Uroš Bizjak --- Created attachment 51007 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51007&acti

[Bug rtl-optimization/101044] -ABS(A) produces two neg instructions

2021-06-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101044 --- Comment #1 from Uroš Bizjak --- The first neg also sets sign flag (SF) for the following CMOVS.

[Bug target/101021] PSHUFB is emitted instead of PSHUFD, PSHUFLW and PSHUFHW with -msse4

2021-06-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101021 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/101023] [11/12 Regression] wrong code with -mstackrealign

2021-06-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101023 Uroš Bizjak changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #1

[Bug target/101021] PSHUFB is emitted instead of PSHUFD, PSHUFLW and PSHUFHW with -msse4

2021-06-11 Thread ubizjak at gmail dot com via Gcc-bugs
|1 Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Uroš Bizjak --- Created attachment 50982 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50982&acti

[Bug target/101021] New: PSHUFB is emitted instead of PSHUFD, PSHUFLW and PSHUFHW with -msse4

2021-06-10 Thread ubizjak at gmail dot com via Gcc-bugs
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- typedef char S; typedef S VV __attribute__((vector_size(16 * sizeof(S; VV ref_perm_pshufd

[Bug target/100936] %p and %P modifiers should not emit segment overrides

2021-06-09 Thread ubizjak at gmail dot com via Gcc-bugs
gcc dot gnu.org |ubizjak at gmail dot com Status|UNCONFIRMED |RESOLVED Target Milestone|--- |12.0 --- Comment #3 from Uroš Bizjak --- Fixed.

[Bug target/43526] ICE: in construct_container, at config/i386/i386.c:5733 with -m96bit-long-double at x86_64-linux

2021-06-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43526 Uroš Bizjak changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED

[Bug target/100936] %p and %P modifiers should not emit segment overrides

2021-06-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100936 --- Comment #1 from Uroš Bizjak --- Proposed patch: --cut here-- diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 04649b42122..0773a4a9ba8 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -13531,7 +13531,7 @

[Bug target/100936] New: %p and %P modifiers should not emit segment overrides

2021-06-06 Thread ubizjak at gmail dot com via Gcc-bugs
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- __seg_gs int var = 123; static int *foo (void) { int *addr; asm ("lea %p1, %0" : "=r"(addr) : "

[Bug middle-end/100880] New: The build should error out for define_insn without insn template

2021-06-02 Thread ubizjak at gmail dot com via Gcc-bugs
Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Currently, the build allows define_insn RTX without insn template. It would be nice to detect this invalid RTX and error out during

[Bug target/100626] [11/12 Regression] ICE Segmentation fault (during RTL pass: split1) since r11-165-geb72dc663e9070b2

2021-05-25 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100626 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/100722] [12 Regression] ice in extract_insn with many vector_size(4) arguments

2021-05-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100722 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug c/100722] ice in extract_insn, at recog.c:2770

2021-05-22 Thread ubizjak at gmail dot com via Gcc-bugs
|--- |12.0 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #1 from Uroš Bizjak --- Missing push insns for vector modes (the same as

[Bug tree-optimization/100696] mult_higpart is not vectorized

2021-05-20 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100696 Uroš Bizjak changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #

[Bug target/100701] [12 Regression] wrong code with -O -fschedule-insns2

2021-05-20 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100701 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/100701] [12 Regression] wrong code with -O -fschedule-insns2

2021-05-20 Thread ubizjak at gmail dot com via Gcc-bugs
at gcc dot gnu.org |ubizjak at gmail dot com CC|uros at gcc dot gnu.org| --- Comment #2 from Uroš Bizjak --- (In reply to Richard Biener from comment #1) > orq %rdi, %rsi > pshuflw $0, %xmm3, %xmm0 > movq%xmm0, %rbp >

[Bug tree-optimization/100696] New: mult_higpart is not vectorized

2021-05-20 Thread ubizjak at gmail dot com via Gcc-bugs
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcases: --cut here-- #define N 4 short r[N], a[N], b[N]; unsigned short ur[N], ua[N], ub[N]; void mul (void) { int i; for (i = 0; i < N; i++) r[i] =

[Bug target/100626] [11/12 Regression] ICE Segmentation fault (during RTL pass: split1) since r11-165-geb72dc663e9070b2

2021-05-17 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100626 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com

[Bug target/100637] [i386] Vectorize 4-byte vectors

2021-05-17 Thread ubizjak at gmail dot com via Gcc-bugs
at gcc dot gnu.org |ubizjak at gmail dot com Last reconfirmed||2021-05-17 Ever confirmed|0 |1 --- Comment #1 from Uroš Bizjak --- Created attachment 50822 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50822&action=edi

[Bug target/100637] New: [i386] Vectorize 4-byte vectors

2021-05-17 Thread ubizjak at gmail dot com via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcases involving 4 byte vectors, e.g.: typedef char __v4qi __attribute__ ((__vector_size__ (4))); __v4qi foo (__v4qi a, __v4qi b, __v4qi c) { return (a & ~b) + c; }

[Bug target/100626] [11/12 Regression] ICE Segmentation fault (during RTL pass: split1) since r11-165-geb72dc663e9070b2

2021-05-17 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100626 --- Comment #3 from Uroš Bizjak --- *di3_doubleword calls split_double_mode with: op0: (subreg:DI (reg/v:SI 89 [ li_18 ]) 0) op1: (reg:DI 90 [ uc_4 ]) op2: (mem/c:DI (plus:SI (reg/f:SI 19 frame) (const_int -4 [0xfffc]))

[Bug target/98218] [TARGET_MMX_WITH_SSE] Implement 64bit vector compares (AVX512 masked compares missing)

2021-05-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 --- Comment #16 from Uroš Bizjak --- (In reply to David Binderman from comment #15) > Bug first appears sometime between git hash 21dfb22920ce32fc, > dated yesterday and git hash 097fde5e7514e909, dated today. Fixed by PR100581.

[Bug target/100581] [12 Regression] ICE in extract_insn, at recog.c:2770 since r12-731-gb1f7fd8a2a5558da

2021-05-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100581 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/100581] [12 Regression] ICE in extract_insn, at recog.c:2770 since r12-731-gb1f7fd8a2a5558da

2021-05-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100581 --- Comment #3 from Uroš Bizjak --- (In reply to Alex Coplan from comment #1) > Is it valid to create a vector type with total size less than the element > size? Shouldn't this be rejected? No, the generated code is: vmovq ff_b(%rip)

[Bug target/100581] [12 Regression] ICE in extract_insn, at recog.c:2770 since r12-731-gb1f7fd8a2a5558da

2021-05-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100581 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com

[Bug target/98218] [TARGET_MMX_WITH_SSE] Implement 64bit vector compares (AVX512 masked compares missing)

2021-05-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 --- Comment #13 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #12) > Yeah, this is a non-existent SSE "cmove". I tried to find all paths where > this should divert to a sequence of logic instructions or PBLENDB, but due > to plethora

[Bug target/98218] [TARGET_MMX_WITH_SSE] Implement 64bit vector compares (AVX512 masked compares missing)

2021-05-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 --- Comment #12 from Uroš Bizjak --- (In reply to David Binderman from comment #11) > I might be seeing something similar: > > caxcpy.f: In function 'caxcpy': > caxcpy.f:53:72: error: unrecognizable insn: >53 | end subroutine > |

[Bug target/98218] [TARGET_MMX_WITH_SSE] Implement 64bit vector compares (AVX512 masked compares missing)

2021-05-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 Uroš Bizjak changed: What|Removed |Added Assignee|ubizjak at gmail dot com |unassigned at gcc dot gnu.org

[Bug other/98375] [meta bug] GCC 12 pending patches

2021-05-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98375 Bug 98375 depends on bug 98218, which changed state. Bug 98218 Summary: [TARGET_MMX_WITH_SSE] Implement 64bit vector compares (AVX512 masked compares missing) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 What|Removed

[Bug target/98218] [TARGET_MMX_WITH_SSE] Implement 64bit vector compares (AVX512 masked compares missing)

2021-05-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 Uroš Bizjak changed: What|Removed |Added Summary|[TARGET_MMX_WITH_SSE] Miss |[TARGET_MMX_WITH_SSE] |v

[Bug target/100461] [11/12 Regression] mingw build broken due to change of rdtsc implementation

2021-05-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100461 Uroš Bizjak changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #1

[Bug target/100445] [12 Regression] ice during RTL pass: vregs

2021-05-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100445 --- Comment #10 from Uroš Bizjak --- Following patch fixes the failures: --cut here-- diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 4dfe7d6c282..61b2f921f41 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc

[Bug target/100445] [12 Regression] ice during RTL pass: vregs

2021-05-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100445 --- Comment #9 from Uroš Bizjak --- ix86_use_mask_cmp_p should be refined, it has an early return for 64bit modes: if (GET_MODE_SIZE (mode) == 64) return true;

[Bug target/100445] [12 Regression] ice during RTL pass: vregs

2021-05-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100445 --- Comment #6 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #5) > ix86_expand_sse_movcc has special TARGET_XOP path, so the following patch is > needed: Ah, you beat me by the second ;) Anyway, I have no XOP target, so probably y

[Bug target/100445] [12 Regression] ice during RTL pass: vregs

2021-05-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100445 --- Comment #5 from Uroš Bizjak --- ix86_expand_sse_movcc has special TARGET_XOP path, so the following patch is needed: diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 347295afbb5..667dd057e0d 100644 --- a/gcc/config/i386/mm

[Bug target/98218] [TARGET_MMX_WITH_SSE] Miss vec_cmpmn/vcondmn expander for 64bit vector

2021-05-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug other/98375] [meta bug] GCC 12 pending patches

2021-05-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98375 Bug 98375 depends on bug 98218, which changed state. Bug 98218 Summary: [TARGET_MMX_WITH_SSE] Miss vec_cmpmn/vcondmn expander for 64bit vector https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 What|Removed |Ad

[Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2

2021-05-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342 --- Comment #8 from Uroš Bizjak --- FYI, this whole analysis was done with Fedora 33 system compiler: gcc version 10.3.1 20210422 (Red Hat 10.3.1-1) (GCC)

[Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2

2021-05-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342 --- Comment #7 from Uroš Bizjak --- I have traced a bit where (insn 2275) and (insn 2287) come from. In _.ira, we have: 613: r125:QI=r2067:DI#0 ... 659: zero_extract(r2080:DI,0x8,0x8)=r125:QI#0 And in _.reload, a DImode reload is insert

[Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2

2021-05-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342 --- Comment #5 from Uroš Bizjak --- The problem can be seen in _.pro_and_epilogue pass: Starting with: _.cmpelim 2741: r14:DI=[sp:DI+0x38] ... 368: di:DI=r14:DI ... 613: si:QI=r14:QI ... 2737: bp:DI=r14:DI ... 658: strict_low_

[Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2

2021-05-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342 --- Comment #4 from Uroš Bizjak --- The problematic insn is: 401cec: 44 89 f6mov%r14d,%esi This one should be 64 bit wide, movl%r14d, %esi # 613 [c=4 l=3] *movqi_internal/2 but is actually a QIm

[Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2

2021-05-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100342 --- Comment #3 from Uroš Bizjak --- For some reason the *input* value at BSWAP insn is truncated to 32bits. v256u128 v256u128_1 = SHLV (SHLSV (__builtin_bswap64 (u128_0), (v256u128) (0 < v256u128_0)) <= 0, v256u128_0); u128_0 i

[Bug testsuite/100355] gcc.c-torture/execute/ieee/cdivchkld.c needs fmaxl

2021-05-03 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100355 --- Comment #3 from Uroš Bizjak --- (In reply to Christophe Lyon from comment #2) > Tried that, but it's not taken into account. > > ieee.exp uses c-torture-execute, maybe that function does not honor dg > directives? (none of the tests under i

[Bug other/98375] [meta bug] GCC 12 pending patches

2021-04-30 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98375 Bug 98375 depends on bug 98060, which changed state. Bug 98060 Summary: Failure to optimize cmp+setnb+add to cmp+sbb https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98060 What|Removed |Added ---

[Bug target/98060] Failure to optimize cmp+setnb+add to cmp+sbb

2021-04-30 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98060 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/100312] __builtin_ia32_maskloadpd256 and friends should be pure

2021-04-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100312 Uroš Bizjak changed: What|Removed |Added Assignee|rguenth at gcc dot gnu.org |ubizjak at gmail dot com

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers

2021-04-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 --- Comment #11 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #9) > (In reply to Richard Biener from comment #4) > > Indeed as far as I understand an unspec volatile isn't sth clobbering > > registers (not even memory?!). The insn i

[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers

2021-04-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 --- Comment #9 from Uroš Bizjak --- (In reply to Richard Biener from comment #4) > Indeed as far as I understand an unspec volatile isn't sth clobbering > registers (not even memory?!). The insn is missing inputs/outputs > (we might be able to m

[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers

2021-04-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 --- Comment #8 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #7) > Confirmed, let me fix this. Please note that the current definition of vzeroupper does not model effects of the instruction at all. The current definition is intende

[Bug target/100041] ICE in curr_insn_transform, at lra-constraints.c:4022

2021-04-24 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 Uroš Bizjak changed: What|Removed |Added Attachment #50649|0 |1 is obsolete|

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #17 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #16) > (In reply to Jakub Jelinek from comment #15) > > Yes, but do they preserve all the bits and never modify any bit patterns, > > including qNaNs and sNaNs? I though

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #16 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #15) > Yes, but do they preserve all the bits and never modify any bit patterns, > including qNaNs and sNaNs? I thought the point of using the fistp was that > it pres

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #14 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #13) > DFmode loads and stores *are* atomic, this is what the optimization is based > on. Loads and stores to/from x87 and SSE registers, to be clear.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #13 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #12) > They do. Though, in the combined patch I'm still a little bit worried about > the first 4 modified peephole2s, the last 4 look good to me. > The last 4 are wher

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-22 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #11 from Uroš Bizjak --- Jakub, do these two patches fix your failures?

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-22 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #10 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #9) > (In reply to Jakub Jelinek from comment #8) > > I think there are 8 those peephole2s rather than just 4 (I've been looking > > for > > rtx_equal_p (XEXP.*, 0) in sy

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-22 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #9 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #8) > I think there are 8 those peephole2s rather than just 4 (I've been looking > for > rtx_equal_p (XEXP.*, 0) in sync.md No, the other are not problematic.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-04-22 Thread ubizjak at gmail dot com via Gcc-bugs
dot gnu.org| Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #7 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #1) > In this particular case it is the sync.md:398 peephole2: > (define_peephole2 > [(set (match_ope

[Bug target/100119] [x86] Conversion unsigned int -> double produces -0 (-m32 -msse2 -mfpmath=sse)

2021-04-19 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100119 --- Comment #2 from Uroš Bizjak --- diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index dda08ff67f2..5a7a00c13bd 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -1550,6 +1550,8 @@ ix

[Bug target/100041] ICE in curr_insn_transform, at lra-constraints.c:4022

2021-04-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041 Uroš Bizjak changed: What|Removed |Added Target Milestone|11.0|12.0 --- Comment #20 from Uroš Bizjak --

[Bug target/100041] ICE in curr_insn_transform, at lra-constraints.c:4022

2021-04-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041 --- Comment #18 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #17) > Can we go with #c15 for GCC11 and do #c16 for GCC12? I'd like to kill the option for GCC11, and the solution is safer than #c15.

[Bug target/100041] ICE in curr_insn_transform, at lra-constraints.c:4022

2021-04-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041 Uroš Bizjak changed: What|Removed |Added Target|x86_64-linux-musl |x86_64 Target Milestone|---

[Bug target/100041] ICE in curr_insn_transform, at lra-constraints.c:4022

2021-04-12 Thread ubizjak at gmail dot com via Gcc-bugs
at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #16 from Uroš Bizjak --- Created attachment 50568 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50568&action=edit Proposed patch Attached patch disables -m96bit-long-double for 64-bit targets.

[Bug target/100041] ICE in curr_insn_transform, at lra-constraints.c:4022

2021-04-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041 --- Comment #15 from Uroš Bizjak --- (In reply to Richard Biener from comment #12) > A possible solution might be to disallow the -m64 -m96bit-long-double > combination, the documentation suggests -m128bit-long-double was intended > as an "optim

[Bug target/100041] ICE in curr_insn_transform, at lra-constraints.c:4022

2021-04-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041 --- Comment #13 from Uroš Bizjak --- See PR79514.

[Bug target/100021] [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell

2021-04-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021 --- Comment #2 from Uroš Bizjak --- Also, you are passing -march=sandybridge, but the profiler seems to show Skylake (SKX) target. The STV pass heavily depends on target costs, and when -march=skylake is passed, the conversion is avoided.

[Bug target/100021] [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell

2021-04-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021 --- Comment #1 from Uroš Bizjak --- This is not vectorization, but the compiler uses vector registers to perform scalar operations. This is STV (scalar-to-vector) pass in action, you can use -mno-stv to avoid transformation. The transformation

[Bug rtl-optimization/99930] Failure to optimize floating point -abs(x) in nontrivial code at -O2/3

2021-04-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99930 --- Comment #6 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #4) > Is there some reason why the patterns are written that way rather than split > immediately into the AND or XOR? Perhaps it could be done on SUBREGs to > make it va

[Bug target/99652] inline doesn't with -mno-sse

2021-03-18 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652 --- Comment #5 from Uroš Bizjak --- inline long double foo (void) { return 1.0; } gcc -S -O2 -mno-80387 double.c double.c: In function ‘foo’: double.c:3:1: error: x87 register return with x87 disabled 3 | { | ^

[Bug c++/99601] [11 regression] g++.dg/modules/iostream-1_b.C on x86_64 with -m32

2021-03-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99601 --- Comment #3 from Uroš Bizjak --- (In reply to CVS Commits from comment #1) > The master branch has been updated by Nathan Sidwell : > > https://gcc.gnu.org/g:770d3487ef18a71f65626c182625889eee29f580 There is a typo in the selector: +// { dg-

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-03-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #34 from Uroš Bizjak --- (In reply to rguent...@suse.de from comment #32) > what about reload_completed? We really only want to do this after RA. No need for it, this is peephole2 pass that *always* runs after reload.

[Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates

2021-03-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405 --- Comment #2 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #1) > Created attachment 50306 [details] > gcc11-pr99405.patch > > Untested fix. - (match_operand:SI 2 "register_operand" "c") + (match_operand:

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-03-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #31 from Uroš Bizjak --- (In reply to Richard Biener from comment #29) > The simplified variant below works but IMHO matches cases we do not > want to transform. I can't find any example on how to achieve that > though. I think that

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-03-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #28 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #27) > (In reply to Richard Biener from comment #26) > > but that doesn't seem to match for some unknown reason. > Try this: The latency problem with the original testca

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-03-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #27 from Uroš Bizjak --- (In reply to Richard Biener from comment #26) > but that doesn't seem to match for some unknown reason. Try this: (define_peephole2 [(match_scratch:DI 5 "Yv") (set (match_operand:DI 0 "sse_reg_operand")

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-03-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #24 from Uroš Bizjak --- (In reply to Richard Biener from comment #22) > That works to avoid the vpinsrq. I guess the case of a mem operand > behaves similar to a gpr (plus the load uop), at least I don't have any > contrary evidenc

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-03-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #21 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #20) > (In reply to Richard Biener from comment #18) > > Even on Skylake it's 2 (movq) + 3 (vpinsr), so there it's 6 vs. 3. Not > > sure if we should somehow do this late

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-03-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #20 from Uroš Bizjak --- (In reply to Richard Biener from comment #18) > Even on Skylake it's 2 (movq) + 3 (vpinsr), so there it's 6 vs. 3. Not > sure if we should somehow do this late somehow (peephole or splitter) since > it requir

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2021-02-25 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 99083, which changed state. Bug 99083 Summary: Big run-time regressions of 519.lbm_r with LTO https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 What|Removed |Added -

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-25 Thread ubizjak at gmail dot com via Gcc-bugs
||patch Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com Resolution|FIXED |--- --- Comment #13 from Uroš Bizjak --- (In reply to Martin Jambor from comment #12) > For the record, I have benchmarked the patches f

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2021-02-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 99083, which changed state. Bug 99083 Summary: Big run-time regressions of 519.lbm_r with LTO https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 What|Removed |Added -

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 Uroš Bizjak changed: What|Removed |Added Target Milestone|--- |11.0

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/99115] ICE in extract_insn, at recog.c:2309 on alpha (error: unrecognizable insn) with -O2

2021-02-17 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99115 --- Comment #4 from Uroš Bizjak --- Compiles OK with: GNU C++14 (GCC) version 8.4.1 20210216 [releases/gcc-8 revision c6513400d84:39c49bc104d:1f3a07da9b6bcfa4733750826746bd18ac6f20db] (alpha-unknown-openbsd6.8) built as a cross from x86_64-linu

[Bug target/99115] ICE in extract_insn, at recog.c:2309 on alpha (error: unrecognizable insn) with -O2

2021-02-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99115 Uroš Bizjak changed: What|Removed |Added Known to work||11.0 --- Comment #3 from Uroš Bizjak ---

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 --- Comment #10 from Uroš Bizjak --- (In reply to Richard Biener from comment #7) > There are a lot of targets that define REG_ALLOC_ORDER ^ > HONOR_REG_ALLOC_ORDER and thus are affected by this change... The following patch should solve this is

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 --- Comment #8 from Uroš Bizjak --- (In reply to Richard Biener from comment #7) > Btw, for GCC 11 it might be tempting to simply revert the "no-op" change? I agree, this is the safest way at this time. The situation now looks like going into ra

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 --- Comment #6 from Uroš Bizjak --- As a side note, it is strange that ADJUST_REG_ALLOC_ORDER somehow require REG_ALLOC_ORDER to be defined (c.f. Comment #3), while its documentation says: The macro body should not assume anything about the

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 --- Comment #5 from Uroš Bizjak --- Martin, can you please benchmark the patch from Comment #4? The patch is not totally trivial, because it introduces HONOR_REG_ALLOC_ORDER to x86 and this define disables some other code in ira-color.c, assign_

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 --- Comment #4 from Uroš Bizjak --- Created attachment 50185 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50185&action=edit Proposed patch Proposed patch that fixes ira-color.c and introduces HONOR_REG_ALLOC_ORDER.

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 --- Comment #3 from Uroš Bizjak --- It looks to me another one is in reload1.c, find_reg: if (this_cost < best_cost /* Among registers with equal cost, prefer caller-saved ones, or use REG_ALLOC_ORDER if

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083 --- Comment #1 from Uroš Bizjak --- This should be a no-op. According to the documentation: --q-- Macro: REG_ALLOC_ORDER If defined, an initializer for a vector of integers, containing the numbers of hard registers in the order in which GCC

[Bug target/99025] [11 Regression] ICE Segmentation fault since r11-6351-g12ae2bc70846a2be

2021-02-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99025 --- Comment #2 from Uroš Bizjak --- Comment on attachment 50154 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50154 gcc11-pr99025.patch >2021-02-09 Jakub Jelinek >+ if (SUBREG_P (operands[1])) >+operands[1] = force_reg (V2SFmode,

<    3   4   5   6   7   8   9   10   11   12   >