[Bug target/95766] Failure to directly use vpbroadcastd for _mm_set1_epi32 when passing unsigned short
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95766 --- Comment #11 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #10) > Kirill, any thoughts on that? I'd prefer your variant, w/o unspecs.
[Bug target/95144] Many AVX-512 functions take an int instead of unsigned int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95144 Kirill Yukhin changed: What|Removed |Added Last reconfirmed||2020-06-16 CC||kyukhin at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #2 from Kirill Yukhin --- Similar bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65744
[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 68633, which changed state. Bug 68633 Summary: [i386, AVX-512] Spec2006/434.zeus miscompares when executed on KNL https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug other/84613] [meta-bug] SPEC compiler performance issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84613 Bug 84613 depends on bug 68633, which changed state. Bug 68633 Summary: [i386, AVX-512] Spec2006/434.zeus miscompares when executed on KNL https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug target/68633] [i386, AVX-512] Spec2006/434.zeus miscompares when executed on KNL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633 Kirill Yukhin changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #3 from Kirill Yukhin --- Fixed.
[Bug other/84613] [meta-bug] SPEC compiler performance issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84613 Bug 84613 depends on bug 68627, which changed state. Bug 68627 Summary: [i386, AVX-512] Illegal insn generated while compiling spec2k6/437.leslie3d for KNL https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug target/68627] [i386, AVX-512] Illegal insn generated while compiling spec2k6/437.leslie3d for KNL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Kirill Yukhin --- Fixed.
[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 68627, which changed state. Bug 68627 Summary: [i386, AVX-512] Illegal insn generated while compiling spec2k6/437.leslie3d for KNL https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 --- Comment #12 from Kirill Yukhin --- Author: kyukhin Date: Mon Feb 12 06:14:15 2018 New Revision: 257579 URL: https://gcc.gnu.org/viewcvs?rev=257579=gcc=rev Log: Fix AVX-512 popcnt and bitalg tests. gcc/testsuite/ PR target/83828 * gcc.target/i386/avx512bitalg-vpopcntb-1.c: Fix test. * gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto. * gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Ditto. * gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c: Ditto. * gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c: Ditto. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c trunk/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c trunk/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 --- Comment #10 from Kirill Yukhin --- HJ, I cannot reproduce this fail on recent SDE. Here's what I have in gcc.log: spawn -ignore SIGHUP /export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/xgcc -B/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/ /export/kyukhin/gcc/svn/trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopc\ ntb-1.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mavx512vl -mavx512bitalg -mavx512bw -lm -o ./avx512bitalgvl-vpopcntb-1.exe^M PASS: gcc.target/i386/avx512bitalgvl-vpopcntb-1.c (test for excess errors) Setting LD_LIBRARY_PATH to :/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc:/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/32: spawn /home/kyukhin/bin/dejagnu/sde-sim ./avx512bitalgvl-vpopcntb-1.exe^M PASS: gcc.target/i386/avx512bitalgvl-vpopcntb-1.c execution test I've also verified manually that test PASS, not SKIPPED. Could you pls send some more info on failure?
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 --- Comment #8 from Kirill Yukhin --- Author: kyukhin Date: Tue Jan 30 08:21:22 2018 New Revision: 257173 URL: https://gcc.gnu.org/viewcvs?rev=257173=gcc=rev Log: Fix AVX-512BITALG test failures gcc/testsuite PR target/83828 * gcc.target/i386/avx512bitalg-vpopcntb-1.c: Fix test. * gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto. * gcc.target/i386/avx512bitalgvl-vpopcntb-1.c: Ditto. * gcc.target/i386/avx512bitalgvl-vpopcntw-1.c: Ditto. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopcntb-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopcntw-1.c
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 --- Comment #7 from Kirill Yukhin --- On the other hand, if masked variant of vpopcnt[w,q] is being issued: there's no way for reload to put 32/64 bit mask into mask register, since kmov[d,q] are only available under -mavx512bw switch. We can insist user to issue -mavx512bw along w/ -mavx512bitalg if she is going to use masked variants of corresponding intrinsics. Then only tests need to be fixed.
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 Kirill Yukhin changed: What|Removed |Added Status|NEW |ASSIGNED CC||kyukhin at gcc dot gnu.org --- Comment #6 from Kirill Yukhin --- Looks like avx512bw demand is excessive in avx512bitalgintrin.h
[Bug target/82983] [8 Regression] ICE in extract_insn, at recog.c:2305 w/ GFMI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82983 --- Comment #1 from Kirill Yukhin --- Author: kyukhin Date: Thu Nov 16 06:14:54 2017 New Revision: 254797 URL: https://gcc.gnu.org/viewcvs?rev=254797=gcc=rev Log: Fix GFNI check which didn't work properly in gfni+sse case gcc/ PR target/82983 * config/i386/gfniintrin.h: Add sse check. * config/i386/i386.c (ix86_expand_builtin): Fix gfni check. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/gfniintrin.h trunk/gcc/config/i386/i386.c
[Bug target/82812] ICE in emit_move_insn, at expr.c:3706
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82812 --- Comment #3 from Kirill Yukhin --- Author: kyukhin Date: Tue Nov 7 19:11:08 2017 New Revision: 254507 URL: https://gcc.gnu.org/viewcvs?rev=254507=gcc=rev Log: Fix SSE bits dependencies. gcc/ PR target/82812 * common/config/i386/i386-common.c (OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET): Remove MPX from flag. (ix86_handle_option): Move MPX to isa_flags2 and GFNI to isa_flags. * config/i386/i386-c.c (ix86_target_macros_internal): Ditto. * config/i386/i386.opt: Ditto. * config/i386/i386.c (ix86_target_string): Ditto. (ix86_option_override_internal): Ditto. (ix86_init_mpx_builtins): Move MPX to args2. (ix86_expand_builtin): Special handling for OPTION_MASK_ISA_GFNI. * config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineinvqb_v64qi, __builtin_ia32_vgf2p8affineinvqb_v64qi_mask, __builtin_ia32_vgf2p8affineinvqb_v32qi, __builtin_ia32_vgf2p8affineinvqb_v32qi_mask, __builtin_ia32_vgf2p8affineinvqb_v16qi, __builtin_ia32_vgf2p8affineinvqb_v16qi_mask): Move to ARGS array. Modified: trunk/gcc/ChangeLog trunk/gcc/common/config/i386/i386-common.c trunk/gcc/config/i386/i386-builtin.def trunk/gcc/config/i386/i386-c.c trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.opt
[Bug tree-optimization/80133] [bootstrap] ICE during build on PPC64-linux.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80133 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #4 from Kirill Yukhin --- I was trying to build GCC w/ some really old host compiler. After I upgraded host GCC to 4.6 - issue was resolved.
[Bug testsuite/81058] FAIL: gcc.target/i386/avx512bw-vpmovu?swb-1.c scan-assembler-times vpmovu?swb.*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81058 --- Comment #4 from Kirill Yukhin --- Confirmed.
[Bug target/81022] invalid address with pointer type casting
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81022 --- Comment #2 from Kirill Yukhin --- Intrinsics guide states [1] that this intrinsic: Store the lower double-precision (64-bit) floating-point element from a into memory. mem_addr does not need to be aligned on any particular boundary. [1] - https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_store_sd=5157
[Bug target/73350] AVX512: GCC optimizes away rounding flags
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73350 --- Comment #8 from Kirill Yukhin --- Author: kyukhin Date: Thu Jun 8 11:24:50 2017 New Revision: 249009 URL: https://gcc.gnu.org/viewcvs?rev=249009=gcc=rev Log: [PR73350][PR80862] Improve subst for RC-capable insns. PR target/73350,80862 gcc/ * config/i386/subst.md (round): Fix round pattern. * config/i386/i386.c (ix86_erase_embedded_rounding): Fix erasing rounding for the fixed pattern. gcc/testsuite/ * gcc.target/i386/pr73350.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr73350.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/subst.md trunk/gcc/testsuite/ChangeLog
[Bug bootstrap/80133] [bootstrap] ICE during build on PPC64-linux.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80133 --- Comment #2 from Kirill Yukhin --- Caused by r241649.
[Bug bootstrap/80133] [bootstrap] ICE during build on PPC64-linux.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80133 --- Comment #1 from Kirill Yukhin --- I am not familiar with Power, may be this can help: [kyukhin@localhost build2]$ lscpu Architecture: ppc64 Byte Order:Big Endian CPU(s):8 On-line CPU(s) list: 0-7 Thread(s) per core:2 Core(s) per socket:1 Socket(s): 4 NUMA node(s): 2 Model: IBM,9117-MMA L1d cache: 64K L1i cache: 64K L2 cache: 4096K L3 cache: 32768K NUMA node0 CPU(s): 0-3 NUMA node1 CPU(s): 4-7
[Bug bootstrap/80133] New: [bootstrap] ICE during build on PPC64-linux.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80133 Bug ID: 80133 Summary: [bootstrap] ICE during build on PPC64-linux. Product: gcc Version: tree-ssa Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- I see on recent trunk: [kyukhin@localhost build2]$ cd powerpc64-unknown-linux-gnu/libgcc/ [kyukhin@localhost libgcc]$ make # If this is the top-level multilib, build all the other # multilibs. /export/kyukhin/gcc/build2/./gcc/xgcc -B/export/kyukhin/gcc/build2/./gcc/ -B/usr/local/powerpc64-unknown-linux-gnu/bin/ -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem /usr/local/powerpc64-unknown-linux-gnu/include -isystem /usr/local/powerpc64-unknown-linux-gnu/s\ ys-include-g -O2 -O2 -g -O2 -DIN_GCC-W -Wall -Wwrite-strings -Wcast-qual -Wno-format -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -mlong-double-128 -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-prote\ ctor -fPIC -mlong-double-128 -mno-minimal-toc -I. -I. -I../.././gcc -I/export/kyukhin/gcc/git/gcc/libgcc -I/export/kyukhin/gcc/git/gcc/libgcc/. -I/export/kyukhin/gcc/git/gcc/libgcc/../gcc -I/export/kyukhin/gcc/git/gcc/libgcc/../include -I/export/kyukhin/gcc/git/gcc/lib\ gcc/../libdecnumber/dpd -I/export/kyukhin/gcc/git/gcc/libgcc/../libdecnumber -DHAVE_CC_TLS -o _muldi3.o -MT _muldi3.o -MD -MP -MF _muldi3.dep -DL_muldi3 -c /export/kyukhin/gcc/git/gcc/libgcc/libgcc2.c -fvisibility=hidden -DHIDE_EXPORTS In file included from /export/kyukhin/gcc/git/gcc/libgcc/libgcc2.c:56:0: /export/kyukhin/gcc/git/gcc/libgcc/libgcc2.c: In function ?__multi3?: /export/kyukhin/gcc/git/gcc/libgcc/libgcc2.h:203:20: internal compiler error: Segmentation fault #define __NDW(a,b) __ ## a ## ti ## b ^ /export/kyukhin/gcc/git/gcc/libgcc/libgcc2.h:273:18: note: in expansion of macro ?__NDW? #define __muldi3 __NDW(mul,3) ^ /export/kyukhin/gcc/git/gcc/libgcc/libgcc2.c:547:1: note: in expansion of macro ?__muldi3? __muldi3 (DWtype u, DWtype v) ^~~~ 0x10e3e383 crash_signal /export/kyukhin/gcc/git/gcc/gcc/toplev.c:337 0x11366fb0 inchash::add_expr(tree_node const*, inchash::hash&, unsigned int) /export/kyukhin/gcc/git/gcc/gcc/tree.c: 0x10e8fd93 iterative_hash_expr /export/kyukhin/gcc/git/gcc/gcc/tree.h:4794 0x10e94767 tree_operand_hash::hash(tree_node* const&) /export/kyukhin/gcc/git/gcc/gcc/tree-hash-traits.h:34 0x11a2f30b hash /export/kyukhin/gcc/git/gcc/gcc/hash-map-traits.h:48 0x11a2e9f3 get /export/kyukhin/gcc/git/gcc/gcc/hash-map.h:150 0x11a2d1ff execute /export/kyukhin/gcc/git/gcc/gcc/gimple-ssa-store-merging.c:1456 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. make: *** [_muldi3.o] Error 1 Configured on RHEL6.5: $ /export/kyukhin/gcc/git/gcc/configure --with-mpfr=/home/kyukhin/bin --with-gmp=/home/kyukhin/bin --with-mpc=/home/kyukhin/bin --enable-languages=c,c++,fortran,lto --disable-multilib I can build gcc-6-branch with this config.
[Bug target/76731] [AVX512] _mm512_i32gather_epi32 and other scatter/gather routines have incorrect signature
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=76731 --- Comment #10 from Kirill Yukhin --- (In reply to Andrew Senkevich from comment #8) > I think we should follow here declarations from icc headers to be compatible > with it. Okay. Could you pls state which rules ICC follows for all gather/scatter intrinsics? Could we use void const * for base in all gather intrinsics? What about scatters?
[Bug tree-optimization/70729] Loop marked with omp simd pragma is not vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729 --- Comment #28 from Kirill Yukhin --- Author: kyukhin Date: Fri Jul 1 09:42:01 2016 New Revision: 237907 URL: https://gcc.gnu.org/viewcvs?rev=237907=gcc=rev Log: PR tree-optimization/70729 gcc/ * tree-vectorizer.c (adjust_simduid_builtins): Nullify safelen field of loop since it can be not valid after transformation. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-vectorizer.c
[Bug target/71346] [AVX-512] AVX-512VL insn emitted when it is disabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71346 --- Comment #3 from Kirill Yukhin --- Author: kyukhin Date: Tue May 31 08:05:24 2016 New Revision: 236909 URL: https://gcc.gnu.org/viewcvs?rev=236909=gcc=rev Log: AVX-512. Limit constraint for scalar operand in split to AVX-512VL. PR target/71346 gcc/ * config/i386/sse.md (define_insn_and_split "*vec_extractv4sf_0"): Use `Yv' for scalar operand. testsuite/ * gcc.target/i386/pr71346.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr71346.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/71346] [AVX-512] AVX-512VL insn emitted when it is disabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71346 --- Comment #2 from Kirill Yukhin --- Looks like issue is in split. This one-liner solves the issue: diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index b348f2d..1267897 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -6837,7 +6837,7 @@ "operands[1] = gen_lowpart (SFmode, operands[1]);") (define_insn_and_split "*sse4_1_extractps" - [(set (match_operand:SF 0 "nonimmediate_operand" "=rm,rm,rm,v,v") + [(set (match_operand:SF 0 "nonimmediate_operand" "=rm,rm,rm,Yv,Yv") (vec_select:SF (match_operand:V4SF 1 "register_operand" "Yr,*x,v,0,v") (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n,n,n,n,n")])))]
[Bug target/71346] [AVX-512] AVX-512VL insn emitted when it is disabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71346 --- Comment #1 from Kirill Yukhin --- Created attachment 38598 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38598=edit Reproducer
[Bug target/71346] [AVX-512] AVX-512VL insn emitted when it is disabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71346 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-05-30 Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org Ever confirmed|0 |1
[Bug target/71346] New: [AVX-512] AVX-512VL insn emitted when it is disabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71346 Bug ID: 71346 Summary: [AVX-512] AVX-512VL insn emitted when it is disabled Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Testcase attached. Started from Jakub's r235968. Reproduce: ./cc1 1.c -dp -m64 -march=knl -Ofast -quiet -o repro.s 2>/dev/null ; cat repro.s |grep shufps |grep xmm17 Those insns belong to AVX-512VL: vshufps$255, %xmm15, %xmm15, %xmm17# 680 sse_shufps_v4sf/2 [length = 7] vshufps$85, %xmm12, %xmm12, %xmm17 # 682 sse_shufps_v4sf/2 [length = 7]
[Bug target/70981] [7 regression] gcc.target/i386/avx512f-vprord-1.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70981 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-05-17 CC||kyukhin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Kirill Yukhin --- Confirmed.
[Bug target/70902] [7 Regression] GCC freezes while compiling for 'skylake-avx512' target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70902 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org, ||ubizjak at gmail dot com --- Comment #2 from Kirill Yukhin --- Started from Uros's r235523.
[Bug target/70728] GCC trunk emits invalid assembly for knl target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Kirill Yukhin --- Done
[Bug target/70728] GCC trunk emits invalid assembly for knl target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728 --- Comment #4 from Kirill Yukhin --- Author: kyukhin Date: Wed Apr 27 12:09:45 2016 New Revision: 235487 URL: https://gcc.gnu.org/viewcvs?rev=235487=gcc=rev Log: AVX-512. PR target/70728. Use separate constraint for AVX-512BW PR target/70728 gcc/ * gcc/config/i386/sse.md (define_insn "3"): Extract AVX-512BW constraint from AVX. gcc/testsuite/ * gcc.target/i386/pr70728.c: New test. Added: branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr70728.c Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/i386/sse.md branches/gcc-6-branch/gcc/testsuite/ChangeLog
[Bug tree-optimization/68030] Redundant address calculations in vectorized loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68030 Kirill Yukhin changed: What|Removed |Added CC||amker.cheng at gmail dot com --- Comment #4 from Kirill Yukhin --- Hello Bin, Is it possible to handle the issue using current ivopt?
[Bug target/70728] GCC trunk emits invalid assembly for knl target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728 --- Comment #3 from Kirill Yukhin --- Author: kyukhin Date: Thu Apr 21 15:29:29 2016 New Revision: 235344 URL: https://gcc.gnu.org/viewcvs?rev=235344=gcc=rev Log: AVX-512. PR target/70728. Use separate constraint for AVX-512BW PR target/70728 gcc/ * gcc/config/i386/sse.md (define_insn "3"): Extract AVX-512BW constraint from AVX. gcc/testsuite/ * gcc.target/i386/pr70728.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70728.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/70728] GCC trunk emits invalid assembly for knl target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728 --- Comment #2 from Kirill Yukhin --- This is a 5/6 regression
[Bug target/70728] GCC trunk emits invalid assembly for knl target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728 Kirill Yukhin changed: What|Removed |Added Target||i?86/x86_64 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-04-19 CC||kyukhin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Kirill Yukhin --- I'll take a look.
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Kirill Yukhin --- Done
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 --- Comment #6 from Kirill Yukhin --- Author: kyukhin Date: Fri Apr 15 15:17:31 2016 New Revision: 235038 URL: https://gcc.gnu.org/viewcvs?rev=235038=gcc=rev Log: AVX-512. Fix mode size check. PR target/70662 gcc/ * config/i386/sse.md(define_insn "_vec_dup"): Fix mode size check. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 --- Comment #5 from Kirill Yukhin --- Author: kyukhin Date: Fri Apr 15 15:13:42 2016 New Revision: 235037 URL: https://gcc.gnu.org/viewcvs?rev=235037=gcc=rev Log: AVX-512, Fix mode size check. PR target/70662 gcc/ * config/i386/sse.md(define_insn "_vec_dup"): Fix mode size check. Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 --- Comment #3 from Kirill Yukhin --- Author: kyukhin Date: Fri Apr 15 09:36:31 2016 New Revision: 235013 URL: https://gcc.gnu.org/viewcvs?rev=235013=gcc=rev Log: AVX-512. Use proper mem ops modifier for Intel syntax in broadcast patter. PR target/70662 gcc/ * config/i386/sse.md: Use proper memory operand modifiers. gcc/testsuite. * gcc.target/i386/pr70662.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70662.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 --- Comment #2 from Kirill Yukhin --- Author: kyukhin Date: Fri Apr 15 08:25:49 2016 New Revision: 235008 URL: https://gcc.gnu.org/viewcvs?rev=235008=gcc=rev Log: AVX-512. Fix mem operand modifier for Intel syntax. PR target/70662 gcc/ * config/i386/sse.md: Use proper memory operand modifiers. testsuite/gcc/ * gcc.target/i386/pr70662.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70662.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-04-14 Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Kirill Yukhin --- I'll take a look.
[Bug tree-optimization/70577] [6 regression] tree-ssa/prefetch-5.c scan-tree-dump-times aprefetch failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70577 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org --- Comment #8 from Kirill Yukhin --- This commit caused miscompare of spec2000/178.galgel on -march=skylake-avx512 (-Ofast -flto -funroll-loops): Newton iteration # 0Maximal derivative = 0.1526E-07 Newton iteration # 0Maximal derivative = 0.3901E-07
[Bug target/59683] ICE: in classify_argument, at config/i386/i386.c:6637 with #pragma GCC target("avx512f")
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59683 --- Comment #3 from Kirill Yukhin --- This hunk from Jakub's fix for PR61925 makes test working: diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index a41efa4..6aebaed 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -4962,6 +4962,15 @@ static GTY(()) tree ix86_previous_fndecl; void ix86_reset_previous_fndecl (void) { + tree new_tree = target_option_current_node; + cl_target_option_restore (_options, TREE_TARGET_OPTION (new_tree)); + if (TREE_TARGET_GLOBALS (new_tree)) +restore_target_globals (TREE_TARGET_GLOBALS (new_tree)); + else if (new_tree == target_option_default_node) +restore_target_globals (_target_globals); + else +TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts (); + ix86_previous_fndecl = NULL_TREE; }
[Bug target/64386] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -mavx512bw
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64386 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #4 from Kirill Yukhin --- Done.
[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510 --- Comment #3 from Kirill Yukhin --- (In reply to Uroš Bizjak from comment #2) > (In reply to Kirill Yukhin from comment #1) > > will take a look. > > I have patch in testing: > Oh, great! Thanks!
[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-04-05 Ever confirmed|0 |1
[Bug target/64387] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -ffloat-store -mavx512er
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||kyukhin at gcc dot gnu.org Resolution|--- |FIXED --- Comment #6 from Kirill Yukhin --- Done
[Bug target/64393] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64393 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #4 from Kirill Yukhin --- Done
[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org --- Comment #1 from Kirill Yukhin --- will take a look.
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 Kirill Yukhin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #8 from Kirill Yukhin --- Done.
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 --- Comment #7 from Kirill Yukhin --- Author: kyukhin Date: Thu Mar 31 15:25:33 2016 New Revision: 234635 URL: https://gcc.gnu.org/viewcvs?rev=234635=gcc=rev Log: Fix PR target/70453. gcc/ * config/i386/sse.md (define_mode_attr shuffletype): Fix typo. gcc/testsuite/ * gcc.target/i386/pr70453.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70453.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 --- Comment #6 from Kirill Yukhin --- Author: kyukhin Date: Thu Mar 31 15:23:29 2016 New Revision: 234634 URL: https://gcc.gnu.org/viewcvs?rev=234634=gcc=rev Log: Fix PR target/70453. gcc/ * config/i386/sse.md (define_mode_attr shuffletype): Fix typo. gcc/testsuite/ * gcc.target/i386/pr70453.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70453.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/70479] FMA is not reassociated causing x2 slowdown vs. ICC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479 --- Comment #3 from Kirill Yukhin --- (In reply to Richard Biener from comment #2) > You mean we fail to handle ternary associative tree codes in GIMPLE reassoc? > Yes, that's true. It's not going to be easy to retro-fit there > implementation-wise. With rebalancing you mean handling reassoc-width > 1? Hi Richard, yes to both.
[Bug tree-optimization/70479] FMA is not reassociated causing x2 slowdown vs. ICC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479 --- Comment #1 from Kirill Yukhin --- (In reply to Kirill Yukhin from comment #0) > Compile: > GCC: g++ -march=haswell -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o > m.gcc > ICC: icpc -O3 -ipo -fpermissive -xAVX2 -qopenmp m.cpp -o m.icc Correct compile commands (original are for Haswell) GCC: g++ -march=knl -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o m.gcc ICC: icpc -O3 -ipo -fpermissive -xMIC-AVX512 -qopenmp m.cpp -o m.icc
[Bug tree-optimization/70479] New: FMA is not reassociated causing x2 slowdown vs. ICC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479 Bug ID: 70479 Summary: FMA is not reassociated causing x2 slowdown vs. ICC Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 38146 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38146=edit Reproducer Attached example demonstrates the issue. GCC is recent trunk. ICC is v16. Compile: GCC: g++ -march=haswell -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o m.gcc ICC: icpc -O3 -ipo -fpermissive -xAVX2 -qopenmp m.cpp -o m.icc Run GCC: time ./m.icc 2 2 ICC: time ./m.gcc 2 2 Hot spot generated by GCC (annotated w/ perf hit counts): 157 │8d0:┌─→vbroad 0x4(%r13),%zmm0 193 ││ lea0x1(%rdx),%edx 173 ││ vmulps (%r14,%rax,1),%zmm0,%zmm0 2943││ vbroad 0x60(%r13),%zmm1 166 ││ vbroad 0x5c(%r13),%zmm2 151 ││ vbroad 0x58(%r13),%zmm3 144 ││ vbroad 0x54(%r13),%zmm4 164 ││ vbroad 0x50(%r13),%zmm5 170 ││ vbroad 0x4c(%r13),%zmm6 162 ││ vbroad 0x48(%r13),%zmm7 162 ││ vbroad 0x44(%r13),%zmm8 154 ││ vbroad 0x40(%r13),%zmm9 172 ││ vbroad 0x3c(%r13),%zmm10 167 ││ vbroad 0x38(%r13),%zmm11 172 ││ vbroad 0x34(%r13),%zmm12 171 ││ vbroad 0x30(%r13),%zmm13 161 ││ vbroad 0x2c(%r13),%zmm14 176 ││ vbroad 0x28(%r13),%zmm15 139 ││ vbroad 0x24(%r13),%zmm16 180 ││ vbroad 0x20(%r13),%zmm17 158 ││ vbroad 0x1c(%r13),%zmm18 165 ││ vbroad 0x18(%r13),%zmm19 140 ││ vbroad 0x10(%r13),%zmm21 179 ││ vbroad 0xc(%r13),%zmm22 146 ││ vbroad 0x8(%r13),%zmm23 170 ││ vbroad 0x0(%r13),%zmm24 170 ││ vbroad 0x14(%r13),%zmm20 168 ││ vfmadd (%r15,%rax,1),%zmm24,%zmm0 2732││ mov0xb8(%rsp),%rcx 172 ││ vfmadd (%r11,%rax,1),%zmm23,%zmm0 1649││ vfmadd (%rsi,%rax,1),%zmm22,%zmm0 3413││ vfmadd (%rcx,%rax,1),%zmm21,%zmm0 3653││ mov0xc0(%rsp),%rcx 182 ││ vfmadd (%rcx,%rax,1),%zmm20,%zmm0 2806││ mov0xc8(%rsp),%rcx 176 ││ vfmadd (%rcx,%rax,1),%zmm19,%zmm0 2439││ mov0xd0(%rsp),%rcx 179 ││ vfmadd (%rcx,%rax,1),%zmm18,%zmm0 2562││ mov0xd8(%rsp),%rcx 197 ││ vfmadd (%rcx,%rax,1),%zmm17,%zmm0 2867││ mov0xe0(%rsp),%rcx 141 ││ vfmadd (%rcx,%rax,1),%zmm16,%zmm0 3200││ mov0xe8(%rsp),%rcx 156 ││ vfmadd (%rcx,%rax,1),%zmm15,%zmm0 3557││ mov0xf0(%rsp),%rcx 158 ││ vfmadd (%rcx,%rax,1),%zmm14,%zmm0 ││ mov0xf8(%rsp),%rcx 143 ││ vfmadd (%rcx,%rax,1),%zmm13,%zmm0 3004││ mov0x100(%rsp),%rcx 177 ││ vfmadd (%rcx,%rax,1),%zmm12,%zmm0 2876││ mov0x108(%rsp),%rcx 144 ││ vfmadd (%rcx,%rax,1),%zmm11,%zmm0 2838││ mov0x110(%rsp),%rcx 168 ││ vfmadd (%rcx,%rax,1),%zmm10,%zmm0 2503││ mov0x118(%rsp),%rcx 203 ││ vfmadd (%rcx,%rax,1),%zmm9,%zmm0 2471││ mov0x120(%rsp),%rcx 185 ││ vfmadd (%rcx,%rax,1),%zmm8,%zmm0 2153││ mov0x128(%rsp),%rcx 152 ││ vfmadd (%r12,%rax,1),%zmm7,%zmm0 2091││ vfmadd (%rbx,%rax,1),%zmm6,%zmm0 3049││ vfmadd (%r10,%rax,1),%zmm5,%zmm0 3737││ vfmadd (%r9,%rax,1),%zmm4,%zmm0 3665││ vfmadd (%r8,%rax,1),%zmm3,%zmm0 3627││ vfmadd (%rdi,%rax,1),%zmm2,%zmm0 3804││ vfmadd (%rcx,%rax,1),%zmm1,%zmm0 4052││ mov0x130(%rsp),%rcx 160 ││ cmp0x138(%rsp),%edx 534 ││ vmovup %zmm0,(%rcx,%rax,1) 3235││ lea0x40(%rax),%rax 161 │└──jb 8d0 Hot spot generated by ICC (annotated w/ perf hit counts): 344 │47a:┌─→vmulps 0x204c(%r11,%r14,4),%zmm27,%zmm2 821 ││ vmulps 0x2050(%r11,%r14,4),%zmm4,%zmm1 318 ││ vmulps 0x2040(%r11,%r14,4),%zmm6,%zmm29 818 ││ vmulps 0x1840(%r11,%r14,4),%zmm7,%zmm31 275 ││ vfmadd 0x1838(%r11,%r14,4),%zmm9,%zmm31 1234 ││ vfmadd 0x183c(%r11,%r14,4),%zmm8,%zmm29 442 ││ vfmadd 0x2044(%r11,%r14,4),%zmm5,%zmm2 1110 ││ vfmadd 0x2048(%r11,%r14,4),%zmm28,%zmm1 337 ││ vaddps %zmm29,%zmm31,%zmm0 1047 ││ vaddps %zmm1,%zmm2,%zmm3 655 ││ vmulps 0x1830(%r11,%r14,4),%zmm11,%zmm30 956 ││ vmulps 0x1834(%r11,%r14,4),%zmm10,%zmm2 296 ││ vmulps 0x1024(%r11,%r14,4),%zmm15,%zmm1 1050 ││ vmulps 0x1028(%r11,%r14,4),%zmm14,%zmm31 294 ││ vaddps %zmm0,%zmm3,%zmm3 1057 ││ vfmadd 0x102c(%r11,%r14,4),%zmm13,%zmm30 344 ││ vfmadd 0x1030(%r11,%r14,4),%zmm12,%zmm2 911 ││ vfmadd 0x1020(%r11,%r14,4),%zmm16,%zmm31 332 ││ vfmadd 0x820(%r11,%r14,4),%zmm17,%zmm1 885 ││ vaddps %z
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 Kirill Yukhin changed: What|Removed |Added Attachment #38133|0 |1 is obsolete|| --- Comment #5 from Kirill Yukhin --- Created attachment 38135 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38135=edit The patch Woops, this one.
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 --- Comment #3 from Kirill Yukhin --- Created attachment 38133 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38133=edit Proposed patch I am reg-testing trivial patch
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-30 Ever confirmed|0 |1 --- Comment #2 from Kirill Yukhin --- Confirmed
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org --- Comment #1 from Kirill Yukhin --- Will look
[Bug target/70429] Wrong code with -O1.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70429 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org --- Comment #4 from Kirill Yukhin --- Seems like combiner performs invalid reassociation. This trivial addition to Jakub's PR70222 fix makes test work: --- a/gcc/combine.c +++ b/gcc/combine.c @@ -10526,7 +10526,7 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, { /* For ((unsigned) (cstULL >> count)) >> cst2 we have to make sure the result will be masked. See PR70222. */ - if (code == LSHIFTRT + if ((code == LSHIFTRT || code == ASHIFTRT) && mode != result_mode && !merge_outer_ops (_op, _const, AND, GET_MODE_MASK (result_mode)
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Kirill Yukhin --- Done.
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 --- Comment #5 from Kirill Yukhin --- Author: kyukhin Date: Mon Mar 28 08:01:56 2016 New Revision: 234501 URL: https://gcc.gnu.org/viewcvs?rev=234501=gcc=rev Log: PR target/70406. gcc/ * config/i386/i386.md (define_split, andn): Fix modes. gcc/testsuite/ * gcc.target/i386/pr70406.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70406.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/i386.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 --- Comment #4 from Kirill Yukhin --- Author: kyukhin Date: Mon Mar 28 07:59:44 2016 New Revision: 234500 URL: https://gcc.gnu.org/viewcvs?rev=234500=gcc=rev Log: PR target/70406 gcc/ * config/i386/i386.md (define_split, andn): Fix modes. gcc/testsuite/ * gcc.target/i386/pr70406.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70406.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 --- Comment #3 from Kirill Yukhin --- Created attachment 38095 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38095=edit Bootstrapped/regtested patch Will submit to gcc-patches shortly
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-03-25 Ever confirmed|0 |1 --- Comment #2 from Kirill Yukhin --- Reproduced.
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Kirill Yukhin --- Done.
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 --- Comment #6 from Kirill Yukhin --- Author: kyukhin Date: Tue Mar 22 11:13:44 2016 New Revision: 234396 URL: https://gcc.gnu.org/viewcvs?rev=234396=gcc=rev Log: PR target/70325. gcc/ * config/i386/i386.c (def_builtin): Handle OPTION_MASK_ISA_AVX512VL to be and-ed with other bits. (const struct builtin_description bdesc_special_args[]): Remove duplicate ISA bits. gcc/testsuite/ * gcc.target/i386/pr70325.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70325.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/i386.c branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 --- Comment #5 from Kirill Yukhin --- Author: kyukhin Date: Tue Mar 22 11:09:03 2016 New Revision: 234395 URL: https://gcc.gnu.org/viewcvs?rev=234395=gcc=rev Log: PR target/70325 gcc/ * config/i386/i386.c (def_builtin): Handle OPTION_MASK_ISA_AVX512VL to be and-ed with other bits. (const struct builtin_description bdesc_special_args[]): Remove duplicate ISA bits. gcc/testsuite/ * gcc.target/i386/pr70325.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70325.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/testsuite/ChangeLog
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #6 from Kirill Yukhin --- Done
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 Kirill Yukhin changed: What|Removed |Added Status|RESOLVED|ASSIGNED Last reconfirmed||2016-03-22 Resolution|FIXED |--- Ever confirmed|0 |1 --- Comment #4 from Kirill Yukhin --- Sorry, closed by mistake
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Kirill Yukhin --- Done.
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 --- Comment #2 from Kirill Yukhin --- I am testing this patch: commit e88ceeabc50634012fa21f47625934d9a2c2e160 Author: Kirill YukhinDate: Mon Mar 21 14:28:58 2016 +0300 AVX-512. Fix PR70325. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 3d8dbc4..2c56ee7 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -32431,7 +32431,7 @@ def_builtin (HOST_WIDE_INT mask, const char *name, mask &= ~OPTION_MASK_ISA_64BIT; if (mask == 0 - || (mask & ix86_isa_flags) != 0 + || (mask & ix86_isa_flags) == mask || (lang_hooks.builtin_function == lang_hooks.builtin_function_ext_scope)) diff --git a/gcc/testsuite/gcc.target/i386/pr70325.c b/gcc/testsuite/gcc.target/i386/pr70325.c new file mode 100644 index 000..e2b9342 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr70325.c @@ -0,0 +1,12 @@ +/* PR target/70325 */ +/* { dg-do compile } */ +/* { dg-options "-mavx512vl -O2" } */ + +typedef char C __attribute((__vector_size__(32))); +typedef int I __attribute((__vector_size__(32))); + +void +f(int a,I b) +{ + __builtin_ia32_storedquqi256_mask((C*)f,(C)b,a); /* { dg-warning "implicit declaration of function" } */ +}
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #5 from Kirill Yukhin --- Author: kyukhin Date: Mon Mar 21 10:53:50 2016 New Revision: 234364 URL: https://gcc.gnu.org/viewcvs?rev=234364=gcc=rev Log: PR target/70293. gcc/ * config/i386 (define_insn "*vec_dup"/AVX2): Block third alternative for AVX-512VL target, gcc/testsuite/ * gcc.target/i386/pr70293.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70293.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #4 from Kirill Yukhin --- Author: kyukhin Date: Mon Mar 21 10:51:04 2016 New Revision: 234363 URL: https://gcc.gnu.org/viewcvs?rev=234363=gcc=rev Log: PR target/70293 gcc/ * config/i386 (define_insn "*vec_dup"/AVX2): Block third alternative for AVX-512VL target, gcc/testsuite/ * gcc.target/i386/pr70293.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70293.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org --- Comment #1 from Kirill Yukhin --- Reproducible with: ./xg++ -B. -O2 -S 1.c
[Bug target/70293] New: [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 Bug ID: 70293 Summary: [ICE, AVX-512] Wrong reg constraints in vec_dup Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 38018 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38018=edit Reproducer Attached testcase ICEs when compiled as: ./xgcc -B. -mtune=broadwell -mavx512vl -O2 -S ~/pixman-sse.i 1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c: In function ‘fast_composite_scaled_bilinear_sse2__8__none\ _OVER’: /home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/work/skylake-avx512-64-wrs-linux/pix\ man/1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c:6059:1: error: insn does not satisfy its constraints: (insn 5050 5049 1065 58 (set (reg/v:V8HI 56 xmm19 [orig:670 D.27517 ] [670]) (vec_duplicate:V8HI (vec_select:HI (reg/v:V8HI 56 xmm19 [orig:670 D.27517 ] [670]) (parallel [ (const_int 0 [0]) ] /home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/sysroots/x\ 86_64-linux/usr/lib/x86_64-wrs-linux/gcc/x86_64-wrs-linux/5.2.0/include/emmintrin.h:606 4153 {avx2_pbroadcastv8hi} (nil)) /home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/work/skylake-avx512-64-wrs-linux/pix\ man/1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c:6059:1: internal compiler error: in extract_constrain_insn, at rec\ og.c:2190 0xdaccab _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /export/users/kyukhin/gcc/git/gcc2/gcc/rtl-error.c:108 0xdacd0b _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /export/users/kyukhin/gcc/git/gcc2/gcc/rtl-error.c:119 0xd50b31 extract_constrain_insn(rtx_insn*) /export/users/kyukhin/gcc/git/gcc2/gcc/recog.c:2190 0xd5f1d3 copyprop_hardreg_forward_1 /export/users/kyukhin/gcc/git/gcc2/gcc/regcprop.c:774 0xd60afe execute /export/users/kyukhin/gcc/git/gcc2/gcc/regcprop.c:1280 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions.
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #1 from Kirill Yukhin --- We've got duplication of patterns (make mddump): ;; /export/users/kyukhin/gcc/git/gcc2/gcc/config/i386/sse.md: 17107 (define_insn ("avx2_pbroadcastv8hi") [ (set (match_operand:V8HI 0 ("register_operand") ("=x")) (vec_duplicate:V8HI (vec_select:HI (match_operand:V8HI 1 ("nonimmediate_operand") ("xm")) (parallel [ (const_int 0 [0]) ] ] ("TARGET_AVX2") ("vpbroadcastw\t{%1, %0|%0, %w1}") [ (set_attr ("type") ("ssemov")) (set_attr ("prefix_extra") ("1")) (set_attr ("prefix") ("vex")) (set_attr ("mode") ("TI")) ]) ... (define_insn ("avx512vl_vec_dupv8hi") [ (set (match_operand:V8HI 0 ("register_operand") ("=v")) (vec_duplicate:V8HI (vec_select:HI (match_operand:V8HI 1 ("nonimmediate_operand") ("vm")) (parallel [ (const_int 0 [0]) ] ] ("(TARGET_AVX512BW) && (TARGET_AVX512VL)") ("vpbroadcastw\t{%1, %0|%0, %1}") [ (set_attr ("type") ("ssemov")) (set_attr ("prefix") ("evex")) (set_attr ("mode") ("TI")) ]) That's why we've got unsatisfied constraints on xmmN, N>15.
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #2 from Kirill Yukhin --- Created attachment 38020 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38020=edit Proposed patch Attached patch solves the issue by blocking AVX2's broadcast pattern alternative: $r->Yi, which is subject of split2
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #3 from Kirill Yukhin --- Regtest is in progress
[Bug target/70028] Error: operand size mismatch for `kmovw' (wrong assembly generated) with -mavx512bw -masm=intel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70028 --- Comment #4 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #3) > Created attachment 37835 [details] > gcc6-pr70028.patch > > So what about this patch then? I don't see kmov* used with %k in other > patterns, where "m" could appear. Hi Jakub, patch is fine to me.
[Bug target/70028] Error: operand size mismatch for `kmovw' (wrong assembly generated) with -mavx512bw -masm=intel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70028 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-01 CC||kyukhin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Kirill Yukhin --- Confirmed. The issue is that operand modifier passed in .md file is %k1, which stands for SI mode. It should be 32b reg or 16b memory, i.e. %ebx and WORD.
[Bug tree-optimization/69980] New: [6 regression] Supposedly wrong SLP code emitted
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69980 Bug ID: 69980 Summary: [6 regression] Supposedly wrong SLP code emitted Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 37806 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37806=edit Reproducer Hello, Attached test runfails when compiled is following: $ gfortran -m64 -Ofast repro.f90 -msse When compiled w/ -O2 - it works fine. Second loop nest is just for verification. Issue lives here: mumax = 0; do k=1,26 do i=1,3 mumax(i) = max(mumax(i), mu(i,k)+mu(i,k)) end do end do Looks like SLP emits some wrong permutations here.
[Bug tree-optimization/69956] New: [ICE] Wrong vector type @ fold-const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69956 Bug ID: 69956 Summary: [ICE] Wrong vector type @ fold-const Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 37789 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37789=edit Reproducer Hello, Attached testcase produces ICE when compiled as following: gcc -S -O2 -march=skylake-avx512 repro.i -ftree-vectorize I observe the ICE since 02.02.2016 /nfs/ims/home/kyukhin/repro.i:2:1: internal compiler error: tree check: expected vector_type, have integer_type in co\ nst_unop, at fold-const.c:1665 fn1() { ^~~ 0xda1f9c tree_check_failed(tree_node const*, char const*, int, char const*, ...) /export/users/gnutester/stability/svn/trunk/gcc/tree.c:9637 0x860742 tree_check(tree_node*, char const*, int, char const*, tree_code) /export/users/gnutester/stability/svn/trunk/gcc/tree.h:3006 0x860742 const_unop(tree_code, tree_node*, tree_node*) /export/users/gnutester/stability/svn/trunk/gcc/fold-const.c:1665 0xe7f639 gimple_resimplify1(gimple**, code_helper*, tree_node*, tree_node**, tree_node* (*)(tree_node*)) /export/users/gnutester/stability/svn/trunk/gcc/gimple-match-head.c:85 0xee84b3 gimple_simplify(gimple*, code_helper*, tree_node**, gimple**, tree_node* (*)(tree_node*), tree_node* (*)(tre\ e_node*)) /export/users/gnutester/stability/svn/trunk/gcc/gimple-match-head.c:622 0x8a0933 gimple_fold_stmt_to_constant_1(gimple*, tree_node* (*)(tree_node*), tree_node* (*)(tree_node*)) /export/users/gnutester/stability/svn/trunk/gcc/gimple-fold.c:4981 0xc409d2 back_propagate_equivalences /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:881 0xc409d2 record_temporary_equivalences(edge_def*, const_and_copies*, avail_exprs_stack*) /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:963 0xd0663a thread_through_normal_block /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-threadedge.c:858 0xd07a22 thread_across_edge(gcond*, edge_def*, bool, const_and_copies*, avail_exprs_stack*, tree_node* (*)(gimple*, g\ imple*, avail_exprs_stack*)) /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-threadedge.c:1005 0xc404c0 dom_opt_dom_walker::thread_across_edge(edge_def*) /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:989 0xc406eb dom_opt_dom_walker::after_dom_children(basic_block_def*) /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:1423 0x11a47a7 dom_walker::walk(basic_block_def*) /export/users/gnutester/stability/svn/trunk/gcc/domwalk.c:307 0xc432a0 execute /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:614 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. I suspect scalar masks.
[Bug tree-optimization/69882] New: [6 regression] Excessive reduction statements generated by SLP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69882 Bug ID: 69882 Summary: [6 regression] Excessive reduction statements generated by SLP Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 37743 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37743=edit Reproducer Hello, Attached test case emits wrong reduction statements. Compile: $ trunk/64/20160220/bin/gfortran -o repro -static -m64 -Ofast -mavx repro.f90 Execution ABORTs Works fine when compiled w/ -O0 Extract from vectorizer dump: : # k_239 = PHI <k.4_11(48), k_266(56)> # c_I_lsm.10_241 = PHI <c_I_lsm.10_48(48), M.0_249(56)> # c_I_lsm.11_242 = PHI <c_I_lsm.11_3(48), M.0_252(56)> # vectp_a.47_406 = PHI <vectp_a.48_402(48), vectp_a.47_407(56)> # vect_M.50_410 = PHI <vect_cst__412(48), vect_M.50_411(56)> # ivtmp_420 = PHI <0(48), ivtmp_421(56)> _245 = (integer(kind=8)) k_239; _246 = _245 * 4; _247 = _246 + -4; _248 = *a_22(D)[_247]; M.0_249 = MAX_EXPR <_248, c_I_lsm.10_241>; _250 = _246 + -3; vect__248.49_408 = MEM[(real(kind=8) *)vectp_a.47_406]; <-- SLP vectp_a.47_409 = vectp_a.47_406 + 32; _251 = *a_22(D)[_250]; vect_M.50_411 = MAX_EXPR <vect__248.49_408, vect_M.50_410>; <-- SLP M.0_252 = MAX_EXPR <_251, c_I_lsm.11_242>; k_266 = k_239 + 1; vectp_a.47_407 = vectp_a.47_409 + 32; < -- SLP ivtmp_421 = ivtmp_420 + 1; if (ivtmp_421 >= bnd.44_361) goto ; else goto ; : ... # REMAINDER k_377 = k_365 + 1; if (k_365 == 26) goto ; else goto ; : goto ; : # k_381 = PHI <k_266(49)> # c_I_lsm.10_384 = PHI <M.0_249(49)> # c_I_lsm.11_386 = PHI <M.0_252(49)> # c_I_lsm.13_389 = PHI <c_I_lsm.13_84(49)> # c_I_lsm.12_392 = PHI <c_I_lsm.12_13(49)> # vect_M.50_413 = PHI <vect_M.50_411(49)> stmp_M.51_414 = BIT_FIELD_REF <vect_M.50_413, 64, 0>; stmp_M.51_415 = BIT_FIELD_REF <vect_M.50_413, 64, 64>; stmp_M.51_416 = BIT_FIELD_REF <vect_M.50_413, 64, 128>; stmp_M.51_417 = BIT_FIELD_REF <vect_M.50_413, 64, 192>; stmp_M.51_418 = MAX_EXPR <stmp_M.51_414, stmp_M.51_416>; # <-- WHOT?? stmp_M.51_419 = MAX_EXPR <stmp_M.51_415, stmp_M.51_417>; # <-- DITTO. _401 = (integer(kind=4)) ratio_mult_vf.45_364; tmp.46_400 = k.4_11 + _401; if (niters.42_358 == ratio_mult_vf.45_364) goto ; else goto ; Those 2 SSA names are then stored to 1st and 2nd array elements
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #24 from Kirill Yukhin --- (In reply to rguent...@suse.de from comment #23) > On Wed, 17 Feb 2016, jakub at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 > > > > --- Comment #22 from Jakub Jelinek --- > > Created attachment 37722 [details] > > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37722=edit > > gcc6-pr69671.patch > > > > Actually, on a closer look, I believe the only problem are the patterns that > > use a vector_move_operand "0C" inside of vec_select with only constants as > > the > > parallel's operands. Because fwprop is able to propagate constants into > > instructions (thus undo the CSE effect), but doesn't do anything on these, > > because it also simplifies them, so instead of the expected say > > (vec_select:V4QI (const_vector:V16QI [ > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > ]) > > (parallel [ > > (const_int 0 [0]) > > (const_int 1 [0x1]) > > (const_int 2 [0x2]) > > (const_int 3 [0x3]) > > ])) > > we get in there simplified: > > (const_vector:V4QI [ > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > ]) > > So, by adding extra patterns for that simplification fwprop is able to do > > its > > job even if CSE did a better job. > > Of course then I wonder why we didn't simplify this in the first place > when generating RTL and need to wait for forwprop ... > > But yes, sounds like the easiest way to go forward. Agree.
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #21 from Kirill Yukhin --- I am going to fix the issue in v7 for sure. But from current point of view this is going to be great pattern refactoring and hence patch will be thousands of lines. If this might be ported - I can put an XFAIL on the tests
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #14 from Kirill Yukhin --- Okay, I've tried: 1. Run AVX-512 testing on Spec2006 and see no impact of the one-liner: Geomeans: INT : 5.11 5.11+0.05% FP : 2.73 2.73-0.08% ALL : 3.54 3.54-0.02% 2. Tried Uroš's proposal. Adding to guilty pattern a condition like this: "TARGET_AVX512VL && ((REG_P (operands[2]) && REG_P (operands[0]) && REGNO (operands[0]) == REGNO (operands[2])) || (operands[2] == CONST0_RTX (mode)))" No success as well. The problem is that zero-masked built-in have register as second sorce at expand. Which when rematerializes to zero. So, setting this condition will lead to ICE in recog @ expand. So, for v6 it looks like we need to remove one-liner. For v7 we need to extend define_subst a bit to allow multiple output patterns. E.g. currently: (define_subst "mask" [(set (match_operand:SUBST_V 0) (match_operand:SUBST_V 1))] "TARGET_AVX512F" [(set (match_dup 0) (vec_merge:SUBST_V (match_dup 1) (match_operand:SUBST_V 2 "vector_move_operand" "0C") (match_operand: 3 "register_operand" "Yk")))]) It'd solve a problem if we'll had this instead: (define_subst "mask" [(set (match_operand:SUBST_V 0) (match_operand:SUBST_V 1))] "TARGET_AVX512F" [(set (match_dup 0) (vec_merge:SUBST_V (match_dup 1) (match_dup 0) (match_operand: 3 "register_operand" "Yk")))]) [(set (match_dup 0) (vec_merge:SUBST_V (match_dup 1) (match_operand:SUBST_V 2 "const0_operand" "C") (match_operand: 3 "register_operand" "Yk")))]) Opinions?
[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 --- Comment #3 from Kirill Yukhin --- Created attachment 37627 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37627=edit Reproducer src Reproducer
[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 --- Comment #4 from Kirill Yukhin --- Created attachment 37628 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37628=edit Reproducer input
[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 --- Comment #5 from Kirill Yukhin --- A bug in fortran's IO RT has emerged during 21 Apr 2016, between r54 and r92; looks like it's caused by the same revision –r71 (libgfortran/io/list_read.c ), which probably just triggers another hidden bug. Trying two builds (as of 21 and 22 Apr ): $ gfortran-20160421 -O0 T.f90 -static $ ./a.out res, (1) ==1 ! ### Ok $ gfortran-20160422 -O0 T.f90 -static $ ./a.out res, (1) == 80 @p¼B ### FAIL – garbage is read in
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #5 from Kirill Yukhin --- (In reply to ktkachov from comment #3) > CC'ing Kirill for AVX512 opinion I suppose that there's something wrong w/ MD patterns. E.g. for example provided pattern is: ;; /export/users/kyukhin/gcc/git/gcc/gcc/config/i386/sse.md: 9199 (define_insn ("avx512vl_truncatev4siv4qi2_mask") [ (set (match_operand:V16QI 0 ("register_operand") ("=v")) (vec_concat:V16QI (vec_merge:V4QI (truncate:V4QI (match_operand:V4SI 1 ("register_operand") ("v"))) (vec_select:V4QI (match_operand:V16QI 2 ("vector_move_operand") ("0C")) (parallel [ (const_int 0 [0]) (const_int 1 [0x1]) (const_int 2 [0x2]) (const_int 3 [0x3]) ])) (match_operand:QI 3 ("register_operand") ("Yk"))) (const_vector:V12QI [ (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) Right now I think that 2nd operand predicate is not correct. It should be const0_rtx (of corresponding mode) or duplicate of operand 0 (result actually) This is whats contstraint tells. However predicate says simply that operand is either const0_rtx or nonimmediate: no connection with operand 0.
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 Kirill Yukhin changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #6 from Kirill Yukhin --- This bug seems to be mine.
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #8 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #7) > So do you want to use reg_or_0_operand? I don't think we usually tie output > with input already in the predicates, except when match_dup is used. That is the issue. reg_or_0_operand won't work (although it is better than "vector_move_operand" since it is prohibits memory) We want 2nd operand to be either: 1. const0_rtx 2. match_dup 0 I cannot see in gcc/genpreds.c if a reference to another operands is possible from the other. We might invent some complicated subst. But patterns look too complicated for that. Maybe extend genpreds.c and friends introducing new version of predicate which will take instead of (op, mode) -> (op, mode, operands). Not sure in volume of efforts though. Really hope there's some simpler solution.
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #10 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #9) > But something like that might remove the flexibility from the register > allocator. > > Wonder why the RA in this case doesn't see that the value loaded into that > pseudo register is CONST0_RTX which satisfies the C constraint and doesn't > undo CSE (rematerialize) in that case if it doesn't have that value already > loaded in the matching register to the output one. Then I see two options: 1. Split all patterns into match_dup and 0_operand by hand 2. Implement dedicated subst for such a patterns which will do p.1 while processing MD. Not sure it'll be easy
[Bug libfortran/69651] New: Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 Bug ID: 69651 Summary: Usage of unitialized pointer io/list_read.c Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libfortran Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Unfortunately I have no testcase. But code itself looks awful to me: /* Worker function to save a KIND=4 character to a string buffer, enlarging the buffer as necessary. */ static void push_char4 (st_parameter_dt *dtp, int c) { gfc_char4_t *new, *p = (gfc_char4_t *) dtp->u.p.saved_string; if (p == NULL) { dtp->u.p.saved_string = xcalloc (SCRATCH_SIZE, sizeof (gfc_char4_t)); dtp->u.p.saved_length = SCRATCH_SIZE; dtp->u.p.saved_used = 0; p = (gfc_char4_t *) dtp->u.p.saved_string; } if (dtp->u.p.saved_used >= dtp->u.p.saved_length) { dtp->u.p.saved_length = 2 * dtp->u.p.saved_length; p = xrealloc (p, dtp->u.p.saved_length * sizeof (gfc_char4_t)); memset4 (new + dtp->u.p.saved_used, 0, // <-- ??? new==junk ??? dtp->u.p.saved_length - dtp->u.p.saved_used); } p[dtp->u.p.saved_used++] = c; } It was introduced w/ r210948 (https://gcc.gnu.org/ml/fortran/2014-05/msg00149.html). Before that new was [at least] initialized.
[Bug target/69120] sse2_shufpd_v2df_mask has wrong name
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69120 --- Comment #2 from Kirill Yukhin --- Looked closely. The name was chosen intentionally to simplify "sse2_shufpd" expand. If we want to fix this name - new subst attribute need to be introduced and if () emit_insn (avx512vl_... else emit_insn (sse2_... inserted into the expand. Beside of the expand this template never called by name. So, I bet to have the name unchanged and keep things simple.
[Bug target/69120] sse2_shufpd_v2df_mask has wrong name
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69120 --- Comment #1 from Kirill Yukhin --- Will fix.