[Bug rtl-optimization/71956] New: [i686][7 Regression] 176.gcc fails on 32 bits when compiled with -march=core-avx2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71956 Bug ID: 71956 Summary: [i686][7 Regression] 176.gcc fails on 32 bits when compiled with -march=core-avx2 Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target Milestone: --- Bisecting points on r235765. My exact options are -O3 -mfpmath=sse -march=core-avx2 -m32 Will try to bisect test sources
[Bug target/71088] [i386, AVX-512, Perf] vpermi2ps instead of vpermps emitted
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71088 --- Comment #1 from Igor Zamyatin --- Fixed by r237982
[Bug libgcc/71559] ICE in ix86_fp_cmp_code_to_pcmp_immediate, at config/i386/i386.c:23042 (KNL/AVX512)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71559 Igor Zamyatin changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #2 from Igor Zamyatin --- Created attachment 38715 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38715=edit Raw fortran source file Fail can be seen on trunk with gfortran -march=knl -Ofast -fno-finite-math-only prja.f. Adding Ilya to look at it
[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 Igor Zamyatin changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #94 from Igor Zamyatin --- We currently observe miscompare for x86 (32 and 64 bits) and bisect points to r235817. No source bisecting yet, however
[Bug lto/71089] [7 Regression] Failed to build 483.xalancbmk in SPEC CPU 2006
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71089 Igor Zamyatin changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #1 from Igor Zamyatin --- May be related to PR71015
[Bug rtl-optimization/69052] [6 Regression] Performance regression after r229402.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69052 Igor Zamyatin changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #3 from Igor Zamyatin --- (In reply to amker from comment #2) > It's my change, I will look into it. Any plans on this?
[Bug target/69344] [6 Regression] 435.gromacs regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69344 Igor Zamyatin changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #1 from Igor Zamyatin --- Same as PR69274
[Bug tree-optimization/68775] spec2006 test case 465.tonto fails with the gcc 6.0 fortran compiler
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68775 Igor Zamyatin changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #2 from Igor Zamyatin --- > > Note that it seems that tonto also fails (since forever) with AVX2 on x86_64. Hmmm, at what optset are you seeing the failure?
[Bug tree-optimization/68654] [6 Regression] CoreMark Pro performance degradation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68654 --- Comment #6 from Igor Zamyatin --- Created attachment 36961 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36961=edit Dumps Profilers show that core_state_transition and calc_func indeed became slower after r228668. First difference in dumps seems start in expand pass. I attached the dumps - quick look shows that there are extra register copies in several places could be seen. Options that were used - -m32 -Ofast -funroll-loops -flto -static -march=core-avx2
[Bug fortran/68486] [6 Regression] 187.facerec in SPEC CPU 2000 failed to build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68486 Igor Zamyatin changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #8 from Igor Zamyatin --- This one fails for me with -O2 Subroutine Foo () Real(4), Allocatable, Save :: tmp (:, :) Real(4), Pointer, Save :: arr (:, :, :) Integer :: l, m, n tmp = (CSHIFT(CSHIFT(arr (:,:,l),m,2),n,1)) End Subroutine Foo
[Bug tree-optimization/68502] New: [6 Regression][i686] spec2000/179.art runfails after r222914
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68502 Bug ID: 68502 Summary: [6 Regression][i686] spec2000/179.art runfails after r222914 Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target Milestone: --- There is a segmentation fault during execution r222914 is: r222914 | rguenth | 2015-05-08 18:13:55 +0300 (Fri, 08 May 2015) | 15 lines Changed paths: M /trunk/gcc/ChangeLog M /trunk/gcc/testsuite/ChangeLog A /trunk/gcc/testsuite/gcc.dg/vect/slp-41.c M /trunk/gcc/tree-vect-data-refs.c M /trunk/gcc/tree-vect-stmts.c Options I used - -m32 -static -O3 -mfpmath=sse -march=core-avx2 -m64 is ok
[Bug tree-optimization/67800] [6 Regression] Missed vectorization opportunity on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67800 --- Comment #3 from Igor Zamyatin --- Richard, do you have any plans regarding this?
[Bug rtl-optimization/67749] FAIL: gcc.dg/ifcvt-2.c scan-rtl-dump ce1 "3 true changes made"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67749 Igor Zamyatin changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #1 from Igor Zamyatin --- Any progress on this?
[Bug tree-optimization/66142] Loop is not vectorized because not sufficient support for GOMP_SIMD_LANE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66142 Igor Zamyatin changed: What|Removed |Added CC||enkovich.gnu at gmail dot com --- Comment #19 from Igor Zamyatin --- Richard, would you be able to look at this in some time?
[Bug lto/66752] spec2000 255.vortex performance compiled with GCC is ~20% lower than with CLANG
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66752 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #13 from Igor Zamyatin izamyatin at gmail dot com --- Why the patch has been reverted?
[Bug bootstrap/66638] [6 Regression] profiledbootstrap failure on x86-64 with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66638 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||amker at gcc dot gnu.org, ||izamyatin at gmail dot com --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- r224020 is guilty
[Bug rtl-optimization/64081] [5 Regression] r217827 prevents RTL loop unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64081 --- Comment #15 from Igor Zamyatin izamyatin at gmail dot com --- Got an access to AIX machine, planning to look at it next week
[Bug tree-optimization/65136] New: VRP inserts unnecessary constant copy in the loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65136 Bug ID: 65136 Summary: VRP inserts unnecessary constant copy in the loop Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com For the following piece of code int foo (unsigned int cc) { while ( cc 16 ) { cc = (cc 0x) + (cc 16); } return cc == 1; } at o2 we have movl%edi, %eax shrl$16, %eax testl %eax, %eax je .L2 .p2align 4,,10 .p2align 3 .L3: movzwl %di, %edi addl%eax, %edi movl$1, %eax movl%edi, %edx shrl$16, %edx testl %edx, %edx jne .L3 while with -fno-tree-vrp jmp .L8 .p2align 4,,10 .p2align 3 .L3: movzwl %di, %edi addl%eax, %edi .L8: movl%edi, %eax shrl$16, %eax testl %eax, %eax jne .L3 vrp changes loop in a following way bb 4: # cc_12 = PHI cc_2(D)(3), cc_5(4) # _13 = PHI _11(3), 1(4) _4 = cc_12 65535; cc_5 = _13 + _4; _3 = cc_5 16; if (_3 != 0) goto bb 4; else goto bb 5; resulted in constant copy to be inserted.
[Bug tree-optimization/64739] Spurious array subscript is above array bounds warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64739 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #2 from Igor Zamyatin izamyatin at gmail dot com --- Looks quite similar to PR64277
[Bug rtl-optimization/64081] [5 Regression] r217827 prevents RTL loop unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64081 --- Comment #13 from Igor Zamyatin izamyatin at gmail dot com --- (In reply to David Edelsohn from comment #12) GCC on AIX. One can use gcc111 in the GCC Compiler Farm. Thanks! I've sent a request for an access to gcc111 but got no response so far...
[Bug rtl-optimization/64081] [5 Regression] r217827 prevents RTL loop unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64081 --- Comment #11 from Igor Zamyatin izamyatin at gmail dot com --- Could you please provide details of your compiler configuration for me to try to reproduce the problem?
[Bug rtl-optimization/64081] [5 Regression] r217827 prevents RTL loop unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64081 --- Comment #8 from Igor Zamyatin izamyatin at gmail dot com --- Created attachment 34524 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34524action=edit patch to try AIX bootstrap
[Bug rtl-optimization/64081] [5 Regression] r217827 prevents RTL loop unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64081 --- Comment #9 from Igor Zamyatin izamyatin at gmail dot com --- David, could you please try attached patch?
[Bug tree-optimization/64277] [4.9/5 Regression] Incorrect warning array subscript is above array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64277 --- Comment #6 from Igor Zamyatin izamyatin at gmail dot com --- cunroll phase 7 times completely unrolls post-loop that was generated by vectorizer. And later vrp complains on those unrolled iterations. Note that for the test without if (nc 3), i.e. void foo(short a[], short m) { int i,j; int f1[10]; short nc; nc = m + 1; for (i = 0, j = m; i nc; i++, j--) { a[i] = f1[i]; a[j] = i; } return; } vrp doesn't complain while cunroll still performs 7 times complete unroll. Most probably merge block generated for the original testcase affects vrp's work: bb 14: # prephitmp_39 = PHI pretmp_40(3), _450(13), _451(40) j_18 = (int) m_7(D); if (prephitmp_39 0) goto bb 16; else goto bb 15;
[Bug tree-optimization/64277] [4.9/5.0 Regression] Incorrect warning array subscript is above array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64277 --- Comment #5 from Igor Zamyatin izamyatin at gmail dot com --- BTW, making nc and m to be int instead short eliminates the warning
[Bug tree-optimization/64277] [4.9/5.0 Regression] Incorrect warning array subscript is above array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64277 --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- See the warning ( used -O3 -mssse3 -Wall) on current trunk configured as ../configure --enable-clocale=gnu --with-system-zlib --enable-shared --with-demangler-in-ld --with-fpmath=sse --enable-checking=release --enable-languages=c,c++,fortran
[Bug lto/64043] [5 Regression] ICE (segfault) with LTO: in tree_check/tree.h:2758 get_binfo_at_offset/tree.c:11914
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64043 --- Comment #16 from Igor Zamyatin izamyatin at gmail dot com --- Hi, Honza! I still see that performance degradations for spec2006 tests. Could you please check those on your side?
[Bug target/64368] [5 Regression] Several libstdc++ test failures on darwin and others after r218964.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64368 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #13 from Igor Zamyatin izamyatin at gmail dot com --- Hi! Any plans on fixing this?
[Bug tree-optimization/64277] [4.9/5.0 Regression] Incorrect warning array subscript is above array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64277 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- Warnings are issued by vrp2. It happens when we have both vector and scalar versions of code. Something seems to confuse VRP analysis, probably the reason is that m and nc are shorts. Changing them to int makes warnings disappear
[Bug lto/64043] [5 Regression] ICE (segfault) with LTO: in tree_check/tree.h:2758 get_binfo_at_offset/tree.c:11914
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64043 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #13 from Igor Zamyatin izamyatin at gmail dot com --- I also see performance degradation for several spec2006 tests after this commit (mostly on FP tests). Eg. on Haswell 433.milc shows ~-7-8% if compiled with -O3 -flto -funroll-loops -march=core-avx2 Not investigated this so far though
[Bug lto/64043] [5 Regression] ICE (segfault) with LTO: in tree_check/tree.h:2758 get_binfo_at_offset/tree.c:11914
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64043 --- Comment #15 from Igor Zamyatin izamyatin at gmail dot com --- Just checked: everywhere -Ofast -flto -funroll-loops -static -m64 -march=core-avx2 used (not -O3 as I mentioned before)
[Bug target/64342] [5 Regression] Tests failing when compiled with '-m32 -fpic' after r216154.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64342 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com, ||vmakarov at redhat dot com --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- avx512f-kandnw-1.c and funcspec-5.c seem to be non-PIC related issues. I asked Kirill to look at them. Others are not stability but more performance issues - generated code is less effective than it should be - in one case for some reasons compiler uses callee-saved ebx in PIC mode instead of edx in non-PIC mode and in xmm case compiler uses stack in PIC mode instead of xmm register in non-PIC mode I see that differencies between PIC and non-PIC modes start on reload pass so I'd like Vlad to look at these cases
[Bug rtl-optimization/64286] Redundant extend removal ignores vector element type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286 --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- Perhaps something like below to restrict ree for such cases? diff --git a/gcc/ree.c b/gcc/ree.c index 3376901..92370ea 100644 --- a/gcc/ree.c +++ b/gcc/ree.c @@ -1004,6 +1004,11 @@ add_removable_extension (const_rtx expr, rtx_insn *insn, struct df_link *defs, *def; ext_cand *cand; + if (!SCALAR_INT_MODE_P (GET_MODE (dest)) + (GET_MODE_UNIT_PRECISION (mode) != + GET_MODE_UNIT_PRECISION (GET_MODE (XEXP (src, 0) +return; + /* First, make sure we can get all the reaching definitions. */ defs = get_defs (insn, XEXP (src, 0), NULL); if (!defs)
[Bug rtl-optimization/64316] New: [5 Regression] ICE in simplify_const_unary_operation after r218503
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64316 Bug ID: 64316 Summary: [5 Regression] ICE in simplify_const_unary_operation after r218503 Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target: x86 Created attachment 34284 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34284action=edit Reproducer For the attached reproducer compiled with -O3 -march=core-avx2 there is: internal compiler error: in simplify_const_unary_operation, at simplify-rtx.c:1676 } ^ 0xbd8b9b simplify_const_unary_operation(rtx_code, machine_mode, rtx_def*, machine_mode) ../../gcc/simplify-rtx.c:1676 0xbd6045 simplify_unary_operation(rtx_code, machine_mode, rtx_def*, machine_mode) ../../gcc/simplify-rtx.c:822 0xbd4f75 simplify_gen_unary(rtx_code, machine_mode, rtx_def*, machine_mode) ../../gcc/simplify-rtx.c:395 0x136761f if_then_else_cond ../../gcc/combine.c:8748 0x135e155 combine_simplify_rtx ../../gcc/combine.c:5394 0x135dea3 subst ../../gcc/combine.c:5331 0x135dca2 subst ../../gcc/combine.c:5276 0x135dca2 subst ../../gcc/combine.c:5276 0x1357a53 try_combine ../../gcc/combine.c:3250 0x1352ba8 combine_instructions ../../gcc/combine.c:1301 0x13740c3 rest_of_handle_combine ../../gcc/combine.c:14052 0x137416a execute ../../gcc/combine.c:14095 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. This actually leads to miscompilation of spec2006/403.gcc with -march=core-avx2
[Bug rtl-optimization/64317] New: [5 Regression] Ineffective allocation of PIC base register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64317 Bug ID: 64317 Summary: [5 Regression] Ineffective allocation of PIC base register Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com CC: vmakarov at redhat dot com Target: i686 Created attachment 34285 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34285action=edit Reproducer For the attached test compiled with -O2 -m32 -fPIE -pie after r218059 we generate call__x86.get_pc_thunk.ax addl$_GLOBAL_OFFSET_TABLE_, %eax subl$28, %esp .cfi_def_cfa_offset 48 movl48(%esp), %edi movl%eax, 12(%esp)--- PIC reg spill testl%edi, %edi je.L8 movl12(%esp), %eax--- PIC reg fill xorl%esi, %esi movlc@GOT(%eax), %ebp .p2align 4,,10 .p2align 3 .L4: movl12(%esp), %ebx--- PIC reg fill addl$1, %esi callbar@PLT while for r218058 there is no spill and only reg-reg fill: call__x86.get_pc_thunk.di addl$_GLOBAL_OFFSET_TABLE_, %edi subl$12, %esp .cfi_def_cfa_offset 32 movl32(%esp), %eax testl%eax, %eax je.L8 movlc@GOT(%edi), %ebp xorl%esi, %esi .p2align 4,,10 .p2align 3 .L4: movl%edi, %ebx addl$1, %esi callbar@PLT
[Bug rtl-optimization/64151] [5 Regression] r218266 caused many regressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64151 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- I also see ~5% regression on eg spec2006/456.hmmer for i686 with -O3
[Bug tree-optimization/64058] [5 Regression] Performance degradation after r216304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64058 --- Comment #5 from Igor Zamyatin izamyatin at gmail dot com --- But at the same time difference in good and bad .optimized dumps seems to me insignificant (only some postfix numbers of variables).
[Bug tree-optimization/64081] New: [5 Regression] r217827 prevents RTL loop unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64081 Bug ID: 64081 Summary: [5 Regression] r217827 prevents RTL loop unroll Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target: x86 Created attachment 34123 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34123action=edit reproducer Noticed that for the attached test no RTL loop unroll started to happen. It is because of changes in dom - namely, I see in dumps that dom2 complicates loop structure. (probably because of changes in lookup_avail_expr?) Looks like r217827 doesn't mean this :) Options that should be used - just -O2 -funroll-loops
[Bug tree-optimization/64058] [5 Regression] Performance degradation after r216304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64058 --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- Partition maps differ 216303: Partition 0 (_1 - 1 101 200 252 267 316 348 ) Partition 16 (l1_lsm.7_159 - 106 159 238 253 ) and for 216304: Partition 3 (l1_lsm.7_58 - 58 106 238 253 315 316 ) Partition 31 (u1_lsm.6_252 - 101 252 267 314 348 ) And also for 216304 there is Coalesce list: (267)u1_lsm.6_252 (315)l1_lsm.7_58 [map: 70, 4] : Fail due to conflict although for 216303 there is Coalesce list: (1)_1 (253)l1_lsm.7_159 [map: 0, 32] : Fail due to conflict
[Bug tree-optimization/64058] New: Performance degradation after r216304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64058 Bug ID: 64058 Summary: Performance degradation after r216304 Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target: x86 Created attachment 34101 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34101action=edit reproducer Got a performance regression for the codes similar to attached test (well, rather specific, my simple attempts to reduce it more failed). Regression root cause is not obviuos at the first glance but noticed that after r216304 we have l1=u1 placed differently after expand pass. Seems for r216303 this assignment was sinked lower and this potentially affects live ranges and thus performance. GCC options: -Ofast -flto -m32
[Bug tree-optimization/64058] Performance degradation after r216304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64058 --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- Created attachment 34102 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34102action=edit good dump
[Bug tree-optimization/64058] Performance degradation after r216304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64058 --- Comment #2 from Igor Zamyatin izamyatin at gmail dot com --- Created attachment 34103 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34103action=edit bad dump
[Bug tree-optimization/63962] New: [5 Regression][x86] Code pessimization after r217213
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63962 Bug ID: 63962 Summary: [5 Regression][x86] Code pessimization after r217213 Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target: x86 While investigating some performance regressions on 32 bits on trunk (just -O2 -m32) I noticed that after r217213 forward propagation makes code worse for following testcase struct TT { int * c; int s; } FF; long foo (int t1, int t2) { unsigned int i, i1; static int *c, s; for (i = 0; i t1; i++) { c = FF.c + FF.s - 1; s = (int)(*c--); for (i1 = 2; i1 t2; i1++) s += (int)(*c--); } return s; } For r217212 I see in 068t.forwprop2 - bb 3: _8 = FF.c; _9 = FF.s; _10 = (sizetype) _9; _11 = _10 + 1073741823; _12 = _11 * 4; c.0_13 = _8 + _12; c = c.0_13; c.3_15 = c.0_13 + 4294967292; c = c.3_15; s.4_17 = *c.0_13; s = s.4_17; # DEBUG i1 = 2 goto bb 5; - while for r217213 code contains one more addition - | bb 3:| _8 = FF.c;| _9 = FF.s;| _10 = (sizetype) _9; | _11 = _10 + 1073741823; | _12 = _11 * 4;| c.0_13 = _8 + _12;| c = c.0_13; | _31 = _12 + 4294967292;--- c.3_15 = _8 + _31; c = c.3_15; s.4_17 = *c.0_13; s = s.4_17; # DEBUG i1 = 2 goto bb 5; - Can try to cook runtime test if it is necessary.
[Bug target/63897] [5.0 regression] gcc.dg/torture/vector-2.c fails at on x86_64-apple-darwin14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63897 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||vmakarov at redhat dot com --- Comment #5 from Igor Zamyatin izamyatin at gmail dot com --- This again looks like RA issue (similar to the issue discussed in PR63620) After ira we have (insn/f 64 3 2 2 (parallel [ (set (reg:SI 94) (unspec:SI [ (const_int 0 [0]) ] UNSPEC_SET_GOT)) (clobber (reg:CC 17 flags)) ]) 683 {set_got} (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_EQUIV (unspec:SI [ (const_int 0 [0]) ] UNSPEC_SET_GOT) (expr_list:REG_CFA_FLUSH_QUEUE (nil) (nil) . (insn 36 32 37 4 (set (reg/v:V4SI 92 [ t ]) (vec_merge:V4SI (vec_duplicate:V4SI (const_int 1 [0x1])) (const_vector:V4SI [ (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) ]) (const_int 1 [0x1]))) /nfs/ims/home/izamyati/test_63897.c:11 2456 {vec_setv4si_0} (expr_list:REG_EQUAL (const_vector:V4SI [ (const_int 1 [0x1]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) ]) (nil))) and after RA (insn/f 64 3 2 2 (parallel [ (set (reg:SI 0 ax [94]) got is in ax (unspec:SI [ (const_int 0 [0]) ] UNSPEC_SET_GOT)) (clobber (reg:CC 17 flags)) ]) 683 {set_got} (expr_list:REG_EQUIV (unspec:SI [ (const_int 0 [0]) ] UNSPEC_SET_GOT) (expr_list:REG_CFA_FLUSH_QUEUE (nil) (nil ... (call_insn/i 21 20 22 2 (set (reg:SI 0 ax) --- ax is changed (call (mem:QI (symbol_ref:SI (memcmp) [flags 0x41] function_decl 0x141d86bd0 __builtin_memcmp) [0 __builtin_memcmp S1 A8]) (const_int 16 [0x10]))) /nfs/ims/home/izamyati/test_63897.c:21 664 {*call_value} (expr_list:REG_EH_REGION (const_int 0 [0]) (nil)) (nil)) (insn 22 21 23 2 (parallel [ (set (reg/f:SI 7 sp) (plus:SI (reg/f:SI 7 sp) (const_int 16 [0x10]))) (clobber (reg:CC 17 flags)) ]) /nfs/ims/home/izamyati/test_63897.c:21 220 {*addsi_1} (expr_list:REG_ARGS_SIZE (const_int 0 [0]) (nil))) (insn 23 22 26 2 (set (reg:SI 0 ax [102]) (reg:SI 0 ax)) /nfs/ims/home/izamyati/test_63897.c:21 90 {*movsi_internal} (nil)) . (insn 72 32 36 4 (set (reg:SI 0 ax [116]) --- ax is again used as got (plus:SI (reg:SI 0 ax [94]) (const:SI (unspec:SI [ (symbol_ref/u:SI (*LC0) [flags 0x2]) ] UNSPEC_MACHOPIC_OFFSET /nfs/ims/home/izamyati/test_63897.c:11 213 {*leasi} (expr_list:REG_EQUAL (symbol_ref/u:SI (*LC0) [flags 0x2]) (nil))) (insn 36 72 37 4 (set (reg/v:V4SI 21 xmm0 [orig:92 t ] [92]) (vec_merge:V4SI (vec_duplicate:V4SI (mem/u/c:SI (reg:SI 0 ax [116]) [0 S4 A32])) (const_vector:V4SI [ (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) ]) (const_int 1 [0x1]))) /nfs/ims/home/izamyati/test_63897.c:11 2456 {vec_setv4si_0} (expr_list:REG_EQUAL (const_vector:V4SI [ (const_int 1 [0x1]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) ]) (nil)))
[Bug target/63897] [5.0 regression] gcc.dg/torture/vector-2.c fails at on x86_64-apple-darwin14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63897 --- Comment #6 from Igor Zamyatin izamyatin at gmail dot com --- after RA=after reload
[Bug ipa/63814] g++.dg/ipa/pr61160-1.C fails with -m32 on darwin14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63814 --- Comment #12 from Igor Zamyatin izamyatin at gmail dot com --- So far it seems the issue unlikely caused by PIC-related changes in i686 - test passes with -fno-devirtualize.
[Bug sanitizer/63845] [5 Regression] c-c++-common/asan/bitfield-[12345].c fails on i?86 -with -fpic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63845 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #3 from Igor Zamyatin izamyatin at gmail dot com --- I already posted a patch - http://gcc.gnu.org/ml/gcc-patches/2014-10/msg03318.html Will ping it today
[Bug ipa/63814] g++.dg/ipa/pr61160-1.C fails with -m32 on darwin14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63814 --- Comment #11 from Igor Zamyatin izamyatin at gmail dot com --- Will take a look. Thanks!
[Bug sanitizer/63846] c-c++-common/asan/misalign-[12].c fails on i?86 with -fpic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63846 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- According to this - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534#c48 it's the same as PR63845 (for which I've already posted a patch)
[Bug ipa/63814] g++.dg/ipa/pr61160-1.C fails with -m32 on darwin14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63814 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||evstupac at gmail dot com, ||izamyatin at gmail dot com --- Comment #7 from Igor Zamyatin izamyatin at gmail dot com --- So, is this compile time failure or runtime failure (or both for two tests)?
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 --- Comment #67 from Igor Zamyatin izamyatin at gmail dot com --- Posted a patch here - http://gcc.gnu.org/ml/gcc-patches/2014-10/msg03318.html Now discussion stop here - http://gcc.gnu.org/ml/gcc-patches/2014-11/msg00320.html
[Bug bootstrap/63622] [5.0 Regression] Bootstrap fails on x86_64-apple-darwin1[34] after revision r216305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63622 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #20 from Igor Zamyatin izamyatin at gmail dot com --- This is mentioned here - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534#c9 Fix for this is under review, start of the discussion is here - http://gcc.gnu.org/ml/gcc-patches/2014-10/msg01727.html
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 --- Comment #50 from Igor Zamyatin izamyatin at gmail dot com --- In addition r216154 breaks a lot of asan tests with -m32: see https://gcc.gnu.org/ml/gcc-testresults/2014-10/msg02834.html Could you please try following patch? diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 5580ea8..508db5d 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -1715,6 +1715,9 @@ expand_used_vars (void) - init_vars_expansion (); - + if (targetm.use_pseudo_pic_reg ()) +pic_offset_table_rtx = gen_reg_rtx (Pmode); + hash_maptree, tree ssa_name_decls; for (i = 0; i SA.map-num_partitions; i++) { diff --git a/gcc/function.c b/gcc/function.c index ee229ad..dab691d 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -3464,11 +3464,6 @@ assign_parms (tree fndecl) - fnargs.release (); - - /* Initialize pic_offset_table_rtx with a pseudo register - if required. */ - if (targetm.use_pseudo_pic_reg ()) -pic_offset_table_rtx = gen_reg_rtx (Pmode); - /* Output all parameter conversion instructions (possibly including calls) now that all parameters have been copied out of hard registers. */ emit_insn (all.first_conversion_insn);
[Bug rtl-optimization/63620] RELOAD lost SET_GOT dependency on Darwin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63620 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #5 from Igor Zamyatin izamyatin at gmail dot com --- (In reply to Uroš Bizjak from comment #3) Confirmed. This will affect all SSE targets. Have you managed to reproduce the issue on i686? The patch at Comment #2 will just paper over the issue. Yeah, it was just a temporary fix for Darwin folks
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 --- Comment #49 from Igor Zamyatin izamyatin at gmail dot com --- Testing a patch to fix asan failures
[Bug target/63615] New: [i686][5 Regression] FAIL: gcc.target/i386/addr-sel-1.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63615 Bug ID: 63615 Summary: [i686][5 Regression] FAIL: gcc.target/i386/addr-sel-1.c Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target: i686 There is a new fail for gcc.target/i386/addr-sel-1.c on i686 after r216462 For this test we now have in asm file movl4(%esp), %eax leal1(%eax), %edx movsbl a(%edx), %ecx movsbl b(%edx), %eax addl%ecx, %eax ret instead of earlier movl4(%esp), %eax movsbl a+1(%eax), %ecx movsbl b+1(%eax), %eax addl%ecx, %eax ret Seems the change resticts some opts in postreload. Now in .postreload there are (insn 6 21 8 2 (parallel [ (set (reg:SI 1 dx [orig:83 D.1733 ] [83]) (plus:SI (reg:SI 0 ax [96]) (const_int 1 [0x1]))) (clobber (reg:CC 17 flags)) ]) ../gcc/testsuite/gcc.target/i386/addr-sel-1.c:13 220 {*addsi_1} (nil)) (insn 8 6 10 2 (set (reg:SI 2 cx [orig:93 D.1733 ] [93]) (sign_extend:SI (mem/j:QI (plus:SI (reg:SI 1 dx [orig:83 D.1733 ] [83]) (symbol_ref:SI (a) var_decl 0x7f7b4a70fc60 a)) [0 a S1 A8]))) ../gcc/testsuite/gcc.target/i386/addr-sel-1.c:13 148 {extendqisi2} (nil)) while earlier there were (insn 6 21 8 2 (parallel [ (set (reg:SI 1 dx [orig:83 D.1733 ] [83]) (plus:SI (reg:SI 0 ax [96]) (const_int 1 [0x1]))) (clobber (reg:CC 17 flags)) ]) addr-sel-1-good.c:13 220 {*addsi_1} (nil)) (insn 8 6 10 2 (set (reg:SI 2 cx [orig:93 D.1733 ] [93]) (sign_extend:SI (mem/j:QI (plus:SI (reg:SI 0 ax [96]) (const:SI (plus:SI (symbol_ref:SI (a) var_decl 0x7fbf3c296c60 a) (const_int 1 [0x1] [0 a S1 A8]))) addr-sel-1-good.c:13 148 {extendqisi2} (nil)) so insn 6 is not needed
[Bug c/63592] Linux kernel build failure due to duplicate exported symbols
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63592 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- The same could be seen for 253.perlbmk and 400.perlbench tests from spec2K/2006 suites
[Bug bootstrap/63536] [5 Regression] bootstrap failed when configured with --with-cpu=slm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63536 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #6 from Igor Zamyatin izamyatin at gmail dot com --- Should be fixed now
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 --- Comment #21 from Igor Zamyatin izamyatin at gmail dot com --- (In reply to Iain Sandoe from comment #20) libtool: compile: /GCC/ml/gcc-trunk-appleas/./gcc/xgcc -B/GCC/ml/gcc-trunk-appleas/./gcc/ -B/compilers/gcc-trunk/x86_64-apple-darwin12/bin/ -B/compilers/gcc-trunk/x86_64-apple-darwin12/lib/ -isystem /compilers/gcc-trunk/x86_64-apple-darwin12/include -isystem /compilers/gcc-trunk/x86_64-apple-darwin12/sys-include -DHAVE_CONFIG_H -I. -I/GCC/gcc-trunk/libquadmath -I /GCC/gcc-trunk/libquadmath/../include -g -O2 -m32 -MT math/frexpq.lo -MD -MP -MF math/.deps/frexpq.Tpo -c /GCC/gcc-trunk/libquadmath/math/frexpq.c -fno-common -DPIC -o math/.libs/frexpq.o /var/folders/tj/17r7407j14d324dzf67cnvxm000114/T//ccahZ8x6.s:68:non- relocatable subtraction expression, LC0 minus L1$pb /var/folders/tj/17r7407j14d324dzf67cnvxm000114/T//ccahZ8x6.s:68:symbol: L1$pb can't be undefined in a subtraction expression /var/folders/tj/17r7407j14d324dzf67cnvxm000114/T//ccahZ8x6.s:unknown: Undefined local symbol L1$pb make[6]: *** [math/frexpq.lo] Error 1 make[5]: *** [all] Error 2 Can we look at the rtl dumps and probably asm file?
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #14 from Igor Zamyatin izamyatin at gmail dot com --- Thanks! That's define_insn_and_split nonlocal_goto_receiver where the issue comes from. Seems now we need to handle this split somewhat similar to the second approach in solving of the profiling issue
[Bug bootstrap/63523] [5.0 regression] gcc/cp/pt.c -Werror=format breaks bootstrap on sparc-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63523 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- Also on i686
[Bug c/63307] [4.9/5 Regression] Cilk+ breaks -fcompare-debug bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63307 --- Comment #5 from Igor Zamyatin izamyatin at gmail dot com --- (In reply to Jakub Jelinek from comment #4) I don't think so. They copy declarations, i.e. create new declarations, and the different ordering of their DECL_UID values may result in code generation differences (e.g. various other spots in the compiler sort based on DECL_UIDs, if you create them in pretty random order, you'll surely trigger some -fcompare-debug (perhaps not with current limited testsuite coverage, but with other tests). Right, thanks for the clarification. Will prepare the whole patch then
[Bug c/63307] [4.9/5 Regression] Cilk+ breaks -fcompare-debug bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63307 --- Comment #3 from Igor Zamyatin izamyatin at gmail dot com --- (In reply to Jakub Jelinek from comment #2) + vec_arglist.release(); Formatting. You could use auto_vec, perhaps with some stack allocated initial buffer if you think say 16 vector elements would be typically enough. Is it ok to have auto_vec declaration outside the routine? Also, what about all the remaining 3 callbacks that create or may create decls and have the same problem? for_local_cb, wrapper_local_cb and declare_one_free_variable. These are callbacks that seem to be safe in the sense of random ordering - perform some 1 to 1 mapping
[Bug c/63307] [4.9/5 Regression] Cilk+ breaks -fcompare-debug bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63307 --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- Would like to ask here first - will something like following be ok: diff --git a/gcc/c-family/cilk.c b/gcc/c-family/cilk.c index bf549ad..f453bc5 100644 --- a/gcc/c-family/cilk.c +++ b/gcc/c-family/cilk.c @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. If not see #include toplev.h #include cgraph.h #include diagnostic.h +#include vec.h #include cilk.h enum add_variable_type { @@ -332,15 +333,23 @@ create_cilk_helper_decl (struct wrapper_data *wd) return fndecl; } +typedef struct +{ + tree parm; + tree arg; +} decl_pair; + +static vecdecl_pair vec_arglist; + /* A function used by walk tree to find wrapper parms. */ static bool wrapper_parm_cb (const void *key0, void **val0, void *data) { - struct wrapper_data *wd = (struct wrapper_data *) data; tree arg = * (tree *)key0; tree val = (tree)*val0; tree parm; + decl_pair dp; if (val == error_mark_node || val == arg) return true; @@ -370,25 +379,48 @@ wrapper_parm_cb (const void *key0, void **val0, void *data) } else parm = val; - TREE_CHAIN (parm) = wd-parms; - wd-parms = parm; - wd-argtypes = tree_cons (NULL_TREE, TREE_TYPE (parm), wd-argtypes); - wd-arglist = tree_cons (NULL_TREE, arg, wd-arglist); + + dp.parm = parm; + dp.arg = arg; + vec_arglist.safe_push(dp); return true; } /* This function is used to build a wrapper of a certain type. */ +static int +compare_decls (const void *a, const void *b) +{ +const decl_pair* t1 = (const decl_pair*) a; +const decl_pair* t2 = (const decl_pair*) b; + +return DECL_UID(t1-arg) DECL_UID(t2-arg); +} + static void build_wrapper_type (struct wrapper_data *wd) { + unsigned int j; + decl_pair * c; wd-arglist = NULL_TREE; wd-parms = NULL_TREE; wd-argtypes = void_list_node; - pointer_map_traverse (wd-decl_map, wrapper_parm_cb, wd); + vec_arglist.create (0); + pointer_map_traverse (wd-decl_map, wrapper_parm_cb, NULL); gcc_assert (wd-type != CILK_BLOCK_FOR); + vec_arglist.qsort(compare_decls); + + FOR_EACH_VEC_ELT (vec_arglist, j, c) +{ + TREE_CHAIN (c-parm) = wd-parms; + wd-parms = c-parm; + wd-argtypes = tree_cons (NULL_TREE, TREE_TYPE (c-parm), wd-argtypes); + wd-arglist = tree_cons (NULL_TREE, c-arg, wd-arglist); +} + vec_arglist.release(); + /* Now build a function. Its return type is void (all side effects are via explicit parameters). Its parameters are WRAPPER_PARMS with type WRAPPER_TYPES. Bootstrapped successfully with GCC_COMPARE_DEBUG=1
[Bug bootstrap/63235] building fails with --disable-bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63235 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #14 from Igor Zamyatin izamyatin at gmail dot com --- Seems, started after r215538
[Bug bootstrap/63235] building fails with --disable-bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63235 --- Comment #15 from Igor Zamyatin izamyatin at gmail dot com --- Sorry, it's r215537
[Bug other/62002] -fcilkplus switch breaks format attribute.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62002 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #2 from Igor Zamyatin izamyatin at gmail dot com --- Am I correct that adding -fcilkplus just fixes the bug somehow? I see that regular trunk g++ gives error for struct foo { void bar(void *my_object, char const *, ...) __attribute__((__format__(__printf__, 2, 3))); }; which is supposed to be correct according to the docs(https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#Function-Attributes)
[Bug other/62002] -fcilkplus switch breaks format attribute.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62002 --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- Right, it is mentioned explicitly in the docs. Will take a look
[Bug bootstrap/62009] New: Bootstrap failure on i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62009 Bug ID: 62009 Summary: Bootstrap failure on i686 Product: gcc Version: 4.10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target: i686 After r213517 | tbsaunde | 2014-08-02 15:34:54 +0400 (Sat, 02 Aug 2014) | 27 lines convert many uses of pointer_map to hash_map gcc/c-family/ * cilk.c: Use hash_map instead of pointer_map. gcc/c/ * c-typeck.c: Use hash_map instead of pointer_map. gcc/cp/ * optimize.c, semantics.c: Use hash_map instead of pointer_map. gcc/ * hash-map.h (default_hashmap_traits::mark_key_deleted): Fix cast. (hash_map::remove): New method. (hash_map::traverse): New method. * cgraph.h, except.c, except.h, gimple-ssa-strength-reduction.c, ipa-utils.c, lto-cgraph.c, lto-streamer.h, omp-low.c, predict.c, tree-cfg.c, tree-cfgcleanup.c, tree-eh.c, tree-eh.h, tree-inline.c, tree-inline.h, tree-nested.c, tree-sra.c, tree-ssa-loop-im.c, tree-ssa-loop-ivopts.c, tree-ssa-reassoc.c, tree-ssa-structalias.c, tree-ssa.c, tree-ssa.h, var-tracking.c: Use hash_map instead of pointer_map. bootstrap for the following configuration CC=gcc -m32 CXX=g++ -m32 ../src-trunk/configure --enable-clocale=gnu --with-system-zlib --enable-shared --with-demangler-in-ld i686-linux --with-fpmath=sse --enable-languages=c,c++,fortran,java,lto,objc fails like: ../../src-trunk/gcc/dwarf2cfi.c:706:1: internal compiler error: in splice, at vec.h:844 def_cfa_0 (dw_cfa_location *old_cfa, dw_cfa_location *new_cfa) ^ 0x8c0c15f vec_edge_var_map, va_heap, vl_embed::splice(vec_edge_var_map, va_heap, vl_embed) ../../src-trunk/gcc/vec.h:844 0x8c0bd72 vec_edge_var_map, va_heap, vl_ptr::splice(vec_edge_var_map, va_heap, vl_ptr) ../../src-trunk/gcc/vec.h:1495 0x8c0b783 vec_edge_var_map, va_heap, vl_ptr::safe_splice(vec_edge_var_map, va_heap, vl_ptr) ../../src-trunk/gcc/vec.h:1512 0x8c06770 redirect_edge_var_map_dup(edge_def*, edge_def*) ../../src-trunk/gcc/tree-ssa.c:113 0x859bbac redirect_edge_succ_nodup(edge_def*, basic_block_def*) ../../src-trunk/gcc/cfghooks.c:437 0x8c068e3 ssa_redirect_edge(edge_def*, basic_block_def*) ../../src-trunk/gcc/tree-ssa.c:173 0x8a2c86b gimple_try_redirect_by_replacing_jump ../../src-trunk/gcc/tree-cfg.c:5419 0x8a2c90b gimple_redirect_edge_and_branch ../../src-trunk/gcc/tree-cfg.c:5450 0x859b940 redirect_edge_and_branch(edge_def*, basic_block_def*) ../../src-trunk/gcc/cfghooks.c:356 0x8a38335 remove_forwarder_block ../../src-trunk/gcc/tree-cfgcleanup.c:445 0x8a38b5b cleanup_tree_cfg_bb ../../src-trunk/gcc/tree-cfgcleanup.c:633 0x8a38c57 cleanup_tree_cfg_1 ../../src-trunk/gcc/tree-cfgcleanup.c:675 0x8a38d88 cleanup_tree_cfg_noloop ../../src-trunk/gcc/tree-cfgcleanup.c:731 0x8a38ea2 cleanup_tree_cfg() ../../src-trunk/gcc/tree-cfgcleanup.c:786 0x8906f96 execute_function_todo ../../src-trunk/gcc/passes.c:1702 0x8906426 do_per_function ../../src-trunk/gcc/passes.c:1476 0x89072c5 execute_todo ../../src-trunk/gcc/passes.c:1806 Please submit a full bug report,
[Bug middle-end/57541] [Cilkplus]: internal compiler error: in gimplify_expr, at gimplify.c:7809
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57541 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #14 from Igor Zamyatin izamyatin at gmail dot com --- I have a bunch of changes in my local tree that should fix this issue. Hope to send them out next week
[Bug middle-end/61734] [4.10 Regression] Regression in ABS_EXPR recognition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61734 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #6 from Igor Zamyatin izamyatin at gmail dot com --- Eric, dou you have any plans regarding this issue?
[Bug tree-optimization/61576] [4.10 Regression] wrong code at -O3 on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61576 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #5 from Igor Zamyatin izamyatin at gmail dot com --- Patch is posted at http://gcc.gnu.org/ml/gcc-patches/2014-06/msg01866.html
[Bug c/61191] cilkplus ICE on syntax error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61191 --- Comment #6 from Igor Zamyatin izamyatin at gmail dot com --- Thanks!
[Bug lto/61256] [4.10 regression] Building spec2000/252.eon with LTO got a compfail after r210522
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61256 --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com --- Fixed by r210672
[Bug c/61191] cilkplus ICE on syntax error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61191 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- Yeah, I think some more options for the test should be added
[Bug lto/61256] New: [4.10 regression] Building spec2000/252.eon with LTO got a compfail after r210522
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61256 Bug ID: 61256 Summary: [4.10 regression] Building spec2000/252.eon with LTO got a compfail after r210522 Product: gcc Version: 4.10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Linking on x86 as follows g++ -m64 -Ofast -flto -funroll-loops -m64 -Ofast -flto -funroll-loops -DSPEC_CPU2000_LP64 ... gives lto1: internal compiler error: in gimple_get_virt_method_for_vtable, at gimple-fold.c:3276 0x730833 gimple_get_virt_method_for_vtable(long, tree_node*, unsigned long, bool*) ../../gcc/gimple-fold.c:3276 0x730a23 gimple_get_virt_method_for_binfo(long, tree_node*, bool*) ../../gcc/gimple-fold.c:3377 0x77a133 record_target_from_binfo ../../gcc/ipa-devirt.c:867 0x77a30f record_target_from_binfo ../../gcc/ipa-devirt.c:884 0x77a9bb possible_polymorphic_call_targets_1 ../../gcc/ipa-devirt.c:931 0x77e609 possible_polymorphic_call_targets(tree_node*, long, ipa_polymorphic_call_context, bool*, void**, int*) ../../gcc/ipa-devirt.c:1743 0x7a46f9 possible_polymorphic_call_targets ../../gcc/ipa-utils.h:121 0x7a46f9 walk_polymorphic_call_targets ../../gcc/ipa.c:177 0x7a46f9 symtab_remove_unreachable_nodes(bool, _IO_FILE*) ../../gcc/ipa.c:407 0x858ec7 execute_todo ../../gcc/passes.c:1843 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. lto-wrapper: g++ returned 1 exit status /usr/bin/ld: lto-wrapper failed collect2: error: ld returned 1 exit status specmake: *** [eon] Error 1 Also 471.omnetpp from spec2006 fails with the same error
[Bug target/60882] New: [ARM] Execution fail on spec2K/197.parser
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60882 Bug ID: 60882 Summary: [ARM] Execution fail on spec2K/197.parser Product: gcc Version: 4.10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Looks like the infinite recursion of read_dict.c/insert_list routine Options: -Ofast -funroll-loops -flto -marm -mcpu=cortex-a15 -mfloat-abi=hard -mfpu=neon Compiler: Target: arm-linux-gnueabihf Configured with: /configure --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv7-a --with-float=hard --with-mode=thumb --with-fpu=vfpv3-d16 --target=arm-linux-gnueabihf --host=arm-linux-gnueabihf --build=arm-linux-gnueabihf --with-multiarch-defaults=arm-linux-gnueabihf --enable-bootstrap=no --enable-languages=c,c++,fortran --enable-shared --enable-linker-build-id --disable-werror Thread model: posix gcc version 4.10.0 20140416 (experimental) (GCC) I believe 4.9.0 also has this fail
[Bug middle-end/60469] simple cilk plus program ICEs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60469 --- Comment #10 from Igor Zamyatin izamyatin at gmail dot com --- (In reply to Jakub Jelinek from comment #9) (In reply to H.J. Lu from comment #8) (In reply to H.J. Lu from comment #7) (In reply to Igor Zamyatin from comment #6) Yes, I was going to post it after complete testing You should set DECL_SEEN_IN_BIND_EXPR_P when setting DECL_CONTEXT, similar to gimple_add_tmp_var. Or we can use create_tmp_var. That is much better idea, it will handle tons of other things, like setting DECL_ARTIFICIAL/DECL_IGNORED_P flags etc. In C++ FE, cp-array-notation.c apparently uses get_temp_regvar, which is also fine (but only defined in C++ FE). Yes, I tried create_tmp_var but it was undefined so I thought it's not a good idea... Will try further with it then
[Bug middle-end/60469] simple cilk plus program ICEs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60469 --- Comment #12 from Igor Zamyatin izamyatin at gmail dot com --- Thanks, will post a patch after the testing
[Bug middle-end/60469] simple cilk plus program ICEs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60469 --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- Following works for me and shows no new errors in regtesting. Not sure it is a good idea though... diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c index 6a5631c..d7c6772 100644 --- a/gcc/c/c-array-notation.c +++ b/gcc/c/c-array-notation.c @@ -284,6 +284,7 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree *new_var) { an_loop_info[ii].var = build_decl (location, VAR_DECL, NULL_TREE, integer_type_node); + DECL_CONTEXT (an_loop_info[ii].var) = current_function_decl; an_loop_info[ii].ind_init = build_modify_expr (location, an_loop_info[ii].var, TREE_TYPE (an_loop_info[ii].var), NOP_EXPR, @@ -783,6 +784,7 @@ build_array_notation_expr (location_t location, tree lhs, tree lhs_origtype, { lhs_an_loop_info[ii].var = build_decl (location, VAR_DECL, NULL_TREE, integer_type_node); +DECL_CONTEXT (lhs_an_loop_info[ii].var) = current_function_decl; lhs_an_loop_info[ii].ind_init = build_modify_expr (location, lhs_an_loop_info[ii].var, TREE_TYPE (lhs_an_loop_info[ii].var), NOP_EXPR, @@ -795,6 +797,7 @@ build_array_notation_expr (location_t location, tree lhs, tree lhs_origtype, integer. */ rhs_an_loop_info[ii].var = build_decl (location, VAR_DECL, NULL_TREE, integer_type_node); + DECL_CONTEXT (rhs_an_loop_info[ii].var) = current_function_decl; rhs_an_loop_info[ii].ind_init = build_modify_expr (location, rhs_an_loop_info[ii].var, TREE_TYPE (rhs_an_loop_info[ii].var), NOP_EXPR, @@ -972,6 +975,7 @@ fix_conditional_array_notations_1 (tree stmt) { an_loop_info[ii].var = build_decl (location, VAR_DECL, NULL_TREE, integer_type_node); + DECL_CONTEXT (an_loop_info[ii].var) = current_function_decl; an_loop_info[ii].ind_init = build_modify_expr (location, an_loop_info[ii].var, TREE_TYPE (an_loop_info[ii].var), NOP_EXPR, @@ -1069,6 +1073,7 @@ fix_array_notation_expr (location_t location, enum tree_code code, { an_loop_info[ii].var = build_decl (location, VAR_DECL, NULL_TREE, integer_type_node); + DECL_CONTEXT (an_loop_info[ii].var) = current_function_decl; an_loop_info[ii].ind_init = build_modify_expr (location, an_loop_info[ii].var, TREE_TYPE (an_loop_info[ii].var), NOP_EXPR, @@ -1165,6 +1170,7 @@ fix_array_notation_call_expr (tree arg) { an_loop_info[ii].var = build_decl (location, VAR_DECL, NULL_TREE, integer_type_node); + DECL_CONTEXT (an_loop_info[ii].var) = current_function_decl; an_loop_info[ii].ind_init = build_modify_expr (location, an_loop_info[ii].var, TREE_TYPE (an_loop_info[ii].var), NOP_EXPR, location, diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 7441784..b61a995 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1732,6 +1732,7 @@ gimplify_var_or_parm_decl (tree *expr_p) be really nice if the front end wouldn't leak these at all. Currently the only known culprit is C++ destructors, as seen in g++.old-deja/g++.jason/binding.C. */ +#if 0 if (TREE_CODE (decl) == VAR_DECL !DECL_SEEN_IN_BIND_EXPR_P (decl) !TREE_STATIC (decl) !DECL_EXTERNAL (decl) @@ -1740,6 +1741,7 @@ gimplify_var_or_parm_decl (tree *expr_p) gcc_assert (seen_error ()); return GS_ERROR; } +#endif /* When within an OpenMP context, notice uses of variables. */ if (gimplify_omp_ctxp omp_notice_variable (gimplify_omp_ctxp, decl, true))
[Bug middle-end/60469] simple cilk plus program ICEs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60469 --- Comment #6 from Igor Zamyatin izamyatin at gmail dot com --- Yes, I was going to post it after complete testing
[Bug middle-end/60467] ICE with -fcilkplus
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60467 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #2 from Igor Zamyatin izamyatin at gmail dot com --- I've also looked at this. I think this one should go diff --git a/gcc/c-family/cilk.c b/gcc/c-family/cilk.c index 6a7bf4f..bf549ad 100644 --- a/gcc/c-family/cilk.c +++ b/gcc/c-family/cilk.c @@ -99,7 +99,6 @@ cilk_set_spawn_marker (location_t loc, tree fcall) it. */ return false; else if (TREE_CODE (fcall) != CALL_EXPR - TREE_CODE (fcall) != FUNCTION_DECL /* In C++, TARGET_EXPR is generated when we have an overloaded '=' operator. */ TREE_CODE (fcall) != TARGET_EXPR)
[Bug middle-end/60469] simple cilk plus program ICEs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60469 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #2 from Igor Zamyatin izamyatin at gmail dot com --- ICE could be seen even for void foo() { asm( ::: memory); } #define ALEN 1024 int main(int argc, char* argv[]) { int b[ALEN]; b[:] = 100; _Cilk_spawn foo(); return 0; } The bad VAR_DECL here is the initial variable for the loop that is created during array annotation expression expanding. The problem seems to be in the way how this var_decl is created. For C++ case create_temp_var is used where DECL_CONTEXT is set up. For C case build_decl is used where no DECL_CONTEXT is filled
[Bug middle-end/60682] [4.9 Regression][OpenMP] ICE on an assignment of local variable inside SIMD loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60682 --- Comment #3 from Igor Zamyatin izamyatin at gmail dot com --- Thanks for the quick fix!
[Bug middle-end/60682] New: [4.9 Regression][OpenMP] ICE on an assignment of local variable inside SIMD loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60682 Bug ID: 60682 Summary: [4.9 Regression][OpenMP] ICE on an assignment of local variable inside SIMD loop Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Seems r207629 (fix for PR59984) introduces given issue. Test is class V3 { public: float v[1]; V3() {} V3(const V3 x) { v[0] = x.v[0]; } }; struct CCC { V3 a[16]; }; void foo(int num, CCC cc) { #pragma omp simd for(int i = 0; i num; ++i) { V3 v3; cc.a[i] = v3; } } compilation flags: -O1 -fopenmp ICE: internal compiler error: in create_tmp_var, at gimple-expr.c:506 cc.a[i] = v3; ^ 0x97ab03 create_tmp_var(tree_node*, char const*) Note that v3's privatization makes ICE disappear.
[Bug middle-end/60586] New: [Cilk+] Parameters evaluation happens inside spawn worker
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60586 Bug ID: 60586 Summary: [Cilk+] Parameters evaluation happens inside spawn worker Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Following test (compiled with eg -O2 -fcilkplus -lcilkrts) #include stdio.h #include cilk/cilk.h #include unistd.h int noop(int x) { return x; } int post_increment(int *x) { sleep(1); return (*x)++; } int main(int argc, char *argv[]) { int m = 5; int n = m; int r = cilk_spawn noop(post_increment(n)); int n2 = n; cilk_sync; printf(After sync: m = %d, n = %d, r = %d, n2 = %d\n, m, n, r, n2); if (r != m || n2 != m + 1) printf(FAILED\n); else printf(PASSED\n); return 0; } outputs After sync: m = 5, n = 6, r = 5, n2 = 5 FAILED That happens because post_increment is called inside spawn worker which is incorrect.
[Bug c++/60189] ICE with invalid use of _Cilk_sync
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60189 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #2 from Igor Zamyatin izamyatin at gmail dot com --- Now on trunk I see the following cilk_test_60189.c: In function ‘foo’: cilk_test_60189.c:3:16: error: expected ‘;’ before ‘return’ _Cilk_sync return; ^ cilk_test_60189.c:3:16: error: expected ‘_Cilk_spawn’ before ‘_Cilk_sync’
[Bug c++/60082] Certain Cilk keywords executable Hanging for -O1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60082 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #6 from Igor Zamyatin izamyatin at gmail dot com --- Fixed after r207623?
[Bug c++/60189] ICE with invalid use of _Cilk_sync
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60189 --- Comment #3 from Igor Zamyatin izamyatin at gmail dot com --- Ah, g++ gives the ICE :(
[Bug tree-optimization/53787] Possible IPA-SRA / IPA-CP improvement
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53787 --- Comment #18 from Igor Zamyatin izamyatin at gmail dot com --- Martin, I checked the patch and can confirm it gives necessary speedup on the test (UMTmk_1.1) Thanks!
[Bug bootstrap/60343] New: r208155 breaks bootstrap
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60343 Bug ID: 60343 Summary: r208155 breaks bootstrap Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Message: ../../gcc/lra-assigns.c: In function ‘int spill_for(int, bitmap)’: ../../gcc/lra-assigns.c:901:4: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] = LRA_MAX_CONSIDERED_RELOAD_PSEUDOS) Could be seen on x86_64, say, for ../configure --enable-clocale=gnu --with-system-zlib --enable-shared --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran,java,lto,objc
[Bug c/59984] OpenMP and Cilk Plus SIMD pragma makes loop incorrect
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59984 --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- vect details show that v1.0_14 = v1 and v2.1_15 = v2 are treated as invariants: test.c:24:14: note: --vectorizing statement: v1.0_14 = v1; test.c:24:14: note: transform statement. test.c:24:14: note: transform load. ncopies = 1 test.c:24:14: note: create vector_type-pointer variable to type: vector(4) int vectorizing a pointer ref: v1 test.c:24:14: note: created vectp_v1.11_40 test.c:24:14: note: add new stmt: vect_v1.12_37 = MEM[(int *)vectp_v1.10_39]; test.c:24:14: note: hoisting out of the vectorized loop: v1.0_14 = v1; test.c:24:14: note: created new init_stmt: vect_cst_.13_7 = {v1.0_8, v1.0_8, v1.0_8, v1.0_8}; test.c:24:14: note: --vectorizing statement: v2.1_15 = v2; test.c:24:14: note: transform statement. test.c:24:14: note: transform load. ncopies = 1 test.c:24:14: note: create vector_type-pointer variable to type: vector(4) int vectorizing a pointer ref: v2 test.c:24:14: note: created vectp_v2.15_1 test.c:24:14: note: add new stmt: vect_v2.16_60 = MEM[(int *)vectp_v2.14_58]; test.c:24:14: note: hoisting out of the vectorized loop: v2.1_15 = v2; test.c:24:14: note: created new init_stmt: vect_cst_.17_62 = {v2.1_61, v2.1_61, v2.1_61, v2.1_61}; Step for both loads determined as 0. Seems support for such case should be explicitly added
[Bug c/59984] OpenMP and Cilk Plus SIMD pragma makes loop incorrect
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59984 --- Comment #3 from Igor Zamyatin izamyatin at gmail dot com --- Vectorizer dump snippet for main: foo.simdclone.0 (vect__12.7_3, vect_cst_.8_53, vect_cst_.8_53, vect_cst_.9_51, vect_cst_.9_51); GIMPLE_NOP vect_v1.12_37 = MEM[(int *)vectp_v1.10_39]; (1) v1.0_14 = v1; vect_v2.16_60 = MEM[(int *)vectp_v2.14_58]; (2) v2.1_15 = v2; vect__16.18_63 = vect_cst_.13_7 * vect_cst_.17_62; --- constants instead of _16 = v1.0_14 * v2.1_15; vect_v1.12_37 and MEM[(int *)vectp_a.19_65] = vect__16.18_63; vect_v2.16_60 Then DCE destroys (1) and (2) and later LIM hoists the multiplication away from the loop.
[Bug tree-optimization/59597] [4.9 Regression] Performance degradation on Coremark after r205074
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59597 --- Comment #4 from Igor Zamyatin izamyatin at gmail dot com --- That would be great, thanks in advance!
[Bug target/59379] [4.9 Regression] gomp_init_num_threads is compiled into an infinite loop with --with-arch=corei7 --with-cpu=slm
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59379 --- Comment #13 from Igor Zamyatin izamyatin at gmail dot com --- I meant that with 3-stage gcc of r204980 testcase from the attachment was compiled and ran successfully, i.e. no infinite loop. Currently debugging shows that routine mul_double_wide_with_sign (which is actually inlined into aff_combination_scale) from double_int.c is miscompiled: in one case for a bad revision new_coef got value which is different from the value for new_coef for r204980
[Bug target/59379] [4.9 Regression] gomp_init_num_threads is compiled into an infinite loop with --with-arch=corei7 --with-cpu=slm
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59379 --- Comment #14 from Igor Zamyatin izamyatin at gmail dot com --- I meant new_coef from aff_combination_scale
[Bug target/59379] [4.9 Regression] gomp_init_num_threads is compiled into an infinite loop with --with-arch=corei7 --with-cpu=slm
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59379 --- Comment #10 from Igor Zamyatin izamyatin at gmail dot com --- I could build profiled bootstrap for r204980 successfully
[Bug tree-optimization/59597] New: Performance degradation on Coremark after r205074
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59597 Bug ID: 59597 Summary: Performance degradation on Coremark after r205074 Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target: x86 Created attachment 31510 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31510action=edit reduced test Degradation could be seen at -Ofast for the attached test which similar to the Coremark codes It seems that jump threading here performs unnecessary nodes duplication and as a result if-conversion doesn't happen.
[Bug tree-optimization/54742] Switch elimination in FSM loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54742 --- Comment #34 from Igor Zamyatin izamyatin at gmail dot com --- Done - http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59597
[Bug tree-optimization/59591] New: ICE in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:156 for -march=core-avx2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59591 Bug ID: 59591 Summary: ICE in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:156 for -march=core-avx2 Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: izamyatin at gmail dot com Target: x86 Started after r206069 and reproduced on 481.wrf from spec2006 Reduced testcase attached Options for reproducing gfortran -O2 -ftree-vectorize -march=core-avx2 -c