[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653 Aldy Hernandez changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2018-01-11 CC||aldyh at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #4 from Aldy Hernandez --- I can reproduce this on mainline, however the testcase is suspect. I see that page_ref_sub() is defined as: int __ia64_asr_i = ((nr)) ... if (__ia64_asr_i == xxx) else if (__ia64_asr_i == yyy) else if (__ia64_asr_i == yyy) etc else _tmp = __bad_increment_for_ia64_fetch_and_add(); The thing I see is that NR doesn't seem like an inlineable constant when passed from the caller: nr = 1UL << compound_order(page) ... page_ref_sub(page, nr) because: unsigned int compound_order(struct page *page) { if (!PageHead(page)) return 0; return page[1].compound_order; } And sure enough...after early inlining, both compound_order and page_ref_sub are inlined into shmem_add_to_page_cache and we can see: _117 = MEM[(struct page *)page_49(D) + 56B].D.16951.D.16950.compound_order; There's no way the compiler can know that _117 is a known constant if it's reading the value from memory. OTOH, the other *_bad* thinggies do get inlined correctly because they depend on sizeof(stuff), whose size can be determined at compile time. Matthew, could you double check here? Maybe I missed something, but perhaps a reduced testcase would help analyze better (at least for me :)).
[Bug tree-optimization/81703] memcpy folding defeats strlen optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81703 --- Comment #2 from prathamesh3492 at gcc dot gnu.org --- Author: prathamesh3492 Date: Thu Jan 11 04:37:48 2018 New Revision: 256475 URL: https://gcc.gnu.org/viewcvs?rev=256475=gcc=rev Log: 2018-01-11 Martin SeborPrathamesh Kulkarni PR tree-optimization/83501 PR tree-optimization/81703 * tree-ssa-strlen.c (get_string_cst): Rename... (get_string_len): ...to this. Handle global constants. (handle_char_store): Adjust. testsuite/ * gcc.dg/strlenopt-39.c: New test-case. * gcc.dg/pr81703.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/pr81703.c trunk/gcc/testsuite/gcc.dg/strlenopt-39.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-strlen.c
[Bug target/83781] [8 Regression] Bootstrap failed on x86 with --with-arch=corei7 --with-cpu=corei7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 --- Comment #8 from Martin Sebor --- Author: msebor Date: Thu Jan 11 05:13:57 2018 New Revision: 256477 URL: https://gcc.gnu.org/viewcvs?rev=256477=gcc=rev Log: PR tree-optimization/83781 - Bootstrap failed on x86 with --with-arch=corei7 --with-cpu=corei7 gcc/ChangeLog: * gimple-fold.c (get_range_strlen): Avoid treating arrays of pointers as string arrays. gcc/testsuite/ChangeLog: * gcc.dg/strlenopt-42.c: New test. Added: trunk/gcc/testsuite/gcc.dg/strlenopt-42.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-fold.c trunk/gcc/testsuite/ChangeLog
[Bug c/83782] New: Inconsistent address for hidden ifunc in a shared library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83782 Bug ID: 83782 Summary: Inconsistent address for hidden ifunc in a shared library Product: gcc Version: 7.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: rafael.espindola at gmail dot com Target Milestone: --- Created attachment 43092 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43092=edit testcase If a function with hidden visibility is implemented with an ifunc, different translation units have different opinions as to what its address is. The translation unit implementing the function gets its address with movqfoo@GOTPCREL(%rip), %rax That is, it finds the address of the actual function after it is selected. Any translation unit will use leaqfoo(%rip), %rax which the linker translates to the address of the plt. The net result is that the attached program finds two addresses for the same function: $ ./t 0x7fc5f331360a 0x7fc5f3313510 Gives the desire to allow ifuncs to be used by changing just the implementation (not the declaration), I think the only solution is to use the plt address in both translation units.
[Bug tree-optimization/83784] New: Missed optimization with bitfield
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83784 Bug ID: 83784 Summary: Missed optimization with bitfield Product: gcc Version: 8.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: daniel.santos at pobox dot com Target Milestone: --- Created attachment 43095 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43095=edit test case The layout of bitfields in memory is, of course, undefined in the C standard and is implementation-dependent. But when I happen to guess how gcc will lay it out correctly, I would like for these pack and unpack functions to compile-out. I'm only doing this because I happen to need to be able to know what 32-bit portion of a 64-bit value has one of the fields (for futex operations) and bitfields are syntactically easier to work with. But due to this flaw, I have to go back to shifting, ANDing, ORing, etc. The attached test case is probably not as simple as it could be as I'm testing both 32 and 64-bit code on x86, but the below is probably a descent summary (for 64-bits): union u { unsigned long ulong_val; struct { unsigned long a:4; unsigned long b:60; }; }; union u pack(union u in) { union u ret; ret.ulong_val |= in.b; ret.ulong_val <<= 4; ret.ulong_val |= in.a; return ret; } The above pack function compiles into the no-op I would expect: pack: .LFB12: .cfi_startproc movq%rdi, %rax ret .cfi_endproc But if I use three bitfields, my pack function is no longer a no-op: union u { unsigned long ulong_val; struct { unsigned long a:4; unsigned long b:30; unsigned long c:30; }; }; union u pack( union u in ) { union u ret; ret.ulong_val = in.c; ret.ulong_val <<= 30; ret.ulong_val |= in.b; ret.ulong_val <<= 4; ret.ulong_val |= in.a; return ret; } And here's the output (with hex immediates for ANDs) pack: pack: .LFB11: .cfi_startproc movq%rdi, %rax movq%rdi, %rdx andl$0xf, %edi shrq$34, %rax shrq$4, %rdx salq$30, %rax andl$0x3fff, %edx orq %rdx, %rax salq$4, %rax orq %rdi, %rax ret .cfi_endproc Possibly related to bug #15596 and maybe even a duplicate of bug #35363, but I'm uncertain. I have only tested on gcc 5.4.0 and 8 from git so far and only x86, but I'm going to *guess* this is a tree-optimization issue and not the x86 backend.
[Bug c++/83778] [8 regression] g++.dg/ext/altivec-cell-2.C fails starting with r256448
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83778 --- Comment #2 from David Malcolm --- Presumably we should simply strip the location from arg, though there are some places with: /* Call get_element_number to validate arg1 if it is a constant. */ if (TREE_CODE (arg1) == INTEGER_CST) (void) get_element_number (TREE_TYPE (arg0), arg1); which suggests any such stripping needs to happen there.
[Bug middle-end/83653] [6/7/8 Regression] GCC fails to remove a can't-happen call on ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83653 --- Comment #5 from Matthew Wilcox --- Hi Aldy! Thanks for looking into this. Yes, I agree, there's no way that GCC can know this is a constant, but that *should* have been taken care of. Please pardon me copying and pasting from the original source file rather than the preprocessed source, but I find it utterly impossible to work with the preprocessed source ... #define atomic_sub_return(i,v) \ ({ \ int __ia64_asr_i = (i); \ (__builtin_constant_p(i)\ && ( (__ia64_asr_i == 1) || (__ia64_asr_i == 4) \ || (__ia64_asr_i == 8) || (__ia64_asr_i == 16) \ || (__ia64_asr_i == -1) || (__ia64_asr_i == -4) \ || (__ia64_asr_i == -8) || (__ia64_asr_i == -16)))\ ? ia64_fetch_and_add(-__ia64_asr_i, &(v)->counter) \ : ia64_atomic_sub(__ia64_asr_i, v); \ }) That __builtin_constant_p() *should* have led GCC to throw up its hands, not bother checking for the +/- 1, 4, 8, 16 cases and just call ia64_atomic_sub(). Looking at the disassembly, I see a BBB bundle, indicating quite strongly to me that it is testing for all of these cases, and the __builtin_constant_p is being ... ignored? misunderstood? Thanks!
[Bug c++/83778] [8 regression] g++.dg/ext/altivec-cell-2.C fails starting with r256448
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83778 --- Comment #1 from David Malcolm --- It does look like an issue with r256448, but I haven't been able to reproduce it here yet. There are 3 in-tree copies of get_element_number, in 3 backends; each has 2 users per backend; they all look like: static int get_element_number (tree vec_type, tree arg) { unsigned HOST_WIDE_INT elt, max = TYPE_VECTOR_SUBPARTS (vec_type) - 1; if (!tree_fits_uhwi_p (arg) || (elt = tree_to_uhwi (arg), elt > max)) { error ("selector must be an integer constant in the range 0..%wi", max); return 0; } return elt; } which is going to fail if a location wrapper node around a INTEGER_CST is passed for ARG, rather than a plain INTEGER_CST.
[Bug tree-optimization/81703] memcpy folding defeats strlen optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81703 prathamesh3492 at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from prathamesh3492 at gcc dot gnu.org --- Fixed.
[Bug tree-optimization/83781] [8 Regression] Bootstrap failed on x86 with --with-arch=corei7 --with-cpu=corei7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 Martin Sebor changed: What|Removed |Added Status|ASSIGNED|RESOLVED Component|target |tree-optimization Resolution|--- |FIXED --- Comment #9 from Martin Sebor --- r256477 should fix the bootstrap failure. Please open a new bug for any new issues with the same commit.
[Bug c/83781] New: [8 Regression] Bootstrap failed on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 Bug ID: 83781 Summary: [8 Regression] Bootstrap failed on x86 Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: msebor at gcc dot gnu.org Target Milestone: --- I got trunk/gcc/hsa-dump.c ../../src-trunk/gcc/hsa-dump.c: In function ‘void dump_hsa_symbol(FILE*, hsa_symbol*)’: ../../src-trunk/gcc/hsa-dump.c:784:21: error: ‘%s’ directive writing up to 71 bytes into a region of size 62 [-Werror=format-overflow=] sprintf (buf, "__%s_%i", hsa_seg_name (symbol->m_segment), ^ ../../src-trunk/gcc/hsa-dump.c:784:15: note: ‘sprintf’ output between 5 and 86 bytes into a destination of size 64 sprintf (buf, "__%s_%i", hsa_seg_name (symbol->m_segment), ^~ symbol->m_name_number); and ../../src-trunk/gcc/config/i386/i386.c: In function 'const char* output_set_got(rtx, rtx)': ../../src-trunk/gcc/config/i386/i386.c:10653:20: error: '%s' directive writing up to 323 bytes into a region of size 13 [-Werror=format-overflow=] sprintf (name, "__x86.get_pc_thunk.%s", reg_names[regno]); ^~~ ../../src-trunk/gcc/config/i386/i386.c:10653:13: note: 'sprintf' output between 20 and 343 bytes into a destination of size 32 sprintf (name, "__x86.get_pc_thunk.%s", reg_names[regno]); ^ r256454 is OK and r25646 failed. It may be caused by r256457.
[Bug target/83781] [8 Regression] Bootstrap failed on x86 with --with-arch=corei7 --with-cpu=corei7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 Martin Sebor changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |msebor at gcc dot gnu.org --- Comment #6 from Martin Sebor --- I can reproduce the problem with the following test case. Let me take care of it. $ cat z.c && gcc -O2 -S -Wall z.c const char* const a[32] = { "1", "12", "123" }; char d[4]; void f (int i) { __builtin_sprintf (d, "%s", a[i]); } z.c: In function ‘f’: z.c:7:26: warning: ‘%s’ directive writing up to 255 bytes into a region of size 4 [-Wformat-overflow=] __builtin_sprintf (d, "%s", a[i]); ^~
[Bug target/83768] ARM: wrong optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83768 --- Comment #2 from Sergey Organov --- 4.8.3 doesn't have the issue, and I don't have fast access to any 4.9. So presumably it has been fixed between 5.4.0 and 5.4.1... It'd still be nice to know if there is some optimization switch in 5.4.0 to be turned off to overcome the problem. Any idea?
[Bug target/83781] [8 Regression] Bootstrap failed on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 Andrew Pinski changed: What|Removed |Added Keywords||build, diagnostic Target||x86_64-linux-gnu Component|c |target Target Milestone|--- |8.0
[Bug target/83781] [8 Regression] Bootstrap failed on x86 with --with-arch=corei7 --with-cpu=corei7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 --- Comment #4 from Martin Sebor --- The gcc/hsa-dump.c warning doesn't seem to correspond to the latest sources. hsa_seg_name() returns one of a number of short strings, the longest being "UNKNOWN_SEGMENT" but the warning says the %s argument can be up to 71 characters long.
[Bug c++/83780] New: False positive alignment error with -fsanitize=undefined with virtual base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83780 Bug ID: 83780 Summary: False positive alignment error with -fsanitize=undefined with virtual base Product: gcc Version: 7.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: securesneakers at gmail dot com Target Milestone: --- Created attachment 43091 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43091=edit Minimal example that reproduces the issue Attached program generates false misalignment errors when compiled with -fsanitize=undefined $ g++ --version g++ (GCC) 7.2.1 20171224 $ uname -s -m Linux x86_64 $ g++ -std=c++11 -O2 -fsanitize=undefined minimal.cpp && ./a.out minimal.cpp:9:8: runtime error: constructor call on misaligned address 0x7ffdd1e1e658 for type 'struct Base2', which requires 16 byte alignment Attached example contains following hierarchy: struct alignas(16) Base1 { }; struct Base2 : virtual Base1 { }; struct Base3 : virtual Base2 { }; alignof(Base2) is set to 16 due to alignment of its base class. But when Base3 is instantiated, Base2 is placed with alignment of 8 as it should be according to Itanium C++ ABI (due to its non-virtual alignment being equal 8): https://refspecs.linuxfoundation.org/cxxabi-1.75.html#class-types. Yet sanitizer complains about alignment not being 16. Seems that sanitizer checks address using "normal" alignment when "non-virtual alignment" should be used.
[Bug target/83781] [8 Regression] Bootstrap failed on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 --- Comment #2 from Martin Sebor --- Created attachment 43093 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43093=edit x86_64-linux tests summary. My x864_64 bootstrap and regression test run of the patch succeeded with the attached test summary. Let me quickly retry with the top of trunk and look into the warning.
[Bug target/83781] [8 Regression] Bootstrap failed on x86 with --with-arch=corei7 --with-cpu=corei7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 --- Comment #7 from Martin Sebor --- Created attachment 43094 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43094=edit Preliminary patch. I'm testing the attached patch.
[Bug tree-optimization/83501] [8 Regression] strlen(a) not folded after strcpy(a, "...")
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83501 --- Comment #8 from prathamesh3492 at gcc dot gnu.org --- Author: prathamesh3492 Date: Thu Jan 11 04:37:48 2018 New Revision: 256475 URL: https://gcc.gnu.org/viewcvs?rev=256475=gcc=rev Log: 2018-01-11 Martin SeborPrathamesh Kulkarni PR tree-optimization/83501 PR tree-optimization/81703 * tree-ssa-strlen.c (get_string_cst): Rename... (get_string_len): ...to this. Handle global constants. (handle_char_store): Adjust. testsuite/ * gcc.dg/strlenopt-39.c: New test-case. * gcc.dg/pr81703.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/pr81703.c trunk/gcc/testsuite/gcc.dg/strlenopt-39.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-strlen.c
[Bug libstdc++/83709] Inserting duplicates into an unordered associative containers causes the container to invalidate iterators
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83709 François Dumont changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |8.0 --- Comment #3 from François Dumont --- Should be fixed now, thanks again for reporting.
[Bug ipa/83532] [8 Regression] ICE in apply_scale, at profile-count.h:955
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83532 Jan Hubicka changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-01-10 Ever confirmed|0 |1 --- Comment #1 from Jan Hubicka --- Does this help? Seems we omit adjustment when producing partial function clone. Index: tree-inline.c === --- tree-inline.c (revision 256324) +++ tree-inline.c (working copy) @@ -2683,7 +2683,6 @@ profile_count den = ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count; profile_count num = entry_block_map->count; - profile_count::adjust_for_ipa_scaling (, ); cfun_to_copy = id->src_cfun = DECL_STRUCT_FUNCTION (callee_fndecl); @@ -2707,6 +2706,8 @@ ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = den; } + profile_count::adjust_for_ipa_scaling (, ); + /* Must have a CFG here at this point. */ gcc_assert (ENTRY_BLOCK_PTR_FOR_FN (DECL_STRUCT_FUNCTION (callee_fndecl)));
[Bug target/83008] [performance] Is it better to avoid extra instructions in data passing between loops?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008 --- Comment #22 from rguenther at suse dot de --- On Wed, 10 Jan 2018, sergey.shalnov at intel dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008 > > --- Comment #21 from sergey.shalnov at intel dot com --- > Thanks Richard for your comments. > Based on our discussion I've produced the patch attached and > run it on SPEC2017intrate/fprate on skylake server (with [-Ofast -flto > -march=skylake-avx512 -mfpmath=sse -funroll-loops]). > Please note, I used your last proposed patch and changed loop trip count > calculation ("ncopies_for_cost * nunits / group_size" is always 1). > > I see the following performance changes: > SPEC CPU 2017 intrate > 500.perlbench_r -0.7% > 525.x264_r +7.2% > Geomean: +0.8% > > SPEC CPU 2017 fprate > 527.cam4_r -1.1% > 538.imagick_r +4.7% > 544.nab_r +3.6% > Geomean: +0.6% > > I believe that after appropriate cost model tweaks for other targets a gain > could be observed but I haven't checked it carefully. > It provides a good performance gain for the original case and a few other > improvements. > > Can you please take a look at the patch and comment (or might propose another > method)? It mixes several things, the last one (> to >= change in cost evaluation clearly wrong). The skylake_cost changes look somewhat odd to me. I'll attach my current SLP costing adjustment patch (after the SVE changes the old didn't build anymore). > Sergey > > From 41e5094cbdce72d4cc5e04fc3d11c01c3c1adbb7 Mon Sep 17 00:00:00 2001 > From: Sergey Shalnov> Date: Tue, 9 Jan 2018 14:37:14 +0100 > Subject: [PATCH, SLP] SLP_common_algo_changes > > --- > gcc/config/i386/x86-tune-costs.h | 4 ++-- > gcc/tree-vect-slp.c | 41 > ++-- > 2 files changed, 33 insertions(+), 12 deletions(-) > > diff --git a/gcc/config/i386/x86-tune-costs.h > b/gcc/config/i386/x86-tune-costs.h > index 312467d..3e0f904 100644 > --- a/gcc/config/i386/x86-tune-costs.h > +++ b/gcc/config/i386/x86-tune-costs.h > @@ -1555,7 +1555,7 @@ struct processor_costs skylake_cost = { >{4, 4, 4}, /* cost of loading integer registers >in QImode, HImode and SImode. >Relative to reg-reg move (2). */ > - {6, 6, 6}, /* cost of storing integer registers > */ > + {6, 6, 4}, /* cost of storing integer registers. > */ >2, /* cost of reg,reg fld/fst */ >{6, 6, 8}, /* cost of loading fp registers >in SFmode, DFmode and XFmode */ > @@ -1570,7 +1570,7 @@ struct processor_costs skylake_cost = { >{6, 6, 6, 10, 20}, /* cost of loading SSE registers >in 32,64,128,256 and 512-bit */ >{6, 6, 6, 10, 20}, /* cost of unaligned loads. */ > - {8, 8, 8, 8, 16},/* cost of storing SSE registers > + {8, 8, 8, 16, 32}, /* cost of storing SSE registers >in 32,64,128,256 and 512-bit */ >{8, 8, 8, 8, 16},/* cost of unaligned stores. */ >2, 2,/* SSE->integer and > integer->SSE moves */ > diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c > index 0ca42b4..7e63a1c 100644 > --- a/gcc/tree-vect-slp.c > +++ b/gcc/tree-vect-slp.c > @@ -1815,18 +1815,39 @@ vect_analyze_slp_cost_1 (slp_instance instance, > slp_tree node, >enum vect_def_type dt; >if (!op || op == lhs) > continue; > - if (vect_is_simple_use (op, stmt_info->vinfo, _stmt, )) > + if (vect_is_simple_use (op, stmt_info->vinfo, _stmt, ) > +&& (dt == vect_constant_def || dt == vect_external_def)) > { > /* Without looking at the actual initializer a vector of > constants can be implemented as load from the constant pool. > -??? We need to pass down stmt_info for a vector type > -even if it points to the wrong stmt. */ > - if (dt == vect_constant_def) > - record_stmt_cost (prologue_cost_vec, 1, vector_load, > - stmt_info, 0, vect_prologue); > - else if (dt == vect_external_def) > - record_stmt_cost (prologue_cost_vec, 1, vec_construct, > - stmt_info, 0, vect_prologue); > +When all elements are the same we can use a splat. */ > + unsigned group_size = SLP_TREE_SCALAR_STMTS (node).length (); > + tree elt = NULL_TREE; > + unsigned nelt = 0; > + for (unsigned j = 0; j < ncopies_for_cost; ++j) > + for (unsigned k = 0; k < group_size; ++k) > + { > + if (nelt == 0) > + elt =
[Bug target/83008] [performance] Is it better to avoid extra instructions in data passing between loops?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008 --- Comment #23 from Richard Biener --- Created attachment 43084 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43084=edit SLP costing for constants/externs improvement
[Bug target/83008] [performance] Is it better to avoid extra instructions in data passing between loops?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008 --- Comment #22 from rguenther at suse dot de --- On Wed, 10 Jan 2018, sergey.shalnov at intel dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008 > > --- Comment #21 from sergey.shalnov at intel dot com --- > Thanks Richard for your comments. > Based on our discussion I've produced the patch attached and > run it on SPEC2017intrate/fprate on skylake server (with [-Ofast -flto > -march=skylake-avx512 -mfpmath=sse -funroll-loops]). > Please note, I used your last proposed patch and changed loop trip count > calculation ("ncopies_for_cost * nunits / group_size" is always 1). > > I see the following performance changes: > SPEC CPU 2017 intrate > 500.perlbench_r -0.7% > 525.x264_r +7.2% > Geomean: +0.8% > > SPEC CPU 2017 fprate > 527.cam4_r -1.1% > 538.imagick_r +4.7% > 544.nab_r +3.6% > Geomean: +0.6% > > I believe that after appropriate cost model tweaks for other targets a gain > could be observed but I haven't checked it carefully. > It provides a good performance gain for the original case and a few other > improvements. > > Can you please take a look at the patch and comment (or might propose another > method)? It mixes several things, the last one (> to >= change in cost evaluation clearly wrong). The skylake_cost changes look somewhat odd to me. I'll attach my current SLP costing adjustment patch (after the SVE changes the old didn't build anymore). > Sergey > > From 41e5094cbdce72d4cc5e04fc3d11c01c3c1adbb7 Mon Sep 17 00:00:00 2001 > From: Sergey Shalnov> Date: Tue, 9 Jan 2018 14:37:14 +0100 > Subject: [PATCH, SLP] SLP_common_algo_changes > > --- > gcc/config/i386/x86-tune-costs.h | 4 ++-- > gcc/tree-vect-slp.c | 41 > ++-- > 2 files changed, 33 insertions(+), 12 deletions(-) > > diff --git a/gcc/config/i386/x86-tune-costs.h > b/gcc/config/i386/x86-tune-costs.h > index 312467d..3e0f904 100644 > --- a/gcc/config/i386/x86-tune-costs.h > +++ b/gcc/config/i386/x86-tune-costs.h > @@ -1555,7 +1555,7 @@ struct processor_costs skylake_cost = { >{4, 4, 4}, /* cost of loading integer registers >in QImode, HImode and SImode. >Relative to reg-reg move (2). */ > - {6, 6, 6}, /* cost of storing integer registers > */ > + {6, 6, 4}, /* cost of storing integer registers. > */ >2, /* cost of reg,reg fld/fst */ >{6, 6, 8}, /* cost of loading fp registers >in SFmode, DFmode and XFmode */ > @@ -1570,7 +1570,7 @@ struct processor_costs skylake_cost = { >{6, 6, 6, 10, 20}, /* cost of loading SSE registers >in 32,64,128,256 and 512-bit */ >{6, 6, 6, 10, 20}, /* cost of unaligned loads. */ > - {8, 8, 8, 8, 16},/* cost of storing SSE registers > + {8, 8, 8, 16, 32}, /* cost of storing SSE registers >in 32,64,128,256 and 512-bit */ >{8, 8, 8, 8, 16},/* cost of unaligned stores. */ >2, 2,/* SSE->integer and > integer->SSE moves */ > diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c > index 0ca42b4..7e63a1c 100644 > --- a/gcc/tree-vect-slp.c > +++ b/gcc/tree-vect-slp.c > @@ -1815,18 +1815,39 @@ vect_analyze_slp_cost_1 (slp_instance instance, > slp_tree node, >enum vect_def_type dt; >if (!op || op == lhs) > continue; > - if (vect_is_simple_use (op, stmt_info->vinfo, _stmt, )) > + if (vect_is_simple_use (op, stmt_info->vinfo, _stmt, ) > +&& (dt == vect_constant_def || dt == vect_external_def)) > { > /* Without looking at the actual initializer a vector of > constants can be implemented as load from the constant pool. > -??? We need to pass down stmt_info for a vector type > -even if it points to the wrong stmt. */ > - if (dt == vect_constant_def) > - record_stmt_cost (prologue_cost_vec, 1, vector_load, > - stmt_info, 0, vect_prologue); > - else if (dt == vect_external_def) > - record_stmt_cost (prologue_cost_vec, 1, vec_construct, > - stmt_info, 0, vect_prologue); > +When all elements are the same we can use a splat. */ > + unsigned group_size = SLP_TREE_SCALAR_STMTS (node).length (); > + tree elt = NULL_TREE; > + unsigned nelt = 0; > + for (unsigned j = 0; j < ncopies_for_cost; ++j) > + for (unsigned k = 0; k < group_size; ++k) > + { > + if (nelt == 0) > + elt =
[Bug ada/83765] LTO bootstrap with Ada fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83765 --- Comment #1 from Richard Biener --- Doesn't seem to work :/ I guess making the old_die && declaration more prevalent might work. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 256378) +++ gcc/dwarf2out.c (working copy) @@ -22044,6 +22044,11 @@ gen_subprogram_die (tree decl, dw_die_re int declaration = (current_function_decl != decl || class_or_namespace_scope_p (context_die)); + /* A declaration that has been previously dumped needs no + additional information. */ + if (old_die && declaration) +return; + /* Now that the C++ front end lazily declares artificial member fns, we might need to retrofit the declaration into its class. */ if (!declaration && !origin && !old_die @@ -22084,11 +22089,6 @@ gen_subprogram_die (tree decl, dw_die_re much as possible. */ else if (old_die) { - /* A declaration that has been previously dumped needs no -additional information. */ - if (declaration) - return; - if (!get_AT_flag (old_die, DW_AT_declaration) /* We can have a normal definition following an inline one in the case of redefinition of GNU C extern inlines.
[Bug libstdc++/67632] explicit instantiation omits copy constructor and others
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67632 --- Comment #9 from Anthony Chuah --- Forgot to add: this bug exists also for clang 5.0.0. % clang++ -std=c++11 --gcc-toolchain=/path/to/gcc t.o x.cc /tmp/x-063634.o: In function `copy(std::unordered_map> const&)': x.cc:(.text+0x18): undefined reference to `std::unordered_map >::unordered_map(std::unordered_map > const&)' clang-5.0: error: linker command failed with exit code 1 (use -v to see invocation)
[Bug c++/81702] [7/8 Regression] ICE in gimple_get_virt_method_for_vtable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81702 Richard Biener changed: What|Removed |Added CC||lesliezhai at llvm dot org.cn --- Comment #13 from Richard Biener --- *** Bug 83764 has been marked as a duplicate of this bug. ***
[Bug middle-end/83764] internal compiler error: in gimple_get_virt_method_for_vtable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83764 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #4 from Richard Biener --- dup. *** This bug has been marked as a duplicate of bug 81702 ***
[Bug libstdc++/67632] explicit instantiation omits copy constructor and others
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67632 --- Comment #10 from Andreas Schwab--- *** Bug 83766 has been marked as a duplicate of this bug. ***
[Bug c++/83766] Bug 67632 not fixed yet
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83766 Andreas Schwabchanged: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andreas Schwab --- dup *** This bug has been marked as a duplicate of bug 67632 ***
[Bug target/83008] [performance] Is it better to avoid extra instructions in data passing between loops?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008 --- Comment #25 from rguenther at suse dot de --- On Wed, 10 Jan 2018, sergey.shalnov at intel dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008 > > --- Comment #24 from sergey.shalnov at intel dot com --- > Richard, > The latest "SLP costing for constants/externs improvement" patch generates the > same code as baseline for the test example. > > Are you sure that "num_vects_to_check" should 1 if vector is not constant? I > would expect "num_vects_to_check = ncopies_for_cost;" here: > > 1863 else > 1864{ > 1865 num_vects_to_check = 1; > 1866 nelt_limit = group_size; > 1867} Yes, that's correct for variable length vectors (SVE)
[Bug target/83687] [6/7/8 Regression] ARM NEON invalid optimisation for vabd/vabdl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83687 ktkachov at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-01-10 CC||ktkachov at gcc dot gnu.org Summary|ARM NEON invalid|[6/7/8 Regression] ARM NEON |optimisation for vabd/vabdl |invalid optimisation for ||vabd/vabdl Ever confirmed|0 |1 Known to fail||4.8.5, 4.9.4, 5.4.1, 6.4.1, ||7.2.1, 8.0 --- Comment #1 from ktkachov at gcc dot gnu.org --- Confirmed on all active branches. Combining VSUB + VABS into VABD in this way is not valid for integers.
[Bug debug/83765] LTO bootstrap with Ada fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83765 --- Comment #5 from Richard Biener --- Author: rguenth Date: Wed Jan 10 14:23:29 2018 New Revision: 256428 URL: https://gcc.gnu.org/viewcvs?rev=256428=gcc=rev Log: 2018-01-10 Richard BienerPR debug/83765 * dwarf2out.c (gen_subprogram_die): Hoist old_die && declaration early out so it also covers the case where we have a non-NULL origin. Modified: trunk/gcc/ChangeLog trunk/gcc/dwarf2out.c
[Bug ipa/83054] [8 Regression] ICE in operator>, at profile-count.h:823
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83054 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2018-01-10 Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Martin Liška --- I have patch for it.
[Bug middle-end/81657] [8 Regression] FAIL: gcc.dg/20050503-1.c scan-assembler-not call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81657 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug tree-optimization/83055] [8 Regression] ICE in operator>, at profile-count.h:834
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83055 --- Comment #5 from Martin Liška --- (In reply to Martin Liška from comment #4) > One another test-case with a bit different back-trace: > > $ g++ > /home/marxin/Programming/gcc/gcc/testsuite/g++.old-deja/g++.mike/p789a.C > /dev/null -Wsuggest-final-types -Ofast > during IPA pass: devirt > /home/marxin/Programming/gcc/gcc/testsuite/g++.old-deja/g++.mike/p789a.C:44: > 1: internal compiler error: in operator>, at profile-count.h:827 > } > ^ > 0xba372c profile_count::operator>(long) const > ../../gcc/profile-count.h:827 > 0xba372c ipa_devirt > ../../gcc/ipa-devirt.c:3750 > 0xba372c execute > ../../gcc/ipa-devirt.c:3892 This one is dup of PR83054, however https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83055#c1 is a different story.
[Bug middle-end/82004] [8 Regression] SPEC CPU2017 628.pop2_s miscompare
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82004 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug lto/81968] [8 regression] early lto debug objects make Solaris ld SEGV
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81968 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug c++/81917] [6/7/8 Regression] internal compiler error: in finish_member_declaration, at cp/semantics.c:3004
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81917 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug debug/82425] [8 regression] gcc.dg/guality/inline-params-2.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82425 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #2 from Richard Biener --- Fixed.
[Bug debug/82425] [8 regression] gcc.dg/guality/inline-params-2.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82425 --- Comment #3 from Richard Biener --- Author: rguenth Date: Wed Jan 10 14:41:34 2018 New Revision: 256429 URL: https://gcc.gnu.org/viewcvs?rev=256429=gcc=rev Log: 2018-01-10 Richard BienerPR debug/82425 * gcc.dg/guality/inline-params-2.c: Un-XFAIL for slim LTO. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/guality/inline-params-2.c
[Bug target/82682] [8 Regression] FAIL: gcc.target/i386/pr50038.c scan-assembler-times movzbl 2 (found 3 times) since r253958
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82682 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug c++/82249] [8 Regression] wrong mismatched argument pack lengths
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82249 --- Comment #5 from Benjamin Buch --- The code is accepted by clang since version 4.0. Older versions probably don't support constexpr lambdas.
[Bug tree-optimization/82965] [8 regression][armeb] gcc.dg/vect/pr79347.c starts failing after r254379
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82965 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Status|UNCONFIRMED |NEW Last reconfirmed||2018-01-10 Component|other |tree-optimization Ever confirmed|0 |1
[Bug tree-optimization/83043] [8 Regression] FAIL: libgomp.graphite/force-parallel-1.c scan-tree-dump-times graphite "2 loops carried no dependency" 1 (found 0 times)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83043 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Status|UNCONFIRMED |NEW Last reconfirmed||2018-01-10 Ever confirmed|0 |1
[Bug preprocessor/83063] [8 Regression] ICE on an invalid preprocessor snippet
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83063 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Version|unknown |8.0
[Bug ipa/83179] [8 regression] gcc.dg/ipa/inline-1.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83179 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug c++/83186] [8 regression] internal compiler error: in build_address, at cp/typeck.c:5667
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83186 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug tree-optimization/83189] [8 regression] internal compiler error: in probability_in, at profile-count.h:1050
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83189 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug target/68834] ICE: in decompose_normal_address, at rtlanal.c:6086 with -O2 -fPIC --param=sched-autopref-queue-depth=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68834 --- Comment #2 from Segher Boessenkool --- Likely the same issue as pr83629 .
[Bug middle-end/83764] internal compiler error: in gimple_get_virt_method_for_vtable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83764 Adam Lackorzynski changed: What|Removed |Added CC||adam at os dot inf.tu-dresden.de --- Comment #3 from Adam Lackorzynski --- This is a duplicate of #81702, which has already been fixed.
[Bug target/83760] [8 Regression] [SH] ICE in maybe_record_trace_start building glibc tst-copy_file_range.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83760 Richard Biener changed: What|Removed |Added Priority|P3 |P4 Target Milestone|--- |8.0
[Bug target/83761] bfin: ICE: in require, at machmode.h:292
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83761 --- Comment #1 from Sebastian Huber--- Created attachment 43086 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43086=edit Makefile to build the cross GCC Use make clone make install/bin/bfin-rtems5-ld make install/bin/bfin-rtems5-gcc to build the cross GCC to reproduce the problem.
[Bug target/83008] [performance] Is it better to avoid extra instructions in data passing between loops?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008 --- Comment #26 from sergey.shalnov at intel dot com --- Sorry, did you meant "arm_sve.h" on ARM? In this case we have machine specific code in common part of the gcc code. Should we make it as machine dependent callback function because having "num_vects_to_check = 1" is incorrect for SSE?
[Bug bootstrap/82831] [8 Regression] Broken PGO bootstrap after r254379
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82831 --- Comment #31 from Martin Liška --- Author: marxin Date: Wed Jan 10 10:54:20 2018 New Revision: 256422 URL: https://gcc.gnu.org/viewcvs?rev=256422=gcc=rev Log: Clean up partitioning in try_optimize_cfg (PR bootstrap/82831). 2018-01-10 Martin LiskaPR bootstrap/82831 * basic-block.h (CLEANUP_NO_PARTITIONING): New define. * bb-reorder.c (pass_reorder_blocks::execute): Do not clean up partitioning. * cfgcleanup.c (try_optimize_cfg): Fix up partitioning if CLEANUP_NO_PARTITIONING is not set. Modified: trunk/gcc/ChangeLog trunk/gcc/basic-block.h trunk/gcc/bb-reorder.c trunk/gcc/cfgcleanup.c
[Bug tree-optimization/81184] [8 regression] gcc.dg/pr21643.c and gcc.dg/tree-ssa/phi-opt-11.c fail starting with r249450
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81184 --- Comment #6 from Christophe Lyon --- Note that I posted a related patch some time ago: https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01477.html
[Bug tree-optimization/83753] ICE: in exact_div, at poly-int.h:2139
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83753 --- Comment #4 from Vidya Praveen --- Thanks for fixing this, Richard!
[Bug c++/81843] [8 Regression] ICE on valid C++11 code with variadic templates: in tsubst_pack_expansion, at cp/pt.c:11524
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81843 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug c++/82249] [8 Regression] wrong mismatched argument pack lengths
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82249 Richard Biener changed: What|Removed |Added Keywords|diagnostic |rejects-valid Priority|P3 |P1 Known to work||7.2.1 --- Comment #4 from Richard Biener --- GCC 7 accepts the code. My version of clang rejects it with t.C:5:6: error: constexpr variable 'calc' must be initialized by a constant expression [](auto n, auto dim)noexcept{ return dim; }; ^~~
[Bug c++/82514] [8 Regression] ICE: in operator[], at vec.h:749
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82514 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug tree-optimization/82604] [8 Regression] SPEC CPU2006 410.bwaves ~50% performance regression with trunk@253679 when ftree-parallelize-loops is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82604 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-01-10 Ever confirmed|0 |1 --- Comment #9 from Richard Biener --- Confirmed at least.
[Bug c++/82728] [8 regression] Incorrect -Wunused-but-set-variable warning with a const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82728 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug c/78768] -Walloca-larger-than and -Wformat-length warnings disabled by -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78768 --- Comment #13 from Richard Biener --- Author: rguenth Date: Wed Jan 10 14:51:07 2018 New Revision: 256430 URL: https://gcc.gnu.org/viewcvs?rev=256430=gcc=rev Log: 2018-01-10 Richard BienerPR testsuite/78768 * gcc.dg/pr78768.c: Un-XFAIL. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/pr78768.c
[Bug sanitizer/82824] [8 regression] libsanitizer fails to build: VM_MEMORY_OS_ALLOC_ONCE undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82824 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug rtl-optimization/82982] [8 Regression] ICE: qsort checking failed (error: qsort comparator non-negative on sorted output: 5) in ready_sort_real in haifa scheduler
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82982 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-01-10 Ever confirmed|0 |1
[Bug boehm-gc/66848] boehm-gc fails test suite on x86_64-apple-darwin15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66848 Andrew Haley changed: What|Removed |Added Status|WAITING |RESOLVED CC||aph at gcc dot gnu.org Resolution|--- |WONTFIX --- Comment #35 from Andrew Haley --- Boehm GC is gone from GCC sources.
[Bug ipa/83051] [8 Regression] ICE on valid code at -O3: in edge_badness, at ipa-inline.c:1024
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83051 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug ipa/83054] [8 Regression] ICE in operator>, at profile-count.h:823
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83054 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug tree-optimization/83081] [8 regression][arm] gcc.dg/pr80218.c fails since r254888
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83081 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Status|UNCONFIRMED |NEW Last reconfirmed|2017-12-19 00:00:00 |2018-01-10 CC||hubicka at gcc dot gnu.org Ever confirmed|0 |1
[Bug debug/83157] [6/7/8 regression] gcc.dg/guality/pr41616-1.c fail, inline instances refer to concrete instance as abstract origin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83157 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug c++/83160] [8 regression] lvalue required as unary ‘&’ operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83160 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug target/83629] [7/8 Regression] ICE: in decompose_normal_address, at rtlanal.c:6329 with -O2 -fPIC -frename-registers --param=sched-autopref-queue-depth=nnn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83629 --- Comment #4 from Segher Boessenkool --- Author: segher Date: Wed Jan 10 15:13:07 2018 New Revision: 256432 URL: https://gcc.gnu.org/viewcvs?rev=256432=gcc=rev Log: rs6000: Wrap diff of immediates in const (PR83629) In various of our 32-bit load_toc patterns we take the difference of two immediates (labels) as a term to something bigger; but this isn't canonical RTL, it needs to be wrapped in CONST. PR target/83629 * config/rs6000/rs6000.md (load_toc_v4_PIC_2, load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): Wrap const term in CONST RTL. testsuite/ PR target/83629 * gcc.target/powerpc/pr83629.c: New testcase. Added: trunk/gcc/testsuite/gcc.target/powerpc/pr83629.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.md trunk/gcc/testsuite/ChangeLog
[Bug target/83330] [7/8 Regression] generating unaligned store to stack for SSE register with -mno-push-args
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83330 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug c++/83769] New: Statement expression inside lambda defined and evaluated in global scope fails to compile with optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83769 Bug ID: 83769 Summary: Statement expression inside lambda defined and evaluated in global scope fails to compile with optimizations Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: giel+gcc at mortis dot eu Target Milestone: --- Created attachment 43089 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43089=edit small example reproducing the problem Evaluating a statement expression (such as used by htons on glibc) inside a lambda defined at global scope (used for Immediately Invoked Lambda Expression (IILE) used as initializer for a global constant) successfully compiles at -O0 but fails to compile at -O1 or higher. I'm building with -std=c++11. I'm getting this error message: > : In lambda function: > :9:29: error: statement-expressions are not allowed outside functions > nor in template-argument lists >addr.sin_port = htons(KLocalPort); A reduced example is attached. I initially discovered this on GCC 4.8.2 (after discovering that a release build fails on CI, while a debug system on my local dev machine succeeds). Testing on godbolt.org confirms that this behaviour was already present on GCC 4.5.3 already (-std=c++0x was necessary on that version) and is still present on GCC 7.2. I also tested Clang (on godbolt.org only) and notice that versions 3.0 up until and including 3.5.1 have the same behaviour (succeed with -O0, fail with complaint about statement expression at file scope with -O1). Clang 3.6 no longer has this problem though.
[Bug target/81535] [8 regression] gcc.target/powerpc/pr79439.c fails starting with r250442
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81535 --- Comment #7 from Jakub Jelinek --- Any further progress? I see you've posted something, but no further follow-ups from you nor Segher.
[Bug tree-optimization/81635] [8 Regression] nvptx SLP test cases regressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81635 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug target/82005] [8 regression] early lto debug creates invalid assembly on Darwin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82005 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug middle-end/82362] [8 Regression] SPEC CPU2006 436.cactusADM ~7% performance deviation with trunk@251713
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82362 Richard Biener changed: What|Removed |Added CC||wilco at gcc dot gnu.org Component|fortran |middle-end Severity|enhancement |normal
[Bug testsuite/82770] [8 regression] gcc.dg/pr78768.c xpass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82770 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Richard Biener --- With -v you can clearly see the warnings are emitted from lto1. If you'd use fat LTO objects you'd see them twice but the testcase requires a linker plugin. So I think the XFAIL needs to simply be removed. Will commit that change now.
[Bug bootstrap/81033] [8 Regression] Revision r249019 breaks bootstrap on darwin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81033 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #30 from Jakub Jelinek --- Any progress on this? Or shall we just disable completely hot/cold function partitioning on darwin till then?
[Bug target/83768] New: ARM: wrong optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83768 Bug ID: 83768 Summary: ARM: wrong optimization Product: gcc Version: 5.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sorganov at gmail dot com Target Milestone: --- GCC 5.4.0 optimizes the following code to empty function, when in fact the code does clear 'count' bytes starting from 'buf' (it looks strange as it's a simplified version of real code). Note that GCC 6.3.0 produces correct code for the case. $ cat clean.c void clean(unsigned char* buf, int count) { int linear = 0; do { count -= linear; while(linear--) *buf++ = 0; linear = count; } while(linear > 0); } Could you please tell if/when this has been already fixed in 5.4.x versions, and is there any particular optimization I can turn off as a work-around for GCC 5.4.0? $ /opt/arm-linux-5.4/bin/arm-linux-gnueabi-gcc -v Using built-in specs. COLLECT_GCC=/opt/arm-linux-5.4/bin/arm-linux-gnueabi-gcc COLLECT_LTO_WRAPPER=/opt/arm-linux-5.4/libexec/gcc/arm-linux-gnueabi/5.4.0/lto-wrapper Target: arm-linux-gnueabi Configured with: ../src/gcc-5.4.0/configure --prefix=/opt/arm-linux-5.4 --target=arm-linux-gnueabi --disable-multilib --disable-nls --enable-shared --enable-threads --enable-tls --disable-decimal-float --disable-libmudflap --disable-libquadmath --disable-libsanitizer --disable-libssp --disable-multilib --with-float=hard --with-gnu-as --with-gnu-ld --without-cloog --without-isl --with-sysroot=/opt/arm-linux-5.4/arm-linux-gnueabi/sysroot --enable-languages=c,c++ --enable-__cxa_atexit --enable-static --enable-shared --enable-threads --enable-long-long --disable-lto --disable-libgomp Thread model: posix gcc version 5.4.0 (GCC) $ /opt/arm-linux-5.4/bin/arm-linux-gnueabi-gcc -O2 -c clean.c -o clean.o -save-temps && cat clean.s .cpu arm10tdmi .eabi_attribute 28, 1 .fpu vfp .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 2 .eabi_attribute 30, 2 .eabi_attribute 34, 0 .eabi_attribute 18, 4 .arm .syntax divided .file "clean.c" .text .align 2 .global clean .type clean, %function clean: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. bx lr .size clean, .-clean .ident "GCC: (GNU) 5.4.0" .section.note.GNU-stack,"",%progbits $
[Bug tree-optimization/83055] [8 Regression] ICE in operator>, at profile-count.h:834
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83055 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Version|7.0 |8.0
[Bug lto/83121] [8 Regression] ICE: in linemap_ordinary_map_lookup, at libcpp/line-map.c:995
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83121 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug ipa/83178] [8 regression] g++.dg/ipa/devirt-22.C fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83178 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug target/83203] [8 Regression] Inefficient int to avx2 vector conversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83203 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Version|7.2.1 |8.0
[Bug c++/81327] [8 Regression] cast to void* does not suppress -Wclass-memaccess
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81327 --- Comment #8 from Ville Voutilainen --- I can confirm that this fixes the woes in building and using Qt as far as QVector is concerned. Now that we have this fix, I see that there are other bit-blasts that don't use casts. Would it be possible to remove this warning from -Wall?
[Bug tree-optimization/83288] [8 Regression] polyhedron gas_dyn 2-fold compile-time regression caused by r255103
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83288 Richard Biener changed: What|Removed |Added Priority|P3 |P1 --- Comment #5 from Richard Biener --- Not solved by the big_speedup typo fix. Let's see if the other accidential change fix fixes it...
[Bug tree-optimization/83435] [8 Regression] ICE in set_value_range, at tree-vrp.c:211
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83435 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Status|NEW |ASSIGNED Version|7.2.1 |8.0 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #3 from Richard Biener --- Mine.
[Bug target/83629] [7/8 Regression] ICE: in decompose_normal_address, at rtlanal.c:6329 with -O2 -fPIC -frename-registers --param=sched-autopref-queue-depth=nnn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83629 --- Comment #5 from Segher Boessenkool --- Fixed on trunk; awaiting backports.
[Bug rtl-optimization/83770] New: [8 Regression] ICE in create_preheader, at cfgloopmanip.c:1536
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83770 Bug ID: 83770 Summary: [8 Regression] ICE in create_preheader, at cfgloopmanip.c:1536 Product: gcc Version: 8.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- gcc-8.0.0-alpha20180107 snapshot (r256324) w/ r256420 applied on top of it still ICEs when compiling the following snippet w/ -O2 (-O3, -Ofast) -fselective-scheduling2 -fsel-sched-pipelining -fno-tree-vrp: int qt, h1; void py (void) { h1 = 1 / (!!qt && !!h1); for (;;) { } } % gcc-8.0.0-alpha20180107 -O2 -fselective-scheduling2 -fsel-sched-pipelining -fno-tree-vrp -c aeu1f56a.c during RTL pass: sched2 aeu1f56a.c: In function 'py': aeu1f56a.c:10:1: internal compiler error: in create_preheader, at cfgloopmanip.c:1536 } ^ 0x5cafc0 create_preheader(loop*, int) /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/cfgloopmanip.c:1535 0x9820b7 create_preheaders(int) /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/cfgloopmanip.c:1552 0xbe5229 apply_loop_flags /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/loop-init.c:64 0xbe5afc loop_optimizer_init(unsigned int) /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/loop-init.c:123 0xd070fd sel_init_pipelining() /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/sel-sched-ir.c:6093 0xd07522 sel_find_rgns() /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/sel-sched-ir.c:6240 0xcfd732 find_rgns /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/sched-rgn.c:1077 0xcfd732 sched_rgn_init(bool) /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/sched-rgn.c:3250 0xd1e821 sel_global_init /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/sel-sched.c:7661 0xd1e821 run_selective_scheduling() /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/sel-sched.c:7715 0xcfebc5 rest_of_handle_sched2 /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/sched-rgn.c:3729 0xcfebc5 execute /var/tmp/portage/sys-devel/gcc-8.0.0_alpha20180107/work/gcc-8-20180107/gcc/sched-rgn.c:3873
[Bug rtl-optimization/83575] [8 Regression] ICE: verify_flow_info failed (error: multiple hot/cold transitions found)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83575 --- Comment #7 from Arseny Solokha --- Is it safe to close this PR now?
[Bug target/83781] [8 Regression] Bootstrap failed on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 --- Comment #1 from H.J. Lu --- (In reply to H.J. Lu from comment #0) > > r256454 is OK and r25646 failed. It may be caused by r256457. Oops. r256463 failed.
[Bug fortran/79383] USE statement error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79383 kargl at gcc dot gnu.org changed: What|Removed |Added CC||kargl at gcc dot gnu.org --- Comment #2 from kargl at gcc dot gnu.org --- The attached code compiles with both 7-branch an trunk. When the resulting excutable is run I get 15.10 on output. Is this the expected behavior?
[Bug target/83781] [8 Regression] Bootstrap failed on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-01-11 Ever confirmed|0 |1 --- Comment #3 from H.J. Lu --- Please configure GCC with --with-arch=corei7 --with-cpu=corei7 --prefix=/usr/8.0.0 --enable-clocale=gnu --with-system-zlib --enable-shared --with-demangler-in-ld --enable-libmpx --with-fpmath=sse
[Bug target/83781] [8 Regression] Bootstrap failed on x86 with --with-arch=corei7 --with-cpu=corei7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781 --- Comment #5 from H.J. Lu --- On i686, it failed with --prefix=/usr/8.0.0 --enable-clocale=gnu --with-system-zlib --enable-shared --with-demangler-in-ld --enable-libmpx i686-linux --with-fpmath=sse
[Bug c++/81327] [8 Regression] cast to void* does not suppress -Wclass-memaccess
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81327 --- Comment #13 from Ville Voutilainen --- I understand that, but considering that I plan to convince the committee that the bit-blasts like Qt does should be well-defined, the warning is a bit eager, cast or no cast. And since it does break existing users, I would prefer having the warning be an opt-in rather than opt-out.
[Bug other/83737] [nvptx] FAIL: gcc.dg/stdint-width-1.c (test for excess errors) for with newlib stdint.h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83737 --- Comment #5 from Tom de Vries --- (In reply to jos...@codesourcery.com from comment #4) > Most configurations (for which the libc used has a working stdint.h) > should probably be using use_gcc_stdint=wrap, so that GCC's stdint.h > includes libc's for hosted compilations but GCC's own for freestanding > compilations. *-*-elf configurations do, for example. I'd advise adding > that for nvptx. I've build nvptx using "use_gcc_stdint=wrap", and indeed that fixed the failure I'll do full testing and commit.
[Bug target/83775] Segfault in arm_declare_function_name()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83775 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- I'm seeing this in my 2 recently built cross-compilers from x86_64-linux to arm*-linux-gnueabi too. I think it is pretty much on any testcase.
[Bug middle-end/81897] [6/7 Regression] spurious -Wmaybe-uninitialized warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81897 --- Comment #20 from Aldy Hernandez --- (In reply to Aldy Hernandez from comment #18) > (In reply to Arnd Bergmann from comment #16) > > Created attachment 43056 [details] > > linux/net/ipv6/route.c, preprocessed and compressed > > > > To test the patch, I reverted the workaround that was added to the kernel > > when I originally reported this. Unfortunately the warning is still there, > > only the reduced version is fixed. I attached the preprocessed source now, > > test with > > > > $ x86_64-linux-gcc-8.0.0 -fno-strict-aliasing -Wall -O2 -Wno-pointer-sign -c > > route-1.i > > /git/arm-soc/net/ipv6/route.c: In function 'inet6_rtm_getroute': > > /git/arm-soc/net/ipv6/route.c:4350:9: warning: 'dst' may be used > > uninitialized in this function [-Wmaybe-uninitialized] > >goto errout; > > ^~ > > } > > > > Reducing this with the latest gcc-8.0.0 snapshot gave me > > > > enum { true } fn1(); > > int inet6_rtm_getroute_iif, inet6_rtm_getroute_rt, inet6_rtm_getroute_rtm_0; > > int *inet6_rtm_getroute___trans_tmp_8; > > int fn2(); > > void fn3() { > > int *dst; > > _Bool fibmatch = inet6_rtm_getroute_rtm_0 & 2; > > if (inet6_rtm_getroute_iif) { > > if (!fibmatch) > > dst = inet6_rtm_getroute___trans_tmp_8; > > static _Bool __warned; > > fn2() && __warned &(); > > __warned = true; > > } else if (!fibmatch) > > dst = 0; > > if (fibmatch) > > dst = 0; > > inet6_rtm_getroute_rt = *dst; > > } > > For both the reduced and full testcase in comment #16, I see this warning > present in GCC 5, 6, 7, and 8 (mainline). So this does not look like a > regression in the recent past. > > I also tested the reduced case in this comment for gcc4.5 (at least at > r150051), and the warning is present there as well. I didn't test the full > testcase, as they were other warnings that muddled the waters in such > ancient version. > > If you feel this reduced case is still problematic, I think we should open a > new PR, and mark it as a regression if you find a GCC far enough back that > does not exhibit the warning. If you can't find a GCC for which this has > regressed, then at least mark it as an enhancement request. FWIW, the reduced testcase in comment 16 is fixed by Andrew and mine's upcoming work for the next release cycle (GCC 9). Sorry :(. So if you open a PR, feel free to CC or assign it to me. (not so for the test in comment 17).
[Bug fortran/82841] Segfault in gfc_simplify_transfer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82841 kargl at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |6.5 --- Comment #5 from kargl at gcc dot gnu.org --- Fixed on 6-branch, 7-branch, and trunk. Closing. Thanks for the bug report.