[Bug fortran/91496] !GCC$ directives error if mistyped or unknown
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91496 Nick changed: What|Removed |Added Version|4.7.2 |7.4.0 --- Comment #1 from Nick --- I have bumped version from 4 to 7, to make it *newer*.
[Bug fortran/91496] New: !GCC$ directives error if mistyped or unknown
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91496 Bug ID: 91496 Summary: !GCC$ directives error if mistyped or unknown Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: nickpapior at gmail dot com Target Milestone: --- In fortran codes I would like to use directives such as: !GCC$ unroll <> When trying to compile a fortran source code with the above directive it fails with: Error: Unclassifiable GCC directive at (1) A small test code: program test integer :: i real :: a(3) !GCC$ unroll 3 do i = 1, 3 a(i) = 2. end do print *, sum(a) end program Tested versions: Failed: 4.7.2 Failed: 4.8.3 Failed: 4.8.4 Failed: 4.8.5 Failed: 4.9.1 Failed: 4.9.2 Failed: 5.1.0 Failed: 5.2.0 Failed: 5.3.0 Failed: 5.4.0 Failed: 6.1.0 Failed: 6.2.0 Failed: 6.3.0 Failed: 6.4.0 Failed: 6.5.0 Failed: 7.1.0 Failed: 7.2.0 Failed: 7.3.0 Failed: 7.4.0 Success: 8.1.0 Success: 8.2.0 Success: 8.3.0 I would recommend that using non-existing or mis-typed directives should *never* issues errors, but rather warnings. The reasoning is that if new directives are added the source would need pre-processor statements to determine the used gfortran version in order to decide on the acceptance of each directive. This becomes cumbersome and unnecessary.
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #32 from Uroš Bizjak --- (In reply to H.J. Lu from comment #31) > > No, IMO IRA should be "fixed" to avoid stack temporary and (based on some > > cost metric) use direct move for paradoxical subregs. > > The problem is > > /* Moves between SSE and integer units are expensive. */ > if (SSE_CLASS_P (class1) != SSE_CLASS_P (class2)) > > /* ??? By keeping returned value relatively high, we limit the number >of moves between integer and SSE registers for all targets. >Additionally, high value prevents problem with x86_modes_tieable_p(), >where integer modes in SSE registers are not tieable >because of missing QImode and HImode moves to, from or between >MMX/SSE registers. */ > return MAX (8, SSE_CLASS_P (class1) > ? ix86_cost->hard_register.sse_to_integer > : ix86_cost->hard_register.integer_to_sse); > > The minimum cost of moves between SSE and integer units is 8. I guess this should be reviewed. This is from reload time, nowadays we never actually disable sse <-> int moves, we use preferred_for_{speed,size} attributes. Also, at least for TARGET_MMX_WITH_SSE targets, we can change the penalty for MMX moves. These sort of changes should be backed by runtime benchmark results. Thanks for heads up, but let's take this cost issue elsewhere.
[Bug middle-end/89544] Argument marshalling incorrectly assumes stack slots are naturally aligned.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89544 --- Comment #8 from Bernd Edlinger --- Author: edlinger Date: Tue Aug 20 05:32:49 2019 New Revision: 274691 URL: https://gcc.gnu.org/viewcvs?rev=274691&root=gcc&view=rev Log: 2019-08-20 Bernd Edlinger PR middle-end/89544 * function.c (assign_parm_find_stack_rtl): Use larger alignment when possible. testsuite: 2019-08-20 Bernd Edlinger PR middle-end/89544 * gcc.target/arm/unaligned-argument-1.c: New test. * gcc.target/arm/unaligned-argument-2.c: New test. Added: trunk/gcc/testsuite/gcc.target/arm/unaligned-argument-1.c trunk/gcc/testsuite/gcc.target/arm/unaligned-argument-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/function.c trunk/gcc/testsuite/ChangeLog
[Bug driver/18206] -dynamic-linker option seems to be badly named and broken
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18206 Eric Gallager changed: What|Removed |Added Component|target |driver --- Comment #4 from Eric Gallager --- (In reply to sandra from comment #2) > I was discussing this with Joseph Myers earlier today. He said "There isn't > meant to be a GCC driver -dynamic-linker option." and pointed at > > https://gcc.gnu.org/ml/gcc-patches/2010-12/msg00194.html > > but apparently some backends have broken again since then: > arm/freebsd.h c6x/uclinux-elf.h rs6000/freebsd64.h > > I found through experimentation that nios2-linux-gnu-gcc accepts > -dynamic-linker as a link option without error, but totally ignores it. > Joseph's explanation is that the driver accepts it as -d but > doesn't check the . If you try compiling with that option and not > just linking, gcc does diagnose that the are invalid. > ...so it seems to me, then, that the "driver" component would make more sense here than "target"
[Bug other/44210] Warning discoverability: generate parts of invoke.texi directly from .opt files to make it easier to find warning flag relationships
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44210 Eric Gallager changed: What|Removed |Added Depends on||44209 Summary|Extended warning control: |Warning discoverability: |like -Wevery -show-warnings |generate parts of ||invoke.texi directly from ||.opt files to make it ||easier to find warning flag ||relationships --- Comment #9 from Eric Gallager --- (In reply to Manuel López-Ibáñez from comment #8) > EnabledBy is already implemented. Also, -Wall --help=warnings shows which > warnings are enabled by -Wall. > > The only remaining thing is to generate parts of invoke.texi directly from > the .opt file. Retitling then. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44209 [Bug 44209] [meta-bug] Some warnings are not linked to diagnostics options
[Bug libstdc++/91495] New: std::transform_reduce with unary op is implemented in the parallel case but not the basic case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91495 Bug ID: 91495 Summary: std::transform_reduce with unary op is implemented in the parallel case but not the basic case Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: a.boettcher at gmail dot com Target Milestone: --- The iso standard per 29.8.5 requires a definition of the type below which is conspicuously missing in GCC-9, despite the parallel versions existing. templateT transform_reduce(InputIterator first, InputIterator last,T init,BinaryOperation binary_op, UnaryOperation unary_op); Seems like an obvious mistake
[Bug c/91494] New: Performance Regression when upgrading from 8.3.0 to 9.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91494 Bug ID: 91494 Summary: Performance Regression when upgrading from 8.3.0 to 9.0 Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: mc_george123 at hotmail dot com Target Milestone: --- During the phoronix tests of botan-1.4.0-blowfish benchmark and crafty-1.4.4 benchmark, there are performance regressions in compilation process between version 8.3.0-433 and 9.0-454.
[Bug middle-end/91433] Performance Regression when upgrading from 8.3.0 to 9.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91433 --- Comment #2 from George Fan --- The compiler option for botan is "-fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt", which the compiler option for crafty is "-pthread -lstdc++ -fprofile-use -lm". While the sub-architecture is coffee lake.
[Bug c++/91493] New: g++ 9.2.1 crashes compiling clickhouse
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91493 Bug ID: 91493 Summary: g++ 9.2.1 crashes compiling clickhouse Product: gcc Version: 9.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: rafaeldtinoco at ubuntu dot com Target Milestone: --- The following function: std::string toString(const ColumnDefaultKind kind) { static const std::unordered_map map{ { ColumnDefaultKind::Default, AliasNames::DEFAULT }, { ColumnDefaultKind::Materialized, AliasNames::MATERIALIZED }, { ColumnDefaultKind::Alias, AliasNames::ALIAS } }; const auto it = map.find(kind); return it != std::end(map) ? it->second : throw Exception{"Invalid ColumnDefaultKind", ErrorCodes::LOGICAL_ERROR}; } causes gcc9 (with attached dump) to crash while other similar function (in related syntax): ColumnDefaultKind columnDefaultKindFromString(const std::string & str) { static const std::unordered_map map{ { AliasNames::DEFAULT, ColumnDefaultKind::Default }, { AliasNames::MATERIALIZED, ColumnDefaultKind::Materialized }, { AliasNames::ALIAS, ColumnDefaultKind::Alias } }; const auto it = map.find(str); return it != std::end(map) ? it->second : throw Exception{"Unknown column default specifier: " + str, ErrorCodes::LOGICAL_ERROR}; } does not. Changing the syntax to: std::string toString(const ColumnDefaultKind kind) { static const std::unordered_map map{ { ColumnDefaultKind::Default, AliasNames::DEFAULT }, { ColumnDefaultKind::Materialized, AliasNames::MATERIALIZED }, { ColumnDefaultKind::Alias, AliasNames::ALIAS } }; const auto it = map.find(kind); if (it != std::end(map)) throw Exception{"Invalid ColumnDefaultKind", ErrorCodes::LOGICAL_ERROR}; return it->second; } fixes the issue.
[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478 --- Comment #3 from dave.anglin at bell dot net --- On 2019-08-19 2:51 a.m., rguenth at gcc dot gnu.org wrote: > Is this a new failure, thus can it be bisected somehow? The failure was introduced in r273662: https://gcc.gnu.org/ml/gcc-cvs/2019-07/msg00827.html
[Bug rtl-optimization/91347] [7/8/9/10 Regression] pointer_string in linux vsprintf.c is miscompiled when sibling calls are optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91347 --- Comment #18 from dave.anglin at bell dot net --- On 2019-08-19 4:36 a.m., ebotcazou at gcc dot gnu.org wrote: > Created attachment 46728 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46728&action=edit > Execution test Works on hppa without -fno-inline.
[Bug ada/91492] [10 regression] Ada documentation issue starting with r274637
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91492 Hans-Peter Nilsson changed: What|Removed |Added CC||hp at gcc dot gnu.org --- Comment #1 from Hans-Peter Nilsson --- Apparently also happens for targets where ada isn't enabled (like cris-elf).
[Bug c++/79817] GCC does not recognize [[deprecated]] attribute for namespace
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79817 Marek Polacek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |mpolacek at gcc dot gnu.org --- Comment #4 from Marek Polacek --- I have a WIP patch: w.C: In function ‘int main()’: w.C:5:9: warning: ‘ns’ is deprecated [-Wdeprecated-declarations] 5 | ns::i = 0; | ^ w.C:1:26: note: declared here 1 | namespace [[deprecated]] ns { int i ; } | ^~
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #31 from H.J. Lu --- (In reply to Uroš Bizjak from comment #28) > (In reply to Richard Biener from comment #26) > > This is the powers of simplify_subreg I guess. We're lucky it doesn't do > > this to arbitrary arithmetic. > > > > So we need to really change all defs we introduce to vector modes instead of > > making our live easy and using paradoxical subregs all over the place. > > No, IMO IRA should be "fixed" to avoid stack temporary and (based on some > cost metric) use direct move for paradoxical subregs. The problem is /* Moves between SSE and integer units are expensive. */ if (SSE_CLASS_P (class1) != SSE_CLASS_P (class2)) /* ??? By keeping returned value relatively high, we limit the number of moves between integer and SSE registers for all targets. Additionally, high value prevents problem with x86_modes_tieable_p(), where integer modes in SSE registers are not tieable because of missing QImode and HImode moves to, from or between MMX/SSE registers. */ return MAX (8, SSE_CLASS_P (class1) ? ix86_cost->hard_register.sse_to_integer : ix86_cost->hard_register.integer_to_sse); The minimum cost of moves between SSE and integer units is 8.
[Bug libstdc++/91486] future::wait_for and shared_timed_mutex::wait_for do not work properly with float duration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91486 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-08-19 Ever confirmed|0 |1 --- Comment #4 from Jonathan Wakely --- Yes that probably makes sense to do.
[Bug c++/91484] Error message: std::is_constructible with incomplete types.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91484 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2019-08-19 Ever confirmed|0 |1 --- Comment #1 from Jonathan Wakely --- Please read https://gcc.gnu.org/bugs (as requested when creating a new bug) and provide the required information.
[Bug fortran/91426] Different colors for errors with multiple locations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91426 --- Comment #4 from David Malcolm --- (In reply to Thomas Koenig from comment #2) > Having had occasion to look at a few hundred multi-line error messages > today, I have now changed my mind on what I would consider best :-) > > I now think different colors for primary and secondary error message > (if we stick with a maximum of two) is actually quite good. > > What would help a lot if the markers (1) and (2) in > > Error: Duplicate statement label 10 at (1) and (2) > > were also colored the same as the two markers under the > text of the program. > > Would this be doable / would others also find that useful? The patch I've just attached ought to do this (though it's just a crude prototype - it only works for the gfc_error_opt case). With that caveat, how does the output look?
[Bug fortran/91426] Different colors for errors with multiple locations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91426 --- Comment #3 from David Malcolm --- Created attachment 46732 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46732&action=edit Prototype patch to colorize the (1) and (2) in the example given
[Bug ada/91492] New: [10 regression] Ada documentation issue starting with r274637
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91492 Bug ID: 91492 Summary: [10 regression] Ada documentation issue starting with r274637 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ada Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- FAIL: compiler driver --help=ada option(s): "^ +-.*[^:.]$" absent from output: " -fdump-scos Dump Source Coverage Obligations" Did a test case not get updated with the recent Ada changes?
[Bug tree-optimization/91491] New: [9 Regression] glib2.0 build not working when built with -O2 on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91491 Bug ID: 91491 Summary: [9 Regression] glib2.0 build not working when built with -O2 on x86_64-linux-gnu Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: doko at debian dot org Target Milestone: --- Created attachment 46731 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46731&action=edit preprocessed source [forwarded from https://bugs.debian.org/931921] Seen with 9.2.0 on x86_64-linux-gnu, works with GCC 8, or with GCC 9 -O1. Further tracked down to: On Fri, 02 Aug 2019 at 19:49:20 +0100, Simon McVittie wrote: > If you compile test_run_seed() with -O1, and the rest of gtestutils.c > with -O2, the clutter test hangs. Binary-searching through the extra optimizations enabled by -O2 [1] led me to the minimal change being: if you modify test_run_seed() to add __attribute__((optimize("no-tree-pre"))) then the clutter test passes. Without that attribute it fails. Attaching the preprocessed gtestutils source.
[Bug ipa/89330] IPA inliner touches released cgraph_edges
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89330 Martin Jambor changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #23 from Martin Jambor --- (In reply to Martin Liška from comment #22) > Should be fixed now. So let's assume it is.
[Bug c++/79817] GCC does not recognize [[deprecated]] attribute for namespace
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79817 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #3 from Marek Polacek --- This is relevant to range-v3 which renamed the "view" namespace to "views", following the PascalCase -> snake_case change.
[Bug target/91386] [9 regression] open-iscsi iscsiadm miscompiled by LTO on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91386 Richard Earnshaw changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |9.3 --- Comment #24 from Richard Earnshaw --- Fixed for gcc-9.3.
[Bug target/91386] [9 regression] open-iscsi iscsiadm miscompiled by LTO on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91386 --- Comment #23 from Richard Earnshaw --- Author: rearnsha Date: Mon Aug 19 16:11:30 2019 New Revision: 274675 URL: https://gcc.gnu.org/viewcvs?rev=274675&root=gcc&view=rev Log: [aarch64] PR target/91386 Use copy_rtx to avoid modifying original insns in peep2 pattern PR target/91386 is a situation where a peephole2 pattern substitution is discarded late because the selected instructions contain frame-related notes that we cannot redistribute (because the pattern has more than one insn in the output). Unfortunately, the original insns were being modified during the generation, so after the undo we are left with corrupt RTL. We avoid this by ensuring that the modifications are always made on a copy, so that the original insns are never changed. Backport from mainline 2019-09-09 Richard Earnshaw PR target/91386 * config/aarch64/aarch64.c (aarch64_gen_adjusted_ldpstp): Use copy_rtx to preserve the contents of the original insns. Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/config/aarch64/aarch64.c
[Bug libstdc++/91067] [9/10 Regression] Clang compiler can't link executable if std::filesystem::directory_iterator is encountered
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91067 --- Comment #17 from Viktor Ostashevskyi --- Ok, got following today with GCC 9.2 with "-O2 -fno-inline -flto=20": ld.bfd: /tmp/tests.oKru4z.ltrans32.ltrans.o: in function `std::__shared_ptr::operator=(std::__shared_ptr&&)': c++/9.2.0/bits/shared_ptr_base.h:1265: undefined reference to `std::__shared_ptr::__shared_ptr(std::__shared_ptr&&)' Code base is huge, so it is hard to minimize test case. Even not sure whether LTO or libstdc++ is guilty.
[Bug middle-end/91490] New: [9/10 Regression] bogus argument missing terminating nul warning on strlen of a flexible array member
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91490 Bug ID: 91490 Summary: [9/10 Regression] bogus argument missing terminating nul warning on strlen of a flexible array member Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: msebor at gcc dot gnu.org Target Milestone: --- The strlen call in f() below compiles with no warning and is successfully folded to a constant, but the equivalent call in g() triggers two spurious instances of the same warning and is not folded. The warning is new in GCC 9 so GCC 8 compiles both functions without one, and folds neither call, $ cat a.c && gcc -O2 -S -Wall -Wextra -fdump-tree-optimized=/dev/stdout a.c struct A { char n, s[]; }; const struct A a1 = { 3, "321" }; int f (void) { return __builtin_strlen (a1.s); // no warning, folded to 3 } const struct A a2 = { 3, { 3, 2, 1, 0 } }; int g (void) { return __builtin_strlen (a2.s); // bogus warning, not folded } a.c: In function ‘g’: a.c:14:30: warning: ‘strlen’ argument missing terminating nul [-Wstringop-overflow=] 14 | return __builtin_strlen (a2.s); // bogus warning, not folded |~~^~ a.c:10:16: note: referenced argument declared here 10 | const struct A a2 = { 3, { 3, 2, 1, 0 } }; |^~ a.c:14:30: warning: ‘strlen’ argument missing terminating nul [-Wstringop-overflow=] 14 | return __builtin_strlen (a2.s); // bogus warning, not folded |~~^~ a.c:10:16: note: referenced argument declared here 10 | const struct A a2 = { 3, { 3, 2, 1, 0 } }; |^~ ;; Function f (f, funcdef_no=0, decl_uid=1910, cgraph_uid=1, symbol_order=1) f () { [local count: 1073741824]: return 3; } ;; Function g (g, funcdef_no=1, decl_uid=1914, cgraph_uid=2, symbol_order=3) g () { long unsigned int _1; int _3; [local count: 1073741824]: _1 = __builtin_strlen (&a2.s); _3 = (int) _1; return _3; }
[Bug tree-optimization/37242] missed FRE opportunity because of signedness of addition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37242 --- Comment #26 from Richard Biener --- /* Match arithmetic done in a different type where we can easily substitute the result from some earlier sign-changed or widened operation. */ if (INTEGRAL_TYPE_P (type) && TREE_CODE (rhs1) == SSA_NAME /* We only handle sign-changes or zero-extension -> & mask. */ this is sign-extension... I think if the inner op has undefined overflow we can widen it(?). That fixes the new testcase. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 274670) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -4312,8 +4312,11 @@ visit_nary_op (tree lhs, gassign *stmt) operation. */ if (INTEGRAL_TYPE_P (type) && TREE_CODE (rhs1) == SSA_NAME - /* We only handle sign-changes or zero-extension -> & mask. */ - && ((TYPE_UNSIGNED (TREE_TYPE (rhs1)) + /* We only handle sign-changes, zero-extension -> & mask or +sign-extension if we know the inner operation doesn't +overflow. */ + && (((TYPE_UNSIGNED (TREE_TYPE (rhs1)) + || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (rhs1))) && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (rhs1))) || TYPE_PRECISION (type) == TYPE_PRECISION (TREE_TYPE (rhs1 { @@ -4347,7 +4350,8 @@ visit_nary_op (tree lhs, gassign *stmt) { unsigned lhs_prec = TYPE_PRECISION (type); unsigned rhs_prec = TYPE_PRECISION (TREE_TYPE (rhs1)); - if (lhs_prec == rhs_prec) + if (lhs_prec == rhs_prec + || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (rhs1))) { gimple_match_op match_op (gimple_match_cond::UNCOND, NOP_EXPR, type, ops[0]); for the benchmark PRE inserts the this way redundant code: Found partial redundancy for expression {nop_expr,maxIdx_31} (0012) Inserted _72 = (long unsigned int) maxIdx_42; in predecessor 4 (0052) but late FRE can get rid of it again.
[Bug tree-optimization/91403] GCC fails with ICE.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91403 --- Comment #5 from Richard Biener --- Author: rguenth Date: Mon Aug 19 14:45:38 2019 New Revision: 274672 URL: https://gcc.gnu.org/viewcvs?rev=274672&root=gcc&view=rev Log: 2019-08-19 Richard Biener PR tree-optimization/91403 * tree-scalar-evolution.c (follow_ssa_edge_binary): Inline cases we can handle with tail-recursion... (follow_ssa_edge_expr): ... here. Do so. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-scalar-evolution.c
[Bug c++/85125] constant expression with const_cast UB does not emit error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85125 Marek Polacek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Marek Polacek --- Done for 10.1 via: Author: mpolacek Date: Mon Aug 19 13:59:13 2019 New Revision: 274671 URL: https://gcc.gnu.org/viewcvs?rev=274671&root=gcc&view=rev Log: PR c++/91264 - detect modifying const objects in constexpr. * constexpr.c (modifying_const_object_error): New function. (cxx_eval_call_expression): Set TREE_READONLY on a CONSTRUCTOR of a const-qualified object after it's been fully constructed. (modifying_const_object_p): New function. (cxx_eval_store_expression): Detect modifying a const object during constant expression evaluation. (cxx_eval_increment_expression): Use a better location when building up the store. (cxx_eval_constant_expression) : Mark a constant object's constructor TREE_READONLY. * g++.dg/cpp1y/constexpr-tracking-const1.C: New test. * g++.dg/cpp1y/constexpr-tracking-const2.C: New test. * g++.dg/cpp1y/constexpr-tracking-const3.C: New test. * g++.dg/cpp1y/constexpr-tracking-const4.C: New test. * g++.dg/cpp1y/constexpr-tracking-const5.C: New test. * g++.dg/cpp1y/constexpr-tracking-const6.C: New test. * g++.dg/cpp1y/constexpr-tracking-const7.C: New test. * g++.dg/cpp1y/constexpr-tracking-const8.C: New test. * g++.dg/cpp1y/constexpr-tracking-const9.C: New test. * g++.dg/cpp1y/constexpr-tracking-const10.C: New test. * g++.dg/cpp1y/constexpr-tracking-const11.C: New test. * g++.dg/cpp1y/constexpr-tracking-const12.C: New test. * g++.dg/cpp1y/constexpr-tracking-const13.C: New test. * g++.dg/cpp1y/constexpr-tracking-const14.C: New test. Added: trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const1.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const10.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const11.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const12.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const13.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const14.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const2.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const3.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const4.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const5.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const6.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const7.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const8.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const9.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/constexpr.c trunk/gcc/testsuite/ChangeLog
[Bug c++/91264] modifying const-qual object in constexpr context not detected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91264 Marek Polacek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Marek Polacek --- Implemented in GCC 10.1.
[Bug c++/91264] modifying const-qual object in constexpr context not detected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91264 --- Comment #5 from Marek Polacek --- Author: mpolacek Date: Mon Aug 19 13:59:13 2019 New Revision: 274671 URL: https://gcc.gnu.org/viewcvs?rev=274671&root=gcc&view=rev Log: PR c++/91264 - detect modifying const objects in constexpr. * constexpr.c (modifying_const_object_error): New function. (cxx_eval_call_expression): Set TREE_READONLY on a CONSTRUCTOR of a const-qualified object after it's been fully constructed. (modifying_const_object_p): New function. (cxx_eval_store_expression): Detect modifying a const object during constant expression evaluation. (cxx_eval_increment_expression): Use a better location when building up the store. (cxx_eval_constant_expression) : Mark a constant object's constructor TREE_READONLY. * g++.dg/cpp1y/constexpr-tracking-const1.C: New test. * g++.dg/cpp1y/constexpr-tracking-const2.C: New test. * g++.dg/cpp1y/constexpr-tracking-const3.C: New test. * g++.dg/cpp1y/constexpr-tracking-const4.C: New test. * g++.dg/cpp1y/constexpr-tracking-const5.C: New test. * g++.dg/cpp1y/constexpr-tracking-const6.C: New test. * g++.dg/cpp1y/constexpr-tracking-const7.C: New test. * g++.dg/cpp1y/constexpr-tracking-const8.C: New test. * g++.dg/cpp1y/constexpr-tracking-const9.C: New test. * g++.dg/cpp1y/constexpr-tracking-const10.C: New test. * g++.dg/cpp1y/constexpr-tracking-const11.C: New test. * g++.dg/cpp1y/constexpr-tracking-const12.C: New test. * g++.dg/cpp1y/constexpr-tracking-const13.C: New test. * g++.dg/cpp1y/constexpr-tracking-const14.C: New test. Added: trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const1.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const10.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const11.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const12.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const13.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const14.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const2.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const3.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const4.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const5.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const6.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const7.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const8.C trunk/gcc/testsuite/g++.dg/cpp1y/constexpr-tracking-const9.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/constexpr.c trunk/gcc/testsuite/ChangeLog
[Bug libstdc++/91486] future::wait_for and shared_timed_mutex::wait_for do not work properly with float duration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91486 --- Comment #3 from John Salmon --- I grep'ed the latest devel source tree (git sha: afadff66) for occurrences of now\(\). The same bug appears several times in include/experimental/io_context and include/experimental/timer. The underlying problem is that operator+(time_point, duration) has well-defined but surprising and error-prone semantics when duration's Rep is float. Maybe it would be better to define a less error-prone helper function, e.g., template time_point __timepoint_plus_duration(const time_point&, duration&); and to use it consistently whenever adding a time_point to a duration?
[Bug rtl-optimization/91347] [7/8/9/10 Regression] pointer_string in linux vsprintf.c is miscompiled when sibling calls are optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91347 Eric Botcazou changed: What|Removed |Added Attachment #46728|0 |1 is obsolete|| --- Comment #17 from Eric Botcazou --- Created attachment 46730 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46730&action=edit Updated execution test
[Bug rtl-optimization/91347] [7/8/9/10 Regression] pointer_string in linux vsprintf.c is miscompiled when sibling calls are optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91347 --- Comment #16 from Eric Botcazou --- > Yes. It aborts with current gcc-9.2.1 and passes with patched gcc-10. > -fno-inline is needed. Thanks. Let's drop the -fno-inline and put it in gcc.c-torture/execute.
asking for __attribute__((aligned()) clarification
All, this is my first post on these lists, so please bear with me. My question is about gcc's __attribute__((aligned()). Please consider the following code: #include typedef uint32_t uuint32_t __attribute__((aligned(1))); uint32_t getuuint32(uint8_t p[]) { return *(uuint32_t*)p; } This is meant to prevent gcc to produce hard faults/address errors on architectures that do not support unaligned access to shorts/ints (e.g some ARMs, some m68k). On these architectures, gcc is supposed to replace the 32 bit access with a series of 8 or 16 bit accesses. I originally came from gcc 4.6.4 (yes, pretty old) where this did not work and gcc does not respect the aligned(1) attribute for its code generation (i.e. it generates a 'normal' pointer dereference, consequently crashing when the code executes). To be fair, it is my understanding that the gcc manuals never promised this *would* work. As - at least as far as I can tell - documentation didn't really change regarding lowering alignment for variables and does not appear to say anything specific regarding pointer dereference on single, misaligned variables, I was pretty astonished to see this working on newer gcc versions (tried 6.2 and 7.4), however. gcc appears to even know the differences within an architecture (68000 generates a bytewise copy while ColdFire - that supports unaligned access - copies a 32 bit value). My question: is this now intended behaviour we can rely on? If yes, can we have documentation upgraded to clearly state that this use case is valid? Thank you. Markus
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #30 from Richard Biener --- (In reply to Richard Biener from comment #29) > (In reply to Uroš Bizjak from comment #27) > > (In reply to rguent...@suse.de from comment #25) > > > and STV converting single-instruction 'chains': > > > > > > Collected chain #40... > > > insns: 381 > > > defs to convert: r463, r465 > > > Computing gain for chain #40... > > > Instruction gain 8 for 381: > > > {r465:SI=smin(r463:SI,[`numBins']);clobber > > > flags:CC;} > > > REG_DEAD r463:SI > > > REG_UNUSED flags:CC > > > Instruction conversion gain: 8 > > > Registers conversion cost: 4 > > > Total gain: 4 > > > Converting chain #40... > > > > Is this in STV1 pass? This (pre-combine) pass should be enabled only for > > TImode conversion, a semi-hack where 64bit targets convert memory access to > > TImode. General STV should not be ran before combine. > > Yes, this is STV1. My patch to enable SImode and DImode chains didn't change > where the pass runs or enable the 2nd run out of compile-time concerns. > > Indeed changing this fixes the issue. I'm going to benchmark it on > 300.twolf. Hmm, it regresses the gcc.target/i386/minmax-6.c though and thus cactusADM (IIRC).
[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478 --- Comment #2 from dave.anglin at bell dot net --- On 2019-08-19 2:51 a.m., rguenth at gcc dot gnu.org wrote: > Is this a new failure, thus can it be bisected somehow? New. I can say at this point that r273635 was okay. There was a testsuite problem with r274010 but it appears that it might have been okay as well.
[Bug rtl-optimization/91347] [7/8/9/10 Regression] pointer_string in linux vsprintf.c is miscompiled when sibling calls are optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91347 --- Comment #15 from dave.anglin at bell dot net --- On 2019-08-19 4:35 a.m., ebotcazou at gcc dot gnu.org wrote: > OK, thanks. Can you check that the testcase to be attached is a valid > execution test for trunk when compiled with -O2 -fno-inline ? Yes. It aborts with current gcc-9.2.1 and passes with patched gcc-10. -fno-inline is needed.
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #29 from Richard Biener --- (In reply to Uroš Bizjak from comment #27) > (In reply to rguent...@suse.de from comment #25) > > and STV converting single-instruction 'chains': > > > > Collected chain #40... > > insns: 381 > > defs to convert: r463, r465 > > Computing gain for chain #40... > > Instruction gain 8 for 381: {r465:SI=smin(r463:SI,[`numBins']);clobber > > flags:CC;} > > REG_DEAD r463:SI > > REG_UNUSED flags:CC > > Instruction conversion gain: 8 > > Registers conversion cost: 4 > > Total gain: 4 > > Converting chain #40... > > Is this in STV1 pass? This (pre-combine) pass should be enabled only for > TImode conversion, a semi-hack where 64bit targets convert memory access to > TImode. General STV should not be ran before combine. Yes, this is STV1. My patch to enable SImode and DImode chains didn't change where the pass runs or enable the 2nd run out of compile-time concerns. Indeed changing this fixes the issue. I'm going to benchmark it on 300.twolf. > > to me the "spill" to (%rsp) looks suspicious and even more so > > the vector(!) memory use in vpminsd. RA could have used > > > > movd %eax, %xmm1 > > vpminsd %xmm1, %xmm0, %xmm1 > > > > no? IRA allocates the pseudo to memory. Testcase: > > This is how IRA handles subregs. Please note, that the memory is correctly > aligned, so vector load does not trip alignment trap. However, on x86 this > approach triggers store forwarding stall.
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #28 from Uroš Bizjak --- (In reply to Richard Biener from comment #26) > This is the powers of simplify_subreg I guess. We're lucky it doesn't do > this to arbitrary arithmetic. > > So we need to really change all defs we introduce to vector modes instead of > making our live easy and using paradoxical subregs all over the place. No, IMO IRA should be "fixed" to avoid stack temporary and (based on some cost metric) use direct move for paradoxical subregs.
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #27 from Uroš Bizjak --- (In reply to rguent...@suse.de from comment #25) > On Sat, 17 Aug 2019, ubizjak at gmail dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 > > > > --- Comment #24 from Uroš Bizjak --- > > It looks that the patch introduced a (small?) runtime regression of 5% in > > SPEC2000 300.twolf on haswell [1]. Maybe worth looking at. > > Biggest changes when benchmarking -mno-stv (base) against -mstv (peak): > >7.28% 37789 twolf_peak.none twolf_peak.none [.] ucxx2 >4.21% 25709 twolf_base.none twolf_base.none [.] ucxx2 >3.72% 22584 twolf_base.none twolf_base.none [.] new_dbox >2.48% 22364 twolf_peak.none twolf_peak.none [.] new_dbox >1.49% 8270 twolf_base.none twolf_base.none [.] sub_penal >1.12% 7576 twolf_peak.none twolf_peak.none [.] sub_penal >1.36% 9314 twolf_peak.none twolf_peak.none [.] > old_assgnto_new2 >1.11% 5257 twolf_base.none twolf_base.none [.] > old_assgnto_new2 > > and in ucxx2 I see > > 0.17 │ mov%eax,(%rsp) > 3.55 │ vpmins (%rsp),%xmm0,%xmm1 >│ test %eax,%eax > 0.22 │ vmovd %xmm1,%r8d > 0.80 │ cmovs %esi,%r8d > > This is from code like > > a1LoBin = Trybin/binWidth < 0 ? 0 : (Trybin>numBins ? numBins : Trybin) > > with only the inner one recognized as MIN because 'numBins' is only > ever loaded conditionally and we don't speculate it. So we expand > from > > _41 = _40 / binWidth.15_36; > if (_41 >= 0) > goto ; [59.00%] > else > goto ; [41.00%] > > bb5: > numBins.26_42 = numBins; > iftmp.19_315 = MIN_EXPR <_41, numBins.26_42>; > > bb6: > # iftmp.19_267 = PHI > > ending up with > > movl%r9d, %eax > cltd > idivl %ecx > movl%eax, (%rsp) > vpminsd (%rsp), %xmm0, %xmm1 > testl %eax, %eax > vmovd %xmm1, %r11d > cmovs %esi, %r11d > > and STV converting single-instruction 'chains': > > Collected chain #40... > insns: 381 > defs to convert: r463, r465 > Computing gain for chain #40... > Instruction gain 8 for 381: {r465:SI=smin(r463:SI,[`numBins']);clobber > flags:CC;} > REG_DEAD r463:SI > REG_UNUSED flags:CC > Instruction conversion gain: 8 > Registers conversion cost: 4 > Total gain: 4 > Converting chain #40... Is this in STV1 pass? This (pre-combine) pass should be enabled only for TImode conversion, a semi-hack where 64bit targets convert memory access to TImode. General STV should not be ran before combine. > to me the "spill" to (%rsp) looks suspicious and even more so > the vector(!) memory use in vpminsd. RA could have used > > movd %eax, %xmm1 > vpminsd %xmm1, %xmm0, %xmm1 > > no? IRA allocates the pseudo to memory. Testcase: This is how IRA handles subregs. Please note, that the memory is correctly aligned, so vector load does not trip alignment trap. However, on x86 this approach triggers store forwarding stall.
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #26 from Richard Biener --- This is the powers of simplify_subreg I guess. We're lucky it doesn't do this to arbitrary arithmetic. So we need to really change all defs we introduce to vector modes instead of making our live easy and using paradoxical subregs all over the place.
[Bug c/91489] New: misplaced stack pointer when __ms_hook_prologue__ attribute is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91489 Bug ID: 91489 Summary: misplaced stack pointer when __ms_hook_prologue__ attribute is used Product: gcc Version: 9.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gofmanp at gmail dot com Target Milestone: --- Created attachment 46729 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46729&action=edit Preprocessed test program (gcc -v -save-temps -m32 -O2 ./file.c) gcc -v: Using built-in specs. COLLECT_GCC=/usr/bin/gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 9.1.1 20190503 (Red Hat 9.1.1-1) (GCC) OS: Linux, Fedora 30 (x86_64) The following test program crashes with SEGFAULT (on return from second call to test_func()) when compiled as 'gcc -m32 -O2': - #include unsigned int __attribute__ ((noinline)) __attribute__((__stdcall__)) __attribute__((__ms_hook_prologue__)) test_func( unsigned long *size ) { static int once; if (once++ == 0) printf("(%p): stub\n", size); return 1; } int main(int argc, char **argv) { printf("%#x.\n", test_func(NULL)); printf("%#x.\n", test_func(NULL)); } The stack pointer is wrong in one of the code paths in test_func(). Here is the snippet from 'objdump -d a.out': 80491e0: 8b ff mov%edi,%edi 80491e2: 55 push %ebp 80491e3: 8b ec mov%esp,%ebp 80491e5: a1 1c c0 04 08 mov0x804c01c,%eax 80491ea: 8d 50 01lea0x1(%eax),%edx 80491ed: 89 15 1c c0 04 08 mov%edx,0x804c01c 80491f3: 85 c0 test %eax,%eax 80491f5: 74 09 je 8049200 80491f7: b8 01 00 00 00 mov$0x1,%eax ; the stack pointer is wrong here, need 'leave' or equivalent 80491fc: c2 04 00ret$0x4 80491ff: 90 nop 8049200: 5d pop%ebp 8049201: 83 ec 14sub$0x14,%esp 8049204: ff 74 24 18 pushl 0x18(%esp) 8049208: 68 0c a0 04 08 push $0x804a00c 804920d: e8 2e fe ff ff call 8049040 8049212: b8 01 00 00 00 mov$0x1,%eax 8049217: 83 c4 1cadd$0x1c,%esp 804921a: c2 04 00ret$0x4 The problem is not there without __attribute__((__ms_hook_prologue__)) (no stack frame is generated in this case), or without -O2 compiler flag. The problem originates from here: https://bugs.winehq.org/show_bug.cgi?id=47633
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #25 from rguenther at suse dot de --- On Sat, 17 Aug 2019, ubizjak at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 > > --- Comment #24 from Uroš Bizjak --- > It looks that the patch introduced a (small?) runtime regression of 5% in > SPEC2000 300.twolf on haswell [1]. Maybe worth looking at. Biggest changes when benchmarking -mno-stv (base) against -mstv (peak): 7.28% 37789 twolf_peak.none twolf_peak.none [.] ucxx2 4.21% 25709 twolf_base.none twolf_base.none [.] ucxx2 3.72% 22584 twolf_base.none twolf_base.none [.] new_dbox 2.48% 22364 twolf_peak.none twolf_peak.none [.] new_dbox 1.49% 8270 twolf_base.none twolf_base.none [.] sub_penal 1.12% 7576 twolf_peak.none twolf_peak.none [.] sub_penal 1.36% 9314 twolf_peak.none twolf_peak.none [.] old_assgnto_new2 1.11% 5257 twolf_base.none twolf_base.none [.] old_assgnto_new2 and in ucxx2 I see 0.17 │ mov%eax,(%rsp) 3.55 │ vpmins (%rsp),%xmm0,%xmm1 │ test %eax,%eax 0.22 │ vmovd %xmm1,%r8d 0.80 │ cmovs %esi,%r8d This is from code like a1LoBin = Trybin/binWidth < 0 ? 0 : (Trybin>numBins ? numBins : Trybin) with only the inner one recognized as MIN because 'numBins' is only ever loaded conditionally and we don't speculate it. So we expand from _41 = _40 / binWidth.15_36; if (_41 >= 0) goto ; [59.00%] else goto ; [41.00%] bb5: numBins.26_42 = numBins; iftmp.19_315 = MIN_EXPR <_41, numBins.26_42>; bb6: # iftmp.19_267 = PHI ending up with movl%r9d, %eax cltd idivl %ecx movl%eax, (%rsp) vpminsd (%rsp), %xmm0, %xmm1 testl %eax, %eax vmovd %xmm1, %r11d cmovs %esi, %r11d and STV converting single-instruction 'chains': Collected chain #40... insns: 381 defs to convert: r463, r465 Computing gain for chain #40... Instruction gain 8 for 381: {r465:SI=smin(r463:SI,[`numBins']);clobber flags:CC;} REG_DEAD r463:SI REG_UNUSED flags:CC Instruction conversion gain: 8 Registers conversion cost: 4 Total gain: 4 Converting chain #40... to me the "spill" to (%rsp) looks suspicious and even more so the vector(!) memory use in vpminsd. RA could have used movd %eax, %xmm1 vpminsd %xmm1, %xmm0, %xmm1 no? IRA allocates the pseudo to memory. Testcase: extern int numBins; extern int binOffst; extern int binWidth; extern int Trybin; void foo (int); void bar (int aleft, int axcenter) { int a1LoBin = (((Trybin=((axcenter + aleft)-binOffst)/binWidth)<0) ? 0 : ((Trybin>numBins) ? numBins : Trybin)); foo (a1LoBin); } STV had emitted (insn 10 9 38 2 (parallel [ (set (reg:SI 93) (div:SI (reg:SI 92) ... (insn 38 10 12 2 (set (subreg:V4SI (reg:SI 98) 0) (vec_merge:V4SI (vec_duplicate:V4SI (reg:SI 93)) (const_vector:V4SI [ (const_int 0 [0]) repeated x4 ]) (const_int 1 [0x1]))) "t.c":9:56 -1 (nil)) ... (insn 39 31 32 2 (set (reg:SI 99) (mem/c:SI (symbol_ref:DI ("numBins") [flags 0x40] ) [1 numBins+0 S4 A32])) "t.c":9:75 -1 (nil)) (insn 32 39 37 2 (set (subreg:V4SI (reg:SI 95) 0) (smin:V4SI (subreg:V4SI (reg:SI 98) 0) (subreg:V4SI (reg:SI 99) 0))) "t.c":9:75 3657 {*sse4_1_sminv4si3} (nil)) but then combine elimiated the copy... Trying 38 -> 32: 38: r98:SI#0=vec_merge(vec_duplicate(r93:SI),const_vector,0x1) 32: r95:SI#0=smin(r98:SI#0,r99:SI#0) REG_DEAD r99:SI REG_DEAD r98:SI Successfully matched this instruction: (set (subreg:V4SI (reg:SI 95) 0) (smin:V4SI (subreg:V4SI (reg:SI 93) 0) (subreg:V4SI (reg:SI 99 [ numBins ]) 0))) allowing combination of insns 38 and 32 original costs 4 + 40 = 44 replacement cost 40 ...running into the issue I tried to fix with making the live-range split copy more "explicit" ... So it looks like STV "forgot" to convert 'reg:SI 99' which is a memory load it split out. But even when fixing that combine now forwards both ops, eliminating the LR split again :/ Disabling combine results in the expected variant (not going through the stack): idivl binWidth(%rip) vmovd %eax, %xmm0 testl %eax, %eax movl%eax, Trybin(%rip) vpminsd %xmm1, %xmm0, %xmm0 vmovd %xmm0, %eax cmovns %eax, %edi jmp foo so while Index: gcc/config/i386/i386-features.c === --- gcc/config/i386/i386-features.c (revision 274666) +++ gcc/config/i386/i386-features.c (working copy) @@ -910,7 +910,9 @@ general_scalar_chain::convert_op (rtx *o { rtx tmp = gen_reg_rtx (GET_MODE (*op)); - emit_insn_before (gen_
[Bug tree-optimization/91403] GCC fails with ICE.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91403 --- Comment #4 from Richard Biener --- OK, so I have a patch to fix the recursion depth in SCEV analysis but then we hit the next one in SLSR, in my case because with -O0 there's no tailcall performed but even with -O2 we don't tailcall it. #8 0x01fbb1d7 in replace_unconditional_candidate (c=0x200a1b60) at /space/rguenther/src/svn/trunk2/gcc/gimple-ssa-strength-reduction.c:2223 #9 0x01fbc47a in replace_uncond_cands_and_profitable_phis ( c=0x200a1b60) at /space/rguenther/src/svn/trunk2/gcc/gimple-ssa-strength-reduction.c:2625 #10 0x01fbc4bc in replace_uncond_cands_and_profitable_phis ( c=0x200a1ae0) at /space/rguenther/src/svn/trunk2/gcc/gimple-ssa-strength-reduction.c:2631 ... #599120 0x01fbc4bc in replace_uncond_cands_and_profitable_phis ( c=0x33ba1b0) at /space/rguenther/src/svn/trunk2/gcc/gimple-ssa-strength-reduction.c:2631 2631replace_uncond_cands_and_profitable_phis (lookup_cand (c->dependent)); given the structure a worklist would be necessary to fix things there.
[Bug rtl-optimization/91347] [7/8/9/10 Regression] pointer_string in linux vsprintf.c is miscompiled when sibling calls are optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91347 --- Comment #14 from Sven Schnelle --- I tested the patch with my (previously broken) kernel Build, and the issue seems to be fixed. Thanks!
[Bug ada/65696] ASAN reports global-buffer-overrun for local tagged types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65696 --- Comment #3 from pmderodat at gcc dot gnu.org --- Author: pmderodat Date: Mon Aug 19 08:36:39 2019 New Revision: 274654 URL: https://gcc.gnu.org/viewcvs?rev=274654&root=gcc&view=rev Log: [Ada] Buffer reading overflow in dispatch table initialization For tagged types not defined at library level that derive from library level tagged types the compiler may generate code to initialize their dispatch table of predefined primitives copying from the parent type data stored in memory after the dispatch table of the parent; that is, at runtime the initialization of dispatch tables overflows reading the parent dispatch table. This problem does not affect the execution of the program since the target dispatch table always has enough space to store the extra data, and after such copy the compiler generates code to complete the initialization of the dispatch table. The following test must compile and execute without errors. package pkg_a is type Root is tagged null record; end pkg_a; with pkg_a; procedure main is type Derived is new pkg_a.Root with null record; -- Test begin null; end main; Command: gnatmake -q main -fsanitize=address; ./main 2019-08-19 Javier Miranda gcc/ada/ PR ada/65696 * exp_atag.ads, exp_atag.adb (Build_Inherit_Predefined_Prims): Adding formal to specify how many predefined primitives are inherited from the parent type. * exp_disp.adb (Number_Of_Predefined_Prims): New subprogram. (Make_Secondary_DT): Compute the number of predefined primitives of all tagged types (including tagged types not defined at library level). Previously we unconditionally relied on the Max_Predef_Prims constant value when building the dispatch tables of tagged types not defined at library level (thus consuming more memory for their dispatch tables than required). (Make_DT): Compute the number of predefined primitives that must be inherited from their parent type when building the dispatch tables of tagged types not defined at library level. Previously we unconditionally relied on the Max_Predef_Prims constant value when building the dispatch tables of tagged types not defined at library level (thus copying more data than required from the parent type). Modified: trunk/gcc/ada/ChangeLog trunk/gcc/ada/exp_atag.adb trunk/gcc/ada/exp_atag.ads trunk/gcc/ada/exp_disp.adb
[Bug tree-optimization/37242] missed FRE opportunity because of signedness of addition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37242 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #25 from Richard Biener --- I think parts of this was fixed with the fix for PR45397 (r245752). I can't really figure if the issue in the benchmark is fixed though. I still see in PRE [local count: 1014686024]: # top_52 = PHI # maxIdx_54 = PHI _4 = (long unsigned int) maxIdx_54; _5 = _4 * 4; _6 = numbers_37(D) + _5; _7 = *_6; _8 = _4 + 1; _9 = _8 * 4; _10 = numbers_37(D) + _9; _11 = *_10; if (_7 < _11) goto ; [50.00%] else goto ; [50.00%] [local count: 507343012]: maxIdx_42 = maxIdx_54 + 1; _68 = (long unsigned int) maxIdx_42; _70 = _68 * 4; _72 = numbers_37(D) + _70; pretmp_74 = *_72; suggesting it is not fixed in GCC 8 at least. Same with GCC 9 and trunk. Testcase: unsigned long a, b; void foo (int m, int f) { unsigned long tem = (unsigned long)m; a = tem + 1; if (f) { int tem2 = m + 1; b = (unsigned long)tem2; } } note to value-number the expressions the same you need to apply knowledge that in if (f), m + 1 cannot overflow.
[Bug rtl-optimization/91347] [7/8/9/10 Regression] pointer_string in linux vsprintf.c is miscompiled when sibling calls are optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91347 --- Comment #13 from Eric Botcazou --- Created attachment 46728 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46728&action=edit Execution test
[Bug rtl-optimization/91347] [7/8/9/10 Regression] pointer_string in linux vsprintf.c is miscompiled when sibling calls are optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91347 --- Comment #12 from Eric Botcazou --- > hppa-unknown-linux-gnu built successfully with change and there were no test > regressions: > https://gcc.gnu.org/ml/gcc-testresults/2019-08/msg01861.html > > Looks good to me. OK, thanks. Can you check that the testcase to be attached is a valid execution test for trunk when compiled with -O2 -fno-inline ?
[Bug libstdc++/91488] [9/10 Regression] char_traits::length causes "inlining failed in call to always_inline" error with -fgnu-tm -O2 -std=c++17
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91488 Richard Biener changed: What|Removed |Added Target Milestone|--- |9.4
[Bug tree-optimization/91482] __builtin_assume_aligned should not break write combining
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91482 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-08-19 Component|rtl-optimization|tree-optimization Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed. store-merging rejects it because we get rid of the __builtin_assume_alinged only in fab.
[Bug target/91472] [8/9/10 Regression] gmp testsuite segfaults with gcc-8 and gcc-9, works fine with gcc-7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91472 --- Comment #6 from Eric Botcazou --- I'll have a look.