[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 Jan Hubicka changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #10 from Jan Hubicka --- Trunk now generates on the unreduced testcase: .file "test.cpp" .text .p2align 4 .globl _Z1f .type _Z1f, @function _Z1f: .LFB6287: .cfi_startproc mulss %xmm3, %xmm0 movq%rdi, %rax mulss %xmm3, %xmm1 mulss %xmm3, %xmm2 movss %xmm0, (%rdi) movss %xmm1, 4(%rdi) movss %xmm2, 8(%rdi) ret .cfi_endproc .LFE6287: .size _Z1f, .-_Z1f .ident "GCC: (GNU) 11.0.1 20210331 (experimental)" .section.note.GNU-stack,"",@progbits
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 --- Comment #9 from CVS Commits --- The master branch has been updated by Jan Hubicka : https://gcc.gnu.org/g:3064fc21aa29d8e04b23c0b52dc4f67de1da6b2f commit r11-7948-g3064fc21aa29d8e04b23c0b52dc4f67de1da6b2f Author: Jan Hubicka Date: Thu Apr 1 12:11:39 2021 +0200 Add testcase for PR98265 gcc/testsuite/ChangeLog: 2021-04-01 Jan Hubicka PR ipa/98265 * gcc.dg/tree-ssa/pr98265.C: New test.
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 --- Comment #8 from CVS Commits --- The releases/gcc-10 branch has been updated by Jan Hubicka : https://gcc.gnu.org/g:42c22a4d724b4a4b0183f4412c3d42c9cca29d30 commit r10-9646-g42c22a4d724b4a4b0183f4412c3d42c9cca29d30 Author: Jan Hubicka Date: Wed Mar 31 22:44:20 2021 +0200 Make USES_COMDAT_LOCAL CIF_FINAL_NORMAL USES_COMDAT_LOCAL is incorrectly defined as CIF_FINAL_ERROR which makes inliner to mis some inlines of functions in comdat section that was previously split. 2021-03-31 Jan Hubicka PR ipa/98265 * cif-code.def (USES_COMDAT_LOCAL): Make CIF_FINAL_NORMAL. (cherry picked from commit e7fd3b783238d034018443e43a58ff87908b4db6)
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 --- Comment #7 from CVS Commits --- The master branch has been updated by Jan Hubicka : https://gcc.gnu.org/g:e7fd3b783238d034018443e43a58ff87908b4db6 commit r11-7940-ge7fd3b783238d034018443e43a58ff87908b4db6 Author: Jan Hubicka Date: Wed Mar 31 22:44:20 2021 +0200 Make USES_COMDAT_LOCAL CIF_FINAL_NORMAL USES_COMDAT_LOCAL is incorrectly defined as CIF_FINAL_ERROR which makes inliner to mis some inlines of functions in comdat section that was previously split. 2021-03-31 Jan Hubicka PR ipa/98265 * cif-code.def (USES_COMDAT_LOCAL): Make CIF_FINAL_NORMAL.
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 --- Comment #6 from Jan Hubicka --- I am testing diff --git a/gcc/cif-code.def b/gcc/cif-code.def index 2f430cf1c39..39b89da155f 100644 --- a/gcc/cif-code.def +++ b/gcc/cif-code.def @@ -125,7 +125,7 @@ DEFCIFCODE(OPTIMIZATION_MISMATCH, CIF_FINAL_ERROR, N_("optimization level attribute mismatch")) /* We can't inline because the callee refers to comdat-local symbols. */ -DEFCIFCODE(USES_COMDAT_LOCAL, CIF_FINAL_ERROR, +DEFCIFCODE(USES_COMDAT_LOCAL, CIF_FINAL_NORMAL, N_("callee refers to comdat-local symbols")) /* We can't inline because of mismatched caller/callee
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 --- Comment #5 from Jan Hubicka --- We do not inline CwiseNullaryOp because it uses comdat local symbols. This is because we do split the function and the .part stays local. At least we should recompute if function calls comdat local after comdat local function is inlined. IPA function summary for Eigen::Matrix should_inline(float, float, float, float)/2 inlinable fp_expression global time: 36.00 self size: 21 global size: 29 min size: 24 self stack: 16 global stack:20 size:19.00, time:18.50 size:4.50, time:3.50, executed if:(not inlined) calls: operator*.isra/90 inlined freq:1.00 Stack frame offset 16, callee self size 4 Eigen::CwiseNullaryOp< , >::CwiseNullaryOp(long int, long int, Eigen::scalar_constant_op) [with = Eigen::scalar_constant_op; PlainObjectType = Eigen::Matrix]/20 callee refers to comdat-local symbols freq:1.00 loop depth: 0 size: 5 time: 14 callee size: 6 stack: 0 op0 is compile time invariant op0 points to local or readonly memory op1 is compile time invariant op2 is compile time invariant op2 points to local or readonly memory
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 Martin Liška changed: What|Removed |Added Last reconfirmed||2021-01-12 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |hubicka at gcc dot gnu.org
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 --- Comment #4 from Martin Liška --- Created attachment 49947 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49947=edit Reduced test-case I reduced a test-case where GCC 10 does not inline all in fn called 'should_inline'.
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 --- Comment #3 from Kartik Mohta --- This is a simple example to demonstrate the problem I've noticed in a bigger program. Do the inlining limits depend on the size of the TU?
[Bug ipa/98265] [10/11 Regression] gcc-10 has significantly worse code generated with -O2 compared to -O1 (or gcc-9 -O2) when using the Eigen C++ library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98265 Richard Biener changed: What|Removed |Added Summary|gcc-10 has significantly|[10/11 Regression] gcc-10 |worse code generated with |has significantly worse |-O2 compared to -O1 (or |code generated with -O2 |gcc-9 -O2) when using the |compared to -O1 (or gcc-9 |Eigen C++ library |-O2) when using the Eigen ||C++ library CC||hubicka at gcc dot gnu.org Target Milestone|--- |10.3 Keywords||missed-optimization --- Comment #2 from Richard Biener --- The TU might be too small so we run into inline limits too quickly?