[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Jan Hubicka changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #17 from Jan Hubicka --- Fixed.
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 --- Comment #16 from Jan Hubicka --- Author: hubicka Date: Thu Feb 14 14:49:03 2019 New Revision: 268880 URL: https://gcc.gnu.org/viewcvs?rev=268880=gcc=rev Log: PR lto/88677 Fix PR number. Modified: trunk/gcc/ChangeLog
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #15 from Jakub Jelinek --- Patch has been approved, are you going to commit it?
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Jan Hubicka changed: What|Removed |Added Assignee|rguenther at suse dot de |hubicka at gcc dot gnu.org --- Comment #14 from Jan Hubicka --- I have posted updated patch to https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00683.html The original analysis was correct - I placed code clearing TREE_READONLY too late.
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Jan Hubicka changed: What|Removed |Added Assignee|hubicka at gcc dot gnu.org |rguenther at suse dot de --- Comment #13 from Jan Hubicka --- So even with the patch for TYPE_NEEDS_CONSTRUCTING I get the following difference in fre1 dump (and it is first one). I also use -fno-strict-aliasing --- test.ii.033t.fre1 2019-02-10 13:27:45.531305878 +0100 +++ dd/test.ii.033t.fre12019-02-10 13:27:34.895310810 +0100 @@ -120,7 +120,7 @@ n::p (struct n * const this) { int SR.3; - struct D D.2531; + struct D D.2527; struct D D.2507; unsigned int _1; @@ -221,9 +221,9 @@ marking outgoing edge 6 -> 8 executable Processing block 5: BB8 Value numbering stmt = SR.9_27 = MEM[(struct D *)]; -Setting value number of SR.9_27 to SR.8_23 (changed) +Setting value number of SR.9_27 to SR.9_27 (changed) marking outgoing edge 8 -> 9 executable -Making available beyond BB8 SR.9_27 for value SR.8_23 +Making available beyond BB8 SR.9_27 for value SR.9_27 Processing block 6: BB7 Value numbering stmt = SR.9_26 = 0; RHS 0 simplified to 0 @@ -233,12 +233,11 @@ Value numbering stmt = .MEM_14 = PHI <.MEM_6(7), .MEM_6(8)> Setting value number of .MEM_14 to .MEM_6 (changed) Value numbering stmt = SR.9_12 = PHI -Marking CSEd to PHI node SR.8_11 = PHI -Setting value number of SR.9_12 to SR.8_11 (changed) -SR.8_11 is available for SR.8_11 +Setting value number of SR.9_12 to SR.9_12 (changed) +Making available beyond BB9 SR.9_12 for value SR.9_12 Value numbering stmt = MEM[(struct D *)] = SR.9_12; No store match -Value numbering store MEM[(struct D *)] to SR.8_11 +Value numbering store MEM[(struct D *)] to SR.9_12 Setting value number of .MEM_18 to .MEM_18 (changed) Value numbering stmt = return; marking outgoing edge 9 -> 1 executable @@ -252,9 +251,7 @@ Value numbering stmt = : Value numbering stmt = resx 1 RPO iteration over 10 blocks visited 10 blocks in total discovering 10 executable blocks iterating 1.0 times, a block was visited max. 1 times -RPO tracked 2 values available at 3 locations and 17 lattice elements -Replaced redundant PHI node defining SR.9_12 with SR.8_11 -Removing dead stmt SR.9_12 = PHI <0(7), SR.9_27(8)> +RPO tracked 4 values available at 4 locations and 17 lattice elements Removing dead stmt SR.9_26 = 0; Removing dead stmt _10 = a_7(D); Removing dead stmt SR.7_24 = SR.8_11; @@ -281,9 +278,9 @@ int SR.8; int SR.7; unsigned int a; - struct D D.2547; - struct D D.2545; - struct D D.2544; + struct D D.2543; + struct D D.2541; + struct D D.2540; struct n a; struct D D.2498; struct D D.2497; @@ -313,7 +310,8 @@ SR.9_27 = MEM[(struct D *)]; : - MEM[(struct D *)] = SR.8_11; + # SR.9_12 = PHI <0(5), SR.9_27(6)> + MEM[(struct D *)] = SR.9_12; return; : Richi, do you have any idea what happens here? If needed I can analyze it more...
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 --- Comment #12 from Jan Hubicka --- I have posted patch which removes the problematic TYPE_NEEDS_CONSTRUCTION to https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00584.html It however does not fix the difference. I suppose it may be TBAA diverging somehow?
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 --- Comment #11 from Jan Hubicka --- > > I guess the FE needs the info. But yes, dropping TREE_READONLY > for TYPE_NEEDS_CONSTRUCTION decls at some point may make sense. > > Generally it would be nice to know (for alias analysis) that > an object is not written to via a pointer. For const globals > with a constructor such writes can only happen inside the global > ctor function(s). Not sure if IPA reference could be of help If I have global ctor of C++ readonly structure, I would expect to be able to call external function to construct its parts that may call back to the unit, so it seems to me that we can not track much? If the ctor is leaf we probably can track down the info, but won't currently. It is soemthing that ipa-ref could try to do if it was not lame.
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 --- Comment #10 from rguenther at suse dot de --- On Mon, 7 Jan 2019, hubicka at ucw dot cz wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 > > --- Comment #9 from Jan Hubicka --- > > https://gcc.gnu > > -> You can't simply remove this flag. You could set it to true > > conservatively. > > Or we could stop marking globals that need constructing TREE_READONLY. > > I see, I was under impression that we already drop TREE_READONLY for > them. But it is bit more subtle and thus I was indeed to quick to drop > streaming here. In: > > struct c { int a; c() { a=1; } }; > const c cc; > const int dd=1; > extern const c ee; > int foo() { return ee.a; } > > I get: > const c ee/8 (const c ee) @0x7f68d301ee80 > Type: variable > Visibility: external public > References: > Referring: _Z3foov/5 (read) > Availability: not-ready > Varpool flags: read-only > const int dd/4 (const int dd) @0x7f68d301ec80 > Type: variable definition analyzed > Visibility: force_output no_reorder > Aux: @0x7f68d316c450 > References: > Referring: > Availability: not-ready > Varpool flags: initialized read-only const-value-known > const c cc/3 (const c cc) @0x7f68d301e900 > Type: variable definition analyzed > Visibility: force_output no_reorder > Aux: @0x7f68d301ec80 > References: > Referring: _Z41__static_initialization_and_destruction_0ii/6 (addr) > Availability: not-ready > Varpool flags: > > So EE is indeed read-only, but CC is not. > I guess it is because constructor to CC is output locally and we clear > the flag, while constructor of EE is output externally. What is utility > of this information? Can we realy in some context that value of EE did > not change? > > It seems to me that it is completely useless info, so I would be in > favor of dropping TREE_READONLY flag probably at cgraph > construction time (because I suppose it is relevant to front-end) I guess the FE needs the info. But yes, dropping TREE_READONLY for TYPE_NEEDS_CONSTRUCTION decls at some point may make sense. Generally it would be nice to know (for alias analysis) that an object is not written to via a pointer. For const globals with a constructor such writes can only happen inside the global ctor function(s). Not sure if IPA reference could be of help here, it is really language stuff. There's also const_cast and mutable where things get into murky territorry...
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 --- Comment #9 from Jan Hubicka --- > https://gcc.gnu > -> You can't simply remove this flag. You could set it to true > conservatively. > Or we could stop marking globals that need constructing TREE_READONLY. I see, I was under impression that we already drop TREE_READONLY for them. But it is bit more subtle and thus I was indeed to quick to drop streaming here. In: struct c { int a; c() { a=1; } }; const c cc; const int dd=1; extern const c ee; int foo() { return ee.a; } I get: const c ee/8 (const c ee) @0x7f68d301ee80 Type: variable Visibility: external public References: Referring: _Z3foov/5 (read) Availability: not-ready Varpool flags: read-only const int dd/4 (const int dd) @0x7f68d301ec80 Type: variable definition analyzed Visibility: force_output no_reorder Aux: @0x7f68d316c450 References: Referring: Availability: not-ready Varpool flags: initialized read-only const-value-known const c cc/3 (const c cc) @0x7f68d301e900 Type: variable definition analyzed Visibility: force_output no_reorder Aux: @0x7f68d301ec80 References: Referring: _Z41__static_initialization_and_destruction_0ii/6 (addr) Availability: not-ready Varpool flags: So EE is indeed read-only, but CC is not. I guess it is because constructor to CC is output locally and we clear the flag, while constructor of EE is output externally. What is utility of this information? Can we realy in some context that value of EE did not change? It seems to me that it is completely useless info, so I would be in favor of dropping TREE_READONLY flag probably at cgraph construction time (because I suppose it is relevant to front-end) Honza
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Richard Biener changed: What|Removed |Added Keywords||wrong-code Status|ASSIGNED|NEW CC||rguenth at gcc dot gnu.org Assignee|rguenth at gcc dot gnu.org |hubicka at gcc dot gnu.org --- Comment #8 from Richard Biener --- Value numbering stmt = SR.10_27 = MEM[(struct D *)]; -Setting value number of SR.10_27 to SR.10_27 (changed) +Setting value number of SR.10_27 to SR.9_23 (changed) but then of course the code-difference : - # SR.10_12 = PHI <0(5), SR.10_27(6)> return; is somewhat spurious since nothing uses SR.10_12, we're getting to removal only because eliding SR.10_27 enables DCE. : SR.10_27 = MEM[(struct D *)]; : # SR.10_12 = PHI <0(5), SR.10_27(6)> return; So the code-gen difference is likely because of the following which makes dropping TYPE_NEEDS_CONSTRUCTING a suspicious change. /* Return true if VAR may be aliased. A variable is considered as maybe aliased if it has its address taken by the local TU or possibly by another TU and might be modified through a pointer. */ static inline bool may_be_aliased (const_tree var) { return (TREE_CODE (var) != CONST_DECL && (TREE_PUBLIC (var) || DECL_EXTERNAL (var) || TREE_ADDRESSABLE (var)) && !((TREE_STATIC (var) || TREE_PUBLIC (var) || DECL_EXTERNAL (var)) && ((TREE_READONLY (var) && !TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (var))) || (TREE_CODE (var) == VAR_DECL && DECL_NONALIASED (var); } the variable in question is unit-size align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x769e6a80 fields public abstract external autoinline decl_3 QI t.ii:2:29 align:16 warn_if_not_align:0 context full-name "constexpr D::D(D&&) noexcept ()" not-really-extern chain > context full-name "const class D" needs-constructor X(constX&) this=(X&) n_parents=1 use_template=1 interface-unknown pointer_to_this reference_to_this > readonly addressable used public external read decl_2 VOID t.ii:17:19 align:8 warn_if_not_align:0 context chain > may_be_aliased tries to avoid claiming R/O data can be written to, but of course that's not true for the global ctor. -> You can't simply remove this flag. You could set it to true conservatively. Or we could stop marking globals that need constructing TREE_READONLY.
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #7 from Richard Biener --- I will have a look.
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Richard Biener changed: What|Removed |Added Version|unknown |9.0 Target Milestone|--- |9.0
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Martin Liška changed: What|Removed |Added Keywords|needs-reduction | --- Comment #6 from Martin Liška --- Reduced test-case: $ cat test2.ii template class c; template class D : public c { public: static D d() { return D(0, e, f); } enum g { e }; enum h { f }; D(b *, g, h) : i() {} b i; }; struct j { int k; }; template <> class c { public: void l(); }; extern const D m; class n { unsigned o; public: D p() { return o ? D::d() : m; } }; void fn1() { n a; a.p().l(); a.p(); } Minimal options: -O2 Difference for -fdump-tree-fre1: $ diff -u 2 1 --- 2 2019-01-04 09:13:20.637935502 +0100 +++ 1 2019-01-04 09:13:17.961848506 +0100 @@ -57,7 +57,7 @@ n::p (struct n * const this) { int SR.3; - struct D D.2531; + struct D D.2527; struct D D.2507; unsigned int _1; @@ -108,9 +108,9 @@ int SR.8; int SR.7; unsigned int a; - struct D D.2547; - struct D D.2545; - struct D D.2544; + struct D D.2543; + struct D D.2541; + struct D D.2540; struct n a; struct D D.2498; struct D D.2497; @@ -140,7 +140,8 @@ SR.9_27 = MEM[(struct D *)]; : - MEM[(struct D *)] = SR.8_11; + # SR.9_12 = PHI <0(5), SR.9_27(6)> + MEM[(struct D *)] = SR.9_12; return; :
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Martin Liška changed: What|Removed |Added Keywords||needs-reduction --- Comment #5 from Martin Liška --- I'm reducing that test-case..
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 --- Comment #4 from Jan Hubicka --- This drops TYPE_NEEDS_CONSTRUCTING. I checked the uses jan@skylake:~/trunk/gcc> grep TYPE_NEEDS_CONSTRU *.c gimplify.c: || TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (decl print-tree.c: if (TYPE_NEEDS_CONSTRUCTING (node)) tree.c: TYPE_NEEDS_CONSTRUCTING (type) = 0; tree.c: verify_variant_match (TYPE_NEEDS_CONSTRUCTING); tree-inline.c: if (TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (p))) I convinced myself that gimply and tree-inline uses are there only for inlining/gimplifying within frontend. Perhaps they can also be dropped? Honza
[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-03 Known to work||8.2.0 Summary|Divergence in -O2 and -O2 |[9 Regression] Divergence |-flto early opts|in -O2 and -O2 -flto early ||opts Ever confirmed|0 |1 Known to fail||9.0 --- Comment #3 from Martin Liška --- Started with r265841.