[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-02-14 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Jan Hubicka  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #17 from Jan Hubicka  ---
Fixed.

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-02-14 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

--- Comment #16 from Jan Hubicka  ---
Author: hubicka
Date: Thu Feb 14 14:49:03 2019
New Revision: 268880

URL: https://gcc.gnu.org/viewcvs?rev=268880=gcc=rev
Log:
PR lto/88677
Fix PR number.

Modified:
trunk/gcc/ChangeLog

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-02-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #15 from Jakub Jelinek  ---
Patch has been approved, are you going to commit it?

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-02-11 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Jan Hubicka  changed:

   What|Removed |Added

   Assignee|rguenther at suse dot de   |hubicka at gcc dot 
gnu.org

--- Comment #14 from Jan Hubicka  ---
I have posted updated patch to
https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00683.html
The original analysis was correct - I placed code clearing TREE_READONLY too
late.

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-02-10 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Jan Hubicka  changed:

   What|Removed |Added

   Assignee|hubicka at gcc dot gnu.org |rguenther at suse dot de

--- Comment #13 from Jan Hubicka  ---
So even with the patch for TYPE_NEEDS_CONSTRUCTING I get the following
difference in fre1 dump (and it is first one). I also use -fno-strict-aliasing
--- test.ii.033t.fre1   2019-02-10 13:27:45.531305878 +0100
+++ dd/test.ii.033t.fre12019-02-10 13:27:34.895310810 +0100
@@ -120,7 +120,7 @@
 n::p (struct n * const this)
 {
   int SR.3;
-  struct D D.2531;
+  struct D D.2527;
   struct D D.2507;
   unsigned int _1;

@@ -221,9 +221,9 @@
 marking outgoing edge 6 -> 8 executable
 Processing block 5: BB8
 Value numbering stmt = SR.9_27 = MEM[(struct D *)];
-Setting value number of SR.9_27 to SR.8_23 (changed)
+Setting value number of SR.9_27 to SR.9_27 (changed)
 marking outgoing edge 8 -> 9 executable
-Making available beyond BB8 SR.9_27 for value SR.8_23
+Making available beyond BB8 SR.9_27 for value SR.9_27
 Processing block 6: BB7
 Value numbering stmt = SR.9_26 = 0;
 RHS 0 simplified to 0
@@ -233,12 +233,11 @@
 Value numbering stmt = .MEM_14 = PHI <.MEM_6(7), .MEM_6(8)>
 Setting value number of .MEM_14 to .MEM_6 (changed)
 Value numbering stmt = SR.9_12 = PHI 
-Marking CSEd to PHI node SR.8_11 = PHI 
-Setting value number of SR.9_12 to SR.8_11 (changed)
-SR.8_11 is available for SR.8_11
+Setting value number of SR.9_12 to SR.9_12 (changed)
+Making available beyond BB9 SR.9_12 for value SR.9_12
 Value numbering stmt = MEM[(struct D *)] = SR.9_12;
 No store match
-Value numbering store MEM[(struct D *)] to SR.8_11
+Value numbering store MEM[(struct D *)] to SR.9_12
 Setting value number of .MEM_18 to .MEM_18 (changed)
 Value numbering stmt = return;
 marking outgoing edge 9 -> 1 executable
@@ -252,9 +251,7 @@
 Value numbering stmt = :
 Value numbering stmt = resx 1
 RPO iteration over 10 blocks visited 10 blocks in total discovering 10
executable blocks iterating 1.0 times, a block was visited max. 1 times
-RPO tracked 2 values available at 3 locations and 17 lattice elements
-Replaced redundant PHI node defining SR.9_12 with SR.8_11
-Removing dead stmt SR.9_12 = PHI <0(7), SR.9_27(8)>
+RPO tracked 4 values available at 4 locations and 17 lattice elements
 Removing dead stmt SR.9_26 = 0;
 Removing dead stmt _10 = a_7(D);
 Removing dead stmt SR.7_24 = SR.8_11;
@@ -281,9 +278,9 @@
   int SR.8;
   int SR.7;
   unsigned int a;
-  struct D D.2547;
-  struct D D.2545;
-  struct D D.2544;
+  struct D D.2543;
+  struct D D.2541;
+  struct D D.2540;
   struct n a;
   struct D D.2498;
   struct D D.2497;
@@ -313,7 +310,8 @@
   SR.9_27 = MEM[(struct D *)];

:
-  MEM[(struct D *)] = SR.8_11;
+  # SR.9_12 = PHI <0(5), SR.9_27(6)>
+  MEM[(struct D *)] = SR.9_12;
   return;

:

Richi, do you have any idea what happens here? If needed I can analyze it
more...

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-02-10 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

--- Comment #12 from Jan Hubicka  ---
I have posted patch which removes the problematic TYPE_NEEDS_CONSTRUCTION to
https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00584.html
It however does not fix the difference.  I suppose it may be TBAA diverging
somehow?

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-08 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

--- Comment #11 from Jan Hubicka  ---
> 
> I guess the FE needs the info.  But yes, dropping TREE_READONLY
> for TYPE_NEEDS_CONSTRUCTION decls at some point may make sense.
> 
> Generally it would be nice to know (for alias analysis) that
> an object is not written to via a pointer.  For const globals
> with a constructor such writes can only happen inside the global
> ctor function(s).  Not sure if IPA reference could be of help

If I have global ctor of C++ readonly structure, I would expect to be
able to call external function to construct its parts that may call back
to the unit, so it seems to me that we can not track much?

If the ctor is leaf we probably can track down the info, but won't
currently. It is soemthing that ipa-ref could try to do if it was not
lame.

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-08 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

--- Comment #10 from rguenther at suse dot de  ---
On Mon, 7 Jan 2019, hubicka at ucw dot cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677
> 
> --- Comment #9 from Jan Hubicka  ---
> > https://gcc.gnu 
> > -> You can't simply remove this flag.  You could set it to true 
> > conservatively.
> > Or we could stop marking globals that need constructing TREE_READONLY.
> 
> I see, I was under impression that we already drop TREE_READONLY for
> them. But it is bit more subtle and thus I was indeed to quick to drop
> streaming here.  In:
> 
> struct c { int a; c() { a=1; } };
> const c cc;
> const int dd=1;
> extern const c ee;
> int foo() { return ee.a; }
> 
> I get:
> const c ee/8 (const c ee) @0x7f68d301ee80
>   Type: variable
>   Visibility: external public
>   References: 
>   Referring: _Z3foov/5 (read)
>   Availability: not-ready
>   Varpool flags: read-only
> const int dd/4 (const int dd) @0x7f68d301ec80
>   Type: variable definition analyzed
>   Visibility: force_output no_reorder
>   Aux: @0x7f68d316c450
>   References: 
>   Referring: 
>   Availability: not-ready
>   Varpool flags: initialized read-only const-value-known
> const c cc/3 (const c cc) @0x7f68d301e900
>   Type: variable definition analyzed
>   Visibility: force_output no_reorder
>   Aux: @0x7f68d301ec80
>   References: 
>   Referring: _Z41__static_initialization_and_destruction_0ii/6 (addr)
>   Availability: not-ready
>   Varpool flags:
> 
> So EE is indeed read-only, but CC is not.
> I guess it is because constructor to CC is output locally and we clear
> the flag, while constructor of EE is output externally. What is utility
> of this information? Can we realy in some context that value of EE did
> not change?
> 
> It seems to me that it is completely useless info, so I would be in
> favor of dropping TREE_READONLY flag probably at cgraph
> construction time (because I suppose it is relevant to front-end)

I guess the FE needs the info.  But yes, dropping TREE_READONLY
for TYPE_NEEDS_CONSTRUCTION decls at some point may make sense.

Generally it would be nice to know (for alias analysis) that
an object is not written to via a pointer.  For const globals
with a constructor such writes can only happen inside the global
ctor function(s).  Not sure if IPA reference could be of help
here, it is really language stuff.  There's also const_cast
and mutable where things get into murky territorry...

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-07 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

--- Comment #9 from Jan Hubicka  ---
> https://gcc.gnu 
> -> You can't simply remove this flag.  You could set it to true 
> conservatively.
> Or we could stop marking globals that need constructing TREE_READONLY.

I see, I was under impression that we already drop TREE_READONLY for
them. But it is bit more subtle and thus I was indeed to quick to drop
streaming here.  In:

struct c { int a; c() { a=1; } };
const c cc;
const int dd=1;
extern const c ee;
int foo() { return ee.a; }

I get:
const c ee/8 (const c ee) @0x7f68d301ee80
  Type: variable
  Visibility: external public
  References: 
  Referring: _Z3foov/5 (read)
  Availability: not-ready
  Varpool flags: read-only
const int dd/4 (const int dd) @0x7f68d301ec80
  Type: variable definition analyzed
  Visibility: force_output no_reorder
  Aux: @0x7f68d316c450
  References: 
  Referring: 
  Availability: not-ready
  Varpool flags: initialized read-only const-value-known
const c cc/3 (const c cc) @0x7f68d301e900
  Type: variable definition analyzed
  Visibility: force_output no_reorder
  Aux: @0x7f68d301ec80
  References: 
  Referring: _Z41__static_initialization_and_destruction_0ii/6 (addr)
  Availability: not-ready
  Varpool flags:

So EE is indeed read-only, but CC is not.
I guess it is because constructor to CC is output locally and we clear
the flag, while constructor of EE is output externally. What is utility
of this information? Can we realy in some context that value of EE did
not change?

It seems to me that it is completely useless info, so I would be in
favor of dropping TREE_READONLY flag probably at cgraph
construction time (because I suppose it is relevant to front-end)

Honza

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|ASSIGNED|NEW
 CC||rguenth at gcc dot gnu.org
   Assignee|rguenth at gcc dot gnu.org |hubicka at gcc dot 
gnu.org

--- Comment #8 from Richard Biener  ---
 Value numbering stmt = SR.10_27 = MEM[(struct D *)];
-Setting value number of SR.10_27 to SR.10_27 (changed)
+Setting value number of SR.10_27 to SR.9_23 (changed)

but then of course the code-difference

:
-  # SR.10_12 = PHI <0(5), SR.10_27(6)>
   return;

is somewhat spurious since nothing uses SR.10_12, we're getting to
removal only because eliding SR.10_27 enables DCE.

   :
  SR.10_27 = MEM[(struct D *)];

   :
  # SR.10_12 = PHI <0(5), SR.10_27(6)>
  return;


So the code-gen difference is likely because of the following which makes
dropping TYPE_NEEDS_CONSTRUCTING a suspicious change.

/* Return true if VAR may be aliased.  A variable is considered as
   maybe aliased if it has its address taken by the local TU
   or possibly by another TU and might be modified through a pointer.  */

static inline bool
may_be_aliased (const_tree var)
{
  return (TREE_CODE (var) != CONST_DECL
  && (TREE_PUBLIC (var)
  || DECL_EXTERNAL (var)
  || TREE_ADDRESSABLE (var))
  && !((TREE_STATIC (var) || TREE_PUBLIC (var) || DECL_EXTERNAL (var))
   && ((TREE_READONLY (var)
&& !TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (var)))
   || (TREE_CODE (var) == VAR_DECL
   && DECL_NONALIASED (var);
}

the variable in question is

 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x769e6a80
fields 
public abstract external autoinline decl_3 QI t.ii:2:29 align:16
warn_if_not_align:0 context 
full-name "constexpr D::D(D&&) noexcept ()"
not-really-extern chain >
context 
full-name "const class D"
needs-constructor X(constX&) this=(X&) n_parents=1 use_template=1
interface-unknown
pointer_to_this  reference_to_this
>
readonly addressable used public external read decl_2 VOID t.ii:17:19
align:8 warn_if_not_align:0 context  chain >

may_be_aliased tries to avoid claiming R/O data can be written to, but of
course that's not true for the global ctor.

-> You can't simply remove this flag.  You could set it to true conservatively.
Or we could stop marking globals that need constructing TREE_READONLY.

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #7 from Richard Biener  ---
I will have a look.

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-07 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Richard Biener  changed:

   What|Removed |Added

Version|unknown |9.0
   Target Milestone|--- |9.0

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-04 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Martin Liška  changed:

   What|Removed |Added

   Keywords|needs-reduction |

--- Comment #6 from Martin Liška  ---
Reduced test-case:

$ cat test2.ii
template  class c;
template  class D : public c {
public:
  static D d() { return D(0, e, f); }
  enum g { e };
  enum h { f };
  D(b *, g, h) : i() {}
  b i;
};
struct j {
  int k;
};
template <> class c {
public:
  void l();
};
extern const D m;
class n {
  unsigned o;

public:
  D p() { return o ? D::d() : m; }
};
void fn1() {
  n a;
  a.p().l();
  a.p();
}

Minimal options: -O2

Difference for -fdump-tree-fre1:

$ diff -u 2 1
--- 2   2019-01-04 09:13:20.637935502 +0100
+++ 1   2019-01-04 09:13:17.961848506 +0100
@@ -57,7 +57,7 @@
 n::p (struct n * const this)
 {
   int SR.3;
-  struct D D.2531;
+  struct D D.2527;
   struct D D.2507;
   unsigned int _1;

@@ -108,9 +108,9 @@
   int SR.8;
   int SR.7;
   unsigned int a;
-  struct D D.2547;
-  struct D D.2545;
-  struct D D.2544;
+  struct D D.2543;
+  struct D D.2541;
+  struct D D.2540;
   struct n a;
   struct D D.2498;
   struct D D.2497;
@@ -140,7 +140,8 @@
   SR.9_27 = MEM[(struct D *)];

:
-  MEM[(struct D *)] = SR.8_11;
+  # SR.9_12 = PHI <0(5), SR.9_27(6)>
+  MEM[(struct D *)] = SR.9_12;
   return;

:

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-03 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Martin Liška  changed:

   What|Removed |Added

   Keywords||needs-reduction

--- Comment #5 from Martin Liška  ---
I'm reducing that test-case..

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-03 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

--- Comment #4 from Jan Hubicka  ---
This drops TYPE_NEEDS_CONSTRUCTING.  I checked the uses
jan@skylake:~/trunk/gcc> grep TYPE_NEEDS_CONSTRU *.c
gimplify.c:   || TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (decl
print-tree.c:  if (TYPE_NEEDS_CONSTRUCTING (node))
tree.c:  TYPE_NEEDS_CONSTRUCTING (type) = 0;
tree.c:  verify_variant_match (TYPE_NEEDS_CONSTRUCTING);
tree-inline.c:  if (TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (p)))

I convinced myself that gimply and tree-inline uses are there only for
inlining/gimplifying within frontend. Perhaps they can also be dropped?

Honza

[Bug lto/88677] [9 Regression] Divergence in -O2 and -O2 -flto early opts

2019-01-03 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88677

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-03
  Known to work||8.2.0
Summary|Divergence in -O2 and -O2   |[9 Regression] Divergence
   |-flto early opts|in -O2 and -O2 -flto early
   ||opts
 Ever confirmed|0   |1
  Known to fail||9.0

--- Comment #3 from Martin Liška  ---
Started with r265841.