[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize

2018-10-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|6.5 |7.4

--- Comment #11 from Jakub Jelinek  ---
GCC 6 branch is being closed

[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize

2017-09-19 Thread spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622

--- Comment #10 from Sebastian Pop  ---
> So a black-box would be a set of stmts rather than a whole GIMPLE BB

Correct: this can be an abstract view of the IR.  The only place where we want
to start transforming the code is in the code generation.  We should be able to
interrupt graphite at any point (maybe due to a compute-out) and leave the
original unmodified IR.  Code generation should not fail and it should be
linear time in number of statements, such that when we start code generation we
know that it will succeed in a short amount of compilation time.

> You mean this tagging of associativeness is not yet done?

Yes, we removed the tagging code when we removed the out-of-ssa translation.
The original tagging relied on the name of the arrays that we created to find
whether the reduction was associative.  This caused some performance
regressions of loops not interchanged anymore (for example the swim loop.)

[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize

2017-09-19 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622

--- Comment #9 from rguenther at suse dot de  ---
On Mon, 18 Sep 2017, spop at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622
> 
> --- Comment #8 from Sebastian Pop  ---
> > I would have expected at least each memory op to be in a separate "black 
> > box"
> 
> We could have a pass before graphite that splits BBs with more than one 
> write into blocks that contain one data write with all the operations 
> and data reads needed to compute the stored value.  This would allow 
> more freedom to schedule BBs around.

Yeah, somewhat iffy but good enough for experiments I guess.  But I
think we should have different BBs for reads as well?  Though that
means we'll handle all the data flow from reads to writes with
scalar references ...

In the end this splitting of BBs should be an internal detail in the
GRAPHITE data structures and not applied to GIMPLE.  Basically
have the black-boxes be writes with all reads/operations to compute
the value implicitely included.  Code-gen would have to deal with
this by eventually duplicating stmts if they end up un-CSEd by
the scheduler.  So a black-box would be a set of stmts rather than
a whole GIMPLE BB.  First step would be abstracting the iteration
over GIMPLE stmts in a black-box I guess.

> > if you follow the original go-out-of-SSA approach you'd have their effects
> > on the CFG edges.  So a more complete fix would similarly handle uses.
> 
> In other words: how do we handle reductions?
> As you remember, the original way was to expose reductions by rewriting
> out-of-SSA
> scalar dependences crossing basic blocks (loop-phi nodes, loop-close-phi
> nodes,)
> tagging the properties of the reduction (commutative, associative)
> on the array, and adding that info to the data dependence graph.
> By adding those properties to the dependence graph, we give the scheduler
> more freedom to select transforms.
> 
> We moved away from rewriting scalar dependences out-of-SSA because we do not
> want to transform the code if the scheduler has no better transform to be 
> done:
> we do not want to leave around inefficient memory reads/writes.
> Instead, we handle SSA names and create scalar references added to the
> dependence graph.  We still need to tag scalar reductions with their
> associative properties to allow the scheduler to reorder the computations.

You mean this tagging of associativeness is not yet done?

[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize

2017-09-18 Thread spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622

--- Comment #8 from Sebastian Pop  ---
> I would have expected at least each memory op to be in a separate "black box"

We could have a pass before graphite that splits BBs with more than one write
into blocks that contain one data write with all the operations and data reads
needed to compute the stored value.  This would allow more freedom to schedule
BBs around.

> if you follow the original go-out-of-SSA approach you'd have their effects
> on the CFG edges.  So a more complete fix would similarly handle uses.

In other words: how do we handle reductions?
As you remember, the original way was to expose reductions by rewriting
out-of-SSA
scalar dependences crossing basic blocks (loop-phi nodes, loop-close-phi
nodes,)
tagging the properties of the reduction (commutative, associative)
on the array, and adding that info to the data dependence graph.
By adding those properties to the dependence graph, we give the scheduler
more freedom to select transforms.

We moved away from rewriting scalar dependences out-of-SSA because we do not
want to transform the code if the scheduler has no better transform to be done:
we do not want to leave around inefficient memory reads/writes.
Instead, we handle SSA names and create scalar references added to the
dependence graph.  We still need to tag scalar reductions with their
associative properties to allow the scheduler to reorder the computations.

[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize

2017-09-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
  Known to work||8.0
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
Summary|[6/7/8 Regression] Wrong|[6/7 Regression] Wrong code
   |code w/ -O2 |w/ -O2 -floop-nest-optimize
   |-floop-nest-optimize|

--- Comment #7 from Richard Biener  ---
Fixed on trunk sofar.

[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize

2017-03-01 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P2  |P4
 CC||jakub at gcc dot gnu.org

[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize

2017-02-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
 CC||rguenth at gcc dot gnu.org
   Target Milestone|--- |6.4

[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize

2017-02-20 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-02-20
 CC||marxin at gcc dot gnu.org
Summary|[7 Regression] Wrong code   |[6/7 Regression] Wrong code
   |w/ -O2 -floop-nest-optimize |w/ -O2 -floop-nest-optimize
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Confirmed, started with r230200.