[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 Jakub Jelinek changed: What|Removed |Added Target Milestone|6.5 |7.4 --- Comment #11 from Jakub Jelinek --- GCC 6 branch is being closed
[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 --- Comment #10 from Sebastian Pop --- > So a black-box would be a set of stmts rather than a whole GIMPLE BB Correct: this can be an abstract view of the IR. The only place where we want to start transforming the code is in the code generation. We should be able to interrupt graphite at any point (maybe due to a compute-out) and leave the original unmodified IR. Code generation should not fail and it should be linear time in number of statements, such that when we start code generation we know that it will succeed in a short amount of compilation time. > You mean this tagging of associativeness is not yet done? Yes, we removed the tagging code when we removed the out-of-ssa translation. The original tagging relied on the name of the arrays that we created to find whether the reduction was associative. This caused some performance regressions of loops not interchanged anymore (for example the swim loop.)
[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 --- Comment #9 from rguenther at suse dot de --- On Mon, 18 Sep 2017, spop at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 > > --- Comment #8 from Sebastian Pop --- > > I would have expected at least each memory op to be in a separate "black > > box" > > We could have a pass before graphite that splits BBs with more than one > write into blocks that contain one data write with all the operations > and data reads needed to compute the stored value. This would allow > more freedom to schedule BBs around. Yeah, somewhat iffy but good enough for experiments I guess. But I think we should have different BBs for reads as well? Though that means we'll handle all the data flow from reads to writes with scalar references ... In the end this splitting of BBs should be an internal detail in the GRAPHITE data structures and not applied to GIMPLE. Basically have the black-boxes be writes with all reads/operations to compute the value implicitely included. Code-gen would have to deal with this by eventually duplicating stmts if they end up un-CSEd by the scheduler. So a black-box would be a set of stmts rather than a whole GIMPLE BB. First step would be abstracting the iteration over GIMPLE stmts in a black-box I guess. > > if you follow the original go-out-of-SSA approach you'd have their effects > > on the CFG edges. So a more complete fix would similarly handle uses. > > In other words: how do we handle reductions? > As you remember, the original way was to expose reductions by rewriting > out-of-SSA > scalar dependences crossing basic blocks (loop-phi nodes, loop-close-phi > nodes,) > tagging the properties of the reduction (commutative, associative) > on the array, and adding that info to the data dependence graph. > By adding those properties to the dependence graph, we give the scheduler > more freedom to select transforms. > > We moved away from rewriting scalar dependences out-of-SSA because we do not > want to transform the code if the scheduler has no better transform to be > done: > we do not want to leave around inefficient memory reads/writes. > Instead, we handle SSA names and create scalar references added to the > dependence graph. We still need to tag scalar reductions with their > associative properties to allow the scheduler to reorder the computations. You mean this tagging of associativeness is not yet done?
[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 --- Comment #8 from Sebastian Pop --- > I would have expected at least each memory op to be in a separate "black box" We could have a pass before graphite that splits BBs with more than one write into blocks that contain one data write with all the operations and data reads needed to compute the stored value. This would allow more freedom to schedule BBs around. > if you follow the original go-out-of-SSA approach you'd have their effects > on the CFG edges. So a more complete fix would similarly handle uses. In other words: how do we handle reductions? As you remember, the original way was to expose reductions by rewriting out-of-SSA scalar dependences crossing basic blocks (loop-phi nodes, loop-close-phi nodes,) tagging the properties of the reduction (commutative, associative) on the array, and adding that info to the data dependence graph. By adding those properties to the dependence graph, we give the scheduler more freedom to select transforms. We moved away from rewriting scalar dependences out-of-SSA because we do not want to transform the code if the scheduler has no better transform to be done: we do not want to leave around inefficient memory reads/writes. Instead, we handle SSA names and create scalar references added to the dependence graph. We still need to tag scalar reductions with their associative properties to allow the scheduler to reorder the computations.
[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Known to work||8.0 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Summary|[6/7/8 Regression] Wrong|[6/7 Regression] Wrong code |code w/ -O2 |w/ -O2 -floop-nest-optimize |-floop-nest-optimize| --- Comment #7 from Richard Biener --- Fixed on trunk sofar.
[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 Jakub Jelinek changed: What|Removed |Added Priority|P2 |P4 CC||jakub at gcc dot gnu.org
[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 Richard Biener changed: What|Removed |Added Priority|P3 |P2 CC||rguenth at gcc dot gnu.org Target Milestone|--- |6.4
[Bug tree-optimization/79622] [6/7 Regression] Wrong code w/ -O2 -floop-nest-optimize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2017-02-20 CC||marxin at gcc dot gnu.org Summary|[7 Regression] Wrong code |[6/7 Regression] Wrong code |w/ -O2 -floop-nest-optimize |w/ -O2 -floop-nest-optimize Ever confirmed|0 |1 --- Comment #1 from Martin Liška --- Confirmed, started with r230200.