Re: [PATCH][GRAPHITE] Simplify SCOP detection

2017-09-28 Thread Sebastian Pop
On Wed, Sep 27, 2017 at 2:21 AM, Richard Biener  wrote:
> On Tue, 26 Sep 2017, Sebastian Pop wrote:
>
>> On Tue, Sep 26, 2017 at 7:03 AM, Richard Biener  wrote:
>>
>> >
>> > The following is the result of me trying to understand SCOP detection
>> > and the validity checks spread around the machinery.  It removes several
>> > quadraticnesses by folding validity checks into
>> > scop_detection::harmful_loop_in_region where we already walk over all
>> > BBs in the region and process individual found loops.
>> >
>> > It also rewrites build_scop_depth/build_scop_breadth into something
>> > I can undestand.
>> >
>> > Bootstrap and regtest is running on x86_64-unknown-linux-gnu (graphite.exp
>> > for all langs is happy, so is SPEC CPU 2006 testing where the statistics
>> > agree before/after the patch).
>> >
>> > I'll apply this after the bootstrap finished.
>> >
>>
>> Have you tried to bootstrap with BOOT_CFLAGS="-O2 -fgraphite-identity"?
>
> I do "-O2 -g -floop-nest-optimize"

Very good.

> but I guess -fgraphite-identity
> should catch more issues?

It would systematically exercise the scop detection and code generation.
When isl scheduler does not find a better schedule, we would not bother
running the code gen part.

> Hmm, maybe -floop-nest-optimize and
> -fgraphite-identity should be combinable
>
> Index: gcc/graphite-optimize-isl.c
> ===
> --- gcc/graphite-optimize-isl.c (revision 253203)
> +++ gcc/graphite-optimize-isl.c (working copy)
> @@ -189,7 +189,7 @@ optimize_isl (scop_p scop)
> print_schedule_ast (dump_file, scop->original_schedule, scop);
>isl_schedule_free (scop->transformed_schedule);
>scop->transformed_schedule = isl_schedule_copy
> (scop->original_schedule);
> -  return false;
> +  return flag_graphite_identity || flag_loop_parallelize_all;

Yes.

>  }
>
>return true;
>
> I'll test/commit the above.

ok.

>
>>
>> > Richard.
>> >
>> > 2017-09-26  Richard Biener  
>> >
>> > * graphite-scop-detection.c (scop_detection::build_scop_depth):
>> > Rewrite,
>> > fold in ...
>> > (scop_detection::build_scop_breadth): ... this.  Removed.
>> > (scop_detection::loop_is_valid_in_scop): Fold into single caller.
>> > (scop_detection::harmful_stmt_in_bb): Likewise.
>> > (scop_detection::graphite_can_represent_stmt): Likewise.
>> > (scop_detection::loop_body_is_valid_scop): Likewise.  Remove
>> > recursion.
>> > (scop_detection::can_represent_loop): Remove recursion, fold in
>> > ...
>> > (scop_detection::can_represent_loop_1): ... this.  Removed.
>> > (scop_detection::harmful_loop_in_region): Simplify after inlining
>> > the above and remove more quadraticness.
>> > (build_scops): Adjust.
>> > * tree-data-ref.c (loop_nest_has_data_refs): Remove pointless
>> > quadraticness.
>> >
>> >
>> This goes in the right direction: it cuts down compilation time.
>> As it is not a trivial change, I need some time to understand how
>> the scop detection works with this change.
>
> The only functional change should be that the SESE composition now
> works top-down instead of working its way bottom-up.  It's not clear
> whether we do more or less work that way

So we went from top-down to bottom-up,
and now with this change we go back to top-down.
I think both algorithms are equivalent in terms of number
of times we validate statements.

We explained the current implementation of the scop detection in:
http://impact.gforge.inria.fr/impact2016/papers/impact2016-kumar.pdf
http://impact.gforge.inria.fr/impact2016/papers/impact2016-kumar-slides.pdf

Here is what happens on an example:

loop_1 {
  loop_2 {
stmt_1
  }
  stmt_2
  loop_3 {
stmt_3
  }
}

- with a top down scop detection, we would start the analysis with loop_1,
and start validating that every stmt in its body (stmt_1, stmt_2,
and finally stmt_3) can be represented in the polyhedral representation.
If at any moment the analysis returns "cannot represent", it would go one
level down and try to validate the immediate sub loop loop_2.
Let's assume that stmt_1 can be represented, and so it would try to
extend the scop by validating stmt_2 and then its sibling loop_3, and say
we fail on validating stmt_3.  All done, max scop is stmt_1 in loop_2
followed by stmt_2.

- with a bottom up we would start from the inner loop_2, it passes
validation of stmt_1, then we extend the scop by validating stmt_2,
and then we fail at validation of stmt_3.  All done, max scop is stmt_1 in
loop_2 followed by stmt_2.  In the bottom-up process we don't
have to validate the outer loop_1.

Supposing that there is no fail in the process, then a top-down detection
would be faster as it does not need to validate one by one the inner loops:
it just goes in one pass over the stmts of loop_1 body.

> I think we can 

Re: [PATCH][GRAPHITE] Simplify SCOP detection

2017-09-27 Thread Richard Biener
On Tue, 26 Sep 2017, Sebastian Pop wrote:

> On Tue, Sep 26, 2017 at 7:03 AM, Richard Biener  wrote:
> 
> >
> > The following is the result of me trying to understand SCOP detection
> > and the validity checks spread around the machinery.  It removes several
> > quadraticnesses by folding validity checks into
> > scop_detection::harmful_loop_in_region where we already walk over all
> > BBs in the region and process individual found loops.
> >
> > It also rewrites build_scop_depth/build_scop_breadth into something
> > I can undestand.
> >
> > Bootstrap and regtest is running on x86_64-unknown-linux-gnu (graphite.exp
> > for all langs is happy, so is SPEC CPU 2006 testing where the statistics
> > agree before/after the patch).
> >
> > I'll apply this after the bootstrap finished.
> >
> 
> Have you tried to bootstrap with BOOT_CFLAGS="-O2 -fgraphite-identity"?

I do "-O2 -g -floop-nest-optimize" but I guess -fgraphite-identity
should catch more issues?  Hmm, maybe -floop-nest-optimize and
-fgraphite-identity should be combinable

Index: gcc/graphite-optimize-isl.c
===
--- gcc/graphite-optimize-isl.c (revision 253203)
+++ gcc/graphite-optimize-isl.c (working copy)
@@ -189,7 +189,7 @@ optimize_isl (scop_p scop)
print_schedule_ast (dump_file, scop->original_schedule, scop);
   isl_schedule_free (scop->transformed_schedule);
   scop->transformed_schedule = isl_schedule_copy 
(scop->original_schedule);
-  return false;
+  return flag_graphite_identity || flag_loop_parallelize_all;
 }
 
   return true;

I'll test/commit the above.

> 
> > Richard.
> >
> > 2017-09-26  Richard Biener  
> >
> > * graphite-scop-detection.c (scop_detection::build_scop_depth):
> > Rewrite,
> > fold in ...
> > (scop_detection::build_scop_breadth): ... this.  Removed.
> > (scop_detection::loop_is_valid_in_scop): Fold into single caller.
> > (scop_detection::harmful_stmt_in_bb): Likewise.
> > (scop_detection::graphite_can_represent_stmt): Likewise.
> > (scop_detection::loop_body_is_valid_scop): Likewise.  Remove
> > recursion.
> > (scop_detection::can_represent_loop): Remove recursion, fold in
> > ...
> > (scop_detection::can_represent_loop_1): ... this.  Removed.
> > (scop_detection::harmful_loop_in_region): Simplify after inlining
> > the above and remove more quadraticness.
> > (build_scops): Adjust.
> > * tree-data-ref.c (loop_nest_has_data_refs): Remove pointless
> > quadraticness.
> >
> >
> This goes in the right direction: it cuts down compilation time.
> As it is not a trivial change, I need some time to understand how
> the scop detection works with this change.

The only functional change should be that the SESE composition now
works top-down instead of working its way bottom-up.  It's not clear
whether we do more or less work that way but at least the function
is now readable -- I think we can structure it bottom-up as well
without doing the confusing two-function way (which I believe
did quite some duplicate work but I never was sure...).

Richard.


Re: [PATCH][GRAPHITE] Simplify SCOP detection

2017-09-26 Thread Sebastian Pop
On Tue, Sep 26, 2017 at 7:03 AM, Richard Biener  wrote:

>
> The following is the result of me trying to understand SCOP detection
> and the validity checks spread around the machinery.  It removes several
> quadraticnesses by folding validity checks into
> scop_detection::harmful_loop_in_region where we already walk over all
> BBs in the region and process individual found loops.
>
> It also rewrites build_scop_depth/build_scop_breadth into something
> I can undestand.
>
> Bootstrap and regtest is running on x86_64-unknown-linux-gnu (graphite.exp
> for all langs is happy, so is SPEC CPU 2006 testing where the statistics
> agree before/after the patch).
>
> I'll apply this after the bootstrap finished.
>

Have you tried to bootstrap with BOOT_CFLAGS="-O2 -fgraphite-identity"?


> Richard.
>
> 2017-09-26  Richard Biener  
>
> * graphite-scop-detection.c (scop_detection::build_scop_depth):
> Rewrite,
> fold in ...
> (scop_detection::build_scop_breadth): ... this.  Removed.
> (scop_detection::loop_is_valid_in_scop): Fold into single caller.
> (scop_detection::harmful_stmt_in_bb): Likewise.
> (scop_detection::graphite_can_represent_stmt): Likewise.
> (scop_detection::loop_body_is_valid_scop): Likewise.  Remove
> recursion.
> (scop_detection::can_represent_loop): Remove recursion, fold in
> ...
> (scop_detection::can_represent_loop_1): ... this.  Removed.
> (scop_detection::harmful_loop_in_region): Simplify after inlining
> the above and remove more quadraticness.
> (build_scops): Adjust.
> * tree-data-ref.c (loop_nest_has_data_refs): Remove pointless
> quadraticness.
>
>
This goes in the right direction: it cuts down compilation time.
As it is not a trivial change, I need some time to understand how
the scop detection works with this change.

Sebastian