On Fri, 22 Sep 2017, Sebastian Pop wrote:

> On Fri, Sep 22, 2017 at 8:03 AM, Richard Biener <rguent...@suse.de> wrote:
> 
> >
> > This simplifies canonicalize_loop_closed_ssa and does other minimal
> > TLC.  It also adds a testcase I reduced from a stupid mistake I made
> > when reworking canonicalize_loop_closed_ssa.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >
> > SPEC CPU 2006 is happy with it, current statistics on x86_64 with
> > -Ofast -march=haswell -floop-nest-optimize are
> >
> >  61 loop nests "optimized"
> >  45 loop nest transforms cancelled because of code generation issues
> >  21 loop nest optimizations timed out the 350000 ISL "operations" we allow
> >
> > I say "optimized" because the usual transform I've seen is static tiling
> > as enforced by GRAPHITE according to --param loop-block-tile-size.
> > There's no way to automagically figure what kind of transform ISL did
> >
> 
> Here is how to automate (without magic) the detection
> of the transform that isl did.
> 
> The problem solved by isl is the minimization of strides
> in memory, and to do this, we need to tell the isl scheduler
> the validity dependence graph, in graphite-optimize-isl.c
> see the validity (RAW, WAR, WAW) and the proximity
> (RAR + validity) maps.  The proximity does include the
> read after read, as the isl scheduler needs to minimize
> strides between consecutive reads.
> 
> When you apply the schedule to the dependence graph,
> one can tell from the result the strides in memory, a good
> way to say whether a transform was beneficial is to sum up
> all memory strides, and make sure that the sum of all strides
> decreases after transform.  We could add a printf with the
> sum of strides before and after transforms, and have the
> testcases check for that.

Interesting.  Can you perhaps show me in code how to do that?

Thanks,
Richard.

Reply via email to