On Mon, 25 Sep 2017, Bin.Cheng wrote:

> On Mon, Sep 25, 2017 at 1:46 PM, Richard Biener <rguent...@suse.de> wrote:
> > On Mon, 25 Sep 2017, Richard Biener wrote:
> >
> >> On Fri, 22 Sep 2017, Richard Biener wrote:
> >>
> >> >
> >> > This simplifies canonicalize_loop_closed_ssa and does other minimal
> >> > TLC.  It also adds a testcase I reduced from a stupid mistake I made
> >> > when reworking canonicalize_loop_closed_ssa.
> >> >
> >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >> >
> >> > SPEC CPU 2006 is happy with it, current statistics on x86_64 with
> >> > -Ofast -march=haswell -floop-nest-optimize are
> >> >
> >> >  61 loop nests "optimized"
> >> >  45 loop nest transforms cancelled because of code generation issues
> >> >  21 loop nest optimizations timed out the 350000 ISL "operations" we 
> >> > allow
> >>
> >> Overall compile time (with -j6) is 695 sec. w/o -floop-nest-optimize
> >> and 709 sec. with (this was with release checking).
> >>
> >> A single-run has 416.gamess (580s -> 618s),
> >> 436.cactusADM (206s -> 182s), 437.leslie3d (228s ->218s),
> >> 450.soplex (229s -> 226s), 465.tonto (428s -> 425s), 401.bzip2 (383s ->
> >> 379s), 462.libquantum (352s -> 343s), ignoring +-2s changes.  Will
> >> do a 3-run for those to confirm (it would be only a single regression
> >> for 416.gamess).
> >
> > 416.gamess regression confirmed, 450.soplex improvement as well,
> 436/437 improvements?  450.soplex (229s -> 226s) loops like noise.

base is with -floop-nest-optimize, peak without.

416.gamess      19580        619       31.7 S   19580        576       
34.0 *
416.gamess      19580        614       31.9 S   19580        577       
33.9 S
416.gamess      19580        618       31.7 *   19580        576       
34.0 S
436.cactusADM   11950        194       61.5 S   11950        204       
58.5 S
436.cactusADM   11950        184       65.0 S   11950        187       
63.8 *
436.cactusADM   11950        186       64.1 *   11950        186       
64.1 S
437.leslie3d     9400        219       43.0 S    9400        218       
43.1 S
437.leslie3d     9400        219       43.0 *    9400        223       
42.1 S
437.leslie3d     9400        218       43.0 S    9400        223       
42.2 *
450.soplex       8340        225       37.0 S    8340        231       
36.1 S
450.soplex       8340        226       36.9 *    8340        230       
36.3 *
450.soplex       8340        227       36.8 S    8340        229       
36.4 S
465.tonto        9840        426       23.1 S    9840        427       
23.0 *
465.tonto        9840        424       23.2 S    9840        430       
22.9 S
465.tonto        9840        425       23.2 *    9840        425       
23.2 S
401.bzip2        9650        379       25.5 S    9650        378       
25.5 S
401.bzip2        9650        379       25.5 *    9650        380       
25.4 *
401.bzip2        9650        379       25.5 S    9650        380       
25.4 S
462.libquantum  20720        351       59.0 *   20720        349       
59.4 S
462.libquantum  20720        351       59.0 S   20720        345       
60.1 *
462.libquantum  20720        352       58.8 S   20720        344       
60.2 S



> Thanks,
> bin
> > in the three-run 462.libquantum regresses (344s -> 351s) so I suppose
> > that's noise.
> >
> > Richard.
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to