-----Original Message----- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Monday, June 16, 2014 7:55 PM To: Ajit Kumar Agarwal Cc: gcc@gcc.gnu.org; Vladimir Makarov; Michael Eager; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Register Pressure guided Unroll and Jam in GCC !!
On Mon, Jun 16, 2014 at 4:14 PM, Ajit Kumar Agarwal <ajit.kumar.agar...@xilinx.com> wrote: > Hello All: > > I have worked on the Open64 compiler where the Register Pressure Guided > Unroll and Jam gave a good amount of performance improvement for the C and > C++ Spec Benchmark and also Fortran benchmarks. > > The Unroll and Jam increases the register pressure in the Unrolled Loop > leading to increase in the Spill and Fetch degrading the performance of the > Unrolled Loop. The Performance of Cache locality achieved through Unroll and > Jam is degraded with the presence of Spilling instruction due to increases in > register pressure Its better to do the decision of Unrolled Factor of the > Loop based on the Performance model of the register pressure. > > Most of the Loop Optimization Like Unroll and Jam is implemented in the High > Level IR. The register pressure based Unroll and Jam requires the calculation > of register pressure in the High Level IR which will be similar to register > pressure we calculate on Register Allocation. This makes the implementation > complex. > > To overcome this, the Open64 compiler does the decision of Unrolling to both > High Level IR and also at the Code Generation Level. Some of the decisions > way at the end of the Code Generation . The advantage of using this approach > like Open64 helps in using the register pressure information calculated by > the Register Allocator. This helps the implementation much simpler and less > complex. > > Can we have this approach in GCC of the Decisions of Unroll and Jam in the > High Level IR and also to defer some of the decision at the Code Generation > Level like Open64? > > Please let me know what do you think. >>Sure, you can for example compute validity of the transform during the GIMPLE >>loop opts, annotate the loop meta-information with the desired transform and >>apply it (or not) later >>during RTL unrolling. Thanks !! Has RTL unrolling been already implemented? Richard. > Thanks & Regards > Ajit