On 11/24/2014 06:47 AM, Ajit Kumar Agarwal wrote: > All: > > The optimization of reducing save and restore of the callee and caller saved > register has been the attention Of > increasing the performance of the benchmark. The callee saved registers is > saved at the entry and restore at the > exit of the procedure if the register is reused inside the procedure whereas > the caller save registers at the Caller site > is saved before the call and the restore after the variable is live and spans > through the call. > > The GCC port has done some optimization whereas the call-used registers are > live inside the procedure and has been > set as 1 bit then it will not be saved and restored. This is based on the > data flow analysis. > > The callee saved registers is useful when there all multiple calls in the > call graph whereas the caller save registers are > useful if the call is the leaf procedure then the saving before the call and > restore after the call will be useful and increases > the performance. > > By traversing the call graph in depth-first-order and the bottom-up approach > we can propagate the save and restore > At the procedure entry and exit to the upper regions of the call graph which > reduces the save and restore at all the lower > Regions across the various lower calls. These decision can be made based on > the frequency of the call in the call graph as > Proposed by Fred Chow. > It is hard to implement as you need to change already generated code (in callees or callers depending in what order you are generating code for functions in the call graph). You can not decide what you will do without RA at least in one function (caller or callee). Although some rough heuristics are possible on existing call graph level but I guess they will probably hurt more than improve the code because the estimation will be very inaccurate.
Although if you create an infrastructure to do such things (having RTLs of all functions compiled). It could be useful for other projects too, e.g. for Minimum Cost Interprocedural Register Allocation (see http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.5914) which is more general approach to what you propose.