I will try this method for trunk later. thanks Carrot
On Wed, Apr 20, 2011 at 4:48 PM, Richard Earnshaw <rearn...@arm.com> wrote: > > On Wed, 2011-04-20 at 16:26 +0800, Carrot Wei wrote: >> On Tue, Apr 19, 2011 at 8:55 PM, Richard Earnshaw <rearn...@arm.com> wrote: >> > >> > On Tue, 2011-04-19 at 17:41 +0800, Guozhi Wei wrote: >> >> Reload pass tries to determine the stack frame, so it needs to check the >> >> push/pop lr optimization opportunity. One of the criteria is if there is >> >> any >> >> far jump inside the function. Unfortunately at this time gcc can't decide >> >> each >> >> instruction's length and basic block layout, so it can't know the offset >> >> of >> >> a jump. To be conservative it assumes every jump is a far jump. So any >> >> jump >> >> in a function will prevent this push/pop lr optimization. >> >> >> >> To enable the push/pop lr optimization in reload pass, I compute the >> >> possible >> >> maximum length of the function body. If the length is not large enough, >> >> far >> >> jump is not necessary, so we can safely do push/pop lr optimization. >> >> >> >> Tested on arm qemu with options -march=armv5te -mthumb, without >> >> regression. >> >> >> >> This patch is for google/main. >> >> >> >> 2011-04-19 Guozhi Wei <car...@google.com> >> >> >> >> Google ref 40255. >> >> * gcc/config/arm/arm.c (SHORTEST_FAR_JUMP_LENGTH): New constant. >> >> (estimate_function_length): New function. >> >> (thumb_far_jump_used_p): No far jump is needed in short function. >> >> >> > >> > Setting aside for the moment Richi's issue with hot/cold sections, this >> > isn't safe. Firstly get_attr_length() doesn't return the worst case >> > length; and secondly, it doesn't take into account the size of reload >> > insns that are still on the reloads stack -- these are only emitted >> > right at the end of the reload pass. Both of these would need to be >> > addressed before this can be safely done. >> > >> > It's worth noting here that in the dim and distant past we used to try >> > to estimate the size of the function and eliminate redundant saves of >> > R14, but the code had to be removed because it was too fragile; but it >> > looks like some vestiges of the code are still in the compiler. >> > >> > A slightly less optimistic approach, but one that is much safer is to >> > scan the function after reload has completed and see if we can avoid >> > having to push LR. We can do this if: >> > >> I guess "less optimistic" is relative to the ideal optimization >> situation, I believe it is still much better than current result. Do >> you think if arm_reorg() is appropriate place to do this? >> > > Making the decision in a single pass would certainly be the best > approach; and arm_reorg is certainly going to come after all other major > code re-arrangements. Indeed, you should probably do this after the > minipool placement so that you can be sure that these don't bulk up the > body of the function too much. > > As you are doing the elimination late on in the compilation you can do a > better job of estimation by calling shorten_branches() to work out the > precise length of each insn. Then you can simply scan over the insns to > work out if there is a branch that still needs r14. > > R. > > > >