On Mon, Jul 27, 2015 at 3:36 AM, James Greenhalgh <james.greenha...@arm.com> wrote: > On Mon, Jul 27, 2015 at 10:52:58AM +0100, pins...@gmail.com wrote: >> > On Jul 27, 2015, at 2:26 AM, Jiong Wang <jiong.w...@arm.com> wrote: >> > >> > Andrew Pinski writes: >> > >> >>> On Fri, Jul 24, 2015 at 2:07 AM, Jiong Wang <jiong.w...@arm.com> wrote: >> >>> >> >>> James Greenhalgh writes: >> >>> >> >>>>> On Wed, May 20, 2015 at 01:35:41PM +0100, Jiong Wang wrote: >> >>>>> Current IRA still use both target macros in a few places. >> >>>>> >> >>>>> Tell IRA to use the order we defined rather than with it's own cost >> >>>>> calculation. Allocate caller saved first, then callee saved. >> >>>>> >> >>>>> This is especially useful for LR/x30, as it's free to allocate and is >> >>>>> pure caller saved when used in leaf function. >> >>>>> >> >>>>> Haven't noticed significant impact on benchmarks, but by grepping some >> >>>>> keywords like "Spilling", "Push.*spill" etc in ira rtl dump, the number >> >>>>> is smaller. >> >>>>> >> >>>>> OK for trunk? >> >>>> >> >>>> OK, sorry for the delay. >> >>>> >> >>>> It might be mail client mangling, but please check that the trailing >> >>>> slashes >> >>>> line up in the version that gets committed. >> >>>> >> >>>> Thanks, >> >>>> James >> >>>> >> >>>>> 2015-05-19 Jiong. Wang <jiong.w...@arm.com> >> >>>>> >> >>>>> gcc/ >> >>>>> PR 63521 >> >>>>> * config/aarch64/aarch64.h (REG_ALLOC_ORDER): Define. >> >>>>> (HONOR_REG_ALLOC_ORDER): Define. >> >>> >> >>> Patch reverted. >> >> >> >> I did not see a reason why this patch was reverted. Maybe I am >> >> missing an email or something. >> > >> > There are several execution regressions under gcc testsuite, although as >> > far as I can see it's this patch exposed hidding bugs in those >> > testcases, but there might be one other issue, so to be conservative, I >> > temporarily reverted this patch. >> >> If you are talking about: >> gcc.target/aarch64/aapcs64/func-ret-2.c execution >> Etc. >> >> These test cases are too dependent on the original register allocation order >> and really can be safely ignored. Really these three tests should be moved or >> written in a more sane way. > > Yup, completely agreed - but the testcases do throw up something > interesting. If we are allocating registers to hold 128-bit values, and > we pick x7 as highest preference, we implicitly allocate x8 along with it. > I think we probably see the same thing if the first thing we do in a > function is a structure copy through a back-end expanded movmem, which > will likely begin with a 128-bit LDP using x7, x8. > > If the argument for this patch is that we prefer to allocate x7-x0 first, > followed by x8, then we've potentially made a sub-optimal decision, our > allocation order for 128-bit values is x7,x8,x5,x6 etc. > > My hunch is that we *might* get better code generation in this corner case > out of some permutation of the allocation order for argument > registers. I'm thinking something along the lines of > > {x6, x5, x4, x7, x3, x2, x1, x0, x8, ... } > > I asked Jiong to take a look at that, and I agree with his decision to > reduce the churn on trunk and just revert the patch until we've come to > a conclusion based on some evidence - rather than just my hunch! I agree > that it would be harmless on trunk from a testing point of view, but I > think Jiong is right to revert the patch until we better understand the > code-generation implications. > > Of course, it might be that I am completely wrong! If you've already taken > a look at using a register allocation order like the example I gave and > have something to share, I'd be happy to read your advice!
Any news on this patch? It has been a year since it was reverted for a bad test that was failing. Thanks, Andrew > > Thanks, > James >