On Mon, Jul 27, 2015 at 3:36 AM, James Greenhalgh
<james.greenha...@arm.com> wrote:
> On Mon, Jul 27, 2015 at 10:52:58AM +0100, pins...@gmail.com wrote:
>> > On Jul 27, 2015, at 2:26 AM, Jiong Wang <jiong.w...@arm.com> wrote:
>> >
>> > Andrew Pinski writes:
>> >
>> >>> On Fri, Jul 24, 2015 at 2:07 AM, Jiong Wang <jiong.w...@arm.com> wrote:
>> >>>
>> >>> James Greenhalgh writes:
>> >>>
>> >>>>> On Wed, May 20, 2015 at 01:35:41PM +0100, Jiong Wang wrote:
>> >>>>> Current IRA still use both target macros in a few places.
>> >>>>>
>> >>>>> Tell IRA to use the order we defined rather than with it's own cost
>> >>>>> calculation. Allocate caller saved first, then callee saved.
>> >>>>>
>> >>>>> This is especially useful for LR/x30, as it's free to allocate and is
>> >>>>> pure caller saved when used in leaf function.
>> >>>>>
>> >>>>> Haven't noticed significant impact on benchmarks, but by grepping some
>> >>>>> keywords like "Spilling", "Push.*spill" etc in ira rtl dump, the number
>> >>>>> is smaller.
>> >>>>>
>> >>>>> OK for trunk?
>> >>>>
>> >>>> OK, sorry for the delay.
>> >>>>
>> >>>> It might be mail client mangling, but please check that the trailing 
>> >>>> slashes
>> >>>> line up in the version that gets committed.
>> >>>>
>> >>>> Thanks,
>> >>>> James
>> >>>>
>> >>>>> 2015-05-19  Jiong. Wang  <jiong.w...@arm.com>
>> >>>>>
>> >>>>> gcc/
>> >>>>>  PR 63521
>> >>>>>  * config/aarch64/aarch64.h (REG_ALLOC_ORDER): Define.
>> >>>>>  (HONOR_REG_ALLOC_ORDER): Define.
>> >>>
>> >>> Patch reverted.
>> >>
>> >> I did not see a reason why this patch was reverted.  Maybe I am
>> >> missing an email or something.
>> >
>> > There are several execution regressions under gcc testsuite, although as
>> > far as I can see it's this patch exposed hidding bugs in those
>> > testcases, but there might be one other issue, so to be conservative, I
>> > temporarily reverted this patch.
>>
>> If you are talking about:
>> gcc.target/aarch64/aapcs64/func-ret-2.c execution
>> Etc.
>>
>> These test cases are too dependent on the original register allocation order
>> and really can be safely ignored. Really these three tests should be moved or
>> written in a more sane way.
>
> Yup, completely agreed - but the testcases do throw up something
> interesting. If we are allocating registers to hold 128-bit values, and
> we pick x7 as highest preference, we implicitly allocate x8 along with it.
> I think we probably see the same thing if the first thing we do in a
> function is a structure copy through a back-end expanded movmem, which
> will likely begin with a 128-bit LDP using x7, x8.
>
> If the argument for this patch is that we prefer to allocate x7-x0 first,
> followed by x8, then we've potentially made a sub-optimal decision, our
> allocation order for 128-bit values is x7,x8,x5,x6 etc.
>
> My hunch is that we *might* get better code generation in this corner case
> out of some permutation of the allocation order for argument
> registers. I'm thinking something along the lines of
>
>   {x6, x5, x4, x7, x3, x2, x1, x0, x8, ... }
>
> I asked Jiong to take a look at that, and I agree with his decision to
> reduce the churn on trunk and just revert the patch until we've come to
> a conclusion based on some evidence - rather than just my hunch! I agree
> that it would be harmless on trunk from a testing point of view, but I
> think Jiong is right to revert the patch until we better understand the
> code-generation implications.
>
> Of course, it might be that I am completely wrong! If you've already taken
> a look at using a register allocation order like the example I gave and
> have something to share, I'd be happy to read your advice!

Any news on this patch?  It has been a year since it was reverted for
a bad test that was failing.

Thanks,
Andrew

>
> Thanks,
> James
>

Reply via email to