Re: RFC: 2->2 combine patch

Segher Boessenkool Mon, 20 Jun 2016 07:38:33 -0700

On Mon, Jun 20, 2016 at 02:39:06PM +0100, Kyrill Tkachov wrote:
> >So I tried out the patch below.  It decreases code size on most targets
> >(mostly fixed length insn targets), and increases it a small bit on some
> >variable length insn targets (doing an op twice, instead of doing it once
> >and doing a move).  It looks to be all good there too, but there are so
> >many changes that it is almost impossible to really check.
> >
> >So: can people try this out with their favourite benchmarks, please?
> 
> I hope to give this a run on AArch64 but I'll probably manage to get to it 
> only next week.
> In the meantime I've had a quick look at some SPEC2006 codegen on aarch64.
> Some benchmarks decrease in size, others increase. One recurring theme I 
> spotted is
> shifts being repeatedly combined with arithmetic operations rather than 
> being computed
> once and reusing the result. For example:
>     lsl    x30, x15, 3
>     add    x4, x5, x30
>     add    x9, x7, x30
>     add    x24, x8, x30
>     add    x10, x0, x30
>     add    x2, x22, x30
> 
> becoming (modulo regalloc fluctuations):
>     add    x14, x2, x15, lsl 3
>     add    x13, x22, x15, lsl 3
>     add    x21, x4, x15, lsl 3
>     add    x6, x0, x15, lsl 3
>     add    x3, x30, x15, lsl 3
> 
> which, while saving one instruction can be harmful overall because the 
> extra shift operation
> in the arithmetic instructions can increase the latency of the instruction. 
> I believe the aarch64
> rtx costs should convey this information. Do you expect RTX costs to gate 
> this behaviour?


Yes, RTX costs are used for *all* of combine's combinations.  So it seems
your add,lsl patterns are the same cost as plain add?  If it were more
expensive, combine would reject this combination.

> Some occurrences that hurt code size look like:
>     cmp    x8, x11, asr 5
> 
> being transformed into:
>     asr    x12, x11, 5
>     cmp    x12, x8, uxtw //zero-extend x8
> with the user of the condition code inverted to match the change in order 
> of operands
> to the comparisons.
> I haven't looked at the RTL dumps yet to figure out why this is happening, 
> it could be a backend
> RTL representation issue.

That could be a target thing yes, hard to tell; it's not clear to me
what combination is made here (if any).


Segher

Re: RFC: 2->2 combine patch

Reply via email to