On Mon, Jun 20, 2016 at 02:39:06PM +0100, Kyrill Tkachov wrote: > >So I tried out the patch below. It decreases code size on most targets > >(mostly fixed length insn targets), and increases it a small bit on some > >variable length insn targets (doing an op twice, instead of doing it once > >and doing a move). It looks to be all good there too, but there are so > >many changes that it is almost impossible to really check. > > > >So: can people try this out with their favourite benchmarks, please? > > I hope to give this a run on AArch64 but I'll probably manage to get to it > only next week. > In the meantime I've had a quick look at some SPEC2006 codegen on aarch64. > Some benchmarks decrease in size, others increase. One recurring theme I > spotted is > shifts being repeatedly combined with arithmetic operations rather than > being computed > once and reusing the result. For example: > lsl x30, x15, 3 > add x4, x5, x30 > add x9, x7, x30 > add x24, x8, x30 > add x10, x0, x30 > add x2, x22, x30 > > becoming (modulo regalloc fluctuations): > add x14, x2, x15, lsl 3 > add x13, x22, x15, lsl 3 > add x21, x4, x15, lsl 3 > add x6, x0, x15, lsl 3 > add x3, x30, x15, lsl 3 > > which, while saving one instruction can be harmful overall because the > extra shift operation > in the arithmetic instructions can increase the latency of the instruction. > I believe the aarch64 > rtx costs should convey this information. Do you expect RTX costs to gate > this behaviour?
Yes, RTX costs are used for *all* of combine's combinations. So it seems your add,lsl patterns are the same cost as plain add? If it were more expensive, combine would reject this combination. > Some occurrences that hurt code size look like: > cmp x8, x11, asr 5 > > being transformed into: > asr x12, x11, 5 > cmp x12, x8, uxtw //zero-extend x8 > with the user of the condition code inverted to match the change in order > of operands > to the comparisons. > I haven't looked at the RTL dumps yet to figure out why this is happening, > it could be a backend > RTL representation issue. That could be a target thing yes, hard to tell; it's not clear to me what combination is made here (if any). Segher