On Mon, Aug 7, 2017 at 1:36 PM, Michael Collison <michael.colli...@arm.com> wrote: > This patch improves code generation for shifts with subtract instructions > where the first operand to the subtract is equal to the bit-size of the > operation. > > > long f1(long x, int i) > { > return x >> (64 - i); > } > > int f2(int x, int i) > { > return x << (32 - i); > } > > > With trunk at -O2 we generate: > > f1: > mov w2, 64 > sub w1, w2, w1 > asr x0, x0, x1 > ret > > f2: > mov w2, 32 > sub w1, w2, w1 > lsl w0, w0, w1 > ret > > with the patch we generate: > > f1: > neg w2, w1 > asr x0, x0, x2 > ret > .size f1, .-f1 > .align 2 > .p2align 3,,7 > .global f2 > .type f2, %function > f2: > neg w2, w1 > lsl w0, w0, w2 > ret > > Okay for trunk?
Shouldn't this be handled in simplify-rtx instead of an aarch64 specific pattern? That is simplify: (SHIFT A (32 - B)) -> (SHIFT A (AND (NEG B) 31)) etc. or maybe not. I don't mind either way after thinking about it more. Thanks, Andrew > > 2017-08-07 Michael Collison <michael.colli...@arm.com> > > * config/aarch64/aarch64.md (*aarch64_reg_<optab>_minus<mode>3): > New pattern. > > 2016-08-07 Michael Collison <michael.colli...@arm.com> > > * gcc.target/aarch64/var_shift_mask_2.c: New test.