This patch improves code generation for shifts with and operations that can be omitted based on the size of the operation. AArch64 variable shifts only use the low 5-6 bits so masking operations that clear higher bits can be removed. When the shift instructions operate on all register arguments they truncate the shift/rotate amount by the size of the registers they operate on: 32 for W-regs, 63 for X-regs. This allows us to optimize away such redundant masking instructions.
The patch only applies to shifts on integer instructions; vector shifts are excluded. unsigned int f1 (unsigned int x, int y) { return x << (y & 31); } unsigned long f3 (unsigned long bit_addr) { unsigned long bitnumb = bit_addr & 63; return (1L << bitnumb); } With trunk at -O2 we generate: f1: and w1, w1, 31 lsl w0, w0, w1 ret .size f1, .-f1 .align 2 .p2align 3,,7 .global f3 .type f3, %function f3: and x0, x0, 63 mov x1, 1 lsl x0, x1, x0 ret with the patch we generate: f1: lsl w0, w0, w1 ret f3: mov x1, 1 lsl x0, x1, x0 ret Okay for trunk? 2017-05-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com> Michael Collison <michael.colli...@arm.com> PR target/70119 * config/aarch64/aarch64.md (*aarch64_<optab>_reg_<mode>3_mask1): New pattern. (*aarch64_reg_<mode>3_neg_mask2): New pattern. (*aarch64_reg_<mode>3_minus_mask): New pattern. (*aarch64_<optab>_reg_di3_mask2): New pattern. * config/aarch64/aarch64.c (aarch64_rtx_costs): Account for cost of shift when the shift amount is masked with constant equal to the size of the mode. 2017-05-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com> Michael Collison <michael.colli...@arm.com> PR target/70119 * gcc.target/aarch64/var_shift_mask_1.c: New test.
pr70119.patch
Description: pr70119.patch