Re: [PATCH][AArch64][2/3] Recognise rev16 operations on SImode and DImode data
On Wed, Mar 19, 2014 at 9:55 AM, Kyrill Tkachov wrote: > Hi all, > > This patch adds a recogniser for the bitmask,shift,orr sequence of > instructions that can be used to reverse the bytes in 16-bit halfwords (for > the sequence itself look at the testcase included in the patch). This can be > implemented with a rev16 instruction. > Since the shifts can occur in any order and there are no canonicalisation > rules for where they appear in the expression we have to have two patterns > to match both cases. > > The rtx costs function is updated to recognise the pattern and cost it > appropriately by using the rev field of the cost tables introduced in patch > [1/3]. The rtx costs helper functions that are used to recognise those > bitwise operations are placed in config/arm/aarch-common.c so that they can > be reused by both arm and aarch64. The ARM bits of this are OK if there are no regressions. > > I've added an execute testcase but no scan-assembler tests since > conceptually in the future the combiner might decide to not use a rev > instruction due to rtx costs. We can at least test that the code generated > is functionally correct though. > > Tested aarch64-none-elf. What about arm-none-eabi :) ? > > Ok for stage1? > > [gcc/] > 2014-03-19 Kyrylo Tkachov > > * config/aarch64/aarch64.md (rev162): New pattern. > (rev162_alt): Likewise. > * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle rev16 case. > * config/arm/aarch-common.c (aarch_rev16_shright_mask_imm_p): New. > (aarch_rev16_shleft_mask_imm_p): Likewise. > (aarch_rev16_p_1): Likewise. > (aarch_rev16_p): Likewise. > * config/arm/aarch-common-protos.h (aarch_rev16_p): Declare extern. > (aarch_rev16_shright_mask_imm_p): Likewise. > (aarch_rev16_shleft_mask_imm_p): Likewise. > > [gcc/testsuite/] > 2014-03-19 Kyrylo Tkachov > > * gcc.target/aarch64/rev16_1.c: New test.
Re: [PATCH][AArch64][2/3] Recognise rev16 operations on SImode and DImode data
On 28/03/14 14:21, Ramana Radhakrishnan wrote: On Wed, Mar 19, 2014 at 9:55 AM, Kyrill Tkachov wrote: Hi all, This patch adds a recogniser for the bitmask,shift,orr sequence of instructions that can be used to reverse the bytes in 16-bit halfwords (for the sequence itself look at the testcase included in the patch). This can be implemented with a rev16 instruction. Since the shifts can occur in any order and there are no canonicalisation rules for where they appear in the expression we have to have two patterns to match both cases. The rtx costs function is updated to recognise the pattern and cost it appropriately by using the rev field of the cost tables introduced in patch [1/3]. The rtx costs helper functions that are used to recognise those bitwise operations are placed in config/arm/aarch-common.c so that they can be reused by both arm and aarch64. The ARM bits of this are OK if there are no regressions. I've added an execute testcase but no scan-assembler tests since conceptually in the future the combiner might decide to not use a rev instruction due to rtx costs. We can at least test that the code generated is functionally correct though. Tested aarch64-none-elf. What about arm-none-eabi :) ? Tested arm-none-eabi and bootstrap on arm linux together with patch [3/3] in the series :) Kyrill Ok for stage1? [gcc/] 2014-03-19 Kyrylo Tkachov * config/aarch64/aarch64.md (rev162): New pattern. (rev162_alt): Likewise. * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle rev16 case. * config/arm/aarch-common.c (aarch_rev16_shright_mask_imm_p): New. (aarch_rev16_shleft_mask_imm_p): Likewise. (aarch_rev16_p_1): Likewise. (aarch_rev16_p): Likewise. * config/arm/aarch-common-protos.h (aarch_rev16_p): Declare extern. (aarch_rev16_shright_mask_imm_p): Likewise. (aarch_rev16_shleft_mask_imm_p): Likewise. [gcc/testsuite/] 2014-03-19 Kyrylo Tkachov * gcc.target/aarch64/rev16_1.c: New test.
Re: [PATCH][AArch64][2/3] Recognise rev16 operations on SImode and DImode data
On 19 March 2014 09:55, Kyrill Tkachov wrote: > [gcc/] > 2014-03-19 Kyrylo Tkachov > > * config/aarch64/aarch64.md (rev162): New pattern. > (rev162_alt): Likewise. > * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle rev16 case. > * config/arm/aarch-common.c (aarch_rev16_shright_mask_imm_p): New. > (aarch_rev16_shleft_mask_imm_p): Likewise. > (aarch_rev16_p_1): Likewise. > (aarch_rev16_p): Likewise. > * config/arm/aarch-common-protos.h (aarch_rev16_p): Declare extern. > (aarch_rev16_shright_mask_imm_p): Likewise. > (aarch_rev16_shleft_mask_imm_p): Likewise. > > [gcc/testsuite/] > 2014-03-19 Kyrylo Tkachov > > * gcc.target/aarch64/rev16_1.c: New test. The aarch64 parts are OK for stage-1. /Marcus
[PATCH][AArch64][2/3] Recognise rev16 operations on SImode and DImode data
Hi all, This patch adds a recogniser for the bitmask,shift,orr sequence of instructions that can be used to reverse the bytes in 16-bit halfwords (for the sequence itself look at the testcase included in the patch). This can be implemented with a rev16 instruction. Since the shifts can occur in any order and there are no canonicalisation rules for where they appear in the expression we have to have two patterns to match both cases. The rtx costs function is updated to recognise the pattern and cost it appropriately by using the rev field of the cost tables introduced in patch [1/3]. The rtx costs helper functions that are used to recognise those bitwise operations are placed in config/arm/aarch-common.c so that they can be reused by both arm and aarch64. I've added an execute testcase but no scan-assembler tests since conceptually in the future the combiner might decide to not use a rev instruction due to rtx costs. We can at least test that the code generated is functionally correct though. Tested aarch64-none-elf. Ok for stage1? [gcc/] 2014-03-19 Kyrylo Tkachov * config/aarch64/aarch64.md (rev162): New pattern. (rev162_alt): Likewise. * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle rev16 case. * config/arm/aarch-common.c (aarch_rev16_shright_mask_imm_p): New. (aarch_rev16_shleft_mask_imm_p): Likewise. (aarch_rev16_p_1): Likewise. (aarch_rev16_p): Likewise. * config/arm/aarch-common-protos.h (aarch_rev16_p): Declare extern. (aarch_rev16_shright_mask_imm_p): Likewise. (aarch_rev16_shleft_mask_imm_p): Likewise. [gcc/testsuite/] 2014-03-19 Kyrylo Tkachov * gcc.target/aarch64/rev16_1.c: New test.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index ebd58c0..41761ae 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -4682,6 +4682,16 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, return false; case IOR: + if (aarch_rev16_p (x)) +{ + *cost = COSTS_N_INSNS (1); + + if (speed) +*cost += extra_cost->alu.rev; + + return true; +} +/* Fall through. */ case XOR: case AND: cost_logic: diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 99a6ac8..a23452b 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3173,6 +3173,38 @@ [(set_attr "type" "rev")] ) +;; There are no canonicalisation rules for the position of the lshiftrt, ashift +;; operations within an IOR/AND RTX, therefore we have two patterns matching +;; each valid permutation. + +(define_insn "rev162" + [(set (match_operand:GPI 0 "register_operand" "=r") +(ior:GPI (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand" "r") + (const_int 8)) + (match_operand:GPI 3 "const_int_operand" "n")) + (and:GPI (lshiftrt:GPI (match_dup 1) +(const_int 8)) + (match_operand:GPI 2 "const_int_operand" "n"] + "aarch_rev16_shleft_mask_imm_p (operands[3], mode) + && aarch_rev16_shright_mask_imm_p (operands[2], mode)" + "rev16\\t%0, %1" + [(set_attr "type" "rev")] +) + +(define_insn "rev162_alt" + [(set (match_operand:GPI 0 "register_operand" "=r") +(ior:GPI (and:GPI (lshiftrt:GPI (match_operand:GPI 1 "register_operand" "r") +(const_int 8)) + (match_operand:GPI 2 "const_int_operand" "n")) + (and:GPI (ashift:GPI (match_dup 1) + (const_int 8)) + (match_operand:GPI 3 "const_int_operand" "n"] + "aarch_rev16_shleft_mask_imm_p (operands[3], mode) + && aarch_rev16_shright_mask_imm_p (operands[2], mode)" + "rev16\\t%0, %1" + [(set_attr "type" "rev")] +) + ;; zero_extend version of above (define_insn "*bswapsi2_uxtw" [(set (match_operand:DI 0 "register_operand" "=r") diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h index d97ee61..08c4c7a 100644 --- a/gcc/config/arm/aarch-common-protos.h +++ b/gcc/config/arm/aarch-common-protos.h @@ -23,6 +23,9 @@ #ifndef GCC_AARCH_COMMON_PROTOS_H #define GCC_AARCH_COMMON_PROTOS_H +extern bool aarch_rev16_p (rtx); +extern bool aarch_rev16_shleft_mask_imm_p (rtx, enum machine_mode); +extern bool aarch_rev16_shright_mask_imm_p (rtx, enum machine_mode); extern int arm_early_load_addr_dep (rtx, rtx); extern int arm_early_store_addr_dep (rtx, rtx); extern int arm_mac_accumulator_is_mul_result (rtx, rtx); diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c index c11f7e9..75ed3fd 100644 --- a/gcc/config/arm/aarch-common.c +++ b/gcc/config/arm/aarch-common.c @@ -155,6 +155,79 @@ arm_get_set_operands (rtx producer, rtx consumer, return 0; } +bool +aarch_rev16_s