On 14/02/12 17:30, Richard Earnshaw wrote:
> On 14/02/12 16:53, Andrew Stubbs wrote:
>> Hi all,
>>
>> I've discovered that GCC does not use ARM conditional execution for
>> 16-bit Thumb opcodes in many cases. It's fine for individual
>> instructions, but if-conversion of basic blocks with more than one
>> instruction fails.
>>
>> E.g.
>>
>> int
>> foo (int a, int b)
>> {
>> if (a != b)
>> {
>> a = a << b;
>> a = a >> 1;
>> }
>>
>> return a + b;
>> }
>>
>> The current compiler gives:
>>
>> foo:
>> cmp r0, r1
>> beq .L2
>> lsls r0, r0, r1
>> asrs r0, r0, #1
>> .L2:
>> adds r0, r0, r1
>> bx lr
>>
>> With my patch I get this:
>>
>> foo:
>> cmp r0, r1
>> itt ne
>> lslne r0, r0, r1
>> asrne r0, r0, #1
>> adds r0, r0, r1
>> bx lr
>>
>> The problem comes from the fact that the compiler prefers "lsls" over
>> "lsl" because the former is a 16-bit encoding, and the latter a 32-bit
>> encoding. There's actually a peephole optimization defined to make this
>> happen wherever the CC register is not live.
>>
>> This is fine in unconditional code, but the CC register clobber means
>> that it's only possible to convert it to conditional code if it is the
>> last instruction in the IT block, so if-conversion fails on the above
>> example.
>>
>> My patch introduces a new target hook "IFCVT_OVERRIDE_MODIFIED_TEST"
>> that allows the CC clobber to be ignored on such instructions, and uses
>> IFCVT_MODIFY_INSN to convert from "lsls" to "lsl<c>" where possible.
>>
>> I've also introduced a new instruction attribute "it_cc" to indicate
>> which instruction patterns are affected.
>>
>> OK for trunk, once stage 1 reopens?
>>
>> Andrew
>>
>
> Bernds checked in a patch last year (or maybe even before that) to make
> the selection of flags clobbered insns run very late (certainly after
> condexec had run), so I'm not sure why you're not seeing this.
>
Hm, you mentioned some peepholes. Try removing them....
R.
> R.
>
>>
>> ifcvt_modify_insn.patch
>>
>>
>> 2012-02-14 Andrew Stubbs <[email protected]>
>>
>> gcc/
>> * config/arm/arm-protos.h (arm_ifcvt_modify_insn): New prototype.
>> (arm_ifcvt_override_modified_test): New prototype.
>> * config/arm/arm.c (thumb_insn_suitable_for_ifcvt): New function.
>> (arm_ifcvt_override_modified_test): New function.
>> (arm_ifcvt_modify_insn): New function.
>> * config/arm/arm.h (IFCVT_OVERRIDE_MODIFIED_TEST): New macro.
>> (IFCVT_MODIFY_INSN): New macro.
>> * config/arm/thumb2.md (it_cc): New attribute.
>> (thumb2_alusi3_short): Set it_cc attribute.
>> (thumb2_shiftsi3_short, thumb2_mov<mode>_shortim): Likewise.
>> (thumb2_addsi_short, thumb2_subsi_short): Likewise.
>> (thumb2_mulsi_short, thumb2_one_cmplsi2_short): Likewise.
>> (thumb2_negsi2_short): Likewise.
>> * doc/tm.texi: Regenerate.
>> * doc/tm.texi.in (IFCVT_OVERRIDE_MODIFIED_TEST): Document.
>> * ifcvt.c (cond_exec_process_insns): Add IFCVT_OVERRIDE_MODIFIED_TEST.
>>
>> gcc/testsuite/
>> * gcc.target/arm/thumb-ifcvt.c: New test case.
>>
>> ---
>> gcc/config/arm/arm-protos.h | 5 ++
>> gcc/config/arm/arm.c | 67
>> ++++++++++++++++++++++++++++
>> gcc/config/arm/arm.h | 5 ++
>> gcc/config/arm/thumb2.md | 11 +++++
>> gcc/doc/tm.texi | 13 +++++
>> gcc/doc/tm.texi.in | 13 +++++
>> gcc/ifcvt.c | 14 +++++-
>> gcc/testsuite/gcc.target/arm/thumb-ifcvt.c | 19 ++++++++
>> 8 files changed, 146 insertions(+), 1 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
>>
>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> index 296550a..67396f0 100644
>> --- a/gcc/config/arm/arm-protos.h
>> +++ b/gcc/config/arm/arm-protos.h
>> @@ -244,4 +244,9 @@ extern const struct tune_params *current_tune;
>> extern int vfp3_const_double_for_fract_bits (rtx);
>> #endif /* RTX_CODE */
>>
>> +#ifdef BB_HEAD
>> +extern int arm_ifcvt_override_modified_test (rtx, rtx);
>> +extern rtx arm_ifcvt_modify_insn (ce_if_block_t *, rtx, rtx);
>> +#endif
>> +
>> #endif /* ! GCC_ARM_PROTOS_H */
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index 0bded8d..803e1c9 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -25139,5 +25139,72 @@ vfp3_const_double_for_fract_bits (rtx operand)
>> return 0;
>> }
>>
>> +/* Find the portion of INSN that is suitable for if-conversion.
>> +
>> + Some Thumb mode 16-bit instructions have two variants: one that is
>> + unconditional but clobbers the condition flags, and one that is
>> + conditional (in an IT block) and does not clobber anything.
>> +
>> + There are 32-bit variants that are unconditional but don't clobber
>> + anything, so the peephole2 pass adds the clobber in order to use the
>> + smaller instruction encoding. Unfortunately this defeats the
>> + if-conversion pass since CC must not be modified in an IT block.
>> +
>> + The peephole can be reversed, and the instruction converted to
>> + conditional execution without affecting the size optimization.
>> +
>> + This function detects these instructions and returns the sub-expression
>> + that contains the operation, without the clobber. */
>> +
>> +static rtx
>> +thumb_insn_suitable_for_ifcvt (rtx pattern, rtx insn)
>> +{
>> + if (GET_CODE (pattern) == PARALLEL
>> + && XVECLEN (pattern, 0) == 2)
>> + {
>> + rtx op = XVECEXP (pattern, 0, 0);
>> + rtx clobber = XVECEXP (pattern, 0, 1);
>> +
>> + if (GET_CODE (clobber) == CLOBBER
>> + && GET_CODE (XEXP (clobber, 0)) == REG
>> + && REGNO (XEXP (clobber, 0)) == CC_REGNUM
>> + && get_attr_it_cc (insn) == IT_CC_NOCLOBBER)
>> + return op;
>> + }
>> +
>> + return NULL_RTX;
>> +}
>> +
>> +/* Prevent if-conversion thinking that certain Thumb1 instructions
>> + clobber CC. These instruction in fact do not when in an IT block. */
>> +int
>> +arm_ifcvt_override_modified_test (rtx test, rtx insn)
>> +{
>> + if (thumb_insn_suitable_for_ifcvt (PATTERN (insn), insn) != NULL_RTX)
>> + return 0; /* Force *not* modified. */
>> +
>> + return -1;
>> +}
>> +
>> +/* Modify insns to enable conditional execution.
>> +
>> + This function removes CC clobers that impede if-conversion. See the
>> + comments on thumb_insn_suitable_for_ifcvt. */
>> +rtx
>> +arm_ifcvt_modify_insn (ce_if_block_t *ce, rtx pattern, rtx insn)
>> +{
>> + rtx op, cond;
>> +
>> + gcc_assert (GET_CODE (pattern) == COND_EXEC);
>> +
>> + cond = XEXP (pattern, 0);
>> + op = thumb_insn_suitable_for_ifcvt (XEXP (pattern, 1), insn);
>> +
>> + if (op != NULL_RTX)
>> + return gen_rtx_COND_EXEC (VOIDmode, cond, op);
>> +
>> + return pattern;
>> +}
>> +
>> #include "gt-arm.h"
>>
>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>> index 5a78125..0b7f332 100644
>> --- a/gcc/config/arm/arm.h
>> +++ b/gcc/config/arm/arm.h
>> @@ -2220,4 +2220,9 @@ extern const char *host_detect_local_cpu (int argc,
>> const char **argv);
>>
>> #define DRIVER_SELF_SPECS MCPU_MTUNE_NATIVE_SPECS
>>
>> +#define IFCVT_OVERRIDE_MODIFIED_TEST(TEST,INSN) \
>> + arm_ifcvt_override_modified_test ((TEST), (INSN))
>> +#define IFCVT_MODIFY_INSN(CE,PATTERN,INSN) \
>> + (PATTERN) = arm_ifcvt_modify_insn ((CE), (PATTERN), (INSN))
>> +
>> #endif /* ! GCC_ARM_H */
>> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
>> index 05585da..454c42f 100644
>> --- a/gcc/config/arm/thumb2.md
>> +++ b/gcc/config/arm/thumb2.md
>> @@ -24,6 +24,9 @@
>> ;; changes made in armv5t as "thumb2". These are considered part
>> ;; the 16-bit Thumb-1 instruction set.
>>
>> +;; Does an insn clobber CC if it appears in an IT block. */
>> +(define_attr "it_cc" "undef,clobber,noclobber" (const_string "undef"))
>> +
>> (define_insn "*thumb2_incscc"
>> [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>> (plus:SI (match_operator:SI 2 "arm_comparison_operator"
>> @@ -674,6 +677,7 @@
>> && GET_CODE(operands[3]) != MINUS"
>> "%I3%!\\t%0, %1, %2"
>> [(set_attr "predicable" "yes")
>> + (set_attr "it_cc" "noclobber")
>> (set_attr "length" "2")]
>> )
>>
>> @@ -708,6 +712,7 @@
>> || REG_P(operands[2]))"
>> "* return arm_output_shift(operands, 2);"
>> [(set_attr "predicable" "yes")
>> + (set_attr "it_cc" "noclobber")
>> (set_attr "shift" "1")
>> (set_attr "length" "2")
>> (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
>> @@ -736,6 +741,7 @@
>> "TARGET_THUMB2 && reload_completed"
>> "mov%!\t%0, %1"
>> [(set_attr "predicable" "yes")
>> + (set_attr "it_cc" "noclobber")
>> (set_attr "length" "2")]
>> )
>>
>> @@ -778,6 +784,7 @@
>> return \"add%!\\t%0, %1, %2\";
>> "
>> [(set_attr "predicable" "yes")
>> + (set_attr "it_cc" "noclobber")
>> (set_attr "length" "2")]
>> )
>>
>> @@ -789,6 +796,7 @@
>> "TARGET_THUMB2 && reload_completed"
>> "sub%!\\t%0, %1, %2"
>> [(set_attr "predicable" "yes")
>> + (set_attr "it_cc" "noclobber")
>> (set_attr "length" "2")]
>> )
>>
>> @@ -905,6 +913,7 @@
>> "TARGET_THUMB2 && optimize_size && reload_completed"
>> "mul%!\\t%0, %2, %0"
>> [(set_attr "predicable" "yes")
>> + (set_attr "it_cc" "noclobber")
>> (set_attr "length" "2")
>> (set_attr "insn" "muls")])
>>
>> @@ -999,6 +1008,7 @@
>> "TARGET_THUMB2 && reload_completed"
>> "mvn%!\t%0, %1"
>> [(set_attr "predicable" "yes")
>> + (set_attr "it_cc" "noclobber")
>> (set_attr "length" "2")]
>> )
>>
>> @@ -1022,6 +1032,7 @@
>> "TARGET_THUMB2 && reload_completed"
>> "neg%!\t%0, %1"
>> [(set_attr "predicable" "yes")
>> + (set_attr "it_cc" "noclobber")
>> (set_attr "length" "2")]
>> )
>>
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index 6d41cee..7d09ddb 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> @@ -10879,6 +10879,19 @@ if-statements into conditions combined by
>> @code{and} and @code{or} operations.
>> being processed and about to be turned into a condition.
>> @end defmac
>>
>> +@defmac IFCVT_OVERRIDE_MODIFIED_TEST (@var{test}, @var{insn})
>> +Used if the target has instructions that modify the condition
>> +register when executed unconditionally, but do not do so when executed
>> +conditionally, or vice-versa. @var{TEST} contains the condition test, and
>> +@var{INSN} contains the original insn to be conditionalized. The macro
>> +should return minus-one if there is to be no override, zero to disregard
>> +an apparent modified condition register, or any other value to indicate
>> +that the instruction does modify the condition register, even if it does
>> +not appear to. Additionally, if conditionalizing the instruction does
>> +change its behaviour, @code{IFCVT_MODIFY_INSN} should make the appropriate
>> +adjustments to match.
>> +@end defmac
>> +
>> @defmac IFCVT_MODIFY_INSN (@var{ce_info}, @var{pattern}, @var{insn})
>> A C expression to modify the @var{PATTERN} of an @var{INSN} that is to
>> be converted to conditional execution format. @var{ce_info} points to
>> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
>> index 396f244..52d9e96 100644
>> --- a/gcc/doc/tm.texi.in
>> +++ b/gcc/doc/tm.texi.in
>> @@ -10759,6 +10759,19 @@ if-statements into conditions combined by
>> @code{and} and @code{or} operations.
>> being processed and about to be turned into a condition.
>> @end defmac
>>
>> +@defmac IFCVT_OVERRIDE_MODIFIED_TEST (@var{test}, @var{insn})
>> +Used if the target has instructions that modify the condition
>> +register when executed unconditionally, but do not do so when executed
>> +conditionally, or vice-versa. @var{TEST} contains the condition test, and
>> +@var{INSN} contains the original insn to be conditionalized. The macro
>> +should return minus-one if there is to be no override, zero to disregard
>> +an apparent modified condition register, or any other value to indicate
>> +that the instruction does modify the condition register, even if it does
>> +not appear to. Additionally, if conditionalizing the instruction does
>> +change its behaviour, @code{IFCVT_MODIFY_INSN} should make the appropriate
>> +adjustments to match.
>> +@end defmac
>> +
>> @defmac IFCVT_MODIFY_INSN (@var{ce_info}, @var{pattern}, @var{insn})
>> A C expression to modify the @var{PATTERN} of an @var{INSN} that is to
>> be converted to conditional execution format. @var{ce_info} points to
>> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
>> index ce60ce2..ff620da 100644
>> --- a/gcc/ifcvt.c
>> +++ b/gcc/ifcvt.c
>> @@ -310,6 +310,7 @@ cond_exec_process_insns (ce_if_block_t *ce_info
>> ATTRIBUTE_UNUSED,
>> rtx insn;
>> rtx xtest;
>> rtx pattern;
>> + int override_modified = -1;
>>
>> if (!start || !end)
>> return FALSE;
>> @@ -338,7 +339,18 @@ cond_exec_process_insns (ce_if_block_t *ce_info
>> ATTRIBUTE_UNUSED,
>> if (must_be_last)
>> return FALSE;
>>
>> - if (modified_in_p (test, insn))
>> + /* Some instructions modify CC flags when executed unconditionally
>> + but not when executed conditionally.
>> + Return values:
>> + -1 = no override
>> + 0 = force false
>> + other = force true. */
>> +#ifdef IFCVT_OVERRIDE_MODIFIED_TEST
>> + override_modified = IFCVT_OVERRIDE_MODIFIED_TEST (test, insn);
>> +#endif
>> +
>> + if ((override_modified == -1 && modified_in_p (test, insn))
>> + || override_modified)
>> {
>> if (!mod_ok)
>> return FALSE;
>> diff --git a/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
>> b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
>> new file mode 100644
>> index 0000000..b03bbce
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
>> @@ -0,0 +1,19 @@
>> +/* Check that Thumb 16-bit shifts can be if-converted. */
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target arm_thumb2_ok } */
>> +/* { dg-options "-O2" } */
>> +
>> +int
>> +foo (int a, int b)
>> +{
>> + if (a != b)
>> + {
>> + a = a << b;
>> + a = a >> 1;
>> + }
>> +
>> + return a + b;
>> +}
>> +
>> +/* { dg-final { scan-assembler "lslne" } } */
>> +/* { dg-final { scan-assembler "asrne" } } */
>
>
>