On Thu, May 5, 2011 at 5:42 PM, Richard Earnshaw <rearn...@arm.com> wrote: > > On Thu, 2011-05-05 at 14:51 +0800, Guozhi Wei wrote: > > Hi > > > > This is the third part of the fixing for > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855 > > > > This patch contains the length computation/refinement for insn patterns > > "*thumb2_movsi_insn", "*thumb2_cbz" and "*thumb2_cbnz". > > > > At the same time this patch revealed two bugs. The first is the maximum > > offset > > of cbz/cbnz, it should be 126, but it is 128 in patterns "*thumb2_cbz" and > > "*thumb2_cbnz". The second is that only 2-register form of shift > > instructions > > can be 16 bit, but 3-register form is allowed in "*thumb2_shiftsi3_short" > > and > > related peephole2. The fix is also contained in this patch. > > > > The patch has been tested on arm qemu. > > > > thanks > > Carrot > > > > > > 2011-05-05 Guozhi Wei <car...@google.com> > > > > PR target/47855 > > * config/arm/thumb2.md (thumb2_movsi_insn): Add length addtribute. > > (thumb2_shiftsi3_short and peephole2): Remove 3-register case. > > (thumb2_cbz): Refine length computation. > > (thumb2_cbnz): Likewise. > > > > Hmm, although these changes are all related to length calculations, they > are really three patches that are unrelated to each other. It would be > easier to review this if they were kept separate. > > 1) thumb2_shiftsi3_short > This appears to be a straight bug. We are putting out a 32-bit > instruction when we are claiming it to be only 16 bits. This is OK. > > 2) thumb2_movsi_insn > There are two things here. > a) Thumb2 has a 16-bit move instruction for all core > register-to-register transfers, so the separation of alternatives 1 and > 2 is unnecessary -- just code these as "rk".
done. > > b) The ldm form does not support unaligned memory accesses. I'm aware > that work is being done to add unaligned support to GCC for ARM, so I > need to find out whether this patch will interfere with those changes. > I'll try to find out what the situation is here and get back to you. > > 3) thumb2_cbz and thumb2_cbnz > The range calculations look wrong here. Remember that the 'pc' as far > as GCC is concerned is the address of the start of the insn. So for a > backwards branch you need to account for all the bytes in the insn > pattern that occur before the branch instruction itself, and secondly > you also have to remember that the 'pc' that the CPU uses is the address > of the branch instruction plus 4. All these conspire to reduce the > backwards range of a short branch to several bytes less than the 256 > that you currently have coded. The usage of 'pc' is more complex than I thought. I understood it after reading the comment in file arm.md. And the description at http://gcc.gnu.org/onlinedocs/gccint/Insn-Lengths.html#Insn-Lengths is not right for forward branch cases. Now the ranges are modified accordingly. It has been tested on arm qemu in thumb2 mode. thanks Carrot 2011-05-06 Guozhi Wei <car...@google.com> PR target/47855 * config/arm/thumb2.md (thumb2_movsi_insn): Add length addtribute. (thumb2_shiftsi3_short and peephole2): Remove 3-register case. (thumb2_cbz): Refine length computation. (thumb2_cbnz): Likewise. Index: config/arm/thumb2.md =================================================================== --- config/arm/thumb2.md (revision 173350) +++ config/arm/thumb2.md (working copy) @@ -165,23 +165,46 @@ ;; regs. The high register alternatives are not taken into account when ;; choosing register preferences in order to reflect their expense. (define_insn "*thumb2_movsi_insn" - [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,l ,*hk,m,*m") - (match_operand:SI 1 "general_operand" "rk ,I,K,j,mi,*mi,l,*hk"))] + [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,l ,*rk,Uu,*m") + (match_operand:SI 1 "general_operand" "rk ,I,K,j,Uu,*mi,l ,*rk"))] "TARGET_THUMB2 && ! TARGET_IWMMXT && !(TARGET_HARD_FLOAT && TARGET_VFP) && ( register_operand (operands[0], SImode) || register_operand (operands[1], SImode))" - "@ - mov%?\\t%0, %1 - mov%?\\t%0, %1 - mvn%?\\t%0, #%B1 - movw%?\\t%0, %1 - ldr%?\\t%0, %1 - ldr%?\\t%0, %1 - str%?\\t%1, %0 - str%?\\t%1, %0" + "* + switch (which_alternative) + { + case 0: return \"mov%?\\t%0, %1\"; + case 1: return \"mov%?\\t%0, %1\"; + case 2: return \"mvn%?\\t%0, #%B1\"; + case 3: return \"movw%?\\t%0, %1\"; + + case 4: + if (GET_CODE (XEXP (operands[1], 0)) == POST_INC) + { + operands[1] = XEXP (XEXP (operands[1], 0), 0); + return \"ldm%(ia%)\t%1!, {%0}\"; + } + else + return \"ldr%?\\t%0, %1\"; + + case 5: return \"ldr%?\\t%0, %1\"; + + case 6: + if (GET_CODE (XEXP (operands[0], 0)) == POST_INC) + { + operands[0] = XEXP (XEXP (operands[0], 0), 0); + return \"stm%(ia%)\t%0!, {%1}\"; + } + else + return \"str%?\\t%1, %0\"; + + case 7: return \"str%?\\t%1, %0\"; + default: gcc_unreachable (); + }" [(set_attr "type" "*,*,*,*,load1,load1,store1,store1") (set_attr "predicable" "yes") + (set_attr "length" "2,4,4,4,2,4,2,4") (set_attr "pool_range" "*,*,*,*,1020,4096,*,*") (set_attr "neg_pool_range" "*,*,*,*,0,0,*,*")] ) @@ -685,7 +708,8 @@ "TARGET_THUMB2 && peep2_regno_dead_p(0, CC_REGNUM) && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT) - || REG_P(operands[2]))" + || REG_P(operands[2])) + && (CONSTANT_P (operands[2]) || (operands[0] == operands[1]))" [(parallel [(set (match_dup 0) (match_op_dup 3 @@ -696,10 +720,10 @@ ) (define_insn "*thumb2_shiftsi3_short" - [(set (match_operand:SI 0 "low_register_operand" "=l") + [(set (match_operand:SI 0 "low_register_operand" "=l,l") (match_operator:SI 3 "shift_operator" - [(match_operand:SI 1 "low_register_operand" "l") - (match_operand:SI 2 "low_reg_or_int_operand" "lM")])) + [(match_operand:SI 1 "low_register_operand" "0,l") + (match_operand:SI 2 "low_reg_or_int_operand" "l,M")])) (clobber (reg:CC CC_REGNUM))] "TARGET_THUMB2 && reload_completed && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT) @@ -707,7 +731,7 @@ "* return arm_output_shift(operands, 2);" [(set_attr "predicable" "yes") (set_attr "shift" "1") - (set_attr "length" "2") + (set_attr "length" "2,2") (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "") (const_string "alu_shift") (const_string "alu_shift_reg")))] @@ -965,13 +989,23 @@ else return \"cmp\\t%0, #0\;beq\\t%l1\"; " - [(set (attr "length") - (if_then_else - (and (ge (minus (match_dup 1) (pc)) (const_int 2)) - (le (minus (match_dup 1) (pc)) (const_int 128)) - (eq (symbol_ref ("which_alternative")) (const_int 0))) - (const_int 2) - (const_int 8)))] + [(set (attr "length") + (if_then_else + (eq (symbol_ref ("which_alternative")) (const_int 0)) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int 2)) + (le (minus (match_dup 1) (pc)) (const_int 128))) + (const_int 2) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int -250)) + (le (minus (match_dup 1) (pc)) (const_int 256))) + (const_int 4) + (const_int 6))) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int -248)) + (le (minus (match_dup 1) (pc)) (const_int 256))) + (const_int 6) + (const_int 8))))] ) (define_insn "*thumb2_cbnz" @@ -988,13 +1022,23 @@ else return \"cmp\\t%0, #0\;bne\\t%l1\"; " - [(set (attr "length") - (if_then_else - (and (ge (minus (match_dup 1) (pc)) (const_int 2)) - (le (minus (match_dup 1) (pc)) (const_int 128)) - (eq (symbol_ref ("which_alternative")) (const_int 0))) - (const_int 2) - (const_int 8)))] + [(set (attr "length") + (if_then_else + (eq (symbol_ref ("which_alternative")) (const_int 0)) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int 2)) + (le (minus (match_dup 1) (pc)) (const_int 128))) + (const_int 2) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int -250)) + (le (minus (match_dup 1) (pc)) (const_int 256))) + (const_int 4) + (const_int 6))) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int -248)) + (le (minus (match_dup 1) (pc)) (const_int 256))) + (const_int 6) + (const_int 8))))] ) ;; 16-bit complement