On Tue, Jul 17, 2012 at 9:47 PM, Ramana Radhakrishnan <ramana.radhakrish...@linaro.org> wrote: > Carrot, > > Sorry about the delayed response. > > On 3 July 2012 12:28, Carrot Wei <car...@google.com> wrote: >> On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan >> <ramana.radhakrish...@linaro.org> wrote: >>> On 28 May 2012 11:08, Carrot Wei <car...@google.com> wrote: >>>> Hi >>>> >>>> This is the second part of the patches that deals with 64bit and. It >>>> directly >>>> extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit >>>> constant operands. >>>> >>> >>> Comments about const_di_ok_for_op still apply from my review of your add >>> patch. >>> >>> However I don't see why and /ior / xor with constants that have either >>> the low or high parts set can't be expanded directly into ands of >>> subregs with moves of zero's or the original value especially if you >>> aren't looking at doing 64 bit operations in neon .With Neon being >>> used for 64 bit arithmetic it gets more interesting. >>> >>> Finally this should target PR target/53189. >>> >> >> Hi Ramana >> >> Thanks for the review. Following is the updated patch according to >> your comments. > > You've missed answering this part of my review :) > >>> However I don't see why and /ior / xor with constants that have either >>> the low or high parts set can't be expanded directly into ands of >>> subregs with moves of zero's or the original value especially if you >>> aren't looking at doing 64 bit operations in neon .With Neon being >>> used for 64 bit arithmetic it gets more interesting. > It has been handled by the const_ok_for_dimode_op and the output part of corresponding SI mode insn. Let's take the IOR case as an example.
In the const_ok_for_dimode_op patch --- arm.c (revision 189278) +++ arm.c (working copy) @@ -2524,6 +2524,16 @@ case PLUS: return arm_not_operand (hi, SImode) && arm_add_operand (lo, SImode); + case IOR: + if ((const_ok_for_arm (hi_val) || (hi_val == 0xFFFFFFFF)) + && (const_ok_for_arm (lo_val) || (lo_val == 0xFFFFFFFF))) + return 1; + if (TARGET_THUMB2 + && (const_ok_for_arm (lo_val) || const_ok_for_arm (~lo_val)) + && (const_ok_for_arm (hi_val) || const_ok_for_arm (~hi_val))) + return 1; + return 0; + default: return 0; } The 0xFFFFFFFF is not valid arm mode immediate, but ior 0XFFFFFFFF results in all 1's, so it is also allowed in an iordi3 insn. And the patch in iorsi3_insn pattern explicitly check the all 0's and all 1's cases, and output either a simple register mov instruction or instruction mov all 1's to the destination. @@ -2902,10 +2915,29 @@ (ior:SI (match_operand:SI 1 "s_register_operand" "%r,r,r") (match_operand:SI 2 "reg_or_int_operand" "rI,K,?n")))] "TARGET_32BIT" - "@ - orr%?\\t%0, %1, %2 - orn%?\\t%0, %1, #%B2 - #" + "* + { + if (CONST_INT_P (operands[2])) + { + HOST_WIDE_INT i = INTVAL (operands[2]) & 0xFFFFFFFF; + if (i == 0xFFFFFFFF) + return \"mvn%?\\t%0, #0\"; + if (i == 0) + { + if (!rtx_equal_p (operands[0], operands[1])) + return \"mov%?\\t%0, %1\"; + else + return \"\"; + } + } + + switch (which_alternative) + { + case 0: return \"orr%?\\t%0, %1, %2\"; + case 1: return \"orn%?\\t%0, %1, #%B2\"; + case 2: return \"#\"; + } + }" "TARGET_32BIT && GET_CODE (operands[2]) == CONST_INT && !(const_ok_for_arm (INTVAL (operands[2])) > Is there any reason why we don't split such cases earlier into the > constituent moves and the associated ands earlier than reload in the > non-Neon case? > I referenced pattern arm_adddi3 which is split after reload_completed. And the pattern arm_subdi3 is even not split. I guess they keep the original constant longer may benefit some optimizations involving constants. But it may also lose flexibility in other cases. > In addition, it would be good to have some tests for Thumb2 that deal > with the replicated constants for Thumb2 . Can you have a look at > creating some tests similar to the thumb2*replicated*.c tests in > gcc.target/arm but for 64 bit constants ? > The new test cases involving thumb2 replicated constants are added as following. thanks Carrot 2012-07-18 Wei Guozhi <car...@google.com> PR target/53189 * gcc.target/arm/pr53189-10.c: New testcase. * gcc.target/arm/pr53189-11.c: New testcase. * gcc.target/arm/pr53189-12.c: New testcase. Index: pr53189-10.c =================================================================== --- pr53189-10.c (revision 0) +++ pr53189-10.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-options "-mthumb -O2" } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-final { scan-assembler-not "mov" } } */ +/* { dg-final { scan-assembler "and.*#-16843010" } } */ + +void t0p(long long * p) +{ + *p &= 0x9fefefefe; +} Index: pr53189-11.c =================================================================== --- pr53189-11.c (revision 0) +++ pr53189-11.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-options "-mthumb -O2" } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-final { scan-assembler-not "mov" } } */ +/* { dg-final { scan-assembler "eor.*#-1426019584" } } */ + +void t0p(long long * p) +{ + *p ^= 0x7ab00ab00; +} Index: pr53189-12.c =================================================================== --- pr53189-12.c (revision 0) +++ pr53189-12.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-options "-mthumb -O2" } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-final { scan-assembler-not "mov" } } */ +/* { dg-final { scan-assembler "orr.*#13435085" } } */ + +void t0p(long long * p) +{ + *p |= 0x500cd00cd; +}