Hi Michael,

On 19 May 2017 at 09:21, Richard Sandiford <richard.sandif...@linaro.org> wrote:
> Thanks for doing this.  Just a couple of comments about the .md stuff:
>
> Michael Collison <michael.colli...@arm.com> writes:
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index 5adc5ed..c6ae670 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -3999,6 +3999,92 @@
>>    }
>>  )
>>
>> +;; When the LSL, LSR, ASR, ROR instructions operate on all register 
>> arguments
>> +;; they truncate the shift/rotate amount by the size of the registers they
>> +;; operate on: 32 for W-regs, 63 for X-regs.  This allows us to optimise 
>> away
>> +;; such redundant masking instructions.  GCC can do that automatically when
>> +;; SHIFT_COUNT_TRUNCATED is true, but we can't enable it for TARGET_SIMD
>> +;; because some of the SISD shift alternatives don't perform this 
>> truncations.
>> +;; So this pattern exists to catch such cases.
>> +
>> +(define_insn "*aarch64_<optab>_reg_<mode>3_mask1"
>> +  [(set (match_operand:GPI 0 "register_operand" "=r")
>> +     (SHIFT:GPI
>> +       (match_operand:GPI 1 "register_operand" "r")
>> +       (subreg:QI (and:GPI
>> +                   (match_operand:GPI 2 "register_operand" "r")
>> +                   (match_operand 3 "const_int_operand" "n")) 0)))]
>> +  "(~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0"
>> +  "<shift>\t%<w>0, %<w>1, %<w>2"
>> +  [(set_attr "type" "shift_reg")]
>> +)
>
> (subreg:QI (...) 0) is only correct for little endian.  For big endian
> it needs to be 3 for SI or 7 for DI.  You could probably handle that
> using something like:
>
>   (match_operator:QI 2 "lowpart_subreg"
>     [(and:GPI ...)])
>
> and defining a lowpart_subreg predicate that checks the SUBREG_BYTE.
> Or just leave the subreg as-is and add !BYTES_BIG_ENDIAN to the insn
> condition.
>

I can confirm that the new tests pass on little-endian, but fail on big.

Thanks,

Christophe


>> +(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2"
>> +  [(set (match_operand:GPI 0 "register_operand" "=r")
>> +     (SHIFT:GPI
>> +       (match_operand:GPI 1 "register_operand" "r")
>> +       (subreg:QI (neg:SI (and:SI
>> +                           (match_operand:SI 2 "register_operand" "r")
>> +                           (match_operand 3 "const_int_operand" "n"))) 0)))]
>> +  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)
>> +   && can_create_pseudo_p ()"
>> +  "#"
>> +  "&& true"
>> +  [(const_int 0)]
>> +  {
>> +    rtx tmp = gen_reg_rtx (SImode);
>> +
>> +    emit_insn (gen_negsi2 (tmp, operands[2]));
>> +    rtx tmp2 = simplify_gen_subreg (QImode, tmp, SImode, 0);
>> +    emit_insn (gen_<optab><mode>3 (operands[0], operands[1], tmp2));
>> +    DONE;
>> +  }
>> +)
>
> Insn patterns shouldn't check can_create_pseudo_p, because there's no
> guarantee that the associated split happens before RA.  In this case it
> should be safe to reuse operand 0 after RA if you change it to:
>
>   [(set (match_operand:GPI 0 "register_operand" "=&r")
>         ...)]
>
> and then:
>
>   rtx tmp = (can_create_pseudo_p ()
>              ? gen_reg_rtx (SImode)
>              : gen_lowpart (SImode, operands[0]));
>
> Thanks,
> Richard

Reply via email to