> I believe this patch to be nothing but an improvement over the current
> state, and that a fix to the constraint problem should be a separate patch.
>
> In that basis, am I OK to commit?
One minor nit:
> (define_special_predicate "shift_operator"
>...
>+ (ior (match_test "GET_CODE (XEXP (op, 1)) == CONST_INT
>+ && ((unsigned HOST_WIDE_INT) INTVAL (XEXP (op, 1))) <
>32")
>+ (match_test "REG_P (XEXP (op, 1))"))))
We're already enforcing the REG_P elsewhere, and it's only valid in some
contexts, so I'd change this to:
(match_test "GET_CODE (XEXP (op, 1)) != CONST_INT
|| ((unsigned HOST_WIDE_INT) INTVAL (XEXP (op, 1))) < 32")
> Now, let me explain the other problem:
>
> As it stands, all shift-related patterns, for ARM or Thumb2 mode, permit
> a shift to be expressed as either a shift type and amount (register or
> constant), or as a multiply and power-of-two constant.
Added complication is that only ARM mode accepts a register.
> Problem scenario 1:
>
> Consider pattern (plus (mult r1 r2) r3).
>
> It so happens that reload knows that r2 contains a constant, say 20,
> so reload checks to see if that could be converted to an immediate.
> Now, 20 is not a power of two, so recog would reject it, but it is in
> the range 0..31 so it does match the 'M' constraint. Oops!
Though as you mention below we the predicate don't allow the second operand to
be a register, so this can never happen. Reload may do unexpected things, but
if it starts randomly changing valid const_int values then we have much bigger
problems.
> Problem scenario 2:
>
> Consider pattern (ashiftrt r1 r2).
>
> Again, it so happens that reload knows that r2 contains a constant, in
> this case let's say 64, so again reload checks to see if that could
> be converted to an immediate. This time, 64 is not in the range
> 0..31, so recog would reject it, but it is a power of two, so it does
> match the 'M' constraint. Again, oops!
>
> I see two ways to fix this properly:
>
> 1. Duplicate all the patterns in the machine description, once for the
> mult case, and once for the other cases. This could probably be
> done with a code iterator, if preferred.
>
> 2. Redefine the canonical form of (plus (mult .. 2^n) ..) such that it
> always uses the (presumably cheaper) shift-and-add option. However,
> this would require all other targets where madd really is the best
> option to fix it up. (I'd imagine that two instructions for shift
> and add would be cheaper speed wise, if properly scheduled, on most
> targets? That doesn't help the size optimization though.)
3. Consistently accept both power-of-two and 0..31 for shifts. Large shift
counts give undefined results[1], so replace them with an arbitrary value
(e.g. 0) during assembly output. Argualy not an entirely "proper" fix, but I
think it'll keep everything happy.
> However, it's not obvious to me that this needs fixing:
> * The failure mode would be an ICE, and we've not seen any.
Then again noone noticed the negative-shift ICE until recently :-/
> * There's a comment in arm.c:shift_op that suggests that this can't
> happen, somehow, at least in the mult case.
> - I'm not sure exactly how reload works, but it seems reasonable
> that it will never try to convert a register to an immediate
> because the pattern does not allow registers in the first place.
> - This logic doesn't hold in the opposite case though.
> Have I explained all that clearly?
I think you've convered most of it.
For bonus points we should probably disallow MULT in the arm_shiftsi3 pattern,
stop it interacting with the regulat mulsi3 pattern in undesirable ways.
Paul
[1] Or at least not any result gcc will be expecting.