Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)

2019-03-19 Thread Ramana Radhakrishnan
On 04/03/2019 09:04, Jakub Jelinek wrote:
> On Fri, Mar 01, 2019 at 03:41:33PM +, Wilco Dijkstra wrote:
>> > and regtest revealed two code size
>> > regressions because of that.  Is -1 vs. 1 the only case of immediate
>> > valid for both "I" and "L" constraints where the former is longer than the
>> > latter?
>> 
>> Yes -1 is the only case which can result in larger code on Thumb-2, so -1 
>> should
>> be disallowed by the I constraint (or even better, the underlying query). 
>> That way
>> it will work correctly for all add/sub patterns, not just this one.
> 
> So, over the weekend I've bootstrapped/regtested on armv7hl-linux-gnueabi
> following two possible follow-ups which handle the -1 and 1 cases right
> (prefer the instruction with #1 for thumb2), 0 and INT_MIN (use subs) and
> for others use subs if both constraints match, otherwise adds.
> 
> The first one uses constraints and no C code in the output, I believe it is
> actually more expensive for compile time, because if one just reads what
> constrain_operands needs to do for another constraint, it is quite a lot.
> I've tried to at least not introduce new constraints for this, there is no
> constraint for number 1 (or for number -1).
> The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> subs over adds.
> 
> The other swaps the alternatives similarly to the above, but for the special
> case of desirable adds with #1 uses C code instead of another alternative.
> 
> Ok for trunk (which one)?
> 
>      Jakub


Option #2 is better (the C code variant)- for something like this adding 
one more constraint is a bit painful.

Ok and watch out for any regressions as usual.

Ramana


Patch ping^2 (Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506))

2019-03-19 Thread Jakub Jelinek
On Tue, Mar 12, 2019 at 12:43:29PM +0100, Jakub Jelinek wrote:
> On Mon, Mar 04, 2019 at 10:04:01AM +0100, Jakub Jelinek wrote:
> > The first one uses constraints and no C code in the output, I believe it is
> > actually more expensive for compile time, because if one just reads what
> > constrain_operands needs to do for another constraint, it is quite a lot.
> > I've tried to at least not introduce new constraints for this, there is no
> > constraint for number 1 (or for number -1).
> > The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> > I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> > valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> > subs over adds.
> > 
> > The other swaps the alternatives similarly to the above, but for the special
> > case of desirable adds with #1 uses C code instead of another alternative.
> > 
> > Ok for trunk (which one)?
> 
> I'd like to ping these patches:
> 
> > 2019-03-04  Jakub Jelinek  
> > 
> > PR target/89506
> > * config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
> > another alternative with I constraint for operands[2] and Pu
> > for operands[3] and emit adds in that case, don't use C code to
> > emit the instruction.
> 
> or:
> 
> > 2019-03-04  Jakub Jelinek  
> > 
> > PR target/89506
> > * config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
> > subs for the first alternative except when operands[3] is 1.

Ping.

Jakub


Patch ping (Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506))

2019-03-12 Thread Jakub Jelinek
On Mon, Mar 04, 2019 at 10:04:01AM +0100, Jakub Jelinek wrote:
> The first one uses constraints and no C code in the output, I believe it is
> actually more expensive for compile time, because if one just reads what
> constrain_operands needs to do for another constraint, it is quite a lot.
> I've tried to at least not introduce new constraints for this, there is no
> constraint for number 1 (or for number -1).
> The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> subs over adds.
> 
> The other swaps the alternatives similarly to the above, but for the special
> case of desirable adds with #1 uses C code instead of another alternative.
> 
> Ok for trunk (which one)?

I'd like to ping these patches:

> 2019-03-04  Jakub Jelinek  
> 
>   PR target/89506
>   * config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
>   another alternative with I constraint for operands[2] and Pu
>   for operands[3] and emit adds in that case, don't use C code to
>   emit the instruction.

or:

> 2019-03-04  Jakub Jelinek  
> 
>   PR target/89506
>   * config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
>   subs for the first alternative except when operands[3] is 1.

Jakub


[PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)

2019-03-04 Thread Jakub Jelinek
On Fri, Mar 01, 2019 at 03:41:33PM +, Wilco Dijkstra wrote:
> > and regtest revealed two code size
> > regressions because of that.  Is -1 vs. 1 the only case of immediate
> > valid for both "I" and "L" constraints where the former is longer than the
> > latter?
> 
> Yes -1 is the only case which can result in larger code on Thumb-2, so -1 
> should
> be disallowed by the I constraint (or even better, the underlying query). 
> That way
> it will work correctly for all add/sub patterns, not just this one.

So, over the weekend I've bootstrapped/regtested on armv7hl-linux-gnueabi
following two possible follow-ups which handle the -1 and 1 cases right
(prefer the instruction with #1 for thumb2), 0 and INT_MIN (use subs) and
for others use subs if both constraints match, otherwise adds.

The first one uses constraints and no C code in the output, I believe it is
actually more expensive for compile time, because if one just reads what
constrain_operands needs to do for another constraint, it is quite a lot.
I've tried to at least not introduce new constraints for this, there is no
constraint for number 1 (or for number -1).
The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
I constraint for the negation of it, i.e. -8..-1, only -1 from this is
valid for I.  If that matches, we emit adds with #1, otherwise just prefer
subs over adds.

The other swaps the alternatives similarly to the above, but for the special
case of desirable adds with #1 uses C code instead of another alternative.

Ok for trunk (which one)?

Jakub
2019-03-04  Jakub Jelinek  

PR target/89506
* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
another alternative with I constraint for operands[2] and Pu
for operands[3] and emit adds in that case, don't use C code to
emit the instruction.

--- gcc/config/arm/arm.md.jj2019-03-02 09:04:25.550794239 +0100
+++ gcc/config/arm/arm.md   2019-03-02 17:08:13.036725812 +0100
@@ -857,31 +857,31 @@ (define_insn "*compare_negsi_si"
(set_attr "type" "alus_sreg")]
 )
 
-;; This is the canonicalization of addsi3_compare0_for_combiner when the
+;; This is the canonicalization of subsi3_compare when the
 ;; addend is a constant.
+;; For 0 and INT_MIN it is essential that we use subs, as adds will result
+;; in different condition codes (like cmn rather than like cmp), so that
+;; alternative comes first.  Both I and L constraints can match for any
+;; 0x??00 where except for 0 and INT_MIN it doesn't matter what we choose,
+;; and also for -1 and 1 with TARGET_THUMB2, in that case prefer instruction
+;; with #1 as it is shorter.  The first alternative will use adds ?, ?, #1 over
+;; subs ?, ?, #-1, the second alternative will use subs for #0 or #2147483648
+;; or any other case where both I and L constraints match.
 (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
(compare:CC
-(match_operand:SI 1 "s_register_operand" "r,r")
-(match_operand:SI 2 "arm_addimm_operand" "L,I")))
-   (set (match_operand:SI 0 "s_register_operand" "=r,r")
+(match_operand:SI 1 "s_register_operand" "r,r,r")
+(match_operand:SI 2 "arm_addimm_operand" "I,I,L")))
+   (set (match_operand:SI 0 "s_register_operand" "=r,r,r")
(plus:SI (match_dup 1)
-(match_operand:SI 3 "arm_addimm_operand" "I,L")))]
+(match_operand:SI 3 "arm_addimm_operand" "Pu,L,I")))]
   "TARGET_32BIT
&& (INTVAL (operands[2])
== trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
-{
-  /* For 0 and INT_MIN it is essential that we use subs, as adds
- will result in different condition codes (like cmn rather than
- like cmp).  For other immediates, we should choose whatever
- will have smaller encoding.  */
-  if (operands[2] == const0_rtx
-  || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x8000)
-  || which_alternative == 1)
-return "subs%?\\t%0, %1, #%n3";
-  else
-return "adds%?\\t%0, %1, %3";
-}
+  "@
+   adds%?\\t%0, %1, %3
+   subs%?\\t%0, %1, #%n3
+   adds%?\\t%0, %1, %3"
   [(set_attr "conds" "set")
(set_attr "type" "alus_sreg")]
 )
2019-03-04  Jakub Jelinek  

PR target/89506
* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
subs for the first alternative except when operands[3] is 1.

--- gcc/config/arm/arm.md.jj2019-03-02 09:04:25.550794239 +0100
+++ gcc/config/arm/arm.md   2019-03-02 09:41:03.501404694 +0100
@@ -857,27 +857,27 @@ (define_insn "*compare_negsi_si"
(set_attr "type" "alus_sreg")]
 )
 
-;; This is the canonicalization of addsi3_compare0_for_combiner when the
+;; This is the canonicalization of subsi3_compare when the
 ;; addend is a constant.
 (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
(compare:CC
 (match_operand:SI 1 "s_register_operand" "r,r")
-(match_operand:SI 2 "arm_addimm_operand" "L,I")))
+(match_operand:SI 2