RE: [PATCH][Aarch64] Add support for overflow add and sub operations

2017-08-01 Thread Michael Collison
Updated the patch per Richard's comments in particular the issues relating to 
use of NE: " Use of ne is wrong here.  The condition register should be set to 
the result of a compare rtl construct.  The same applies elsewhere within this 
patch.  NE is then used on the result of the comparison.  The mode of the 
compare then indicates what might or might not be valid in the way the 
comparison is finally constructed."

Okay for trunk?

2017-08-01  Michael Collison  <michael.colli...@arm.com>
Richard Henderson <r...@redhat.com>

* config/aarch64/aarch64-modes.def (CC_V): New.
* config/aarch64/aarch64-protos.h
(aarch64_add_128bit_scratch_regs): Declare
(aarch64_add_128bit_scratch_regs): Declare.
(aarch64_expand_subvti): Declare.
(aarch64_gen_unlikely_cbranch): Declare
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
for signed overflow using CC_Vmode.
(aarch64_get_condition_code_1): Handle CC_Vmode.
(aarch64_gen_unlikely_cbranch): New function.
(aarch64_add_128bit_scratch_regs): New function.
(aarch64_subv_128bit_scratch_regs): New function.
(aarch64_expand_subvti): New function.
* config/aarch64/aarch64.md (addv4, uaddv4): New.
(addti3): Create simpler code if low part is already known to be 0.
(addvti4, uaddvti4): New.
(*add3_compareC_cconly_imm): New.
(*add3_compareC_cconly): New.
(*add3_compareC_imm): New.
(*add3_compareC): Rename from add3_compare1; do not
handle constants within this pattern.
(*add3_compareV_cconly_imm): New.
(*add3_compareV_cconly): New.
(*add3_compareV_imm): New.
(add3_compareV): New.
(add3_carryinC, add3_carryinV): New.
(*add3_carryinC_zero, *add3_carryinV_zero): New.
(*add3_carryinC, *add3_carryinV): New.
(subv4, usubv4): New.
(subti): Handle op1 zero.
(subvti4, usub4ti4): New.
(*sub3_compare1_imm): New.
(sub3_carryinCV): New.
(*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
(*sub3_carryinCV_z2, *sub3_carryinCV): New.
* testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase.
* testsuite/gcc.target/arm/builtin_saddl.c: New testcase.
* testsuite/gcc.target/arm/builtin_saddll.c: New testcase.
* testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase.
* testsuite/gcc.target/arm/builtin_uaddl.c: New testcase.
* testsuite/gcc.target/arm/builtin_uaddll.c: New testcase.
* testsuite/gcc.target/arm/builtin_ssub_128.c: New testcase.
* testsuite/gcc.target/arm/builtin_ssubl.c: New testcase.
* testsuite/gcc.target/arm/builtin_ssubll.c: New testcase.
* testsuite/gcc.target/arm/builtin_usub_128.c: New testcase.
* testsuite/gcc.target/arm/builtin_usubl.c: New testcase.
* testsuite/gcc.target/arm/builtin_usubll.c: New testcase.

-Original Message-
From: Richard Earnshaw (lists) [mailto:richard.earns...@arm.com] 
Sent: Wednesday, July 5, 2017 2:38 AM
To: Michael Collison <michael.colli...@arm.com>; Christophe Lyon 
<christophe.l...@linaro.org>
Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations

On 19/05/17 22:11, Michael Collison wrote:
> Christophe,
> 
> I had a type in the two test cases: "addcs" should have been "adcs". I caught 
> this previously but submitted the previous patch incorrectly. Updated patch 
> attached.
> 
> Okay for trunk?
> 

Apologies for the delay responding, I've been procrastinating over this
one.   In part it's due to the size of the patch with very little
top-level description of what's the motivation and overall approach to the 
problem.

It would really help review if this could be split into multiple patches with a 
description of what each stage achieves.

Anyway, there are a couple of obvious formatting issues to deal with first, 
before we get into the details of the patch.

> -Original Message-
> From: Christophe Lyon [mailto:christophe.l...@linaro.org]
> Sent: Friday, May 19, 2017 3:59 AM
> To: Michael Collison <michael.colli...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub 
> operations
> 
> Hi Michael,
> 
> 
> On 19 May 2017 at 07:12, Michael Collison <michael.colli...@arm.com> wrote:
>> Hi,
>>
>> This patch improves code generations for builtin arithmetic overflow 
>> operations for the aarch64 backend. As an example for a simple test case 
>> such as:
>>
>> Sure for a simple test case such as:
>>
>> int
>> f (int x, int y, int *ovf)
>> {
>>   int res;
>>   *ovf = __builtin_sadd_overf

Re: [PATCH][Aarch64] Add support for overflow add and sub operations

2017-07-06 Thread Richard Earnshaw (lists)
On 06/07/17 08:29, Michael Collison wrote:
> Richard,
> 
> Can you explain "Use of ne is wrong here.  The condition register should
> be set to the result of a compare rtl construct.  The same applies
> elsewhere within this patch.  NE is then used on the result of the
> comparison.  The mode of the compare then indicates what might or might
> not be valid in the way the comparison is finally constructed."?
> 
> Why is "ne" wrong? I don't doubt you are correct, but I see nothing in
> the internals manual that forbids it. I want to understand what issues
> this exposes.
> 

Because the idiomatic form on a machine with a flags register is

CCreg:mode = COMPARE:mode (A, B)

which is then used with

 (CCreg:mode, 0)

where cond-op is NE, EQ, GE, ... as appropriate.


> As you indicate I used this idiom in the arm port when I added the
> overflow operations there as well. Additionally other targets seem to
> use the comparison operators this way (i386 for the umulv).

Some targets really have boolean predicate operations that set results
explicitly in GP registers as the truth of A < B, etc.  On those
machines using

 pred-reg = cond-op (A, B)

makes sense, but not on ARM or AArch64.

R.

> 
> Regards,
> 
> Michael Collison
> 
> -Original Message-
> From: Richard Earnshaw (lists) [mailto:richard.earns...@arm.com]
> Sent: Wednesday, July 5, 2017 2:38 AM
> To: Michael Collison <michael.colli...@arm.com>; Christophe Lyon
> <christophe.l...@linaro.org>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub
> operations
> 
> On 19/05/17 22:11, Michael Collison wrote:
>> Christophe,
>> 
>> I had a type in the two test cases: "addcs" should have been "adcs". I 
>> caught this previously but submitted the previous patch incorrectly. Updated 
>> patch attached.
>> 
>> Okay for trunk?
>> 
> 
> Apologies for the delay responding, I've been procrastinating over this
> one.   In part it's due to the size of the patch with very little
> top-level description of what's the motivation and overall approach to
> the problem.
> 
> It would really help review if this could be split into multiple patches
> with a description of what each stage achieves.
> 
> Anyway, there are a couple of obvious formatting issues to deal with
> first, before we get into the details of the patch.
> 
>> -----Original Message-----
>> From: Christophe Lyon [mailto:christophe.l...@linaro.org]
>> Sent: Friday, May 19, 2017 3:59 AM
>> To: Michael Collison <michael.colli...@arm.com>
>> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
>> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub 
>> operations
>> 
>> Hi Michael,
>> 
>> 
>> On 19 May 2017 at 07:12, Michael Collison <michael.colli...@arm.com> wrote:
>>> Hi,
>>>
>>> This patch improves code generations for builtin arithmetic overflow 
>>> operations for the aarch64 backend. As an example for a simple test case 
>>> such as:
>>>
>>> Sure for a simple test case such as:
>>>
>>> int
>>> f (int x, int y, int *ovf)
>>> {
>>>   int res;
>>>   *ovf = __builtin_sadd_overflow (x, y, );
>>>   return res;
>>> }
>>>
>>> Current trunk at -O2 generates
>>>
>>> f:
>>> mov w3, w0
>>> mov w4, 0
>>> add w0, w0, w1
>>> tbnzw1, #31, .L4
>>> cmp w0, w3
>>> blt .L3
>>> .L2:
>>> str w4, [x2]
>>> ret
>>> .p2align 3
>>> .L4:
>>> cmp w0, w3
>>> ble .L2
>>> .L3:
>>> mov w4, 1
>>> b   .L2
>>>
>>>
>>> With the patch this now generates:
>>>
>>> f:
>>> addsw0, w0, w1
>>> csetw1, vs
>>> str w1, [x2]
>>> ret
>>>
>>>
>>> Original patch from Richard Henderson:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>>>
>>>
>>> Okay for trunk?
>>>
>>> 2017-05-17  Michael Collison  <michael.colli...@arm.com>
>>> Richard Henderson <r...@redhat.com>
>>>
>>> * config/aarch64/aarch64-modes.def (CC_V): New.
>>> * config/aarch64/aarch64-protos.h
>>> (aarch

RE: [PATCH][Aarch64] Add support for overflow add and sub operations

2017-07-06 Thread Michael Collison
Richard,

Can you explain "Use of ne is wrong here.  The condition register should be set 
to the result of a compare rtl construct.  The same applies elsewhere within 
this patch.  NE is then used on the result of the comparison.  The mode of the 
compare then indicates what might or might not be valid in the way the 
comparison is finally constructed."?

Why is "ne" wrong? I don't doubt you are correct, but I see nothing in the 
internals manual that forbids it. I want to understand what issues this exposes.

As you indicate I used this idiom in the arm port when I added the overflow 
operations there as well. Additionally other targets seem to use the comparison 
operators this way (i386 for the umulv).

Regards,

Michael Collison

-Original Message-
From: Richard Earnshaw (lists) [mailto:richard.earns...@arm.com] 
Sent: Wednesday, July 5, 2017 2:38 AM
To: Michael Collison <michael.colli...@arm.com>; Christophe Lyon 
<christophe.l...@linaro.org>
Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations

On 19/05/17 22:11, Michael Collison wrote:
> Christophe,
> 
> I had a type in the two test cases: "addcs" should have been "adcs". I caught 
> this previously but submitted the previous patch incorrectly. Updated patch 
> attached.
> 
> Okay for trunk?
> 

Apologies for the delay responding, I've been procrastinating over this
one.   In part it's due to the size of the patch with very little
top-level description of what's the motivation and overall approach to the 
problem.

It would really help review if this could be split into multiple patches with a 
description of what each stage achieves.

Anyway, there are a couple of obvious formatting issues to deal with first, 
before we get into the details of the patch.

> -Original Message-
> From: Christophe Lyon [mailto:christophe.l...@linaro.org]
> Sent: Friday, May 19, 2017 3:59 AM
> To: Michael Collison <michael.colli...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub 
> operations
> 
> Hi Michael,
> 
> 
> On 19 May 2017 at 07:12, Michael Collison <michael.colli...@arm.com> wrote:
>> Hi,
>>
>> This patch improves code generations for builtin arithmetic overflow 
>> operations for the aarch64 backend. As an example for a simple test case 
>> such as:
>>
>> Sure for a simple test case such as:
>>
>> int
>> f (int x, int y, int *ovf)
>> {
>>   int res;
>>   *ovf = __builtin_sadd_overflow (x, y, );
>>   return res;
>> }
>>
>> Current trunk at -O2 generates
>>
>> f:
>> mov w3, w0
>> mov w4, 0
>> add w0, w0, w1
>> tbnzw1, #31, .L4
>> cmp w0, w3
>> blt .L3
>> .L2:
>> str w4, [x2]
>> ret
>> .p2align 3
>> .L4:
>> cmp w0, w3
>> ble .L2
>> .L3:
>> mov w4, 1
>> b   .L2
>>
>>
>> With the patch this now generates:
>>
>> f:
>> addsw0, w0, w1
>> csetw1, vs
>> str w1, [x2]
>> ret
>>
>>
>> Original patch from Richard Henderson:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>>
>>
>> Okay for trunk?
>>
>> 2017-05-17  Michael Collison  <michael.colli...@arm.com>
>> Richard Henderson <r...@redhat.com>
>>
>> * config/aarch64/aarch64-modes.def (CC_V): New.
>> * config/aarch64/aarch64-protos.h
>> (aarch64_add_128bit_scratch_regs): Declare
>> (aarch64_add_128bit_scratch_regs): Declare.
>> (aarch64_expand_subvti): Declare.
>> (aarch64_gen_unlikely_cbranch): Declare
>> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
>> for signed overflow using CC_Vmode.
>> (aarch64_get_condition_code_1): Handle CC_Vmode.
>> (aarch64_gen_unlikely_cbranch): New function.
>> (aarch64_add_128bit_scratch_regs): New function.
>> (aarch64_subv_128bit_scratch_regs): New function.
>> (aarch64_expand_subvti): New function.
>> * config/aarch64/aarch64.md (addv4, uaddv4): New.
>> (addti3): Create simpler code if low part is already known to be 0.
>> (addvti4, uaddvti4): New.
>> (*add3_compareC_cconly_imm): New.
>> (*add3_compareC_cconly): New.
>> (*add3_compareC_imm): New.
>&g

Re: [PATCH][Aarch64] Add support for overflow add and sub operations

2017-07-05 Thread Richard Earnshaw (lists)
On 19/05/17 22:11, Michael Collison wrote:
> Christophe,
> 
> I had a type in the two test cases: "addcs" should have been "adcs". I caught 
> this previously but submitted the previous patch incorrectly. Updated patch 
> attached.
> 
> Okay for trunk?
> 

Apologies for the delay responding, I've been procrastinating over this
one.   In part it's due to the size of the patch with very little
top-level description of what's the motivation and overall approach to
the problem.

It would really help review if this could be split into multiple patches
with a description of what each stage achieves.

Anyway, there are a couple of obvious formatting issues to deal with
first, before we get into the details of the patch.

> -Original Message-
> From: Christophe Lyon [mailto:christophe.l...@linaro.org] 
> Sent: Friday, May 19, 2017 3:59 AM
> To: Michael Collison <michael.colli...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations
> 
> Hi Michael,
> 
> 
> On 19 May 2017 at 07:12, Michael Collison <michael.colli...@arm.com> wrote:
>> Hi,
>>
>> This patch improves code generations for builtin arithmetic overflow 
>> operations for the aarch64 backend. As an example for a simple test case 
>> such as:
>>
>> Sure for a simple test case such as:
>>
>> int
>> f (int x, int y, int *ovf)
>> {
>>   int res;
>>   *ovf = __builtin_sadd_overflow (x, y, );
>>   return res;
>> }
>>
>> Current trunk at -O2 generates
>>
>> f:
>> mov w3, w0
>> mov w4, 0
>> add w0, w0, w1
>> tbnzw1, #31, .L4
>> cmp w0, w3
>> blt .L3
>> .L2:
>> str w4, [x2]
>> ret
>> .p2align 3
>> .L4:
>> cmp w0, w3
>> ble .L2
>> .L3:
>> mov w4, 1
>> b   .L2
>>
>>
>> With the patch this now generates:
>>
>> f:
>> addsw0, w0, w1
>> csetw1, vs
>> str w1, [x2]
>> ret
>>
>>
>> Original patch from Richard Henderson:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>>
>>
>> Okay for trunk?
>>
>> 2017-05-17  Michael Collison  <michael.colli...@arm.com>
>> Richard Henderson <r...@redhat.com>
>>
>> * config/aarch64/aarch64-modes.def (CC_V): New.
>> * config/aarch64/aarch64-protos.h
>> (aarch64_add_128bit_scratch_regs): Declare
>> (aarch64_add_128bit_scratch_regs): Declare.
>> (aarch64_expand_subvti): Declare.
>> (aarch64_gen_unlikely_cbranch): Declare
>> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
>> for signed overflow using CC_Vmode.
>> (aarch64_get_condition_code_1): Handle CC_Vmode.
>> (aarch64_gen_unlikely_cbranch): New function.
>> (aarch64_add_128bit_scratch_regs): New function.
>> (aarch64_subv_128bit_scratch_regs): New function.
>> (aarch64_expand_subvti): New function.
>> * config/aarch64/aarch64.md (addv4, uaddv4): New.
>> (addti3): Create simpler code if low part is already known to be 0.
>> (addvti4, uaddvti4): New.
>> (*add3_compareC_cconly_imm): New.
>> (*add3_compareC_cconly): New.
>> (*add3_compareC_imm): New.
>> (*add3_compareC): Rename from add3_compare1; do not
>> handle constants within this pattern.
>> (*add3_compareV_cconly_imm): New.
>> (*add3_compareV_cconly): New.
>> (*add3_compareV_imm): New.
>> (add3_compareV): New.
>> (add3_carryinC, add3_carryinV): New.
>> (*add3_carryinC_zero, *add3_carryinV_zero): New.
>> (*add3_carryinC, *add3_carryinV): New.
>> (subv4, usubv4): New.
>> (subti): Handle op1 zero.
>> (subvti4, usub4ti4): New.
>> (*sub3_compare1_imm): New.
>> (sub3_carryinCV): New.
>> (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
>> (*sub3_carryinCV_z2, *sub3_carryinCV): New.
>> * testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase.
>> * testsuite/gcc.target/arm/builtin_saddl.c: New testcase.
>> * testsuite/gcc.target/arm/builtin_saddll.c: New testcase.
>> * testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase.
>> * testsuite/g

RE: [PING^2][PATCH][Aarch64] Add support for overflow add and sub operations

2017-06-12 Thread Michael Collison
Ping ^2. Updated patch posted here:

https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01615.html




RE: [PING][PATCH][Aarch64] Add support for overflow add and sub operations

2017-06-01 Thread Michael Collison
Ping. Testsuite issue resolved. Okay for trunk?

-Original Message-
From: Christophe Lyon [mailto:christophe.l...@linaro.org] 
Sent: Friday, May 19, 2017 3:59 AM
To: Michael Collison <michael.colli...@arm.com>
Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations

Hi Michael,


On 19 May 2017 at 07:12, Michael Collison <michael.colli...@arm.com> wrote:
> Hi,
>
> This patch improves code generations for builtin arithmetic overflow 
> operations for the aarch64 backend. As an example for a simple test case such 
> as:
>
> Sure for a simple test case such as:
>
> int
> f (int x, int y, int *ovf)
> {
>   int res;
>   *ovf = __builtin_sadd_overflow (x, y, );
>   return res;
> }
>
> Current trunk at -O2 generates
>
> f:
> mov w3, w0
> mov w4, 0
> add w0, w0, w1
> tbnzw1, #31, .L4
> cmp w0, w3
> blt .L3
> .L2:
> str w4, [x2]
> ret
> .p2align 3
> .L4:
> cmp w0, w3
> ble .L2
> .L3:
> mov w4, 1
> b   .L2
>
>
> With the patch this now generates:
>
> f:
> addsw0, w0, w1
> csetw1, vs
> str w1, [x2]
> ret
>
>
> Original patch from Richard Henderson:
>
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>
>
> Okay for trunk?
>
> 2017-05-17  Michael Collison  <michael.colli...@arm.com>
> Richard Henderson <r...@redhat.com>
>
> * config/aarch64/aarch64-modes.def (CC_V): New.
> * config/aarch64/aarch64-protos.h
> (aarch64_add_128bit_scratch_regs): Declare
> (aarch64_add_128bit_scratch_regs): Declare.
> (aarch64_expand_subvti): Declare.
> (aarch64_gen_unlikely_cbranch): Declare
> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
> for signed overflow using CC_Vmode.
> (aarch64_get_condition_code_1): Handle CC_Vmode.
> (aarch64_gen_unlikely_cbranch): New function.
> (aarch64_add_128bit_scratch_regs): New function.
> (aarch64_subv_128bit_scratch_regs): New function.
> (aarch64_expand_subvti): New function.
> * config/aarch64/aarch64.md (addv4, uaddv4): New.
> (addti3): Create simpler code if low part is already known to be 0.
> (addvti4, uaddvti4): New.
> (*add3_compareC_cconly_imm): New.
> (*add3_compareC_cconly): New.
> (*add3_compareC_imm): New.
> (*add3_compareC): Rename from add3_compare1; do not
> handle constants within this pattern.
> (*add3_compareV_cconly_imm): New.
> (*add3_compareV_cconly): New.
> (*add3_compareV_imm): New.
> (add3_compareV): New.
> (add3_carryinC, add3_carryinV): New.
> (*add3_carryinC_zero, *add3_carryinV_zero): New.
> (*add3_carryinC, *add3_carryinV): New.
> (subv4, usubv4): New.
> (subti): Handle op1 zero.
> (subvti4, usub4ti4): New.
> (*sub3_compare1_imm): New.
> (sub3_carryinCV): New.
> (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
> (*sub3_carryinCV_z2, *sub3_carryinCV): New.
> * testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_saddl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_saddll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uaddl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uaddll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssub_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssubl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssubll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usub_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usubl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usubll.c: New testcase.

I've tried your patch, and 2 of the new tests FAIL:
gcc.target/aarch64/builtin_sadd_128.c scan-assembler addcs
gcc.target/aarch64/builtin_uadd_128.c scan-assembler addcs

Am I missing something?

Thanks,

Christophe


RE: [PATCH][Aarch64] Add support for overflow add and sub operations

2017-05-19 Thread Michael Collison
Christophe,

I had a type in the two test cases: "addcs" should have been "adcs". I caught 
this previously but submitted the previous patch incorrectly. Updated patch 
attached.

Okay for trunk?

-Original Message-
From: Christophe Lyon [mailto:christophe.l...@linaro.org] 
Sent: Friday, May 19, 2017 3:59 AM
To: Michael Collison <michael.colli...@arm.com>
Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations

Hi Michael,


On 19 May 2017 at 07:12, Michael Collison <michael.colli...@arm.com> wrote:
> Hi,
>
> This patch improves code generations for builtin arithmetic overflow 
> operations for the aarch64 backend. As an example for a simple test case such 
> as:
>
> Sure for a simple test case such as:
>
> int
> f (int x, int y, int *ovf)
> {
>   int res;
>   *ovf = __builtin_sadd_overflow (x, y, );
>   return res;
> }
>
> Current trunk at -O2 generates
>
> f:
> mov w3, w0
> mov w4, 0
> add w0, w0, w1
> tbnzw1, #31, .L4
> cmp w0, w3
> blt .L3
> .L2:
> str w4, [x2]
> ret
> .p2align 3
> .L4:
> cmp w0, w3
> ble .L2
> .L3:
> mov w4, 1
> b   .L2
>
>
> With the patch this now generates:
>
> f:
> addsw0, w0, w1
> csetw1, vs
> str w1, [x2]
> ret
>
>
> Original patch from Richard Henderson:
>
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>
>
> Okay for trunk?
>
> 2017-05-17  Michael Collison  <michael.colli...@arm.com>
> Richard Henderson <r...@redhat.com>
>
> * config/aarch64/aarch64-modes.def (CC_V): New.
> * config/aarch64/aarch64-protos.h
> (aarch64_add_128bit_scratch_regs): Declare
> (aarch64_add_128bit_scratch_regs): Declare.
> (aarch64_expand_subvti): Declare.
> (aarch64_gen_unlikely_cbranch): Declare
> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
> for signed overflow using CC_Vmode.
> (aarch64_get_condition_code_1): Handle CC_Vmode.
> (aarch64_gen_unlikely_cbranch): New function.
> (aarch64_add_128bit_scratch_regs): New function.
> (aarch64_subv_128bit_scratch_regs): New function.
> (aarch64_expand_subvti): New function.
> * config/aarch64/aarch64.md (addv4, uaddv4): New.
> (addti3): Create simpler code if low part is already known to be 0.
> (addvti4, uaddvti4): New.
> (*add3_compareC_cconly_imm): New.
> (*add3_compareC_cconly): New.
> (*add3_compareC_imm): New.
> (*add3_compareC): Rename from add3_compare1; do not
> handle constants within this pattern.
> (*add3_compareV_cconly_imm): New.
> (*add3_compareV_cconly): New.
> (*add3_compareV_imm): New.
> (add3_compareV): New.
> (add3_carryinC, add3_carryinV): New.
> (*add3_carryinC_zero, *add3_carryinV_zero): New.
> (*add3_carryinC, *add3_carryinV): New.
> (subv4, usubv4): New.
> (subti): Handle op1 zero.
> (subvti4, usub4ti4): New.
> (*sub3_compare1_imm): New.
> (sub3_carryinCV): New.
> (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
> (*sub3_carryinCV_z2, *sub3_carryinCV): New.
> * testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_saddl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_saddll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uaddl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uaddll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssub_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssubl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssubll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usub_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usubl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usubll.c: New testcase.

I've tried your patch, and 2 of the new tests FAIL:
gcc.target/aarch64/builtin_sadd_128.c scan-assembler addcs
gcc.target/aarch64/builtin_uadd_128.c scan-assembler addcs

Am I missing something?

Thanks,

Christophe


pr6308v2.patch
Description: pr6308v2.patch


Re: [PATCH][Aarch64] Add support for overflow add and sub operations

2017-05-19 Thread Christophe Lyon
Hi Michael,


On 19 May 2017 at 07:12, Michael Collison  wrote:
> Hi,
>
> This patch improves code generations for builtin arithmetic overflow 
> operations for the aarch64 backend. As an example for a simple test case such 
> as:
>
> Sure for a simple test case such as:
>
> int
> f (int x, int y, int *ovf)
> {
>   int res;
>   *ovf = __builtin_sadd_overflow (x, y, );
>   return res;
> }
>
> Current trunk at -O2 generates
>
> f:
> mov w3, w0
> mov w4, 0
> add w0, w0, w1
> tbnzw1, #31, .L4
> cmp w0, w3
> blt .L3
> .L2:
> str w4, [x2]
> ret
> .p2align 3
> .L4:
> cmp w0, w3
> ble .L2
> .L3:
> mov w4, 1
> b   .L2
>
>
> With the patch this now generates:
>
> f:
> addsw0, w0, w1
> csetw1, vs
> str w1, [x2]
> ret
>
>
> Original patch from Richard Henderson:
>
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>
>
> Okay for trunk?
>
> 2017-05-17  Michael Collison  
> Richard Henderson 
>
> * config/aarch64/aarch64-modes.def (CC_V): New.
> * config/aarch64/aarch64-protos.h
> (aarch64_add_128bit_scratch_regs): Declare
> (aarch64_add_128bit_scratch_regs): Declare.
> (aarch64_expand_subvti): Declare.
> (aarch64_gen_unlikely_cbranch): Declare
> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
> for signed overflow using CC_Vmode.
> (aarch64_get_condition_code_1): Handle CC_Vmode.
> (aarch64_gen_unlikely_cbranch): New function.
> (aarch64_add_128bit_scratch_regs): New function.
> (aarch64_subv_128bit_scratch_regs): New function.
> (aarch64_expand_subvti): New function.
> * config/aarch64/aarch64.md (addv4, uaddv4): New.
> (addti3): Create simpler code if low part is already known to be 0.
> (addvti4, uaddvti4): New.
> (*add3_compareC_cconly_imm): New.
> (*add3_compareC_cconly): New.
> (*add3_compareC_imm): New.
> (*add3_compareC): Rename from add3_compare1; do not
> handle constants within this pattern.
> (*add3_compareV_cconly_imm): New.
> (*add3_compareV_cconly): New.
> (*add3_compareV_imm): New.
> (add3_compareV): New.
> (add3_carryinC, add3_carryinV): New.
> (*add3_carryinC_zero, *add3_carryinV_zero): New.
> (*add3_carryinC, *add3_carryinV): New.
> (subv4, usubv4): New.
> (subti): Handle op1 zero.
> (subvti4, usub4ti4): New.
> (*sub3_compare1_imm): New.
> (sub3_carryinCV): New.
> (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
> (*sub3_carryinCV_z2, *sub3_carryinCV): New.
> * testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_saddl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_saddll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uaddl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uaddll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssub_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssubl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssubll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usub_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usubl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usubll.c: New testcase.

I've tried your patch, and 2 of the new tests FAIL:
gcc.target/aarch64/builtin_sadd_128.c scan-assembler addcs
gcc.target/aarch64/builtin_uadd_128.c scan-assembler addcs

Am I missing something?

Thanks,

Christophe


[PATCH][Aarch64] Add support for overflow add and sub operations

2017-05-18 Thread Michael Collison
Hi,

This patch improves code generations for builtin arithmetic overflow operations 
for the aarch64 backend. As an example for a simple test case such as:

Sure for a simple test case such as:

int
f (int x, int y, int *ovf)
{
  int res;
  *ovf = __builtin_sadd_overflow (x, y, );
  return res;
}

Current trunk at -O2 generates

f:
mov w3, w0
mov w4, 0
add w0, w0, w1
tbnzw1, #31, .L4
cmp w0, w3
blt .L3
.L2:
str w4, [x2]
ret
.p2align 3
.L4:
cmp w0, w3
ble .L2
.L3:
mov w4, 1
b   .L2


With the patch this now generates:

f:
addsw0, w0, w1
csetw1, vs
str w1, [x2]
ret


Original patch from Richard Henderson:

https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html


Okay for trunk?

2017-05-17  Michael Collison  
Richard Henderson 

* config/aarch64/aarch64-modes.def (CC_V): New.
* config/aarch64/aarch64-protos.h
(aarch64_add_128bit_scratch_regs): Declare
(aarch64_add_128bit_scratch_regs): Declare.
(aarch64_expand_subvti): Declare.
(aarch64_gen_unlikely_cbranch): Declare
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
for signed overflow using CC_Vmode.
(aarch64_get_condition_code_1): Handle CC_Vmode.
(aarch64_gen_unlikely_cbranch): New function.
(aarch64_add_128bit_scratch_regs): New function.
(aarch64_subv_128bit_scratch_regs): New function.
(aarch64_expand_subvti): New function.
* config/aarch64/aarch64.md (addv4, uaddv4): New.
(addti3): Create simpler code if low part is already known to be 0.
(addvti4, uaddvti4): New.
(*add3_compareC_cconly_imm): New.
(*add3_compareC_cconly): New.
(*add3_compareC_imm): New.
(*add3_compareC): Rename from add3_compare1; do not
handle constants within this pattern.
(*add3_compareV_cconly_imm): New.
(*add3_compareV_cconly): New.
(*add3_compareV_imm): New.
(add3_compareV): New.
(add3_carryinC, add3_carryinV): New.
(*add3_carryinC_zero, *add3_carryinV_zero): New.
(*add3_carryinC, *add3_carryinV): New.
(subv4, usubv4): New.
(subti): Handle op1 zero.
(subvti4, usub4ti4): New.
(*sub3_compare1_imm): New.
(sub3_carryinCV): New.
(*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
(*sub3_carryinCV_z2, *sub3_carryinCV): New.
* testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase.
* testsuite/gcc.target/arm/builtin_saddl.c: New testcase.
* testsuite/gcc.target/arm/builtin_saddll.c: New testcase.
* testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase.
* testsuite/gcc.target/arm/builtin_uaddl.c: New testcase.
* testsuite/gcc.target/arm/builtin_uaddll.c: New testcase.
* testsuite/gcc.target/arm/builtin_ssub_128.c: New testcase.
* testsuite/gcc.target/arm/builtin_ssubl.c: New testcase.
* testsuite/gcc.target/arm/builtin_ssubll.c: New testcase.
* testsuite/gcc.target/arm/builtin_usub_128.c: New testcase.
* testsuite/gcc.target/arm/builtin_usubl.c: New testcase.
* testsuite/gcc.target/arm/builtin_usubll.c: New testcase.


PR6308.patch
Description: PR6308.patch


[Ping^2][PATCH][Aarch64] Add support for overflow add and sub operations

2017-01-12 Thread Michael Collison

Ping.

Link to original post: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03119.html


[Ping[[PATCH][Aarch64] Add support for overflow add and sub operations

2016-12-08 Thread Michael Collison
Ping.

Link to original post: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03119.html


[PATCH][Aarch64] Add support for overflow add and sub operations

2016-11-30 Thread Michael Collison
Hi,

This patch improves code generations for builtin arithmetic overflow operations 
for the aarch64 backend. As an example for a simple test case such as:

int
f (int x, int y, int *ovf)
{
  int res;
  *ovf = __builtin_sadd_overflow (x, y, );
  return res;
}

Current trunk at -O2 generates

f:
mov w3, w0
mov w4, 0
add w0, w0, w1
tbnzw1, #31, .L4
cmp w0, w3
blt .L3
.L2:
str w4, [x2]
ret
.p2align 3
.L4:
cmp w0, w3
ble .L2
.L3:
mov w4, 1
b   .L2


With the patch this now generates:

f:
addsw0, w0, w1
csetw1, vs
str w1, [x2]
ret

Tested on aarch64-linux-gnu with no regressions. Okay for trunk?


2016-11-30  Michael Collison  
Richard Henderson 

* config/aarch64/aarch64-modes.def (CC_V): New.
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
for signed overflow using CC_Vmode.
(aarch64_get_condition_code_1): Handle CC_Vmode.
* config/aarch64/aarch64.md (addv4, uaddv4): New.
(addti3): Create simpler code if low part is already known to be 0.
(addvti4, uaddvti4): New.
(*add3_compareC_cconly_imm): New.
(*add3_compareC_cconly): New.
(*add3_compareC_imm): New.
(*add3_compareC): Rename from add3_compare1; do not
handle constants within this pattern.
(*add3_compareV_cconly_imm): New.
(*add3_compareV_cconly): New.
(*add3_compareV_imm): New.
(add3_compareV): New.
(add3_carryinC, add3_carryinV): New.
(*add3_carryinC_zero, *add3_carryinV_zero): New.
(*add3_carryinC, *add3_carryinV): New.
(subv4, usubv4): New.
(subti): Handle op1 zero.
(subvti4, usub4ti4): New.
(*sub3_compare1_imm): New.
(sub3_carryinCV): New.
(*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
(*sub3_carryinCV_z2, *sub3_carryinCV): New


rth_overflow_ipreview1.patch
Description: rth_overflow_ipreview1.patch