Re: [PATCH][AArch64] Improve csinc/csneg/csinv opportunities on immediates

2015-07-24 Thread Kyrill Tkachov


On 17/07/15 20:00, pins...@gmail.com wrote:





On Jul 17, 2015, at 9:58 PM, Kyrill Tkachov  wrote:



On 10/07/15 14:45, Kyrill Tkachov wrote:

On 10/07/15 10:00, pins...@gmail.com wrote:



On Jul 10, 2015, at 1:47 AM, Kyrill Tkachov  wrote:

Hi Andrew,


On 10/07/15 09:40, pins...@gmail.com wrote:




On Jul 10, 2015, at 1:34 AM, Kyrill Tkachov  wrote:

Hi all,

Currently when evaluating expressions like (a ? 24 : 25) we will move 24 and 25 
into
registers and perform a csel on them.  This misses the opportunity to instead 
move just 24
into a register and then perform a csinc, saving us an instruction and a 
register use.
Similarly for csneg and csinv.

This patch implements that idea by allowing such pairs of immediates in 
*cmov_insn
and adding an early splitter that performs the necessary transformation.

The testcase included in the patch demonstrates the kind of opportunities that 
are now picked up.

With this patch I see about 9.6% more csinc instructions being generated for 
SPEC2006
and the generated code looks objectively better (i.e. fewer mov-immediates and 
slightly
lower register pressure).

Bootstrapped and tested on aarch64.

Ok for trunk?

I think this is the wrong place for this optimization. It should happen in 
expr.c and we should produce cond_expr on the gimple level.

I had considered it, but I wasn't sure how general the conditional 
increment/negate/inverse operations
are to warrant a midend implementation. Do you mean the 
expand_cond_expr_using_cmove function in expr.c?

Yes and we can expand it to even have a target hook on how to expand them if 
needed.

I played around in that part and it seems that by the time it gets to expansion 
the midend
doesn't have a cond_expr of the two immediates, it's a PHI node with the 
immediates already expanded.
I have not been able to get it to match a cond_expr of two immediates there, 
although that could be
because I'm unfamiliar with that part of the codebase.

So by the time we reach expansion time we don't have a COND_EXPR of two 
immediates, so I tried getting
the code in expr.c to do the right thing, but it didn't work out.
This patch catches this opportunity at the RTL level and could catch such cases 
if they were to be
generated by any of the pre-combine RTL passes. Or do you reckon looking for 
these patterns in RTL
ifcvt is the way to go? I think it would be somewhat messy to express the 
CSNEG, CSINV opportunities
there as we don't have optabs for conditional negate and invert, but 
conditional increment would work,
though in the aarch64 case we can only do a conditional by 1 rather than a 
general conditional add.

Right as I said, I have a patch to phiopt to produce the cond_expr when it is 
useful. That is create cond_expr before we even get to rtl.


Ok, and if that's done, then we should be able to catch them in 
expand_cond_expr_using_cmove
and do more optimal target-specific tricks there.
If you already have something for phiopt would it be possible to send it to 
gcc-patches so we can
play around with the expr.c part of this problem? Even a rough version would be 
helpful.

Thanks,
Kyrill




Thanks,
Andrew



Kyrill



Kyrill


There is already a standard pattern for condition add so the a ? Const1 : 
const2 can be handled in the a generic way without much troubles. We should 
handle it better in rtl  ifcvt too (that should be an easier patch). The neg 
and not cases are very target specific but can be handled by a target hook and 
expand it directly to it.



   I have patches to do both but I have not got around to cleaning them up. If 
anyone wants them, I can send a link to my current gcc 5.1 sources with them 
included.

Any chance you can post them on gcc-patches even as a rough idea of what needs 
to be done?

I posted my expr patch a few years ago but I never got around to rth's 
comments. This was the generic increment patch. Basically aarch64 should be 
implementing that pattern too.


The main reason why this should be handled in gimple is that ifcvt on the rtl 
level is not cheap and does not catch all of the cases the simple expansion of 
phi-opt does. I can dig that patch up and I will be doing that next week 
anyways.

Thanks,
Andrew


Thanks,
Kyrill


   Thanks,
Andrew


Thanks,
Kyrill

2015-07-10  Kyrylo Tkachov  

 * config/aarch64/aarch64.md (*cmov_insn): Move stricter
 check for operands 3 and 4 to pattern predicate.  Allow immediates
 that can be expressed as csinc/csneg/csinv.  New define_split.
 (*csinv3_insn): Rename to...
 (csinv3_insn): ... This.
 * config/aarch64/aarch64.h (AARCH64_IMMS_OK_FOR_CSNEG): New macro.
 (AARCH64_IMMS_OK_FOR_CSINC): Likewise.
 (AARCH64_IMMS_OK_FOR_CSINV): Likewise.
 * config/aarch64/aarch64.c (aarch64_imms_ok_for_cond_op_1):
 New function.
 (aarch64_imms_ok_for_cond_op): Likewise.
 * config/aarch64/aarch64-protos.h (aarch64_imms_ok_for_cond_op_1):
 Declare prototype.
 (aarch64_imms_ok_for_cond_op): Likewise

Re: [PATCH][AArch64] Improve csinc/csneg/csinv opportunities on immediates

2015-07-17 Thread pinskia




> On Jul 17, 2015, at 9:58 PM, Kyrill Tkachov  wrote:
> 
> 
>> On 10/07/15 14:45, Kyrill Tkachov wrote:
>>> On 10/07/15 10:00, pins...@gmail.com wrote:
>>> 
>>> 
 On Jul 10, 2015, at 1:47 AM, Kyrill Tkachov  wrote:
 
 Hi Andrew,
 
> On 10/07/15 09:40, pins...@gmail.com wrote:
> 
> 
> 
>> On Jul 10, 2015, at 1:34 AM, Kyrill Tkachov  
>> wrote:
>> 
>> Hi all,
>> 
>> Currently when evaluating expressions like (a ? 24 : 25) we will move 24 
>> and 25 into
>> registers and perform a csel on them.  This misses the opportunity to 
>> instead move just 24
>> into a register and then perform a csinc, saving us an instruction and a 
>> register use.
>> Similarly for csneg and csinv.
>> 
>> This patch implements that idea by allowing such pairs of immediates in 
>> *cmov_insn
>> and adding an early splitter that performs the necessary transformation.
>> 
>> The testcase included in the patch demonstrates the kind of 
>> opportunities that are now picked up.
>> 
>> With this patch I see about 9.6% more csinc instructions being generated 
>> for SPEC2006
>> and the generated code looks objectively better (i.e. fewer 
>> mov-immediates and slightly
>> lower register pressure).
>> 
>> Bootstrapped and tested on aarch64.
>> 
>> Ok for trunk?
> I think this is the wrong place for this optimization. It should happen 
> in expr.c and we should produce cond_expr on the gimple level.
 I had considered it, but I wasn't sure how general the conditional 
 increment/negate/inverse operations
 are to warrant a midend implementation. Do you mean the 
 expand_cond_expr_using_cmove function in expr.c?
>>> Yes and we can expand it to even have a target hook on how to expand them 
>>> if needed.
>> I played around in that part and it seems that by the time it gets to 
>> expansion the midend
>> doesn't have a cond_expr of the two immediates, it's a PHI node with the 
>> immediates already expanded.
>> I have not been able to get it to match a cond_expr of two immediates there, 
>> although that could be
>> because I'm unfamiliar with that part of the codebase.
> 
> So by the time we reach expansion time we don't have a COND_EXPR of two 
> immediates, so I tried getting
> the code in expr.c to do the right thing, but it didn't work out.
> This patch catches this opportunity at the RTL level and could catch such 
> cases if they were to be
> generated by any of the pre-combine RTL passes. Or do you reckon looking for 
> these patterns in RTL
> ifcvt is the way to go? I think it would be somewhat messy to express the 
> CSNEG, CSINV opportunities
> there as we don't have optabs for conditional negate and invert, but 
> conditional increment would work,
> though in the aarch64 case we can only do a conditional by 1 rather than a 
> general conditional add.

Right as I said, I have a patch to phiopt to produce the cond_expr when it is 
useful. That is create cond_expr before we even get to rtl. 

Thanks,
Andrew


> 
> Kyrill
> 
> 
>> 
>> Kyrill
>> 
>>> There is already a standard pattern for condition add so the a ? Const1 : 
>>> const2 can be handled in the a generic way without much troubles. We should 
>>> handle it better in rtl  ifcvt too (that should be an easier patch). The 
>>> neg and not cases are very target specific but can be handled by a target 
>>> hook and expand it directly to it.
>>> 
>>> 
>   I have patches to do both but I have not got around to cleaning them 
> up. If anyone wants them, I can send a link to my current gcc 5.1 sources 
> with them included.
 Any chance you can post them on gcc-patches even as a rough idea of what 
 needs to be done?
>>> I posted my expr patch a few years ago but I never got around to rth's 
>>> comments. This was the generic increment patch. Basically aarch64 should be 
>>> implementing that pattern too.
>>> 
>>> 
>>> The main reason why this should be handled in gimple is that ifcvt on the 
>>> rtl level is not cheap and does not catch all of the cases the simple 
>>> expansion of phi-opt does. I can dig that patch up and I will be doing that 
>>> next week anyways.
>>> 
>>> Thanks,
>>> Andrew
>>> 
 Thanks,
 Kyrill
 
>   Thanks,
> Andrew
> 
>> Thanks,
>> Kyrill
>> 
>> 2015-07-10  Kyrylo Tkachov  
>> 
>> * config/aarch64/aarch64.md (*cmov_insn): Move stricter
>> check for operands 3 and 4 to pattern predicate.  Allow immediates
>> that can be expressed as csinc/csneg/csinv.  New define_split.
>> (*csinv3_insn): Rename to...
>> (csinv3_insn): ... This.
>> * config/aarch64/aarch64.h (AARCH64_IMMS_OK_FOR_CSNEG): New macro.
>> (AARCH64_IMMS_OK_FOR_CSINC): Likewise.
>> (AARCH64_IMMS_OK_FOR_CSINV): Likewise.
>> * config/aarch64/aarch64.c (aarch64_imms_ok_for_cond_op_1):
>> New fun

Re: [PATCH][AArch64] Improve csinc/csneg/csinv opportunities on immediates

2015-07-17 Thread Kyrill Tkachov


On 10/07/15 14:45, Kyrill Tkachov wrote:

On 10/07/15 10:00, pins...@gmail.com wrote:




On Jul 10, 2015, at 1:47 AM, Kyrill Tkachov  wrote:

Hi Andrew,


On 10/07/15 09:40, pins...@gmail.com wrote:




On Jul 10, 2015, at 1:34 AM, Kyrill Tkachov  wrote:

Hi all,

Currently when evaluating expressions like (a ? 24 : 25) we will move 24 and 25 
into
registers and perform a csel on them.  This misses the opportunity to instead 
move just 24
into a register and then perform a csinc, saving us an instruction and a 
register use.
Similarly for csneg and csinv.

This patch implements that idea by allowing such pairs of immediates in 
*cmov_insn
and adding an early splitter that performs the necessary transformation.

The testcase included in the patch demonstrates the kind of opportunities that 
are now picked up.

With this patch I see about 9.6% more csinc instructions being generated for 
SPEC2006
and the generated code looks objectively better (i.e. fewer mov-immediates and 
slightly
lower register pressure).

Bootstrapped and tested on aarch64.

Ok for trunk?

I think this is the wrong place for this optimization. It should happen in 
expr.c and we should produce cond_expr on the gimple level.

I had considered it, but I wasn't sure how general the conditional 
increment/negate/inverse operations
are to warrant a midend implementation. Do you mean the 
expand_cond_expr_using_cmove function in expr.c?

Yes and we can expand it to even have a target hook on how to expand them if 
needed.

I played around in that part and it seems that by the time it gets to expansion 
the midend
doesn't have a cond_expr of the two immediates, it's a PHI node with the 
immediates already expanded.
I have not been able to get it to match a cond_expr of two immediates there, 
although that could be
because I'm unfamiliar with that part of the codebase.


So by the time we reach expansion time we don't have a COND_EXPR of two 
immediates, so I tried getting
the code in expr.c to do the right thing, but it didn't work out.
This patch catches this opportunity at the RTL level and could catch such cases 
if they were to be
generated by any of the pre-combine RTL passes. Or do you reckon looking for 
these patterns in RTL
ifcvt is the way to go? I think it would be somewhat messy to express the 
CSNEG, CSINV opportunities
there as we don't have optabs for conditional negate and invert, but 
conditional increment would work,
though in the aarch64 case we can only do a conditional by 1 rather than a 
general conditional add.

Kyrill




Kyrill


There is already a standard pattern for condition add so the a ? Const1 : 
const2 can be handled in the a generic way without much troubles. We should 
handle it better in rtl  ifcvt too (that should be an easier patch). The neg 
and not cases are very target specific but can be handled by a target hook and 
expand it directly to it.


   
I have patches to do both but I have not got around to cleaning them up. If anyone wants them, I can send a link to my current gcc 5.1 sources with them included.

Any chance you can post them on gcc-patches even as a rough idea of what needs 
to be done?

I posted my expr patch a few years ago but I never got around to rth's 
comments. This was the generic increment patch. Basically aarch64 should be 
implementing that pattern too.


The main reason why this should be handled in gimple is that ifcvt on the rtl 
level is not cheap and does not catch all of the cases the simple expansion of 
phi-opt does. I can dig that patch up and I will be doing that next week 
anyways.

Thanks,
Andrew


Thanks,
Kyrill

   
Thanks,

Andrew


Thanks,
Kyrill

2015-07-10  Kyrylo Tkachov  

 * config/aarch64/aarch64.md (*cmov_insn): Move stricter
 check for operands 3 and 4 to pattern predicate.  Allow immediates
 that can be expressed as csinc/csneg/csinv.  New define_split.
 (*csinv3_insn): Rename to...
 (csinv3_insn): ... This.
 * config/aarch64/aarch64.h (AARCH64_IMMS_OK_FOR_CSNEG): New macro.
 (AARCH64_IMMS_OK_FOR_CSINC): Likewise.
 (AARCH64_IMMS_OK_FOR_CSINV): Likewise.
 * config/aarch64/aarch64.c (aarch64_imms_ok_for_cond_op_1):
 New function.
 (aarch64_imms_ok_for_cond_op): Likewise.
 * config/aarch64/aarch64-protos.h (aarch64_imms_ok_for_cond_op_1):
 Declare prototype.
 (aarch64_imms_ok_for_cond_op): Likewise.

2015-07-10  Kyrylo Tkachov  

 * gcc.target/aarch64/cond-op-imm_1.c: New test.





Re: [PATCH][AArch64] Improve csinc/csneg/csinv opportunities on immediates

2015-07-10 Thread Kyrill Tkachov


On 10/07/15 10:00, pins...@gmail.com wrote:





On Jul 10, 2015, at 1:47 AM, Kyrill Tkachov  wrote:

Hi Andrew,


On 10/07/15 09:40, pins...@gmail.com wrote:




On Jul 10, 2015, at 1:34 AM, Kyrill Tkachov  wrote:

Hi all,

Currently when evaluating expressions like (a ? 24 : 25) we will move 24 and 25 
into
registers and perform a csel on them.  This misses the opportunity to instead 
move just 24
into a register and then perform a csinc, saving us an instruction and a 
register use.
Similarly for csneg and csinv.

This patch implements that idea by allowing such pairs of immediates in 
*cmov_insn
and adding an early splitter that performs the necessary transformation.

The testcase included in the patch demonstrates the kind of opportunities that 
are now picked up.

With this patch I see about 9.6% more csinc instructions being generated for 
SPEC2006
and the generated code looks objectively better (i.e. fewer mov-immediates and 
slightly
lower register pressure).

Bootstrapped and tested on aarch64.

Ok for trunk?

I think this is the wrong place for this optimization. It should happen in 
expr.c and we should produce cond_expr on the gimple level.

I had considered it, but I wasn't sure how general the conditional 
increment/negate/inverse operations
are to warrant a midend implementation. Do you mean the 
expand_cond_expr_using_cmove function in expr.c?

Yes and we can expand it to even have a target hook on how to expand them if 
needed.


I played around in that part and it seems that by the time it gets to expansion 
the midend
doesn't have a cond_expr of the two immediates, it's a PHI node with the 
immediates already expanded.
I have not been able to get it to match a cond_expr of two immediates there, 
although that could be
because I'm unfamiliar with that part of the codebase.

Kyrill



There is already a standard pattern for condition add so the a ? Const1 : 
const2 can be handled in the a generic way without much troubles. We should 
handle it better in rtl  ifcvt too (that should be an easier patch). The neg 
and not cases are very target specific but can be handled by a target hook and 
expand it directly to it.


  
I have patches to do both but I have not got around to cleaning them up. If anyone wants them, I can send a link to my current gcc 5.1 sources with them included.

Any chance you can post them on gcc-patches even as a rough idea of what needs 
to be done?


I posted my expr patch a few years ago but I never got around to rth's 
comments. This was the generic increment patch. Basically aarch64 should be 
implementing that pattern too.


The main reason why this should be handled in gimple is that ifcvt on the rtl 
level is not cheap and does not catch all of the cases the simple expansion of 
phi-opt does. I can dig that patch up and I will be doing that next week 
anyways.

Thanks,
Andrew


Thanks,
Kyrill

  
Thanks,

Andrew


Thanks,
Kyrill

2015-07-10  Kyrylo Tkachov  

* config/aarch64/aarch64.md (*cmov_insn): Move stricter
check for operands 3 and 4 to pattern predicate.  Allow immediates
that can be expressed as csinc/csneg/csinv.  New define_split.
(*csinv3_insn): Rename to...
(csinv3_insn): ... This.
* config/aarch64/aarch64.h (AARCH64_IMMS_OK_FOR_CSNEG): New macro.
(AARCH64_IMMS_OK_FOR_CSINC): Likewise.
(AARCH64_IMMS_OK_FOR_CSINV): Likewise.
* config/aarch64/aarch64.c (aarch64_imms_ok_for_cond_op_1):
New function.
(aarch64_imms_ok_for_cond_op): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_imms_ok_for_cond_op_1):
Declare prototype.
(aarch64_imms_ok_for_cond_op): Likewise.

2015-07-10  Kyrylo Tkachov  

* gcc.target/aarch64/cond-op-imm_1.c: New test.





Re: [PATCH][AArch64] Improve csinc/csneg/csinv opportunities on immediates

2015-07-10 Thread pinskia




> On Jul 10, 2015, at 1:47 AM, Kyrill Tkachov  wrote:
> 
> Hi Andrew,
> 
>> On 10/07/15 09:40, pins...@gmail.com wrote:
>> 
>> 
>> 
>>> On Jul 10, 2015, at 1:34 AM, Kyrill Tkachov  wrote:
>>> 
>>> Hi all,
>>> 
>>> Currently when evaluating expressions like (a ? 24 : 25) we will move 24 
>>> and 25 into
>>> registers and perform a csel on them.  This misses the opportunity to 
>>> instead move just 24
>>> into a register and then perform a csinc, saving us an instruction and a 
>>> register use.
>>> Similarly for csneg and csinv.
>>> 
>>> This patch implements that idea by allowing such pairs of immediates in 
>>> *cmov_insn
>>> and adding an early splitter that performs the necessary transformation.
>>> 
>>> The testcase included in the patch demonstrates the kind of opportunities 
>>> that are now picked up.
>>> 
>>> With this patch I see about 9.6% more csinc instructions being generated 
>>> for SPEC2006
>>> and the generated code looks objectively better (i.e. fewer mov-immediates 
>>> and slightly
>>> lower register pressure).
>>> 
>>> Bootstrapped and tested on aarch64.
>>> 
>>> Ok for trunk?
>> I think this is the wrong place for this optimization. It should happen in 
>> expr.c and we should produce cond_expr on the gimple level.
> 
> I had considered it, but I wasn't sure how general the conditional 
> increment/negate/inverse operations
> are to warrant a midend implementation. Do you mean the 
> expand_cond_expr_using_cmove function in expr.c?

Yes and we can expand it to even have a target hook on how to expand them if 
needed. 

There is already a standard pattern for condition add so the a ? Const1 : 
const2 can be handled in the a generic way without much troubles. We should 
handle it better in rtl  ifcvt too (that should be an easier patch). The neg 
and not cases are very target specific but can be handled by a target hook and 
expand it directly to it. 


> 
>>  
>> I have patches to do both but I have not got around to cleaning them up. If 
>> anyone wants them, I can send a link to my current gcc 5.1 sources with them 
>> included.
> 
> Any chance you can post them on gcc-patches even as a rough idea of what 
> needs to be done?


I posted my expr patch a few years ago but I never got around to rth's 
comments. This was the generic increment patch. Basically aarch64 should be 
implementing that pattern too. 


The main reason why this should be handled in gimple is that ifcvt on the rtl 
level is not cheap and does not catch all of the cases the simple expansion of 
phi-opt does. I can dig that patch up and I will be doing that next week 
anyways. 

Thanks,
Andrew

> 
> Thanks,
> Kyrill
> 
>>  
>> Thanks,
>> Andrew
>> 
>>> Thanks,
>>> Kyrill
>>> 
>>> 2015-07-10  Kyrylo Tkachov  
>>> 
>>>* config/aarch64/aarch64.md (*cmov_insn): Move stricter
>>>check for operands 3 and 4 to pattern predicate.  Allow immediates
>>>that can be expressed as csinc/csneg/csinv.  New define_split.
>>>(*csinv3_insn): Rename to...
>>>(csinv3_insn): ... This.
>>>* config/aarch64/aarch64.h (AARCH64_IMMS_OK_FOR_CSNEG): New macro.
>>>(AARCH64_IMMS_OK_FOR_CSINC): Likewise.
>>>(AARCH64_IMMS_OK_FOR_CSINV): Likewise.
>>>* config/aarch64/aarch64.c (aarch64_imms_ok_for_cond_op_1):
>>>New function.
>>>(aarch64_imms_ok_for_cond_op): Likewise.
>>>* config/aarch64/aarch64-protos.h (aarch64_imms_ok_for_cond_op_1):
>>>Declare prototype.
>>>(aarch64_imms_ok_for_cond_op): Likewise.
>>> 
>>> 2015-07-10  Kyrylo Tkachov  
>>> 
>>>* gcc.target/aarch64/cond-op-imm_1.c: New test.
>>> 
> 


Re: [PATCH][AArch64] Improve csinc/csneg/csinv opportunities on immediates

2015-07-10 Thread Kyrill Tkachov

Hi Andrew,

On 10/07/15 09:40, pins...@gmail.com wrote:





On Jul 10, 2015, at 1:34 AM, Kyrill Tkachov  wrote:

Hi all,

Currently when evaluating expressions like (a ? 24 : 25) we will move 24 and 25 
into
registers and perform a csel on them.  This misses the opportunity to instead 
move just 24
into a register and then perform a csinc, saving us an instruction and a 
register use.
Similarly for csneg and csinv.

This patch implements that idea by allowing such pairs of immediates in 
*cmov_insn
and adding an early splitter that performs the necessary transformation.

The testcase included in the patch demonstrates the kind of opportunities that 
are now picked up.

With this patch I see about 9.6% more csinc instructions being generated for 
SPEC2006
and the generated code looks objectively better (i.e. fewer mov-immediates and 
slightly
lower register pressure).

Bootstrapped and tested on aarch64.

Ok for trunk?

I think this is the wrong place for this optimization. It should happen in 
expr.c and we should produce cond_expr on the gimple level.


I had considered it, but I wasn't sure how general the conditional 
increment/negate/inverse operations
 are to warrant a midend implementation. Do you mean the 
expand_cond_expr_using_cmove function in expr.c?

  


I have patches to do both but I have not got around to cleaning them up. If 
anyone wants them, I can send a link to my current gcc 5.1 sources with them 
included.


Any chance you can post them on gcc-patches even as a rough idea of what needs 
to be done?

Thanks,
Kyrill

  


Thanks,
Andrew


Thanks,
Kyrill

2015-07-10  Kyrylo Tkachov  

* config/aarch64/aarch64.md (*cmov_insn): Move stricter
check for operands 3 and 4 to pattern predicate.  Allow immediates
that can be expressed as csinc/csneg/csinv.  New define_split.
(*csinv3_insn): Rename to...
(csinv3_insn): ... This.
* config/aarch64/aarch64.h (AARCH64_IMMS_OK_FOR_CSNEG): New macro.
(AARCH64_IMMS_OK_FOR_CSINC): Likewise.
(AARCH64_IMMS_OK_FOR_CSINV): Likewise.
* config/aarch64/aarch64.c (aarch64_imms_ok_for_cond_op_1):
New function.
(aarch64_imms_ok_for_cond_op): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_imms_ok_for_cond_op_1):
Declare prototype.
(aarch64_imms_ok_for_cond_op): Likewise.

2015-07-10  Kyrylo Tkachov  

* gcc.target/aarch64/cond-op-imm_1.c: New test.





Re: [PATCH][AArch64] Improve csinc/csneg/csinv opportunities on immediates

2015-07-10 Thread pinskia




> On Jul 10, 2015, at 1:34 AM, Kyrill Tkachov  wrote:
> 
> Hi all,
> 
> Currently when evaluating expressions like (a ? 24 : 25) we will move 24 and 
> 25 into
> registers and perform a csel on them.  This misses the opportunity to instead 
> move just 24
> into a register and then perform a csinc, saving us an instruction and a 
> register use.
> Similarly for csneg and csinv.
> 
> This patch implements that idea by allowing such pairs of immediates in 
> *cmov_insn
> and adding an early splitter that performs the necessary transformation.
> 
> The testcase included in the patch demonstrates the kind of opportunities 
> that are now picked up.
> 
> With this patch I see about 9.6% more csinc instructions being generated for 
> SPEC2006
> and the generated code looks objectively better (i.e. fewer mov-immediates 
> and slightly
> lower register pressure).
> 
> Bootstrapped and tested on aarch64.
> 
> Ok for trunk?

I think this is the wrong place for this optimization. It should happen in 
expr.c and we should produce cond_expr on the gimple level. 

I have patches to do both but I have not got around to cleaning them up. If 
anyone wants them, I can send a link to my current gcc 5.1 sources with them 
included. 

Thanks,
Andrew

> 
> Thanks,
> Kyrill
> 
> 2015-07-10  Kyrylo Tkachov  
> 
>* config/aarch64/aarch64.md (*cmov_insn): Move stricter
>check for operands 3 and 4 to pattern predicate.  Allow immediates
>that can be expressed as csinc/csneg/csinv.  New define_split.
>(*csinv3_insn): Rename to...
>(csinv3_insn): ... This.
>* config/aarch64/aarch64.h (AARCH64_IMMS_OK_FOR_CSNEG): New macro.
>(AARCH64_IMMS_OK_FOR_CSINC): Likewise.
>(AARCH64_IMMS_OK_FOR_CSINV): Likewise.
>* config/aarch64/aarch64.c (aarch64_imms_ok_for_cond_op_1):
>New function.
>(aarch64_imms_ok_for_cond_op): Likewise.
>* config/aarch64/aarch64-protos.h (aarch64_imms_ok_for_cond_op_1):
>Declare prototype.
>(aarch64_imms_ok_for_cond_op): Likewise.
> 
> 2015-07-10  Kyrylo Tkachov  
> 
>* gcc.target/aarch64/cond-op-imm_1.c: New test.
>