Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-10 Thread James Greenhalgh
On Thu, Mar 10, 2016 at 10:32:15AM -0600, Evandro Menezes wrote:
> >I agree to postpone until GCC 7.
> >
> >[AArch64] Replace insn to zero up SIMD registers
> >
> >gcc/
> >* config/aarch64/aarch64.md
> >(*movhf_aarch64): Add "movi %0, #0" to zero up register.
> >(*movsf_aarch64): Likewise and add "simd" attributes.
> >(*movdf_aarch64): Likewise.
> >
> >This patch removes the FP attributes from the HF, SF, DF, TF moves.
> 
> And now, with the patch. :-/
> 

Thanks for sticking with it. This is OK for GCC 7 when development
opens.

Remember to mention the most recent changes in your Changelog entry
(Remove "fp" attribute from *movhf_aarch64 and *movtf_aarch64).

Thanks,
James



Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-10 Thread Evandro Menezes

On 03/10/16 10:27, Evandro Menezes wrote:

On 03/10/16 07:23, James Greenhalgh wrote:

On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:

On 03/01/16 13:08, Evandro Menezes wrote:

On 03/01/16 13:02, Wilco Dijkstra wrote:

Evandro Menezes wrote:

The meaning of these attributes are not clear to me.  Is there a
reference somewhere about which insns are FP or SIMD or neither?

The meaning should be clear, "fp" is a floating point
instruction, "simd" a SIMD one
as defined in ARM-ARM.

Indeed, I had to add the Y for the f_mcr insn to match it with 
nosimd.
However, I didn't feel that it should be moved to the right, 
since it's

already disparaged.  Am I missing something detail?

It might not matter for this specific case, but I have seen
reload forcing the very
first alternative without looking at any costs or preferences -
as long as it is legal.
This suggests we need to order alternatives from most preferred
alternative to least
preferred one.

I think it is good enough for commit, James?

Methinks that my issue with those attributes is that I'm not as
fluent in AArch64 as I'd like to be.

Please, feel free to edit the patch changing the order then.

Replace insn to zero up SIMD registers

gcc/
 * config/aarch64/aarch64.md
 (*movhf_aarch64): Add "movi %0, #0" to zero up register.
 (*movsf_aarch64): Likewise and add "simd" and "fp" 
attributes.

 (*movdf_aarch64): Likewise.

Swapped the order of the constraints to favor MOVI.

Just say the word...
I'm wondering whether this is appropriate for GCC6 now that we are so 
late
in the development cycle. Additionally, I have some comments on your 
patch:


diff --git a/gcc/config/aarch64/aarch64.md 
b/gcc/config/aarch64/aarch64.md

index 68676c9..4502a58 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,11 +1163,12 @@
  )
(define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, 
?r,w,w,m,r,m ,r")
-(match_operand:HF 1 "general_operand"  "?rY, 
w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w 
,?r,w,w,m,r,m ,r")
+(match_operand:HF 1 "general_operand"  "Y ,?rY, 
w,w,m,w,m,rY,r"))]

"TARGET_FLOAT && (register_operand (operands[0], HFmode)
  || aarch64_reg_or_fp_zero (operands[1], HFmode))"
"@
+   movi\\t%0.4h, #0
 mov\\t%0.h[0], %w1
 umov\\t%w0, %1.h[0]
 mov\\t%0.h[0], %1.h[0]
@@ -1176,18 +1177,19 @@
 ldrh\\t%w0, %1
 strh\\t%w1, %0
 mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
   f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
  )
(define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  
,w,m,r,m ,r")
-(match_operand:SF 1 "general_operand"  "?rY, 
w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w  
,w,m,r,m ,r")
+(match_operand:SF 1 "general_operand"  "Y ,?rY, 
w,w,Ufc,m,w,m,rY,r"))]

"TARGET_FLOAT && (register_operand (operands[0], SFmode)
  || aarch64_reg_or_fp_zero (operands[1], SFmode))"
"@
+   movi\\t%0.2s, #0
 fmov\\t%s0, %w1
 fmov\\t%w0, %s1
 fmov\\t%s0, %s1
@@ -1197,16 +1199,19 @@
 ldr\\t%w0, %1
 str\\t%w1, %0
 mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
- f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
+ f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
  )

This fp attribute looks wrong to me. The two fmov instructions that move
between core and FP registers should be tagged "yes". However, this is
irrelevant as the whole pattern is guarded by TARGET_FLOAT.

It would be clearer to drop the FP attribute entirely, so as not to give
the erroneous impression that some alternatives in this insn are enabled
for !TARGET_FLOAT.


You mean to remove the FP attribute from all, HF, SF, DF, TF?


  (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  
,w,m,r,m ,r")
-(match_operand:DF 1 "general_operand"  "?rY, 
w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w  
,w,m,r,m ,r")
+(match_operand:DF 1 "general_operand"  "Y ,?rY, 
w,w,Ufc,m,w,m,rY,r"))]

"TARGET_FLOAT && (register_operand (operands[0], DFmode)
  || aarch64_reg_or_fp_zero (operands[1], DFmode))"
"@
+   movi\\t%d0, #0
 fmov\\t%d0, %x1
 fmov\\t%x0, %d1
 fmov\\t%d0, %d1
@@ -1216,8 +1221,10 @@
 ldr\\t%x0, %1
 str\\t%x1, %0
 mov\\t%x0, %x1"
-  

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-10 Thread Evandro Menezes

On 03/10/16 07:23, James Greenhalgh wrote:

On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:

On 03/01/16 13:08, Evandro Menezes wrote:

On 03/01/16 13:02, Wilco Dijkstra wrote:

Evandro Menezes wrote:

The meaning of these attributes are not clear to me.  Is there a
reference somewhere about which insns are FP or SIMD or neither?

The meaning should be clear, "fp" is a floating point
instruction, "simd" a SIMD one
as defined in ARM-ARM.


Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
However, I didn't feel that it should be moved to the right, since it's
already disparaged.  Am I missing something detail?

It might not matter for this specific case, but I have seen
reload forcing the very
first alternative without looking at any costs or preferences -
as long as it is legal.
This suggests we need to order alternatives from most preferred
alternative to least
preferred one.

I think it is good enough for commit, James?

Methinks that my issue with those attributes is that I'm not as
fluent in AArch64 as I'd like to be.

Please, feel free to edit the patch changing the order then.

Replace insn to zero up SIMD registers

gcc/
 * config/aarch64/aarch64.md
 (*movhf_aarch64): Add "movi %0, #0" to zero up register.
 (*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
 (*movdf_aarch64): Likewise.

Swapped the order of the constraints to favor MOVI.

Just say the word...

I'm wondering whether this is appropriate for GCC6 now that we are so late
in the development cycle. Additionally, I have some comments on your patch:


diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..4502a58 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,11 +1163,12 @@
  )
  
  (define_insn "*movhf_aarch64"

-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-   (match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
+   (match_operand:HF 1 "general_operand"  "Y ,?rY, w,w,m,w,m,rY,r"))]
"TARGET_FLOAT && (register_operand (operands[0], HFmode)
  || aarch64_reg_or_fp_zero (operands[1], HFmode))"
"@
+   movi\\t%0.4h, #0
 mov\\t%0.h[0], %w1
 umov\\t%w0, %1.h[0]
 mov\\t%0.h[0], %1.h[0]
@@ -1176,18 +1177,19 @@
 ldrh\\t%w0, %1
 strh\\t%w1, %0
 mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
   f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
  )
  
  (define_insn "*movsf_aarch64"

-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-   (match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m 
,r")
+   (match_operand:SF 1 "general_operand"  "Y ,?rY, 
w,w,Ufc,m,w,m,rY,r"))]
"TARGET_FLOAT && (register_operand (operands[0], SFmode)
  || aarch64_reg_or_fp_zero (operands[1], SFmode))"
"@
+   movi\\t%0.2s, #0
 fmov\\t%s0, %w1
 fmov\\t%w0, %s1
 fmov\\t%s0, %s1
@@ -1197,16 +1199,19 @@
 ldr\\t%w0, %1
 str\\t%w1, %0
 mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
- f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
+ f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
  )

This fp attribute looks wrong to me. The two fmov instructions that move
between core and FP registers should be tagged "yes". However, this is
irrelevant as the whole pattern is guarded by TARGET_FLOAT.

It would be clearer to drop the FP attribute entirely, so as not to give
the erroneous impression that some alternatives in this insn are enabled
for !TARGET_FLOAT.


You mean to remove the FP attribute from all, HF, SF, DF, TF?


  (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-   (match_operand:DF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m 
,r")
+   (match_operand:DF 1 "general_operand"  "Y ,?rY, 
w,w,Ufc,m,w,m,rY,r"))]
"TARGET_FLOAT && (register_operand (operands[0], DFmode)
  || aarch64_reg_or_fp_zero (operands[1], DFmode))"
"@
+   movi\\t%d0, #0
 fmov\\t%d0, %x1
 fmov\\t%x0, %d1
 fmov\\t%d0, %d1
@@ -1216,8 +1221,10 @@
 ldr\\t%x0, %1
 str\\t%x1, %0
 mov\\t%x0, %x1"
-  [(set_attr "type" 

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-10 Thread James Greenhalgh
On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:
> On 03/01/16 13:08, Evandro Menezes wrote:
> >On 03/01/16 13:02, Wilco Dijkstra wrote:
> >>Evandro Menezes wrote:
> >>>The meaning of these attributes are not clear to me.  Is there a
> >>>reference somewhere about which insns are FP or SIMD or neither?
> >>The meaning should be clear, "fp" is a floating point
> >>instruction, "simd" a SIMD one
> >>as defined in ARM-ARM.
> >>
> >>>Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
> >>>However, I didn't feel that it should be moved to the right, since it's
> >>>already disparaged.  Am I missing something detail?
> >>It might not matter for this specific case, but I have seen
> >>reload forcing the very
> >>first alternative without looking at any costs or preferences -
> >>as long as it is legal.
> >>This suggests we need to order alternatives from most preferred
> >>alternative to least
> >>preferred one.
> >>
> >>I think it is good enough for commit, James?
> >
> >Methinks that my issue with those attributes is that I'm not as
> >fluent in AArch64 as I'd like to be.
> >
> >Please, feel free to edit the patch changing the order then.
> 
>Replace insn to zero up SIMD registers
> 
>gcc/
> * config/aarch64/aarch64.md
> (*movhf_aarch64): Add "movi %0, #0" to zero up register.
> (*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
> (*movdf_aarch64): Likewise.
> 
> Swapped the order of the constraints to favor MOVI.
> 
> Just say the word...

I'm wondering whether this is appropriate for GCC6 now that we are so late
in the development cycle. Additionally, I have some comments on your patch:

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 68676c9..4502a58 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1163,11 +1163,12 @@
>  )
>  
>  (define_insn "*movhf_aarch64"
> -  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
> - (match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
> +  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
> + (match_operand:HF 1 "general_operand"  "Y ,?rY, w,w,m,w,m,rY,r"))]
>"TARGET_FLOAT && (register_operand (operands[0], HFmode)
>  || aarch64_reg_or_fp_zero (operands[1], HFmode))"
>"@
> +   movi\\t%0.4h, #0
> mov\\t%0.h[0], %w1
> umov\\t%w0, %1.h[0]
> mov\\t%0.h[0], %1.h[0]
> @@ -1176,18 +1177,19 @@
> ldrh\\t%w0, %1
> strh\\t%w1, %0
> mov\\t%w0, %w1"
> -  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
> +  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
>   f_loads,f_stores,load1,store1,mov_reg")
> -   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
> -   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
> +   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
> +   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
>  )
>  
>  (define_insn "*movsf_aarch64"
> -  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
> - (match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
> +  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m 
> ,r")
> + (match_operand:SF 1 "general_operand"  "Y ,?rY, 
> w,w,Ufc,m,w,m,rY,r"))]
>"TARGET_FLOAT && (register_operand (operands[0], SFmode)
>  || aarch64_reg_or_fp_zero (operands[1], SFmode))"
>"@
> +   movi\\t%0.2s, #0
> fmov\\t%s0, %w1
> fmov\\t%w0, %s1
> fmov\\t%s0, %s1
> @@ -1197,16 +1199,19 @@
> ldr\\t%w0, %1
> str\\t%w1, %0
> mov\\t%w0, %w1"
> -  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
> - f_loads,f_stores,load1,store1,mov_reg")]
> +  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
> + f_loads,f_stores,load1,store1,mov_reg")
> +   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
> +   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
>  )

This fp attribute looks wrong to me. The two fmov instructions that move
between core and FP registers should be tagged "yes". However, this is
irrelevant as the whole pattern is guarded by TARGET_FLOAT.

It would be clearer to drop the FP attribute entirely, so as not to give
the erroneous impression that some alternatives in this insn are enabled
for !TARGET_FLOAT.

>  (define_insn "*movdf_aarch64"
> -  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
> - (match_operand:DF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
> +  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m 
> ,r")
> + (match_operand:DF 1 "general_operand"  "Y ,?rY, 
> w,w,Ufc,m,w,m,rY,r"))]
>"TARGET_FLOAT && (register_operand (operands[0], DFmode)
>  || aarch64_reg_or_fp_zero (operands[1], DFmode))"
>"@
> +   movi\\t%d0, #0
> fmov\\t%d0, %x1
> fmov\\t%x0, %d1
> fmov\\t%d0, %d1
> @@ -1216,8 +1221,10 @@

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-09 Thread Evandro Menezes

On 03/01/16 13:08, Evandro Menezes wrote:

On 03/01/16 13:02, Wilco Dijkstra wrote:

Evandro Menezes wrote:

The meaning of these attributes are not clear to me.  Is there a
reference somewhere about which insns are FP or SIMD or neither?
The meaning should be clear, "fp" is a floating point instruction, 
"simd" a SIMD one

as defined in ARM-ARM.


Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
However, I didn't feel that it should be moved to the right, since it's
already disparaged.  Am I missing something detail?
It might not matter for this specific case, but I have seen reload 
forcing the very
first alternative without looking at any costs or preferences - as 
long as it is legal.
This suggests we need to order alternatives from most preferred 
alternative to least

preferred one.

I think it is good enough for commit, James?


Methinks that my issue with those attributes is that I'm not as fluent 
in AArch64 as I'd like to be.


Please, feel free to edit the patch changing the order then.


   Replace insn to zero up SIMD registers

   gcc/
* config/aarch64/aarch64.md
(*movhf_aarch64): Add "movi %0, #0" to zero up register.
(*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
(*movdf_aarch64): Likewise.

Swapped the order of the constraints to favor MOVI.

Just say the word...

Thank you,

--
Evandro Menezes

>From bcb76a4c864436930e1236e7ce35d9e689adf075 Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 33 -
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..4502a58 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,11 +1163,12 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w  ,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"  "Y ,?rY, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
 || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
+   movi\\t%0.4h, #0
mov\\t%0.h[0], %w1
umov\\t%w0, %1.h[0]
mov\\t%0.h[0], %1.h[0]
@@ -1176,18 +1177,19 @@
ldrh\\t%w0, %1
strh\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\
  f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"  "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
 || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
+   movi\\t%0.2s, #0
fmov\\t%s0, %w1
fmov\\t%w0, %s1
fmov\\t%s0, %s1
@@ -1197,16 +1199,19 @@
ldr\\t%w0, %1
str\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
- f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\
+ f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w  ,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"  "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
 || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
+   movi\\t%d0, #0
fmov\\t%d0, %x1
fmov\\t%x0, %d1
fmov\\t%d0, %d1
@@ -1216,8 +1221,10 @@
ldr\\t%x0, %1
str\\t%x1, %0
mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
- f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
+ f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" 

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-01 Thread Evandro Menezes

On 03/01/16 13:02, Wilco Dijkstra wrote:

Evandro Menezes wrote:

The meaning of these attributes are not clear to me.  Is there a
reference somewhere about which insns are FP or SIMD or neither?

The meaning should be clear, "fp" is a floating point instruction, "simd" a 
SIMD one
as defined in ARM-ARM.


Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
However, I didn't feel that it should be moved to the right, since it's
already disparaged.  Am I missing something detail?

It might not matter for this specific case, but I have seen reload forcing the 
very
first alternative without looking at any costs or preferences - as long as it 
is legal.
This suggests we need to order alternatives from most preferred alternative to 
least
preferred one.

I think it is good enough for commit, James?


Methinks that my issue with those attributes is that I'm not as fluent 
in AArch64 as I'd like to be.


Please, feel free to edit the patch changing the order then.

Thank you,

--
Evandro Menezes



Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-01 Thread Wilco Dijkstra
Evandro Menezes wrote:
>
> The meaning of these attributes are not clear to me.  Is there a
> reference somewhere about which insns are FP or SIMD or neither?

The meaning should be clear, "fp" is a floating point instruction, "simd" a 
SIMD one
as defined in ARM-ARM.

> Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.
> However, I didn't feel that it should be moved to the right, since it's
> already disparaged.  Am I missing something detail?

It might not matter for this specific case, but I have seen reload forcing the 
very
first alternative without looking at any costs or preferences - as long as it 
is legal.
This suggests we need to order alternatives from most preferred alternative to 
least
preferred one.

I think it is good enough for commit, James?

Wilco



Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-02-29 Thread Evandro Menezes

On 02/29/16 12:07, Wilco Dijkstra wrote:

Evandro Menezes  wrote:

Please, verify the new "simd" and "fp" attributes for SF and DF.

Both movsf and movdf should be:

(set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
(set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")

Did you check that with -mcpu=generic+nosimd you get fmov s0, wzr?
In my version I kept the Y on the fmov and placed the neon_mov first.


The meaning of these attributes are not clear to me.  Is there a 
reference somewhere about which insns are FP or SIMD or neither?


Indeed, I had to add the Y for the f_mcr insn to match it with nosimd.  
However, I didn't feel that it should be moved to the right, since it's 
already disparaged.  Am I missing something detail?


Thank you for the review,

--
Evandro Menezes

>From 952c0f74da98efd7fcb37b2cfe3c17518a619088 Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 33 -
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..416e065 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,12 +1163,13 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, w,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"  "?rY,Y, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
 || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
mov\\t%0.h[0], %w1
+   movi\\t%0.4h, #0
umov\\t%w0, %1.h[0]
mov\\t%0.h[0], %1.h[0]
ldr\\t%h0, %1
@@ -1176,19 +1177,20 @@
ldrh\\t%w0, %1
strh\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\
  f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"  "?rY,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
 || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
fmov\\t%s0, %w1
+   movi\\t%0.2s, #0
fmov\\t%w0, %s1
fmov\\t%s0, %s1
fmov\\t%s0, %1
@@ -1197,17 +1199,20 @@
ldr\\t%w0, %1
str\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
- f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\
+ f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"  "?rY,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
 || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
fmov\\t%d0, %x1
+   movi\\t%d0, #0
fmov\\t%x0, %d1
fmov\\t%d0, %d1
fmov\\t%d0, %1
@@ -1216,8 +1221,10 @@
ldr\\t%x0, %1
str\\t%x1, %0
mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
- f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconstd,\
+ f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")]
 )
 
 (define_insn "*movtf_aarch64"
-- 
2.6.3



Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-02-29 Thread Wilco Dijkstra
Evandro Menezes  wrote:
> 
> Please, verify the new "simd" and "fp" attributes for SF and DF.

Both movsf and movdf should be:

(set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
(set_attr "fp"   "*,*,*,yes,yes,yes,yes,*,*,*")

Did you check that with -mcpu=generic+nosimd you get fmov s0, wzr?
In my version I kept the Y on the fmov and placed the neon_mov first.

Wilco



Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-02-26 Thread Evandro Menezes

On 02/26/16 06:37, Wilco Dijkstra wrote:

Evandro Menezes  wrote:

I have a question though: is it necessary to add the "fp" and "simd"
attributes to both movsf_aarch64 and movdf_aarch64 as well?

You need at least the "simd" attribute, but providing "fp" as well is clearer
(in principle the TARGET_FLOAT check in the pattern condition is
redundant as a result, but the movhf and movtf patterns already do both).

Also you want to use the smallest possible SIMD size as these are
scalar operations and some microarchitectures execute 64-bit operations
more efficiently than 128-bit ones, so:

 mov\\t%0.h[0], %w1
+   movi\\t%0.4h, #0
 umov\\t%w0, %1.h[0]

 fmov\\t%s0, %w1
+   movi\\t%0.2s, #0
 fmov\\t%w0, %s1

With those changes it should be ready for commit once you get the OK from 
James/Marcus.


Replace insn to zero up SIMD registers

gcc/
* config/aarch64/aarch64.md
(*movhf_aarch64): Add "movi %0, #0" to zero up register.
(*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
(*movdf_aarch64): Likewise.

Please, verify the new "simd" and "fp" attributes for SF and DF.

Thank you,

--
Evandro Menezes

>From d447411ea85f065faa77d56cb143eb07a467 Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise and add "simd" and "fp" attributes.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 33 -
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 68676c9..a5661ed 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1163,12 +1163,13 @@
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"  "?r,Y, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
 || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
mov\\t%0.h[0], %w1
+   movi\\t%0.4h, #0
umov\\t%w0, %1.h[0]
mov\\t%0.h[0], %1.h[0]
ldr\\t%h0, %1
@@ -1176,19 +1177,20 @@
ldrh\\t%w0, %1
strh\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\
  f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"  "?r,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
 || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
fmov\\t%s0, %w1
+   movi\\t%0.2s, #0
fmov\\t%w0, %s1
fmov\\t%s0, %s1
fmov\\t%s0, %1
@@ -1197,17 +1199,20 @@
ldr\\t%w0, %1
str\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
- f_loads,f_stores,load1,store1,mov_reg")]
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\
+ f_loads,f_stores,load1,store1,mov_reg")
+   (set_attr "simd" "yes,yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "yes,*,yes,yes,yes,yes,yes,*,*,yes")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"  "?r,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], DFmode)
 || aarch64_reg_or_fp_zero (operands[1], DFmode))"
   "@
fmov\\t%d0, %x1
+   movi\\t%d0, #0
fmov\\t%x0, %d1
fmov\\t%d0, %d1
fmov\\t%d0, %1
@@ -1216,8 +1221,10 @@
ldr\\t%x0, %1
str\\t%x1, %0
mov\\t%x0, %x1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\
- f_loadd,f_stored,load1,store1,mov_reg")]
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconstd,\
+ f_loadd,f_stored,load1,store1,mov_reg")
+   (set_attr "simd" "yes,yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "yes,*,yes,yes,yes,yes,yes,*,*,yes")]
 )
 
 (define_insn "*movtf_aarch64"
-- 
2.6.3


Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-02-26 Thread Wilco Dijkstra
Evandro Menezes  wrote:
>
> I have a question though: is it necessary to add the "fp" and "simd"
> attributes to both movsf_aarch64 and movdf_aarch64 as well?

You need at least the "simd" attribute, but providing "fp" as well is clearer
(in principle the TARGET_FLOAT check in the pattern condition is 
redundant as a result, but the movhf and movtf patterns already do both).

Also you want to use the smallest possible SIMD size as these are
scalar operations and some microarchitectures execute 64-bit operations
more efficiently than 128-bit ones, so:

mov\\t%0.h[0], %w1
+   movi\\t%0.4h, #0
umov\\t%w0, %1.h[0]

fmov\\t%s0, %w1
+   movi\\t%0.2s, #0
fmov\\t%w0, %s1

With those changes it should be ready for commit once you get the OK from 
James/Marcus.

Wilco



Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-01-27 Thread Evandro Menezes

On 01/22/16 07:52, Wilco Dijkstra wrote:

On 12/16/2015 03:30 PM, Evandro Menezes wrote:

 On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:

 On 20 October 2015 at 00:40, Evandro Menezes  
wrote:

 In the existing targets, it seems that it's always faster to zero 
up a DF

 register with "movi %d0, #0" instead of "fmov %d0, xzr".

 This patch modifies the respective pattern.


 Hi Evandro,

 This patch changes the generic, u architecture independent instruction
 selection. The ARM ARM (C3.5.3) makes a specific recommendation about
 the choice of instruction in this situation and the current
 implementation in GCC follows that recommendation.  Wilco has also
 picked up on this issue he has the same patch internal to ARM along
 with an ongoing discussion with ARM architecture folk regarding this
 recommendation.  I'm reluctant to take this patch right now on the
 basis that it runs contrary to ARM ARM recommendation pending the
 conclusion of Wilco's discussion with ARM architecture folk.


 Have you had a chance to discuss this internally further?

Yes, it was decided to remove the recommendation from future ARM ARM's.

Several review comments on your patch 
(https://patchwork.ozlabs.org/patch/532736):

* This should be added to movhf, movsf and movdf - movtf already does this.
* It is important to set the "fp" and "simd" attributes so that the movi 
variant can
only be selected if it is available.



Hi, Wilco.

   2016-01-27 Evandro Menezes 

Replace insn to zero up SIMD registers

gcc/
* config/aarch64/aarch64.md
(*movhf_aarch64): Add "movi %0, #0" to zero up register.
(*movsf_aarch64): Likewise.
(*movdf_aarch64): Likewise.

When this decision is final, methinks that this patch would be close to 
addressing this change.


I have a question though: is it necessary to add the "fp" and "simd" 
attributes to both movsf_aarch64 and movdf_aarch64 as well?


Thank you,

--
Evandro Menezes

>From b390d411b56cfcdedf4601d0487baf1ef84e79ab Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f6c8eb1..43a3854 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1164,64 +1164,67 @@
   operands[1] = force_reg (mode, operands[1]);
   }
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"  "?r,Y, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
 || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
mov\\t%0.h[0], %w1
+   movi\\t%0.8h, #0
umov\\t%w0, %1.h[0]
mov\\t%0.h[0], %1.h[0]
ldr\\t%h0, %1
str\\t%h1, %0
ldrh\\t%w0, %1
strh\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\
  f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"  "?r,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
 || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
fmov\\t%s0, %w1
+   movi\\t%0.4s, #0
fmov\\t%w0, %s1
fmov\\t%s0, %s1
fmov\\t%s0, %1
ldr\\t%s0, %1
str\\t%s1, %0
ldr\\t%w0, %1
str\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\
  f_loads,f_stores,load1,store1,mov_reg")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w,?r,w,w 

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-01-22 Thread Wilco Dijkstra
On 12/16/2015 03:30 PM, Evandro Menezes wrote:
>
>    On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:
>
>    On 20 October 2015 at 00:40, Evandro Menezes  
>wrote:
>
>    In the existing targets, it seems that it's always faster to zero 
>up a DF
>
>    register with "movi %d0, #0" instead of "fmov %d0, xzr".
>
>    This patch modifies the respective pattern.
>
>
>    Hi Evandro,
>
>    This patch changes the generic, u architecture independent instruction
>    selection. The ARM ARM (C3.5.3) makes a specific recommendation about
>    the choice of instruction in this situation and the current
>    implementation in GCC follows that recommendation.  Wilco has also
>    picked up on this issue he has the same patch internal to ARM along
>    with an ongoing discussion with ARM architecture folk regarding this
>    recommendation.  I'm reluctant to take this patch right now on the
>    basis that it runs contrary to ARM ARM recommendation pending the
>    conclusion of Wilco's discussion with ARM architecture folk.
>
>
>    Have you had a chance to discuss this internally further?

Yes, it was decided to remove the recommendation from future ARM ARM's.

Several review comments on your patch 
(https://patchwork.ozlabs.org/patch/532736):

* This should be added to movhf, movsf and movdf - movtf already does this.
* It is important to set the "fp" and "simd" attributes so that the movi 
variant can
   only be selected if it is available.

Cheers,
Wilco



Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-01-12 Thread Evandro Menezes

On 12/16/2015 03:30 PM, Evandro Menezes wrote:

On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:
On 20 October 2015 at 00:40, Evandro Menezes  
wrote:
In the existing targets, it seems that it's always faster to zero up 
a DF

register with "movi %d0, #0" instead of "fmov %d0, xzr".

This patch modifies the respective pattern.


Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.


Have you had a chance to discuss this internally further?



Ping.

--
Evandro Menezes



Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-12-16 Thread Evandro Menezes

On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:

On 20 October 2015 at 00:40, Evandro Menezes  wrote:

In the existing targets, it seems that it's always faster to zero up a DF
register with "movi %d0, #0" instead of "fmov %d0, xzr".

This patch modifies the respective pattern.


Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.



Marcus,

Have you had a chance to discuss this internally further?

Thank you,

--
Evandro Menezes



Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-12-03 Thread Evandro Menezes

On 11/09/2015 04:59 PM, Evandro Menezes wrote:

Hi, Marcus.

Have you an update from the architecture folks about this?

Thank you,


Marcus?

--
Evandro Menezes



Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-11-19 Thread Evandro Menezes

On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:

On 20 October 2015 at 00:40, Evandro Menezes  wrote:

In the existing targets, it seems that it's always faster to zero up a DF
register with "movi %d0, #0" instead of "fmov %d0, xzr".

This patch modifies the respective pattern.


Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.


Ping.

--
Evandro Menezes



Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-11-09 Thread Evandro Menezes

Hi, Marcus.

Have you an update from the architecture folks about this?

Thank you,

--
Evandro Menezes

On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:

On 20 October 2015 at 00:40, Evandro Menezes  wrote:

In the existing targets, it seems that it's always faster to zero up a DF
register with "movi %d0, #0" instead of "fmov %d0, xzr".

This patch modifies the respective pattern.


Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.

Cheers
/Marcus





Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-10-30 Thread Marcus Shawcroft
On 20 October 2015 at 00:40, Evandro Menezes  wrote:
> In the existing targets, it seems that it's always faster to zero up a DF
> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>
> This patch modifies the respective pattern.


Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.

Cheers
/Marcus


Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-10-28 Thread Evandro Menezes

Ping.

--
Evandro Menezes

On 10/20/2015 09:27 AM, Andrew Pinski wrote:

On Tue, Oct 20, 2015 at 7:59 AM, Andrew Pinski  wrote:

On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski  wrote:

On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes  wrote:

In the existing targets, it seems that it's always faster to zero up a DF
register with "movi %d0, #0" instead of "fmov %d0, xzr".

I think for ThunderX 1, this change will not make a difference.  So I
am neutral on this change.

Actually depending on fmov is decoded in our pipeline, this change
might actually be worse.  Currently fmov with an immediate is 1 cycle
while movi is two cycles.  Let me double check how internally on how
it is decoded and if it is 1 cycle or two.

Ok, my objections are removed as I talked with the architectures here
at Cavium and using movi is better in this case.

Thanks,
Andrew


Thanks,
Andrew


Thanks,
Andrew


This patch modifies the respective pattern.

Please, commit if it's alright.

Thank you,

--
Evandro Menezes





Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-10-20 Thread Andrew Pinski
On Tue, Oct 20, 2015 at 7:59 AM, Andrew Pinski  wrote:
> On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski  wrote:
>> On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes  
>> wrote:
>>> In the existing targets, it seems that it's always faster to zero up a DF
>>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>>
>> I think for ThunderX 1, this change will not make a difference.  So I
>> am neutral on this change.
>
> Actually depending on fmov is decoded in our pipeline, this change
> might actually be worse.  Currently fmov with an immediate is 1 cycle
> while movi is two cycles.  Let me double check how internally on how
> it is decoded and if it is 1 cycle or two.

Ok, my objections are removed as I talked with the architectures here
at Cavium and using movi is better in this case.

Thanks,
Andrew

>
> Thanks,
> Andrew
>
>>
>> Thanks,
>> Andrew
>>
>>>
>>> This patch modifies the respective pattern.
>>>
>>> Please, commit if it's alright.
>>>
>>> Thank you,
>>>
>>> --
>>> Evandro Menezes
>>>


Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-10-19 Thread Andrew Pinski
On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes  wrote:
> In the existing targets, it seems that it's always faster to zero up a DF
> register with "movi %d0, #0" instead of "fmov %d0, xzr".

I think for ThunderX 1, this change will not make a difference.  So I
am neutral on this change.

Thanks,
Andrew

>
> This patch modifies the respective pattern.
>
> Please, commit if it's alright.
>
> Thank you,
>
> --
> Evandro Menezes
>


Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-10-19 Thread Andrew Pinski
On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski  wrote:
> On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes  
> wrote:
>> In the existing targets, it seems that it's always faster to zero up a DF
>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>
> I think for ThunderX 1, this change will not make a difference.  So I
> am neutral on this change.

Actually depending on fmov is decoded in our pipeline, this change
might actually be worse.  Currently fmov with an immediate is 1 cycle
while movi is two cycles.  Let me double check how internally on how
it is decoded and if it is 1 cycle or two.

Thanks,
Andrew

>
> Thanks,
> Andrew
>
>>
>> This patch modifies the respective pattern.
>>
>> Please, commit if it's alright.
>>
>> Thank you,
>>
>> --
>> Evandro Menezes
>>