Re: [PATCH][AArch64] Replace insn to zero up DF register
On Thu, Mar 10, 2016 at 10:32:15AM -0600, Evandro Menezes wrote: > >I agree to postpone until GCC 7. > > > >[AArch64] Replace insn to zero up SIMD registers > > > >gcc/ > >* config/aarch64/aarch64.md > >(*movhf_aarch64): Add "movi %0, #0" to zero up register. > >(*movsf_aarch64): Likewise and add "simd" attributes. > >(*movdf_aarch64): Likewise. > > > >This patch removes the FP attributes from the HF, SF, DF, TF moves. > > And now, with the patch. :-/ > Thanks for sticking with it. This is OK for GCC 7 when development opens. Remember to mention the most recent changes in your Changelog entry (Remove "fp" attribute from *movhf_aarch64 and *movtf_aarch64). Thanks, James
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 03/10/16 10:27, Evandro Menezes wrote: On 03/10/16 07:23, James Greenhalgh wrote: On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote: On 03/01/16 13:08, Evandro Menezes wrote: On 03/01/16 13:02, Wilco Dijkstra wrote: Evandro Menezes wrote: The meaning of these attributes are not clear to me. Is there a reference somewhere about which insns are FP or SIMD or neither? The meaning should be clear, "fp" is a floating point instruction, "simd" a SIMD one as defined in ARM-ARM. Indeed, I had to add the Y for the f_mcr insn to match it with nosimd. However, I didn't feel that it should be moved to the right, since it's already disparaged. Am I missing something detail? It might not matter for this specific case, but I have seen reload forcing the very first alternative without looking at any costs or preferences - as long as it is legal. This suggests we need to order alternatives from most preferred alternative to least preferred one. I think it is good enough for commit, James? Methinks that my issue with those attributes is that I'm not as fluent in AArch64 as I'd like to be. Please, feel free to edit the patch changing the order then. Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise and add "simd" and "fp" attributes. (*movdf_aarch64): Likewise. Swapped the order of the constraints to favor MOVI. Just say the word... I'm wondering whether this is appropriate for GCC6 now that we are so late in the development cycle. Additionally, I have some comments on your patch: diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 68676c9..4502a58 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1163,11 +1163,12 @@ ) (define_insn "*movhf_aarch64" - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r") -(match_operand:HF 1 "general_operand" "?rY, w,w,m,w,m,rY,r"))] + [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w ,?r,w,w,m,r,m ,r") +(match_operand:HF 1 "general_operand" "Y ,?rY, w,w,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], HFmode) || aarch64_reg_or_fp_zero (operands[1], HFmode))" "@ + movi\\t%0.4h, #0 mov\\t%0.h[0], %w1 umov\\t%w0, %1.h[0] mov\\t%0.h[0], %1.h[0] @@ -1176,18 +1177,19 @@ ldrh\\t%w0, %1 strh\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ + [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\ f_loads,f_stores,load1,store1,mov_reg") - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*") + (set_attr "fp" "*,*,*,*,yes,yes,*,*,*")] ) (define_insn "*movsf_aarch64" - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") -(match_operand:SF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m ,r") +(match_operand:SF 1 "general_operand" "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], SFmode) || aarch64_reg_or_fp_zero (operands[1], SFmode))" "@ + movi\\t%0.2s, #0 fmov\\t%s0, %w1 fmov\\t%w0, %s1 fmov\\t%s0, %s1 @@ -1197,16 +1199,19 @@ ldr\\t%w0, %1 str\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ - f_loads,f_stores,load1,store1,mov_reg")] + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\ + f_loads,f_stores,load1,store1,mov_reg") + (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*") + (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*")] ) This fp attribute looks wrong to me. The two fmov instructions that move between core and FP registers should be tagged "yes". However, this is irrelevant as the whole pattern is guarded by TARGET_FLOAT. It would be clearer to drop the FP attribute entirely, so as not to give the erroneous impression that some alternatives in this insn are enabled for !TARGET_FLOAT. You mean to remove the FP attribute from all, HF, SF, DF, TF? (define_insn "*movdf_aarch64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") -(match_operand:DF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m ,r") +(match_operand:DF 1 "general_operand" "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], DFmode) || aarch64_reg_or_fp_zero (operands[1], DFmode))" "@ + movi\\t%d0, #0 fmov\\t%d0, %x1 fmov\\t%x0, %d1 fmov\\t%d0, %d1 @@ -1216,8 +1221,10 @@ ldr\\t%x0, %1 str\\t%x1, %0 mov\\t%x0, %x1" -
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 03/10/16 07:23, James Greenhalgh wrote: On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote: On 03/01/16 13:08, Evandro Menezes wrote: On 03/01/16 13:02, Wilco Dijkstra wrote: Evandro Menezes wrote: The meaning of these attributes are not clear to me. Is there a reference somewhere about which insns are FP or SIMD or neither? The meaning should be clear, "fp" is a floating point instruction, "simd" a SIMD one as defined in ARM-ARM. Indeed, I had to add the Y for the f_mcr insn to match it with nosimd. However, I didn't feel that it should be moved to the right, since it's already disparaged. Am I missing something detail? It might not matter for this specific case, but I have seen reload forcing the very first alternative without looking at any costs or preferences - as long as it is legal. This suggests we need to order alternatives from most preferred alternative to least preferred one. I think it is good enough for commit, James? Methinks that my issue with those attributes is that I'm not as fluent in AArch64 as I'd like to be. Please, feel free to edit the patch changing the order then. Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise and add "simd" and "fp" attributes. (*movdf_aarch64): Likewise. Swapped the order of the constraints to favor MOVI. Just say the word... I'm wondering whether this is appropriate for GCC6 now that we are so late in the development cycle. Additionally, I have some comments on your patch: diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 68676c9..4502a58 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1163,11 +1163,12 @@ ) (define_insn "*movhf_aarch64" - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r") - (match_operand:HF 1 "general_operand" "?rY, w,w,m,w,m,rY,r"))] + [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w ,?r,w,w,m,r,m ,r") + (match_operand:HF 1 "general_operand" "Y ,?rY, w,w,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], HFmode) || aarch64_reg_or_fp_zero (operands[1], HFmode))" "@ + movi\\t%0.4h, #0 mov\\t%0.h[0], %w1 umov\\t%w0, %1.h[0] mov\\t%0.h[0], %1.h[0] @@ -1176,18 +1177,19 @@ ldrh\\t%w0, %1 strh\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ + [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\ f_loads,f_stores,load1,store1,mov_reg") - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*") + (set_attr "fp" "*,*,*,*,yes,yes,*,*,*")] ) (define_insn "*movsf_aarch64" - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:SF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m ,r") + (match_operand:SF 1 "general_operand" "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], SFmode) || aarch64_reg_or_fp_zero (operands[1], SFmode))" "@ + movi\\t%0.2s, #0 fmov\\t%s0, %w1 fmov\\t%w0, %s1 fmov\\t%s0, %s1 @@ -1197,16 +1199,19 @@ ldr\\t%w0, %1 str\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ - f_loads,f_stores,load1,store1,mov_reg")] + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\ + f_loads,f_stores,load1,store1,mov_reg") + (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*") + (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*")] ) This fp attribute looks wrong to me. The two fmov instructions that move between core and FP registers should be tagged "yes". However, this is irrelevant as the whole pattern is guarded by TARGET_FLOAT. It would be clearer to drop the FP attribute entirely, so as not to give the erroneous impression that some alternatives in this insn are enabled for !TARGET_FLOAT. You mean to remove the FP attribute from all, HF, SF, DF, TF? (define_insn "*movdf_aarch64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:DF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m ,r") + (match_operand:DF 1 "general_operand" "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], DFmode) || aarch64_reg_or_fp_zero (operands[1], DFmode))" "@ + movi\\t%d0, #0 fmov\\t%d0, %x1 fmov\\t%x0, %d1 fmov\\t%d0, %d1 @@ -1216,8 +1221,10 @@ ldr\\t%x0, %1 str\\t%x1, %0 mov\\t%x0, %x1" - [(set_attr "type"
Re: [PATCH][AArch64] Replace insn to zero up DF register
On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote: > On 03/01/16 13:08, Evandro Menezes wrote: > >On 03/01/16 13:02, Wilco Dijkstra wrote: > >>Evandro Menezes wrote: > >>>The meaning of these attributes are not clear to me. Is there a > >>>reference somewhere about which insns are FP or SIMD or neither? > >>The meaning should be clear, "fp" is a floating point > >>instruction, "simd" a SIMD one > >>as defined in ARM-ARM. > >> > >>>Indeed, I had to add the Y for the f_mcr insn to match it with nosimd. > >>>However, I didn't feel that it should be moved to the right, since it's > >>>already disparaged. Am I missing something detail? > >>It might not matter for this specific case, but I have seen > >>reload forcing the very > >>first alternative without looking at any costs or preferences - > >>as long as it is legal. > >>This suggests we need to order alternatives from most preferred > >>alternative to least > >>preferred one. > >> > >>I think it is good enough for commit, James? > > > >Methinks that my issue with those attributes is that I'm not as > >fluent in AArch64 as I'd like to be. > > > >Please, feel free to edit the patch changing the order then. > >Replace insn to zero up SIMD registers > >gcc/ > * config/aarch64/aarch64.md > (*movhf_aarch64): Add "movi %0, #0" to zero up register. > (*movsf_aarch64): Likewise and add "simd" and "fp" attributes. > (*movdf_aarch64): Likewise. > > Swapped the order of the constraints to favor MOVI. > > Just say the word... I'm wondering whether this is appropriate for GCC6 now that we are so late in the development cycle. Additionally, I have some comments on your patch: > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 68676c9..4502a58 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -1163,11 +1163,12 @@ > ) > > (define_insn "*movhf_aarch64" > - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r") > - (match_operand:HF 1 "general_operand" "?rY, w,w,m,w,m,rY,r"))] > + [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w ,?r,w,w,m,r,m ,r") > + (match_operand:HF 1 "general_operand" "Y ,?rY, w,w,m,w,m,rY,r"))] >"TARGET_FLOAT && (register_operand (operands[0], HFmode) > || aarch64_reg_or_fp_zero (operands[1], HFmode))" >"@ > + movi\\t%0.4h, #0 > mov\\t%0.h[0], %w1 > umov\\t%w0, %1.h[0] > mov\\t%0.h[0], %1.h[0] > @@ -1176,18 +1177,19 @@ > ldrh\\t%w0, %1 > strh\\t%w1, %0 > mov\\t%w0, %w1" > - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ > + [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\ > f_loads,f_stores,load1,store1,mov_reg") > - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") > - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] > + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*") > + (set_attr "fp" "*,*,*,*,yes,yes,*,*,*")] > ) > > (define_insn "*movsf_aarch64" > - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") > - (match_operand:SF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] > + [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m > ,r") > + (match_operand:SF 1 "general_operand" "Y ,?rY, > w,w,Ufc,m,w,m,rY,r"))] >"TARGET_FLOAT && (register_operand (operands[0], SFmode) > || aarch64_reg_or_fp_zero (operands[1], SFmode))" >"@ > + movi\\t%0.2s, #0 > fmov\\t%s0, %w1 > fmov\\t%w0, %s1 > fmov\\t%s0, %s1 > @@ -1197,16 +1199,19 @@ > ldr\\t%w0, %1 > str\\t%w1, %0 > mov\\t%w0, %w1" > - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ > - f_loads,f_stores,load1,store1,mov_reg")] > + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\ > + f_loads,f_stores,load1,store1,mov_reg") > + (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*") > + (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*")] > ) This fp attribute looks wrong to me. The two fmov instructions that move between core and FP registers should be tagged "yes". However, this is irrelevant as the whole pattern is guarded by TARGET_FLOAT. It would be clearer to drop the FP attribute entirely, so as not to give the erroneous impression that some alternatives in this insn are enabled for !TARGET_FLOAT. > (define_insn "*movdf_aarch64" > - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") > - (match_operand:DF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] > + [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m > ,r") > + (match_operand:DF 1 "general_operand" "Y ,?rY, > w,w,Ufc,m,w,m,rY,r"))] >"TARGET_FLOAT && (register_operand (operands[0], DFmode) > || aarch64_reg_or_fp_zero (operands[1], DFmode))" >"@ > + movi\\t%d0, #0 > fmov\\t%d0, %x1 > fmov\\t%x0, %d1 > fmov\\t%d0, %d1 > @@ -1216,8 +1221,10 @@
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 03/01/16 13:08, Evandro Menezes wrote: On 03/01/16 13:02, Wilco Dijkstra wrote: Evandro Menezes wrote: The meaning of these attributes are not clear to me. Is there a reference somewhere about which insns are FP or SIMD or neither? The meaning should be clear, "fp" is a floating point instruction, "simd" a SIMD one as defined in ARM-ARM. Indeed, I had to add the Y for the f_mcr insn to match it with nosimd. However, I didn't feel that it should be moved to the right, since it's already disparaged. Am I missing something detail? It might not matter for this specific case, but I have seen reload forcing the very first alternative without looking at any costs or preferences - as long as it is legal. This suggests we need to order alternatives from most preferred alternative to least preferred one. I think it is good enough for commit, James? Methinks that my issue with those attributes is that I'm not as fluent in AArch64 as I'd like to be. Please, feel free to edit the patch changing the order then. Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise and add "simd" and "fp" attributes. (*movdf_aarch64): Likewise. Swapped the order of the constraints to favor MOVI. Just say the word... Thank you, -- Evandro Menezes >From bcb76a4c864436930e1236e7ce35d9e689adf075 Mon Sep 17 00:00:00 2001 From: Evandro MenezesDate: Mon, 19 Oct 2015 18:31:48 -0500 Subject: [PATCH] Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise and add "simd" and "fp" attributes. (*movdf_aarch64): Likewise. --- gcc/config/aarch64/aarch64.md | 33 - 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 68676c9..4502a58 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1163,11 +1163,12 @@ ) (define_insn "*movhf_aarch64" - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r") - (match_operand:HF 1 "general_operand" "?rY, w,w,m,w,m,rY,r"))] + [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w ,?r,w,w,m,r,m ,r") + (match_operand:HF 1 "general_operand" "Y ,?rY, w,w,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], HFmode) || aarch64_reg_or_fp_zero (operands[1], HFmode))" "@ + movi\\t%0.4h, #0 mov\\t%0.h[0], %w1 umov\\t%w0, %1.h[0] mov\\t%0.h[0], %1.h[0] @@ -1176,18 +1177,19 @@ ldrh\\t%w0, %1 strh\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ + [(set_attr "type" "neon_move,neon_from_gp,neon_to_gp,neon_move,\ f_loads,f_stores,load1,store1,mov_reg") - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*") + (set_attr "fp" "*,*,*,*,yes,yes,*,*,*")] ) (define_insn "*movsf_aarch64" - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:SF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m ,r") + (match_operand:SF 1 "general_operand" "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], SFmode) || aarch64_reg_or_fp_zero (operands[1], SFmode))" "@ + movi\\t%0.2s, #0 fmov\\t%s0, %w1 fmov\\t%w0, %s1 fmov\\t%s0, %s1 @@ -1197,16 +1199,19 @@ ldr\\t%w0, %1 str\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ - f_loads,f_stores,load1,store1,mov_reg")] + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconsts,\ + f_loads,f_stores,load1,store1,mov_reg") + (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*") + (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*")] ) (define_insn "*movdf_aarch64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:DF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w ,?r,w,w ,w,m,r,m ,r") + (match_operand:DF 1 "general_operand" "Y ,?rY, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], DFmode) || aarch64_reg_or_fp_zero (operands[1], DFmode))" "@ + movi\\t%d0, #0 fmov\\t%d0, %x1 fmov\\t%x0, %d1 fmov\\t%d0, %d1 @@ -1216,8 +1221,10 @@ ldr\\t%x0, %1 str\\t%x1, %0 mov\\t%x0, %x1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\ - f_loadd,f_stored,load1,store1,mov_reg")] + [(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\ + f_loadd,f_stored,load1,store1,mov_reg") + (set_attr "simd"
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 03/01/16 13:02, Wilco Dijkstra wrote: Evandro Menezes wrote: The meaning of these attributes are not clear to me. Is there a reference somewhere about which insns are FP or SIMD or neither? The meaning should be clear, "fp" is a floating point instruction, "simd" a SIMD one as defined in ARM-ARM. Indeed, I had to add the Y for the f_mcr insn to match it with nosimd. However, I didn't feel that it should be moved to the right, since it's already disparaged. Am I missing something detail? It might not matter for this specific case, but I have seen reload forcing the very first alternative without looking at any costs or preferences - as long as it is legal. This suggests we need to order alternatives from most preferred alternative to least preferred one. I think it is good enough for commit, James? Methinks that my issue with those attributes is that I'm not as fluent in AArch64 as I'd like to be. Please, feel free to edit the patch changing the order then. Thank you, -- Evandro Menezes
Re: [PATCH][AArch64] Replace insn to zero up DF register
Evandro Menezes wrote: > > The meaning of these attributes are not clear to me. Is there a > reference somewhere about which insns are FP or SIMD or neither? The meaning should be clear, "fp" is a floating point instruction, "simd" a SIMD one as defined in ARM-ARM. > Indeed, I had to add the Y for the f_mcr insn to match it with nosimd. > However, I didn't feel that it should be moved to the right, since it's > already disparaged. Am I missing something detail? It might not matter for this specific case, but I have seen reload forcing the very first alternative without looking at any costs or preferences - as long as it is legal. This suggests we need to order alternatives from most preferred alternative to least preferred one. I think it is good enough for commit, James? Wilco
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 02/29/16 12:07, Wilco Dijkstra wrote: Evandro Menezeswrote: Please, verify the new "simd" and "fp" attributes for SF and DF. Both movsf and movdf should be: (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*") (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*") Did you check that with -mcpu=generic+nosimd you get fmov s0, wzr? In my version I kept the Y on the fmov and placed the neon_mov first. The meaning of these attributes are not clear to me. Is there a reference somewhere about which insns are FP or SIMD or neither? Indeed, I had to add the Y for the f_mcr insn to match it with nosimd. However, I didn't feel that it should be moved to the right, since it's already disparaged. Am I missing something detail? Thank you for the review, -- Evandro Menezes >From 952c0f74da98efd7fcb37b2cfe3c17518a619088 Mon Sep 17 00:00:00 2001 From: Evandro Menezes Date: Mon, 19 Oct 2015 18:31:48 -0500 Subject: [PATCH] Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise and add "simd" and "fp" attributes. (*movdf_aarch64): Likewise. --- gcc/config/aarch64/aarch64.md | 33 - 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 68676c9..416e065 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1163,12 +1163,13 @@ ) (define_insn "*movhf_aarch64" - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r") - (match_operand:HF 1 "general_operand" "?rY, w,w,m,w,m,rY,r"))] + [(set (match_operand:HF 0 "nonimmediate_operand" "=w, w,?r,w,w,m,r,m ,r") + (match_operand:HF 1 "general_operand" "?rY,Y, w,w,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], HFmode) || aarch64_reg_or_fp_zero (operands[1], HFmode))" "@ mov\\t%0.h[0], %w1 + movi\\t%0.4h, #0 umov\\t%w0, %1.h[0] mov\\t%0.h[0], %1.h[0] ldr\\t%h0, %1 @@ -1176,19 +1177,20 @@ ldrh\\t%w0, %1 strh\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ + [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\ f_loads,f_stores,load1,store1,mov_reg") - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*") + (set_attr "fp" "*,*,*,*,yes,yes,*,*,*")] ) (define_insn "*movsf_aarch64" - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:SF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:SF 0 "nonimmediate_operand" "=w, w,?r,w,w ,w,m,r,m ,r") + (match_operand:SF 1 "general_operand" "?rY,Y, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], SFmode) || aarch64_reg_or_fp_zero (operands[1], SFmode))" "@ fmov\\t%s0, %w1 + movi\\t%0.2s, #0 fmov\\t%w0, %s1 fmov\\t%s0, %s1 fmov\\t%s0, %1 @@ -1197,17 +1199,20 @@ ldr\\t%w0, %1 str\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ - f_loads,f_stores,load1,store1,mov_reg")] + [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\ + f_loads,f_stores,load1,store1,mov_reg") + (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*") + (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*")] ) (define_insn "*movdf_aarch64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:DF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=w, w,?r,w,w ,w,m,r,m ,r") + (match_operand:DF 1 "general_operand" "?rY,Y, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], DFmode) || aarch64_reg_or_fp_zero (operands[1], DFmode))" "@ fmov\\t%d0, %x1 + movi\\t%d0, #0 fmov\\t%x0, %d1 fmov\\t%d0, %d1 fmov\\t%d0, %1 @@ -1216,8 +1221,10 @@ ldr\\t%x0, %1 str\\t%x1, %0 mov\\t%x0, %x1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\ - f_loadd,f_stored,load1,store1,mov_reg")] + [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconstd,\ + f_loadd,f_stored,load1,store1,mov_reg") + (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*") + (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*")] ) (define_insn "*movtf_aarch64" -- 2.6.3
Re: [PATCH][AArch64] Replace insn to zero up DF register
Evandro Menezeswrote: > > Please, verify the new "simd" and "fp" attributes for SF and DF. Both movsf and movdf should be: (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*") (set_attr "fp" "*,*,*,yes,yes,yes,yes,*,*,*") Did you check that with -mcpu=generic+nosimd you get fmov s0, wzr? In my version I kept the Y on the fmov and placed the neon_mov first. Wilco
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 02/26/16 06:37, Wilco Dijkstra wrote: Evandro Menezeswrote: I have a question though: is it necessary to add the "fp" and "simd" attributes to both movsf_aarch64 and movdf_aarch64 as well? You need at least the "simd" attribute, but providing "fp" as well is clearer (in principle the TARGET_FLOAT check in the pattern condition is redundant as a result, but the movhf and movtf patterns already do both). Also you want to use the smallest possible SIMD size as these are scalar operations and some microarchitectures execute 64-bit operations more efficiently than 128-bit ones, so: mov\\t%0.h[0], %w1 + movi\\t%0.4h, #0 umov\\t%w0, %1.h[0] fmov\\t%s0, %w1 + movi\\t%0.2s, #0 fmov\\t%w0, %s1 With those changes it should be ready for commit once you get the OK from James/Marcus. Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise and add "simd" and "fp" attributes. (*movdf_aarch64): Likewise. Please, verify the new "simd" and "fp" attributes for SF and DF. Thank you, -- Evandro Menezes >From d447411ea85f065faa77d56cb143eb07a467 Mon Sep 17 00:00:00 2001 From: Evandro Menezes Date: Mon, 19 Oct 2015 18:31:48 -0500 Subject: [PATCH] Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise and add "simd" and "fp" attributes. (*movdf_aarch64): Likewise. --- gcc/config/aarch64/aarch64.md | 33 - 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 68676c9..a5661ed 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1163,12 +1163,13 @@ ) (define_insn "*movhf_aarch64" - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r") - (match_operand:HF 1 "general_operand" "?rY, w,w,m,w,m,rY,r"))] + [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w,?r,w,w,m,r,m ,r") + (match_operand:HF 1 "general_operand" "?r,Y, w,w,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], HFmode) || aarch64_reg_or_fp_zero (operands[1], HFmode))" "@ mov\\t%0.h[0], %w1 + movi\\t%0.4h, #0 umov\\t%w0, %1.h[0] mov\\t%0.h[0], %1.h[0] ldr\\t%h0, %1 @@ -1176,19 +1177,20 @@ ldrh\\t%w0, %1 strh\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ + [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\ f_loads,f_stores,load1,store1,mov_reg") - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*") + (set_attr "fp" "*,*,*,*,yes,yes,*,*,*")] ) (define_insn "*movsf_aarch64" - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:SF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w,?r,w,w ,w,m,r,m ,r") + (match_operand:SF 1 "general_operand" "?r,Y, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], SFmode) || aarch64_reg_or_fp_zero (operands[1], SFmode))" "@ fmov\\t%s0, %w1 + movi\\t%0.2s, #0 fmov\\t%w0, %s1 fmov\\t%s0, %s1 fmov\\t%s0, %1 @@ -1197,17 +1199,20 @@ ldr\\t%w0, %1 str\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ - f_loads,f_stores,load1,store1,mov_reg")] + [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\ + f_loads,f_stores,load1,store1,mov_reg") + (set_attr "simd" "yes,yes,yes,yes,yes,*,*,*,*,*") + (set_attr "fp" "yes,*,yes,yes,yes,yes,yes,*,*,yes")] ) (define_insn "*movdf_aarch64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:DF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w,?r,w,w ,w,m,r,m ,r") + (match_operand:DF 1 "general_operand" "?r,Y, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], DFmode) || aarch64_reg_or_fp_zero (operands[1], DFmode))" "@ fmov\\t%d0, %x1 + movi\\t%d0, #0 fmov\\t%x0, %d1 fmov\\t%d0, %d1 fmov\\t%d0, %1 @@ -1216,8 +1221,10 @@ ldr\\t%x0, %1 str\\t%x1, %0 mov\\t%x0, %x1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconstd,\ - f_loadd,f_stored,load1,store1,mov_reg")] + [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconstd,\ + f_loadd,f_stored,load1,store1,mov_reg") + (set_attr "simd" "yes,yes,yes,yes,yes,*,*,*,*,*") + (set_attr "fp" "yes,*,yes,yes,yes,yes,yes,*,*,yes")] ) (define_insn "*movtf_aarch64" -- 2.6.3
Re: [PATCH][AArch64] Replace insn to zero up DF register
Evandro Menezeswrote: > > I have a question though: is it necessary to add the "fp" and "simd" > attributes to both movsf_aarch64 and movdf_aarch64 as well? You need at least the "simd" attribute, but providing "fp" as well is clearer (in principle the TARGET_FLOAT check in the pattern condition is redundant as a result, but the movhf and movtf patterns already do both). Also you want to use the smallest possible SIMD size as these are scalar operations and some microarchitectures execute 64-bit operations more efficiently than 128-bit ones, so: mov\\t%0.h[0], %w1 + movi\\t%0.4h, #0 umov\\t%w0, %1.h[0] fmov\\t%s0, %w1 + movi\\t%0.2s, #0 fmov\\t%w0, %s1 With those changes it should be ready for commit once you get the OK from James/Marcus. Wilco
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 01/22/16 07:52, Wilco Dijkstra wrote: On 12/16/2015 03:30 PM, Evandro Menezes wrote: On 10/30/2015 05:24 AM, Marcus Shawcroft wrote: On 20 October 2015 at 00:40, Evandro Menezeswrote: In the existing targets, it seems that it's always faster to zero up a DF register with "movi %d0, #0" instead of "fmov %d0, xzr". This patch modifies the respective pattern. Hi Evandro, This patch changes the generic, u architecture independent instruction selection. The ARM ARM (C3.5.3) makes a specific recommendation about the choice of instruction in this situation and the current implementation in GCC follows that recommendation. Wilco has also picked up on this issue he has the same patch internal to ARM along with an ongoing discussion with ARM architecture folk regarding this recommendation. I'm reluctant to take this patch right now on the basis that it runs contrary to ARM ARM recommendation pending the conclusion of Wilco's discussion with ARM architecture folk. Have you had a chance to discuss this internally further? Yes, it was decided to remove the recommendation from future ARM ARM's. Several review comments on your patch (https://patchwork.ozlabs.org/patch/532736): * This should be added to movhf, movsf and movdf - movtf already does this. * It is important to set the "fp" and "simd" attributes so that the movi variant can only be selected if it is available. Hi, Wilco. 2016-01-27 Evandro Menezes Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise. (*movdf_aarch64): Likewise. When this decision is final, methinks that this patch would be close to addressing this change. I have a question though: is it necessary to add the "fp" and "simd" attributes to both movsf_aarch64 and movdf_aarch64 as well? Thank you, -- Evandro Menezes >From b390d411b56cfcdedf4601d0487baf1ef84e79ab Mon Sep 17 00:00:00 2001 From: Evandro Menezes Date: Mon, 19 Oct 2015 18:31:48 -0500 Subject: [PATCH] Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "movi %0, #0" to zero up register. (*movsf_aarch64): Likewise. (*movdf_aarch64): Likewise. --- gcc/config/aarch64/aarch64.md | 25 ++--- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f6c8eb1..43a3854 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1164,64 +1164,67 @@ operands[1] = force_reg (mode, operands[1]); } ) (define_insn "*movhf_aarch64" - [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r") - (match_operand:HF 1 "general_operand" "?rY, w,w,m,w,m,rY,r"))] + [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w,?r,w,w,m,r,m ,r") + (match_operand:HF 1 "general_operand" "?r,Y, w,w,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], HFmode) || aarch64_reg_or_fp_zero (operands[1], HFmode))" "@ mov\\t%0.h[0], %w1 + movi\\t%0.8h, #0 umov\\t%w0, %1.h[0] mov\\t%0.h[0], %1.h[0] ldr\\t%h0, %1 str\\t%h1, %0 ldrh\\t%w0, %1 strh\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\ + [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\ f_loads,f_stores,load1,store1,mov_reg") - (set_attr "simd" "yes,yes,yes,*,*,*,*,*") - (set_attr "fp" "*,*,*,yes,yes,*,*,*")] + (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*") + (set_attr "fp" "*,*,*,*,yes,yes,*,*,*")] ) (define_insn "*movsf_aarch64" - [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:SF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w,?r,w,w ,w,m,r,m ,r") + (match_operand:SF 1 "general_operand" "?r,Y, w,w,Ufc,m,w,m,rY,r"))] "TARGET_FLOAT && (register_operand (operands[0], SFmode) || aarch64_reg_or_fp_zero (operands[1], SFmode))" "@ fmov\\t%s0, %w1 + movi\\t%0.4s, #0 fmov\\t%w0, %s1 fmov\\t%s0, %s1 fmov\\t%s0, %1 ldr\\t%s0, %1 str\\t%s1, %0 ldr\\t%w0, %1 str\\t%w1, %0 mov\\t%w0, %w1" - [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\ + [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\ f_loads,f_stores,load1,store1,mov_reg")] ) (define_insn "*movdf_aarch64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w ,w,m,r,m ,r") - (match_operand:DF 1 "general_operand" "?rY, w,w,Ufc,m,w,m,rY,r"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w,?r,w,w
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 12/16/2015 03:30 PM, Evandro Menezes wrote: > > On 10/30/2015 05:24 AM, Marcus Shawcroft wrote: > > On 20 October 2015 at 00:40, Evandro Menezes>wrote: > > In the existing targets, it seems that it's always faster to zero >up a DF > > register with "movi %d0, #0" instead of "fmov %d0, xzr". > > This patch modifies the respective pattern. > > > Hi Evandro, > > This patch changes the generic, u architecture independent instruction > selection. The ARM ARM (C3.5.3) makes a specific recommendation about > the choice of instruction in this situation and the current > implementation in GCC follows that recommendation. Wilco has also > picked up on this issue he has the same patch internal to ARM along > with an ongoing discussion with ARM architecture folk regarding this > recommendation. I'm reluctant to take this patch right now on the > basis that it runs contrary to ARM ARM recommendation pending the > conclusion of Wilco's discussion with ARM architecture folk. > > > Have you had a chance to discuss this internally further? Yes, it was decided to remove the recommendation from future ARM ARM's. Several review comments on your patch (https://patchwork.ozlabs.org/patch/532736): * This should be added to movhf, movsf and movdf - movtf already does this. * It is important to set the "fp" and "simd" attributes so that the movi variant can only be selected if it is available. Cheers, Wilco
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 12/16/2015 03:30 PM, Evandro Menezes wrote: On 10/30/2015 05:24 AM, Marcus Shawcroft wrote: On 20 October 2015 at 00:40, Evandro Menezeswrote: In the existing targets, it seems that it's always faster to zero up a DF register with "movi %d0, #0" instead of "fmov %d0, xzr". This patch modifies the respective pattern. Hi Evandro, This patch changes the generic, u architecture independent instruction selection. The ARM ARM (C3.5.3) makes a specific recommendation about the choice of instruction in this situation and the current implementation in GCC follows that recommendation. Wilco has also picked up on this issue he has the same patch internal to ARM along with an ongoing discussion with ARM architecture folk regarding this recommendation. I'm reluctant to take this patch right now on the basis that it runs contrary to ARM ARM recommendation pending the conclusion of Wilco's discussion with ARM architecture folk. Have you had a chance to discuss this internally further? Ping. -- Evandro Menezes
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 10/30/2015 05:24 AM, Marcus Shawcroft wrote: On 20 October 2015 at 00:40, Evandro Menezeswrote: In the existing targets, it seems that it's always faster to zero up a DF register with "movi %d0, #0" instead of "fmov %d0, xzr". This patch modifies the respective pattern. Hi Evandro, This patch changes the generic, u architecture independent instruction selection. The ARM ARM (C3.5.3) makes a specific recommendation about the choice of instruction in this situation and the current implementation in GCC follows that recommendation. Wilco has also picked up on this issue he has the same patch internal to ARM along with an ongoing discussion with ARM architecture folk regarding this recommendation. I'm reluctant to take this patch right now on the basis that it runs contrary to ARM ARM recommendation pending the conclusion of Wilco's discussion with ARM architecture folk. Marcus, Have you had a chance to discuss this internally further? Thank you, -- Evandro Menezes
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 11/09/2015 04:59 PM, Evandro Menezes wrote: Hi, Marcus. Have you an update from the architecture folks about this? Thank you, Marcus? -- Evandro Menezes
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 10/30/2015 05:24 AM, Marcus Shawcroft wrote: On 20 October 2015 at 00:40, Evandro Menezeswrote: In the existing targets, it seems that it's always faster to zero up a DF register with "movi %d0, #0" instead of "fmov %d0, xzr". This patch modifies the respective pattern. Hi Evandro, This patch changes the generic, u architecture independent instruction selection. The ARM ARM (C3.5.3) makes a specific recommendation about the choice of instruction in this situation and the current implementation in GCC follows that recommendation. Wilco has also picked up on this issue he has the same patch internal to ARM along with an ongoing discussion with ARM architecture folk regarding this recommendation. I'm reluctant to take this patch right now on the basis that it runs contrary to ARM ARM recommendation pending the conclusion of Wilco's discussion with ARM architecture folk. Ping. -- Evandro Menezes
Re: [PATCH][AArch64] Replace insn to zero up DF register
Hi, Marcus. Have you an update from the architecture folks about this? Thank you, -- Evandro Menezes On 10/30/2015 05:24 AM, Marcus Shawcroft wrote: On 20 October 2015 at 00:40, Evandro Menezeswrote: In the existing targets, it seems that it's always faster to zero up a DF register with "movi %d0, #0" instead of "fmov %d0, xzr". This patch modifies the respective pattern. Hi Evandro, This patch changes the generic, u architecture independent instruction selection. The ARM ARM (C3.5.3) makes a specific recommendation about the choice of instruction in this situation and the current implementation in GCC follows that recommendation. Wilco has also picked up on this issue he has the same patch internal to ARM along with an ongoing discussion with ARM architecture folk regarding this recommendation. I'm reluctant to take this patch right now on the basis that it runs contrary to ARM ARM recommendation pending the conclusion of Wilco's discussion with ARM architecture folk. Cheers /Marcus
Re: [PATCH][AArch64] Replace insn to zero up DF register
On 20 October 2015 at 00:40, Evandro Menezeswrote: > In the existing targets, it seems that it's always faster to zero up a DF > register with "movi %d0, #0" instead of "fmov %d0, xzr". > > This patch modifies the respective pattern. Hi Evandro, This patch changes the generic, u architecture independent instruction selection. The ARM ARM (C3.5.3) makes a specific recommendation about the choice of instruction in this situation and the current implementation in GCC follows that recommendation. Wilco has also picked up on this issue he has the same patch internal to ARM along with an ongoing discussion with ARM architecture folk regarding this recommendation. I'm reluctant to take this patch right now on the basis that it runs contrary to ARM ARM recommendation pending the conclusion of Wilco's discussion with ARM architecture folk. Cheers /Marcus
Re: [PATCH][AArch64] Replace insn to zero up DF register
Ping. -- Evandro Menezes On 10/20/2015 09:27 AM, Andrew Pinski wrote: On Tue, Oct 20, 2015 at 7:59 AM, Andrew Pinskiwrote: On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski wrote: On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes wrote: In the existing targets, it seems that it's always faster to zero up a DF register with "movi %d0, #0" instead of "fmov %d0, xzr". I think for ThunderX 1, this change will not make a difference. So I am neutral on this change. Actually depending on fmov is decoded in our pipeline, this change might actually be worse. Currently fmov with an immediate is 1 cycle while movi is two cycles. Let me double check how internally on how it is decoded and if it is 1 cycle or two. Ok, my objections are removed as I talked with the architectures here at Cavium and using movi is better in this case. Thanks, Andrew Thanks, Andrew Thanks, Andrew This patch modifies the respective pattern. Please, commit if it's alright. Thank you, -- Evandro Menezes
Re: [PATCH][AArch64] Replace insn to zero up DF register
On Tue, Oct 20, 2015 at 7:59 AM, Andrew Pinskiwrote: > On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski wrote: >> On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes >> wrote: >>> In the existing targets, it seems that it's always faster to zero up a DF >>> register with "movi %d0, #0" instead of "fmov %d0, xzr". >> >> I think for ThunderX 1, this change will not make a difference. So I >> am neutral on this change. > > Actually depending on fmov is decoded in our pipeline, this change > might actually be worse. Currently fmov with an immediate is 1 cycle > while movi is two cycles. Let me double check how internally on how > it is decoded and if it is 1 cycle or two. Ok, my objections are removed as I talked with the architectures here at Cavium and using movi is better in this case. Thanks, Andrew > > Thanks, > Andrew > >> >> Thanks, >> Andrew >> >>> >>> This patch modifies the respective pattern. >>> >>> Please, commit if it's alright. >>> >>> Thank you, >>> >>> -- >>> Evandro Menezes >>>
Re: [PATCH][AArch64] Replace insn to zero up DF register
On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezeswrote: > In the existing targets, it seems that it's always faster to zero up a DF > register with "movi %d0, #0" instead of "fmov %d0, xzr". I think for ThunderX 1, this change will not make a difference. So I am neutral on this change. Thanks, Andrew > > This patch modifies the respective pattern. > > Please, commit if it's alright. > > Thank you, > > -- > Evandro Menezes >
Re: [PATCH][AArch64] Replace insn to zero up DF register
On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinskiwrote: > On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes > wrote: >> In the existing targets, it seems that it's always faster to zero up a DF >> register with "movi %d0, #0" instead of "fmov %d0, xzr". > > I think for ThunderX 1, this change will not make a difference. So I > am neutral on this change. Actually depending on fmov is decoded in our pipeline, this change might actually be worse. Currently fmov with an immediate is 1 cycle while movi is two cycles. Let me double check how internally on how it is decoded and if it is 1 cycle or two. Thanks, Andrew > > Thanks, > Andrew > >> >> This patch modifies the respective pattern. >> >> Please, commit if it's alright. >> >> Thank you, >> >> -- >> Evandro Menezes >>