[Patch ARM] Allow auto-vectorizer to use vfma.
Hi, This allows the auto-vectorizer to use vfma under Ofast or ffast-math. I have a follow-up patch which will add support for these from arm_neon.h as well before someone asks. It's being regression tested as we speak and that'll follow shortly. Tested on A15 silicon native with no regressions. Committed. regards, Ramana 2012-09-11 Ramana Radhakrishnan ramana.radhakrish...@arm.com Matthew Gretton-Dann matthew.gretton-d...@arm.com * config/arm/neon.md (fmaVCVTF:mode4): New pattern. (*fmsubVCVTF:mode4): Likewise. * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. 2012-09-11 Ramana Radhakrishnan ramana.radhakrish...@arm.com Matthew Gretton-Dann matthew.gretton-d...@arm.com * gcc.target/arm/neon-vfma-1.c: New testcase. * gcc.target/arm/neon-vfms-1.c: Likewise. * gcc.target/arm/neon-vmla-1.c: Update test to use int instead of float. * gcc.target/arm/neon-vmls-1.c: Likewise. * lib/target-supports.exp (add_options_for_arm_neonv2): New function. (check_effective_target_arm_neonv2_ok_nocache): Likewise. (check_effective_target_arm_neonv2_ok): Likewise. (check_effective_target_arm_neonv2_hw): Likewise. (check_effective_target_arm_neonv2): Likewise.diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index a929546..4821bb7 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -707,6 +707,33 @@ (const_string neon_mla_qqq_32_qqd_32_scalar)] ) +;; Fused multiply-accumulate +(define_insn fmaVCVTF:mode4 + [(set (match_operand:VCVTF 0 register_operand =w) +(fma:VCVTF (match_operand:VCVTF 1 register_operand w) + (match_operand:VCVTF 2 register_operand w) + (match_operand:VCVTF 3 register_operand 0)))] + TARGET_NEON TARGET_FMA flag_unsafe_math_optimizations + vfma%?.V_if_elem\\t%V_reg0, %V_reg1, %V_reg2 + [(set (attr neon_type) + (if_then_else (match_test Is_d_reg) + (const_string neon_fp_vmla_ddd) + (const_string neon_fp_vmla_qqq)))] +) + +(define_insn *fmsubVCVTF:mode4 + [(set (match_operand:VCVTF 0 register_operand =w) +(fma:VCVTF (neg:VCVTF (match_operand:VCVTF 1 register_operand w)) + (match_operand:VCVTF 2 register_operand w) + (match_operand:VCVTF 3 register_operand 0)))] + TARGET_NEON TARGET_FMA flag_unsafe_math_optimizations + vfms%?.V_if_elem\\t%V_reg0, %V_reg1, %V_reg2 + [(set (attr neon_type) + (if_then_else (match_test Is_d_reg) + (const_string neon_fp_vmla_ddd) + (const_string neon_fp_vmla_qqq)))] +) + (define_insn iormode3 [(set (match_operand:VDQ 0 s_register_operand =w,w) (ior:VDQ (match_operand:VDQ 1 s_register_operand w,0) diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 7e9dbe3..3fe52ad 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1525,11 +1525,19 @@ ARM target supports generating NEON instructions. @item arm_neon_hw Test system supports executing NEON instructions. +@item arm_neonv2_hw +Test system supports executing NEON v2 instructions. + @item arm_neon_ok @anchor{arm_neon_ok} ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible options. Some multilibs may be incompatible with these options. +@item arm_neonv2_ok +@anchor{arm_neon_ok} +ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible +options. Some multilibs may be incompatible with these options. + @item arm_neon_fp16_ok @anchor{arm_neon_fp16_ok} ARM Target supports @code{-mfpu=neon-fp16 -mfloat-abi=softfp} or compatible diff --git a/gcc/testsuite/gcc.target/arm/neon-vfma-1.c b/gcc/testsuite/gcc.target/arm/neon-vfma-1.c new file mode 100644 index 000..a003a82 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vfma-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neonv2_ok } */ +/* { dg-options -O2 -ftree-vectorize -ffast-math } */ +/* { dg-add-options arm_neonv2 } */ +/* { dg-final { scan-assembler vfma\\.f32\[ \]+\[dDqQ] } } */ + +/* Verify that VFMA is used. */ +void f1(int n, float a, float x[], float y[]) { + int i; + for (i = 0; i n; ++i) +y[i] = a * x[i] + y[i]; +} diff --git a/gcc/testsuite/gcc.target/arm/neon-vfms-1.c b/gcc/testsuite/gcc.target/arm/neon-vfms-1.c new file mode 100644 index 000..8cefd8a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vfms-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neonv2_ok } */ +/* { dg-options -O2 -ftree-vectorize -ffast-math } */ +/* { dg-add-options arm_neonv2 } */ +/* { dg-final { scan-assembler vfms\\.f32\[ \]+\[dDqQ] } } */ + +/* Verify that VFMS is used. */ +void f1(int n, float a, float x[], float y[]) { + int i; + for (i = 0; i n; ++i) +y[i] = a * -x[i] + y[i]; +} diff --git a/gcc/testsuite/gcc.target/arm/neon-vmla-1.c b/gcc/testsuite/gcc.target/arm/neon-vmla-1.c index
Re: [Patch ARM] Allow auto-vectorizer to use vfma.
Hi, your patch broke bootstrapping here: /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node `arm_neon_ok' previously defined at line 1532. (Sorry for only complaining about those issues today.) Tobias On 09/11/2012 02:54 PM, Ramana Radhakrishnan wrote: Hi, This allows the auto-vectorizer to use vfma under Ofast or ffast-math. I have a follow-up patch which will add support for these from arm_neon.h as well before someone asks. It's being regression tested as we speak and that'll follow shortly. Tested on A15 silicon native with no regressions. Committed. regards, Ramana 2012-09-11 Ramana Radhakrishnan ramana.radhakrish...@arm.com Matthew Gretton-Dann matthew.gretton-d...@arm.com * config/arm/neon.md (fmaVCVTF:mode4): New pattern. (*fmsubVCVTF:mode4): Likewise. * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. 2012-09-11 Ramana Radhakrishnan ramana.radhakrish...@arm.com Matthew Gretton-Dann matthew.gretton-d...@arm.com * gcc.target/arm/neon-vfma-1.c: New testcase. * gcc.target/arm/neon-vfms-1.c: Likewise. * gcc.target/arm/neon-vmla-1.c: Update test to use int instead of float. * gcc.target/arm/neon-vmls-1.c: Likewise. * lib/target-supports.exp (add_options_for_arm_neonv2): New function. (check_effective_target_arm_neonv2_ok_nocache): Likewise. (check_effective_target_arm_neonv2_ok): Likewise. (check_effective_target_arm_neonv2_hw): Likewise. (check_effective_target_arm_neonv2): Likewise.
Re: [Patch ARM] Allow auto-vectorizer to use vfma.
your patch broke bootstrapping here: /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node `arm_neon_ok' previously defined at line 1532. (Sorry for only complaining about those issues today.) No need to feel sorry about that. It is Really Bad that people apparently don't test their patches properly. Ciao! Steven
Re: [Patch ARM] Allow auto-vectorizer to use vfma.
On 09/11/2012 03:08 PM, Tobias Burnus wrote: your patch broke bootstrapping here: /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node `arm_neon_ok' previously defined at line 1532. I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should have been arm_neon2_ok. (I also changed spaces into tabs in the ChangeLog.) Tobias PS: Fortunately, documentation changes do not require an all-language bootstrap. On 09/11/2012 02:54 PM, Ramana Radhakrishnan wrote: Hi, This allows the auto-vectorizer to use vfma under Ofast or ffast-math. I have a follow-up patch which will add support for these from arm_neon.h as well before someone asks. It's being regression tested as we speak and that'll follow shortly. Tested on A15 silicon native with no regressions. Committed. regards, Ramana 2012-09-11 Ramana Radhakrishnan ramana.radhakrish...@arm.com Matthew Gretton-Dann matthew.gretton-d...@arm.com * config/arm/neon.md (fmaVCVTF:mode4): New pattern. (*fmsubVCVTF:mode4): Likewise. * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. 2012-09-11 Ramana Radhakrishnan ramana.radhakrish...@arm.com Matthew Gretton-Dann matthew.gretton-d...@arm.com * gcc.target/arm/neon-vfma-1.c: New testcase. * gcc.target/arm/neon-vfms-1.c: Likewise. * gcc.target/arm/neon-vmla-1.c: Update test to use int instead of float. * gcc.target/arm/neon-vmls-1.c: Likewise. * lib/target-supports.exp (add_options_for_arm_neonv2): New function. (check_effective_target_arm_neonv2_ok_nocache): Likewise. (check_effective_target_arm_neonv2_ok): Likewise. (check_effective_target_arm_neonv2_hw): Likewise. (check_effective_target_arm_neonv2): Likewise. Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 191180) +++ gcc/ChangeLog (working copy) @@ -1,9 +1,13 @@ +2012-09-11 Tobias Burnus bur...@net-b.de + + * doc/sourcebuild.texi (arm_neon_v2_ok): Fix @anchor. + 2012-09-11 Ramana Radhakrishnan ramana.radhakrish...@arm.com -Matthew Gretton-Dann matthew.gretton-d...@arm.com + Matthew Gretton-Dann matthew.gretton-d...@arm.com - * config/arm/neon.md (fmaVCVTF:mode4): New pattern. - (*fmsubVCVTF:mode4): Likewise. - * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. + * config/arm/neon.md (fmaVCVTF:mode4): New pattern. + (*fmsubVCVTF:mode4): Likewise. + * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. 2012-09-11 Aldy Hernandez al...@redhat.com Index: gcc/doc/sourcebuild.texi === --- gcc/doc/sourcebuild.texi (revision 191180) +++ gcc/doc/sourcebuild.texi (working copy) @@ -1534,7 +1534,7 @@ ARM Target supports @code{-mfpu=neon -mfloat-abi=s options. Some multilibs may be incompatible with these options. @item arm_neonv2_ok -@anchor{arm_neon_ok} +@anchor{arm_neon2_ok} ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible options. Some multilibs may be incompatible with these options.
Re: [Patch ARM] Allow auto-vectorizer to use vfma.
On 09/11/12 14:17, Tobias Burnus wrote: On 09/11/2012 03:08 PM, Tobias Burnus wrote: your patch broke bootstrapping here: /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node `arm_neon_ok' previously defined at line 1532. I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should have been arm_neon2_ok. (I also changed spaces into tabs in the ChangeLog.) On 09/11/12 14:17, Tobias Burnus wrote: On 09/11/2012 03:08 PM, Tobias Burnus wrote: your patch broke bootstrapping here: /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node `arm_neon_ok' previously defined at line 1532. I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should have been arm_neon2_ok. (I also changed spaces into tabs in the ChangeLog.) Nearly: should be arm_neonv2_ok rather than arm_neon_ok. I've realized another issue with the command line and committed this as obvious after checking that the documentation built fine. Thanks and apologies for the slip-up. I've changed machines recently and somethings not ok in this new setup. regards, Ramana 2012-09-11 Ramana Radhakrishnan ramana.radhakrish...@arm.com * doc/sourcebuild.texi (arm_neon_v2_ok): Adjust command line. Index: gcc/doc/sourcebuild.texi === --- gcc/doc/sourcebuild.texi (revision 191181) +++ gcc/doc/sourcebuild.texi (revision 191182) @@ -1534,8 +1534,8 @@ ARM Target supports @code{-mfpu=neon -mf options. Some multilibs may be incompatible with these options. @item arm_neonv2_ok -@anchor{arm_neon2_ok} -ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible +@anchor{arm_neonv2_ok} +ARM Target supports @code{-mfpu=neon-vfpv4 -mfloat-abi=softfp} or compatible options. Some multilibs may be incompatible with these options. @item arm_neon_fp16_ok