[PATCH] D26858: [AArch64] Don't constrain the assembler when using -mgeneral-regs-only

2016-11-18 Thread silviu.bara...@arm.com via cfe-commits
sbaranga created this revision.
sbaranga added reviewers: jmolloy, rengolin, t.p.northover.
sbaranga added a subscriber: cfe-commits.
Herald added a subscriber: aemerson.

We use the neonasm, cryptoasm, fp-armv8asm and fullfp16asm features
to enable the assembling of instructions that were disabled when
disabling neon, crypto, and fp-armv8 features.

  

This makes the -mgeneral-regs-only behaviour compatible with gcc.

  

Fixes https://llvm.org/bugs/show_bug.cgi?id=30792


https://reviews.llvm.org/D26858

Files:
  docs/UsersManual.rst
  lib/Driver/Tools.cpp
  test/Driver/aarch64-mgeneral_regs_only.c


Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -2623,9 +2623,33 @@
 D.Diag(diag::err_drv_clang_unsupported) << A->getAsString(Args);
 
   if (Args.getLastArg(options::OPT_mgeneral_regs_only)) {
+// Find the last of each feature.
+llvm::StringMap LastOpt;
+for (unsigned I = 0, N = Features.size(); I < N; ++I) {
+  StringRef Name = Features[I];
+  assert(Name[0] == '-' || Name[0] == '+');
+  LastOpt[Name.drop_front(1)] = I;
+}
+
+llvm::StringMap::iterator I = LastOpt.find("neon");
+if (I != LastOpt.end() && Features[I->second] == "+neon")
+  Features.push_back("+neonasm");
+
+I = LastOpt.find("crypto");
+if (I != LastOpt.end() && Features[I->second] == "+crypto")
+  Features.push_back("+cryptoasm");
+
+I = LastOpt.find("fp-armv8");
+if (I != LastOpt.end() && Features[I->second] == "+fp-armv8")
+  Features.push_back("+fp-armv8asm");
+
+I = LastOpt.find("fullfp16");
+if (I != LastOpt.end() && Features[I->second] == "+fullfp16")
+  Features.push_back("+fullfp16asm");
+
 Features.push_back("-fp-armv8");
-Features.push_back("-crypto");
 Features.push_back("-neon");
+Features.push_back("-crypto");
   }
 
   // En/disable crc
Index: docs/UsersManual.rst
===
--- docs/UsersManual.rst
+++ docs/UsersManual.rst
@@ -1188,7 +1188,8 @@
Generate code which only uses the general purpose registers.
 
This option restricts the generated code to use general registers
-   only. This only applies to the AArch64 architecture.
+   only but does not restrict the assembler. This only applies to the
+   AArch64 architecture.
 
 .. option:: -mcompact-branches=[values]
 
Index: test/Driver/aarch64-mgeneral_regs_only.c
===
--- test/Driver/aarch64-mgeneral_regs_only.c
+++ test/Driver/aarch64-mgeneral_regs_only.c
@@ -4,6 +4,19 @@
 // RUN:   | FileCheck --check-prefix=CHECK-NO-FP %s
 // RUN: %clang -target arm64-linux-eabi -mgeneral-regs-only %s -### 2>&1 \
 // RUN:   | FileCheck --check-prefix=CHECK-NO-FP %s
+// RUN: %clang -target aarch64-linux-eabi -mgeneral-regs-only 
-mfpu=crypto-neon-fp-armv8 -%s -### 2>&1 \
+// RUN:   | FileCheck --check-prefix=CHECK-FP %s
+
+// CHECK-NO-FP: "-target-feature" "+neonasm"
+// CHECK-NO-FP-NOT: "-target-feature" "+fp-armv8asm"
+// CHECK-NO-FP-NOT: "-target-feature" "+cryptoasm"
 // CHECK-NO-FP: "-target-feature" "-fp-armv8"
 // CHECK-NO-FP: "-target-feature" "-crypto"
 // CHECK-NO-FP: "-target-feature" "-neon"
+
+// CHECK-FP: "-target-feature" "+neonasm"
+// CHECK-FP: "-target-feature" "+fp-armv8asm"
+// CHECK-FP: "-target-feature" "+cryptoasm"
+// CHECK-FP: "-target-feature" "-fp-armv8"
+// CHECK-FP: "-target-feature" "-crypto"
+// CHECK-FP: "-target-feature" "-neon"


Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -2623,9 +2623,33 @@
 D.Diag(diag::err_drv_clang_unsupported) << A->getAsString(Args);
 
   if (Args.getLastArg(options::OPT_mgeneral_regs_only)) {
+// Find the last of each feature.
+llvm::StringMap LastOpt;
+for (unsigned I = 0, N = Features.size(); I < N; ++I) {
+  StringRef Name = Features[I];
+  assert(Name[0] == '-' || Name[0] == '+');
+  LastOpt[Name.drop_front(1)] = I;
+}
+
+llvm::StringMap::iterator I = LastOpt.find("neon");
+if (I != LastOpt.end() && Features[I->second] == "+neon")
+  Features.push_back("+neonasm");
+
+I = LastOpt.find("crypto");
+if (I != LastOpt.end() && Features[I->second] == "+crypto")
+  Features.push_back("+cryptoasm");
+
+I = LastOpt.find("fp-armv8");
+if (I != LastOpt.end() && Features[I->second] == "+fp-armv8")
+  Features.push_back("+fp-armv8asm");
+
+I = LastOpt.find("fullfp16");
+if (I != LastOpt.end() && Features[I->second] == "+fullfp16")
+  Features.push_back("+fullfp16asm");
+
 Features.push_back("-fp-armv8");
-Features.push_back("-crypto");
 Features.push_back("-neon");
+Features.push_back("-crypto");
   }
 
   // En/disable crc
Index: docs/UsersManual.rst
==

[PATCH] D26858: [AArch64] Don't constrain the assembler when using -mgeneral-regs-only

2016-11-18 Thread silviu.bara...@arm.com via cfe-commits
sbaranga updated this revision to Diff 78541.
sbaranga added a comment.

Update regression tests.


https://reviews.llvm.org/D26858

Files:
  docs/UsersManual.rst
  lib/Driver/Tools.cpp
  test/Driver/aarch64-mgeneral_regs_only.c


Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -2623,9 +2623,33 @@
 D.Diag(diag::err_drv_clang_unsupported) << A->getAsString(Args);
 
   if (Args.getLastArg(options::OPT_mgeneral_regs_only)) {
+// Find the last of each feature.
+llvm::StringMap LastOpt;
+for (unsigned I = 0, N = Features.size(); I < N; ++I) {
+  StringRef Name = Features[I];
+  assert(Name[0] == '-' || Name[0] == '+');
+  LastOpt[Name.drop_front(1)] = I;
+}
+
+llvm::StringMap::iterator I = LastOpt.find("neon");
+if (I != LastOpt.end() && Features[I->second] == "+neon")
+  Features.push_back("+neonasm");
+
+I = LastOpt.find("crypto");
+if (I != LastOpt.end() && Features[I->second] == "+crypto")
+  Features.push_back("+cryptoasm");
+
+I = LastOpt.find("fp-armv8");
+if (I != LastOpt.end() && Features[I->second] == "+fp-armv8")
+  Features.push_back("+fp-armv8asm");
+
+I = LastOpt.find("fullfp16");
+if (I != LastOpt.end() && Features[I->second] == "+fullfp16")
+  Features.push_back("+fullfp16asm");
+
 Features.push_back("-fp-armv8");
-Features.push_back("-crypto");
 Features.push_back("-neon");
+Features.push_back("-crypto");
   }
 
   // En/disable crc
Index: docs/UsersManual.rst
===
--- docs/UsersManual.rst
+++ docs/UsersManual.rst
@@ -1188,7 +1188,8 @@
Generate code which only uses the general purpose registers.
 
This option restricts the generated code to use general registers
-   only. This only applies to the AArch64 architecture.
+   only but does not restrict the assembler. This only applies to the
+   AArch64 architecture.
 
 .. option:: -mcompact-branches=[values]
 
Index: test/Driver/aarch64-mgeneral_regs_only.c
===
--- test/Driver/aarch64-mgeneral_regs_only.c
+++ test/Driver/aarch64-mgeneral_regs_only.c
@@ -4,6 +4,18 @@
 // RUN:   | FileCheck --check-prefix=CHECK-NO-FP %s
 // RUN: %clang -target arm64-linux-eabi -mgeneral-regs-only %s -### 2>&1 \
 // RUN:   | FileCheck --check-prefix=CHECK-NO-FP %s
+// RUN: %clang -target aarch64-linux-eabi -mgeneral-regs-only 
-march=armv8.1a+crypto %s -### 2>&1 \
+// RUN:   | FileCheck --check-prefix=CHECK-FP %s
+
+// CHECK-NO-FP: "-target-feature" "+neonasm"
+// CHECK-NO-FP-NOT: "-target-feature" "+fp-armv8asm"
+// CHECK-NO-FP-NOT: "-target-feature" "+cryptoasm"
 // CHECK-NO-FP: "-target-feature" "-fp-armv8"
-// CHECK-NO-FP: "-target-feature" "-crypto"
 // CHECK-NO-FP: "-target-feature" "-neon"
+// CHECK-NO-FP: "-target-feature" "-crypto"
+
+// CHECK-FP: "-target-feature" "+neonasm"
+// CHECK-FP: "-target-feature" "+cryptoasm"
+// CHECK-FP: "-target-feature" "-fp-armv8"
+// CHECK-FP: "-target-feature" "-neon"
+// CHECK-FP: "-target-feature" "-crypto"


Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -2623,9 +2623,33 @@
 D.Diag(diag::err_drv_clang_unsupported) << A->getAsString(Args);
 
   if (Args.getLastArg(options::OPT_mgeneral_regs_only)) {
+// Find the last of each feature.
+llvm::StringMap LastOpt;
+for (unsigned I = 0, N = Features.size(); I < N; ++I) {
+  StringRef Name = Features[I];
+  assert(Name[0] == '-' || Name[0] == '+');
+  LastOpt[Name.drop_front(1)] = I;
+}
+
+llvm::StringMap::iterator I = LastOpt.find("neon");
+if (I != LastOpt.end() && Features[I->second] == "+neon")
+  Features.push_back("+neonasm");
+
+I = LastOpt.find("crypto");
+if (I != LastOpt.end() && Features[I->second] == "+crypto")
+  Features.push_back("+cryptoasm");
+
+I = LastOpt.find("fp-armv8");
+if (I != LastOpt.end() && Features[I->second] == "+fp-armv8")
+  Features.push_back("+fp-armv8asm");
+
+I = LastOpt.find("fullfp16");
+if (I != LastOpt.end() && Features[I->second] == "+fullfp16")
+  Features.push_back("+fullfp16asm");
+
 Features.push_back("-fp-armv8");
-Features.push_back("-crypto");
 Features.push_back("-neon");
+Features.push_back("-crypto");
   }
 
   // En/disable crc
Index: docs/UsersManual.rst
===
--- docs/UsersManual.rst
+++ docs/UsersManual.rst
@@ -1188,7 +1188,8 @@
Generate code which only uses the general purpose registers.
 
This option restricts the generated code to use general registers
-   only. This only applies to the AArch64 architecture.
+   only but does not restrict the assembler. This only applies to the
+   AArch64 architecture.
 
 .. option:: -mcompact-branc

Re: [PATCH] D13127: [ARM] Upgrade codegen for vld[234] and vst[234] to to communicate a 0 address space

2015-09-30 Thread silviu.bara...@arm.com via cfe-commits
sbaranga accepted this revision.
sbaranga added a comment.
This revision is now accepted and ready to land.

LGTM


http://reviews.llvm.org/D13127



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-28 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added a comment.

Thanks, r267869!

-Silviu


http://reviews.llvm.org/D18963



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D19665: [ARM] Guard the declarations of f16 to f32 vcvt intrinsics in arm_neon.h by testing __ARM_FP

2016-04-28 Thread silviu.bara...@arm.com via cfe-commits
sbaranga created this revision.
sbaranga added a reviewer: rengolin.
sbaranga added subscribers: t.p.northover, cfe-commits.
Herald added subscribers: rengolin, aemerson.

Conversions between float and half are only available when the
taraget has the half-precision extension. Guard these intrinsics
so that they don't cause crashes in the backend.

Fixes PR27550.

http://reviews.llvm.org/D19665

Files:
  include/clang/Basic/arm_neon.td
  test/CodeGen/arm-negative-fp16.c

Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -704,8 +704,12 @@
 

 // E.3.22 Converting vectors
 
-def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">;
-def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+let ArchGuard = "(__ARM_FP & 2)" in {
+  def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">;
+  def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+  def VCVT_HIGH_F16_F32 : SOpInst<"vcvt_high_f16", "hmj", "Hf", 
OP_VCVT_NA_HI_F16>;
+  def VCVT_HIGH_F32_F16 : SOpInst<"vcvt_high_f32", "wk", "h", 
OP_VCVT_EX_HI_F32>;
+}
 
 def VCVT_S32 : SInst<"vcvt_s32", "xd",  "fQf">;
 def VCVT_U32 : SInst<"vcvt_u32", "ud",  "fQf">;
@@ -981,8 +985,6 @@
 def VCVT_U64 : SInst<"vcvt_u64", "ud",  "dQd">;
 def VCVT_F64 : SInst<"vcvt_f64", "Fd",  "lUlQlQUl">;
 
-def VCVT_HIGH_F16_F32 : SOpInst<"vcvt_high_f16", "hmj", "Hf", 
OP_VCVT_NA_HI_F16>;
-def VCVT_HIGH_F32_F16 : SOpInst<"vcvt_high_f32", "wk", "h", OP_VCVT_EX_HI_F32>;
 def VCVT_HIGH_F32_F64 : SOpInst<"vcvt_high_f32", "qfj", "d", 
OP_VCVT_NA_HI_F32>;
 def VCVT_HIGH_F64_F32 : SOpInst<"vcvt_high_f64", "wj", "f", OP_VCVT_EX_HI_F64>;
 
Index: test/CodeGen/arm-negative-fp16.c
===
--- /dev/null
+++ test/CodeGen/arm-negative-fp16.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -triple thumbv7-none-eabi %s -target-feature +neon 
-target-feature -fp16 -fsyntax-only -verify
+
+#include 
+
+float16x4_t test_vcvt_f16_f32(float32x4_t a) {
+  return vcvt_f16_f32(a); // expected-warning{{implicit declaration of 
function 'vcvt_f16_f32'}}  expected-error{{returning 'int' from a function with 
incompatible result type 'float16x4_t'}}
+}
+
+float32x4_t test_vcvt_f32_f16(float16x4_t a) {
+  return vcvt_f32_f16(a); // expected-warning{{implicit declaration of 
function 'vcvt_f32_f16'}} expected-error{{returning 'int' from a function with 
incompatible result type 'float32x4_t'}}
+}
+
+float32x4_t test_vcvt_high_f32_f16(float16x8_t a) {
+  return vcvt_high_f32_f16(a); // expected-warning{{implicit declaration of 
function 'vcvt_high_f32_f16'}} expected-error{{returning 'int' from a function 
with incompatible result type 'float32x4_t'}}
+}
+float16x8_t test_vcvt_high_f16_f32(float16x4_t a, float32x4_t b) {
+  return vcvt_high_f16_f32(a, b); // expected-warning{{implicit declaration of 
function 'vcvt_high_f16_f32'}} expected-error{{returning 'int' from a function 
with incompatible result type 'float16x8_t'}}
+}


Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -704,8 +704,12 @@
 
 // E.3.22 Converting vectors
 
-def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">;
-def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+let ArchGuard = "(__ARM_FP & 2)" in {
+  def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">;
+  def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+  def VCVT_HIGH_F16_F32 : SOpInst<"vcvt_high_f16", "hmj", "Hf", OP_VCVT_NA_HI_F16>;
+  def VCVT_HIGH_F32_F16 : SOpInst<"vcvt_high_f32", "wk", "h", OP_VCVT_EX_HI_F32>;
+}
 
 def VCVT_S32 : SInst<"vcvt_s32", "xd",  "fQf">;
 def VCVT_U32 : SInst<"vcvt_u32", "ud",  "fQf">;
@@ -981,8 +985,6 @@
 def VCVT_U64 : SInst<"vcvt_u64", "ud",  "dQd">;
 def VCVT_F64 : SInst<"vcvt_f64", "Fd",  "lUlQlQUl">;
 
-def VCVT_HIGH_F16_F32 : SOpInst<"vcvt_high_f16", "hmj", "Hf", OP_VCVT_NA_HI_F16>;
-def VCVT_HIGH_F32_F16 : SOpInst<"vcvt_high_f32", "wk", "h", OP_VCVT_EX_HI_F32>;
 def VCVT_HIGH_F32_F64 : SOpInst<"vcvt_high_f32", "qfj", "d", OP_VCVT_NA_HI_F32>;
 def VCVT_HIGH_F64_F32 : SOpInst<"vcvt_high_f64", "wj", "f", OP_VCVT_EX_HI_F64>;
 
Index: test/CodeGen/arm-negative-fp16.c
===
--- /dev/null
+++ test/CodeGen/arm-negative-fp16.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -triple thumbv7-none-eabi %s -target-feature +neon -target-feature -fp16 -fsyntax-only -verify
+
+#include 
+
+float16x4_t test_vcvt_f16_f32(float32x4_t a) {
+  return vcvt_f16_f32(a); // expected-warning{{implicit declaration of function 'vcvt_f16_f32'}}  expected-error{{returning 'int' from a function with incompatible result type 'float16x4_t'}}
+}
+
+float32x

Re: [PATCH] D19665: [ARM] Guard the declarations of f16 to f32 vcvt intrinsics in arm_neon.h by testing __ARM_FP

2016-04-29 Thread silviu.bara...@arm.com via cfe-commits
sbaranga updated this revision to Diff 5.
sbaranga added a comment.

Don't change the AArch64 intrinsics and move the test to Sema.


http://reviews.llvm.org/D19665

Files:
  include/clang/Basic/arm_neon.td
  test/Sema/arm-no-fp16.c

Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -704,8 +704,10 @@
 

 // E.3.22 Converting vectors
 
-def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">;
-def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+let ArchGuard = "(__ARM_FP & 2)" in {
+  def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">;
+  def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+}
 
 def VCVT_S32 : SInst<"vcvt_s32", "xd",  "fQf">;
 def VCVT_U32 : SInst<"vcvt_u32", "ud",  "fQf">;
Index: test/Sema/arm-no-fp16.c
===
--- /dev/null
+++ test/Sema/arm-no-fp16.c
@@ -0,0 +1,11 @@
+// RUN: %clang_cc1 -triple thumbv7-none-eabi %s -target-feature +neon 
-target-feature -fp16 -fsyntax-only -verify
+
+#include 
+
+float16x4_t test_vcvt_f16_f32(float32x4_t a) {
+  return vcvt_f16_f32(a); // expected-warning{{implicit declaration of 
function 'vcvt_f16_f32'}}  expected-error{{returning 'int' from a function with 
incompatible result type 'float16x4_t'}}
+}
+
+float32x4_t test_vcvt_f32_f16(float16x4_t a) {
+  return vcvt_f32_f16(a); // expected-warning{{implicit declaration of 
function 'vcvt_f32_f16'}} expected-error{{returning 'int' from a function with 
incompatible result type 'float32x4_t'}}
+}


Index: include/clang/Basic/arm_neon.td
===
--- include/clang/Basic/arm_neon.td
+++ include/clang/Basic/arm_neon.td
@@ -704,8 +704,10 @@
 
 // E.3.22 Converting vectors
 
-def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">;
-def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+let ArchGuard = "(__ARM_FP & 2)" in {
+  def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">;
+  def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+}
 
 def VCVT_S32 : SInst<"vcvt_s32", "xd",  "fQf">;
 def VCVT_U32 : SInst<"vcvt_u32", "ud",  "fQf">;
Index: test/Sema/arm-no-fp16.c
===
--- /dev/null
+++ test/Sema/arm-no-fp16.c
@@ -0,0 +1,11 @@
+// RUN: %clang_cc1 -triple thumbv7-none-eabi %s -target-feature +neon -target-feature -fp16 -fsyntax-only -verify
+
+#include 
+
+float16x4_t test_vcvt_f16_f32(float32x4_t a) {
+  return vcvt_f16_f32(a); // expected-warning{{implicit declaration of function 'vcvt_f16_f32'}}  expected-error{{returning 'int' from a function with incompatible result type 'float16x4_t'}}
+}
+
+float32x4_t test_vcvt_f32_f16(float16x4_t a) {
+  return vcvt_f32_f16(a); // expected-warning{{implicit declaration of function 'vcvt_f32_f16'}} expected-error{{returning 'int' from a function with incompatible result type 'float32x4_t'}}
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D19665: [ARM] Guard the declarations of f16 to f32 vcvt intrinsics in arm_neon.h by testing __ARM_FP

2016-04-29 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added inline comments.


Comment at: include/clang/Basic/arm_neon.td:710-711
@@ -709,2 +709,4 @@
+  def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">;
+}
 
 def VCVT_S32 : SInst<"vcvt_s32", "xd",  "fQf">;

Thanks for catching this!


http://reviews.llvm.org/D19665



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D19665: [ARM] Guard the declarations of f16 to f32 vcvt intrinsics in arm_neon.h by testing __ARM_FP

2016-04-29 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added a comment.

Thanks, r268047!

Cheers,
Silviu


http://reviews.llvm.org/D19665



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-11 Thread silviu.bara...@arm.com via cfe-commits
sbaranga created this revision.
sbaranga added a reviewer: t.p.northover.
sbaranga added a subscriber: cfe-commits.
Herald added subscribers: rengolin, aemerson.

According to the ACLE spec, "__ARM_FEATURE_FMA is defined to 1 if
the hardware floating-point architecture supports fused floating-point
multiply-accumulate".

This changes clang's behaviour from emitting this macro for v7-A and v7-R
cores to only emitting it when the target has VFPv4 (and therefore support
for the floating point multiply-accumulate instruction).

Fixes PR27216

http://reviews.llvm.org/D18963

Files:
  lib/Basic/Targets.cpp
  test/CodeGen/arm-neon-fma.c
  test/Preprocessor/arm-acle-6.5.c
  test/Sema/arm_vfma.c

Index: lib/Basic/Targets.cpp
===
--- lib/Basic/Targets.cpp
+++ lib/Basic/Targets.cpp
@@ -4927,7 +4927,8 @@
 Builder.defineMacro("__ARM_FP16_ARGS", "1");
 
 // ACLE 6.5.3 Fused multiply-accumulate (FMA)
-if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM"))
+if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM") &&
+(FPU & VFP4FPU))
   Builder.defineMacro("__ARM_FEATURE_FMA", "1");
 
 // Subtarget options.
Index: test/CodeGen/arm-neon-fma.c
===
--- test/CodeGen/arm-neon-fma.c
+++ test/CodeGen/arm-neon-fma.c
@@ -3,6 +3,7 @@
 // RUN:   -target-cpu cortex-a8 \
 // RUN:   -mfloat-abi hard \
 // RUN:   -ffreestanding \
+// RUN:   -target-feature +vfp4 \
 // RUN:   -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
 
 #include 
Index: test/Preprocessor/arm-acle-6.5.c
===
--- test/Preprocessor/arm-acle-6.5.c
+++ test/Preprocessor/arm-acle-6.5.c
@@ -49,10 +49,13 @@
 
 // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA
 
-// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
-// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
+// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7a-eabi -mfpu=neon-vfpv4 -x c -E -dM %s -o - | 
FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7r-eabi -mfpu=neon-vfpv4 -x c -E -dM %s -o - | 
FileCheck %s -check-prefix CHECK-FMA
 // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
-// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
+// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv8-eabi -mfpu=neon-vfpv4 -x c -E -dM %s -o - | 
FileCheck %s -check-prefix CHECK-FMA
 
 // CHECK-FMA: __ARM_FEATURE_FMA 1
 
Index: test/Sema/arm_vfma.c
===
--- test/Sema/arm_vfma.c
+++ test/Sema/arm_vfma.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-target-feature +vfp4 -fsyntax-only -verify %s
 #include 
 
 // expected-no-diagnostics


Index: lib/Basic/Targets.cpp
===
--- lib/Basic/Targets.cpp
+++ lib/Basic/Targets.cpp
@@ -4927,7 +4927,8 @@
 Builder.defineMacro("__ARM_FP16_ARGS", "1");
 
 // ACLE 6.5.3 Fused multiply-accumulate (FMA)
-if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM"))
+if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM") &&
+(FPU & VFP4FPU))
   Builder.defineMacro("__ARM_FEATURE_FMA", "1");
 
 // Subtarget options.
Index: test/CodeGen/arm-neon-fma.c
===
--- test/CodeGen/arm-neon-fma.c
+++ test/CodeGen/arm-neon-fma.c
@@ -3,6 +3,7 @@
 // RUN:   -target-cpu cortex-a8 \
 // RUN:   -mfloat-abi hard \
 // RUN:   -ffreestanding \
+// RUN:   -target-feature +vfp4 \
 // RUN:   -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
 
 #include 
Index: test/Preprocessor/arm-acle-6.5.c
===
--- test/Preprocessor/arm-acle-6.5.c
+++ test/Preprocessor/arm-acle-6.5.c
@@ -49,10 +49,13 @@
 
 // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA
 
-// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
-// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7a-eabi -mfpu=neon-vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN:

Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-11 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added inline comments.


Comment at: lib/Basic/Targets.cpp:4931
@@ -4931,1 +4930,3 @@
+if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM") &&
+(FPU & VFP4FPU))
   Builder.defineMacro("__ARM_FEATURE_FMA", "1");

rengolin wrote:
> I think just two checks are necessary, here:
> 
> (FPU & VFPV4FPU) || (ArchVersion > 7)
> 
> and make sure that the right FPU flag is set from the right cores, plus 
> "+vfp4".
Yes, that should be sufficient.


Comment at: test/CodeGen/arm-neon-fma.c:6
@@ -5,2 +5,3 @@
 // RUN:   -ffreestanding \
+// RUN:   -target-feature +vfp4 \
 // RUN:   -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s

rengolin wrote:
> why not change the cpu to a core that has vfp4?
> 
> I know the test is about FMA, not the CPU, but this is a combination that 
> will never occur in the wild...
Sure, good point.


Comment at: test/Sema/arm_vfma.c:1
@@ -1,2 +1,2 @@
-// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-target-feature +vfp4 -fsyntax-only -verify %s
 #include 

rengolin wrote:
> It's possible that v7 Apple cores always have FMA? I'd make sure of that 
> before forcing the flag here. We don't want to disable it inadvertently.
> 
> @t.p.northover, can you confirm Apple's support for VFP4?
If they do support it and don't have the vfp4 feature, then before this patch 
clang/llvm wouldn't have emitted a fma/vfma instruction anyway in any 
circumstances (because the backend will not generate it). The backend would 
instead legalize it with fmaf() libcalls - but that's not the correct behaviour 
according to the spec.


http://reviews.llvm.org/D18963



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-11 Thread silviu.bara...@arm.com via cfe-commits
sbaranga updated this revision to Diff 53254.
sbaranga added a comment.

Apply review comments from Renato:

- simplify condition for enabling __ARM_FEATURE_FMA
- use cortex-a7 instead of cortex-a8 for testing since this is a real use case.


http://reviews.llvm.org/D18963

Files:
  lib/Basic/Targets.cpp
  test/CodeGen/arm-neon-fma.c
  test/Preprocessor/arm-acle-6.5.c
  test/Sema/arm_vfma.c

Index: lib/Basic/Targets.cpp
===
--- lib/Basic/Targets.cpp
+++ lib/Basic/Targets.cpp
@@ -4927,7 +4927,7 @@
 Builder.defineMacro("__ARM_FP16_ARGS", "1");
 
 // ACLE 6.5.3 Fused multiply-accumulate (FMA)
-if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM"))
+if (ArchVersion >= 7 && (FPU & VFP4FPU))
   Builder.defineMacro("__ARM_FEATURE_FMA", "1");
 
 // Subtarget options.
Index: test/CodeGen/arm-neon-fma.c
===
--- test/CodeGen/arm-neon-fma.c
+++ test/CodeGen/arm-neon-fma.c
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \
 // RUN:   -target-abi aapcs \
-// RUN:   -target-cpu cortex-a8 \
+// RUN:   -target-cpu cortex-a7 \
 // RUN:   -mfloat-abi hard \
 // RUN:   -ffreestanding \
 // RUN:   -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
Index: test/Preprocessor/arm-acle-6.5.c
===
--- test/Preprocessor/arm-acle-6.5.c
+++ test/Preprocessor/arm-acle-6.5.c
@@ -49,10 +49,13 @@
 
 // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA
 
-// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
-// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
+// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck 
%s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck 
%s -check-prefix CHECK-FMA
 // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
-// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
+// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv8-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck 
%s -check-prefix CHECK-FMA
 
 // CHECK-FMA: __ARM_FEATURE_FMA 1
 
Index: test/Sema/arm_vfma.c
===
--- test/Sema/arm_vfma.c
+++ test/Sema/arm_vfma.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-target-feature +vfp4 -fsyntax-only -verify %s
 #include 
 
 // expected-no-diagnostics


Index: lib/Basic/Targets.cpp
===
--- lib/Basic/Targets.cpp
+++ lib/Basic/Targets.cpp
@@ -4927,7 +4927,7 @@
 Builder.defineMacro("__ARM_FP16_ARGS", "1");
 
 // ACLE 6.5.3 Fused multiply-accumulate (FMA)
-if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM"))
+if (ArchVersion >= 7 && (FPU & VFP4FPU))
   Builder.defineMacro("__ARM_FEATURE_FMA", "1");
 
 // Subtarget options.
Index: test/CodeGen/arm-neon-fma.c
===
--- test/CodeGen/arm-neon-fma.c
+++ test/CodeGen/arm-neon-fma.c
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \
 // RUN:   -target-abi aapcs \
-// RUN:   -target-cpu cortex-a8 \
+// RUN:   -target-cpu cortex-a7 \
 // RUN:   -mfloat-abi hard \
 // RUN:   -ffreestanding \
 // RUN:   -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
Index: test/Preprocessor/arm-acle-6.5.c
===
--- test/Preprocessor/arm-acle-6.5.c
+++ test/Preprocessor/arm-acle-6.5.c
@@ -49,10 +49,13 @@
 
 // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA
 
-// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
-// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
 // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
-// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-

Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-12 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added inline comments.


Comment at: test/Sema/arm_vfma.c:1
@@ -1,2 +1,2 @@
-// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-target-feature +vfp4 -fsyntax-only -verify %s
 #include 

t.p.northover wrote:
> rengolin wrote:
> > t.p.northover wrote:
> > > v7s is Swift, which has FMA. v7 for us is Cortex-A9, which I think also 
> > > has FMA (not that it matters much these days).
> > v7 is Cortex-A8, and neither A8 nor A9 have FMA in VFP, only NEON.
> > 
> > Does Swift have FMA in VFP? or just NEON?
> Sorry, it appears virtually every part of my statement was wrong then. v7 
> really does seem to be Cortex-A8 even for us, and Swift doesn't have scalar 
> VFMA.
The error seems to be coming from how the getDefaultFPU() is called when the 
cpu is not specified. It turns out that it gets called with an empty CPU string 
(perhaps we meant to call with either "generic" or the CPU that was set in 
ARMTargetInfo (which does get correctly recognized as swift in this case).

FWIW, Cortex-A9 doesn't have FMA,


http://reviews.llvm.org/D18963



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-12 Thread silviu.bara...@arm.com via cfe-commits
sbaranga updated this revision to Diff 53409.
sbaranga added a comment.

If no cpu has been passed to the command line, use the generic cpu when 
selecting
features/FPU, instead of using an empty string (which is not recognized by the
TargetParser).


http://reviews.llvm.org/D18963

Files:
  lib/Basic/Targets.cpp
  test/CodeGen/arm-long-calls.c
  test/CodeGen/arm-neon-fma.c
  test/CodeGen/arm-no-movt.c
  test/Preprocessor/arm-acle-6.5.c
  test/Sema/arm_vfma.c
  test/Sema/neon-vector-types-support.c

Index: lib/Basic/Targets.cpp
===
--- lib/Basic/Targets.cpp
+++ lib/Basic/Targets.cpp
@@ -4707,6 +4707,8 @@
   initFeatureMap(llvm::StringMap &Features, DiagnosticsEngine &Diags,
  StringRef CPU,
  const std::vector &FeaturesVec) const override {
+if (CPU == "")
+  CPU = "generic";
 
 std::vector TargetFeatures;
 unsigned Arch = llvm::ARM::parseArch(getTriple().getArchName());
@@ -4927,7 +4929,7 @@
 Builder.defineMacro("__ARM_FP16_ARGS", "1");
 
 // ACLE 6.5.3 Fused multiply-accumulate (FMA)
-if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM"))
+if (ArchVersion >= 7 && (FPU & VFP4FPU))
   Builder.defineMacro("__ARM_FEATURE_FMA", "1");
 
 // Subtarget options.
Index: test/CodeGen/arm-no-movt.c
===
--- test/CodeGen/arm-no-movt.c
+++ test/CodeGen/arm-no-movt.c
@@ -1,7 +1,7 @@
-// RUN: %clang_cc1 -triple thumbv7-apple-ios5  -target-feature +no-movt -emit-llvm -o - %s | FileCheck -check-prefix=NO-MOVT %s
+// RUN: %clang_cc1 -triple thumbv7-apple-ios5 -target-feature +no-movt -emit-llvm -o - %s | FileCheck -check-prefix=NO-MOVT %s
 // RUN: %clang_cc1 -triple thumbv7-apple-ios5 -emit-llvm -o - %s | FileCheck -check-prefix=MOVT %s
 
-// NO-MOVT: attributes #0 = { {{.*}} "target-features"="+no-movt"
-// MOVT-NOT: attributes #0 = { {{.*}} "target-features"="+no-movt"
+// NO-MOVT: attributes #0 = { {{.*}} "target-features"="{{.*}}+no-movt{{.*}}"
+// MOVT-NOT: attributes #0 = { {{.*}} "target-features"="{{.*}}+no-movt{{.*}}"
 
 int foo1(int a) { return a; }
Index: test/CodeGen/arm-long-calls.c
===
--- test/CodeGen/arm-long-calls.c
+++ test/CodeGen/arm-long-calls.c
@@ -1,7 +1,7 @@
 // RUN: %clang_cc1 -triple thumbv7-apple-ios5  -target-feature +long-calls -emit-llvm -o - %s | FileCheck -check-prefix=LONGCALL %s
 // RUN: %clang_cc1 -triple thumbv7-apple-ios5 -emit-llvm -o - %s | FileCheck -check-prefix=NOLONGCALL %s
 
-// LONGCALL: attributes #0 = { {{.*}} "target-features"="+long-calls"
-// NOLONGCALL-NOT: attributes #0 = { {{.*}} "target-features"="+long-calls"
+// LONGCALL: attributes #0 = { {{.*}} "target-features"="{{.*}}+long-calls{{.*}}"
+// NOLONGCALL-NOT: attributes #0 = { {{.*}} "target-features"="{{.*}}+long-calls{{.*}}"
 
 int foo1(int a) { return a; }
Index: test/CodeGen/arm-neon-fma.c
===
--- test/CodeGen/arm-neon-fma.c
+++ test/CodeGen/arm-neon-fma.c
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \
 // RUN:   -target-abi aapcs \
-// RUN:   -target-cpu cortex-a8 \
+// RUN:   -target-cpu cortex-a7 \
 // RUN:   -mfloat-abi hard \
 // RUN:   -ffreestanding \
 // RUN:   -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
Index: test/Preprocessor/arm-acle-6.5.c
===
--- test/Preprocessor/arm-acle-6.5.c
+++ test/Preprocessor/arm-acle-6.5.c
@@ -49,10 +49,13 @@
 
 // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA
 
-// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
-// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
 // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
-// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv8-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
 
 // CHECK-FMA: __ARM_FEATURE_FMA 1
 
Index: test/Sema/arm_vfma.c
===
--- test/Sema/arm_vfma.c
+++ test/Sema/arm_vfma.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple t

Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-12 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added a comment.

I've updated the patch to fix the defaults when the cpu is not specified. 
Renato, Tim, could you have a look at this again please?

Thanks,
Silviu


http://reviews.llvm.org/D18963



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-22 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added a comment.

A gentle ping?

Cheers,
Silviu


http://reviews.llvm.org/D18963



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-26 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added inline comments.


Comment at: lib/Basic/Targets.cpp:4710
@@ -4709,1 +4709,3 @@
  const std::vector &FeaturesVec) const override {
+if (CPU == "")
+  CPU = "generic";

rengolin wrote:
> This change is unrelated and may bring side effects into clang. I'd keep this 
> out and investigate it in another patch with the appropriate tests. If you 
> just force the target-feature in the test, this corner case won't be relevant 
> in this patch.
Ok, that makes sense. I'll revert to the previous revision of this patch.


http://reviews.llvm.org/D18963



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-26 Thread silviu.bara...@arm.com via cfe-commits
sbaranga updated this revision to Diff 55018.
sbaranga added a comment.

Address the latest review comments (which means rolling back to the last 
change).


http://reviews.llvm.org/D18963

Files:
  lib/Basic/Targets.cpp
  test/CodeGen/arm-neon-fma.c
  test/Preprocessor/arm-acle-6.5.c
  test/Sema/arm_vfma.c

Index: lib/Basic/Targets.cpp
===
--- lib/Basic/Targets.cpp
+++ lib/Basic/Targets.cpp
@@ -4931,7 +4931,7 @@
 Builder.defineMacro("__ARM_FP16_ARGS", "1");
 
 // ACLE 6.5.3 Fused multiply-accumulate (FMA)
-if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM"))
+if (ArchVersion >= 7 && (FPU & VFP4FPU))
   Builder.defineMacro("__ARM_FEATURE_FMA", "1");
 
 // Subtarget options.
Index: test/CodeGen/arm-neon-fma.c
===
--- test/CodeGen/arm-neon-fma.c
+++ test/CodeGen/arm-neon-fma.c
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \
 // RUN:   -target-abi aapcs \
-// RUN:   -target-cpu cortex-a8 \
+// RUN:   -target-cpu cortex-a7 \
 // RUN:   -mfloat-abi hard \
 // RUN:   -ffreestanding \
 // RUN:   -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
Index: test/Preprocessor/arm-acle-6.5.c
===
--- test/Preprocessor/arm-acle-6.5.c
+++ test/Preprocessor/arm-acle-6.5.c
@@ -49,10 +49,13 @@
 
 // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA
 
-// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
-// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
+// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck 
%s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck 
%s -check-prefix CHECK-FMA
 // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
-// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-FMA
+// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s 
-check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv8-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck 
%s -check-prefix CHECK-FMA
 
 // CHECK-FMA: __ARM_FEATURE_FMA 1
 
Index: test/Sema/arm_vfma.c
===
--- test/Sema/arm_vfma.c
+++ test/Sema/arm_vfma.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple thumbv7-none-eabi -target-feature +neon 
-target-feature +vfp4 -fsyntax-only -verify %s
 #include 
 
 // expected-no-diagnostics


Index: lib/Basic/Targets.cpp
===
--- lib/Basic/Targets.cpp
+++ lib/Basic/Targets.cpp
@@ -4931,7 +4931,7 @@
 Builder.defineMacro("__ARM_FP16_ARGS", "1");
 
 // ACLE 6.5.3 Fused multiply-accumulate (FMA)
-if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM"))
+if (ArchVersion >= 7 && (FPU & VFP4FPU))
   Builder.defineMacro("__ARM_FEATURE_FMA", "1");
 
 // Subtarget options.
Index: test/CodeGen/arm-neon-fma.c
===
--- test/CodeGen/arm-neon-fma.c
+++ test/CodeGen/arm-neon-fma.c
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \
 // RUN:   -target-abi aapcs \
-// RUN:   -target-cpu cortex-a8 \
+// RUN:   -target-cpu cortex-a7 \
 // RUN:   -mfloat-abi hard \
 // RUN:   -ffreestanding \
 // RUN:   -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
Index: test/Preprocessor/arm-acle-6.5.c
===
--- test/Preprocessor/arm-acle-6.5.c
+++ test/Preprocessor/arm-acle-6.5.c
@@ -49,10 +49,13 @@
 
 // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA
 
-// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
-// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA
+// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
 // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
-// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA
+// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -c

Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4

2016-04-26 Thread silviu.bara...@arm.com via cfe-commits
sbaranga added inline comments.


Comment at: test/Sema/arm_vfma.c:1
@@ -1,2 +1,2 @@
-// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon 
-fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple thumbv7-none-eabi -target-feature +neon 
-target-feature +vfp4 -fsyntax-only -verify %s
 #include 

I updated this test, but used thumbv7-none-eabi here, since VFPv4 requires at 
least v7.


http://reviews.llvm.org/D18963



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits