[PATCH] D26858: [AArch64] Don't constrain the assembler when using -mgeneral-regs-only
sbaranga created this revision. sbaranga added reviewers: jmolloy, rengolin, t.p.northover. sbaranga added a subscriber: cfe-commits. Herald added a subscriber: aemerson. We use the neonasm, cryptoasm, fp-armv8asm and fullfp16asm features to enable the assembling of instructions that were disabled when disabling neon, crypto, and fp-armv8 features. This makes the -mgeneral-regs-only behaviour compatible with gcc. Fixes https://llvm.org/bugs/show_bug.cgi?id=30792 https://reviews.llvm.org/D26858 Files: docs/UsersManual.rst lib/Driver/Tools.cpp test/Driver/aarch64-mgeneral_regs_only.c Index: lib/Driver/Tools.cpp === --- lib/Driver/Tools.cpp +++ lib/Driver/Tools.cpp @@ -2623,9 +2623,33 @@ D.Diag(diag::err_drv_clang_unsupported) << A->getAsString(Args); if (Args.getLastArg(options::OPT_mgeneral_regs_only)) { +// Find the last of each feature. +llvm::StringMap LastOpt; +for (unsigned I = 0, N = Features.size(); I < N; ++I) { + StringRef Name = Features[I]; + assert(Name[0] == '-' || Name[0] == '+'); + LastOpt[Name.drop_front(1)] = I; +} + +llvm::StringMap::iterator I = LastOpt.find("neon"); +if (I != LastOpt.end() && Features[I->second] == "+neon") + Features.push_back("+neonasm"); + +I = LastOpt.find("crypto"); +if (I != LastOpt.end() && Features[I->second] == "+crypto") + Features.push_back("+cryptoasm"); + +I = LastOpt.find("fp-armv8"); +if (I != LastOpt.end() && Features[I->second] == "+fp-armv8") + Features.push_back("+fp-armv8asm"); + +I = LastOpt.find("fullfp16"); +if (I != LastOpt.end() && Features[I->second] == "+fullfp16") + Features.push_back("+fullfp16asm"); + Features.push_back("-fp-armv8"); -Features.push_back("-crypto"); Features.push_back("-neon"); +Features.push_back("-crypto"); } // En/disable crc Index: docs/UsersManual.rst === --- docs/UsersManual.rst +++ docs/UsersManual.rst @@ -1188,7 +1188,8 @@ Generate code which only uses the general purpose registers. This option restricts the generated code to use general registers - only. This only applies to the AArch64 architecture. + only but does not restrict the assembler. This only applies to the + AArch64 architecture. .. option:: -mcompact-branches=[values] Index: test/Driver/aarch64-mgeneral_regs_only.c === --- test/Driver/aarch64-mgeneral_regs_only.c +++ test/Driver/aarch64-mgeneral_regs_only.c @@ -4,6 +4,19 @@ // RUN: | FileCheck --check-prefix=CHECK-NO-FP %s // RUN: %clang -target arm64-linux-eabi -mgeneral-regs-only %s -### 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-NO-FP %s +// RUN: %clang -target aarch64-linux-eabi -mgeneral-regs-only -mfpu=crypto-neon-fp-armv8 -%s -### 2>&1 \ +// RUN: | FileCheck --check-prefix=CHECK-FP %s + +// CHECK-NO-FP: "-target-feature" "+neonasm" +// CHECK-NO-FP-NOT: "-target-feature" "+fp-armv8asm" +// CHECK-NO-FP-NOT: "-target-feature" "+cryptoasm" // CHECK-NO-FP: "-target-feature" "-fp-armv8" // CHECK-NO-FP: "-target-feature" "-crypto" // CHECK-NO-FP: "-target-feature" "-neon" + +// CHECK-FP: "-target-feature" "+neonasm" +// CHECK-FP: "-target-feature" "+fp-armv8asm" +// CHECK-FP: "-target-feature" "+cryptoasm" +// CHECK-FP: "-target-feature" "-fp-armv8" +// CHECK-FP: "-target-feature" "-crypto" +// CHECK-FP: "-target-feature" "-neon" Index: lib/Driver/Tools.cpp === --- lib/Driver/Tools.cpp +++ lib/Driver/Tools.cpp @@ -2623,9 +2623,33 @@ D.Diag(diag::err_drv_clang_unsupported) << A->getAsString(Args); if (Args.getLastArg(options::OPT_mgeneral_regs_only)) { +// Find the last of each feature. +llvm::StringMap LastOpt; +for (unsigned I = 0, N = Features.size(); I < N; ++I) { + StringRef Name = Features[I]; + assert(Name[0] == '-' || Name[0] == '+'); + LastOpt[Name.drop_front(1)] = I; +} + +llvm::StringMap::iterator I = LastOpt.find("neon"); +if (I != LastOpt.end() && Features[I->second] == "+neon") + Features.push_back("+neonasm"); + +I = LastOpt.find("crypto"); +if (I != LastOpt.end() && Features[I->second] == "+crypto") + Features.push_back("+cryptoasm"); + +I = LastOpt.find("fp-armv8"); +if (I != LastOpt.end() && Features[I->second] == "+fp-armv8") + Features.push_back("+fp-armv8asm"); + +I = LastOpt.find("fullfp16"); +if (I != LastOpt.end() && Features[I->second] == "+fullfp16") + Features.push_back("+fullfp16asm"); + Features.push_back("-fp-armv8"); -Features.push_back("-crypto"); Features.push_back("-neon"); +Features.push_back("-crypto"); } // En/disable crc Index: docs/UsersManual.rst ==
[PATCH] D26858: [AArch64] Don't constrain the assembler when using -mgeneral-regs-only
sbaranga updated this revision to Diff 78541. sbaranga added a comment. Update regression tests. https://reviews.llvm.org/D26858 Files: docs/UsersManual.rst lib/Driver/Tools.cpp test/Driver/aarch64-mgeneral_regs_only.c Index: lib/Driver/Tools.cpp === --- lib/Driver/Tools.cpp +++ lib/Driver/Tools.cpp @@ -2623,9 +2623,33 @@ D.Diag(diag::err_drv_clang_unsupported) << A->getAsString(Args); if (Args.getLastArg(options::OPT_mgeneral_regs_only)) { +// Find the last of each feature. +llvm::StringMap LastOpt; +for (unsigned I = 0, N = Features.size(); I < N; ++I) { + StringRef Name = Features[I]; + assert(Name[0] == '-' || Name[0] == '+'); + LastOpt[Name.drop_front(1)] = I; +} + +llvm::StringMap::iterator I = LastOpt.find("neon"); +if (I != LastOpt.end() && Features[I->second] == "+neon") + Features.push_back("+neonasm"); + +I = LastOpt.find("crypto"); +if (I != LastOpt.end() && Features[I->second] == "+crypto") + Features.push_back("+cryptoasm"); + +I = LastOpt.find("fp-armv8"); +if (I != LastOpt.end() && Features[I->second] == "+fp-armv8") + Features.push_back("+fp-armv8asm"); + +I = LastOpt.find("fullfp16"); +if (I != LastOpt.end() && Features[I->second] == "+fullfp16") + Features.push_back("+fullfp16asm"); + Features.push_back("-fp-armv8"); -Features.push_back("-crypto"); Features.push_back("-neon"); +Features.push_back("-crypto"); } // En/disable crc Index: docs/UsersManual.rst === --- docs/UsersManual.rst +++ docs/UsersManual.rst @@ -1188,7 +1188,8 @@ Generate code which only uses the general purpose registers. This option restricts the generated code to use general registers - only. This only applies to the AArch64 architecture. + only but does not restrict the assembler. This only applies to the + AArch64 architecture. .. option:: -mcompact-branches=[values] Index: test/Driver/aarch64-mgeneral_regs_only.c === --- test/Driver/aarch64-mgeneral_regs_only.c +++ test/Driver/aarch64-mgeneral_regs_only.c @@ -4,6 +4,18 @@ // RUN: | FileCheck --check-prefix=CHECK-NO-FP %s // RUN: %clang -target arm64-linux-eabi -mgeneral-regs-only %s -### 2>&1 \ // RUN: | FileCheck --check-prefix=CHECK-NO-FP %s +// RUN: %clang -target aarch64-linux-eabi -mgeneral-regs-only -march=armv8.1a+crypto %s -### 2>&1 \ +// RUN: | FileCheck --check-prefix=CHECK-FP %s + +// CHECK-NO-FP: "-target-feature" "+neonasm" +// CHECK-NO-FP-NOT: "-target-feature" "+fp-armv8asm" +// CHECK-NO-FP-NOT: "-target-feature" "+cryptoasm" // CHECK-NO-FP: "-target-feature" "-fp-armv8" -// CHECK-NO-FP: "-target-feature" "-crypto" // CHECK-NO-FP: "-target-feature" "-neon" +// CHECK-NO-FP: "-target-feature" "-crypto" + +// CHECK-FP: "-target-feature" "+neonasm" +// CHECK-FP: "-target-feature" "+cryptoasm" +// CHECK-FP: "-target-feature" "-fp-armv8" +// CHECK-FP: "-target-feature" "-neon" +// CHECK-FP: "-target-feature" "-crypto" Index: lib/Driver/Tools.cpp === --- lib/Driver/Tools.cpp +++ lib/Driver/Tools.cpp @@ -2623,9 +2623,33 @@ D.Diag(diag::err_drv_clang_unsupported) << A->getAsString(Args); if (Args.getLastArg(options::OPT_mgeneral_regs_only)) { +// Find the last of each feature. +llvm::StringMap LastOpt; +for (unsigned I = 0, N = Features.size(); I < N; ++I) { + StringRef Name = Features[I]; + assert(Name[0] == '-' || Name[0] == '+'); + LastOpt[Name.drop_front(1)] = I; +} + +llvm::StringMap::iterator I = LastOpt.find("neon"); +if (I != LastOpt.end() && Features[I->second] == "+neon") + Features.push_back("+neonasm"); + +I = LastOpt.find("crypto"); +if (I != LastOpt.end() && Features[I->second] == "+crypto") + Features.push_back("+cryptoasm"); + +I = LastOpt.find("fp-armv8"); +if (I != LastOpt.end() && Features[I->second] == "+fp-armv8") + Features.push_back("+fp-armv8asm"); + +I = LastOpt.find("fullfp16"); +if (I != LastOpt.end() && Features[I->second] == "+fullfp16") + Features.push_back("+fullfp16asm"); + Features.push_back("-fp-armv8"); -Features.push_back("-crypto"); Features.push_back("-neon"); +Features.push_back("-crypto"); } // En/disable crc Index: docs/UsersManual.rst === --- docs/UsersManual.rst +++ docs/UsersManual.rst @@ -1188,7 +1188,8 @@ Generate code which only uses the general purpose registers. This option restricts the generated code to use general registers - only. This only applies to the AArch64 architecture. + only but does not restrict the assembler. This only applies to the + AArch64 architecture. .. option:: -mcompact-branc
Re: [PATCH] D13127: [ARM] Upgrade codegen for vld[234] and vst[234] to to communicate a 0 address space
sbaranga accepted this revision. sbaranga added a comment. This revision is now accepted and ready to land. LGTM http://reviews.llvm.org/D13127 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga added a comment. Thanks, r267869! -Silviu http://reviews.llvm.org/D18963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D19665: [ARM] Guard the declarations of f16 to f32 vcvt intrinsics in arm_neon.h by testing __ARM_FP
sbaranga created this revision. sbaranga added a reviewer: rengolin. sbaranga added subscribers: t.p.northover, cfe-commits. Herald added subscribers: rengolin, aemerson. Conversions between float and half are only available when the taraget has the half-precision extension. Guard these intrinsics so that they don't cause crashes in the backend. Fixes PR27550. http://reviews.llvm.org/D19665 Files: include/clang/Basic/arm_neon.td test/CodeGen/arm-negative-fp16.c Index: include/clang/Basic/arm_neon.td === --- include/clang/Basic/arm_neon.td +++ include/clang/Basic/arm_neon.td @@ -704,8 +704,12 @@ // E.3.22 Converting vectors -def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">; -def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; +let ArchGuard = "(__ARM_FP & 2)" in { + def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">; + def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; + def VCVT_HIGH_F16_F32 : SOpInst<"vcvt_high_f16", "hmj", "Hf", OP_VCVT_NA_HI_F16>; + def VCVT_HIGH_F32_F16 : SOpInst<"vcvt_high_f32", "wk", "h", OP_VCVT_EX_HI_F32>; +} def VCVT_S32 : SInst<"vcvt_s32", "xd", "fQf">; def VCVT_U32 : SInst<"vcvt_u32", "ud", "fQf">; @@ -981,8 +985,6 @@ def VCVT_U64 : SInst<"vcvt_u64", "ud", "dQd">; def VCVT_F64 : SInst<"vcvt_f64", "Fd", "lUlQlQUl">; -def VCVT_HIGH_F16_F32 : SOpInst<"vcvt_high_f16", "hmj", "Hf", OP_VCVT_NA_HI_F16>; -def VCVT_HIGH_F32_F16 : SOpInst<"vcvt_high_f32", "wk", "h", OP_VCVT_EX_HI_F32>; def VCVT_HIGH_F32_F64 : SOpInst<"vcvt_high_f32", "qfj", "d", OP_VCVT_NA_HI_F32>; def VCVT_HIGH_F64_F32 : SOpInst<"vcvt_high_f64", "wj", "f", OP_VCVT_EX_HI_F64>; Index: test/CodeGen/arm-negative-fp16.c === --- /dev/null +++ test/CodeGen/arm-negative-fp16.c @@ -0,0 +1,18 @@ +// RUN: %clang_cc1 -triple thumbv7-none-eabi %s -target-feature +neon -target-feature -fp16 -fsyntax-only -verify + +#include + +float16x4_t test_vcvt_f16_f32(float32x4_t a) { + return vcvt_f16_f32(a); // expected-warning{{implicit declaration of function 'vcvt_f16_f32'}} expected-error{{returning 'int' from a function with incompatible result type 'float16x4_t'}} +} + +float32x4_t test_vcvt_f32_f16(float16x4_t a) { + return vcvt_f32_f16(a); // expected-warning{{implicit declaration of function 'vcvt_f32_f16'}} expected-error{{returning 'int' from a function with incompatible result type 'float32x4_t'}} +} + +float32x4_t test_vcvt_high_f32_f16(float16x8_t a) { + return vcvt_high_f32_f16(a); // expected-warning{{implicit declaration of function 'vcvt_high_f32_f16'}} expected-error{{returning 'int' from a function with incompatible result type 'float32x4_t'}} +} +float16x8_t test_vcvt_high_f16_f32(float16x4_t a, float32x4_t b) { + return vcvt_high_f16_f32(a, b); // expected-warning{{implicit declaration of function 'vcvt_high_f16_f32'}} expected-error{{returning 'int' from a function with incompatible result type 'float16x8_t'}} +} Index: include/clang/Basic/arm_neon.td === --- include/clang/Basic/arm_neon.td +++ include/clang/Basic/arm_neon.td @@ -704,8 +704,12 @@ // E.3.22 Converting vectors -def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">; -def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; +let ArchGuard = "(__ARM_FP & 2)" in { + def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">; + def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; + def VCVT_HIGH_F16_F32 : SOpInst<"vcvt_high_f16", "hmj", "Hf", OP_VCVT_NA_HI_F16>; + def VCVT_HIGH_F32_F16 : SOpInst<"vcvt_high_f32", "wk", "h", OP_VCVT_EX_HI_F32>; +} def VCVT_S32 : SInst<"vcvt_s32", "xd", "fQf">; def VCVT_U32 : SInst<"vcvt_u32", "ud", "fQf">; @@ -981,8 +985,6 @@ def VCVT_U64 : SInst<"vcvt_u64", "ud", "dQd">; def VCVT_F64 : SInst<"vcvt_f64", "Fd", "lUlQlQUl">; -def VCVT_HIGH_F16_F32 : SOpInst<"vcvt_high_f16", "hmj", "Hf", OP_VCVT_NA_HI_F16>; -def VCVT_HIGH_F32_F16 : SOpInst<"vcvt_high_f32", "wk", "h", OP_VCVT_EX_HI_F32>; def VCVT_HIGH_F32_F64 : SOpInst<"vcvt_high_f32", "qfj", "d", OP_VCVT_NA_HI_F32>; def VCVT_HIGH_F64_F32 : SOpInst<"vcvt_high_f64", "wj", "f", OP_VCVT_EX_HI_F64>; Index: test/CodeGen/arm-negative-fp16.c === --- /dev/null +++ test/CodeGen/arm-negative-fp16.c @@ -0,0 +1,18 @@ +// RUN: %clang_cc1 -triple thumbv7-none-eabi %s -target-feature +neon -target-feature -fp16 -fsyntax-only -verify + +#include + +float16x4_t test_vcvt_f16_f32(float32x4_t a) { + return vcvt_f16_f32(a); // expected-warning{{implicit declaration of function 'vcvt_f16_f32'}} expected-error{{returning 'int' from a function with incompatible result type 'float16x4_t'}} +} + +float32x
Re: [PATCH] D19665: [ARM] Guard the declarations of f16 to f32 vcvt intrinsics in arm_neon.h by testing __ARM_FP
sbaranga updated this revision to Diff 5. sbaranga added a comment. Don't change the AArch64 intrinsics and move the test to Sema. http://reviews.llvm.org/D19665 Files: include/clang/Basic/arm_neon.td test/Sema/arm-no-fp16.c Index: include/clang/Basic/arm_neon.td === --- include/clang/Basic/arm_neon.td +++ include/clang/Basic/arm_neon.td @@ -704,8 +704,10 @@ // E.3.22 Converting vectors -def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">; -def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; +let ArchGuard = "(__ARM_FP & 2)" in { + def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">; + def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; +} def VCVT_S32 : SInst<"vcvt_s32", "xd", "fQf">; def VCVT_U32 : SInst<"vcvt_u32", "ud", "fQf">; Index: test/Sema/arm-no-fp16.c === --- /dev/null +++ test/Sema/arm-no-fp16.c @@ -0,0 +1,11 @@ +// RUN: %clang_cc1 -triple thumbv7-none-eabi %s -target-feature +neon -target-feature -fp16 -fsyntax-only -verify + +#include + +float16x4_t test_vcvt_f16_f32(float32x4_t a) { + return vcvt_f16_f32(a); // expected-warning{{implicit declaration of function 'vcvt_f16_f32'}} expected-error{{returning 'int' from a function with incompatible result type 'float16x4_t'}} +} + +float32x4_t test_vcvt_f32_f16(float16x4_t a) { + return vcvt_f32_f16(a); // expected-warning{{implicit declaration of function 'vcvt_f32_f16'}} expected-error{{returning 'int' from a function with incompatible result type 'float32x4_t'}} +} Index: include/clang/Basic/arm_neon.td === --- include/clang/Basic/arm_neon.td +++ include/clang/Basic/arm_neon.td @@ -704,8 +704,10 @@ // E.3.22 Converting vectors -def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">; -def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; +let ArchGuard = "(__ARM_FP & 2)" in { + def VCVT_F16_F32 : SInst<"vcvt_f16_f32", "md", "Hf">; + def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; +} def VCVT_S32 : SInst<"vcvt_s32", "xd", "fQf">; def VCVT_U32 : SInst<"vcvt_u32", "ud", "fQf">; Index: test/Sema/arm-no-fp16.c === --- /dev/null +++ test/Sema/arm-no-fp16.c @@ -0,0 +1,11 @@ +// RUN: %clang_cc1 -triple thumbv7-none-eabi %s -target-feature +neon -target-feature -fp16 -fsyntax-only -verify + +#include + +float16x4_t test_vcvt_f16_f32(float32x4_t a) { + return vcvt_f16_f32(a); // expected-warning{{implicit declaration of function 'vcvt_f16_f32'}} expected-error{{returning 'int' from a function with incompatible result type 'float16x4_t'}} +} + +float32x4_t test_vcvt_f32_f16(float16x4_t a) { + return vcvt_f32_f16(a); // expected-warning{{implicit declaration of function 'vcvt_f32_f16'}} expected-error{{returning 'int' from a function with incompatible result type 'float32x4_t'}} +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D19665: [ARM] Guard the declarations of f16 to f32 vcvt intrinsics in arm_neon.h by testing __ARM_FP
sbaranga added inline comments. Comment at: include/clang/Basic/arm_neon.td:710-711 @@ -709,2 +709,4 @@ + def VCVT_F32_F16 : SInst<"vcvt_f32_f16", "wd", "h">; +} def VCVT_S32 : SInst<"vcvt_s32", "xd", "fQf">; Thanks for catching this! http://reviews.llvm.org/D19665 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D19665: [ARM] Guard the declarations of f16 to f32 vcvt intrinsics in arm_neon.h by testing __ARM_FP
sbaranga added a comment. Thanks, r268047! Cheers, Silviu http://reviews.llvm.org/D19665 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga created this revision. sbaranga added a reviewer: t.p.northover. sbaranga added a subscriber: cfe-commits. Herald added subscribers: rengolin, aemerson. According to the ACLE spec, "__ARM_FEATURE_FMA is defined to 1 if the hardware floating-point architecture supports fused floating-point multiply-accumulate". This changes clang's behaviour from emitting this macro for v7-A and v7-R cores to only emitting it when the target has VFPv4 (and therefore support for the floating point multiply-accumulate instruction). Fixes PR27216 http://reviews.llvm.org/D18963 Files: lib/Basic/Targets.cpp test/CodeGen/arm-neon-fma.c test/Preprocessor/arm-acle-6.5.c test/Sema/arm_vfma.c Index: lib/Basic/Targets.cpp === --- lib/Basic/Targets.cpp +++ lib/Basic/Targets.cpp @@ -4927,7 +4927,8 @@ Builder.defineMacro("__ARM_FP16_ARGS", "1"); // ACLE 6.5.3 Fused multiply-accumulate (FMA) -if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM")) +if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM") && +(FPU & VFP4FPU)) Builder.defineMacro("__ARM_FEATURE_FMA", "1"); // Subtarget options. Index: test/CodeGen/arm-neon-fma.c === --- test/CodeGen/arm-neon-fma.c +++ test/CodeGen/arm-neon-fma.c @@ -3,6 +3,7 @@ // RUN: -target-cpu cortex-a8 \ // RUN: -mfloat-abi hard \ // RUN: -ffreestanding \ +// RUN: -target-feature +vfp4 \ // RUN: -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s #include Index: test/Preprocessor/arm-acle-6.5.c === --- test/Preprocessor/arm-acle-6.5.c +++ test/Preprocessor/arm-acle-6.5.c @@ -49,10 +49,13 @@ // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA -// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7a-eabi -mfpu=neon-vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7r-eabi -mfpu=neon-vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv8-eabi -mfpu=neon-vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // CHECK-FMA: __ARM_FEATURE_FMA 1 Index: test/Sema/arm_vfma.c === --- test/Sema/arm_vfma.c +++ test/Sema/arm_vfma.c @@ -1,4 +1,4 @@ -// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -fsyntax-only -verify %s +// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -target-feature +vfp4 -fsyntax-only -verify %s #include // expected-no-diagnostics Index: lib/Basic/Targets.cpp === --- lib/Basic/Targets.cpp +++ lib/Basic/Targets.cpp @@ -4927,7 +4927,8 @@ Builder.defineMacro("__ARM_FP16_ARGS", "1"); // ACLE 6.5.3 Fused multiply-accumulate (FMA) -if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM")) +if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM") && +(FPU & VFP4FPU)) Builder.defineMacro("__ARM_FEATURE_FMA", "1"); // Subtarget options. Index: test/CodeGen/arm-neon-fma.c === --- test/CodeGen/arm-neon-fma.c +++ test/CodeGen/arm-neon-fma.c @@ -3,6 +3,7 @@ // RUN: -target-cpu cortex-a8 \ // RUN: -mfloat-abi hard \ // RUN: -ffreestanding \ +// RUN: -target-feature +vfp4 \ // RUN: -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s #include Index: test/Preprocessor/arm-acle-6.5.c === --- test/Preprocessor/arm-acle-6.5.c +++ test/Preprocessor/arm-acle-6.5.c @@ -49,10 +49,13 @@ // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA -// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7a-eabi -mfpu=neon-vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN:
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga added inline comments. Comment at: lib/Basic/Targets.cpp:4931 @@ -4931,1 +4930,3 @@ +if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM") && +(FPU & VFP4FPU)) Builder.defineMacro("__ARM_FEATURE_FMA", "1"); rengolin wrote: > I think just two checks are necessary, here: > > (FPU & VFPV4FPU) || (ArchVersion > 7) > > and make sure that the right FPU flag is set from the right cores, plus > "+vfp4". Yes, that should be sufficient. Comment at: test/CodeGen/arm-neon-fma.c:6 @@ -5,2 +5,3 @@ // RUN: -ffreestanding \ +// RUN: -target-feature +vfp4 \ // RUN: -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s rengolin wrote: > why not change the cpu to a core that has vfp4? > > I know the test is about FMA, not the CPU, but this is a combination that > will never occur in the wild... Sure, good point. Comment at: test/Sema/arm_vfma.c:1 @@ -1,2 +1,2 @@ -// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -fsyntax-only -verify %s +// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -target-feature +vfp4 -fsyntax-only -verify %s #include rengolin wrote: > It's possible that v7 Apple cores always have FMA? I'd make sure of that > before forcing the flag here. We don't want to disable it inadvertently. > > @t.p.northover, can you confirm Apple's support for VFP4? If they do support it and don't have the vfp4 feature, then before this patch clang/llvm wouldn't have emitted a fma/vfma instruction anyway in any circumstances (because the backend will not generate it). The backend would instead legalize it with fmaf() libcalls - but that's not the correct behaviour according to the spec. http://reviews.llvm.org/D18963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga updated this revision to Diff 53254. sbaranga added a comment. Apply review comments from Renato: - simplify condition for enabling __ARM_FEATURE_FMA - use cortex-a7 instead of cortex-a8 for testing since this is a real use case. http://reviews.llvm.org/D18963 Files: lib/Basic/Targets.cpp test/CodeGen/arm-neon-fma.c test/Preprocessor/arm-acle-6.5.c test/Sema/arm_vfma.c Index: lib/Basic/Targets.cpp === --- lib/Basic/Targets.cpp +++ lib/Basic/Targets.cpp @@ -4927,7 +4927,7 @@ Builder.defineMacro("__ARM_FP16_ARGS", "1"); // ACLE 6.5.3 Fused multiply-accumulate (FMA) -if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM")) +if (ArchVersion >= 7 && (FPU & VFP4FPU)) Builder.defineMacro("__ARM_FEATURE_FMA", "1"); // Subtarget options. Index: test/CodeGen/arm-neon-fma.c === --- test/CodeGen/arm-neon-fma.c +++ test/CodeGen/arm-neon-fma.c @@ -1,6 +1,6 @@ // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \ // RUN: -target-abi aapcs \ -// RUN: -target-cpu cortex-a8 \ +// RUN: -target-cpu cortex-a7 \ // RUN: -mfloat-abi hard \ // RUN: -ffreestanding \ // RUN: -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s Index: test/Preprocessor/arm-acle-6.5.c === --- test/Preprocessor/arm-acle-6.5.c +++ test/Preprocessor/arm-acle-6.5.c @@ -49,10 +49,13 @@ // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA -// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv8-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // CHECK-FMA: __ARM_FEATURE_FMA 1 Index: test/Sema/arm_vfma.c === --- test/Sema/arm_vfma.c +++ test/Sema/arm_vfma.c @@ -1,4 +1,4 @@ -// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -fsyntax-only -verify %s +// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -target-feature +vfp4 -fsyntax-only -verify %s #include // expected-no-diagnostics Index: lib/Basic/Targets.cpp === --- lib/Basic/Targets.cpp +++ lib/Basic/Targets.cpp @@ -4927,7 +4927,7 @@ Builder.defineMacro("__ARM_FP16_ARGS", "1"); // ACLE 6.5.3 Fused multiply-accumulate (FMA) -if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM")) +if (ArchVersion >= 7 && (FPU & VFP4FPU)) Builder.defineMacro("__ARM_FEATURE_FMA", "1"); // Subtarget options. Index: test/CodeGen/arm-neon-fma.c === --- test/CodeGen/arm-neon-fma.c +++ test/CodeGen/arm-neon-fma.c @@ -1,6 +1,6 @@ // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \ // RUN: -target-abi aapcs \ -// RUN: -target-cpu cortex-a8 \ +// RUN: -target-cpu cortex-a7 \ // RUN: -mfloat-abi hard \ // RUN: -ffreestanding \ // RUN: -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s Index: test/Preprocessor/arm-acle-6.5.c === --- test/Preprocessor/arm-acle-6.5.c +++ test/Preprocessor/arm-acle-6.5.c @@ -49,10 +49,13 @@ // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA -// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga added inline comments. Comment at: test/Sema/arm_vfma.c:1 @@ -1,2 +1,2 @@ -// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -fsyntax-only -verify %s +// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -target-feature +vfp4 -fsyntax-only -verify %s #include t.p.northover wrote: > rengolin wrote: > > t.p.northover wrote: > > > v7s is Swift, which has FMA. v7 for us is Cortex-A9, which I think also > > > has FMA (not that it matters much these days). > > v7 is Cortex-A8, and neither A8 nor A9 have FMA in VFP, only NEON. > > > > Does Swift have FMA in VFP? or just NEON? > Sorry, it appears virtually every part of my statement was wrong then. v7 > really does seem to be Cortex-A8 even for us, and Swift doesn't have scalar > VFMA. The error seems to be coming from how the getDefaultFPU() is called when the cpu is not specified. It turns out that it gets called with an empty CPU string (perhaps we meant to call with either "generic" or the CPU that was set in ARMTargetInfo (which does get correctly recognized as swift in this case). FWIW, Cortex-A9 doesn't have FMA, http://reviews.llvm.org/D18963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga updated this revision to Diff 53409. sbaranga added a comment. If no cpu has been passed to the command line, use the generic cpu when selecting features/FPU, instead of using an empty string (which is not recognized by the TargetParser). http://reviews.llvm.org/D18963 Files: lib/Basic/Targets.cpp test/CodeGen/arm-long-calls.c test/CodeGen/arm-neon-fma.c test/CodeGen/arm-no-movt.c test/Preprocessor/arm-acle-6.5.c test/Sema/arm_vfma.c test/Sema/neon-vector-types-support.c Index: lib/Basic/Targets.cpp === --- lib/Basic/Targets.cpp +++ lib/Basic/Targets.cpp @@ -4707,6 +4707,8 @@ initFeatureMap(llvm::StringMap &Features, DiagnosticsEngine &Diags, StringRef CPU, const std::vector &FeaturesVec) const override { +if (CPU == "") + CPU = "generic"; std::vector TargetFeatures; unsigned Arch = llvm::ARM::parseArch(getTriple().getArchName()); @@ -4927,7 +4929,7 @@ Builder.defineMacro("__ARM_FP16_ARGS", "1"); // ACLE 6.5.3 Fused multiply-accumulate (FMA) -if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM")) +if (ArchVersion >= 7 && (FPU & VFP4FPU)) Builder.defineMacro("__ARM_FEATURE_FMA", "1"); // Subtarget options. Index: test/CodeGen/arm-no-movt.c === --- test/CodeGen/arm-no-movt.c +++ test/CodeGen/arm-no-movt.c @@ -1,7 +1,7 @@ -// RUN: %clang_cc1 -triple thumbv7-apple-ios5 -target-feature +no-movt -emit-llvm -o - %s | FileCheck -check-prefix=NO-MOVT %s +// RUN: %clang_cc1 -triple thumbv7-apple-ios5 -target-feature +no-movt -emit-llvm -o - %s | FileCheck -check-prefix=NO-MOVT %s // RUN: %clang_cc1 -triple thumbv7-apple-ios5 -emit-llvm -o - %s | FileCheck -check-prefix=MOVT %s -// NO-MOVT: attributes #0 = { {{.*}} "target-features"="+no-movt" -// MOVT-NOT: attributes #0 = { {{.*}} "target-features"="+no-movt" +// NO-MOVT: attributes #0 = { {{.*}} "target-features"="{{.*}}+no-movt{{.*}}" +// MOVT-NOT: attributes #0 = { {{.*}} "target-features"="{{.*}}+no-movt{{.*}}" int foo1(int a) { return a; } Index: test/CodeGen/arm-long-calls.c === --- test/CodeGen/arm-long-calls.c +++ test/CodeGen/arm-long-calls.c @@ -1,7 +1,7 @@ // RUN: %clang_cc1 -triple thumbv7-apple-ios5 -target-feature +long-calls -emit-llvm -o - %s | FileCheck -check-prefix=LONGCALL %s // RUN: %clang_cc1 -triple thumbv7-apple-ios5 -emit-llvm -o - %s | FileCheck -check-prefix=NOLONGCALL %s -// LONGCALL: attributes #0 = { {{.*}} "target-features"="+long-calls" -// NOLONGCALL-NOT: attributes #0 = { {{.*}} "target-features"="+long-calls" +// LONGCALL: attributes #0 = { {{.*}} "target-features"="{{.*}}+long-calls{{.*}}" +// NOLONGCALL-NOT: attributes #0 = { {{.*}} "target-features"="{{.*}}+long-calls{{.*}}" int foo1(int a) { return a; } Index: test/CodeGen/arm-neon-fma.c === --- test/CodeGen/arm-neon-fma.c +++ test/CodeGen/arm-neon-fma.c @@ -1,6 +1,6 @@ // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \ // RUN: -target-abi aapcs \ -// RUN: -target-cpu cortex-a8 \ +// RUN: -target-cpu cortex-a7 \ // RUN: -mfloat-abi hard \ // RUN: -ffreestanding \ // RUN: -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s Index: test/Preprocessor/arm-acle-6.5.c === --- test/Preprocessor/arm-acle-6.5.c +++ test/Preprocessor/arm-acle-6.5.c @@ -49,10 +49,13 @@ // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA -// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv8-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // CHECK-FMA: __ARM_FEATURE_FMA 1 Index: test/Sema/arm_vfma.c === --- test/Sema/arm_vfma.c +++ test/Sema/arm_vfma.c @@ -1,4 +1,4 @@ -// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -fsyntax-only -verify %s +// RUN: %clang_cc1 -triple t
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga added a comment. I've updated the patch to fix the defaults when the cpu is not specified. Renato, Tim, could you have a look at this again please? Thanks, Silviu http://reviews.llvm.org/D18963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga added a comment. A gentle ping? Cheers, Silviu http://reviews.llvm.org/D18963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga added inline comments. Comment at: lib/Basic/Targets.cpp:4710 @@ -4709,1 +4709,3 @@ const std::vector &FeaturesVec) const override { +if (CPU == "") + CPU = "generic"; rengolin wrote: > This change is unrelated and may bring side effects into clang. I'd keep this > out and investigate it in another patch with the appropriate tests. If you > just force the target-feature in the test, this corner case won't be relevant > in this patch. Ok, that makes sense. I'll revert to the previous revision of this patch. http://reviews.llvm.org/D18963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga updated this revision to Diff 55018. sbaranga added a comment. Address the latest review comments (which means rolling back to the last change). http://reviews.llvm.org/D18963 Files: lib/Basic/Targets.cpp test/CodeGen/arm-neon-fma.c test/Preprocessor/arm-acle-6.5.c test/Sema/arm_vfma.c Index: lib/Basic/Targets.cpp === --- lib/Basic/Targets.cpp +++ lib/Basic/Targets.cpp @@ -4931,7 +4931,7 @@ Builder.defineMacro("__ARM_FP16_ARGS", "1"); // ACLE 6.5.3 Fused multiply-accumulate (FMA) -if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM")) +if (ArchVersion >= 7 && (FPU & VFP4FPU)) Builder.defineMacro("__ARM_FEATURE_FMA", "1"); // Subtarget options. Index: test/CodeGen/arm-neon-fma.c === --- test/CodeGen/arm-neon-fma.c +++ test/CodeGen/arm-neon-fma.c @@ -1,6 +1,6 @@ // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \ // RUN: -target-abi aapcs \ -// RUN: -target-cpu cortex-a8 \ +// RUN: -target-cpu cortex-a7 \ // RUN: -mfloat-abi hard \ // RUN: -ffreestanding \ // RUN: -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s Index: test/Preprocessor/arm-acle-6.5.c === --- test/Preprocessor/arm-acle-6.5.c +++ test/Preprocessor/arm-acle-6.5.c @@ -49,10 +49,13 @@ // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA -// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv8-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // CHECK-FMA: __ARM_FEATURE_FMA 1 Index: test/Sema/arm_vfma.c === --- test/Sema/arm_vfma.c +++ test/Sema/arm_vfma.c @@ -1,4 +1,4 @@ -// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -fsyntax-only -verify %s +// RUN: %clang_cc1 -triple thumbv7-none-eabi -target-feature +neon -target-feature +vfp4 -fsyntax-only -verify %s #include // expected-no-diagnostics Index: lib/Basic/Targets.cpp === --- lib/Basic/Targets.cpp +++ lib/Basic/Targets.cpp @@ -4931,7 +4931,7 @@ Builder.defineMacro("__ARM_FP16_ARGS", "1"); // ACLE 6.5.3 Fused multiply-accumulate (FMA) -if (ArchVersion >= 7 && (CPUProfile != "M" || CPUAttr == "7EM")) +if (ArchVersion >= 7 && (FPU & VFP4FPU)) Builder.defineMacro("__ARM_FEATURE_FMA", "1"); // Subtarget options. Index: test/CodeGen/arm-neon-fma.c === --- test/CodeGen/arm-neon-fma.c +++ test/CodeGen/arm-neon-fma.c @@ -1,6 +1,6 @@ // RUN: %clang_cc1 -triple thumbv7-none-linux-gnueabihf \ // RUN: -target-abi aapcs \ -// RUN: -target-cpu cortex-a8 \ +// RUN: -target-cpu cortex-a7 \ // RUN: -mfloat-abi hard \ // RUN: -ffreestanding \ // RUN: -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s Index: test/Preprocessor/arm-acle-6.5.c === --- test/Preprocessor/arm-acle-6.5.c +++ test/Preprocessor/arm-acle-6.5.c @@ -49,10 +49,13 @@ // CHECK-NO-FMA-NOT: __ARM_FEATURE_FMA -// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7a-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7a-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv7r-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-NO-FMA +// RUN: %clang -target armv7r-eabi -mfpu=vfpv4 -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA // RUN: %clang -target armv7em-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA -// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -check-prefix CHECK-FMA +// RUN: %clang -target armv8-eabi -x c -E -dM %s -o - | FileCheck %s -c
Re: [PATCH] D18963: PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
sbaranga added inline comments. Comment at: test/Sema/arm_vfma.c:1 @@ -1,2 +1,2 @@ -// RUN: %clang_cc1 -triple thumbv7s-apple-ios7.0 -target-feature +neon -fsyntax-only -verify %s +// RUN: %clang_cc1 -triple thumbv7-none-eabi -target-feature +neon -target-feature +vfp4 -fsyntax-only -verify %s #include I updated this test, but used thumbv7-none-eabi here, since VFPv4 requires at least v7. http://reviews.llvm.org/D18963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits