[PATCH] D31167: Use FPContractModeKind universally
hfinkel added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") anemet wrote: > hfinkel wrote: > > anemet wrote: > > > hfinkel wrote: > > > > yaxunl wrote: > > > > > hfinkel wrote: > > > > > > yaxunl wrote: > > > > > > > anemet wrote: > > > > > > > > yaxunl wrote: > > > > > > > > > anemet wrote: > > > > > > > > > > yaxunl wrote: > > > > > > > > > > > This change seemed to cause some performance degradations > > > > > > > > > > > in some OpenCL applications. > > > > > > > > > > > > > > > > > > > > > > This option used to be on by default. Now it is off by > > > > > > > > > > > default. > > > > > > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen > > > > > > > > > > > stages. In the codegen stage, clang is invoked with .bc > > > > > > > > > > > as input, therefore this option is off by default, > > > > > > > > > > > whereas it was on by default before this change. > > > > > > > > > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > This change seemed to cause some performance degradations > > > > > > > > > > > in some OpenCL applications. > > > > > > > > > > > > > > > > > > > > > > This option used to be on by default. Now it is off by > > > > > > > > > > > default. > > > > > > > > > > > > > > > > > > > > It's always been off. It was set to fast for CUDA which > > > > > > > > > > should still be the case. See the changes further down on > > > > > > > > > > the patch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen > > > > > > > > > > > stages. In the codegen stage, clang is invoked with .bc > > > > > > > > > > > as input, therefore this option is off by default, > > > > > > > > > > > whereas it was on by default before this change. > > > > > > > > > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > > > > > > > Sorry but I am not sure what changed, can you elaborate on > > > > > > > > > > what you're doing and how things used to work for you? > > > > > > > > > Before your change, -ffp-contract was a codegen option > > > > > > > > > defined as > > > > > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > therefore the default value as on. After your change, it > > > > > > > > > becomes off by default. > > > > > > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was > > > > > > > > a backend option. I still don't understand your use-case. Can > > > > > > > > you include a small testcase how this used to work before? > > > > > > > -ffp-contract=on used to be a codegen option before this change. > > > > > > > In CodeGen/BackendUtil.cpp, before this change: > > > > > > > > > > > > > > ``` > > > > > > > switch (CodeGenOpts.getFPContractMode()) { > > > > > > > switch (LangOpts.getDefaultFPContractMode()) { > > > > > > > case LangOptions::FPC_Off: > > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > > > > > break; > > > > > > > case LangOptions::FPC_On: > > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > > > > > break; > > > > > > > case LangOptions::FPC_Fast: > > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > > > > > break; > > > > > > > } > > > > > > > ``` > > > > > > > By default, -fp-contract=on, which results in > > > > > > > llvm::FPOpFusion::Standard in the backend. This options allows > > > > > > > backend to translate llvm.fmuladd to FMA machine instructions. > > > > > > > > > > > > > > After this change, it becomes: > > > > > > > > > > > > > > ``` > > > > > > > switch (LangOpts.getDefaultFPContractMode()) { > > > > > > > case LangOptions::FPC_Off: > > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > > > > > break; > > > > > > > case LangOptions::FPC_On: > > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > > > > > break; > > > > > > > case LangOptions::FPC_Fast: > > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > > > > > break; > > > > > > > } > > > > > > > ``` > > > > > > > now by default -ffp-contract=off, which results in > > > > > > > llvm::FPOpFusion::Strict in the backend. This option disables > > > > > > > backend translating llvm.fmuladd to FMA machine instructions in > > > > > > > certain situations. > > > > > > > > > > > > > > A simple lit test is as follows: > > > > > > > > > > > > > > > >
[PATCH] D31167: Use FPContractModeKind universally
yaxunl added a comment. In https://reviews.llvm.org/D31167#725705, @mehdi_amini wrote: > I believe considering the goal of moving to using per-instruction FMF and > kills the global backend option, this leads to a bitcode upgrade issue: > OpenCL (or other) bitcode that were generated bitcode don't have the right > FMF and will be upgraded conservatively. > Performance regression when upgrading bitcode are to be expected in general, > so it is not a bug. > To recover, an option for an OpenCL backend would be to add a pass that set > the expected FMF everywhere after bitcode upgrade. Actually we are not concerned about old bitcodes. We are doing separate compile/link/codegen, but the bitcode we use in codegen is fresh from the same revision of llvm/clang. As long as the FMF carries the correct info from Clang we should be OK. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
mehdi_amini added a comment. I believe considering the goal of moving to using per-instruction FMF and kills the global backend option, this leads to a bitcode upgrade issue: OpenCL (or other) bitcode that were generated bitcode don't have the right FMF and will be upgraded conservatively. Performance regression when upgrading bitcode are to be expected in general, so it is not a bug. To recover, an option for an OpenCL backend would be to add a pass that set the expected FMF everywhere after bitcode upgrade. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
anemet added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") hfinkel wrote: > anemet wrote: > > hfinkel wrote: > > > yaxunl wrote: > > > > hfinkel wrote: > > > > > yaxunl wrote: > > > > > > anemet wrote: > > > > > > > yaxunl wrote: > > > > > > > > anemet wrote: > > > > > > > > > yaxunl wrote: > > > > > > > > > > This change seemed to cause some performance degradations > > > > > > > > > > in some OpenCL applications. > > > > > > > > > > > > > > > > > > > > This option used to be on by default. Now it is off by > > > > > > > > > > default. > > > > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen > > > > > > > > > > stages. In the codegen stage, clang is invoked with .bc as > > > > > > > > > > input, therefore this option is off by default, whereas it > > > > > > > > > > was on by default before this change. > > > > > > > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > This change seemed to cause some performance degradations > > > > > > > > > > in some OpenCL applications. > > > > > > > > > > > > > > > > > > > > This option used to be on by default. Now it is off by > > > > > > > > > > default. > > > > > > > > > > > > > > > > > > It's always been off. It was set to fast for CUDA which > > > > > > > > > should still be the case. See the changes further down on > > > > > > > > > the patch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen > > > > > > > > > > stages. In the codegen stage, clang is invoked with .bc as > > > > > > > > > > input, therefore this option is off by default, whereas it > > > > > > > > > > was on by default before this change. > > > > > > > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > > > > > Sorry but I am not sure what changed, can you elaborate on > > > > > > > > > what you're doing and how things used to work for you? > > > > > > > > Before your change, -ffp-contract was a codegen option defined > > > > > > > > as > > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > > > > > > > ``` > > > > > > > > > > > > > > > > therefore the default value as on. After your change, it > > > > > > > > becomes off by default. > > > > > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a > > > > > > > backend option. I still don't understand your use-case. Can you > > > > > > > include a small testcase how this used to work before? > > > > > > -ffp-contract=on used to be a codegen option before this change. In > > > > > > CodeGen/BackendUtil.cpp, before this change: > > > > > > > > > > > > ``` > > > > > > switch (CodeGenOpts.getFPContractMode()) { > > > > > > switch (LangOpts.getDefaultFPContractMode()) { > > > > > > case LangOptions::FPC_Off: > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > > > > break; > > > > > > case LangOptions::FPC_On: > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > > > > break; > > > > > > case LangOptions::FPC_Fast: > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > > > > break; > > > > > > } > > > > > > ``` > > > > > > By default, -fp-contract=on, which results in > > > > > > llvm::FPOpFusion::Standard in the backend. This options allows > > > > > > backend to translate llvm.fmuladd to FMA machine instructions. > > > > > > > > > > > > After this change, it becomes: > > > > > > > > > > > > ``` > > > > > > switch (LangOpts.getDefaultFPContractMode()) { > > > > > > case LangOptions::FPC_Off: > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > > > > break; > > > > > > case LangOptions::FPC_On: > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > > > > break; > > > > > > case LangOptions::FPC_Fast: > > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > > > > break; > > > > > > } > > > > > > ``` > > > > > > now by default -ffp-contract=off, which results in > > > > > > llvm::FPOpFusion::Strict in the backend. This option disables > > > > > > backend translating llvm.fmuladd to FMA machine instructions in > > > > > > certain situations. > > > > > > > > > > > > A simple lit test is as follows: > > > > > > > > > > > > > > > > > > ``` > > > > > > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s > > > > > > > > > > > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, > > > > > > double %a, double %b, double
[PATCH] D31167: Use FPContractModeKind universally
hfinkel added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") anemet wrote: > hfinkel wrote: > > yaxunl wrote: > > > hfinkel wrote: > > > > yaxunl wrote: > > > > > anemet wrote: > > > > > > yaxunl wrote: > > > > > > > anemet wrote: > > > > > > > > yaxunl wrote: > > > > > > > > > This change seemed to cause some performance degradations in > > > > > > > > > some OpenCL applications. > > > > > > > > > > > > > > > > > > This option used to be on by default. Now it is off by > > > > > > > > > default. > > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen > > > > > > > > > stages. In the codegen stage, clang is invoked with .bc as > > > > > > > > > input, therefore this option is off by default, whereas it > > > > > > > > > was on by default before this change. > > > > > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > This change seemed to cause some performance degradations in > > > > > > > > > some OpenCL applications. > > > > > > > > > > > > > > > > > > This option used to be on by default. Now it is off by > > > > > > > > > default. > > > > > > > > > > > > > > > > It's always been off. It was set to fast for CUDA which should > > > > > > > > still be the case. See the changes further down on the patch. > > > > > > > > > > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen > > > > > > > > > stages. In the codegen stage, clang is invoked with .bc as > > > > > > > > > input, therefore this option is off by default, whereas it > > > > > > > > > was on by default before this change. > > > > > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > > > Sorry but I am not sure what changed, can you elaborate on what > > > > > > > > you're doing and how things used to work for you? > > > > > > > Before your change, -ffp-contract was a codegen option defined as > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > > > > > > ``` > > > > > > > > > > > > > > therefore the default value as on. After your change, it becomes > > > > > > > off by default. > > > > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a > > > > > > backend option. I still don't understand your use-case. Can you > > > > > > include a small testcase how this used to work before? > > > > > -ffp-contract=on used to be a codegen option before this change. In > > > > > CodeGen/BackendUtil.cpp, before this change: > > > > > > > > > > ``` > > > > > switch (CodeGenOpts.getFPContractMode()) { > > > > > switch (LangOpts.getDefaultFPContractMode()) { > > > > > case LangOptions::FPC_Off: > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > > > break; > > > > > case LangOptions::FPC_On: > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > > > break; > > > > > case LangOptions::FPC_Fast: > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > > > break; > > > > > } > > > > > ``` > > > > > By default, -fp-contract=on, which results in > > > > > llvm::FPOpFusion::Standard in the backend. This options allows > > > > > backend to translate llvm.fmuladd to FMA machine instructions. > > > > > > > > > > After this change, it becomes: > > > > > > > > > > ``` > > > > > switch (LangOpts.getDefaultFPContractMode()) { > > > > > case LangOptions::FPC_Off: > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > > > break; > > > > > case LangOptions::FPC_On: > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > > > break; > > > > > case LangOptions::FPC_Fast: > > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > > > break; > > > > > } > > > > > ``` > > > > > now by default -ffp-contract=off, which results in > > > > > llvm::FPOpFusion::Strict in the backend. This option disables backend > > > > > translating llvm.fmuladd to FMA machine instructions in certain > > > > > situations. > > > > > > > > > > A simple lit test is as follows: > > > > > > > > > > > > > > > ``` > > > > > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s > > > > > > > > > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, > > > > > double %a, double %b, double %c) local_unnamed_addr #0 { > > > > > entry: > > > > > ; CHECK: v_fma_f64 > > > > > %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, > > > > > double %a) > > > > > store double %0, double addrspace(1)* %out, align 8, !tbaa !6 > > > > >
[PATCH] D31167: Use FPContractModeKind universally
anemet added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") hfinkel wrote: > yaxunl wrote: > > hfinkel wrote: > > > yaxunl wrote: > > > > anemet wrote: > > > > > yaxunl wrote: > > > > > > anemet wrote: > > > > > > > yaxunl wrote: > > > > > > > > This change seemed to cause some performance degradations in > > > > > > > > some OpenCL applications. > > > > > > > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen > > > > > > > > stages. In the codegen stage, clang is invoked with .bc as > > > > > > > > input, therefore this option is off by default, whereas it was > > > > > > > > on by default before this change. > > > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > > > Thanks. > > > > > > > > This change seemed to cause some performance degradations in > > > > > > > > some OpenCL applications. > > > > > > > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > > > > > > > It's always been off. It was set to fast for CUDA which should > > > > > > > still be the case. See the changes further down on the patch. > > > > > > > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen > > > > > > > > stages. In the codegen stage, clang is invoked with .bc as > > > > > > > > input, therefore this option is off by default, whereas it was > > > > > > > > on by default before this change. > > > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > Sorry but I am not sure what changed, can you elaborate on what > > > > > > > you're doing and how things used to work for you? > > > > > > Before your change, -ffp-contract was a codegen option defined as > > > > > > > > > > > > > > > > > > ``` > > > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > > > > > ``` > > > > > > > > > > > > therefore the default value as on. After your change, it becomes > > > > > > off by default. > > > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a > > > > > backend option. I still don't understand your use-case. Can you > > > > > include a small testcase how this used to work before? > > > > -ffp-contract=on used to be a codegen option before this change. In > > > > CodeGen/BackendUtil.cpp, before this change: > > > > > > > > ``` > > > > switch (CodeGenOpts.getFPContractMode()) { > > > > switch (LangOpts.getDefaultFPContractMode()) { > > > > case LangOptions::FPC_Off: > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > > break; > > > > case LangOptions::FPC_On: > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > > break; > > > > case LangOptions::FPC_Fast: > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > > break; > > > > } > > > > ``` > > > > By default, -fp-contract=on, which results in > > > > llvm::FPOpFusion::Standard in the backend. This options allows backend > > > > to translate llvm.fmuladd to FMA machine instructions. > > > > > > > > After this change, it becomes: > > > > > > > > ``` > > > > switch (LangOpts.getDefaultFPContractMode()) { > > > > case LangOptions::FPC_Off: > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > > break; > > > > case LangOptions::FPC_On: > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > > break; > > > > case LangOptions::FPC_Fast: > > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > > break; > > > > } > > > > ``` > > > > now by default -ffp-contract=off, which results in > > > > llvm::FPOpFusion::Strict in the backend. This option disables backend > > > > translating llvm.fmuladd to FMA machine instructions in certain > > > > situations. > > > > > > > > A simple lit test is as follows: > > > > > > > > > > > > ``` > > > > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s > > > > > > > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, > > > > double %a, double %b, double %c) local_unnamed_addr #0 { > > > > entry: > > > > ; CHECK: v_fma_f64 > > > > %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double > > > > %a) > > > > store double %0, double addrspace(1)* %out, align 8, !tbaa !6 > > > > ret void > > > > } > > > > > > > > declare double @llvm.fmuladd.f64(double, double, double) #1 > > > > > > > > attributes #0 = { nounwind > > > > "correctly-rounded-divide-sqrt-fp-math"="false" > > > > "disable-tail-calls"="false" "less-precise-fpmad"="false"
[PATCH] D31167: Use FPContractModeKind universally
hfinkel added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") yaxunl wrote: > hfinkel wrote: > > yaxunl wrote: > > > anemet wrote: > > > > yaxunl wrote: > > > > > anemet wrote: > > > > > > yaxunl wrote: > > > > > > > This change seemed to cause some performance degradations in some > > > > > > > OpenCL applications. > > > > > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > > > > > > > There are applications do separated compile/link/codegen stages. > > > > > > > In the codegen stage, clang is invoked with .bc as input, > > > > > > > therefore this option is off by default, whereas it was on by > > > > > > > default before this change. > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > > > Thanks. > > > > > > > This change seemed to cause some performance degradations in some > > > > > > > OpenCL applications. > > > > > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > > > > > It's always been off. It was set to fast for CUDA which should > > > > > > still be the case. See the changes further down on the patch. > > > > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen stages. > > > > > > > In the codegen stage, clang is invoked with .bc as input, > > > > > > > therefore this option is off by default, whereas it was on by > > > > > > > default before this change. > > > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > Sorry but I am not sure what changed, can you elaborate on what > > > > > > you're doing and how things used to work for you? > > > > > Before your change, -ffp-contract was a codegen option defined as > > > > > > > > > > > > > > > ``` > > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > > > > ``` > > > > > > > > > > therefore the default value as on. After your change, it becomes off > > > > > by default. > > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a > > > > backend option. I still don't understand your use-case. Can you > > > > include a small testcase how this used to work before? > > > -ffp-contract=on used to be a codegen option before this change. In > > > CodeGen/BackendUtil.cpp, before this change: > > > > > > ``` > > > switch (CodeGenOpts.getFPContractMode()) { > > > switch (LangOpts.getDefaultFPContractMode()) { > > > case LangOptions::FPC_Off: > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > break; > > > case LangOptions::FPC_On: > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > break; > > > case LangOptions::FPC_Fast: > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > break; > > > } > > > ``` > > > By default, -fp-contract=on, which results in llvm::FPOpFusion::Standard > > > in the backend. This options allows backend to translate llvm.fmuladd to > > > FMA machine instructions. > > > > > > After this change, it becomes: > > > > > > ``` > > > switch (LangOpts.getDefaultFPContractMode()) { > > > case LangOptions::FPC_Off: > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > > break; > > > case LangOptions::FPC_On: > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > > break; > > > case LangOptions::FPC_Fast: > > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > > break; > > > } > > > ``` > > > now by default -ffp-contract=off, which results in > > > llvm::FPOpFusion::Strict in the backend. This option disables backend > > > translating llvm.fmuladd to FMA machine instructions in certain > > > situations. > > > > > > A simple lit test is as follows: > > > > > > > > > ``` > > > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s > > > > > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, double > > > %a, double %b, double %c) local_unnamed_addr #0 { > > > entry: > > > ; CHECK: v_fma_f64 > > > %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double %a) > > > store double %0, double addrspace(1)* %out, align 8, !tbaa !6 > > > ret void > > > } > > > > > > declare double @llvm.fmuladd.f64(double, double, double) #1 > > > > > > attributes #0 = { nounwind > > > "correctly-rounded-divide-sqrt-fp-math"="false" > > > "disable-tail-calls"="false" "less-precise-fpmad"="false" > > > "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" > > > "no-jump-tables"="false" "no-nans-fp-math"="false" > > > "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" > > > "stack-protector-buffer-size"="8" > > >
[PATCH] D31167: Use FPContractModeKind universally
yaxunl added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") hfinkel wrote: > yaxunl wrote: > > anemet wrote: > > > yaxunl wrote: > > > > anemet wrote: > > > > > yaxunl wrote: > > > > > > This change seemed to cause some performance degradations in some > > > > > > OpenCL applications. > > > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > > > > > There are applications do separated compile/link/codegen stages. In > > > > > > the codegen stage, clang is invoked with .bc as input, therefore > > > > > > this option is off by default, whereas it was on by default before > > > > > > this change. > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > > > Thanks. > > > > > > This change seemed to cause some performance degradations in some > > > > > > OpenCL applications. > > > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > > > It's always been off. It was set to fast for CUDA which should still > > > > > be the case. See the changes further down on the patch. > > > > > > > > > > > > > > > > > There are applications do separated compile/link/codegen stages. In > > > > > > the codegen stage, clang is invoked with .bc as input, therefore > > > > > > this option is off by default, whereas it was on by default before > > > > > > this change. > > > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > Sorry but I am not sure what changed, can you elaborate on what > > > > > you're doing and how things used to work for you? > > > > Before your change, -ffp-contract was a codegen option defined as > > > > > > > > > > > > ``` > > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > > > ``` > > > > > > > > therefore the default value as on. After your change, it becomes off by > > > > default. > > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a backend > > > option. I still don't understand your use-case. Can you include a small > > > testcase how this used to work before? > > -ffp-contract=on used to be a codegen option before this change. In > > CodeGen/BackendUtil.cpp, before this change: > > > > ``` > > switch (CodeGenOpts.getFPContractMode()) { > > switch (LangOpts.getDefaultFPContractMode()) { > > case LangOptions::FPC_Off: > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > break; > > case LangOptions::FPC_On: > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > break; > > case LangOptions::FPC_Fast: > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > break; > > } > > ``` > > By default, -fp-contract=on, which results in llvm::FPOpFusion::Standard in > > the backend. This options allows backend to translate llvm.fmuladd to FMA > > machine instructions. > > > > After this change, it becomes: > > > > ``` > > switch (LangOpts.getDefaultFPContractMode()) { > > case LangOptions::FPC_Off: > > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > > break; > > case LangOptions::FPC_On: > > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > > break; > > case LangOptions::FPC_Fast: > > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > > break; > > } > > ``` > > now by default -ffp-contract=off, which results in > > llvm::FPOpFusion::Strict in the backend. This option disables backend > > translating llvm.fmuladd to FMA machine instructions in certain situations. > > > > A simple lit test is as follows: > > > > > > ``` > > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s > > > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, double > > %a, double %b, double %c) local_unnamed_addr #0 { > > entry: > > ; CHECK: v_fma_f64 > > %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double %a) > > store double %0, double addrspace(1)* %out, align 8, !tbaa !6 > > ret void > > } > > > > declare double @llvm.fmuladd.f64(double, double, double) #1 > > > > attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" > > "disable-tail-calls"="false" "less-precise-fpmad"="false" > > "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" > > "no-jump-tables"="false" "no-nans-fp-math"="false" > > "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" > > "stack-protector-buffer-size"="8" > > "target-features"="+fp64-fp16-denormals,-fp32-denormals" > > "unsafe-fp-math"="false" "use-soft-float"="false" } > > attributes #1 = { nounwind readnone } > > > > ``` > > The test passes before this change and fails after this change. > I'm missing something
[PATCH] D31167: Use FPContractModeKind universally
hfinkel added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") yaxunl wrote: > anemet wrote: > > yaxunl wrote: > > > anemet wrote: > > > > yaxunl wrote: > > > > > This change seemed to cause some performance degradations in some > > > > > OpenCL applications. > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > > > There are applications do separated compile/link/codegen stages. In > > > > > the codegen stage, clang is invoked with .bc as input, therefore this > > > > > option is off by default, whereas it was on by default before this > > > > > change. > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > Thanks. > > > > > This change seemed to cause some performance degradations in some > > > > > OpenCL applications. > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > It's always been off. It was set to fast for CUDA which should still > > > > be the case. See the changes further down on the patch. > > > > > > > > > > > > > > There are applications do separated compile/link/codegen stages. In > > > > > the codegen stage, clang is invoked with .bc as input, therefore this > > > > > option is off by default, whereas it was on by default before this > > > > > change. > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > Sorry but I am not sure what changed, can you elaborate on what you're > > > > doing and how things used to work for you? > > > Before your change, -ffp-contract was a codegen option defined as > > > > > > > > > ``` > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > > ``` > > > > > > therefore the default value as on. After your change, it becomes off by > > > default. > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a backend > > option. I still don't understand your use-case. Can you include a small > > testcase how this used to work before? > -ffp-contract=on used to be a codegen option before this change. In > CodeGen/BackendUtil.cpp, before this change: > > ``` > switch (CodeGenOpts.getFPContractMode()) { > switch (LangOpts.getDefaultFPContractMode()) { > case LangOptions::FPC_Off: > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > break; > case LangOptions::FPC_On: > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > break; > case LangOptions::FPC_Fast: > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > break; > } > ``` > By default, -fp-contract=on, which results in llvm::FPOpFusion::Standard in > the backend. This options allows backend to translate llvm.fmuladd to FMA > machine instructions. > > After this change, it becomes: > > ``` > switch (LangOpts.getDefaultFPContractMode()) { > case LangOptions::FPC_Off: > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > break; > case LangOptions::FPC_On: > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > break; > case LangOptions::FPC_Fast: > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > break; > } > ``` > now by default -ffp-contract=off, which results in llvm::FPOpFusion::Strict > in the backend. This option disables backend translating llvm.fmuladd to FMA > machine instructions in certain situations. > > A simple lit test is as follows: > > > ``` > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, double %a, > double %b, double %c) local_unnamed_addr #0 { > entry: > ; CHECK: v_fma_f64 > %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double %a) > store double %0, double addrspace(1)* %out, align 8, !tbaa !6 > ret void > } > > declare double @llvm.fmuladd.f64(double, double, double) #1 > > attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" > "disable-tail-calls"="false" "less-precise-fpmad"="false" > "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" > "no-jump-tables"="false" "no-nans-fp-math"="false" > "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" > "stack-protector-buffer-size"="8" > "target-features"="+fp64-fp16-denormals,-fp32-denormals" > "unsafe-fp-math"="false" "use-soft-float"="false" } > attributes #1 = { nounwind readnone } > > ``` > The test passes before this change and fails after this change. I'm missing something here. We're calling for OpenCL: Opts.setDefaultFPContractMode(LangOptions::FPC_On); which seems like it should change the result returned by getDefaultFPContractMode(). Why does it not? Repository: rL LLVM https://reviews.llvm.org/D31167
[PATCH] D31167: Use FPContractModeKind universally
yaxunl added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") anemet wrote: > yaxunl wrote: > > anemet wrote: > > > yaxunl wrote: > > > > This change seemed to cause some performance degradations in some > > > > OpenCL applications. > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > There are applications do separated compile/link/codegen stages. In the > > > > codegen stage, clang is invoked with .bc as input, therefore this > > > > option is off by default, whereas it was on by default before this > > > > change. > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > Thanks. > > > > This change seemed to cause some performance degradations in some > > > > OpenCL applications. > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > It's always been off. It was set to fast for CUDA which should still be > > > the case. See the changes further down on the patch. > > > > > > > > > > > There are applications do separated compile/link/codegen stages. In the > > > > codegen stage, clang is invoked with .bc as input, therefore this > > > > option is off by default, whereas it was on by default before this > > > > change. > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > Sorry but I am not sure what changed, can you elaborate on what you're > > > doing and how things used to work for you? > > Before your change, -ffp-contract was a codegen option defined as > > > > > > ``` > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > ``` > > > > therefore the default value as on. After your change, it becomes off by > > default. > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a backend > option. I still don't understand your use-case. Can you include a small > testcase how this used to work before? -ffp-contract=on used to be a codegen option before this change. In CodeGen/BackendUtil.cpp, before this change: ``` switch (CodeGenOpts.getFPContractMode()) { switch (LangOpts.getDefaultFPContractMode()) { case LangOptions::FPC_Off: Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; break; case LangOptions::FPC_On: Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; break; case LangOptions::FPC_Fast: Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; break; } ``` By default, -fp-contract=on, which results in llvm::FPOpFusion::Standard in the backend. This options allows backend to translate llvm.fmuladd to FMA machine instructions. After this change, it becomes: ``` switch (LangOpts.getDefaultFPContractMode()) { case LangOptions::FPC_Off: Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; break; case LangOptions::FPC_On: Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; break; case LangOptions::FPC_Fast: Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; break; } ``` now by default -ffp-contract=off, which results in llvm::FPOpFusion::Strict in the backend. This option disables backend translating llvm.fmuladd to FMA machine instructions in certain situations. A simple lit test is as follows: ``` ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, double %a, double %b, double %c) local_unnamed_addr #0 { entry: ; CHECK: v_fma_f64 %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double %a) store double %0, double addrspace(1)* %out, align 8, !tbaa !6 ret void } declare double @llvm.fmuladd.f64(double, double, double) #1 attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+fp64-fp16-denormals,-fp32-denormals" "unsafe-fp-math"="false" "use-soft-float"="false" } attributes #1 = { nounwind readnone } ``` The test passes before this change and fails after this change. Comment at: cfe/trunk/lib/Frontend/CompilerInvocation.cpp:1638 Opts.LaxVectorConversions = 0; -Opts.DefaultFPContract = 1; +Opts.setDefaultFPContractMode(LangOptions::FPC_On); Opts.NativeHalfType = 1; yaxunl wrote: > hfinkel wrote: > > yaxunl wrote: > > > hfinkel wrote: > > > > Looks like the intent is certainly to have kept the OpenCL default the > > > > same here. > > > Well we also use clang to compile LLVM IR to ISA. Before this change, > > > fp-contract
[PATCH] D31167: Use FPContractModeKind universally
anemet added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") yaxunl wrote: > anemet wrote: > > yaxunl wrote: > > > This change seemed to cause some performance degradations in some OpenCL > > > applications. > > > > > > This option used to be on by default. Now it is off by default. > > > > > > There are applications do separated compile/link/codegen stages. In the > > > codegen stage, clang is invoked with .bc as input, therefore this option > > > is off by default, whereas it was on by default before this change. > > > > > > Is there any reason not to keep the original behavior? > > > > > > Thanks. > > > This change seemed to cause some performance degradations in some OpenCL > > > applications. > > > > > > This option used to be on by default. Now it is off by default. > > > > It's always been off. It was set to fast for CUDA which should still be > > the case. See the changes further down on the patch. > > > > > > > > There are applications do separated compile/link/codegen stages. In the > > > codegen stage, clang is invoked with .bc as input, therefore this option > > > is off by default, whereas it was on by default before this change. > > > > > > Is there any reason not to keep the original behavior? > > > > Sorry but I am not sure what changed, can you elaborate on what you're > > doing and how things used to work for you? > Before your change, -ffp-contract was a codegen option defined as > > > ``` > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > ``` > > therefore the default value as on. After your change, it becomes off by > default. No. -ffp-contract=on was a FE option and -ffp-contract=fast was a backend option. I still don't understand your use-case. Can you include a small testcase how this used to work before? Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
yaxunl added inline comments. Comment at: cfe/trunk/lib/Frontend/CompilerInvocation.cpp:1638 Opts.LaxVectorConversions = 0; -Opts.DefaultFPContract = 1; +Opts.setDefaultFPContractMode(LangOptions::FPC_On); Opts.NativeHalfType = 1; hfinkel wrote: > yaxunl wrote: > > hfinkel wrote: > > > Looks like the intent is certainly to have kept the OpenCL default the > > > same here. > > Well we also use clang to compile LLVM IR to ISA. Before this change, > > fp-contract was on so that backend did fp contract. However, now clang does > > not turn on fp-contract when compiling LLVM IRs. > That's the equivalent of 'Fast'. Maybe this should also be set to Fast then. Agree. That should fix this issue. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
hfinkel added inline comments. Comment at: cfe/trunk/lib/Frontend/CompilerInvocation.cpp:1638 Opts.LaxVectorConversions = 0; -Opts.DefaultFPContract = 1; +Opts.setDefaultFPContractMode(LangOptions::FPC_On); Opts.NativeHalfType = 1; yaxunl wrote: > hfinkel wrote: > > Looks like the intent is certainly to have kept the OpenCL default the same > > here. > Well we also use clang to compile LLVM IR to ISA. Before this change, > fp-contract was on so that backend did fp contract. However, now clang does > not turn on fp-contract when compiling LLVM IRs. That's the equivalent of 'Fast'. Maybe this should also be set to Fast then. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
yaxunl added inline comments. Comment at: cfe/trunk/lib/Frontend/CompilerInvocation.cpp:1638 Opts.LaxVectorConversions = 0; -Opts.DefaultFPContract = 1; +Opts.setDefaultFPContractMode(LangOptions::FPC_On); Opts.NativeHalfType = 1; hfinkel wrote: > Looks like the intent is certainly to have kept the OpenCL default the same > here. Well we also use clang to compile LLVM IR to ISA. Before this change, fp-contract was on so that backend did fp contract. However, now clang does not turn on fp-contract when compiling LLVM IRs. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
yaxunl added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") anemet wrote: > yaxunl wrote: > > This change seemed to cause some performance degradations in some OpenCL > > applications. > > > > This option used to be on by default. Now it is off by default. > > > > There are applications do separated compile/link/codegen stages. In the > > codegen stage, clang is invoked with .bc as input, therefore this option is > > off by default, whereas it was on by default before this change. > > > > Is there any reason not to keep the original behavior? > > > > Thanks. > > This change seemed to cause some performance degradations in some OpenCL > > applications. > > > > This option used to be on by default. Now it is off by default. > > It's always been off. It was set to fast for CUDA which should still be the > case. See the changes further down on the patch. > > > > > There are applications do separated compile/link/codegen stages. In the > > codegen stage, clang is invoked with .bc as input, therefore this option is > > off by default, whereas it was on by default before this change. > > > > Is there any reason not to keep the original behavior? > > Sorry but I am not sure what changed, can you elaborate on what you're doing > and how things used to work for you? Before your change, -ffp-contract was a codegen option defined as ``` ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) ``` therefore the default value as on. After your change, it becomes off by default. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
hfinkel added inline comments. Comment at: cfe/trunk/lib/Frontend/CompilerInvocation.cpp:1638 Opts.LaxVectorConversions = 0; -Opts.DefaultFPContract = 1; +Opts.setDefaultFPContractMode(LangOptions::FPC_On); Opts.NativeHalfType = 1; Looks like the intent is certainly to have kept the OpenCL default the same here. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
anemet added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") yaxunl wrote: > This change seemed to cause some performance degradations in some OpenCL > applications. > > This option used to be on by default. Now it is off by default. > > There are applications do separated compile/link/codegen stages. In the > codegen stage, clang is invoked with .bc as input, therefore this option is > off by default, whereas it was on by default before this change. > > Is there any reason not to keep the original behavior? > > Thanks. > This change seemed to cause some performance degradations in some OpenCL > applications. > > This option used to be on by default. Now it is off by default. It's always been off. It was set to fast for CUDA which should still be the case. See the changes further down on the patch. > > There are applications do separated compile/link/codegen stages. In the > codegen stage, clang is invoked with .bc as input, therefore this option is > off by default, whereas it was on by default before this change. > > Is there any reason not to keep the original behavior? Sorry but I am not sure what changed, can you elaborate on what you're doing and how things used to work for you? Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
yaxunl added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") This change seemed to cause some performance degradations in some OpenCL applications. This option used to be on by default. Now it is off by default. There are applications do separated compile/link/codegen stages. In the codegen stage, clang is invoked with .bc as input, therefore this option is off by default, whereas it was on by default before this change. Is there any reason not to keep the original behavior? Thanks. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
rnk added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.h:217 /// Adjust BinaryOperator::FPFeatures to match the bit-field size of this. - unsigned fp_contract : 1; + LangOptions::FPContractModeKind fp_contract : 2; }; anemet wrote: > rnk wrote: > > Please do not use bitfields with enum types, it's a good way to break the > > build on Windows. This change triggered this clang-cl warning: > > ``` > > C:\src\llvm-project\clang\include\clang/Basic/LangOptions.h(208,17): > > warning: implicit truncation from 'clang::LangOptions::FPContractModeKind' > > to bit-field changes value from 2 to -2 [-Wbitfield-constant-conversion] > > fp_contract = LangOptions::FPC_Fast; > > ^ ~ > > ``` > Noted and thanks for the fix! Unfortunately the warning wasn't showing up on > my host. I'll take a look why. Clang doesn't emit that warning on Posix because it wouldn't be true. The implicit underlying type of the enum on non-Windows is 'unsigned', not 'int'. We could issue a portability warning, but we wouldn't be able to turn it on by default because many users don't care about Windows portability. Anyway, sorry about the bother. This is one of the reasons we just use 'unsigned' for all our bitfields. =/ Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
anemet added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.h:217 /// Adjust BinaryOperator::FPFeatures to match the bit-field size of this. - unsigned fp_contract : 1; + LangOptions::FPContractModeKind fp_contract : 2; }; rnk wrote: > Please do not use bitfields with enum types, it's a good way to break the > build on Windows. This change triggered this clang-cl warning: > ``` > C:\src\llvm-project\clang\include\clang/Basic/LangOptions.h(208,17): > warning: implicit truncation from 'clang::LangOptions::FPContractModeKind' to > bit-field changes value from 2 to -2 [-Wbitfield-constant-conversion] > fp_contract = LangOptions::FPC_Fast; > ^ ~ > ``` Noted and thanks for the fix! Unfortunately the warning wasn't showing up on my host. I'll take a look why. Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
rnk added inline comments. Comment at: cfe/trunk/include/clang/Basic/LangOptions.h:217 /// Adjust BinaryOperator::FPFeatures to match the bit-field size of this. - unsigned fp_contract : 1; + LangOptions::FPContractModeKind fp_contract : 2; }; Please do not use bitfields with enum types, it's a good way to break the build on Windows. This change triggered this clang-cl warning: ``` C:\src\llvm-project\clang\include\clang/Basic/LangOptions.h(208,17): warning: implicit truncation from 'clang::LangOptions::FPContractModeKind' to bit-field changes value from 2 to -2 [-Wbitfield-constant-conversion] fp_contract = LangOptions::FPC_Fast; ^ ~ ``` Repository: rL LLVM https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
This revision was automatically updated to reflect the committed changes. Closed by commit rL299027: Use FPContractModeKind universally (authored by anemet). Changed prior to commit: https://reviews.llvm.org/D31167?vs=92423=93406#toc Repository: rL LLVM https://reviews.llvm.org/D31167 Files: cfe/trunk/include/clang/AST/Expr.h cfe/trunk/include/clang/AST/ExprCXX.h cfe/trunk/include/clang/Basic/LangOptions.def cfe/trunk/include/clang/Basic/LangOptions.h cfe/trunk/include/clang/Frontend/CodeGenOptions.def cfe/trunk/include/clang/Frontend/CodeGenOptions.h cfe/trunk/lib/CodeGen/BackendUtil.cpp cfe/trunk/lib/CodeGen/CGExprScalar.cpp cfe/trunk/lib/Frontend/CompilerInvocation.cpp cfe/trunk/lib/Sema/SemaAttr.cpp Index: cfe/trunk/lib/Sema/SemaAttr.cpp === --- cfe/trunk/lib/Sema/SemaAttr.cpp +++ cfe/trunk/lib/Sema/SemaAttr.cpp @@ -450,13 +450,16 @@ void Sema::ActOnPragmaFPContract(tok::OnOffSwitch OOS) { switch (OOS) { case tok::OOS_ON: -FPFeatures.setFPContractable(true); +FPFeatures.setAllowFPContractWithinStatement(); break; case tok::OOS_OFF: -FPFeatures.setFPContractable(false); +FPFeatures.setDisallowFPContract(); break; case tok::OOS_DEFAULT: -FPFeatures.setFPContractable(getLangOpts().DefaultFPContract); +if (getLangOpts().getDefaultFPContractMode() == LangOptions::FPC_On) + FPFeatures.setAllowFPContractWithinStatement(); +else + FPFeatures.setDisallowFPContract(); break; } } Index: cfe/trunk/lib/Frontend/CompilerInvocation.cpp === --- cfe/trunk/lib/Frontend/CompilerInvocation.cpp +++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp @@ -816,18 +816,6 @@ } } - if (Arg *A = Args.getLastArg(OPT_ffp_contract)) { -StringRef Val = A->getValue(); -if (Val == "fast") - Opts.setFPContractMode(CodeGenOptions::FPC_Fast); -else if (Val == "on") - Opts.setFPContractMode(CodeGenOptions::FPC_On); -else if (Val == "off") - Opts.setFPContractMode(CodeGenOptions::FPC_Off); -else - Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val; - } - if (Arg *A = Args.getLastArg(OPT_fdenormal_fp_math_EQ)) { StringRef Val = A->getValue(); if (Val == "ieee") @@ -1647,7 +1635,7 @@ Opts.AltiVec = 0; Opts.ZVector = 0; Opts.LaxVectorConversions = 0; -Opts.DefaultFPContract = 1; +Opts.setDefaultFPContractMode(LangOptions::FPC_On); Opts.NativeHalfType = 1; Opts.NativeHalfArgsAndReturns = 1; // Include default header file for OpenCL. @@ -1658,6 +1646,9 @@ Opts.CUDA = IK == IK_CUDA || IK == IK_PreprocessedCuda || LangStd == LangStandard::lang_cuda; + if (Opts.CUDA) +// Set default FP_CONTRACT to FAST. +Opts.setDefaultFPContractMode(LangOptions::FPC_Fast); Opts.RenderScript = IK == IK_RenderScript; if (Opts.RenderScript) { @@ -2282,6 +2273,18 @@ Args.hasArg(OPT_cl_unsafe_math_optimizations) || Args.hasArg(OPT_cl_fast_relaxed_math); + if (Arg *A = Args.getLastArg(OPT_ffp_contract)) { +StringRef Val = A->getValue(); +if (Val == "fast") + Opts.setDefaultFPContractMode(LangOptions::FPC_Fast); +else if (Val == "on") + Opts.setDefaultFPContractMode(LangOptions::FPC_On); +else if (Val == "off") + Opts.setDefaultFPContractMode(LangOptions::FPC_Off); +else + Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val; + } + Opts.RetainCommentsFromSystemHeaders = Args.hasArg(OPT_fretain_comments_from_system_headers); @@ -2536,10 +2539,6 @@ // triple used for host compilation. if (LangOpts.CUDAIsDevice) Res.getTargetOpts().HostTriple = Res.getFrontendOpts().AuxTriple; - -// Set default FP_CONTRACT to FAST. -if (!Args.hasArg(OPT_ffp_contract)) - Res.getCodeGenOpts().setFPContractMode(CodeGenOptions::FPC_Fast); } // FIXME: Override value name discarding when asan or msan is used because the Index: cfe/trunk/lib/CodeGen/CGExprScalar.cpp === --- cfe/trunk/lib/CodeGen/CGExprScalar.cpp +++ cfe/trunk/lib/CodeGen/CGExprScalar.cpp @@ -2672,12 +2672,7 @@ "Only fadd/fsub can be the root of an fmuladd."); // Check whether this op is marked as fusable. - if (!op.FPFeatures.isFPContractable()) -return nullptr; - - // Check whether -ffp-contract=on. (If -ffp-contract=off/fast, fusing is - // either disabled, or handled entirely by the LLVM backend). - if (CGF.CGM.getCodeGenOpts().getFPContractMode() != CodeGenOptions::FPC_On) + if (!op.FPFeatures.allowFPContractWithinStatement()) return nullptr; // We have a potentially fusable op. Look for a mul on one of the operands. Index: cfe/trunk/lib/CodeGen/BackendUtil.cpp
[PATCH] D31167: Use FPContractModeKind universally
aaron.ballman added inline comments. Comment at: include/clang/Basic/LangOptions.h:92 + enum FPContractModeKind { +FPC_Off,// Form fused FP ops only where result will not be affected. +FPC_On, // Form fused FP ops according to FP_CONTRACT rules. anemet wrote: > aaron.ballman wrote: > > I think you mean effected rather than affected. > I think the verb is affect; effect is the noun. Quick grep confirms: > > /org/llvm$ git grep affected | wc -l > 109 > /org/llvm$ git grep effected | wc -l >2 > I think I was thrown by "form" being the verb in that sentence, but you are correct. :-) https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
anemet added inline comments. Comment at: include/clang/Basic/LangOptions.h:92 + enum FPContractModeKind { +FPC_Off,// Form fused FP ops only where result will not be affected. +FPC_On, // Form fused FP ops according to FP_CONTRACT rules. aaron.ballman wrote: > I think you mean effected rather than affected. I think the verb is affect; effect is the noun. Quick grep confirms: /org/llvm$ git grep affected | wc -l 109 /org/llvm$ git grep effected | wc -l 2 https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
aaron.ballman accepted this revision. aaron.ballman added a comment. This revision is now accepted and ready to land. Aside from a minor comment nit, LGTM Comment at: include/clang/Basic/LangOptions.h:92 + enum FPContractModeKind { +FPC_Off,// Form fused FP ops only where result will not be affected. +FPC_On, // Form fused FP ops according to FP_CONTRACT rules. I think you mean effected rather than affected. https://reviews.llvm.org/D31167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D31167: Use FPContractModeKind universally
anemet created this revision. FPContractModeKind is the codegen option flag which is already ternary (off, on, fast). This makes it universally the type for the contractable info across the front-end: - In FPOptions (i.e. in the Sema + in the expression nodes). - In LangOpts::DefaultFPContractMode which is the option that initializes FPOptions in the Sema. Another way to look at this change is that before fp-contractable on/off were the only states handled to the front-end: - For "on", FMA folding was performed by the front-end - For "fast", we simply forwarded the flag to TargetOptions to handle it in LLVM Now off/on/fast are all exposed because for fast we will generate fast-math-flags during CodeGen. This is toward moving fp-contraction=fast from an LLVM TargetOption to a FastMathFlag in order to fix PR25721. https://reviews.llvm.org/D31167 Files: include/clang/AST/Expr.h include/clang/AST/ExprCXX.h include/clang/Basic/LangOptions.def include/clang/Basic/LangOptions.h include/clang/Frontend/CodeGenOptions.def include/clang/Frontend/CodeGenOptions.h lib/CodeGen/BackendUtil.cpp lib/CodeGen/CGExprScalar.cpp lib/Frontend/CompilerInvocation.cpp lib/Sema/SemaAttr.cpp Index: lib/Sema/SemaAttr.cpp === --- lib/Sema/SemaAttr.cpp +++ lib/Sema/SemaAttr.cpp @@ -450,13 +450,16 @@ void Sema::ActOnPragmaFPContract(tok::OnOffSwitch OOS) { switch (OOS) { case tok::OOS_ON: -FPFeatures.setFPContractable(true); +FPFeatures.setAllowFPContractWithinStatement(); break; case tok::OOS_OFF: -FPFeatures.setFPContractable(false); +FPFeatures.setDisallowFPContract(); break; case tok::OOS_DEFAULT: -FPFeatures.setFPContractable(getLangOpts().DefaultFPContract); +if (getLangOpts().getDefaultFPContractMode() == LangOptions::FPC_On) + FPFeatures.setAllowFPContractWithinStatement(); +else + FPFeatures.setDisallowFPContract(); break; } } Index: lib/Frontend/CompilerInvocation.cpp === --- lib/Frontend/CompilerInvocation.cpp +++ lib/Frontend/CompilerInvocation.cpp @@ -811,18 +811,6 @@ } } - if (Arg *A = Args.getLastArg(OPT_ffp_contract)) { -StringRef Val = A->getValue(); -if (Val == "fast") - Opts.setFPContractMode(CodeGenOptions::FPC_Fast); -else if (Val == "on") - Opts.setFPContractMode(CodeGenOptions::FPC_On); -else if (Val == "off") - Opts.setFPContractMode(CodeGenOptions::FPC_Off); -else - Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val; - } - if (Arg *A = Args.getLastArg(OPT_fdenormal_fp_math_EQ)) { StringRef Val = A->getValue(); if (Val == "ieee") @@ -1625,7 +1613,7 @@ Opts.ZVector = 0; Opts.CXXOperatorNames = 1; Opts.LaxVectorConversions = 0; -Opts.DefaultFPContract = 1; +Opts.setDefaultFPContractMode(LangOptions::FPC_On); Opts.NativeHalfType = 1; Opts.NativeHalfArgsAndReturns = 1; // Include default header file for OpenCL. @@ -1636,6 +1624,9 @@ Opts.CUDA = IK == IK_CUDA || IK == IK_PreprocessedCuda || LangStd == LangStandard::lang_cuda; + if (Opts.CUDA) +// Set default FP_CONTRACT to FAST. +Opts.setDefaultFPContractMode(LangOptions::FPC_Fast); Opts.RenderScript = IK == IK_RenderScript; if (Opts.RenderScript) { @@ -2263,6 +2254,18 @@ Args.hasArg(OPT_cl_unsafe_math_optimizations) || Args.hasArg(OPT_cl_fast_relaxed_math); + if (Arg *A = Args.getLastArg(OPT_ffp_contract)) { +StringRef Val = A->getValue(); +if (Val == "fast") + Opts.setDefaultFPContractMode(LangOptions::FPC_Fast); +else if (Val == "on") + Opts.setDefaultFPContractMode(LangOptions::FPC_On); +else if (Val == "off") + Opts.setDefaultFPContractMode(LangOptions::FPC_Off); +else + Diags.Report(diag::err_drv_invalid_value) << A->getAsString(Args) << Val; + } + Opts.RetainCommentsFromSystemHeaders = Args.hasArg(OPT_fretain_comments_from_system_headers); @@ -2516,10 +2519,6 @@ // triple used for host compilation. if (LangOpts.CUDAIsDevice) Res.getTargetOpts().HostTriple = Res.getFrontendOpts().AuxTriple; - -// Set default FP_CONTRACT to FAST. -if (!Args.hasArg(OPT_ffp_contract)) - Res.getCodeGenOpts().setFPContractMode(CodeGenOptions::FPC_Fast); } // FIXME: Override value name discarding when asan or msan is used because the Index: lib/CodeGen/CGExprScalar.cpp === --- lib/CodeGen/CGExprScalar.cpp +++ lib/CodeGen/CGExprScalar.cpp @@ -2663,12 +2663,7 @@ "Only fadd/fsub can be the root of an fmuladd."); // Check whether this op is marked as fusable. - if (!op.FPFeatures.isFPContractable()) -return nullptr; - - // Check whether -ffp-contract=on. (If -ffp-contract=off/fast,