[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-10 Thread Joshua Cranmer via cfe-commits


@@ -15883,6 +15883,95 @@ The returned value is completely identical to the 
input except for the sign bit;
 in particular, if the input is a NaN, then the quiet/signaling bit and payload
 are perfectly preserved.
 
+.. _i_fminmax_family:
+
+'``llvm.min.*``' Intrinsics Comparation
+^^^
+
+Standard:
+"
+
+IEEE754 and ISO C define some min/max operations, and they have some 
differences
+on working with qNaN/sNaN and +0.0/-0.0. Here is the list:
+
+.. list-table::
+   :header-rows: 2
+
+   * - ``ISO C``
+ - fmin/fmax
+ - none
+ - fmininum/fmaximum
+ - fminimum_num/fmaximum_num
+
+   * - ``IEEE754``
+ - none
+ - nimNUM/maxNUM (2008)
+ - minimum/maximum (2019)
+ - minimumNumber/maximumNumber (2019)
+
+   * - ``+0.0 vs -0.0``
+ - either one
+ - +0.0 > -0.0

jcranmer-intel wrote:

My copy of IEEE 754-2008 says of minNum:

> minNum(x, y) is the canonicalized number x if x canonicalized number if one
operand is a number and the other a quiet NaN. Otherwise it is either x or y, 
canonicalized (this
means results might differ among implementations). 

IEEE 754 doesn't guarantee the sign of `minNum(-0.0, +0.0)`, just like `fmin`.

Note also that TS 18611-1 bound C `fmin` to IEEE 754-2008 `minNum` 
(https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1711.pdf)--`fmin` and 
`minNum` should also be considered equivalent.

(Also, while I'm being pedantic, the spelling used in IEEE 754 is `minNum`, not 
`minNUM`.

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-10 Thread Joshua Cranmer via cfe-commits


@@ -15868,6 +15868,51 @@ The returned value is completely identical to the 
input except for the sign bit;
 in particular, if the input is a NaN, then the quiet/signaling bit and payload
 are perfectly preserved.
 
+.. _i_fminmax_family:
+
+'``llvm.min.*``' Intrinsics Comparation
+^^^
+
+.. list-table::
+   :header-rows: 1
+   :widths: 16 28 28 28
+
+   * - Operation
+ - minnum/maxnum
+ - minimum/maximum
+ - minimumnum/maximumnum
+
+   * - ``NUM vs qNaN``
+ - NUM, no exception
+ - qNaN, no exception
+ - qNaN, no exception
+
+   * - ``NUM vs sNaN``
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+ - NUM, invalid exception
+
+   * - ``qNaN vs sNaN``
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+
+   * - ``sNaN vs sNaN``
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+
+   * - ``+0.0 vs -0.0``
+ - either one
+ - +0.0(max)/-0.0(min)
+ - +0.0(max)/-0.0(min)
+
+   * - ``NUM vs NUM``
+ - larger(max)/smaller(min)
+ - larger(max)/smaller(min)
+ - larger(max)/smaller(min)

jcranmer-intel wrote:

The current text still isn't covering the interaction of sNaN with LLVM's 
NaN-handling policy.

It's also not entirely accurate to say that invalid exceptions are raised when 
FP exceptions are unspecified without constrained intrinsics.

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-10 Thread Joshua Cranmer via cfe-commits


@@ -15883,6 +15883,95 @@ The returned value is completely identical to the 
input except for the sign bit;
 in particular, if the input is a NaN, then the quiet/signaling bit and payload
 are perfectly preserved.
 
+.. _i_fminmax_family:
+
+'``llvm.min.*``' Intrinsics Comparation
+^^^
+
+Standard:
+"
+
+IEEE754 and ISO C define some min/max operations, and they have some 
differences
+on working with qNaN/sNaN and +0.0/-0.0. Here is the list:
+
+.. list-table::
+   :header-rows: 2
+
+   * - ``ISO C``
+ - fmin/fmax
+ - none
+ - fmininum/fmaximum
+ - fminimum_num/fmaximum_num
+
+   * - ``IEEE754``
+ - none
+ - nimNUM/maxNUM (2008)
+ - minimum/maximum (2019)
+ - minimumNumber/maximumNumber (2019)
+
+   * - ``+0.0 vs -0.0``
+ - either one
+ - +0.0 > -0.0
+ - +0.0 > -0.0
+ - +0.0 > -0.0
+
+   * - ``NUM/qNaN vs sNaN``
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+ - NUM/qNaN, invalid exception
+
+   * - ``NUM/qNaN vs qNaN``
+ - NUM/qNaN, no excpetion

jcranmer-intel wrote:

EMISSPELLED in this column

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Report erroneous floating point results in _Complex math (PR #90588)

2024-06-07 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel approved this pull request.


https://github.com/llvm/llvm-project/pull/90588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Joshua Cranmer via cfe-commits


@@ -1723,6 +1723,18 @@ class MachineIRBuilder {
 return buildInstr(TargetOpcode::G_FMAXNUM_IEEE, {Dst}, {Src0, Src1}, 
Flags);
   }
 
+  MachineInstrBuilder
+  buildFMinimumNUM(const DstOp , const SrcOp , const SrcOp ,

jcranmer-intel wrote:

How about `FMinimumnum` instead?

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Joshua Cranmer via cfe-commits


@@ -9130,6 +9142,15 @@ void SelectionDAGBuilder::visitCall(const CallInst ) {
 if (visitBinaryFloatCall(I, ISD::FMAXNUM))
   return;
 break;
+  case LibFunc_fminimum_num:
+  case LibFunc_fminimum_numf:
+if (visitBinaryFloatCall(I, ISD::FMINIMUMNUM))
+  return;
+break;
+  case LibFunc_fmaximum_num:

jcranmer-intel wrote:

Methinks there's quite a few missing case statements here.

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Joshua Cranmer via cfe-commits


@@ -631,6 +631,46 @@ TEST(APFloatTest, Maximum) {
   EXPECT_TRUE(std::isnan(maximum(nan, f1).convertToDouble()));
 }
 
+TEST(APFloatTest, MinimumNumber) {
+  APFloat f1(1.0);
+  APFloat f2(2.0);
+  APFloat zp(0.0);
+  APFloat zn(-0.0);
+  APFloat nan = APFloat::getNaN(APFloat::IEEEdouble());
+  APFloat snan = APFloat::getSNaN(APFloat::IEEEdouble());
+
+  EXPECT_EQ(1.0, minimumnum(f1, f2).convertToDouble());
+  EXPECT_EQ(1.0, minimumnum(f2, f1).convertToDouble());
+  EXPECT_EQ(-0.0, minimumnum(zp, zn).convertToDouble());
+  EXPECT_EQ(-0.0, minimumnum(zn, zp).convertToDouble());

jcranmer-intel wrote:

I don't think this is testing what you want to test, as `0.0 == -0.0`

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Joshua Cranmer via cfe-commits


@@ -1275,6 +1283,14 @@ let IntrProperties = [IntrInaccessibleMemOnly, 
IntrWillReturn, IntrStrictFP] in
[ LLVMMatchType<0>,
  LLVMMatchType<0>,
  llvm_metadata_ty ]>;
+  def int_experimental_constrained_maximumnum : DefaultAttrsIntrinsic<[ 
llvm_anyfloat_ty ],
+   [ LLVMMatchType<0>,
+ LLVMMatchType<0>,
+ llvm_metadata_ty ]>;
+  def int_experimental_constrained_minimumnum : DefaultAttrsIntrinsic<[ 
llvm_anyfloat_ty ],
+   [ LLVMMatchType<0>,
+ LLVMMatchType<0>,
+ llvm_metadata_ty ]>;

jcranmer-intel wrote:

You still need documentation for all these extra intrinsics.

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Joshua Cranmer via cfe-commits


@@ -15868,6 +15868,51 @@ The returned value is completely identical to the 
input except for the sign bit;
 in particular, if the input is a NaN, then the quiet/signaling bit and payload
 are perfectly preserved.
 
+.. _i_fminmax_family:
+
+'``llvm.min.*``' Intrinsics Comparation
+^^^
+
+.. list-table::
+   :header-rows: 1
+   :widths: 16 28 28 28
+
+   * - Operation
+ - minnum/maxnum
+ - minimum/maximum
+ - minimumnum/maximumnum
+
+   * - ``NUM vs qNaN``
+ - NUM, no exception
+ - qNaN, no exception
+ - qNaN, no exception

jcranmer-intel wrote:

This line is not correct. (`minimumnum(x, qNaN)` is `x`)

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Joshua Cranmer via cfe-commits


@@ -15868,6 +15868,51 @@ The returned value is completely identical to the 
input except for the sign bit;
 in particular, if the input is a NaN, then the quiet/signaling bit and payload
 are perfectly preserved.
 
+.. _i_fminmax_family:
+
+'``llvm.min.*``' Intrinsics Comparation
+^^^
+
+.. list-table::
+   :header-rows: 1
+   :widths: 16 28 28 28
+
+   * - Operation
+ - minnum/maxnum
+ - minimum/maximum
+ - minimumnum/maximumnum
+
+   * - ``NUM vs qNaN``
+ - NUM, no exception
+ - qNaN, no exception
+ - qNaN, no exception
+
+   * - ``NUM vs sNaN``
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+ - NUM, invalid exception
+
+   * - ``qNaN vs sNaN``
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+
+   * - ``sNaN vs sNaN``
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+ - qNaN, invalid exception
+
+   * - ``+0.0 vs -0.0``
+ - either one
+ - +0.0(max)/-0.0(min)
+ - +0.0(max)/-0.0(min)
+
+   * - ``NUM vs NUM``
+ - larger(max)/smaller(min)
+ - larger(max)/smaller(min)
+ - larger(max)/smaller(min)

jcranmer-intel wrote:

It feels like this section needs a callout on how the sNaN entries interact 
with our current NaN-handling policy.

https://github.com/llvm/llvm-project/pull/93841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Macro for constant rounding mode (PR #92699)

2024-05-27 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

Overall, I'm not opposed to this patch.

This new macro should probably be mentioned somewhere in the clang user 
documentation.

> The way this requirement is formulated indicates that it could be implemented 
> using preprocessor facility. Such implementation would require a builtin 
> macro that is set in the region where pragma FENV_ROUND is in effect and 
> reflects constant rounding mode.

Has there been any discussion with gcc and libc implementations about how they 
plan to implement this requirement? I'm not entirely convinced this is the best 
approach, especially given that "`printf` and `scanf` families" is on the list 
of functions that need to be affected, and that's a decent amount of functions 
to have to wrap with round-mode-aware-variants.

> This change introduces macro ROUNDING_MODE, which is a string dependent on 
> the constant rounding mode

It expands to an identify, not a string literal, right?

https://github.com/llvm/llvm-project/pull/92699
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Report erroneous floating point results in _Complex math (PR #90588)

2024-05-23 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

After banging my head on the desk trying to figure out why the tests were 
giving such strange results, I realized the issue...

Could you please use `__builtin_complex` in the test to construct the complex 
infinities instead of `__builtin_infinity() * 1.0j`?

https://github.com/llvm/llvm-project/pull/90588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implementation of '#pragma STDC FENV_ROUND' (PR #89617)

2024-05-17 Thread Joshua Cranmer via cfe-commits


@@ -5980,6 +5987,64 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   return Ret;
 }
 
+static bool endsWithRoundingModeSuffix(StringRef FuncName) {
+  size_t Underscore = FuncName.find_last_of("_");
+  if (Underscore == StringRef::npos || Underscore < 2)
+return false;
+  StringRef Suffix = FuncName.substr(Underscore + 1);
+  static const StringRef RMSuffixes[] = {"rtz", "rte", "rtp", "rtn", "rhaz",
+ "rz",  "rn",  "ru",  "rd"};
+  for (auto RM : RMSuffixes) {
+if (Suffix == RM)
+  return true;
+  }
+  return false;
+}
+
+bool CodeGenFunction::requiresDynamicRounding(const CGCallee ) {
+  if (Callee.isOrdinary()) {
+const Decl *CalleeDecl = 
Callee.getAbstractInfo().getCalleeDecl().getDecl();
+if (const FunctionDecl *FD = dyn_cast_or_null(CalleeDecl)) {
+  IdentifierInfo *FuncNameII = FD->getDeclName().getAsIdentifierInfo();
+  if (FuncNameII) {
+StringRef FuncName = FuncNameII->getName();
+// If a reserved identifier ends with rounding mode suffix preceded by
+// underscore, this function does not need the previous dynamic 
rounding
+// mode to be set.
+if (isReservedInAllContexts(
+FuncNameII->isReserved(getContext().getLangOpts( {
+  if (endsWithRoundingModeSuffix(FuncName))
+return false;
+}
+  }
+}
+  }
+  return true;
+}
+
+/// Sets dynamic rounding mode for the function called in the region where
+/// pragma FENV_ROUND is in effect.
+void CodeGenFunction::setRoundingModeForCall(const CGCallee ) {
+  if (Target.hasStaticRounding() || Callee.isBuiltin() ||
+  !requiresDynamicRounding(Callee))
+return;
+  if (!CurrentRoundingIsStatic || !DynamicRoundingMode)
+return;
+  Builder.CreateCall(CGM.getIntrinsic(llvm::Intrinsic::set_rounding),
+ DynamicRoundingMode);
+  CurrentRoundingIsStatic = false;

jcranmer-intel wrote:

I'm not seeing logic here to effect the proper handling of, e.g., `sqrt`. Is 
this planned in a future patch? If so, add in a TODO note.

https://github.com/llvm/llvm-project/pull/89617
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implementation of '#pragma STDC FENV_ROUND' (PR #89617)

2024-05-17 Thread Joshua Cranmer via cfe-commits


@@ -5980,6 +5987,64 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   return Ret;
 }
 
+static bool endsWithRoundingModeSuffix(StringRef FuncName) {
+  size_t Underscore = FuncName.find_last_of("_");
+  if (Underscore == StringRef::npos || Underscore < 2)
+return false;
+  StringRef Suffix = FuncName.substr(Underscore + 1);
+  static const StringRef RMSuffixes[] = {"rtz", "rte", "rtp", "rtn", "rhaz",
+ "rz",  "rn",  "ru",  "rd"};
+  for (auto RM : RMSuffixes) {
+if (Suffix == RM)
+  return true;
+  }
+  return false;
+}
+
+bool CodeGenFunction::requiresDynamicRounding(const CGCallee ) {
+  if (Callee.isOrdinary()) {
+const Decl *CalleeDecl = 
Callee.getAbstractInfo().getCalleeDecl().getDecl();
+if (const FunctionDecl *FD = dyn_cast_or_null(CalleeDecl)) {
+  IdentifierInfo *FuncNameII = FD->getDeclName().getAsIdentifierInfo();
+  if (FuncNameII) {
+StringRef FuncName = FuncNameII->getName();
+// If a reserved identifier ends with rounding mode suffix preceded by
+// underscore, this function does not need the previous dynamic 
rounding
+// mode to be set.

jcranmer-intel wrote:

Where is this rule coming from?

https://github.com/llvm/llvm-project/pull/89617
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implementation of '#pragma STDC FENV_ROUND' (PR #89617)

2024-05-17 Thread Joshua Cranmer via cfe-commits


@@ -1232,6 +1232,14 @@ class TargetInfo : public TransferrableTargetInfo,
 return true;
   }
 
+  /// Returns true, if an operations that depends on rounding mode can be
+  /// implemented without changing FP environment. In this case the rounding
+  /// mode is encoded in the bits of implementing instruction.

jcranmer-intel wrote:

I'm not sure we actually have any (non-target-specific) IR constructs at the 
moment that actually implement static rounding mode. The documentation for 
constrained intrinsics says:

> For values other than “round.dynamic” optimization passes may assume that the 
> actual runtime rounding mode (as defined in a target-specific manner) matches 
> the specified rounding mode, but this is not guaranteed.

It's also the case that static rounding mode may be a less-than-global 
decision. X86 AVX512/AVX10 has static rounding mode, which is a subtarget 
consideration, but even then, it's not entirely clear that they would be 
absolutely preferred.

https://github.com/llvm/llvm-project/pull/89617
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implementation of '#pragma STDC FENV_ROUND' (PR #89617)

2024-05-17 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

Sorry for just thinking of this now, but we should also have tests for some of 
the builtins like `__builtin_fma` or `__builtin_sqrt`.

https://github.com/llvm/llvm-project/pull/89617
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implementation of '#pragma STDC FENV_ROUND' (PR #89617)

2024-05-17 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel edited 
https://github.com/llvm/llvm-project/pull/89617
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Use constant rounding mode for floating literals (PR #90877)

2024-05-10 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

I'm generally happy with the testing and semantics at this point.

https://github.com/llvm/llvm-project/pull/90877
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Use constant rounding mode for floating literals (PR #90877)

2024-05-03 Thread Joshua Cranmer via cfe-commits


@@ -79,3 +79,16 @@ float V7 = []() -> float {
   0x0.01p0F);
 }();
 // CHECK: @V7 = {{.*}} float 1.00e+00
+
+template struct L {
+  constexpr L() : value(V) {}
+  float value;
+};
+
+#pragma STDC FENV_ROUND FE_DOWNWARD

jcranmer-intel wrote:

The interaction of these pragmas with C++ is underspecified, although for the 
most part, you can make pretty reasonable expectations about what the results 
should be. I'm also planning on bringing a paper to C++ to clarify the 
semantics somewhat, with the intent mostly being "just don't allow the pragma 
in most places."

That said, templates do add in all sorts of fun extra complexity, and there are 
at least three scenarios worth testing (with regards to interpretation of 
constants within the template body):
* Implicit template instantiation (you already test this)
* Explicit template instantiation
* Template specialization

I can definitely agree that implicit template instantiation should inherit from 
the rounding mode of the definition. Template specialization probably 
*shouldn't* inherit from the original template definition. Explicit 
instantiation... I'm not sure? I can argue that one either way.

https://github.com/llvm/llvm-project/pull/90877
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Use constant rounding mode for floating literals (PR #90877)

2024-05-03 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

I've been doing some testing, and I do want to confirm my understanding of the 
C standard here.

>From what I can tell, macro expansion (phase 4) happens before constants are 
>parsed (phase 7). As a result, if you have code like this:
```c
#define CONSTANT 0.1f
```
the interpretation of `0.1f` depends on the state of the pragma at point of use 
of the macro, not point of declaration of the macro. Given that pragmas can be 
defined with `_Pragma`, with lambdas or statement expressions (albeit neither 
of which is standard C), it should be possible to create a macro that evaluates 
a constant with a given rounding mode:
```c++
#define rendevous(x) _Pragma(#x)
#define CONSTANT(RM, x) ([](){ rendevous(STDC FENV_ROUND RM); return x; })())
```

This does seem to indeed be the behavior of the current implementation, but I 
would like to see some tests in the test code confirming that it's the macro 
use, not the `#define` that determines the interpretation of the constant, and 
another test for the `_Pragma` case.

https://github.com/llvm/llvm-project/pull/90877
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix -fno-unsafe-math-optimizations behavior (PR #89473)

2024-04-30 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel approved this pull request.


https://github.com/llvm/llvm-project/pull/89473
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implementation of '#pragma STDC FENV_ROUND' (PR #89617)

2024-04-26 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel requested changes to this pull request.

I haven't fully tested the changes yet, so right now I'm looking at the test to 
figure out how much is supported. Nevertheless, I can already tell that this is 
not complete support.

7.6.2p4 does clearly state that floating constants need to be evaluated 
according to the standard rounding mode, so that the constant `0.1` evaluates 
to a different value in `FE_DOWNWARD` versus `FE_UPWARD`.

I'm not seeing from the tests how the code is handling calls to functions. 
Calls to all functions outside of a finite list (see same paragraph) need to 
restore the saved dynamic rounding mode for the duration of the call.

I'd like to see tests covering casts and conversions better.

https://github.com/llvm/llvm-project/pull/89617
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Clean up denormal handling with -ffp-model, -ffast-math, etc. (PR #89477)

2024-04-26 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> The "etc." is eliding -fno-fast-math, -funsafe-math-optimizations, and 
> -fno-unsafe-math-optimizations

Maybe "fast-math-ish flags" is a good summary of the lot?

https://github.com/llvm/llvm-project/pull/89477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Clean up denormal handling with -ffp-model, -ffast-math, etc. (PR #89477)

2024-04-26 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

This may need some release notes adjustments as well; I already have a todo to 
revisit the release notes around release time to make sure we get the summary 
of the denormal handling flag changes right.

https://github.com/llvm/llvm-project/pull/89477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-04-25 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel closed 
https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-04-25 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel updated 
https://github.com/llvm/llvm-project/pull/80475

>From 971cc613e994a308f939f68247257b65e04c74fa Mon Sep 17 00:00:00 2001
From: Joshua Cranmer 
Date: Fri, 2 Feb 2024 10:35:29 -0800
Subject: [PATCH 1/5] Disable FTZ/DAZ when compiling shared libraries by
 default.

This fixes https://github.com/llvm/llvm-project/issues/57589, and aligns
Clang with the behavior of current versions of gcc. There is a new option,
-mdaz-ftz, to control the linking of the file that sets FTZ/DAZ on startup, and
this flag is on by default if -ffast-math is present and -shared isn't.
---
 clang/docs/ReleaseNotes.rst   |  6 +
 clang/docs/UsersManual.rst| 15 
 clang/include/clang/Driver/Options.td |  5 
 clang/lib/Driver/ToolChain.cpp| 15 ++--
 clang/test/Driver/linux-ld.c  | 34 +++
 5 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index d1f7293a842bb6..8fd41d07c57262 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -207,6 +207,12 @@ Non-comprehensive list of changes in this release
 - ``__typeof_unqual__`` is available in all C modes as an extension, which 
behaves
   like ``typeof_unqual`` from C23, similar to ``__typeof__`` and ``typeof``.
 
+
+* Code compiled with ``-shared`` and ``-ffast-math`` will no longer enable
+  flush-to-zero floating-point mode by default. This decision can be overridden
+  with use of ``-mdaz-ftz``. This behavior now matches GCC's behavior.
+  (`#57589 `_)
+
 New Compiler Flags
 --
 - ``-fsanitize=implicit-bitfield-conversion`` checks implicit truncation and
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 8df40566fcba3d..d0326f01d251e0 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -1506,7 +1506,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
 
* ``-ffp-contract=fast``
 
-   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code. See
+   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code unless
+   ``-shared`` or ``-mno-daz-ftz`` is present. See
:ref:`crtfastmath.o` for more details.
 
 .. option:: -fno-fast-math
@@ -1560,7 +1561,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
  ``-ffp-contract``.
 
Note: ``-fno-fast-math`` implies ``-fdenormal-fp-math=ieee``.
-   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code.
+   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code
+   unless ``-mdaz-ftz`` is present.
 
 .. option:: -fdenormal-fp-math=
 
@@ -1938,10 +1940,13 @@ by using ``#pragma STDC FENV_ROUND`` with a value other 
than ``FE_DYNAMIC``.
 
 A note about ``crtfastmath.o``
 ^^
-``-ffast-math`` and ``-funsafe-math-optimizations`` cause ``crtfastmath.o`` to 
be
-automatically linked,  which adds a static constructor that sets the FTZ/DAZ
+``-ffast-math`` and ``-funsafe-math-optimizations`` without the ``-shared``
+option cause ``crtfastmath.o`` to be
+automatically linked, which adds a static constructor that sets the FTZ/DAZ
 bits in MXCSR, affecting not only the current compilation unit but all static
-and shared libraries included in the program.
+and shared libraries included in the program. This decision can be overridden
+by using either the flag ``-mdaz-ftz`` or ``-mno-daz-ftz`` to respectively
+link or not link ``crtfastmath.o``.
 
 .. _FLT_EVAL_METHOD:
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 922bda721dc780..f59b6962daf261 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2615,6 +2615,11 @@ defm protect_parens : BoolFOption<"protect-parens",
   "floating-point expressions are evaluated">,
   NegFlag>;
 
+defm daz_ftz : SimpleMFlag<"daz-ftz",
+  "Globally set", "Do not globally set",
+  " the denormals-are-zero (DAZ) and flush-to-zero (FTZ) bits in the "
+  "floating-point control register on program startup.">;
+
 def ffor_scope : Flag<["-"], "ffor-scope">, Group;
 def fno_for_scope : Flag<["-"], "fno-for-scope">, Group;
 
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 237092ed07e5dc..a3979e91589e88 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -1307,9 +1307,14 @@ void ToolChain::AddCCKextLibArgs(const ArgList ,
 
 bool ToolChain::isFastMathRuntimeAvailable(const ArgList ,
std::string ) const {
+  // Don't implicitly link in mode-changing libraries in a shared library, 
since
+  // this can have very deleterious effects. See the various links from
+  // https://github.com/llvm/llvm-project/issues/57589 for more information.
+  bool Default 

[clang] Implementation of '#pragma STDC FENV_ROUND' (PR #89617)

2024-04-24 Thread Joshua Cranmer via cfe-commits


@@ -0,0 +1,160 @@
+// RUN: %clang_cc1 -S -triple x86_64-linux-gnu -emit-llvm %s -o - | \
+// RUN:   FileCheck %s --implicit-check-not "call void @llvm.set.rounding" 
--implicit-check-not "call i32 @llvm.get.rounding"
+
+float func_rz_ru(float w, float x, float y, float z) {
+  #pragma STDC FENV_ROUND FE_TOWARDZERO
+  float result = x * y;
+  {
+#pragma STDC FENV_ROUND FE_UPWARD
+result += z;
+  }
+  return result - w;
+}
+
+// CHECK-LABEL: @func_rz_ru
+// CHECK:  call void @llvm.set.rounding(i32 0)
+// CHECK:  call float @llvm.experimental.constrained.fmul.f32({{.*}}, metadata 
!"round.towardzero", metadata !"fpexcept.ignore")
+// CHECK:  call void @llvm.set.rounding(i32 2)
+// CHECK:  call float @llvm.experimental.constrained.fadd.f32({{.*}}, metadata 
!"round.upward", metadata !"fpexcept.ignore")
+// CHECK:  call void @llvm.set.rounding(i32 0)
+// CHECK:  call float @llvm.experimental.constrained.fsub.f32({{.*}}, metadata 
!"round.towardzero", metadata !"fpexcept.ignore")
+// CHECK:  call void @llvm.set.rounding(i32 1)
+
+
+float func_rz_rz(float w, float x, float y, float z) {
+  #pragma STDC FENV_ROUND FE_TOWARDZERO
+  float result = x * y;
+  {
+#pragma STDC FENV_ROUND FE_TOWARDZERO
+result += z;
+  }
+  return result - w;
+}
+
+// CHECK-LABEL: @func_rz_rz
+// CHECK:  call void @llvm.set.rounding(i32 0)
+// CHECK:  call float @llvm.experimental.constrained.fmul.f32({{.*}}, metadata 
!"round.towardzero", metadata !"fpexcept.ignore")
+// CHECK:  call float @llvm.experimental.constrained.fadd.f32({{.*}}, metadata 
!"round.towardzero", metadata !"fpexcept.ignore")
+// CHECK:  call float @llvm.experimental.constrained.fsub.f32({{.*}}, metadata 
!"round.towardzero", metadata !"fpexcept.ignore")
+// CHECK:  call void @llvm.set.rounding(i32 1)
+
+float func_rne_rne(float w, float x, float y, float z) {
+  #pragma STDC FENV_ROUND FE_TONEAREST
+  float result = x * y;
+  {
+#pragma STDC FENV_ROUND FE_TONEAREST
+result += z;
+  }
+  return result - w;
+}
+
+// CHECK-LABEL: @func_rne_rne
+// CHECK:  fmul
+// CHECK:  fadd
+// CHECK:  fsub
+
+float func_rz_dyn_noacc(float w, float x, float y, float z) {
+  #pragma STDC FENV_ROUND FE_TOWARDZERO
+  float result = x * y;
+  {
+#pragma STDC FENV_ROUND FE_DYNAMIC
+result += z;
+  }
+  return result - w;
+}
+
+// CHECK-LABEL: @func_rz_dyn_noacc
+// CHECK:  call void @llvm.set.rounding(i32 0)
+// CHECK:  call float @llvm.experimental.constrained.fmul.f32({{.*}}, metadata 
!"round.towardzero", metadata !"fpexcept.ignore")
+// CHECK:  call void @llvm.set.rounding(i32 1)
+// CHECK:  call float @llvm.experimental.constrained.fadd.f32({{.*}}, metadata 
!"round.tonearest", metadata !"fpexcept.ignore")

jcranmer-intel wrote:

7.6.2p3:
> If the FE_DYNAMIC mode is specified and FENV_ACCESS is "off", the translator 
> may assume that the default rounding mode is in effect.

https://github.com/llvm/llvm-project/pull/89617
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-04-24 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel updated 
https://github.com/llvm/llvm-project/pull/80475

>From 971cc613e994a308f939f68247257b65e04c74fa Mon Sep 17 00:00:00 2001
From: Joshua Cranmer 
Date: Fri, 2 Feb 2024 10:35:29 -0800
Subject: [PATCH 1/4] Disable FTZ/DAZ when compiling shared libraries by
 default.

This fixes https://github.com/llvm/llvm-project/issues/57589, and aligns
Clang with the behavior of current versions of gcc. There is a new option,
-mdaz-ftz, to control the linking of the file that sets FTZ/DAZ on startup, and
this flag is on by default if -ffast-math is present and -shared isn't.
---
 clang/docs/ReleaseNotes.rst   |  6 +
 clang/docs/UsersManual.rst| 15 
 clang/include/clang/Driver/Options.td |  5 
 clang/lib/Driver/ToolChain.cpp| 15 ++--
 clang/test/Driver/linux-ld.c  | 34 +++
 5 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index d1f7293a842bb6..8fd41d07c57262 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -207,6 +207,12 @@ Non-comprehensive list of changes in this release
 - ``__typeof_unqual__`` is available in all C modes as an extension, which 
behaves
   like ``typeof_unqual`` from C23, similar to ``__typeof__`` and ``typeof``.
 
+
+* Code compiled with ``-shared`` and ``-ffast-math`` will no longer enable
+  flush-to-zero floating-point mode by default. This decision can be overridden
+  with use of ``-mdaz-ftz``. This behavior now matches GCC's behavior.
+  (`#57589 `_)
+
 New Compiler Flags
 --
 - ``-fsanitize=implicit-bitfield-conversion`` checks implicit truncation and
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 8df40566fcba3d..d0326f01d251e0 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -1506,7 +1506,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
 
* ``-ffp-contract=fast``
 
-   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code. See
+   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code unless
+   ``-shared`` or ``-mno-daz-ftz`` is present. See
:ref:`crtfastmath.o` for more details.
 
 .. option:: -fno-fast-math
@@ -1560,7 +1561,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
  ``-ffp-contract``.
 
Note: ``-fno-fast-math`` implies ``-fdenormal-fp-math=ieee``.
-   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code.
+   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code
+   unless ``-mdaz-ftz`` is present.
 
 .. option:: -fdenormal-fp-math=
 
@@ -1938,10 +1940,13 @@ by using ``#pragma STDC FENV_ROUND`` with a value other 
than ``FE_DYNAMIC``.
 
 A note about ``crtfastmath.o``
 ^^
-``-ffast-math`` and ``-funsafe-math-optimizations`` cause ``crtfastmath.o`` to 
be
-automatically linked,  which adds a static constructor that sets the FTZ/DAZ
+``-ffast-math`` and ``-funsafe-math-optimizations`` without the ``-shared``
+option cause ``crtfastmath.o`` to be
+automatically linked, which adds a static constructor that sets the FTZ/DAZ
 bits in MXCSR, affecting not only the current compilation unit but all static
-and shared libraries included in the program.
+and shared libraries included in the program. This decision can be overridden
+by using either the flag ``-mdaz-ftz`` or ``-mno-daz-ftz`` to respectively
+link or not link ``crtfastmath.o``.
 
 .. _FLT_EVAL_METHOD:
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 922bda721dc780..f59b6962daf261 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2615,6 +2615,11 @@ defm protect_parens : BoolFOption<"protect-parens",
   "floating-point expressions are evaluated">,
   NegFlag>;
 
+defm daz_ftz : SimpleMFlag<"daz-ftz",
+  "Globally set", "Do not globally set",
+  " the denormals-are-zero (DAZ) and flush-to-zero (FTZ) bits in the "
+  "floating-point control register on program startup.">;
+
 def ffor_scope : Flag<["-"], "ffor-scope">, Group;
 def fno_for_scope : Flag<["-"], "fno-for-scope">, Group;
 
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 237092ed07e5dc..a3979e91589e88 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -1307,9 +1307,14 @@ void ToolChain::AddCCKextLibArgs(const ArgList ,
 
 bool ToolChain::isFastMathRuntimeAvailable(const ArgList ,
std::string ) const {
+  // Don't implicitly link in mode-changing libraries in a shared library, 
since
+  // this can have very deleterious effects. See the various links from
+  // https://github.com/llvm/llvm-project/issues/57589 for more information.
+  bool Default 

[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-04-24 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel edited 
https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-04-24 Thread Joshua Cranmer via cfe-commits


@@ -842,25 +842,6 @@ void Linux::addProfileRTLibs(const llvm::opt::ArgList 
,
   ToolChain::addProfileRTLibs(Args, CmdArgs);
 }
 
-llvm::DenormalMode
-Linux::getDefaultDenormalModeForType(const llvm::opt::ArgList ,
- const JobAction ,
- const llvm::fltSemantics *FPType) const {
-  switch (getTriple().getArch()) {

jcranmer-intel wrote:

Probably. But we can't reliably tell from the individual file compile commands 
if we're going to be linked as a shared library or not, which is why I'm 
dropping this attempt to set it at all in the first place.

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Align -ffp-model=fast denormal handling with -ffast-math (PR #89477)

2024-04-23 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> I'm not sure what the correct behavior is across all platforms. My 
> perspective on this is heavily X86-biased, so I'd really like some guidance 
> from people who work on other targets about their expectations. I think I saw 
> at least one target that sets denormal-fp-math=preservesign as their default 
> even without any fast-math options enabled. It wouldn't surprise me if there 
> are targets that don't want to set denormal-fp-math=preservesign even with 
> fast-math.

Some of the GPU targets, IIRC, want daz/ftz by default. Not all targets have 
DAZ/FTZ bits that can be set; I think RISC-V is in this category, although to 
be honest, trying to track down all the ISA extensions to make sure is a bit 
beyond my ken.

I kinda do long for the day when daz/ftz can be consigned to the same sort of 
dustbin of history as, say, EBCDIC.

> For some(?) Linux targets, we've got the mess I'm touching here where the 
> driver is looking for crtfastmath.o and trying to inspect the command-line 
> for fast-math settings outside the other FP option rendering, and then the FP 
> option rendering, which is meant to be target-independent, is kind of making 
> a mess of it, especially with inconsistent handling of "denormal-fp-math" and 
> "denormal-fp32-math".

The issue here I think is that of the Linux targets, only the x86 people cared 
enough to get the denormal-fp-math set correctly, but the logic actually 
applies across targets. It's a real mess since some targets flush to positive 
zero and some targets flush to signed zero and some manuals I couldn't make 
heads or tails out of what they actually did.

https://github.com/llvm/llvm-project/pull/89477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-04-23 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

Ping for review

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-04-23 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel updated 
https://github.com/llvm/llvm-project/pull/80475

>From 971cc613e994a308f939f68247257b65e04c74fa Mon Sep 17 00:00:00 2001
From: Joshua Cranmer 
Date: Fri, 2 Feb 2024 10:35:29 -0800
Subject: [PATCH 1/3] Disable FTZ/DAZ when compiling shared libraries by
 default.

This fixes https://github.com/llvm/llvm-project/issues/57589, and aligns
Clang with the behavior of current versions of gcc. There is a new option,
-mdaz-ftz, to control the linking of the file that sets FTZ/DAZ on startup, and
this flag is on by default if -ffast-math is present and -shared isn't.
---
 clang/docs/ReleaseNotes.rst   |  6 +
 clang/docs/UsersManual.rst| 15 
 clang/include/clang/Driver/Options.td |  5 
 clang/lib/Driver/ToolChain.cpp| 15 ++--
 clang/test/Driver/linux-ld.c  | 34 +++
 5 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index d1f7293a842bb6..8fd41d07c57262 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -207,6 +207,12 @@ Non-comprehensive list of changes in this release
 - ``__typeof_unqual__`` is available in all C modes as an extension, which 
behaves
   like ``typeof_unqual`` from C23, similar to ``__typeof__`` and ``typeof``.
 
+
+* Code compiled with ``-shared`` and ``-ffast-math`` will no longer enable
+  flush-to-zero floating-point mode by default. This decision can be overridden
+  with use of ``-mdaz-ftz``. This behavior now matches GCC's behavior.
+  (`#57589 `_)
+
 New Compiler Flags
 --
 - ``-fsanitize=implicit-bitfield-conversion`` checks implicit truncation and
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 8df40566fcba3d..d0326f01d251e0 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -1506,7 +1506,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
 
* ``-ffp-contract=fast``
 
-   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code. See
+   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code unless
+   ``-shared`` or ``-mno-daz-ftz`` is present. See
:ref:`crtfastmath.o` for more details.
 
 .. option:: -fno-fast-math
@@ -1560,7 +1561,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
  ``-ffp-contract``.
 
Note: ``-fno-fast-math`` implies ``-fdenormal-fp-math=ieee``.
-   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code.
+   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code
+   unless ``-mdaz-ftz`` is present.
 
 .. option:: -fdenormal-fp-math=
 
@@ -1938,10 +1940,13 @@ by using ``#pragma STDC FENV_ROUND`` with a value other 
than ``FE_DYNAMIC``.
 
 A note about ``crtfastmath.o``
 ^^
-``-ffast-math`` and ``-funsafe-math-optimizations`` cause ``crtfastmath.o`` to 
be
-automatically linked,  which adds a static constructor that sets the FTZ/DAZ
+``-ffast-math`` and ``-funsafe-math-optimizations`` without the ``-shared``
+option cause ``crtfastmath.o`` to be
+automatically linked, which adds a static constructor that sets the FTZ/DAZ
 bits in MXCSR, affecting not only the current compilation unit but all static
-and shared libraries included in the program.
+and shared libraries included in the program. This decision can be overridden
+by using either the flag ``-mdaz-ftz`` or ``-mno-daz-ftz`` to respectively
+link or not link ``crtfastmath.o``.
 
 .. _FLT_EVAL_METHOD:
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 922bda721dc780..f59b6962daf261 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2615,6 +2615,11 @@ defm protect_parens : BoolFOption<"protect-parens",
   "floating-point expressions are evaluated">,
   NegFlag>;
 
+defm daz_ftz : SimpleMFlag<"daz-ftz",
+  "Globally set", "Do not globally set",
+  " the denormals-are-zero (DAZ) and flush-to-zero (FTZ) bits in the "
+  "floating-point control register on program startup.">;
+
 def ffor_scope : Flag<["-"], "ffor-scope">, Group;
 def fno_for_scope : Flag<["-"], "fno-for-scope">, Group;
 
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 237092ed07e5dc..a3979e91589e88 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -1307,9 +1307,14 @@ void ToolChain::AddCCKextLibArgs(const ArgList ,
 
 bool ToolChain::isFastMathRuntimeAvailable(const ArgList ,
std::string ) const {
+  // Don't implicitly link in mode-changing libraries in a shared library, 
since
+  // this can have very deleterious effects. See the various links from
+  // https://github.com/llvm/llvm-project/issues/57589 for more information.
+  bool Default 

[clang] [C++23] [CLANG] Adding C++23 constexpr math functions: fmin and frexp. (PR #88978)

2024-04-23 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

Other side notes:

fmin and frexp can signal exceptions if the input is an sNaN, which causes 
[library.c]p3 to kick in. (That's the only time these operations can signal an 
exception.)

https://github.com/llvm/llvm-project/pull/88978
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [C++23] [CLANG] Adding C++23 constexpr math functions: fmin and frexp. (PR #88978)

2024-04-23 Thread Joshua Cranmer via cfe-commits


@@ -2922,7 +2922,7 @@ static bool handleFloatFloatBinOp(EvalInfo , const 
BinaryOperator *E,
   //   If during the evaluation of an expression, the result is not
   //   mathematically defined [...], the behavior is undefined.
   // FIXME: C++ rules require us to not conform to IEEE 754 here.
-  if (LHS.isNaN()) {
+  if (!Info.getLangOpts().CPlusPlus23 && LHS.isNaN()) {

jcranmer-intel wrote:

https://eel.is/c++draft/library.c#3 says that

>A call to a C standard library function is a non-constant library call 
>([defns.nonconst.libcall]) if it raises a floating-point exception other than 
>FE_INEXACT.)"

which suggests that floating-point exceptions should generally cause things to 
fall out of constant expressions. C's Annex F suggests that warnings be emitted 
for constant expressions that cause floating-point exceptions other than 
FE_INEXACT (see F.8.2p2).

As for wording, [expr.pre]p4 requires that things be "mathematically defined", 
except IEEE 754 specifically says (in section 3.2)

> The mathematical structure underpinning the arithmetic in this standard is 
> the extended reals, that is, the set
of real numbers together with positive and negative infinity.

The C++ specification actually does define "mathematically defined" somewhere, 
in a footnote of https://eel.is/c++draft/sf.cmath.general:

> A mathematical function is mathematically defined for a given set of argument 
> values (a) if it is explicitly defined for that set of argument values, or 
> (b) if its limiting value exists and does not depend on the direction of 
> approach.

(which, as pendants will note, only defines it for functions, not the base 
operations of C++, and sidesteps the question of what the set of argument 
values actually is).

It does seem pretty clear from all the standards that a result of infinity is 
clearly part of the representable types, and it seems a reasonable inference 
that the result being infinity is "mathematically defined" (division by 0 is 
permissible in the extended real numbers).

The role of NaNs is... clear as mud. You have to go into the definition of 
 to find out that NaN is a representable value. IEEE 754 gives multiple 
specification levels for floating-point, where NaN is a part of some of them, 
and proceeds to ignore the fact that it did so in the rest of the 
specification; the definition of operations involving NaN is deferred into an 
entire separate section. Section 7.2 goes so far as to say that

> The invalid operation exception is signaled if and only if there is no 
> usefully definable result. In these cases
the operands are invalid for the operation to be performed.

Is `0.0 / 0.0` "mathematically defined" per IEEE 754? It's definitely 
*defined*, but I don't think I can entirely endorse saying that it's 
"mathematically defined." The rule of thumb that FE_INVALID is signaled when 
the inputs are non-qNaN but the output is NaN, combined with the above text, 
makes it suspect that it should be considered a "mathematical" definition. But 
this also doesn't help with `NaN / NaN`, which is `NaN` but doesn't throw a 
FE_INVALID.

As noted in the CWG issue Aaron linked to, someone needs to write a paper to 
clarify this, and... that's going to be me isn't it?

https://github.com/llvm/llvm-project/pull/88978
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [C99] Claim conformance for _Complex support (PR #88161)

2024-04-12 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel approved this pull request.


https://github.com/llvm/llvm-project/pull/88161
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [C99] Claim conformance for _Complex support (PR #88161)

2024-04-10 Thread Joshua Cranmer via cfe-commits


@@ -373,6 +355,10 @@ C99 implementation status
   Yes
 
 
+(2): Clang supports _Complex type specifiers 
but
+does not support _Imaginary type specifiers. Support for
+_Imaginary is optional in C99 which is why Clang is fully 
conforming.

jcranmer-intel wrote:

Maybe mention here something about not claiming Annex G conformance?

https://github.com/llvm/llvm-project/pull/88161
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [C99] Claim conformance for _Complex support (PR #88161)

2024-04-10 Thread Joshua Cranmer via cfe-commits


@@ -0,0 +1,97 @@
+// RUN: %clang_cc1 -verify -std=c99 %s
+
+/* WG14 N620, N638, N657, N694, N809: Yes*
+ * Complex and imaginary support in 
+ *
+ * NB: Clang supports _Complex but not _Imaginary. In C99, _Complex support is
+ * required outside of freestanding, but _Imaginary support is fully optional.
+ * In C11, both are made fully optional. We claim full conformance because we
+ * are actually conforming, but this gets an asterisk because it's also only
+ * partially implemented in a way and users should know about that.
+ *
+ * Because the functionality is so intertwined between the various papers,
+ * we're testing all of the functionality in one file.
+ */
+
+// Demonstrate that we support spelling complex floating-point objects.
+float _Complex f1;
+_Complex float f2;
+
+double _Complex d1;
+_Complex double d2;
+
+long double _Complex ld1;
+_Complex long double ld2;
+
+// Show that we don't support spelling imaginary types.
+float _Imaginary fi1; // expected-error {{imaginary types are not supported}}
+_Imaginary float fi2; // expected-error {{imaginary types are not supported}}
+
+double _Imaginary di1; // expected-error {{imaginary types are not supported}}
+_Imaginary double di2; // expected-error {{imaginary types are not supported}}
+
+long double _Imaginary ldi1; // expected-error {{imaginary types are not 
supported}}
+_Imaginary long double ldi2; // expected-error {{imaginary types are not 
supported}}
+
+// Each complex type has the same representation and alignment as an array
+// containing two elements of the corresponding real type.
+_Static_assert(sizeof(float _Complex) == sizeof(struct { float mem[2]; }), "");
+_Static_assert(_Alignof(float _Complex) == _Alignof(struct { float mem[2]; }), 
"");

jcranmer-intel wrote:

I'm not entirely certain that there's a mandatory rule in C99 (or any other 
version of C) that requires that
`struct { float mem[2]; }` and `float mem[2];` are the same in layout 
(particularly alignment, see C99 6.7.2.1p12 and read it somewhat maliciously). 
Though I will concede that all non-malicious compilers will treat them as the 
same.

https://github.com/llvm/llvm-project/pull/88161
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [C99] Claim conformance for _Complex support (PR #88161)

2024-04-10 Thread Joshua Cranmer via cfe-commits


@@ -0,0 +1,97 @@
+// RUN: %clang_cc1 -verify -std=c99 %s
+
+/* WG14 N620, N638, N657, N694, N809: Yes*
+ * Complex and imaginary support in 
+ *
+ * NB: Clang supports _Complex but not _Imaginary. In C99, _Complex support is
+ * required outside of freestanding, but _Imaginary support is fully optional.
+ * In C11, both are made fully optional. We claim full conformance because we
+ * are actually conforming, but this gets an asterisk because it's also only
+ * partially implemented in a way and users should know about that.
+ *
+ * Because the functionality is so intertwined between the various papers,
+ * we're testing all of the functionality in one file.
+ */
+
+// Demonstrate that we support spelling complex floating-point objects.
+float _Complex f1;
+_Complex float f2;
+
+double _Complex d1;
+_Complex double d2;
+
+long double _Complex ld1;
+_Complex long double ld2;
+
+// Show that we don't support spelling imaginary types.
+float _Imaginary fi1; // expected-error {{imaginary types are not supported}}
+_Imaginary float fi2; // expected-error {{imaginary types are not supported}}
+
+double _Imaginary di1; // expected-error {{imaginary types are not supported}}
+_Imaginary double di2; // expected-error {{imaginary types are not supported}}
+
+long double _Imaginary ldi1; // expected-error {{imaginary types are not 
supported}}
+_Imaginary long double ldi2; // expected-error {{imaginary types are not 
supported}}
+
+// Each complex type has the same representation and alignment as an array
+// containing two elements of the corresponding real type.
+_Static_assert(sizeof(float _Complex) == sizeof(struct { float mem[2]; }), "");
+_Static_assert(_Alignof(float _Complex) == _Alignof(struct { float mem[2]; }), 
"");
+
+_Static_assert(sizeof(double _Complex) == sizeof(struct { double mem[2]; }), 
"");
+_Static_assert(_Alignof(double _Complex) == _Alignof(struct { double mem[2]; 
}), "");
+
+_Static_assert(sizeof(long double _Complex) == sizeof(struct { long double 
mem[2]; }), "");
+_Static_assert(_Alignof(long double _Complex) == _Alignof(struct { long double 
mem[2]; }), "");
+
+// The first element corresponds to the real part and the second element
+// corresponds to the imaginary part.
+_Static_assert(__real((float _Complex){ 1.0f, 2.0f }) == 1.0f, "");
+_Static_assert(__imag((float _Complex){ 1.0f, 2.0f }) == 2.0f, "");

jcranmer-intel wrote:

This is using an extension for complex initializers, not a standard feature, so 
I'm a little bit uncomfortable that it's actually testing the assertion that 
the first element of the array corresponds to the real.

That said, I'm not sure there's any non-extension way to test that as a 
constant expression anyways, so... :shrug:

https://github.com/llvm/llvm-project/pull/88161
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [C99] Claim conformance for _Complex support (PR #88161)

2024-04-10 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel edited 
https://github.com/llvm/llvm-project/pull/88161
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [C99] Claim conformance for _Complex support (PR #88161)

2024-04-10 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

You're missing checks for type domain rules, so things like:

- converting between `float _Complex` and `double _Complex`
- common type of `float _Complex` and `double`
- result of `int` and `float _Complex`
- complex types not allowed in increment/decrement operators
- complex types permitted in unary +, -

Not sure how to test this, but _Complex float not promoting to _Complex double 
like float does to double is also worth slotting under complex conformance. 
Also 6.7.8p10 (static-storage duration arithmetic types (which include complex 
types) are initialized to 0).

https://github.com/llvm/llvm-project/pull/88161
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-03-20 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel approved this pull request.


https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-03-15 Thread Joshua Cranmer via cfe-commits


@@ -310,6 +310,13 @@ class ComplexExprEmitter
 CGF.getContext().getFloatTypeSemantics(ElementType);
 const llvm::fltSemantics  =
 CGF.getContext().getFloatTypeSemantics(HigherElementType);
+// Check that LongDouble Size > Double Size.
+// This can be interpreted as:
+// SmallerType.LargestFiniteVal * SmallerType.LargestFiniteVal <=
+// LargerType.LargestFiniteVal.

jcranmer-intel wrote:

Not entirely accurate, it's `(SmallerType.LargestFiniteVal * 
SmallerType.LargestFiniteVal) * 2`

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-03-15 Thread Joshua Cranmer via cfe-commits


@@ -310,6 +310,13 @@ class ComplexExprEmitter
 CGF.getContext().getFloatTypeSemantics(ElementType);
 const llvm::fltSemantics  =
 CGF.getContext().getFloatTypeSemantics(HigherElementType);
+// Check that LongDouble Size > Double Size.

jcranmer-intel wrote:

I'd word this as "Check that the promoted type can handle the intermediate 
values without overflowing."

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-03-14 Thread Joshua Cranmer via cfe-commits


@@ -287,9 +288,47 @@ class ComplexExprEmitter
   ComplexPairTy EmitComplexBinOpLibCall(StringRef LibCallName,
 const BinOpInfo );
 
-  QualType getPromotionType(QualType Ty) {
+  QualType GetHigherPrecisionFPType(QualType ElementType) {
+const auto *CurrentBT = dyn_cast(ElementType);
+switch (CurrentBT->getKind()) {
+case BuiltinType::Kind::Float16:
+  return CGF.getContext().FloatTy;
+case BuiltinType::Kind::Float:
+case BuiltinType::Kind::BFloat16:
+  return CGF.getContext().DoubleTy;
+case BuiltinType::Kind::Double:
+  return CGF.getContext().LongDoubleTy;
+default:
+  return ElementType;
+}
+  }
+
+  QualType HigherPrecisionTypeForComplexArithmetic(QualType ElementType,
+   bool IsDivOpCode) {
+QualType HigherElementType = GetHigherPrecisionFPType(ElementType);
+const llvm::fltSemantics  =
+CGF.getContext().getFloatTypeSemantics(ElementType);
+const llvm::fltSemantics  =
+CGF.getContext().getFloatTypeSemantics(HigherElementType);
+if (llvm::APFloat::semanticsMaxExponent(ElementTypeSemantics) * 2 + 1 <=

jcranmer-intel wrote:

I'd appreciate a comment here explaining why this is the correct check, so that 
future people reading this code can understand why this is being doing.

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-03-14 Thread Joshua Cranmer via cfe-commits


@@ -287,9 +288,47 @@ class ComplexExprEmitter
   ComplexPairTy EmitComplexBinOpLibCall(StringRef LibCallName,
 const BinOpInfo );
 
-  QualType getPromotionType(QualType Ty) {
+  QualType GetHigherPrecisionFPType(QualType ElementType) {
+const auto *CurrentBT = dyn_cast(ElementType);
+switch (CurrentBT->getKind()) {
+case BuiltinType::Kind::Float16:
+  return CGF.getContext().FloatTy;
+case BuiltinType::Kind::Float:
+case BuiltinType::Kind::BFloat16:
+  return CGF.getContext().DoubleTy;
+case BuiltinType::Kind::Double:
+  return CGF.getContext().LongDoubleTy;
+default:
+  return ElementType;
+}
+  }
+
+  QualType HigherPrecisionTypeForComplexArithmetic(QualType ElementType,
+   bool IsDivOpCode) {
+QualType HigherElementType = GetHigherPrecisionFPType(ElementType);
+const llvm::fltSemantics  =
+CGF.getContext().getFloatTypeSemantics(ElementType);
+const llvm::fltSemantics  =
+CGF.getContext().getFloatTypeSemantics(HigherElementType);
+if (llvm::APFloat::semanticsMaxExponent(ElementTypeSemantics) * 2 + 1 <=
+llvm::APFloat::semanticsMaxExponent(HigherElementTypeSemantics)) {
+  return CGF.getContext().getComplexType(HigherElementType);
+} else {
+  FPHasBeenPromoted = LangOptions::ComplexRangeKind::CX_Improved;

jcranmer-intel wrote:

If I'm following correctly, this means that the behavior wouldn't be right for 
this code:

```c
_Complex float func(_Complex float a, _Complex long double b, _Complex float c) 
{
  return (_Complex float)(b / c) / a;
}
```

The code would first process the long double complex division, which isn't 
promotable, and then the float division, using the same `ComplexExprEmitter` I 
think, so the resetting to improved would cause the second float division to be 
emitted as improved instead of promoted.

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-03-08 Thread Joshua Cranmer via cfe-commits


@@ -1847,19 +1847,50 @@ floating point semantic models: precise (the default), 
strict, and fast.
* ``16`` - Forces ``_Float16`` operations to be emitted without using excess
  precision arithmetic.
 
+.. option:: -fcomplex-arithmetic=:
+
+   This option specifies the implementation for complex multiplication and 
division.
+
+   Valid values are: ``basic``, ``improved``, ``full`` and ``promoted``.
+
+   * ``basic`` Implementation of complex division and multiplication using
+ algebraic formulas at source precision. No special handling to avoid
+ overflow. NaN and infinite values are not handled.
+   * ``improved`` Implementation of complex division using the Smith algorithm
+ at source precision. Smith's algorithm for complex division.
+ See SMITH, R. L. Algorithm 116: Complex division. Commun. ACM 5, 8 (1962).
+ This value offers improved handling for overflow in intermediate
+ calculations, but overflow may occur. NaN and infinite values are not
+ handled in some cases.
+   * ``full`` Implementation of complex division and multiplication using a
+ call to runtime library functions (generally the case, but the BE might
+ sometimes replace the library call if it knows enough about the potential
+ range of the inputs). Overflow and non-finite values are handled by the
+ library implementation. For the case of multiplication overflow will 
occur in
+ accordance with normal floating-point rules. This is the default value.
+   * ``promoted`` Implementation of complex division using algebraic formulas 
at
+ higher precision. Overflow is handled. Non-finite values are handled in 
some
+ cases. If the target does not have native support for a higher precision
+ data type, an implementation for the complex operation will be used to 
provide
+ improved guards against intermediate overflow, but overflow and underflow 
may
+ still occur in some cases. NaN and infinite values are not handled.

jcranmer-intel wrote:

> Windows on x86-64 is the really ugly case here, because the target hardware 
> supports an 80-bit floating-point type, but by default the operating system 
> configures the x87 layer to perform calculations as if it were a 64-bit type.

My understanding of the x87 precision control field is that it only affects the 
number of used bits in the significand, but still retains the full exponent 
range, so even with PC set to 53-bit precision, you would still get sufficient 
range to avoid overflow in a complex division, albeit the result might be 
rounded slightly differently.

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-03-08 Thread Joshua Cranmer via cfe-commits


@@ -283,9 +283,48 @@ class ComplexExprEmitter
   ComplexPairTy EmitComplexBinOpLibCall(StringRef LibCallName,
 const BinOpInfo );
 
-  QualType getPromotionType(QualType Ty) {
+  QualType GetHigherPrecisionFPType(QualType ElementType) {
+const auto *CurrentBT = dyn_cast(ElementType);
+switch (CurrentBT->getKind()) {
+case BuiltinType::Kind::Float16:
+  return CGF.getContext().FloatTy;
+case BuiltinType::Kind::Float:
+case BuiltinType::Kind::BFloat16:
+  return CGF.getContext().DoubleTy;
+case BuiltinType::Kind::Double:
+  return CGF.getContext().LongDoubleTy;
+default:
+  return ElementType;
+}
+  }
+
+  QualType HigherPrecisionTypeForComplexArithmetic(QualType ElementType,
+   bool IsDivOpCode) {
+QualType HigherElementType = GetHigherPrecisionFPType(ElementType);
+const llvm::fltSemantics  =
+CGF.getContext().getFloatTypeSemantics(ElementType);
+const llvm::fltSemantics  =
+CGF.getContext().getFloatTypeSemantics(HigherElementType);
+const llvm::Triple TI = CGF.getTarget().getTriple();
+if ((llvm::APFloat::getSizeInBits(HigherElementTypeSemantics) >
+ llvm::APFloat::getSizeInBits(ElementTypeSemantics)) &&

jcranmer-intel wrote:

To avoid intermediate overflow, we need Smaller::largest() * Smaller::largest() 
+ Smaller::largest() * Smaller::largest() <= Larger::largest(). That means the 
correct check should be, I think:

```c++
if (ApFloatBase::semanticsMaxExponent(ElementTypeSemantics) * 2 + 1 <= 
ApFloatBase::semanticsMaxExponent(HigherElementTypeSemantics))
```

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-03-08 Thread Joshua Cranmer via cfe-commits


@@ -982,13 +1024,18 @@ ComplexPairTy ComplexExprEmitter::EmitBinDiv(const 
BinOpInfo ) {
 llvm::Value *OrigLHSi = LHSi;
 if (!LHSi)
   LHSi = llvm::Constant::getNullValue(RHSi->getType());
-if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Fortran)
+QualType ComplexElementTy = Op.Ty->castAs()->getElementType();
+const BuiltinType *BT = ComplexElementTy->getAs();
+if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Improved ||
+(Op.FPFeatures.getComplexRange() == LangOptions::CX_Promoted &&
+ BT->getKind() == BuiltinType::Kind::Double))

jcranmer-intel wrote:

I still don't think this is the right way to do this logic. It's probably 
better to have a local effective complex range kind variable, and if promoted 
is requested but can't be done, have it swap it to improved.

(This will definitely break down for decimal floats/hexfloats!)

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Sema] Fix bug in `_Complex float`+`int` arithmetic (PR #83063)

2024-03-04 Thread Joshua Cranmer via cfe-commits


@@ -0,0 +1,143 @@
+// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown -o - | 
FileCheck %s --check-prefix=X86
+
+// Check that for 'F _Complex + int' (F = real floating-point type), we emit an
+// implicit cast from 'int' to 'F', but NOT to 'F _Complex' (i.e. that we do
+// 'F _Complex + F', NOT 'F _Complex + F _Complex'), and likewise for -/*.
+
+float _Complex add_float_ci(float _Complex a, int b) {
+  // X86-LABEL: @add_float_ci
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to float
+  // X86: fadd float {{.*}}, [[I]]
+  // X86-NOT: fadd
+  return a + b;
+}
+
+float _Complex add_float_ic(int a, float _Complex b) {
+  // X86-LABEL: @add_float_ic
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to float
+  // X86: fadd float [[I]]
+  // X86-NOT: fadd
+  return a + b;
+}
+
+float _Complex sub_float_ci(float _Complex a, int b) {
+  // X86-LABEL: @sub_float_ci
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to float
+  // X86: fsub float {{.*}}, [[I]]
+  // X86-NOT: fsub
+  return a - b;
+}
+
+float _Complex sub_float_ic(int a, float _Complex b) {
+  // X86-LABEL: @sub_float_ic
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to float
+  // X86: fsub float [[I]]
+  // X86: fneg
+  // X86-NOT: fsub
+  return a - b;
+}
+
+float _Complex mul_float_ci(float _Complex a, int b) {
+  // X86-LABEL: @mul_float_ci
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to float
+  // X86: fmul float {{.*}}, [[I]]
+  // X86: fmul float {{.*}}, [[I]]
+  // X86-NOT: fmul
+  return a * b;
+}
+
+float _Complex mul_float_ic(int a, float _Complex b) {
+  // X86-LABEL: @mul_float_ic
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to float
+  // X86: fmul float [[I]]
+  // X86: fmul float [[I]]
+  // X86-NOT: fmul
+  return a * b;
+}
+
+float _Complex div_float_ci(float _Complex a, int b) {
+  // X86-LABEL: @div_float_ci
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to float
+  // X86: fdiv float {{.*}}, [[I]]
+  // X86: fdiv float {{.*}}, [[I]]
+  // X86-NOT: @__divsc3
+  return a / b;
+}
+
+// There is no good way of doing this w/o converting the 'int' to a complex
+// number, so we expect complex division here.
+float _Complex div_float_ic(int a, float _Complex b) {
+  // X86-LABEL: @div_float_ic
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to float
+  // X86: call {{.*}} @__divsc3(float {{.*}} [[I]], float noundef 
0.{{0+}}e+00, float {{.*}}, float {{.*}})
+  return a / b;
+}
+
+double _Complex add_double_ci(double _Complex a, int b) {
+  // X86-LABEL: @add_double_ci
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to double
+  // X86: fadd double {{.*}}, [[I]]
+  // X86-NOT: fadd
+  return a + b;
+}
+
+double _Complex add_double_ic(int a, double _Complex b) {
+  // X86-LABEL: @add_double_ic
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to double
+  // X86: fadd double [[I]]
+  // X86-NOT: fadd
+  return a + b;
+}
+
+double _Complex sub_double_ci(double _Complex a, int b) {
+  // X86-LABEL: @sub_double_ci
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to double
+  // X86: fsub double {{.*}}, [[I]]
+  // X86-NOT: fsub
+  return a - b;
+}
+
+double _Complex sub_double_ic(int a, double _Complex b) {
+  // X86-LABEL: @sub_double_ic
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to double
+  // X86: fsub double [[I]]
+  // X86: fneg
+  // X86-NOT: fsub
+  return a - b;
+}
+
+double _Complex mul_double_ci(double _Complex a, int b) {
+  // X86-LABEL: @mul_double_ci
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to double
+  // X86: fmul double {{.*}}, [[I]]
+  // X86: fmul double {{.*}}, [[I]]
+  // X86-NOT: fmul
+  return a * b;
+}
+
+double _Complex mul_double_ic(int a, double _Complex b) {
+  // X86-LABEL: @mul_double_ic
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to double
+  // X86: fmul double [[I]]
+  // X86: fmul double [[I]]
+  // X86-NOT: fmul
+  return a * b;
+}
+
+double _Complex div_double_ci(double _Complex a, int b) {
+  // X86-LABEL: @div_double_ci
+  // X86: [[I:%.*]] = sitofp i32 {{%.*}} to double
+  // X86: fdiv double {{.*}}, [[I]]
+  // X86: fdiv double {{.*}}, [[I]]
+  // X86-NOT: @__divdc3

jcranmer-intel wrote:

This check makes me a little nervous, since it's dependent on the default 
strategy for complex division not being changed. OTOH, the check later on is 
expecting a call to __divdc3, so if the default strategy does change, then this 
test should at least fail so that someone changing the default strategy should 
know to update the -NOT check here.

(If I had a great idea for how to do a codegen check for complex / real domain 
that was independent of complex / complex domain implementation strategy, I 
would give it here, but I don't).

https://github.com/llvm/llvm-project/pull/83063
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [Sema] Fix bug in `_Complex float`+`int` arithmetic (PR #83063)

2024-03-04 Thread Joshua Cranmer via cfe-commits


@@ -1114,8 +1114,6 @@ static bool handleIntegerToComplexFloatConversion(Sema 
, ExprResult ,
   if (IntTy->isIntegerType()) {
 QualType fpTy = ComplexTy->castAs()->getElementType();
 IntExpr = S.ImpCastExprToType(IntExpr.get(), fpTy, CK_IntegralToFloating);
-IntExpr = S.ImpCastExprToType(IntExpr.get(), ComplexTy,
-  CK_FloatingRealToComplex);

jcranmer-intel wrote:

This change makes the comment on this method and the comment in 
`handleComplexConversion` inaccurate; it would be a good idea to update those 
comments to indicate what is actually happening here.

https://github.com/llvm/llvm-project/pull/83063
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-02-29 Thread Joshua Cranmer via cfe-commits


@@ -1847,19 +1847,33 @@ floating point semantic models: precise (the default), 
strict, and fast.
* ``16`` - Forces ``_Float16`` operations to be emitted without using excess
  precision arithmetic.
 
-.. option:: -fcx-limited-range:
-
-   This option enables the naive mathematical formulas for complex division and
-   multiplication with no NaN checking of results. The default is
-   ``-fno-cx-limited-range``, but this option is enabled by the ``-ffast-math``
-   option.
-
-.. option:: -fcx-fortran-rules:
-
-   This option enables the naive mathematical formulas for complex
-   multiplication and enables application of Smith's algorithm for complex
-   division. See SMITH, R. L. Algorithm 116: Complex division. Commun.
-   ACM 5, 8 (1962). The default is ``-fno-cx-fortran-rules``.
+.. option:: -fcomplex-arithmetic=:
+
+   This option specifies the implementation for complex multiplication and 
division.
+
+   Valid values are: ``basic``, ``improved``, ``full`` and ``promoted``.
+
+   * ``basic`` Implementation of complex division and multiplication using
+ algebraic formulas at source precision. No special handling to avoid
+ overflow. NaN and infinite and  values are not handled.
+   * ``improved`` Implementation of complex division using the Smith algorithm 
at
+ source precision. Smith's algorithm for complex division.
+ See SMITH, R. L. Algorithm 116: Complex division. Commun. ACM 5, 8 (1962).
+ This value offers improved handling for overflow in intermediate 
calculations,
+ but overflow may occur. NaN and infinite and  values are not handled in 
some
+ cases.
+   * ``full``  Implementation of complex division and multiplication using a
+ call to runtime library functions (generally the case, but the BE might
+ sometimes replace the library call if it knows enough about the potential
+ range of the inputs). Overflow and non-finite values are handled by the
+ library implementation.
+   * ``promoted`` Implementation of complex division using algebraic formulas 
at
+ higher precision. Overflow is handled. Non-finite values are handled in 
some
+ cases. If the target hardware does not have native support for a higher 
precision
+ data type, an implementation for the complex operation will be used to 
provide
+ improved guards against intermediate overflow, but overflow and underflow 
may
+ still occur in some cases. NaN and infinite and  values are not handled.
+ This is the default value.

jcranmer-intel wrote:

C89 doesn't have complex types, do we even support `_Complex` in -std=gnu89 
mode?

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-02-29 Thread Joshua Cranmer via cfe-commits


@@ -982,13 +1022,18 @@ ComplexPairTy ComplexExprEmitter::EmitBinDiv(const 
BinOpInfo ) {
 llvm::Value *OrigLHSi = LHSi;
 if (!LHSi)
   LHSi = llvm::Constant::getNullValue(RHSi->getType());
-if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Fortran)
+QualType ComplexElementTy = Op.Ty->castAs()->getElementType();
+const BuiltinType *BT = ComplexElementTy->getAs();
+if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Improved ||
+(Op.FPFeatures.getComplexRange() == LangOptions::CX_Promoted &&
+ BT->getKind() == BuiltinType::Kind::LongDouble))

jcranmer-intel wrote:

This isn't going to do the right thing for `_Complex __float128` I believe. Or 
`_Complex double` in the case where `long double` is binary64, I'm pretty sure, 
since that would promote it to itself and then do basic.

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-02-29 Thread Joshua Cranmer via cfe-commits


@@ -283,9 +283,46 @@ class ComplexExprEmitter
   ComplexPairTy EmitComplexBinOpLibCall(StringRef LibCallName,
 const BinOpInfo );
 
-  QualType getPromotionType(QualType Ty) {
+  QualType HigherPrecisionTypeForComplexArithmetic(QualType ElementType,
+   bool IsDivOpCode) {
+const TargetInfo  = CGF.getContext().getTargetInfo();
+const LangOptions Opts = CGF.getLangOpts();
+if (const auto *BT = dyn_cast(ElementType)) {
+  switch (BT->getKind()) {
+  case BuiltinType::Kind::Float16: {
+if (TI.hasFloat16Type() && !TI.hasLegalHalfType())
+  return CGF.getContext().getComplexType(CGF.getContext().FloatTy);
+break;
+  }
+  case BuiltinType::Kind::BFloat16: {
+if (TI.hasBFloat16Type() && !TI.hasFullBFloat16Type())
+  return CGF.getContext().getComplexType(CGF.getContext().FloatTy);
+break;
+  }
+  case BuiltinType::Kind::Float:
+return CGF.getContext().getComplexType(CGF.getContext().DoubleTy);
+break;
+  case BuiltinType::Kind::Double: {
+if (TI.hasLongDoubleType())
+  return 
CGF.getContext().getComplexType(CGF.getContext().LongDoubleTy);
+return CGF.getContext().getComplexType(CGF.getContext().DoubleTy);

jcranmer-intel wrote:

I'm not sure it's a good idea to return a specific type here if it's not known 
to actually be higher precision? `long double` isn't guaranteed to be a 
different LLVM type from `double`...

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CLANG] Full support of complex multiplication and division. (PR #81514)

2024-02-29 Thread Joshua Cranmer via cfe-commits


@@ -1847,19 +1847,50 @@ floating point semantic models: precise (the default), 
strict, and fast.
* ``16`` - Forces ``_Float16`` operations to be emitted without using excess
  precision arithmetic.
 
+.. option:: -fcomplex-arithmetic=:
+
+   This option specifies the implementation for complex multiplication and 
division.
+
+   Valid values are: ``basic``, ``improved``, ``full`` and ``promoted``.
+
+   * ``basic`` Implementation of complex division and multiplication using
+ algebraic formulas at source precision. No special handling to avoid
+ overflow. NaN and infinite values are not handled.
+   * ``improved`` Implementation of complex division using the Smith algorithm
+ at source precision. Smith's algorithm for complex division.
+ See SMITH, R. L. Algorithm 116: Complex division. Commun. ACM 5, 8 (1962).
+ This value offers improved handling for overflow in intermediate
+ calculations, but overflow may occur. NaN and infinite values are not
+ handled in some cases.
+   * ``full`` Implementation of complex division and multiplication using a
+ call to runtime library functions (generally the case, but the BE might
+ sometimes replace the library call if it knows enough about the potential
+ range of the inputs). Overflow and non-finite values are handled by the
+ library implementation. For the case of multiplication overflow will 
occur in
+ accordance with normal floating-point rules. This is the default value.
+   * ``promoted`` Implementation of complex division using algebraic formulas 
at
+ higher precision. Overflow is handled. Non-finite values are handled in 
some
+ cases. If the target does not have native support for a higher precision
+ data type, an implementation for the complex operation will be used to 
provide
+ improved guards against intermediate overflow, but overflow and underflow 
may
+ still occur in some cases. NaN and infinite values are not handled.

jcranmer-intel wrote:

This documentation doesn't make it clear what happens if there is no 
higher-precision datatype available and you use `promoted` format. And I'm 
somewhat uncomfortable with the idea that using `promoted` keeps you from being 
able to choose what happens in that case.

(In general, `promoted` is scary to me for anything larger than a `float` 
because `long double` is just such a cursed type and I'm not sure it's a good 
idea to convert `double` computations to `long double`).

https://github.com/llvm/llvm-project/pull/81514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-28 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> It matters more for AMDGPU, where we need to care because some instructions 
> just don't respect denormals. We legalize some operations differently 
> depending on the mode

But the shared library stuff isn't an issue for AMDGPU, right?

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-22 Thread Joshua Cranmer via cfe-commits


@@ -1569,6 +1569,40 @@
 // RUN:--gcc-toolchain="" \
 // RUN:--sysroot=%S/Inputs/basic_linux_tree 2>&1 \
 // RUN:   | FileCheck --check-prefix=CHECK-NOCRTFASTMATH %s
+// Don't link crtfastmath.o with -shared
+// RUN: %clang --target=x86_64-unknown-linux -no-pie -### %s -ffast-math 
-shared \
+// RUN:--gcc-toolchain="" \

jcranmer-intel wrote:

The tests use a custom sysroot that provides a `crtfastmath.o` file, so it 
doesn't matter whether or not it's provided by the host toolchain.

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-22 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

I did a thorough investigation into how the 
`denormal-fp-math=preserve-sign,preserve-sign` attribute affects the resulting 
IR for all of the SPEC benchmarks (which actually do run into subnormals), and 
the basic summary I found is that the only differences I found were that the 
FTZ/DAZ mode prevented a few inferences that are still true in 
denormal-flushing: things like `x >= 0` implies `x >= -1`.

So overall, the practical effect of the `denormal-fp-math` attribute being set 
incorrectly doesn't appear to matter.

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-22 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel updated 
https://github.com/llvm/llvm-project/pull/80475

>From fc0507f013f556cc7c49a38f22d14578311f1f42 Mon Sep 17 00:00:00 2001
From: Joshua Cranmer 
Date: Fri, 2 Feb 2024 10:35:29 -0800
Subject: [PATCH 1/3] Disable FTZ/DAZ when compiling shared libraries by
 default.

This fixes https://github.com/llvm/llvm-project/issues/57589, and aligns
Clang with the behavior of current versions of gcc. There is a new option,
-mdaz-ftz, to control the linking of the file that sets FTZ/DAZ on startup, and
this flag is on by default if -ffast-math is present and -shared isn't.
---
 clang/docs/ReleaseNotes.rst   |  5 
 clang/docs/UsersManual.rst| 15 
 clang/include/clang/Driver/Options.td |  5 
 clang/lib/Driver/ToolChain.cpp| 15 ++--
 clang/test/Driver/linux-ld.c  | 34 +++
 5 files changed, 67 insertions(+), 7 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 74bb9a07f0b13f..723a2bc859219b 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -145,6 +145,11 @@ Non-comprehensive list of changes in this release
 - ``__builtin_addc``, ``__builtin_subc``, and the other sizes of those
   builtins are now constexpr and may be used in constant expressions.
 
+* Code compiled with ``-shared`` and ``-ffast-math`` will no longer enable
+  flush-to-zero floating-point mode by default. This decision can be overridden
+  with use of ``-mdaz-ftz``. This behavior now matches GCC's behavior.
+  (`#57589 `_)
+
 New Compiler Flags
 --
 
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 7391e4cf3a9aeb..257af3c8d4d39a 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -1506,7 +1506,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
 
* ``-ffp-contract=fast``
 
-   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code. See
+   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code unless
+   ``-shared`` or ``-mno-daz-ftz`` is present. See
:ref:`crtfastmath.o` for more details.
 
 .. option:: -fno-fast-math
@@ -1560,7 +1561,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
  ``-ffp-contract``.
 
Note: ``-fno-fast-math`` implies ``-fdenormal-fp-math=ieee``.
-   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code.
+   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code
+   unless ``-mdaz-ftz`` is present.
 
 .. option:: -fdenormal-fp-math=
 
@@ -1907,10 +1909,13 @@ by using ``#pragma STDC FENV_ROUND`` with a value other 
than ``FE_DYNAMIC``.
 
 A note about ``crtfastmath.o``
 ^^
-``-ffast-math`` and ``-funsafe-math-optimizations`` cause ``crtfastmath.o`` to 
be
-automatically linked,  which adds a static constructor that sets the FTZ/DAZ
+``-ffast-math`` and ``-funsafe-math-optimizations`` without the ``-shared``
+option cause ``crtfastmath.o`` to be
+automatically linked, which adds a static constructor that sets the FTZ/DAZ
 bits in MXCSR, affecting not only the current compilation unit but all static
-and shared libraries included in the program.
+and shared libraries included in the program. This decision can be overridden
+by using either the flag ``-mdaz-ftz`` or ``-mno-daz-ftz`` to respectively
+link or not link ``crtfastmath.o``.
 
 .. _FLT_EVAL_METHOD:
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 3a028fadb25b18..c607e2e5c6ca5f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2581,6 +2581,11 @@ defm protect_parens : BoolFOption<"protect-parens",
   "floating-point expressions are evaluated">,
   NegFlag>;
 
+defm daz_ftz : SimpleMFlag<"daz-ftz",
+  "Globally set", "Do not globally set",
+  " the denormals-are-zero (DAZ) and flush-to-zero (FTZ) bits in the "
+  "floating-point control register on program startup.">;
+
 def ffor_scope : Flag<["-"], "ffor-scope">, Group;
 def fno_for_scope : Flag<["-"], "fno-for-scope">, Group;
 
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index f8c13c86daf9b0..08301b89ebb33e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -1292,9 +1292,14 @@ void ToolChain::AddCCKextLibArgs(const ArgList ,
 
 bool ToolChain::isFastMathRuntimeAvailable(const ArgList ,
std::string ) const {
+  // Don't implicitly link in mode-changing libraries in a shared library, 
since
+  // this can have very deleterious effects. See the various links from
+  // https://github.com/llvm/llvm-project/issues/57589 for more information.
+  bool Default = !Args.hasArg(options::OPT_shared);
+
   // Do not check for -fno-fast-math or -fno-unsafe-math 

[clang] [llvm] InstCombine: Enable SimplifyDemandedUseFPClass and remove flag (PR #81108)

2024-02-13 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel approved this pull request.


https://github.com/llvm/llvm-project/pull/81108
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-09 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> > > I'd like to see this change land, but with the "-mdaz-ftz" option removed 
> > > (because I don't think it's useful). That would fix the critical problem 
> > > of fast-math infecting shared libraries with the ftz setting, and we 
> > > could straighten out the other problems, which are relatively minor, 
> > > afterwards.
> > 
> > 
> > There is, without this change, no way to control whether or not 
> > `crtfastmath.o` is linked independent of all of the other fast-math 
> > options. The `-mdaz-ftz` option would at least add a flag to explicitly 
> > control the parameter (for the people who care), and we can then have 
> > discussions about different ways to effect setting DAZ/FTZ bits or what 
> > options imply `-mdaz-ftz` in future PRs. That alone makes it a worthy 
> > addition IMHO; the compatibility with gcc is another nice feature.
> 
> You can always link crtfastmath.o directly, of course. Ultimately, I don't 
> think the compiler should ever be adding the crtfastmath.o file. I would 
> prefer to insert code directly into the entry function as @arsenm indicated 
> the AMDGPU backend does for kernels. That would then be controlled by the 
> -fdenormal-fp-math option or something more explicitly linked to the entry 
> function.
> 
> I don't want to add -mdaz-ftz because once we do we're kind of stuck with it. 
> If you don't add it here, we'd continue the current behavior of linking with 
> crtfastmath.o normally but we'd stop infecting shared libraries with it.

I don't think it's unreasonable to switch the logic of `-mdaz-ftz` from linking 
a file with a global initializer that sets the flags to making it emit the 
entry-point call to such a function instead, it still largely follows the same 
logic to me. And having a command line flag makes it easier for users to access 
rather than manually linking in a file located who-knows-where in the toolchain 
(although I suspect anyone who cares hard enough would rather just write the 
calls to set FTZ/DAZ than track it down from the toolchain).

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-08 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> I'd like to see this change land, but with the "-mdaz-ftz" option removed 
> (because I don't think it's useful). That would fix the critical problem of 
> fast-math infecting shared libraries with the ftz setting, and we could 
> straighten out the other problems, which are relatively minor, afterwards.

There is, without this change, no way to control whether or not `crtfastmath.o` 
is linked independent of all of the other fast-math options. The `-mdaz-ftz` 
option would at least add a flag to explicitly control the parameter (for the 
people who care), and we can then have discussions about different ways to 
effect setting DAZ/FTZ bits or what options imply `-mdaz-ftz` in future PRs. 
That alone makes it a worthy addition IMHO; the compatibility with gcc is 
another nice feature.

> I'm suggesting that we modify Clang so that -ffast-math doesn't affect 
> denormal-fp-math, by (as I mentioned before) removing the overload 
> [Linux::getDefaultDenormalModeForType](https://github.com/llvm/llvm-project/blob/d4c5acac99e83ffa12d2d720c9e502a181cbd7ea/clang/lib/Driver/ToolChains/Linux.cpp#L838).
>  This makes Clang for Linux X86 and X86-64 work the same as every other 
> OS/CPU combo.

That sounds to me ultimately like it should be a separate PR, although you're 
suggesting that should happen first?

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-05 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> (Sidenote: "dynamic" isn't even 
> [documented](https://clang.llvm.org/docs/UsersManual.html#cmdoption-fdenormal-fp-math)).

It's not a selectable enum of the Clang `-fdenormal-fp-math` flag, but it is 
one for the LLVM function attribute `denormal-fp-math`.

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-05 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> I think there is a bit of a problematic interaction with 
> getDenormalModeForType 
> [here](https://github.com/llvm/llvm-project/blob/7a94acb2da5b20d12f13f3c5f4eb0f3f46e78e73/clang/lib/Driver/ToolChains/Linux.cpp#L838C8-L838C37).
>  "-shared" is (should be) a flag used only for linking, but that function is 
> calling isFastMathRuntimeAvailable to affect the default denormal math mode 
> for _compilation_. That's not going to work.

The logic is already somewhat broken already: if `-ffast-math` isn't being 
added into the link line, or we're a shared library linked into a `-ffast-math` 
executable, we're not going to get the right value of denormal math mode. So 
it's a matter of which behavior is going to be the least likely to be incorrect 
in practice.

> I wonder if, instead, we should just have `-ffast-math` always downgrade 
> `-fdenormal-fp-math=ieee` to `-fdenormal-fp-math=preserve-sign`, under the 
> rationale of "you asked for fast math, and preserve-sign mode might let the 
> compiler generate faster code"?

One of the issues is that we have optimizations that are going to make 
assumptions about how denormals work in operations (particularly around being 
able to lower `llvm.is.fpclass` to `fcmp`). So you risk misoptimization if the 
value is wrong. Arguably, the only safe value is dynamic (compiler doesn't know 
what it's set to).

(We're wrong in another direction, which is that we don't attempt to set the 
flag to non-ieee for anything other x86. I tried looking up how the equivalents 
of FTZ/DAZ worked on other architectures and came to with the conclusion that 
the whole thing is a bundle of sadness that should have never been invented.)

https://github.com/llvm/llvm-project/pull/80475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-02 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel created 
https://github.com/llvm/llvm-project/pull/80475

This fixes https://github.com/llvm/llvm-project/issues/57589, and aligns Clang 
with the behavior of current versions of gcc. There is a new option, -mdaz-ftz, 
to control the linking of the file that sets FTZ/DAZ on startup, and this flag 
is on by default if -ffast-math is present and -shared isn't.

>From 14e683e37854816b34105f8ff9d80dafb1ba4001 Mon Sep 17 00:00:00 2001
From: Joshua Cranmer 
Date: Fri, 2 Feb 2024 10:35:29 -0800
Subject: [PATCH] Disable FTZ/DAZ when compiling shared libraries by default.

This fixes https://github.com/llvm/llvm-project/issues/57589, and aligns
Clang with the behavior of current versions of gcc. There is a new option,
-mdaz-ftz, to control the linking of the file that sets FTZ/DAZ on startup, and
this flag is on by default if -ffast-math is present and -shared isn't.
---
 clang/docs/ReleaseNotes.rst   |  5 
 clang/docs/UsersManual.rst| 15 
 clang/include/clang/Driver/Options.td |  5 
 clang/lib/Driver/ToolChain.cpp| 15 ++--
 clang/test/Driver/linux-ld.c  | 34 +++
 5 files changed, 67 insertions(+), 7 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index cd8a82f281f52..904e292a2eb4b 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -117,6 +117,11 @@ C23 Feature Support
 Non-comprehensive list of changes in this release
 -
 
+* Code compiled with ``-shared`` and ``-ffast-math`` will no longer enable
+  flush-to-zero floating-point mode by default. This decision can be overridden
+  with use of ``-mdaz-ftz``. This behavior now matches GCC's behavior.
+  (`#57589 `_)
+
 New Compiler Flags
 --
 
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 7391e4cf3a9ae..257af3c8d4d39 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -1506,7 +1506,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
 
* ``-ffp-contract=fast``
 
-   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code. See
+   Note: ``-ffast-math`` causes ``crtfastmath.o`` to be linked with code unless
+   ``-shared`` or ``-mno-daz-ftz`` is present. See
:ref:`crtfastmath.o` for more details.
 
 .. option:: -fno-fast-math
@@ -1560,7 +1561,8 @@ floating point semantic models: precise (the default), 
strict, and fast.
  ``-ffp-contract``.
 
Note: ``-fno-fast-math`` implies ``-fdenormal-fp-math=ieee``.
-   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code.
+   ``-fno-fast-math`` causes ``crtfastmath.o`` to not be linked with code
+   unless ``-mdaz-ftz`` is present.
 
 .. option:: -fdenormal-fp-math=
 
@@ -1907,10 +1909,13 @@ by using ``#pragma STDC FENV_ROUND`` with a value other 
than ``FE_DYNAMIC``.
 
 A note about ``crtfastmath.o``
 ^^
-``-ffast-math`` and ``-funsafe-math-optimizations`` cause ``crtfastmath.o`` to 
be
-automatically linked,  which adds a static constructor that sets the FTZ/DAZ
+``-ffast-math`` and ``-funsafe-math-optimizations`` without the ``-shared``
+option cause ``crtfastmath.o`` to be
+automatically linked, which adds a static constructor that sets the FTZ/DAZ
 bits in MXCSR, affecting not only the current compilation unit but all static
-and shared libraries included in the program.
+and shared libraries included in the program. This decision can be overridden
+by using either the flag ``-mdaz-ftz`` or ``-mno-daz-ftz`` to respectively
+link or not link ``crtfastmath.o``.
 
 .. _FLT_EVAL_METHOD:
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 73071a6648541..68f806d46c9e0 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2554,6 +2554,11 @@ defm protect_parens : BoolFOption<"protect-parens",
   "floating-point expressions are evaluated">,
   NegFlag>;
 
+defm daz_ftz : SimpleMFlag<"daz-ftz",
+  "Globally set", "Do not globally set",
+  " the denormals-are-zero (DAZ) and flush-to-zero (FTZ) bits in the "
+  "floating-point control register on program startup.">;
+
 def ffor_scope : Flag<["-"], "ffor-scope">, Group;
 def fno_for_scope : Flag<["-"], "fno-for-scope">, Group;
 
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 388030592b483..5c003e153fa52 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -1271,9 +1271,14 @@ void ToolChain::AddCCKextLibArgs(const ArgList ,
 
 bool ToolChain::isFastMathRuntimeAvailable(const ArgList ,
std::string ) const {
+  // Don't implicitly link in mode-changing libraries in a shared library, 
since
+  // this can have very deleterious effects. 

[clang] [Clang][C++23] Core language changes from P1467R9 extended floating-point types and standard names. (PR #78503)

2024-01-30 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

I haven't attempted to make my way through the sema changes yet, but some 
comments already:

https://github.com/llvm/llvm-project/pull/78503
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][C++23] Core language changes from P1467R9 extended floating-point types and standard names. (PR #78503)

2024-01-30 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel edited 
https://github.com/llvm/llvm-project/pull/78503
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2023-12-20 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

There's probably some useful discussion to be had about how aggressive 
`-ffinite-math-only` at the clang level should be wrt lowering to `nnan`/`ninf` 
in the IR.

It may be worth deferring the ReturnInst changes (but landing everything else) 
until that discussion is had.

https://github.com/llvm/llvm-project/pull/74056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Sema] Warning for _Float16 passed to format specifier '%f' (PR #74439)

2023-12-20 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> [N2844](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2844.pdf), as 
> linked by OP, references the removed language that you are looking for.

Doing some more spelunking, no released version of the TS ever had default 
argument promotion. The change to the TS was done shortly before integration in 
C, in response to the C++ proposal for extended floating-point types, but WG14 
objected to the change, so it was dropped in actual integration. See 
https://mailman.oakapple.net/pipermail/cfp-interest/2020-September/001782.html 
for the summary.

https://github.com/llvm/llvm-project/pull/74439
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Sema] Warning for _Float16 passed to format specifier '%f' (PR #74439)

2023-12-20 Thread Joshua Cranmer via cfe-commits

jcranmer-intel wrote:

> Should this only apply in C23 mode? Standard behavior until C23 is that 
> `_Float16` promotes to `double`. What about C++?

I can't find any reference in older versions of C or TS 18661-3 that suggests 
that `_Float16` is promoted to `double`. The wording of 6.5.2.2 used to say

> If the expression that denotes the called function has a type that does not 
> include a prototype, the integer promotions are performed on each argument, 
> and arguments that have type `float` are promoted to `double`. These are 
> called the default argument promotions.

Which suggests that *only* `float` is promoted and that other floating point 
types are not promoted, and nothing in TS 18661-3 can be construed to suggest 
promotion either (notably, `_Float32` *doesn't* promote to `double`, even 
though it's likely the same representation as `float`!).

https://github.com/llvm/llvm-project/pull/74439
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add support for -fcx-limited-range and #pragma CX_LIMITED_RANGE. (PR #68820)

2023-10-25 Thread Joshua Cranmer via cfe-commits


@@ -846,6 +865,105 @@ ComplexPairTy ComplexExprEmitter::EmitBinMul(const 
BinOpInfo ) {
   return ComplexPairTy(ResR, ResI);
 }
 
+ComplexPairTy ComplexExprEmitter::EmitAlgebraicDiv(llvm::Value *LHSr,
+   llvm::Value *LHSi,
+   llvm::Value *RHSr,
+   llvm::Value *RHSi) {
+  // (a+ib) / (c+id) = ((ac+bd)/(cc+dd)) + i((bc-ad)/(cc+dd))
+  llvm::Value *DSTr, *DSTi;
+
+  llvm::Value *AC = Builder.CreateFMul(LHSr, RHSr); // a*c
+  llvm::Value *BD = Builder.CreateFMul(LHSi, RHSi); // b*d
+  llvm::Value *ACpBD = Builder.CreateFAdd(AC, BD);  // ac+bd
+
+  llvm::Value *CC = Builder.CreateFMul(RHSr, RHSr); // c*c
+  llvm::Value *DD = Builder.CreateFMul(RHSi, RHSi); // d*d
+  llvm::Value *CCpDD = Builder.CreateFAdd(CC, DD);  // cc+dd
+
+  llvm::Value *BC = Builder.CreateFMul(LHSi, RHSr); // b*c
+  llvm::Value *AD = Builder.CreateFMul(LHSr, RHSi); // a*d
+  llvm::Value *BCmAD = Builder.CreateFSub(BC, AD);  // bc-ad
+
+  DSTr = Builder.CreateFDiv(ACpBD, CCpDD);
+  DSTi = Builder.CreateFDiv(BCmAD, CCpDD);
+  return ComplexPairTy(DSTr, DSTi);
+}
+
+/// EmitFAbs - Emit a call to @llvm.fabs.
+static llvm::Value *EmitllvmFAbs(CodeGenFunction , llvm::Value *Value) {
+  llvm::Function *Func =
+  CGF.CGM.getIntrinsic(llvm::Intrinsic::fabs, Value->getType());
+  llvm::Value *Call = CGF.Builder.CreateCall(Func, Value);
+  return Call;
+}
+
+ComplexPairTy ComplexExprEmitter::EmitRangeReductionDiv(llvm::Value *LHSr,
+llvm::Value *LHSi,
+llvm::Value *RHSr,
+llvm::Value *RHSi) {
+  // (a + ib) / (c + id) = (e + if)
+  llvm::Value *FAbsRHSr = EmitllvmFAbs(CGF, RHSr); // |c|
+  llvm::Value *FAbsRHSi = EmitllvmFAbs(CGF, RHSi); // |d|
+  // |c| >= |d|
+  llvm::Value *IsR = Builder.CreateFCmpUGT(FAbsRHSr, FAbsRHSi, "abs_cmp");
+
+  llvm::BasicBlock *TrueBB = CGF.createBasicBlock("true_bb_name");
+  llvm::BasicBlock *FalseBB = CGF.createBasicBlock("false_bb_name");
+  llvm::BasicBlock *ContBB = CGF.createBasicBlock("cont_bb");
+  Builder.CreateCondBr(IsR, TrueBB, FalseBB);
+
+  CGF.EmitBlock(TrueBB);
+  // abs(c) >= abs(d)
+  // r = d/c
+  // tmp = c + rd
+  // e = (a + br)/tmp
+  // f = (b - ar)/tmp
+  llvm::Value *DdC = Builder.CreateFDiv(RHSi, RHSr); // d/c
+
+  llvm::Value *RD = Builder.CreateFMul(DdC, RHSi);   // (d/c)d
+  llvm::Value *CpRD = Builder.CreateFAdd(RHSr, RD);  // c+((d/c)d)
+
+  llvm::Value *T3 = Builder.CreateFMul(LHSi, DdC);   // b(d/c)
+  llvm::Value *T4 = Builder.CreateFAdd(LHSr, T3);// a+b(d/c)
+  llvm::Value *DSTTr = Builder.CreateFDiv(T4, CpRD); // (a+b(d/c))/(c+(d/c)d)
+
+  llvm::Value *T5 = Builder.CreateFMul(LHSr, DdC);   // ar
+  llvm::Value *T6 = Builder.CreateFSub(LHSi, T5);// b-ar
+  llvm::Value *DSTTi = Builder.CreateFDiv(T6, CpRD); // (b-a(d/c))/(c+(d/c)d)
+  Builder.CreateBr(ContBB);
+
+  CGF.EmitBlock(FalseBB);
+  // abs(c) < abs(d)
+  // r = c/d
+  // tmp = d + rc
+  // e = (ar + b)/tmp
+  // f = (br - a)/tmp
+  llvm::Value *CdD = Builder.CreateFDiv(RHSr, RHSi);  // c/d
+
+  llvm::Value *RC = Builder.CreateFMul(CdD, RHSr);// (c/d)c
+  llvm::Value *DpRC = Builder.CreateFAdd(RHSi, RC);   // d+(c/d)c
+
+  llvm::Value *T7 = Builder.CreateFAdd(CdD, RHSi);// (c/d)+b
+  llvm::Value *T8 = Builder.CreateFMul(LHSr, T7); // a((c/d)+b)

jcranmer-intel wrote:

This code is wrong, I think, it should be mul then add.

https://github.com/llvm/llvm-project/pull/68820
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add support for -fcx-limited-range and #pragma CX_LIMITED_RANGE. (PR #68820)

2023-10-25 Thread Joshua Cranmer via cfe-commits


@@ -846,6 +865,105 @@ ComplexPairTy ComplexExprEmitter::EmitBinMul(const 
BinOpInfo ) {
   return ComplexPairTy(ResR, ResI);
 }
 
+ComplexPairTy ComplexExprEmitter::EmitAlgebraicDiv(llvm::Value *LHSr,
+   llvm::Value *LHSi,
+   llvm::Value *RHSr,
+   llvm::Value *RHSi) {
+  // (a+ib) / (c+id) = ((ac+bd)/(cc+dd)) + i((bc-ad)/(cc+dd))
+  llvm::Value *DSTr, *DSTi;
+
+  llvm::Value *AC = Builder.CreateFMul(LHSr, RHSr); // a*c
+  llvm::Value *BD = Builder.CreateFMul(LHSi, RHSi); // b*d
+  llvm::Value *ACpBD = Builder.CreateFAdd(AC, BD);  // ac+bd
+
+  llvm::Value *CC = Builder.CreateFMul(RHSr, RHSr); // c*c
+  llvm::Value *DD = Builder.CreateFMul(RHSi, RHSi); // d*d
+  llvm::Value *CCpDD = Builder.CreateFAdd(CC, DD);  // cc+dd
+
+  llvm::Value *BC = Builder.CreateFMul(LHSi, RHSr); // b*c
+  llvm::Value *AD = Builder.CreateFMul(LHSr, RHSi); // a*d
+  llvm::Value *BCmAD = Builder.CreateFSub(BC, AD);  // bc-ad
+
+  DSTr = Builder.CreateFDiv(ACpBD, CCpDD);
+  DSTi = Builder.CreateFDiv(BCmAD, CCpDD);
+  return ComplexPairTy(DSTr, DSTi);
+}
+
+/// EmitFAbs - Emit a call to @llvm.fabs.
+static llvm::Value *EmitllvmFAbs(CodeGenFunction , llvm::Value *Value) {
+  llvm::Function *Func =
+  CGF.CGM.getIntrinsic(llvm::Intrinsic::fabs, Value->getType());
+  llvm::Value *Call = CGF.Builder.CreateCall(Func, Value);
+  return Call;
+}
+
+ComplexPairTy ComplexExprEmitter::EmitRangeReductionDiv(llvm::Value *LHSr,

jcranmer-intel wrote:

Should probably identify this as smith's algorithm for complex division.

https://github.com/llvm/llvm-project/pull/68820
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add support for -fcx-limited-range and #pragma CX_LIMITED_RANGE. (PR #68820)

2023-10-25 Thread Joshua Cranmer via cfe-commits


@@ -28,4 +28,6 @@ OPTION(FPEvalMethod, LangOptions::FPEvalMethodKind, 2, 
AllowApproxFunc)
 OPTION(Float16ExcessPrecision, LangOptions::ExcessPrecisionKind, 2, 
FPEvalMethod)
 OPTION(BFloat16ExcessPrecision, LangOptions::ExcessPrecisionKind, 2, 
Float16ExcessPrecision)
 OPTION(MathErrno, bool, 1, BFloat16ExcessPrecision)
+OPTION(CxLimitedRange, bool, 1, MathErrno)
+OPTION(CxFortranRules, bool, 1, CxLimitedRange)

jcranmer-intel wrote:

I'd recommend this be seen as a single, ternary enum (ComplexRange, which can 
be full, fortran, or limited) rather than a pair of boolean options.

https://github.com/llvm/llvm-project/pull/68820
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add support for -fcx-limited-range and #pragma CX_LIMITED_RANGE. (PR #68820)

2023-10-25 Thread Joshua Cranmer via cfe-commits


@@ -846,6 +865,105 @@ ComplexPairTy ComplexExprEmitter::EmitBinMul(const 
BinOpInfo ) {
   return ComplexPairTy(ResR, ResI);
 }
 
+ComplexPairTy ComplexExprEmitter::EmitAlgebraicDiv(llvm::Value *LHSr,
+   llvm::Value *LHSi,
+   llvm::Value *RHSr,
+   llvm::Value *RHSi) {
+  // (a+ib) / (c+id) = ((ac+bd)/(cc+dd)) + i((bc-ad)/(cc+dd))
+  llvm::Value *DSTr, *DSTi;
+
+  llvm::Value *AC = Builder.CreateFMul(LHSr, RHSr); // a*c
+  llvm::Value *BD = Builder.CreateFMul(LHSi, RHSi); // b*d
+  llvm::Value *ACpBD = Builder.CreateFAdd(AC, BD);  // ac+bd
+
+  llvm::Value *CC = Builder.CreateFMul(RHSr, RHSr); // c*c
+  llvm::Value *DD = Builder.CreateFMul(RHSi, RHSi); // d*d
+  llvm::Value *CCpDD = Builder.CreateFAdd(CC, DD);  // cc+dd
+
+  llvm::Value *BC = Builder.CreateFMul(LHSi, RHSr); // b*c
+  llvm::Value *AD = Builder.CreateFMul(LHSr, RHSi); // a*d
+  llvm::Value *BCmAD = Builder.CreateFSub(BC, AD);  // bc-ad
+
+  DSTr = Builder.CreateFDiv(ACpBD, CCpDD);
+  DSTi = Builder.CreateFDiv(BCmAD, CCpDD);
+  return ComplexPairTy(DSTr, DSTi);
+}
+
+/// EmitFAbs - Emit a call to @llvm.fabs.
+static llvm::Value *EmitllvmFAbs(CodeGenFunction , llvm::Value *Value) {
+  llvm::Function *Func =
+  CGF.CGM.getIntrinsic(llvm::Intrinsic::fabs, Value->getType());
+  llvm::Value *Call = CGF.Builder.CreateCall(Func, Value);
+  return Call;
+}
+
+ComplexPairTy ComplexExprEmitter::EmitRangeReductionDiv(llvm::Value *LHSr,
+llvm::Value *LHSi,
+llvm::Value *RHSr,
+llvm::Value *RHSi) {
+  // (a + ib) / (c + id) = (e + if)
+  llvm::Value *FAbsRHSr = EmitllvmFAbs(CGF, RHSr); // |c|
+  llvm::Value *FAbsRHSi = EmitllvmFAbs(CGF, RHSi); // |d|
+  // |c| >= |d|
+  llvm::Value *IsR = Builder.CreateFCmpUGT(FAbsRHSr, FAbsRHSi, "abs_cmp");
+
+  llvm::BasicBlock *TrueBB = CGF.createBasicBlock("true_bb_name");
+  llvm::BasicBlock *FalseBB = CGF.createBasicBlock("false_bb_name");
+  llvm::BasicBlock *ContBB = CGF.createBasicBlock("cont_bb");

jcranmer-intel wrote:

These could use better basic block names.

https://github.com/llvm/llvm-project/pull/68820
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add support for -fcx-limited-range and #pragma CX_LIMITED_RANGE. (PR #68820)

2023-10-25 Thread Joshua Cranmer via cfe-commits


@@ -846,6 +859,31 @@ void Parser::HandlePragmaFEnvRound() {
   Actions.ActOnPragmaFEnvRound(PragmaLoc, RM);
 }
 
+void Parser::HandlePragmaCXLimitedRange() {
+  assert(Tok.is(tok::annot_pragma_cx_limited_range));
+  tok::OnOffSwitch OOS = static_cast(
+  reinterpret_cast(Tok.getAnnotationValue()));
+
+  bool IsEnabled;
+  switch (OOS) {
+  case tok::OOS_ON:
+IsEnabled = true;
+break;
+  case tok::OOS_OFF:
+IsEnabled = false;
+break;
+  case tok::OOS_DEFAULT:
+// According to ISO C99 standard chapter 7.3.4, the default value
+// for the pragma is ``off'. In GCC, the option -fcx-limited-range
+// controls the default setting of the pragma.

jcranmer-intel wrote:

GCC doesn't support these pragmas at all, but I believe it's reasonable to say 
that `-fcx-limited-range` and `-fcx-fortran-rules` controls the default value 
of these pragmas.

https://github.com/llvm/llvm-project/pull/68820
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add support for -fcx-limited-range and #pragma CX_LIMITED_RANGE. (PR #68820)

2023-10-25 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel commented:

I'm sure Aaron would appreciate me also saying that these changes would require 
addition to the clang release notes.  

https://github.com/llvm/llvm-project/pull/68820
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add support for -fcx-limited-range and #pragma CX_LIMITED_RANGE. (PR #68820)

2023-10-25 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel edited 
https://github.com/llvm/llvm-project/pull/68820
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add support for -fcx-limited-range and #pragma CX_LIMITED_RANGE. (PR #68820)

2023-10-25 Thread Joshua Cranmer via cfe-commits


@@ -2807,6 +2807,9 @@ static void RenderFloatingPointOptions(const ToolChain 
, const Driver ,
   bool StrictFPModel = false;
   StringRef Float16ExcessPrecision = "";
   StringRef BFloat16ExcessPrecision = "";
+  StringRef CxLimitedRange = "NoCxLimiteRange";

jcranmer-intel wrote:

This is spelled incorrectly, and yet another reason you should define a 
three-valued enum to represent range.

https://github.com/llvm/llvm-project/pull/68820
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix math-errno issue (PR #66381)

2023-09-18 Thread Joshua Cranmer via cfe-commits

https://github.com/jcranmer-intel approved this pull request.


https://github.com/llvm/llvm-project/pull/66381
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix math-errno issue (PR #66381)

2023-09-15 Thread Joshua Cranmer via cfe-commits


@@ -2313,14 +2313,20 @@ RValue CodeGenFunction::EmitBuiltinExpr(const 
GlobalDecl GD, unsigned BuiltinID,
   FD->hasAttr() ? 0 : BuiltinID;
 
   bool ErrnoOverriden = false;
-  // True if math-errno is overriden via the
+  bool ErrnoOverrideValue = false;

jcranmer-intel wrote:

This might be easier to understand if you replaced these two variables with a 
single `std::optional ErrnoOverride`.

https://github.com/llvm/llvm-project/pull/66381
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] bf49237 - [Clang] Enable -print-pipeline-passes in clang.

2023-09-13 Thread Joshua Cranmer via cfe-commits

Author: Joshua Cranmer
Date: 2023-09-13T08:57:10-07:00
New Revision: bf492371033378cec4c0443a3f87db0f818bbade

URL: 
https://github.com/llvm/llvm-project/commit/bf492371033378cec4c0443a3f87db0f818bbade
DIFF: 
https://github.com/llvm/llvm-project/commit/bf492371033378cec4c0443a3f87db0f818bbade.diff

LOG: [Clang] Enable -print-pipeline-passes in clang.

Reviewed By: arsenm, aeubanks

Differential Revision: https://reviews.llvm.org/D127221

Added: 
clang/test/CodeGen/print-pipeline-passes.c

Modified: 
clang/lib/CodeGen/BackendUtil.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index 2f649678b4f20df..04cb9064dc789a3 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -91,6 +91,7 @@ using namespace llvm;
 
 namespace llvm {
 extern cl::opt DebugInfoCorrelate;
+extern cl::opt PrintPipelinePasses;
 
 // Experiment to move sanitizers earlier.
 static cl::opt ClSanitizeOnOptimizerEarlyEP(
@@ -1090,6 +1091,17 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
   TheModule->addModuleFlag(Module::Error, "UnifiedLTO", uint32_t(1));
   }
 
+  // Print a textual, '-passes=' compatible, representation of pipeline if
+  // requested.
+  if (PrintPipelinePasses) {
+MPM.printPipeline(outs(), [](StringRef ClassName) {
+  auto PassName = PIC.getPassNameForClassName(ClassName);
+  return PassName.empty() ? ClassName : PassName;
+});
+outs() << "\n";
+return;
+  }
+
   // Now that we have all of the passes ready, run them.
   {
 PrettyStackTraceString CrashInfo("Optimizer");
@@ -1127,6 +1139,13 @@ void EmitAssemblyHelper::RunCodegenPipeline(
 return;
   }
 
+  // If -print-pipeline-passes is requested, don't run the legacy pass manager.
+  // FIXME: when codegen is switched to use the new pass manager, it should 
also
+  // emit pass names here.
+  if (PrintPipelinePasses) {
+return;
+  }
+
   {
 PrettyStackTraceString CrashInfo("Code generation");
 llvm::TimeTraceScope TimeScope("CodeGenPasses");

diff  --git a/clang/test/CodeGen/print-pipeline-passes.c 
b/clang/test/CodeGen/print-pipeline-passes.c
new file mode 100644
index 000..904d2416bc9e19f
--- /dev/null
+++ b/clang/test/CodeGen/print-pipeline-passes.c
@@ -0,0 +1,9 @@
+// Test that -print-pipeline-passes works in Clang
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm-bc -o /dev/null 
-mllvm -print-pipeline-passes -O0 %s 2>&1 | FileCheck %s
+
+// Don't try to check all passes, just a few to make sure that something is
+// actually printed.
+// CHECK: always-inline
+// CHECK-SAME: annotation-remarks
+void Foo(void) {}



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] bcad161 - [Clang][SPIR-V] Emit target extension types for OpenCL types on SPIR-V.

2023-03-13 Thread Joshua Cranmer via cfe-commits

Author: Joshua Cranmer
Date: 2023-03-13T14:20:24-04:00
New Revision: bcad161db3e69e27736c975ef5eeac60c96dcc97

URL: 
https://github.com/llvm/llvm-project/commit/bcad161db3e69e27736c975ef5eeac60c96dcc97
DIFF: 
https://github.com/llvm/llvm-project/commit/bcad161db3e69e27736c975ef5eeac60c96dcc97.diff

LOG: [Clang][SPIR-V] Emit target extension types for OpenCL types on SPIR-V.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D141008

Added: 


Modified: 
clang/include/clang-c/Index.h
clang/include/clang/Basic/OpenCLExtensionTypes.def
clang/lib/CodeGen/CGOpenCLRuntime.cpp
clang/lib/CodeGen/CGOpenCLRuntime.h
clang/lib/CodeGen/TargetInfo.cpp
clang/lib/CodeGen/TargetInfo.h
clang/test/CodeGenOpenCL/cast_image.cl
clang/test/CodeGenOpenCL/cl20-device-side-enqueue-attributes.cl
clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
clang/test/CodeGenOpenCL/intel-subgroups-avc-ext-types.cl
clang/test/CodeGenOpenCL/opencl_types.cl
clang/test/CodeGenOpenCL/sampler.cl
clang/test/Index/pipe-size.cl
llvm/docs/SPIRVUsage.rst

Removed: 




diff  --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h
index 88d719d32c21f..5f71b86b20b48 100644
--- a/clang/include/clang-c/Index.h
+++ b/clang/include/clang-c/Index.h
@@ -2923,10 +2923,15 @@ enum CXTypeKind {
   CXType_OCLIntelSubgroupAVCImeResult = 169,
   CXType_OCLIntelSubgroupAVCRefResult = 170,
   CXType_OCLIntelSubgroupAVCSicResult = 171,
+  CXType_OCLIntelSubgroupAVCImeResultSingleReferenceStreamout = 172,
+  CXType_OCLIntelSubgroupAVCImeResultDualReferenceStreamout = 173,
+  CXType_OCLIntelSubgroupAVCImeSingleReferenceStreamin = 174,
+  CXType_OCLIntelSubgroupAVCImeDualReferenceStreamin = 175,
+
+  /* Old aliases for AVC OpenCL extension types. */
   CXType_OCLIntelSubgroupAVCImeResultSingleRefStreamout = 172,
   CXType_OCLIntelSubgroupAVCImeResultDualRefStreamout = 173,
   CXType_OCLIntelSubgroupAVCImeSingleRefStreamin = 174,
-
   CXType_OCLIntelSubgroupAVCImeDualRefStreamin = 175,
 
   CXType_ExtVector = 176,

diff  --git a/clang/include/clang/Basic/OpenCLExtensionTypes.def 
b/clang/include/clang/Basic/OpenCLExtensionTypes.def
index 84ffbe936b77d..17c72d69a0206 100644
--- a/clang/include/clang/Basic/OpenCLExtensionTypes.def
+++ b/clang/include/clang/Basic/OpenCLExtensionTypes.def
@@ -28,10 +28,10 @@ INTEL_SUBGROUP_AVC_TYPE(mce_result_t, MceResult)
 INTEL_SUBGROUP_AVC_TYPE(ime_result_t, ImeResult)
 INTEL_SUBGROUP_AVC_TYPE(ref_result_t, RefResult)
 INTEL_SUBGROUP_AVC_TYPE(sic_result_t, SicResult)
-INTEL_SUBGROUP_AVC_TYPE(ime_result_single_reference_streamout_t, 
ImeResultSingleRefStreamout)
-INTEL_SUBGROUP_AVC_TYPE(ime_result_dual_reference_streamout_t, 
ImeResultDualRefStreamout)
-INTEL_SUBGROUP_AVC_TYPE(ime_single_reference_streamin_t, ImeSingleRefStreamin)
-INTEL_SUBGROUP_AVC_TYPE(ime_dual_reference_streamin_t, ImeDualRefStreamin)
+INTEL_SUBGROUP_AVC_TYPE(ime_result_single_reference_streamout_t, 
ImeResultSingleReferenceStreamout)
+INTEL_SUBGROUP_AVC_TYPE(ime_result_dual_reference_streamout_t, 
ImeResultDualReferenceStreamout)
+INTEL_SUBGROUP_AVC_TYPE(ime_single_reference_streamin_t, 
ImeSingleReferenceStreamin)
+INTEL_SUBGROUP_AVC_TYPE(ime_dual_reference_streamin_t, 
ImeDualReferenceStreamin)
 
 #undef INTEL_SUBGROUP_AVC_TYPE
 #endif // INTEL_SUBGROUP_AVC_TYPE

diff  --git a/clang/lib/CodeGen/CGOpenCLRuntime.cpp 
b/clang/lib/CodeGen/CGOpenCLRuntime.cpp
index 5a5d7a4d18c57..395ba5e7dc144 100644
--- a/clang/lib/CodeGen/CGOpenCLRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenCLRuntime.cpp
@@ -31,8 +31,11 @@ void 
CGOpenCLRuntime::EmitWorkGroupLocalVarDecl(CodeGenFunction ,
 }
 
 llvm::Type *CGOpenCLRuntime::convertOpenCLSpecificType(const Type *T) {
-  assert(T->isOpenCLSpecificType() &&
- "Not an OpenCL specific type!");
+  assert(T->isOpenCLSpecificType() && "Not an OpenCL specific type!");
+
+  // Check if the target has a specific translation for this type first.
+  if (llvm::Type *TransTy = CGM.getTargetCodeGenInfo().getOpenCLType(CGM, T))
+return TransTy;
 
   switch (cast(T)->getKind()) {
   default:
@@ -75,6 +78,9 @@ llvm::PointerType *CGOpenCLRuntime::getPointerType(const Type 
*T,
 }
 
 llvm::Type *CGOpenCLRuntime::getPipeType(const PipeType *T) {
+  if (llvm::Type *PipeTy = CGM.getTargetCodeGenInfo().getOpenCLType(CGM, T))
+return PipeTy;
+
   if (T->isReadOnly())
 return getPipeType(T, "opencl.pipe_ro_t", PipeROTy);
   else
@@ -91,12 +97,18 @@ llvm::Type *CGOpenCLRuntime::getPipeType(const PipeType *T, 
StringRef Name,
   return PipeTy;
 }
 
-llvm::PointerType *CGOpenCLRuntime::getSamplerType(const Type *T) {
-  if (!SamplerTy)
-SamplerTy = llvm::PointerType::get(llvm::StructType::create(
-  CGM.getLLVMContext(), "opencl.sampler_t"),
-  CGM.getContext().getTargetAddressSpace(
-  CGM.getContext().getOpenCLTypeAddrSpace(T)));
+llvm::Type