[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2022-12-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision.
arsenm added a comment.
This revision now requires changes to proceed.

Please rebase if still relevant


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2022-11-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.
Herald added a subscriber: Naghasan.
Herald added a project: All.

Was this addressed already? Today it looks like we have:

  if (Opts.FastRelaxedMath || Opts.CLUnsafeMath)
Opts.setDefaultFPContractMode(LangOptions::FPM_Fast);


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-08-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

Is this still necessary?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-05-22 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia marked an inline comment as done.
Anastasia added inline comments.



Comment at: clang/test/CodeGenOpenCL/relaxed-fpmath.cl:21
+float fused_mad(float a, float b, float c) {
+  // NORMAL: @llvm.fmuladd.f32
+  // FAST: fmul fast float

Anastasia wrote:
> Anastasia wrote:
> > Anastasia wrote:
> > > I don't find this behavior "NORMAL". I don't believe we should contract 
> > > expressions by default in OpenCL...
> > I just found in table 38 of OpenCL C spec, last entry says:
> > 
> > 
> > > x * y + z
> > > Implemented either as a correctly rounded fma or as a multiply and an add 
> > > both of which are correctly rounded.
> > 
> > In table 8 for `fma` it states:
> > 
> > > Returns the correctly rounded floating-point representation of the sum of 
> > > c with the infinitely precise product of a and b. Rounding of 
> > > intermediate products shall not occur. 
> > 
> > 
> > When I check LLVM doecumentation for fmuladd it says:
> > 
> > > is equivalent to the expression a * b + c, except that it is unspecified 
> > > whether rounding will be performed between the multiplication and 
> > > addition steps. Fusion is not guaranteed, even if the target platform 
> > > supports it. If a fused multiply-add is required, the corresponding 
> > > llvm.fma intrinsic function should be used instead. This never sets 
> > > errno, just as ‘llvm.fma.*’.
> > 
> > Does this mean that rounding of an intermediate product may occur and 
> > therefore it is not safe to use it for OpenCL mode by default?
> > 
> After more digging I conclude that the use of `fmuladd` for `x * y + z` 
> should be ok as per table 38 of OpenCL C because it allows either fused or 
> non-fused operation i.e. it is ok if either intermediate value is rounded or 
> not.
I changed my mind again. This is now covered by https://reviews.llvm.org/D80440.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-05-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

I think the contract flag needs clarification. I would interpret an instruction 
with only a contract flag as meaning allow precision increasing FMA formation, 
and contract+afn to mean combining while reducing precision


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-05-22 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia marked an inline comment as done.
Anastasia added inline comments.



Comment at: clang/test/CodeGenOpenCL/relaxed-fpmath.cl:21
+float fused_mad(float a, float b, float c) {
+  // NORMAL: @llvm.fmuladd.f32
+  // FAST: fmul fast float

Anastasia wrote:
> Anastasia wrote:
> > I don't find this behavior "NORMAL". I don't believe we should contract 
> > expressions by default in OpenCL...
> I just found in table 38 of OpenCL C spec, last entry says:
> 
> 
> > x * y + z
> > Implemented either as a correctly rounded fma or as a multiply and an add 
> > both of which are correctly rounded.
> 
> In table 8 for `fma` it states:
> 
> > Returns the correctly rounded floating-point representation of the sum of c 
> > with the infinitely precise product of a and b. Rounding of intermediate 
> > products shall not occur. 
> 
> 
> When I check LLVM doecumentation for fmuladd it says:
> 
> > is equivalent to the expression a * b + c, except that it is unspecified 
> > whether rounding will be performed between the multiplication and addition 
> > steps. Fusion is not guaranteed, even if the target platform supports it. 
> > If a fused multiply-add is required, the corresponding llvm.fma intrinsic 
> > function should be used instead. This never sets errno, just as 
> > ‘llvm.fma.*’.
> 
> Does this mean that rounding of an intermediate product may occur and 
> therefore it is not safe to use it for OpenCL mode by default?
> 
After more digging I conclude that the use of `fmuladd` for `x * y + z` should 
be ok as per table 38 of OpenCL C because it allows either fused or non-fused 
operation i.e. it is ok if either intermediate value is rounded or not.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-05-22 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia marked an inline comment as done.
Anastasia added inline comments.



Comment at: clang/test/CodeGenOpenCL/relaxed-fpmath.cl:21
+float fused_mad(float a, float b, float c) {
+  // NORMAL: @llvm.fmuladd.f32
+  // FAST: fmul fast float

Anastasia wrote:
> I don't find this behavior "NORMAL". I don't believe we should contract 
> expressions by default in OpenCL...
I just found in table 38 of OpenCL C spec, last entry says:


> x * y + z
> Implemented either as a correctly rounded fma or as a multiply and an add 
> both of which are correctly rounded.

In table 8 for `fma` it states:

> Returns the correctly rounded floating-point representation of the sum of c 
> with the infinitely precise product of a and b. Rounding of intermediate 
> products shall not occur. 


When I check LLVM doecumentation for fmuladd it says:

> is equivalent to the expression a * b + c, except that it is unspecified 
> whether rounding will be performed between the multiplication and addition 
> steps. Fusion is not guaranteed, even if the target platform supports it. If 
> a fused multiply-add is required, the corresponding llvm.fma intrinsic 
> function should be used instead. This never sets errno, just as ‘llvm.fma.*’.

Does this mean that rounding of an intermediate product may occur and therefore 
it is not safe to use it for OpenCL mode by default?



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-05-22 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia added a comment.

In D80416#2050250 , @arsenm wrote:

> The langref wording makes me think this isn't quite right. This depends on 
> your definition of floating point contraction. I've always assumed it meant 
> allow FMA, potentially increasing precision. Is contracting into something 
> less precise allowed?


I don't see anywhere it says that contraction is for higher precision only. If 
I check the LLVM language manual `fast` flag implies `contract` which is what 
we are setting with `-cl-fast-relaxed-math` known to result in lower accuracy.

> If not, that's stricter / the opposite of what -cl-mad-enable implies. My 
> interpretation of the CL spec description would be to use fmuladd with an afn 
> flag (although that still can allow for increasing precision)

Currently `fmuladd` is produced with `LangOptions::FPM_On` that is used with 
FP_CONTRACT pragma. If I look at the documentation `LangOptions::FPM_Fast` is 
serving the same  purpose just allowing more cases to be contracted (i.e. 
across statements). Hence due to this it sets contract flag rather than 
emitting intrinsic directly providing more freedom to the backend to optimise 
and combine fused computations.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-05-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

The langref wording makes me think this isn't quite right. This depends on your 
definition of floating point contraction. I've always assumed it meant allow 
FMA, potentially increasing precision. Is contracting into something less 
precise allowed? If not, that's stricter / the opposite of what -cl-mad-enable 
implies. My interpretation of the CL spec description would be to use fmuladd 
with an afn flag (although that still can allow for increasing precision)

For AMDGPU I've thought about interpreting less-precise-fpmad as allowing 
denormal flushing that would otherwise be illegal. Currently it doesn't do 
anything, but somehow interpreting the flags for this would be better.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-05-21 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia marked an inline comment as done.
Anastasia added inline comments.



Comment at: clang/test/CodeGenOpenCL/relaxed-fpmath.cl:21
+float fused_mad(float a, float b, float c) {
+  // NORMAL: @llvm.fmuladd.f32
+  // FAST: fmul fast float

I don't find this behavior "NORMAL". I don't believe we should contract 
expressions by default in OpenCL...


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80416/new/

https://reviews.llvm.org/D80416



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80416: [RFC][OpenCL] Set fp contract flag on -cl-mad-enable

2020-05-21 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia created this revision.
Anastasia added reviewers: rjmccall, arsenm.
Herald added subscribers: ebevhan, yaxunl, wdng.
Anastasia marked an inline comment as done.
Anastasia added inline comments.
Anastasia retitled this revision from "[RCF][OpenCL] Set fp contract flag on 
-cl-mad-enable" to "[RFC][OpenCL] Set fp contract flag on -cl-mad-enable".



Comment at: clang/test/CodeGenOpenCL/relaxed-fpmath.cl:21
+float fused_mad(float a, float b, float c) {
+  // NORMAL: @llvm.fmuladd.f32
+  // FAST: fmul fast float

I don't find this behavior "NORMAL". I don't believe we should contract 
expressions by default in OpenCL...


I think setting `contract` flag on fp instructions with `-cl-mad-enable` should 
be sensible considering spec wording:

> -cl-mad-enable Allow a * b + c to be replaced by a mad. The mad computes a * 
> b + c with reduced accuracy. For example, some OpenCL devices implement mad 
> as truncate the result of a * b before adding it to c

However, I am unclear how it impacts `fdiv` instructions and etc, as I am not 
sure how exactly it is optimized. LLVM reference manual says:

> contract
> 
>   Allow floating-point contraction (e.g. fusing a multiply followed by an 
> addition into a fused multiply-and-add).

Presumably `contract` makes no effect in `fdiv`?

Now another question is whether we could remove `LessPreciseFPMAD` from CodeGen 
options as I don't feel it has actual uses. Although I might be 
misunderstanding this.

TODO: the same applies to `-cl-unsafe-math-optimizations`


https://reviews.llvm.org/D80416

Files:
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGenOpenCL/relaxed-fpmath.cl


Index: clang/test/CodeGenOpenCL/relaxed-fpmath.cl
===
--- clang/test/CodeGenOpenCL/relaxed-fpmath.cl
+++ clang/test/CodeGenOpenCL/relaxed-fpmath.cl
@@ -12,10 +12,32 @@
   // FAST: fdiv fast float
   // FINITE: fdiv nnan ninf float
   // UNSAFE: fdiv nnan nsz float
-  // MAD: fdiv float
+  // MAD: fdiv contract float
   // NOSIGNED: fdiv nsz float
   return a / b;
 }
+
+float fused_mad(float a, float b, float c) {
+  // NORMAL: @llvm.fmuladd.f32
+  // FAST: fmul fast float
+  // FAST: fadd fast float
+  // MAD: fmul contract float
+  // MAD: fadd contract float
+  return a*b+c;
+}
+
+float fused_mad_accross_stmts(float a, float b, float c) {
+  // NORMAL: fmul float
+  // NORMAL: fadd float
+  // FAST: fmul fast float
+  // FAST: fadd fast float
+  // MAD: fmul contract float
+  // MAD: fadd contract float
+  float t = a*b;
+  return t+c;
+}
+
+
 // CHECK: attributes
 
 // NORMAL: "less-precise-fpmad"="false"
Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -2944,7 +2944,7 @@
   Opts.NoBitFieldTypeAlign = Args.hasArg(OPT_fno_bitfield_type_align);
   Opts.SinglePrecisionConstants = 
Args.hasArg(OPT_cl_single_precision_constant);
   Opts.FastRelaxedMath = Args.hasArg(OPT_cl_fast_relaxed_math);
-  if (Opts.FastRelaxedMath)
+  if (Opts.FastRelaxedMath || Args.hasArg(OPT_cl_mad_enable))
 Opts.setDefaultFPContractMode(LangOptions::FPM_Fast);
   Opts.HexagonQdsp6Compat = Args.hasArg(OPT_mqdsp6_compat);
   Opts.FakeAddressSpaceMap = Args.hasArg(OPT_ffake_address_space_map);


Index: clang/test/CodeGenOpenCL/relaxed-fpmath.cl
===
--- clang/test/CodeGenOpenCL/relaxed-fpmath.cl
+++ clang/test/CodeGenOpenCL/relaxed-fpmath.cl
@@ -12,10 +12,32 @@
   // FAST: fdiv fast float
   // FINITE: fdiv nnan ninf float
   // UNSAFE: fdiv nnan nsz float
-  // MAD: fdiv float
+  // MAD: fdiv contract float
   // NOSIGNED: fdiv nsz float
   return a / b;
 }
+
+float fused_mad(float a, float b, float c) {
+  // NORMAL: @llvm.fmuladd.f32
+  // FAST: fmul fast float
+  // FAST: fadd fast float
+  // MAD: fmul contract float
+  // MAD: fadd contract float
+  return a*b+c;
+}
+
+float fused_mad_accross_stmts(float a, float b, float c) {
+  // NORMAL: fmul float
+  // NORMAL: fadd float
+  // FAST: fmul fast float
+  // FAST: fadd fast float
+  // MAD: fmul contract float
+  // MAD: fadd contract float
+  float t = a*b;
+  return t+c;
+}
+
+
 // CHECK: attributes
 
 // NORMAL: "less-precise-fpmad"="false"
Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -2944,7 +2944,7 @@
   Opts.NoBitFieldTypeAlign = Args.hasArg(OPT_fno_bitfield_type_align);
   Opts.SinglePrecisionConstants = Args.hasArg(OPT_cl_single_precision_constant);
   Opts.FastRelaxedMath = Args.hasArg(OPT_cl_fast_relaxed_math);
-  if (Opts.FastRelaxedMath)
+  if (Opts.FastRelaxedMath || Args.hasArg(OPT_cl_mad_enable))