[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)

2026-03-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/188245

>From b201ddf5efd33e8ef6712718c2d170ae94fe Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 24 Mar 2026 14:25:55 +0100
Subject: [PATCH] libclc: Force assuming fast float fma for AMDGPU

Currently the build uses the default dummy target, which assumes
FMA is slow. Force this to assume fast fma, which is the case on
any remotely new hardware. In the future if we want better support
for older targets, there should be a separate build of the math
functions for the slow fma case.
---
 libclc/clc/include/clc/math/math.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libclc/clc/include/clc/math/math.h 
b/libclc/clc/include/clc/math/math.h
index 22ed3f9defcbe..950e8055c98c9 100644
--- a/libclc/clc/include/clc/math/math.h
+++ b/libclc/clc/include/clc/math/math.h
@@ -30,7 +30,9 @@
 #define __CLC_FAST_FMA_F16 0
 #endif
 
-#ifdef FP_FAST_FMAF
+// TODO: Stop forcing this for AMDGPU, and use a separate build for slow-fma
+// case.
+#if defined(FP_FAST_FMAF) || defined(__AMDGPU__)
 #define __CLC_FAST_FMA_F32 1
 #else
 #define __CLC_FAST_FMA_F32 0

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)

2026-03-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/188245

>From b201ddf5efd33e8ef6712718c2d170ae94fe Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 24 Mar 2026 14:25:55 +0100
Subject: [PATCH] libclc: Force assuming fast float fma for AMDGPU

Currently the build uses the default dummy target, which assumes
FMA is slow. Force this to assume fast fma, which is the case on
any remotely new hardware. In the future if we want better support
for older targets, there should be a separate build of the math
functions for the slow fma case.
---
 libclc/clc/include/clc/math/math.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libclc/clc/include/clc/math/math.h 
b/libclc/clc/include/clc/math/math.h
index 22ed3f9defcbe..950e8055c98c9 100644
--- a/libclc/clc/include/clc/math/math.h
+++ b/libclc/clc/include/clc/math/math.h
@@ -30,7 +30,9 @@
 #define __CLC_FAST_FMA_F16 0
 #endif
 
-#ifdef FP_FAST_FMAF
+// TODO: Stop forcing this for AMDGPU, and use a separate build for slow-fma
+// case.
+#if defined(FP_FAST_FMAF) || defined(__AMDGPU__)
 #define __CLC_FAST_FMA_F32 1
 #else
 #define __CLC_FAST_FMA_F32 0

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)

2026-03-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/188245

>From 404e1bf6aae5df6a0796e057c2867e5cf165233d Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 24 Mar 2026 14:25:55 +0100
Subject: [PATCH] libclc: Force assuming fast float fma for AMDGPU

Currently the build uses the default dummy target, which assumes
FMA is slow. Force this to assume fast fma, which is the case on
any remotely new hardware. In the future if we want better support
for older targets, there should be a separate build of the math
functions for the slow fma case.
---
 libclc/clc/include/clc/math/math.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libclc/clc/include/clc/math/math.h 
b/libclc/clc/include/clc/math/math.h
index 22ed3f9defcbe..950e8055c98c9 100644
--- a/libclc/clc/include/clc/math/math.h
+++ b/libclc/clc/include/clc/math/math.h
@@ -30,7 +30,9 @@
 #define __CLC_FAST_FMA_F16 0
 #endif
 
-#ifdef FP_FAST_FMAF
+// TODO: Stop forcing this for AMDGPU, and use a separate build for slow-fma
+// case.
+#if defined(FP_FAST_FMAF) || defined(__AMDGPU__)
 #define __CLC_FAST_FMA_F32 1
 #else
 #define __CLC_FAST_FMA_F32 0

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)

2026-03-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/188245
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)

2026-03-24 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.com/github/pr/llvm/llvm-project/188245?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#188245** https://app.graphite.com/github/pr/llvm/llvm-project/188245?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.com/github/pr/llvm/llvm-project/188245?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#188244** https://app.graphite.com/github/pr/llvm/llvm-project/188244?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/188245
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)

2026-03-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/188245

Currently the build uses the default dummy target, which assumes
FMA is slow. Force this to assume fast fma, which is the case on
any remotely new hardware. In the future if we want better support
for older targets, there should be a separate build of the math
functions for the slow fma case.

>From f31527f816ad796bff2f3a0b57fc5f659946d9ab Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 24 Mar 2026 14:25:55 +0100
Subject: [PATCH] libclc: Force assuming fast float fma for AMDGPU

Currently the build uses the default dummy target, which assumes
FMA is slow. Force this to assume fast fma, which is the case on
any remotely new hardware. In the future if we want better support
for older targets, there should be a separate build of the math
functions for the slow fma case.
---
 libclc/clc/include/clc/math/math.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libclc/clc/include/clc/math/math.h 
b/libclc/clc/include/clc/math/math.h
index 22ed3f9defcbe..284097c1883dc 100644
--- a/libclc/clc/include/clc/math/math.h
+++ b/libclc/clc/include/clc/math/math.h
@@ -30,7 +30,9 @@
 #define __CLC_FAST_FMA_F16 0
 #endif
 
-#ifdef FP_FAST_FMAF
+// TODO: Stop forcing this for AMDGPU, and use a separate build for slow-fma
+// case.
+#ifdef FP_FAST_FMAF || defined(__AMDGPU__)
 #define __CLC_FAST_FMA_F32 1
 #else
 #define __CLC_FAST_FMA_F32 0

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits