[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/188245 >From b201ddf5efd33e8ef6712718c2d170ae94fe Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 24 Mar 2026 14:25:55 +0100 Subject: [PATCH] libclc: Force assuming fast float fma for AMDGPU Currently the build uses the default dummy target, which assumes FMA is slow. Force this to assume fast fma, which is the case on any remotely new hardware. In the future if we want better support for older targets, there should be a separate build of the math functions for the slow fma case. --- libclc/clc/include/clc/math/math.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libclc/clc/include/clc/math/math.h b/libclc/clc/include/clc/math/math.h index 22ed3f9defcbe..950e8055c98c9 100644 --- a/libclc/clc/include/clc/math/math.h +++ b/libclc/clc/include/clc/math/math.h @@ -30,7 +30,9 @@ #define __CLC_FAST_FMA_F16 0 #endif -#ifdef FP_FAST_FMAF +// TODO: Stop forcing this for AMDGPU, and use a separate build for slow-fma +// case. +#if defined(FP_FAST_FMAF) || defined(__AMDGPU__) #define __CLC_FAST_FMA_F32 1 #else #define __CLC_FAST_FMA_F32 0 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/188245 >From b201ddf5efd33e8ef6712718c2d170ae94fe Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 24 Mar 2026 14:25:55 +0100 Subject: [PATCH] libclc: Force assuming fast float fma for AMDGPU Currently the build uses the default dummy target, which assumes FMA is slow. Force this to assume fast fma, which is the case on any remotely new hardware. In the future if we want better support for older targets, there should be a separate build of the math functions for the slow fma case. --- libclc/clc/include/clc/math/math.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libclc/clc/include/clc/math/math.h b/libclc/clc/include/clc/math/math.h index 22ed3f9defcbe..950e8055c98c9 100644 --- a/libclc/clc/include/clc/math/math.h +++ b/libclc/clc/include/clc/math/math.h @@ -30,7 +30,9 @@ #define __CLC_FAST_FMA_F16 0 #endif -#ifdef FP_FAST_FMAF +// TODO: Stop forcing this for AMDGPU, and use a separate build for slow-fma +// case. +#if defined(FP_FAST_FMAF) || defined(__AMDGPU__) #define __CLC_FAST_FMA_F32 1 #else #define __CLC_FAST_FMA_F32 0 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/188245 >From 404e1bf6aae5df6a0796e057c2867e5cf165233d Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 24 Mar 2026 14:25:55 +0100 Subject: [PATCH] libclc: Force assuming fast float fma for AMDGPU Currently the build uses the default dummy target, which assumes FMA is slow. Force this to assume fast fma, which is the case on any remotely new hardware. In the future if we want better support for older targets, there should be a separate build of the math functions for the slow fma case. --- libclc/clc/include/clc/math/math.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libclc/clc/include/clc/math/math.h b/libclc/clc/include/clc/math/math.h index 22ed3f9defcbe..950e8055c98c9 100644 --- a/libclc/clc/include/clc/math/math.h +++ b/libclc/clc/include/clc/math/math.h @@ -30,7 +30,9 @@ #define __CLC_FAST_FMA_F16 0 #endif -#ifdef FP_FAST_FMAF +// TODO: Stop forcing this for AMDGPU, and use a separate build for slow-fma +// case. +#if defined(FP_FAST_FMAF) || defined(__AMDGPU__) #define __CLC_FAST_FMA_F32 1 #else #define __CLC_FAST_FMA_F32 0 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/188245 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.com/github/pr/llvm/llvm-project/188245?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#188245** https://app.graphite.com/github/pr/llvm/llvm-project/188245?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.com/github/pr/llvm/llvm-project/188245?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#188244** https://app.graphite.com/github/pr/llvm/llvm-project/188244?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/188245 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libclc] libclc: Force assuming fast float fma for AMDGPU (PR #188245)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/188245 Currently the build uses the default dummy target, which assumes FMA is slow. Force this to assume fast fma, which is the case on any remotely new hardware. In the future if we want better support for older targets, there should be a separate build of the math functions for the slow fma case. >From f31527f816ad796bff2f3a0b57fc5f659946d9ab Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 24 Mar 2026 14:25:55 +0100 Subject: [PATCH] libclc: Force assuming fast float fma for AMDGPU Currently the build uses the default dummy target, which assumes FMA is slow. Force this to assume fast fma, which is the case on any remotely new hardware. In the future if we want better support for older targets, there should be a separate build of the math functions for the slow fma case. --- libclc/clc/include/clc/math/math.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libclc/clc/include/clc/math/math.h b/libclc/clc/include/clc/math/math.h index 22ed3f9defcbe..284097c1883dc 100644 --- a/libclc/clc/include/clc/math/math.h +++ b/libclc/clc/include/clc/math/math.h @@ -30,7 +30,9 @@ #define __CLC_FAST_FMA_F16 0 #endif -#ifdef FP_FAST_FMAF +// TODO: Stop forcing this for AMDGPU, and use a separate build for slow-fma +// case. +#ifdef FP_FAST_FMAF || defined(__AMDGPU__) #define __CLC_FAST_FMA_F32 1 #else #define __CLC_FAST_FMA_F32 0 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
