[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
frasercrmck wrote: ping https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
frasercrmck wrote: @arsenm are you happy for this to be merged? https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
https://github.com/frasercrmck updated https://github.com/llvm/llvm-project/pull/116786 >From 65713ca581a83261e888feb7a96c76d1525d223b Mon Sep 17 00:00:00 2001 From: Fraser Cormack Date: Tue, 17 Dec 2024 16:52:51 + Subject: [PATCH] [libclc] Move several integer functions to CLC library This commit moves over the OpenCL clz, hadd, mad24, mad_hi, mul24, mul_hi, popcount, rhadd, and upsample builtins to the CLC library. There are no changes to any target's CLC libraries. The OpenCL mad_hi builtin wasn't previously publicly available from the CLC libraries, as it was hash-defined to mul_hi in the header files. That issue has been fixed, and mad_hi is now exposed. The custom AMD implementation/workaround for popcount has been removed as it was only valid for clang < 7. There are still two integer functions which haven't been moved over. The OpenCL add_sat, sub_sat, and mad_sat builtins require saturating conversion builtins which haven't yet been ported. --- libclc/amdgcn/lib/SOURCES | 1 - libclc/amdgcn/lib/integer/popcount.cl | 6 - libclc/amdgcn/lib/integer/popcount.inc| 17 --- libclc/clc/include/clc/integer/binary_decl.h | 2 + libclc/clc/include/clc/integer/clc_clz.h | 21 +++ libclc/clc/include/clc/integer/clc_hadd.h | 21 +++ libclc/clc/include/clc/integer/clc_mad24.h| 21 +++ libclc/clc/include/clc/integer/clc_mad_hi.h | 8 ++ libclc/clc/include/clc/integer/clc_mul24.h| 21 +++ libclc/clc/include/clc/integer/clc_mul_hi.h | 21 +++ libclc/clc/include/clc/integer/clc_popcount.h | 19 +++ libclc/clc/include/clc/integer/clc_rhadd.h| 21 +++ libclc/clc/include/clc/integer/clc_upsample.h | 38 + .../include/clc/integer/definitions.h | 7 +- libclc/clc/include/clc/integer/gentype24.inc | 134 ++ libclc/clc/include/clc/integer/ternary_decl.h | 2 + libclc/clc/include/clc/integer/unary_decl.h | 1 + .../clc/include/clc/integer/unary_intrin.inc | 26 libclc/clc/lib/generic/SOURCES| 7 + libclc/clc/lib/generic/integer/clc_clz.cl | 44 ++ libclc/clc/lib/generic/integer/clc_hadd.cl| 4 + libclc/clc/lib/generic/integer/clc_hadd.inc | 8 ++ libclc/clc/lib/generic/integer/clc_mad24.cl | 5 + libclc/clc/lib/generic/integer/clc_mad24.inc | 5 + libclc/clc/lib/generic/integer/clc_mul24.cl | 4 + .../lib/generic/integer/clc_mul24.inc}| 4 +- libclc/clc/lib/generic/integer/clc_mul_hi.cl | 113 +++ libclc/clc/lib/generic/integer/clc_rhadd.cl | 4 + libclc/clc/lib/generic/integer/clc_rhadd.inc | 8 ++ .../clc/lib/generic/integer/clc_upsample.cl | 45 ++ libclc/generic/include/clc/integer/clz.h | 6 +- libclc/generic/include/clc/integer/clz.inc| 1 - libclc/generic/include/clc/integer/hadd.h | 6 +- libclc/generic/include/clc/integer/hadd.inc | 1 - libclc/generic/include/clc/integer/mad24.h| 9 +- libclc/generic/include/clc/integer/mad24.inc | 1 - libclc/generic/include/clc/integer/mad_hi.h | 7 +- libclc/generic/include/clc/integer/mul24.h| 9 +- libclc/generic/include/clc/integer/mul24.inc | 1 - libclc/generic/include/clc/integer/mul_hi.h | 6 +- libclc/generic/include/clc/integer/mul_hi.inc | 1 - libclc/generic/include/clc/integer/popcount.h | 9 +- libclc/generic/include/clc/integer/rhadd.h| 6 +- libclc/generic/include/clc/integer/rhadd.inc | 1 - libclc/generic/include/clc/integer/upsample.h | 33 +++-- libclc/generic/include/integer/popcount.h | 3 - .../generic/include/integer/unary_intrin.inc | 20 --- libclc/generic/lib/SOURCES| 1 + libclc/generic/lib/integer/binary_def.inc | 8 ++ libclc/generic/lib/integer/clz.cl | 44 +- libclc/generic/lib/integer/hadd.cl| 5 +- libclc/generic/lib/integer/hadd.inc | 6 - libclc/generic/lib/integer/mad24.cl | 7 +- libclc/generic/lib/integer/mad24.inc | 3 - libclc/generic/lib/integer/mad_hi.cl | 7 + libclc/generic/lib/integer/mul24.cl | 7 +- libclc/generic/lib/integer/mul_hi.cl | 110 +- libclc/generic/lib/integer/popcount.cl| 7 +- libclc/generic/lib/integer/rhadd.cl | 5 +- libclc/generic/lib/integer/rhadd.inc | 6 - libclc/generic/lib/integer/ternary_def.inc| 8 ++ libclc/generic/lib/integer/unary_def.inc | 7 + libclc/generic/lib/integer/upsample.cl| 54 +++ libclc/generic/lib/math/clc_fma.cl| 3 +- libclc/generic/lib/math/clc_fmod.cl | 5 +- libclc/generic/lib/math/clc_remainder.cl | 5 +- libclc/generic/lib/math/clc_remquo.cl | 5 +- libclc/generic/lib/math/sincos_helpers.cl | 20 +-- 68 files changed, 780 insertions(+), 301 deletions(-) delete mode 100644 libclc/amdgcn/lib/integer/popcount.cl delete mode 100644 libclc/amdgcn/lib/integer/
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz frasercrmck wrote: > > In this context of this PR, what are you proposing? That no such > > redirections should take place, or just that ones that map to intrinsics > > specifically should not be redirected? > > I mean __clc_ functions should not be implemented by directly calling the > opencl public name, that's backwards. Targets should be implementing __clc > override implementations, not directly providing their own opencl public > entrypoints. It sounds like that's the end goal, not sure how this step gets > you there though. I won't hold this up if it's in service of reaching that end Yep, I think we agree on the direction of this aspect of libclc. Regardless, this is how it's currently done for the SPIR-V targets. I would prefer if SPIR-V targets built a fully complete OpenCL library, and that they would selectively link which OpenCL builtins they want from libclc and which they want to override or optimize themselves. Personally, I'm hesitant to make too many changes that affect the downstream SPIR-V targets at the same time as refactoring the library in such an intrusive way. This macro header redirection is a convenient way of keeping things the same for them, whilst clearing away the noise that might obscure real codegen issues. As mentioned above, it sounds like we can improve the SPIR-V situation for most `__clc` functions. In general I'd prefer to loop back to that sort of thing once the refactoring is complete. Then we can test (and commit, revert, whatever) those changes in isolation. We're not backing ourselves into a corner with the macro redirections. https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz arsenm wrote: > In this context of this PR, what are you proposing? That no such redirections > should take place, or just that ones that map to intrinsics specifically > should not be redirected? I mean __clc_ functions should not be implemented by directly calling the opencl public name, that's backwards. Targets should be implementing __clc override implementations, not directly providing their own opencl public entrypoints. It sounds like that's the end goal, not sure how this step gets you there though. I won't hold this up if it's in service of reaching that end https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
https://github.com/karolherbst edited https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz karolherbst wrote: similar here https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz rjodinchr wrote: I don't think the size of the library matters too much. For sure if we see better performance numbers with libclc implementation we would want to switch to them. But to be honest, when it comes to performance, the first thing we do is to forget about the accuracy and use native Vulkan operators, thus we use as libclc as little as possible in that case. https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz frasercrmck wrote: > On the `clspv` side, we only care about functions that we are not capable of > translating with the correct precision using a few Vulkan SPIR-V operators. > > * rhadd: > https://github.com/google/clspv/blob/fea65392ea282dec9a43c3ab86fb63b890f6354e/lib/ReplaceOpenCLBuiltinPass.cpp#L3691 > > * popcount: > https://github.com/google/clspv/blob/fea65392ea282dec9a43c3ab86fb63b890f6354e/lib/SPIRVProducerPass.cpp#L4332 > > > So what `clspv` relies on is math functions that are easy to translate with a > maintainable number of Vulkan operators. Thanks! So for clspv this is (almost) entirely a performance thing, because you can ship a smaller library and have more performant implementations of some of the builtins? What if libclc was able to provide equivalent or better implementations? Would you switch to the libclc implementations? Is the size of the libclc library at all a concern, or is it just a matter of the right codegen? https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz rjodinchr wrote: On the `clspv` side, we only care about functions that we are not capable of translating with the correct precision using a few Vulkan SPIR-V operators. - rhadd: https://github.com/google/clspv/blob/fea65392ea282dec9a43c3ab86fb63b890f6354e/lib/ReplaceOpenCLBuiltinPass.cpp#L3691 - popcount: https://github.com/google/clspv/blob/fea65392ea282dec9a43c3ab86fb63b890f6354e/lib/SPIRVProducerPass.cpp#L4332 So what `clspv` relies on is math functions that are easy to translate with a maintainable number of Vulkan operators. https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz frasercrmck wrote: > > I'm not sure why wrapping the intrinsic isn't a suitable implementation for > > SPIR-V targets. > > I assert it must be and SPIRV must handle the intrinsic. It seems like the SPIR-V translator is able to translate the `ctlz` intrinsic to the `OpExtInst clz` SPIR-V instruction, which seems to me to be equivalent to redirecting `__clc_clz` back to the OpenCL `clz` entry point. In this context of this PR, what are you proposing? That no such redirections should take place, or just that ones that map to intrinsics specifically should not be redirected? I don't see how providing proper implementations for the CLC functions in this PR would be a problem for SPIR-V targets, except perhaps for adding extra code they don't end up using. What I still don't fully understand is why the SPIR-V targets currently don't provide implementations for certain OpenCL entry points like `rhadd` or `popcount`. @karolherbst or @rjodinchr, would you be able to clarify? https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz arsenm wrote: > I'm not sure why wrapping the intrinsic isn't a suitable implementation for > SPIR-V targets. I assert it must be and SPIRV must handle the intrinsic. > I keep going back and forth between whether libclc should be judgemental > about that, or whether it should allow targets to opt out of any definition > they want. The intent of libclc is not a free form user library. It is compiler data for clang to implement opencl. Allowing targets to define the implementation function as a redirection to the standard entrypoint just doesn't make sense. Plus every target should be dragged to use the generic implementation as much as possible, or else there's little point in using shared infrastructure. In cases where there's a direct llvm intrinsic there's no sensible reason to emit something else. https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
https://github.com/frasercrmck edited https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz frasercrmck wrote: > > The clspv/spirv targets don't provide implementations of all OpenCL builtins > > What do you mean by this exactly? I think the system today is backwards and > it results in confusing terminology. There are 2 possible interpretation of > "target provided" library functions. The backend notion and the libclc notion. > Apologies; it is confusing. What I meant was that the libclc libraries built for those targets intentionally don't supply implementation of all OpenCL functions. If you disassemble `clspv64--.bc` you'll see it make calls to `_Z3clzi` but `_Z3clzi` is left as a declaration. By "provide" I mean "expose an implementation for". And, just to be clear, it's not just `clz`. You can see all of the `declare`s if you disassemble the libclc libraries. I can't claim to fully understand what they're doing with these functions once they've linked with their libclc libraries. > > we need to replace that with a redirection to clz somehow > > By `clz` do you mean OpenCL `clz` (Itanium mangled function defined in this > bitcode lib) or system/target provided (libc) `clz`? Either way, it shouldn't > matter. `__clc_clz` should not be defined in terms of either. The goal here > is to just get a call to `llvm.ctlz` (which should be the universal > implementation) in the IR. __clc_clz should always just be a wrapper around > the intrinsic call. What you have in the impl file should already produce > this after instcombine in the scalar case. By `clz` I meant OpenCL `clz` (or `_Z3clc*`) but as I explained earlier, it's not *defined* in this bitcode lib (for those targets). I don't want to get too hung up specifically on `clz` (not all are as simple as being intrinsic wrappers), but I agree with you in principle. I'm not sure why wrapping the intrinsic _isn't_ a suitable implementation for SPIR-V targets. I keep going back and forth between whether libclc _should_ be judgemental about that, or whether it should allow targets to opt out of any definition they want. The AMD/NVIDIA (and our own downstream) targets are certainly a lot simpler in this regard, in that they desire an implementation of all OpenCL builtins. This macro business is a way of avoiding us having to customize the "implementations" of `clz` for SPIR-V targets, by having them compile an implementation of `__clc_clz` that just calls back out to OpenCL `clz`, which is left as a declaration in the final libclc library. This is all working from the assumptions that: 1. I want to make as few changes to the bitcode codegen for the SPIR-V targets as possible, in the interests of throughput of the CLC-related patches. 2. Whichever OpenCL builtins they're currently opting out of providing must be for a reason and I don't want to change that 3. The people using SPIR-V targets are harder to reach than the other users of libclc Perhaps those assumptions are invalid. > I see some of the target overrides are defined directly on the opencl > entrypoints. These should be overriding the __clc wrappers, rather than > directly overriding the entries. rocm device libs never tries to use the > opencl entrypoints internally, and (libclc is originally a fork of the same > lib from ~2014). I think the targets should only provide overrides of __clc > wrappers, and not directly override the public entrypoints. Those wrappers > may end up being implemented with the libm/libc system names (preferably by > codegen) I totally agree - this is the direction I want to take it in. OpenCL builtins call `__clc` functions, and `__clc` functions call each other internally. Targets override `__clc` functions if they wish, and the OpenCL builtins are left 100% generic. At that point I'd also like to re-organize the library so we have `libclc/clc` and `libclc/opencl` to better differentiate the code. I just haven't round gotten to any `__clc` versions of functions that are currently being overridden by any target. Once I get `sign`/`nextafter` moved over, we can move over the conversion builtins, and then the more complicated ones will follow, like maths builtins. https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz arsenm wrote: > The clspv/spirv targets don't provide implementations of all OpenCL builtins What do you mean by this exactly? I think the system today is backwards and it results in confusing terminology. There are 2 possible interpretation of "target provided" library functions. The backend notion and the libclc notion. > we need to replace that with a redirection to clz somehow By `clz` do you mean OpenCL `clz` (Itanium mangled function defined in this bitcode lib) or system/target provided (libc) `clz`? Either way, it shouldn't matter. `__clc_clz` should not be defined in terms of either. The goal here is to just get a call to `llvm.ctlz` (which should be the universal implementation) in the IR. __clc_clz should always just be a wrapper around the intrinsic call. What you have in the impl file should already produce this after instcombine in the scalar case. I see some of the target overrides are defined directly on the opencl entrypoints. These should be overriding the __clc wrappers, rather than directly overriding the entries. rocm device libs never tries to use the opencl entrypoints internally, and (libclc is originally a fork of the same lib from ~2014). I think the targets should only provide overrides of __clc wrappers, and not directly override the public entrypoints. Those wrappers may end up being implemented with the libm/libc system names (preferably by codegen) https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,44 @@ +#include +#include +#include + +_CLC_OVERLOAD _CLC_DEF char __clc_clz(char x) { + return __clc_clz((ushort)(uchar)x) - 8; +} + +_CLC_OVERLOAD _CLC_DEF uchar __clc_clz(uchar x) { + return __clc_clz((ushort)x) - 8; +} + +_CLC_OVERLOAD _CLC_DEF short __clc_clz(short x) { + return x ? __builtin_clzs(x) : 16; +} + +_CLC_OVERLOAD _CLC_DEF ushort __clc_clz(ushort x) { + return x ? __builtin_clzs(x) : 16; +} + +_CLC_OVERLOAD _CLC_DEF int __clc_clz(int x) { + return x ? __builtin_clz(x) : 32; +} + +_CLC_OVERLOAD _CLC_DEF uint __clc_clz(uint x) { + return x ? __builtin_clz(x) : 32; +} + +_CLC_OVERLOAD _CLC_DEF long __clc_clz(long x) { + return x ? __builtin_clzl(x) : 64; +} + +_CLC_OVERLOAD _CLC_DEF ulong __clc_clz(ulong x) { + return x ? __builtin_clzl(x) : 64; +} + +_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, char, __clc_clz, char) frasercrmck wrote: That might be better and less ambiguous as a new `__builtin_elementwise_clz` builtin, come to think of it. https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz frasercrmck wrote: Oh right, I thought we were okay with this pattern given previous PRs were using it. It's a bit awkard. The clspv/spirv targets don't provide implementations of all OpenCL builtins. They don't want to provide `clz`, in this instance. But we're going to see, for example, `__clc_remquo` call out to `__clc_clz`. We can't leave a dangling reference to `__clc_clz` - we need to replace that with a redirection to `clz` somehow. This macro solution seemed like the simplest way, and one that would introduce no - or minimal - changes to the bytecode for those targets. Since they're both primarily used by downstream users I favoured the lightest possible touch. The other solution would be to have those targets define `__clz_clz` to call out to `clz`. That would be more code, and introduce more indirection, but would be functionally equivalent. I'd have to go back and make the change for the other instances of this pattern already merged. Do you have another idea of how we could accomplish this? https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz arsenm wrote: I don't really understand what's going on here, or how the comment is relevant (here and all the other cases). Macros should be used for approximately nothing, but the generic implementation shouldn't depend on any reasoning about what spirv wants to do https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
@@ -0,0 +1,44 @@ +#include +#include +#include + +_CLC_OVERLOAD _CLC_DEF char __clc_clz(char x) { + return __clc_clz((ushort)(uchar)x) - 8; +} + +_CLC_OVERLOAD _CLC_DEF uchar __clc_clz(uchar x) { + return __clc_clz((ushort)x) - 8; +} + +_CLC_OVERLOAD _CLC_DEF short __clc_clz(short x) { + return x ? __builtin_clzs(x) : 16; +} + +_CLC_OVERLOAD _CLC_DEF ushort __clc_clz(ushort x) { + return x ? __builtin_clzs(x) : 16; +} + +_CLC_OVERLOAD _CLC_DEF int __clc_clz(int x) { + return x ? __builtin_clz(x) : 32; +} + +_CLC_OVERLOAD _CLC_DEF uint __clc_clz(uint x) { + return x ? __builtin_clz(x) : 32; +} + +_CLC_OVERLOAD _CLC_DEF long __clc_clz(long x) { + return x ? __builtin_clzl(x) : 64; +} + +_CLC_OVERLOAD _CLC_DEF ulong __clc_clz(ulong x) { + return x ? __builtin_clzl(x) : 64; +} + +_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, char, __clc_clz, char) frasercrmck wrote: It would be nice if `__builtin_clz*` builtins could operate on vector types. I'll make a note of this improvement. https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
https://github.com/frasercrmck edited https://github.com/llvm/llvm-project/pull/116786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libclc] [libclc] Move several integer functions to CLC library (PR #116786)
https://github.com/frasercrmck created https://github.com/llvm/llvm-project/pull/116786 This commit moves over the OpenCL clz, hadd, mad24, mad_hi, mul24, mul_hi, popcount, rhadd, and upsample builtins to the CLC library. There are no changes to any target's CLC libraries. The OpenCL mad_hi builtin wasn't previously publicly available from the CLC libraries, as it was hash-defined to mul_hi in the header files. That issue has been fixed, and mad_hi is now exposed. The custom AMD implementation/workaround for popcount has been removed as it was only valid for clang < 7. There are still three integer functions which haven't been moved over. The OpenCL add_sat, sub_sat, and mad_sat builtins require saturating conversion builtins which haven't yet been ported. >From 3f05aee5651a8364d4f3ba45bfc8024ff5beec8c Mon Sep 17 00:00:00 2001 From: Fraser Cormack Date: Mon, 18 Nov 2024 17:29:42 + Subject: [PATCH] [libclc] Move several integer functions to CLC library This commit moves over the OpenCL clz, hadd, mad24, mad_hi, mul24, mul_hi, popcount, rhadd, and upsample builtins to the CLC library. There are no changes to any target's CLC libraries. The OpenCL mad_hi builtin wasn't previously publicly available from the CLC libraries, as it was hash-defined to mul_hi in the header files. That issue has been fixed, and mad_hi is now exposed. The custom AMD implementation/workaround for popcount has been removed as it was only valid for clang < 7. There are still two integer functions which haven't been moved over. The OpenCL add_sat, sub_sat, and mad_sat builtins require saturating conversion builtins which haven't yet been ported. --- libclc/amdgcn/lib/SOURCES | 1 - libclc/amdgcn/lib/integer/popcount.cl | 6 - libclc/amdgcn/lib/integer/popcount.inc| 17 --- libclc/clc/include/clc/integer/binary_decl.h | 2 + libclc/clc/include/clc/integer/clc_clz.h | 21 +++ libclc/clc/include/clc/integer/clc_hadd.h | 21 +++ libclc/clc/include/clc/integer/clc_mad24.h| 21 +++ libclc/clc/include/clc/integer/clc_mad_hi.h | 8 ++ libclc/clc/include/clc/integer/clc_mul24.h| 21 +++ libclc/clc/include/clc/integer/clc_mul_hi.h | 21 +++ libclc/clc/include/clc/integer/clc_popcount.h | 19 +++ libclc/clc/include/clc/integer/clc_rhadd.h| 21 +++ libclc/clc/include/clc/integer/clc_upsample.h | 38 + .../include/clc/integer/definitions.h | 7 +- libclc/clc/include/clc/integer/gentype24.inc | 134 ++ libclc/clc/include/clc/integer/ternary_decl.h | 2 + libclc/clc/include/clc/integer/unary_decl.h | 1 + .../clc/include/clc/integer/unary_intrin.inc | 26 libclc/clc/lib/generic/SOURCES| 7 + libclc/clc/lib/generic/integer/clc_clz.cl | 44 ++ libclc/clc/lib/generic/integer/clc_hadd.cl| 4 + libclc/clc/lib/generic/integer/clc_hadd.inc | 8 ++ libclc/clc/lib/generic/integer/clc_mad24.cl | 5 + libclc/clc/lib/generic/integer/clc_mad24.inc | 5 + libclc/clc/lib/generic/integer/clc_mul24.cl | 4 + .../lib/generic/integer/clc_mul24.inc}| 4 +- libclc/clc/lib/generic/integer/clc_mul_hi.cl | 113 +++ libclc/clc/lib/generic/integer/clc_rhadd.cl | 4 + libclc/clc/lib/generic/integer/clc_rhadd.inc | 8 ++ .../clc/lib/generic/integer/clc_upsample.cl | 45 ++ libclc/generic/include/clc/integer/clz.h | 6 +- libclc/generic/include/clc/integer/clz.inc| 1 - libclc/generic/include/clc/integer/hadd.h | 6 +- libclc/generic/include/clc/integer/hadd.inc | 1 - libclc/generic/include/clc/integer/mad24.h| 9 +- libclc/generic/include/clc/integer/mad24.inc | 1 - libclc/generic/include/clc/integer/mad_hi.h | 7 +- libclc/generic/include/clc/integer/mul24.h| 9 +- libclc/generic/include/clc/integer/mul24.inc | 1 - libclc/generic/include/clc/integer/mul_hi.h | 6 +- libclc/generic/include/clc/integer/mul_hi.inc | 1 - libclc/generic/include/clc/integer/popcount.h | 9 +- libclc/generic/include/clc/integer/rhadd.h| 6 +- libclc/generic/include/clc/integer/rhadd.inc | 1 - libclc/generic/include/clc/integer/upsample.h | 33 +++-- libclc/generic/include/integer/popcount.h | 3 - .../generic/include/integer/unary_intrin.inc | 20 --- libclc/generic/lib/SOURCES| 1 + libclc/generic/lib/integer/binary_def.inc | 8 ++ libclc/generic/lib/integer/clz.cl | 44 +- libclc/generic/lib/integer/hadd.cl| 5 +- libclc/generic/lib/integer/hadd.inc | 6 - libclc/generic/lib/integer/mad24.cl | 7 +- libclc/generic/lib/integer/mad24.inc | 3 - libclc/generic/lib/integer/mad_hi.cl | 7 + libclc/generic/lib/integer/mul24.cl | 7 +- libclc/generic/lib/integer/mul_hi.cl | 110 +- libclc/generic/lib/integer/popcount.cl| 7 +- libclc/generic/lib/integer/rhadd