[PATCH] D37231: Add half load and store builtins
This revision was automatically updated to reflect the committed changes. Closed by commit rL312742: [OpenCL] Add half load and store builtins (authored by jvesely). Changed prior to commit: https://reviews.llvm.org/D37231?vs=113624=114240#toc Repository: rL LLVM https://reviews.llvm.org/D37231 Files: cfe/trunk/include/clang/Basic/Builtins.def cfe/trunk/include/clang/Basic/Builtins.h cfe/trunk/lib/Basic/Builtins.cpp cfe/trunk/lib/CodeGen/CGBuiltin.cpp cfe/trunk/test/CodeGenOpenCL/no-half.cl Index: cfe/trunk/lib/CodeGen/CGBuiltin.cpp === --- cfe/trunk/lib/CodeGen/CGBuiltin.cpp +++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp @@ -2768,6 +2768,24 @@ Name), {NDRange, Block})); } + + case Builtin::BI__builtin_store_half: + case Builtin::BI__builtin_store_halff: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Address Address = EmitPointerWithAlignment(E->getArg(1)); +Value *HalfVal = Builder.CreateFPTrunc(Val, Builder.getHalfTy()); +return RValue::get(Builder.CreateStore(HalfVal, Address)); + } + case Builtin::BI__builtin_load_half: { +Address Address = EmitPointerWithAlignment(E->getArg(0)); +Value *HalfVal = Builder.CreateLoad(Address); +return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getDoubleTy())); + } + case Builtin::BI__builtin_load_halff: { +Address Address = EmitPointerWithAlignment(E->getArg(0)); +Value *HalfVal = Builder.CreateLoad(Address); +return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getFloatTy())); + } case Builtin::BIprintf: if (getTarget().getTriple().isNVPTX()) return EmitNVPTXDevicePrintfCallExpr(E, ReturnValue); Index: cfe/trunk/lib/Basic/Builtins.cpp === --- cfe/trunk/lib/Basic/Builtins.cpp +++ cfe/trunk/lib/Basic/Builtins.cpp @@ -69,9 +69,14 @@ bool MSModeUnsupported = !LangOpts.MicrosoftExt && (BuiltinInfo.Langs & MS_LANG); bool ObjCUnsupported = !LangOpts.ObjC1 && BuiltinInfo.Langs == OBJC_LANG; - bool OclCUnsupported = LangOpts.OpenCLVersion != 200 && - BuiltinInfo.Langs == OCLC20_LANG; + bool OclC1Unsupported = (LangOpts.OpenCLVersion / 100) != 1 && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES ) == OCLC1X_LANG; + bool OclC2Unsupported = LangOpts.OpenCLVersion != 200 && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES) == OCLC20_LANG; + bool OclCUnsupported = !LangOpts.OpenCL && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES); return !BuiltinsUnsupported && !MathBuiltinsUnsupported && !OclCUnsupported && + !OclC1Unsupported && !OclC2Unsupported && !GnuModeUnsupported && !MSModeUnsupported && !ObjCUnsupported; } Index: cfe/trunk/include/clang/Basic/Builtins.h === --- cfe/trunk/include/clang/Basic/Builtins.h +++ cfe/trunk/include/clang/Basic/Builtins.h @@ -36,10 +36,12 @@ CXX_LANG = 0x4, // builtin for cplusplus only. OBJC_LANG = 0x8,// builtin for objective-c and objective-c++ MS_LANG = 0x10, // builtin requires MS mode. - OCLC20_LANG = 0x20, // builtin for OpenCL C only. + OCLC20_LANG = 0x20, // builtin for OpenCL C 2.0 only. + OCLC1X_LANG = 0x40, // builtin for OpenCL C 1.x only. ALL_LANGUAGES = C_LANG | CXX_LANG | OBJC_LANG, // builtin for all languages. ALL_GNU_LANGUAGES = ALL_LANGUAGES | GNU_LANG, // builtin requires GNU mode. - ALL_MS_LANGUAGES = ALL_LANGUAGES | MS_LANG // builtin requires MS mode. + ALL_MS_LANGUAGES = ALL_LANGUAGES | MS_LANG,// builtin requires MS mode. + ALL_OCLC_LANGUAGES = OCLC1X_LANG | OCLC20_LANG // builtin for OCLC languages. }; namespace Builtin { Index: cfe/trunk/include/clang/Basic/Builtins.def === --- cfe/trunk/include/clang/Basic/Builtins.def +++ cfe/trunk/include/clang/Basic/Builtins.def @@ -1424,6 +1424,12 @@ LANGBUILTIN(to_local, "v*v*", "tn", OCLC20_LANG) LANGBUILTIN(to_private, "v*v*", "tn", OCLC20_LANG) +// OpenCL half load/store builtin +LANGBUILTIN(__builtin_store_half, "vdh*", "n", ALL_OCLC_LANGUAGES) +LANGBUILTIN(__builtin_store_halff, "vfh*", "n", ALL_OCLC_LANGUAGES) +LANGBUILTIN(__builtin_load_half, "dhC*", "nc", ALL_OCLC_LANGUAGES) +LANGBUILTIN(__builtin_load_halff, "fhC*", "nc", ALL_OCLC_LANGUAGES) + // Builtins for os_log/os_trace BUILTIN(__builtin_os_log_format_buffer_size, "zcC*.", "p:0:nut") BUILTIN(__builtin_os_log_format, "v*v*cC*.", "p:0:nt") Index: cfe/trunk/test/CodeGenOpenCL/no-half.cl === --- cfe/trunk/test/CodeGenOpenCL/no-half.cl +++ cfe/trunk/test/CodeGenOpenCL/no-half.cl @@ -0,0 +1,39 @@ +// RUN: %clang_cc1 %s -cl-std=cl2.0 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s +// RUN: %clang_cc1 %s -cl-std=cl1.2
[PATCH] D37231: Add half load and store builtins
Anastasia accepted this revision. Anastasia added a comment. This revision is now accepted and ready to land. Yes, sorry overlooked that. :) LGTM! Thanks! Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
jvesely marked 2 inline comments as done. jvesely added inline comments. Comment at: test/CodeGenOpenCL/no-half.cl:27 + foo[0] = __builtin_load_halff(bar); +// CHECK: [[HALF_VAL:%.*]] = load +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to float Anastasia wrote: > jvesely wrote: > > Anastasia wrote: > > > Minor thing: any reason you are not checking the load fully? > > just my laziness, I've added full check. > Could we do the same for the above examples too? I don't understand. if you mean test_store_*, those functions do not generate any load instructions. the full generated code is: test_store_double: ``` entry: %0 = fptrunc double %foo to half store half %0, half addrspace(1)* %bar, align 2 ret void ``` Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
Anastasia added inline comments. Comment at: test/CodeGenOpenCL/no-half.cl:27 + foo[0] = __builtin_load_halff(bar); +// CHECK: [[HALF_VAL:%.*]] = load +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to float jvesely wrote: > Anastasia wrote: > > Minor thing: any reason you are not checking the load fully? > just my laziness, I've added full check. Could we do the same for the above examples too? Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
jvesely requested review of this revision. jvesely added a comment. please let me know if your accept still stands for the modified version. Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
jvesely updated this revision to Diff 113624. jvesely added a comment. mark load pointers const Repository: rL LLVM https://reviews.llvm.org/D37231 Files: include/clang/Basic/Builtins.def include/clang/Basic/Builtins.h lib/Basic/Builtins.cpp lib/CodeGen/CGBuiltin.cpp test/CodeGenOpenCL/no-half.cl Index: test/CodeGenOpenCL/no-half.cl === --- /dev/null +++ test/CodeGenOpenCL/no-half.cl @@ -0,0 +1,39 @@ +// RUN: %clang_cc1 %s -cl-std=cl2.0 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s +// RUN: %clang_cc1 %s -cl-std=cl1.2 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s +// RUN: %clang_cc1 %s -cl-std=cl1.1 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s + +#pragma OPENCL EXTENSION cl_khr_fp64:enable + +// CHECK-LABEL: @test_store_float(float %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_store_float(float foo, __global half* bar) +{ + __builtin_store_halff(foo, bar); +// CHECK: [[HALF_VAL:%.*]] = fptrunc float %foo to half +// CHECK: store half [[HALF_VAL]], half addrspace({{.}})* %bar, align 2 +} + +// CHECK-LABEL: @test_store_double(double %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_store_double(double foo, __global half* bar) +{ + __builtin_store_half(foo, bar); +// CHECK: [[HALF_VAL:%.*]] = fptrunc double %foo to half +// CHECK: store half [[HALF_VAL]], half addrspace({{.}})* %bar, align 2 +} + +// CHECK-LABEL: @test_load_float(float addrspace({{.}}){{.*}} %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_load_float(__global float* foo, __global half* bar) +{ + foo[0] = __builtin_load_halff(bar); +// CHECK: [[HALF_VAL:%.*]] = load half, half addrspace({{.}})* %bar +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to float +// CHECK: store float [[FULL_VAL]], float addrspace({{.}})* %foo +} + +// CHECK-LABEL: @test_load_double(double addrspace({{.}}){{.*}} %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_load_double(__global double* foo, __global half* bar) +{ + foo[0] = __builtin_load_half(bar); +// CHECK: [[HALF_VAL:%.*]] = load half, half addrspace({{.}})* %bar +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to double +// CHECK: store double [[FULL_VAL]], double addrspace({{.}})* %foo +} Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -2724,6 +2724,24 @@ Name), {NDRange, Block})); } + + case Builtin::BI__builtin_store_half: + case Builtin::BI__builtin_store_halff: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Address Address = EmitPointerWithAlignment(E->getArg(1)); +Value *HalfVal = Builder.CreateFPTrunc(Val, Builder.getHalfTy()); +return RValue::get(Builder.CreateStore(HalfVal, Address)); + } + case Builtin::BI__builtin_load_half: { +Address Address = EmitPointerWithAlignment(E->getArg(0)); +Value *HalfVal = Builder.CreateLoad(Address); +return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getDoubleTy())); + } + case Builtin::BI__builtin_load_halff: { +Address Address = EmitPointerWithAlignment(E->getArg(0)); +Value *HalfVal = Builder.CreateLoad(Address); +return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getFloatTy())); + } case Builtin::BIprintf: if (getTarget().getTriple().isNVPTX()) return EmitNVPTXDevicePrintfCallExpr(E, ReturnValue); Index: lib/Basic/Builtins.cpp === --- lib/Basic/Builtins.cpp +++ lib/Basic/Builtins.cpp @@ -69,9 +69,14 @@ bool MSModeUnsupported = !LangOpts.MicrosoftExt && (BuiltinInfo.Langs & MS_LANG); bool ObjCUnsupported = !LangOpts.ObjC1 && BuiltinInfo.Langs == OBJC_LANG; - bool OclCUnsupported = LangOpts.OpenCLVersion != 200 && - BuiltinInfo.Langs == OCLC20_LANG; + bool OclC1Unsupported = (LangOpts.OpenCLVersion / 100) != 1 && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES ) == OCLC1X_LANG; + bool OclC2Unsupported = LangOpts.OpenCLVersion != 200 && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES) == OCLC20_LANG; + bool OclCUnsupported = !LangOpts.OpenCL && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES); return !BuiltinsUnsupported && !MathBuiltinsUnsupported && !OclCUnsupported && + !OclC1Unsupported && !OclC2Unsupported && !GnuModeUnsupported && !MSModeUnsupported && !ObjCUnsupported; } Index: include/clang/Basic/Builtins.h === --- include/clang/Basic/Builtins.h +++ include/clang/Basic/Builtins.h @@ -36,10 +36,12 @@ CXX_LANG = 0x4, // builtin for cplusplus only. OBJC_LANG = 0x8,// builtin for objective-c and objective-c++ MS_LANG = 0x10, // builtin requires MS mode. - OCLC20_LANG = 0x20, // builtin for OpenCL C only. +
[PATCH] D37231: Add half load and store builtins
jvesely added inline comments. Comment at: include/clang/Basic/Builtins.def:1427 +// OpenCL half load/store builtin +BUILTIN(__builtin_store_half, "vdh*", "n") +BUILTIN(__builtin_store_halff, "vfh*", "n") Anastasia wrote: > jvesely wrote: > > Anastasia wrote: > > > I think this should be a language builtin (see above) but perhaps we > > > might need to extend the language version here. Because I believe we only > > > have OpenCL v2.0 currently. > > > > > > Also this should only be available if `cl_khr_fp16` is supported and > > > enabled? I think we are doing similar with some subgroups functions (e.g. > > > `get_kernel_sub_group_count_for_ndrange`) that are only supported by > > > `cl_khr_subgroup` but those have custom diagnostic though. May be we > > > could leave this check out since `half` is not available if `cl_khr_fp16` > > > is not enabled anyways. > > This is specifically meant to be used when `cl_khr_fp16` is **not** > > available. > > CLC allows using half as storage format and half pointers without the > > extension, > > vstore_half/vload_half are used to load/store half values. (CL1.2 CH > > 6.1.1.1) > > > > These builtins are not necessary if `cl_khr_fp16` is available (we can use > > regular loads/stores). > > > > I'll take stab at making these CLC only, but similarly to device specific > > builtins it looked useful beyond that, since these builtins provide access > > to half type storage. > Strange. This is not how I would interpret from the extension spec though: > https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/cl_khr_fp16.html > > But I think for this change is probably fine indeed because this doesn't > affect half type itself. I'm not sure I see the conflict here. `cl_khr_fp16` adds support for `half` scalar and `halfn` vector types. without the extension the specs say (`OCL 1.2 Ch. 6.1.1.1`): > The half data type can only be used to declare a pointer to a buffer that > contains half values. `vload_half` and `vstore_half` used to access those buffers without needing `half` type (or the `cl_khr_fp16` extension). > But I think for this change is probably fine indeed because this doesn't > affect half type itself. exactly. this is needed outside of `cl_khr_fp16`, or the `half` type Comment at: test/CodeGenOpenCL/no-half.cl:19 + __builtin_store_half(foo, bar); +// CHECK: [[HALF_VAL:%.*]] = fptrunc double %foo to half +// CHECK: store half [[HALF_VAL]], half addrspace({{.}})* %bar, align 2 Anastasia wrote: > Would it make sense to add a check for `load` similarly to `store` in the > test_load_float/test_load_double tests? there is no load. `fptrunc double %foo to half` uses the function parameter directly Comment at: test/CodeGenOpenCL/no-half.cl:27 + foo[0] = __builtin_load_halff(bar); +// CHECK: [[HALF_VAL:%.*]] = load +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to float Anastasia wrote: > Minor thing: any reason you are not checking the load fully? just my laziness, I've added full check. Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
jvesely updated this revision to Diff 113588. jvesely marked 6 inline comments as done. jvesely edited the summary of this revision. jvesely added a comment. fully check loads in tests Repository: rL LLVM https://reviews.llvm.org/D37231 Files: include/clang/Basic/Builtins.def include/clang/Basic/Builtins.h lib/Basic/Builtins.cpp lib/CodeGen/CGBuiltin.cpp test/CodeGenOpenCL/no-half.cl Index: test/CodeGenOpenCL/no-half.cl === --- /dev/null +++ test/CodeGenOpenCL/no-half.cl @@ -0,0 +1,39 @@ +// RUN: %clang_cc1 %s -cl-std=cl2.0 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s +// RUN: %clang_cc1 %s -cl-std=cl1.2 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s +// RUN: %clang_cc1 %s -cl-std=cl1.1 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s + +#pragma OPENCL EXTENSION cl_khr_fp64:enable + +// CHECK-LABEL: @test_store_float(float %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_store_float(float foo, __global half* bar) +{ + __builtin_store_halff(foo, bar); +// CHECK: [[HALF_VAL:%.*]] = fptrunc float %foo to half +// CHECK: store half [[HALF_VAL]], half addrspace({{.}})* %bar, align 2 +} + +// CHECK-LABEL: @test_store_double(double %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_store_double(double foo, __global half* bar) +{ + __builtin_store_half(foo, bar); +// CHECK: [[HALF_VAL:%.*]] = fptrunc double %foo to half +// CHECK: store half [[HALF_VAL]], half addrspace({{.}})* %bar, align 2 +} + +// CHECK-LABEL: @test_load_float(float addrspace({{.}}){{.*}} %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_load_float(__global float* foo, __global half* bar) +{ + foo[0] = __builtin_load_halff(bar); +// CHECK: [[HALF_VAL:%.*]] = load half, half addrspace({{.}})* %bar +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to float +// CHECK: store float [[FULL_VAL]], float addrspace({{.}})* %foo +} + +// CHECK-LABEL: @test_load_double(double addrspace({{.}}){{.*}} %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_load_double(__global double* foo, __global half* bar) +{ + foo[0] = __builtin_load_half(bar); +// CHECK: [[HALF_VAL:%.*]] = load half, half addrspace({{.}})* %bar +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to double +// CHECK: store double [[FULL_VAL]], double addrspace({{.}})* %foo +} Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -2724,6 +2724,24 @@ Name), {NDRange, Block})); } + + case Builtin::BI__builtin_store_half: + case Builtin::BI__builtin_store_halff: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Address Address = EmitPointerWithAlignment(E->getArg(1)); +Value *HalfVal = Builder.CreateFPTrunc(Val, Builder.getHalfTy()); +return RValue::get(Builder.CreateStore(HalfVal, Address)); + } + case Builtin::BI__builtin_load_half: { +Address Address = EmitPointerWithAlignment(E->getArg(0)); +Value *HalfVal = Builder.CreateLoad(Address); +return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getDoubleTy())); + } + case Builtin::BI__builtin_load_halff: { +Address Address = EmitPointerWithAlignment(E->getArg(0)); +Value *HalfVal = Builder.CreateLoad(Address); +return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getFloatTy())); + } case Builtin::BIprintf: if (getTarget().getTriple().isNVPTX()) return EmitNVPTXDevicePrintfCallExpr(E, ReturnValue); Index: lib/Basic/Builtins.cpp === --- lib/Basic/Builtins.cpp +++ lib/Basic/Builtins.cpp @@ -69,9 +69,14 @@ bool MSModeUnsupported = !LangOpts.MicrosoftExt && (BuiltinInfo.Langs & MS_LANG); bool ObjCUnsupported = !LangOpts.ObjC1 && BuiltinInfo.Langs == OBJC_LANG; - bool OclCUnsupported = LangOpts.OpenCLVersion != 200 && - BuiltinInfo.Langs == OCLC20_LANG; + bool OclC1Unsupported = (LangOpts.OpenCLVersion / 100) != 1 && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES ) == OCLC1X_LANG; + bool OclC2Unsupported = LangOpts.OpenCLVersion != 200 && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES) == OCLC20_LANG; + bool OclCUnsupported = !LangOpts.OpenCL && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES); return !BuiltinsUnsupported && !MathBuiltinsUnsupported && !OclCUnsupported && + !OclC1Unsupported && !OclC2Unsupported && !GnuModeUnsupported && !MSModeUnsupported && !ObjCUnsupported; } Index: include/clang/Basic/Builtins.h === --- include/clang/Basic/Builtins.h +++ include/clang/Basic/Builtins.h @@ -36,10 +36,12 @@ CXX_LANG = 0x4, // builtin for cplusplus only. OBJC_LANG = 0x8,// builtin for objective-c and objective-c++ MS_LANG = 0x10,
[PATCH] D37231: Add half load and store builtins
Anastasia added inline comments. Comment at: test/CodeGenOpenCL/no-half.cl:19 + __builtin_store_half(foo, bar); +// CHECK: [[HALF_VAL:%.*]] = fptrunc double %foo to half +// CHECK: store half [[HALF_VAL]], half addrspace({{.}})* %bar, align 2 Would it make sense to add a check for `load` similarly to `store` in the test_load_float/test_load_double tests? Comment at: test/CodeGenOpenCL/no-half.cl:27 + foo[0] = __builtin_load_halff(bar); +// CHECK: [[HALF_VAL:%.*]] = load +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to float Minor thing: any reason you are not checking the load fully? Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
Anastasia added a comment. LGTM! Thanks! Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
Anastasia accepted this revision. Anastasia added inline comments. This revision is now accepted and ready to land. Comment at: include/clang/Basic/Builtins.def:1427 +// OpenCL half load/store builtin +BUILTIN(__builtin_store_half, "vdh*", "n") +BUILTIN(__builtin_store_halff, "vfh*", "n") jvesely wrote: > Anastasia wrote: > > I think this should be a language builtin (see above) but perhaps we might > > need to extend the language version here. Because I believe we only have > > OpenCL v2.0 currently. > > > > Also this should only be available if `cl_khr_fp16` is supported and > > enabled? I think we are doing similar with some subgroups functions (e.g. > > `get_kernel_sub_group_count_for_ndrange`) that are only supported by > > `cl_khr_subgroup` but those have custom diagnostic though. May be we could > > leave this check out since `half` is not available if `cl_khr_fp16` is not > > enabled anyways. > This is specifically meant to be used when `cl_khr_fp16` is **not** available. > CLC allows using half as storage format and half pointers without the > extension, > vstore_half/vload_half are used to load/store half values. (CL1.2 CH 6.1.1.1) > > These builtins are not necessary if `cl_khr_fp16` is available (we can use > regular loads/stores). > > I'll take stab at making these CLC only, but similarly to device specific > builtins it looked useful beyond that, since these builtins provide access to > half type storage. Strange. This is not how I would interpret from the extension spec though: https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/cl_khr_fp16.html But I think for this change is probably fine indeed because this doesn't affect half type itself. Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
jvesely updated this revision to Diff 113190. jvesely added a comment. restrict builtins to OCLC langauges Repository: rL LLVM https://reviews.llvm.org/D37231 Files: include/clang/Basic/Builtins.def include/clang/Basic/Builtins.h lib/Basic/Builtins.cpp lib/CodeGen/CGBuiltin.cpp test/CodeGenOpenCL/no-half.cl Index: test/CodeGenOpenCL/no-half.cl === --- /dev/null +++ test/CodeGenOpenCL/no-half.cl @@ -0,0 +1,39 @@ +// RUN: %clang_cc1 %s -cl-std=cl2.0 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s +// RUN: %clang_cc1 %s -cl-std=cl1.2 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s +// RUN: %clang_cc1 %s -cl-std=cl1.1 -emit-llvm -o - -triple spir-unknown-unknown | FileCheck %s + +#pragma OPENCL EXTENSION cl_khr_fp64:enable + +// CHECK-LABEL: @test_store_float(float %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_store_float(float foo, __global half* bar) +{ + __builtin_store_halff(foo, bar); +// CHECK: [[HALF_VAL:%.*]] = fptrunc float %foo to half +// CHECK: store half [[HALF_VAL]], half addrspace({{.}})* %bar, align 2 +} + +// CHECK-LABEL: @test_store_double(double %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_store_double(double foo, __global half* bar) +{ + __builtin_store_half(foo, bar); +// CHECK: [[HALF_VAL:%.*]] = fptrunc double %foo to half +// CHECK: store half [[HALF_VAL]], half addrspace({{.}})* %bar, align 2 +} + +// CHECK-LABEL: @test_load_float(float addrspace({{.}}){{.*}} %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_load_float(__global float* foo, __global half* bar) +{ + foo[0] = __builtin_load_halff(bar); +// CHECK: [[HALF_VAL:%.*]] = load +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to float +// CHECK: store float [[FULL_VAL]], float addrspace({{.}})* %foo +} + +// CHECK-LABEL: @test_load_double(double addrspace({{.}}){{.*}} %foo, half addrspace({{.}}){{.*}} %bar) +__kernel void test_load_double(__global double* foo, __global half* bar) +{ + foo[0] = __builtin_load_half(bar); +// CHECK: [[HALF_VAL:%.*]] = load +// CHECK: [[FULL_VAL:%.*]] = fpext half [[HALF_VAL]] to double +// CHECK: store double [[FULL_VAL]], double addrspace({{.}})* %foo +} Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -2724,6 +2724,24 @@ Name), {NDRange, Block})); } + + case Builtin::BI__builtin_store_half: + case Builtin::BI__builtin_store_halff: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Address Address = EmitPointerWithAlignment(E->getArg(1)); +Value *HalfVal = Builder.CreateFPTrunc(Val, Builder.getHalfTy()); +return RValue::get(Builder.CreateStore(HalfVal, Address)); + } + case Builtin::BI__builtin_load_half: { +Address Address = EmitPointerWithAlignment(E->getArg(0)); +Value *HalfVal = Builder.CreateLoad(Address); +return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getDoubleTy())); + } + case Builtin::BI__builtin_load_halff: { +Address Address = EmitPointerWithAlignment(E->getArg(0)); +Value *HalfVal = Builder.CreateLoad(Address); +return RValue::get(Builder.CreateFPExt(HalfVal, Builder.getFloatTy())); + } case Builtin::BIprintf: if (getTarget().getTriple().isNVPTX()) return EmitNVPTXDevicePrintfCallExpr(E, ReturnValue); Index: lib/Basic/Builtins.cpp === --- lib/Basic/Builtins.cpp +++ lib/Basic/Builtins.cpp @@ -69,9 +69,14 @@ bool MSModeUnsupported = !LangOpts.MicrosoftExt && (BuiltinInfo.Langs & MS_LANG); bool ObjCUnsupported = !LangOpts.ObjC1 && BuiltinInfo.Langs == OBJC_LANG; - bool OclCUnsupported = LangOpts.OpenCLVersion != 200 && - BuiltinInfo.Langs == OCLC20_LANG; + bool OclC1Unsupported = (LangOpts.OpenCLVersion / 100) != 1 && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES ) == OCLC1X_LANG; + bool OclC2Unsupported = LangOpts.OpenCLVersion != 200 && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES) == OCLC20_LANG; + bool OclCUnsupported = !LangOpts.OpenCL && + (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES); return !BuiltinsUnsupported && !MathBuiltinsUnsupported && !OclCUnsupported && + !OclC1Unsupported && !OclC2Unsupported && !GnuModeUnsupported && !MSModeUnsupported && !ObjCUnsupported; } Index: include/clang/Basic/Builtins.h === --- include/clang/Basic/Builtins.h +++ include/clang/Basic/Builtins.h @@ -36,10 +36,12 @@ CXX_LANG = 0x4, // builtin for cplusplus only. OBJC_LANG = 0x8,// builtin for objective-c and objective-c++ MS_LANG = 0x10, // builtin requires MS mode. - OCLC20_LANG = 0x20, // builtin for OpenCL C only. + OCLC20_LANG = 0x20, // builtin for OpenCL C 2.0 only. +
[PATCH] D37231: Add half load and store builtins
jvesely added inline comments. Comment at: include/clang/Basic/Builtins.def:1427 +// OpenCL half load/store builtin +BUILTIN(__builtin_store_half, "vdh*", "n") +BUILTIN(__builtin_store_halff, "vfh*", "n") Anastasia wrote: > I think this should be a language builtin (see above) but perhaps we might > need to extend the language version here. Because I believe we only have > OpenCL v2.0 currently. > > Also this should only be available if `cl_khr_fp16` is supported and enabled? > I think we are doing similar with some subgroups functions (e.g. > `get_kernel_sub_group_count_for_ndrange`) that are only supported by > `cl_khr_subgroup` but those have custom diagnostic though. May be we could > leave this check out since `half` is not available if `cl_khr_fp16` is not > enabled anyways. This is specifically meant to be used when `cl_khr_fp16` is **not** available. CLC allows using half as storage format and half pointers without the extension, vstore_half/vload_half are used to load/store half values. (CL1.2 CH 6.1.1.1) These builtins are not necessary if `cl_khr_fp16` is available (we can use regular loads/stores). I'll take stab at making these CLC only, but similarly to device specific builtins it looked useful beyond that, since these builtins provide access to half type storage. Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37231: Add half load and store builtins
Anastasia added inline comments. Comment at: include/clang/Basic/Builtins.def:1427 +// OpenCL half load/store builtin +BUILTIN(__builtin_store_half, "vdh*", "n") +BUILTIN(__builtin_store_halff, "vfh*", "n") I think this should be a language builtin (see above) but perhaps we might need to extend the language version here. Because I believe we only have OpenCL v2.0 currently. Also this should only be available if `cl_khr_fp16` is supported and enabled? I think we are doing similar with some subgroups functions (e.g. `get_kernel_sub_group_count_for_ndrange`) that are only supported by `cl_khr_subgroup` but those have custom diagnostic though. May be we could leave this check out since `half` is not available if `cl_khr_fp16` is not enabled anyways. Comment at: test/CodeGenOpenCL/no-half.cl:3 + +#pragma OPENCL EXTENSION cl_khr_fp64:enable + It seems strange that `cl_khr_fp16` is not enabled too. Repository: rL LLVM https://reviews.llvm.org/D37231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits