[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-02 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

If we go with this approach, please also remove sqrt from 
`clang/lib/Sema/OpenCLBuiltins.td` (and ideally add a comment pointing out that 
sqrt is handled in `opencl-c-base.h`)




Comment at: clang/lib/Headers/opencl-c-base.h:832
+
+inline float __ovld __cnfn sqrt(float __x) {
+  return __builtin_elementwise_sqrt(__x);

Anastasia wrote:
> Is this a generic implementation enough? Would some targets not need to do 
> something different for this built-in?
> 
> Ideally this header is to be kept light so I am a bit worried about adding 
> definitions of the functions here. Otherwise we will end up in the same 
> situation as we one day were with opencl-c.h. So could these be left there 
> instead? It might be good to check with @svenvh if TableGen header has 
> already a way to do this function forwarding or can be extended to do such a 
> thing. Then it would be implementable in the both header mechanisms. I don't 
> know if Sven has some other ideas or opinions...
We did already discuss this a bit on the GitHub issue: 
https://github.com/llvm/llvm-project/issues/64264



Comment at: clang/lib/Headers/opencl-c-base.h:856-858
+// We only really want to define the float variants here. However bad things
+// seem to happen with -fdeclare-opencl-builtins and splitting the handling of
+// different overloads.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156743/new/

https://reviews.llvm.org/D156743

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D151339: [OpenCL] Add cl_ext_image_raw10_raw12 extension

2023-07-26 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG5e8b44cc447e: [OpenCL] Add cl_ext_image_raw10_raw12 
extension (authored by svenvh).

Changed prior to commit:
  https://reviews.llvm.org/D151339?vs=525175=544292#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151339/new/

https://reviews.llvm.org/D151339

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/test/Headers/opencl-c-header.cl


Index: clang/test/Headers/opencl-c-header.cl
===
--- clang/test/Headers/opencl-c-header.cl
+++ clang/test/Headers/opencl-c-header.cl
@@ -187,6 +187,9 @@
 #if __opencl_c_ext_fp64_local_atomic_min_max != 1
 #error "Incorrectly defined __opencl_c_ext_fp64_local_atomic_min_max"
 #endif
+#if __opencl_c_ext_image_raw10_raw12 != 1
+#error "Incorrectly defined __opencl_c_ext_image_raw10_raw12"
+#endif
 
 #else
 
@@ -271,6 +274,9 @@
 #ifdef __opencl_c_ext_fp64_local_atomic_min_max
 #error "Incorrectly __opencl_c_ext_fp64_local_atomic_min_max defined"
 #endif
+#ifdef __opencl_c_ext_image_raw10_raw12
+#error "Incorrect __opencl_c_ext_image_raw10_raw12 define"
+#endif
 
 #endif //(defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
 
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -45,6 +45,7 @@
 #define __opencl_c_ext_fp32_local_atomic_add 1
 #define __opencl_c_ext_fp32_global_atomic_min_max 1
 #define __opencl_c_ext_fp32_local_atomic_min_max 1
+#define __opencl_c_ext_image_raw10_raw12 1
 
 #endif // defined(__SPIR__) || defined(__SPIRV__)
 #endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
@@ -477,6 +478,10 @@
 #if __OPENCL_C_VERSION__ >= CL_VERSION_3_0
 #define CLK_UNORM_INT_101010_2 0x10E0
 #endif // __OPENCL_C_VERSION__ >= CL_VERSION_3_0
+#ifdef __opencl_c_ext_image_raw10_raw12
+#define CLK_UNSIGNED_INT_RAW10_EXT 0x10E3
+#define CLK_UNSIGNED_INT_RAW12_EXT 0x10E4
+#endif // __opencl_c_ext_image_raw10_raw12
 
 // Channel order, numbering must be aligned with cl_channel_order in cl.h
 //


Index: clang/test/Headers/opencl-c-header.cl
===
--- clang/test/Headers/opencl-c-header.cl
+++ clang/test/Headers/opencl-c-header.cl
@@ -187,6 +187,9 @@
 #if __opencl_c_ext_fp64_local_atomic_min_max != 1
 #error "Incorrectly defined __opencl_c_ext_fp64_local_atomic_min_max"
 #endif
+#if __opencl_c_ext_image_raw10_raw12 != 1
+#error "Incorrectly defined __opencl_c_ext_image_raw10_raw12"
+#endif
 
 #else
 
@@ -271,6 +274,9 @@
 #ifdef __opencl_c_ext_fp64_local_atomic_min_max
 #error "Incorrectly __opencl_c_ext_fp64_local_atomic_min_max defined"
 #endif
+#ifdef __opencl_c_ext_image_raw10_raw12
+#error "Incorrect __opencl_c_ext_image_raw10_raw12 define"
+#endif
 
 #endif //(defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
 
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -45,6 +45,7 @@
 #define __opencl_c_ext_fp32_local_atomic_add 1
 #define __opencl_c_ext_fp32_global_atomic_min_max 1
 #define __opencl_c_ext_fp32_local_atomic_min_max 1
+#define __opencl_c_ext_image_raw10_raw12 1
 
 #endif // defined(__SPIR__) || defined(__SPIRV__)
 #endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
@@ -477,6 +478,10 @@
 #if __OPENCL_C_VERSION__ >= CL_VERSION_3_0
 #define CLK_UNORM_INT_101010_2 0x10E0
 #endif // __OPENCL_C_VERSION__ >= CL_VERSION_3_0
+#ifdef __opencl_c_ext_image_raw10_raw12
+#define CLK_UNSIGNED_INT_RAW10_EXT 0x10E3
+#define CLK_UNSIGNED_INT_RAW12_EXT 0x10E4
+#endif // __opencl_c_ext_image_raw10_raw12
 
 // Channel order, numbering must be aligned with cl_channel_order in cl.h
 //
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D151339: [OpenCL] Add cl_ext_image_raw10_raw12 extension

2023-05-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: Anastasia.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Add the defines for the `cl_ext_image_raw10_raw12` extension.

Draft extension specification is at 
https://github.com/KhronosGroup/OpenCL-Docs/pull/919


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D151339

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/test/Headers/opencl-c-header.cl


Index: clang/test/Headers/opencl-c-header.cl
===
--- clang/test/Headers/opencl-c-header.cl
+++ clang/test/Headers/opencl-c-header.cl
@@ -187,6 +187,9 @@
 #if __opencl_c_ext_fp64_local_atomic_min_max != 1
 #error "Incorrectly defined __opencl_c_ext_fp64_local_atomic_min_max"
 #endif
+#if __opencl_c_ext_image_raw10_raw12 != 1
+#error "Incorrectly defined __opencl_c_ext_image_raw10_raw12"
+#endif
 
 #else
 
@@ -271,6 +274,9 @@
 #ifdef __opencl_c_ext_fp64_local_atomic_min_max
 #error "Incorrectly __opencl_c_ext_fp64_local_atomic_min_max defined"
 #endif
+#ifdef __opencl_c_ext_image_raw10_raw12
+#error "Incorrectly __opencl_c_ext_image_raw10_raw12 defined"
+#endif
 
 #endif //(defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
 
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -45,6 +45,7 @@
 #define __opencl_c_ext_fp32_local_atomic_add 1
 #define __opencl_c_ext_fp32_global_atomic_min_max 1
 #define __opencl_c_ext_fp32_local_atomic_min_max 1
+#define __opencl_c_ext_image_raw10_raw12 1
 
 #endif // defined(__SPIR__) || defined(__SPIRV__)
 #endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
@@ -474,6 +475,10 @@
 #define CLK_HALF_FLOAT0x10DD
 #define CLK_FLOAT 0x10DE
 #define CLK_UNORM_INT24   0x10DF
+#ifdef __opencl_c_ext_image_raw10_raw12
+#define CLK_UNSIGNED_INT_RAW10_EXT 0x10E3
+#define CLK_UNSIGNED_INT_RAW12_EXT 0x10E4
+#endif // __opencl_c_ext_image_raw10_raw12
 
 // Channel order, numbering must be aligned with cl_channel_order in cl.h
 //


Index: clang/test/Headers/opencl-c-header.cl
===
--- clang/test/Headers/opencl-c-header.cl
+++ clang/test/Headers/opencl-c-header.cl
@@ -187,6 +187,9 @@
 #if __opencl_c_ext_fp64_local_atomic_min_max != 1
 #error "Incorrectly defined __opencl_c_ext_fp64_local_atomic_min_max"
 #endif
+#if __opencl_c_ext_image_raw10_raw12 != 1
+#error "Incorrectly defined __opencl_c_ext_image_raw10_raw12"
+#endif
 
 #else
 
@@ -271,6 +274,9 @@
 #ifdef __opencl_c_ext_fp64_local_atomic_min_max
 #error "Incorrectly __opencl_c_ext_fp64_local_atomic_min_max defined"
 #endif
+#ifdef __opencl_c_ext_image_raw10_raw12
+#error "Incorrectly __opencl_c_ext_image_raw10_raw12 defined"
+#endif
 
 #endif //(defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
 
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -45,6 +45,7 @@
 #define __opencl_c_ext_fp32_local_atomic_add 1
 #define __opencl_c_ext_fp32_global_atomic_min_max 1
 #define __opencl_c_ext_fp32_local_atomic_min_max 1
+#define __opencl_c_ext_image_raw10_raw12 1
 
 #endif // defined(__SPIR__) || defined(__SPIRV__)
 #endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
@@ -474,6 +475,10 @@
 #define CLK_HALF_FLOAT0x10DD
 #define CLK_FLOAT 0x10DE
 #define CLK_UNORM_INT24   0x10DF
+#ifdef __opencl_c_ext_image_raw10_raw12
+#define CLK_UNSIGNED_INT_RAW10_EXT 0x10E3
+#define CLK_UNSIGNED_INT_RAW12_EXT 0x10E4
+#endif // __opencl_c_ext_image_raw10_raw12
 
 // Channel order, numbering must be aligned with cl_channel_order in cl.h
 //
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104040: [OpenCL] Add TableGen emitter for OpenCL builtin header

2023-03-09 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG4cb843d09942: [OpenCL] Add builtin header TableGen emitter 
(authored by svenvh).
Herald added a subscriber: mgrang.

Changed prior to commit:
  https://reviews.llvm.org/D104040?vs=417678=503704#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104040/new/

https://reviews.llvm.org/D104040

Files:
  clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
  clang/utils/TableGen/TableGen.cpp
  clang/utils/TableGen/TableGenBackends.h

Index: clang/utils/TableGen/TableGenBackends.h
===
--- clang/utils/TableGen/TableGenBackends.h
+++ clang/utils/TableGen/TableGenBackends.h
@@ -124,6 +124,8 @@
 
 void EmitClangOpenCLBuiltins(llvm::RecordKeeper ,
  llvm::raw_ostream );
+void EmitClangOpenCLBuiltinHeader(llvm::RecordKeeper ,
+  llvm::raw_ostream );
 void EmitClangOpenCLBuiltinTests(llvm::RecordKeeper ,
  llvm::raw_ostream );
 
Index: clang/utils/TableGen/TableGen.cpp
===
--- clang/utils/TableGen/TableGen.cpp
+++ clang/utils/TableGen/TableGen.cpp
@@ -65,6 +65,7 @@
   GenClangCommentCommandInfo,
   GenClangCommentCommandList,
   GenClangOpenCLBuiltins,
+  GenClangOpenCLBuiltinHeader,
   GenClangOpenCLBuiltinTests,
   GenArmNeon,
   GenArmFP16,
@@ -200,6 +201,9 @@
"documentation comments"),
 clEnumValN(GenClangOpenCLBuiltins, "gen-clang-opencl-builtins",
"Generate OpenCL builtin declaration handlers"),
+clEnumValN(GenClangOpenCLBuiltinHeader,
+   "gen-clang-opencl-builtin-header",
+   "Generate OpenCL builtin header"),
 clEnumValN(GenClangOpenCLBuiltinTests, "gen-clang-opencl-builtin-tests",
"Generate OpenCL builtin declaration tests"),
 clEnumValN(GenArmNeon, "gen-arm-neon", "Generate arm_neon.h for clang"),
@@ -384,6 +388,9 @@
   case GenClangOpenCLBuiltins:
 EmitClangOpenCLBuiltins(Records, OS);
 break;
+  case GenClangOpenCLBuiltinHeader:
+EmitClangOpenCLBuiltinHeader(Records, OS);
+break;
   case GenClangOpenCLBuiltinTests:
 EmitClangOpenCLBuiltinTests(Records, OS);
 break;
Index: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
===
--- clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
+++ clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
@@ -324,6 +324,18 @@
   void emit() override;
 };
 
+// OpenCL builtin header generator.  This class processes the same TableGen
+// input as BuiltinNameEmitter, but generates a .h file that contains a
+// prototype for each builtin function described in the .td input.
+class OpenCLBuiltinHeaderEmitter : public OpenCLBuiltinFileEmitterBase {
+public:
+  OpenCLBuiltinHeaderEmitter(RecordKeeper , raw_ostream )
+  : OpenCLBuiltinFileEmitterBase(Records, OS) {}
+
+  // Entrypoint to generate the header.
+  void emit() override;
+};
+
 } // namespace
 
 void BuiltinNameEmitter::Emit() {
@@ -1260,11 +1272,76 @@
   }
 }
 
+void OpenCLBuiltinHeaderEmitter::emit() {
+  emitSourceFileHeader("OpenCL Builtin declarations", OS);
+
+  emitExtensionSetup();
+
+  OS << R"(
+#define __ovld __attribute__((overloadable))
+#define __conv __attribute__((convergent))
+#define __purefn __attribute__((pure))
+#define __cnfn __attribute__((const))
+
+)";
+
+  // Iterate over all builtins; sort to follow order of definition in .td file.
+  std::vector Builtins = Records.getAllDerivedDefinitions("Builtin");
+  llvm::sort(Builtins, LessRecord());
+
+  for (const auto *B : Builtins) {
+StringRef Name = B->getValueAsString("Name");
+
+std::string OptionalExtensionEndif = emitExtensionGuard(B);
+std::string OptionalVersionEndif = emitVersionGuard(B);
+
+SmallVector, 4> FTypes;
+expandTypesInSignature(B->getValueAsListOfDefs("Signature"), FTypes);
+
+for (const auto  : FTypes) {
+  StringRef OptionalTypeExtEndif = emitTypeExtensionGuards(Signature);
+
+  // Emit function declaration.
+  OS << Signature[0] << " __ovld ";
+  if (B->getValueAsBit("IsConst"))
+OS << "__cnfn ";
+  if (B->getValueAsBit("IsPure"))
+OS << "__purefn ";
+  if (B->getValueAsBit("IsConv"))
+OS << "__conv ";
+
+  OS << Name << "(";
+  if (Signature.size() > 1) {
+for (unsigned I = 1; I < Signature.size(); I++) {
+  if (I != 1)
+OS << ", ";
+  OS << Signature[I];
+}
+  }
+  OS << ");\n";
+
+  OS << OptionalTypeExtEndif;
+}
+
+OS << OptionalVersionEndif;
+OS << OptionalExtensionEndif;
+  }
+
+  OS << "\n// Disable any extensions we may have enabled previously.\n"
+"#pragma OPENCL EXTENSION all : disable";
+}
+
 void 

[PATCH] D143348: [Clang][Doc][OpenCL] Release 16 notes

2023-02-13 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added inline comments.



Comment at: clang/docs/ReleaseNotes.rst:840
+  * Fixed conditional definition of the depth image and read_write image3d 
builtins.
+  * Added ``nounwind`` attribute to all builtin functions.
+

Anastasia wrote:
> svenvh wrote:
> > It's slightly more than that: clang adds `nounwind` not only for builtin 
> > functions, but for any OpenCL function call.
> Thanks, this makes sense... stack unwind in OpenCL kernel is meaningless atm.
> 
> However has this change been made in a separate commit?
The change of https://reviews.llvm.org/D142033 is not specific to builtins.

Please remember to decrease the indent level of this bullet point, so that it 
is no longer under "Improved builtin functions support".  Otherwise LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143348/new/

https://reviews.llvm.org/D143348

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D143348: [Clang][Doc][OpenCL] Release 16 notes

2023-02-06 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/docs/ReleaseNotes.rst:840
+  * Fixed conditional definition of the depth image and read_write image3d 
builtins.
+  * Added ``nounwind`` attribute to all builtin functions.
+

It's slightly more than that: clang adds `nounwind` not only for builtin 
functions, but for any OpenCL function call.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143348/new/

https://reviews.llvm.org/D143348

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D142033: [OpenCL] Always add nounwind attribute for OpenCL

2023-01-20 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG149521091499: [OpenCL] Always add nounwind attribute for 
OpenCL (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142033/new/

https://reviews.llvm.org/D142033

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
  clang/test/CodeGenOpenCL/convergent.cl


Index: clang/test/CodeGenOpenCL/convergent.cl
===
--- clang/test/CodeGenOpenCL/convergent.cl
+++ clang/test/CodeGenOpenCL/convergent.cl
@@ -139,4 +139,5 @@
 // CHECK: attributes #3 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
 // CHECK: attributes #4 = { {{[^}]*}}convergent{{[^}]*}} }
 // CHECK: attributes #5 = { {{[^}]*}}convergent{{[^}]*}} }
-// CHECK: attributes #6 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
+// CHECK: attributes #6 = { {{[^}]*}}nounwind{{[^}]*}} }
+// CHECK: attributes #7 = { {{[^}]*}}convergent noduplicate nounwind{{[^}]*}} }
Index: clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
===
--- clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+++ clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
@@ -298,7 +298,7 @@
 // CHECK: attributes #2 = { nocallback nofree nounwind willreturn 
memory(argmem: readwrite) }
 // CHECK: attributes #3 = { convergent noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-cpu"="gfx900" 
"target-features"="+16-bit-insts,+ci-insts,+dpp,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst,+wavefrontsize64"
 }
 // CHECK: attributes #4 = { nounwind "enqueued-block" }
-// CHECK: attributes #5 = { convergent }
+// CHECK: attributes #5 = { convergent nounwind }
 //.
 // CHECK: !0 = !{i32 1, !"amdgpu_code_object_version", i32 400}
 // CHECK: !1 = !{i32 1, !"wchar_size", i32 4}
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1969,11 +1969,11 @@
 FuncAttrs.addAttribute(llvm::Attribute::Convergent);
   }
 
-  // TODO: NoUnwind attribute should be added for other GPU modes OpenCL, HIP,
+  // TODO: NoUnwind attribute should be added for other GPU modes HIP,
   // SYCL, OpenMP offload. AFAIK, none of them support exceptions in device
   // code.
-  if (getLangOpts().CUDA && getLangOpts().CUDAIsDevice) {
-// Exceptions aren't supported in CUDA device code.
+  if ((getLangOpts().CUDA && getLangOpts().CUDAIsDevice) ||
+  getLangOpts().OpenCL) {
 FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
   }
 


Index: clang/test/CodeGenOpenCL/convergent.cl
===
--- clang/test/CodeGenOpenCL/convergent.cl
+++ clang/test/CodeGenOpenCL/convergent.cl
@@ -139,4 +139,5 @@
 // CHECK: attributes #3 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
 // CHECK: attributes #4 = { {{[^}]*}}convergent{{[^}]*}} }
 // CHECK: attributes #5 = { {{[^}]*}}convergent{{[^}]*}} }
-// CHECK: attributes #6 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
+// CHECK: attributes #6 = { {{[^}]*}}nounwind{{[^}]*}} }
+// CHECK: attributes #7 = { {{[^}]*}}convergent noduplicate nounwind{{[^}]*}} }
Index: clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
===
--- clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+++ clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
@@ -298,7 +298,7 @@
 // CHECK: attributes #2 = { nocallback nofree nounwind willreturn memory(argmem: readwrite) }
 // CHECK: attributes #3 = { convergent noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="gfx900" "target-features"="+16-bit-insts,+ci-insts,+dpp,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst,+wavefrontsize64" }
 // CHECK: attributes #4 = { nounwind "enqueued-block" }
-// CHECK: attributes #5 = { convergent }
+// CHECK: attributes #5 = { convergent nounwind }
 //.
 // CHECK: !0 = !{i32 1, !"amdgpu_code_object_version", i32 400}
 // CHECK: !1 = !{i32 1, !"wchar_size", i32 4}
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1969,11 +1969,11 @@
 FuncAttrs.addAttribute(llvm::Attribute::Convergent);
   }
 
-  // TODO: NoUnwind attribute should be added for other GPU modes OpenCL, HIP,
+  // TODO: NoUnwind attribute should be added for other GPU modes HIP,
   // SYCL, OpenMP offload. AFAIK, none of them support exceptions in device
   // code.
-  if (getLangOpts().CUDA && getLangOpts().CUDAIsDevice) {
-// Exceptions aren't supported in CUDA device code.
+  if ((getLangOpts().CUDA && getLangOpts().CUDAIsDevice) ||
+  

[PATCH] D142033: [OpenCL] Always add nounwind attribute for OpenCL

2023-01-20 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D142033#4062671 , @bader wrote:

> Should we generalize and rename `clang/test/CodeGenOpenCL/convergent.cl` to 
> validate function attributes other than `convergent`? It's not obvious that 
> presence of `nounwind` attribute is validated by 
> `clang/test/CodeGenOpenCL/convergent.cl`.

I think the main goal of `clang/test/CodeGenOpenCL/convergent.cl` remains 
testing `convergent`, so I'd rather not generalize this particular test.  
`nounwind` is somewhat coincidentally tested by various tests now, but we could 
add a separate test for generic attributes such as `nounwind` if you think it's 
worth doing so.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142033/new/

https://reviews.llvm.org/D142033

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D142033: [OpenCL] Always add nounwind attribute for OpenCL

2023-01-18 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added reviewers: Anastasia, yaxunl, bader.
Herald added subscribers: kosarev, Naghasan, ldrumm, kerbowa, jvesely.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Neither OpenCL nor C++ for OpenCL support exceptions, so add the
`nounwind` attribute unconditionally for those languages.

Unblocks D138958 .


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D142033

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
  clang/test/CodeGenOpenCL/convergent.cl


Index: clang/test/CodeGenOpenCL/convergent.cl
===
--- clang/test/CodeGenOpenCL/convergent.cl
+++ clang/test/CodeGenOpenCL/convergent.cl
@@ -139,4 +139,5 @@
 // CHECK: attributes #3 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
 // CHECK: attributes #4 = { {{[^}]*}}convergent{{[^}]*}} }
 // CHECK: attributes #5 = { {{[^}]*}}convergent{{[^}]*}} }
-// CHECK: attributes #6 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
+// CHECK: attributes #6 = { {{[^}]*}}nounwind{{[^}]*}} }
+// CHECK: attributes #7 = { {{[^}]*}}convergent noduplicate nounwind{{[^}]*}} }
Index: clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
===
--- clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+++ clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
@@ -298,7 +298,7 @@
 // CHECK: attributes #2 = { nocallback nofree nounwind willreturn 
memory(argmem: readwrite) }
 // CHECK: attributes #3 = { convergent noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-cpu"="gfx900" 
"target-features"="+16-bit-insts,+ci-insts,+dpp,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst,+wavefrontsize64"
 }
 // CHECK: attributes #4 = { nounwind "enqueued-block" }
-// CHECK: attributes #5 = { convergent }
+// CHECK: attributes #5 = { convergent nounwind }
 //.
 // CHECK: !0 = !{i32 1, !"amdgpu_code_object_version", i32 400}
 // CHECK: !1 = !{i32 1, !"wchar_size", i32 4}
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1969,11 +1969,11 @@
 FuncAttrs.addAttribute(llvm::Attribute::Convergent);
   }
 
-  // TODO: NoUnwind attribute should be added for other GPU modes OpenCL, HIP,
+  // TODO: NoUnwind attribute should be added for other GPU modes HIP,
   // SYCL, OpenMP offload. AFAIK, none of them support exceptions in device
   // code.
-  if (getLangOpts().CUDA && getLangOpts().CUDAIsDevice) {
-// Exceptions aren't supported in CUDA device code.
+  if ((getLangOpts().CUDA && getLangOpts().CUDAIsDevice) ||
+  getLangOpts().OpenCL) {
 FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
   }
 


Index: clang/test/CodeGenOpenCL/convergent.cl
===
--- clang/test/CodeGenOpenCL/convergent.cl
+++ clang/test/CodeGenOpenCL/convergent.cl
@@ -139,4 +139,5 @@
 // CHECK: attributes #3 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
 // CHECK: attributes #4 = { {{[^}]*}}convergent{{[^}]*}} }
 // CHECK: attributes #5 = { {{[^}]*}}convergent{{[^}]*}} }
-// CHECK: attributes #6 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
+// CHECK: attributes #6 = { {{[^}]*}}nounwind{{[^}]*}} }
+// CHECK: attributes #7 = { {{[^}]*}}convergent noduplicate nounwind{{[^}]*}} }
Index: clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
===
--- clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+++ clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
@@ -298,7 +298,7 @@
 // CHECK: attributes #2 = { nocallback nofree nounwind willreturn memory(argmem: readwrite) }
 // CHECK: attributes #3 = { convergent noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="gfx900" "target-features"="+16-bit-insts,+ci-insts,+dpp,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst,+wavefrontsize64" }
 // CHECK: attributes #4 = { nounwind "enqueued-block" }
-// CHECK: attributes #5 = { convergent }
+// CHECK: attributes #5 = { convergent nounwind }
 //.
 // CHECK: !0 = !{i32 1, !"amdgpu_code_object_version", i32 400}
 // CHECK: !1 = !{i32 1, !"wchar_size", i32 4}
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1969,11 +1969,11 @@
 FuncAttrs.addAttribute(llvm::Attribute::Convergent);
   }
 
-  // TODO: NoUnwind attribute should be added for other GPU modes OpenCL, HIP,
+  // TODO: NoUnwind attribute should be added for other GPU modes HIP,
   // SYCL, OpenMP offload. AFAIK, none of them support exceptions in device
   // code.
-  if 

[PATCH] D138958: [clang] Better UX for Clang’s unwind-affecting attributes

2023-01-18 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl:20
 // Test that Attr.Const from OpenCLBuiltins.td is lowered to a readnone 
attribute.
+// FIXME: we don't, though.
 // CHECK-LABEL: @test_const_attr

lebedev.ri wrote:
> I've looked, and i really don't understand how D64319 works.
> It seems like the AST is then serialized into a header?
> Because just adding a new attribute without spelling does not solve the issue.
We're not setting the `NoUnwind` attribute for OpenCL (yet).  The following 
quick-and-dirty patch appears to fix this test for your patch (but will cause 
other tests to fail).  If you think it's time to add `NoUnwind` now, I can try 
putting up a review for that.

```
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 276d91fa2758..1ea3c11fbe03 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1972,7 +1972,7 @@ void 
CodeGenModule::getDefaultFunctionAttributes(StringRef Name,
   // TODO: NoUnwind attribute should be added for other GPU modes OpenCL, HIP,
   // SYCL, OpenMP offload. AFAIK, none of them support exceptions in device
   // code.
-  if (getLangOpts().CUDA && getLangOpts().CUDAIsDevice) {
+  if ((getLangOpts().CUDA && getLangOpts().CUDAIsDevice) || 
getLangOpts().OpenCL) {
 // Exceptions aren't supported in CUDA device code.
 FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
   }
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138958/new/

https://reviews.llvm.org/D138958

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141297: [OpenCL] Allow undefining header-only features

2023-01-16 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa60b8f468119: [OpenCL] Allow undefining header-only features 
(authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141297/new/

https://reviews.llvm.org/D141297

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/test/SemaOpenCL/features.cl


Index: clang/test/SemaOpenCL/features.cl
===
--- clang/test/SemaOpenCL/features.cl
+++ clang/test/SemaOpenCL/features.cl
@@ -26,6 +26,15 @@
 // RUN: %clang_cc1 -triple spir-unknown-unknown %s -E -dM -o - -x cl 
-cl-std=clc++1.0 \
 // RUN:   | FileCheck -match-full-lines %s  --check-prefix=NO-FEATURES
 
+// For OpenCL C 3.0, header-only features can be disabled using macros.
+// RUN: %clang_cc1 -triple spir-unknown-unknown %s -E -dM -o - -x cl 
-cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header \
+// RUN:-D__undef___opencl_c_work_group_collective_functions=1 \
+// RUN:-D__undef___opencl_c_atomic_order_seq_cst=1 \
+// RUN:-D__undef___opencl_c_atomic_scope_device=1 \
+// RUN:-D__undef___opencl_c_atomic_scope_all_devices=1 \
+// RUN:-D__undef___opencl_c_read_write_images=1 \
+// RUN:   | FileCheck %s --check-prefix=NO-HEADERONLY-FEATURES
+
 // Note that __opencl_c_int64 is always defined assuming
 // always compiling for FULL OpenCL profile
 
@@ -43,14 +52,20 @@
 // FEATURES: #define __opencl_c_subgroups 1
 
 // NO-FEATURES: #define __opencl_c_int64 1
-// NO-FEATURES-NOT: __opencl_c_3d_image_writes
-// NO-FEATURES-NOT: __opencl_c_atomic_order_acq_rel
-// NO-FEATURES-NOT: __opencl_c_atomic_order_seq_cst
-// NO-FEATURES-NOT: __opencl_c_device_enqueue
-// NO-FEATURES-NOT: __opencl_c_fp64
-// NO-FEATURES-NOT: __opencl_c_generic_address_space
-// NO-FEATURES-NOT: __opencl_c_images
-// NO-FEATURES-NOT: __opencl_c_pipes
-// NO-FEATURES-NOT: __opencl_c_program_scope_global_variables
-// NO-FEATURES-NOT: __opencl_c_read_write_images
-// NO-FEATURES-NOT: __opencl_c_subgroups
+// NO-FEATURES-NOT: #define __opencl_c_3d_image_writes
+// NO-FEATURES-NOT: #define __opencl_c_atomic_order_acq_rel
+// NO-FEATURES-NOT: #define __opencl_c_atomic_order_seq_cst
+// NO-FEATURES-NOT: #define __opencl_c_device_enqueue
+// NO-FEATURES-NOT: #define __opencl_c_fp64
+// NO-FEATURES-NOT: #define __opencl_c_generic_address_space
+// NO-FEATURES-NOT: #define __opencl_c_images
+// NO-FEATURES-NOT: #define __opencl_c_pipes
+// NO-FEATURES-NOT: #define __opencl_c_program_scope_global_variables
+// NO-FEATURES-NOT: #define __opencl_c_read_write_images
+// NO-FEATURES-NOT: #define __opencl_c_subgroups
+
+// NO-HEADERONLY-FEATURES-NOT: #define 
__opencl_c_work_group_collective_functions
+// NO-HEADERONLY-FEATURES-NOT: #define __opencl_c_atomic_order_seq_cst
+// NO-HEADERONLY-FEATURES-NOT: #define __opencl_c_atomic_scope_device
+// NO-HEADERONLY-FEATURES-NOT: #define __opencl_c_atomic_scope_all_devices
+// NO-HEADERONLY-FEATURES-NOT: #define __opencl_c_read_write_images
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -74,6 +74,25 @@
 #define __opencl_c_atomic_scope_all_devices 1
 #define __opencl_c_read_write_images 1
 #endif // defined(__SPIR__)
+
+// Undefine any feature macros that have been explicitly disabled using
+// an __undef_ macro.
+#ifdef __undef___opencl_c_work_group_collective_functions
+#undef __opencl_c_work_group_collective_functions
+#endif
+#ifdef __undef___opencl_c_atomic_order_seq_cst
+#undef __opencl_c_atomic_order_seq_cst
+#endif
+#ifdef __undef___opencl_c_atomic_scope_device
+#undef __opencl_c_atomic_scope_device
+#endif
+#ifdef __undef___opencl_c_atomic_scope_all_devices
+#undef __opencl_c_atomic_scope_all_devices
+#endif
+#ifdef __undef___opencl_c_read_write_images
+#undef __opencl_c_read_write_images
+#endif
+
 #endif // (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
 
 #if !defined(__opencl_c_generic_address_space)


Index: clang/test/SemaOpenCL/features.cl
===
--- clang/test/SemaOpenCL/features.cl
+++ clang/test/SemaOpenCL/features.cl
@@ -26,6 +26,15 @@
 // RUN: %clang_cc1 -triple spir-unknown-unknown %s -E -dM -o - -x cl -cl-std=clc++1.0 \
 // RUN:   | FileCheck -match-full-lines %s  --check-prefix=NO-FEATURES
 
+// For OpenCL C 3.0, header-only features can be disabled using macros.
+// RUN: %clang_cc1 -triple spir-unknown-unknown %s -E -dM -o - -x cl -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header \
+// RUN:-D__undef___opencl_c_work_group_collective_functions=1 \
+// RUN:-D__undef___opencl_c_atomic_order_seq_cst=1 \
+// RUN:-D__undef___opencl_c_atomic_scope_device=1 \
+// RUN:-D__undef___opencl_c_atomic_scope_all_devices=1 

[PATCH] D141297: [OpenCL] Allow undefining header-only features

2023-01-12 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D141297#4043122 , @Anastasia wrote:

> Btw I wonder if in the future we could add some error or warning in case 
> someone uses the same approach for frontend specific features, i.e.
>
>   #ifdef __undef___opencl_c_generic_address_space
>   #error "Feature __opencl_c_generic_address_space can only be disabled via 
> -cl-ext flag"
>   #endif

Interesting idea, but I'm a bit hesitant of doing so:
It increases the size of `opencl-c-base.h` so it will take longer to parse, 
which will affect every OpenCL compilation. Luckily we can avoid that cost if 
we keep the `__undef_` mechanism an internal solution, which it will be once we 
let `-cl-ext=-feature` generate `__undef_` macros for extensions that are not 
in `OpenCLExtensions.def`. So longer term users will never have to pass 
`__undef_` macros.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141297/new/

https://reviews.llvm.org/D141297

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141297: [OpenCL] Allow undefining header-only features

2023-01-09 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added reviewers: Anastasia, FMarno.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

`opencl-c-base.h` always defines 5 particular feature macros for
SPIR-V, making it impossible to disable those features.

To allow disabling any of those features, let the header recognize
`__undef_` macros.  The user can then pass the
`-D__undef_` flag on the command line to disable a specific
feature.  The `__undef` macro could potentially also be set from
`-cl-ext=-feature`, but for now only change the header and only
provide `__undef` macros for the 5 features that are always enabled in
`opencl-c-base.h`.

This is an alternative to https://reviews.llvm.org/D137652


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D141297

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/test/SemaOpenCL/features.cl


Index: clang/test/SemaOpenCL/features.cl
===
--- clang/test/SemaOpenCL/features.cl
+++ clang/test/SemaOpenCL/features.cl
@@ -26,6 +26,15 @@
 // RUN: %clang_cc1 -triple spir-unknown-unknown %s -E -dM -o - -x cl 
-cl-std=clc++1.0 \
 // RUN:   | FileCheck -match-full-lines %s  --check-prefix=NO-FEATURES
 
+// For OpenCL C 3.0, header-only features can be disabled using macros.
+// RUN: %clang_cc1 -triple spir-unknown-unknown %s -E -dM -o - -x cl 
-cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header \
+// RUN:-D__undef___opencl_c_work_group_collective_functions=1 \
+// RUN:-D__undef___opencl_c_atomic_order_seq_cst=1 \
+// RUN:-D__undef___opencl_c_atomic_scope_device=1 \
+// RUN:-D__undef___opencl_c_atomic_scope_all_devices=1 \
+// RUN:-D__undef___opencl_c_read_write_images=1 \
+// RUN:   | FileCheck %s --check-prefix=NO-HEADERONLY-FEATURES
+
 // Note that __opencl_c_int64 is always defined assuming
 // always compiling for FULL OpenCL profile
 
@@ -43,14 +52,20 @@
 // FEATURES: #define __opencl_c_subgroups 1
 
 // NO-FEATURES: #define __opencl_c_int64 1
-// NO-FEATURES-NOT: __opencl_c_3d_image_writes
-// NO-FEATURES-NOT: __opencl_c_atomic_order_acq_rel
-// NO-FEATURES-NOT: __opencl_c_atomic_order_seq_cst
-// NO-FEATURES-NOT: __opencl_c_device_enqueue
-// NO-FEATURES-NOT: __opencl_c_fp64
-// NO-FEATURES-NOT: __opencl_c_generic_address_space
-// NO-FEATURES-NOT: __opencl_c_images
-// NO-FEATURES-NOT: __opencl_c_pipes
-// NO-FEATURES-NOT: __opencl_c_program_scope_global_variables
-// NO-FEATURES-NOT: __opencl_c_read_write_images
-// NO-FEATURES-NOT: __opencl_c_subgroups
+// NO-FEATURES-NOT: #define __opencl_c_3d_image_writes
+// NO-FEATURES-NOT: #define __opencl_c_atomic_order_acq_rel
+// NO-FEATURES-NOT: #define __opencl_c_atomic_order_seq_cst
+// NO-FEATURES-NOT: #define __opencl_c_device_enqueue
+// NO-FEATURES-NOT: #define __opencl_c_fp64
+// NO-FEATURES-NOT: #define __opencl_c_generic_address_space
+// NO-FEATURES-NOT: #define __opencl_c_images
+// NO-FEATURES-NOT: #define __opencl_c_pipes
+// NO-FEATURES-NOT: #define __opencl_c_program_scope_global_variables
+// NO-FEATURES-NOT: #define __opencl_c_read_write_images
+// NO-FEATURES-NOT: #define __opencl_c_subgroups
+
+// NO-HEADERONLY-FEATURES-NOT: #define 
__opencl_c_work_group_collective_functions
+// NO-HEADERONLY-FEATURES-NOT: #define __opencl_c_atomic_order_seq_cst
+// NO-HEADERONLY-FEATURES-NOT: #define __opencl_c_atomic_scope_device
+// NO-HEADERONLY-FEATURES-NOT: #define __opencl_c_atomic_scope_all_devices
+// NO-HEADERONLY-FEATURES-NOT: #define __opencl_c_read_write_images
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -74,6 +74,25 @@
 #define __opencl_c_atomic_scope_all_devices 1
 #define __opencl_c_read_write_images 1
 #endif // defined(__SPIR__)
+
+// Undefine any feature macros that have been explicitly disabled using
+// an __undef_ macro.
+#ifdef __undef___opencl_c_work_group_collective_functions
+#undef __opencl_c_work_group_collective_functions
+#endif
+#ifdef __undef___opencl_c_atomic_order_seq_cst
+#undef __opencl_c_atomic_order_seq_cst
+#endif
+#ifdef __undef___opencl_c_atomic_scope_device
+#undef __opencl_c_atomic_scope_device
+#endif
+#ifdef __undef___opencl_c_atomic_scope_all_devices
+#undef __opencl_c_atomic_scope_all_devices
+#endif
+#ifdef __undef___opencl_c_read_write_images
+#undef __opencl_c_read_write_images
+#endif
+
 #endif // (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
 
 #if !defined(__opencl_c_generic_address_space)


Index: clang/test/SemaOpenCL/features.cl
===
--- clang/test/SemaOpenCL/features.cl
+++ clang/test/SemaOpenCL/features.cl
@@ -26,6 +26,15 @@
 // RUN: %clang_cc1 -triple spir-unknown-unknown %s -E -dM -o - -x cl 

[PATCH] D141008: [Clang][SPIR-V] Emit target extension types for OpenCL types on SPIR-V.

2023-01-05 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

> it may be more appropriate to make these triggered off of a hidden option 
> defaulted to off for now, or maybe based on whether or not opaque pointers 
> are enabled

There isn't really a meaningful alternative representation for these opaque 
types when opaque pointers are enabled. So it sounds reasonable to gate it on 
whether opaque pointers are enabled.




Comment at: clang/include/clang-c/Index.h:30
  * The version constants for the libclang API.
  * CINDEX_VERSION_MINOR should increase when there are API additions.
  * CINDEX_VERSION_MAJOR is intended for "major" source/ABI breaking changes.

I suppose you need to bump `CINDEX_VERSION_MINOR` for the enum additions?



Comment at: llvm/docs/SPIRVUsage.rst:103
+
+All integer arguments take the same value as they do in the SPIR-V type name.
+For example, the OpenCL type ``image2d_depth_ro_t`` would be represented in




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141008/new/

https://reviews.llvm.org/D141008

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137652: Remove mandatory define of optional features macros for OpenCL C 3.0

2022-11-29 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

> @svenvh I remember that we have also discussed the addition of a vendor 
> specific header where such feature/extension macro definition can be added to 
> avoid the macro pollution but I feel this is somewhat orthogonal i.e. the 
> fine grained control of macro defines is still needed?

Unfortunately I don't remember the details of that discussion, but I agree that 
it's worth looking into a solution for issue #55674, using e.g. `__undef` 
macros as you described above.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137652/new/

https://reviews.llvm.org/D137652

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137652: Remove mandatory define of optional features macros for OpenCL C 3.0

2022-11-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/include/clang/Basic/OpenCLExtensions.def:123
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_work_group_collective_functions, false, 
300, OCL_C_30)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_int64, false, 300, OCL_C_30)
 

I am wondering why those features weren't added together with the other OpenCL 
3.0 features; there wasn't any discussion around that in D95776.  Perhaps it's 
because these don't affect the compiler behaviour directly? (but then neither 
does e.g. `__opencl_c_atomic_order_acq_rel`)  Wondering if @Anastasia has any 
insights here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137652/new/

https://reviews.llvm.org/D137652

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134445: [PR57881][OpenCL] Fix incorrect diagnostics with templated types in kernel arguments

2022-09-22 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

> I have attempted to workaround this issue for the reported test cases, 
> however it still doesn't quite work when the type is created from the 
> template parameter (see FIXME test case in the patch). I presume we want to 
> allow this? If so we might need to disable lazy template instantiation in 
> this case. My guess is the only issue this this is that we will have 
> performance penalty for the code of this format:

I don't have enough experience with lazy template instantiation to give 
meaningful advice here.  Though I'm not too worried about performance penalties 
for the example you're giving, as I'd expect most realistic use cases will 
require the full instantiation of a templated type sooner or later in the 
source program anyway (e.g. near the calling point).

> While this matter might need more thoughts and investigations I wonder 
> whether it makes sense to commit this fix for the time being since it's 
> fixing the reported test case at least?

That sounds reasonable to me; clang currently hard-rejects a valid source due 
to an over-conservative diagnostic which isn't a great user experience.

> So another approach we could take is to change this diagnostics into warnings 
> and then if we can't fully detect the type provide the messaging that we 
> can't detect whether the type is safe...

I think that makes sense, and should be fine to leave as a followup.




Comment at: clang/lib/Sema/SemaDecl.cpp:8856
+ if (CXXRec) {
+   if (!CXXRec->hasDefinition())
+ CXXRec = CXXRec->getTemplateInstantiationPattern();

A comment explaining what we're trying to do here would be nice.



Comment at: clang/test/SemaOpenCLCXX/invalid-kernel.clcpp:99
+struct Outer {
+struct Inner{
+int i;

Indenting would help readability.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134445/new/

https://reviews.llvm.org/D134445

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D132473: [Docs][OpenCL][SPIR-V] Release 15 notes for Clang

2022-08-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

LGTM, thanks!




Comment at: clang/docs/ReleaseNotes.rst:676
+  LLVM backend when generating SPIR-V binary.
+
 Floating Point Support in Clang

Anastasia wrote:
> svenvh wrote:
> > Should we say anything about opaque pointers?  Something like:
> > 
> > ```
> >  - Although LLVM has switched to opaque pointers with this release, SPIR-V 
> > generation still relies on typed pointers in this release.
> > ```
> Sure, good point! I have added an item to clarify this. Does it look ok to 
> you?
generation -> generator I'd say, which you can address at commit time.  For the 
rest it looks good!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132473/new/

https://reviews.llvm.org/D132473

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D132473: [Docs][OpenCL][SPIR-V] Release 15 notes for Clang

2022-08-23 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D132473#3742688 , @Anastasia wrote:

> @svenvh as you made quite a lot of changes in the headers feel free to expand 
> the description if you feel we should document some of those in more detail.

I think you have captured it quite well already; they were mostly small fixes 
(although many).




Comment at: clang/docs/ReleaseNotes.rst:676
+  LLVM backend when generating SPIR-V binary.
+
 Floating Point Support in Clang

Should we say anything about opaque pointers?  Something like:

```
 - Although LLVM has switched to opaque pointers with this release, SPIR-V 
generation still relies on typed pointers in this release.
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132473/new/

https://reviews.llvm.org/D132473

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D130768: [OpenCL][SPIR-V] Add test for extern functions with a pointer

2022-08-19 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.

LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130768/new/

https://reviews.llvm.org/D130768

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D128436: [OpenCL] Remove fast_ half geometric builtins

2022-07-05 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGf8e658ec9ff5: [OpenCL] Remove fast_ half geometric builtins 
(authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128436/new/

https://reviews.llvm.org/D128436

Files:
  clang/lib/Headers/opencl-c.h


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -10467,12 +10467,6 @@
 float __ovld __cnfn fast_distance(float2, float2);
 float __ovld __cnfn fast_distance(float3, float3);
 float __ovld __cnfn fast_distance(float4, float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_distance(half, half);
-half __ovld __cnfn fast_distance(half2, half2);
-half __ovld __cnfn fast_distance(half3, half3);
-half __ovld __cnfn fast_distance(half4, half4);
-#endif //cl_khr_fp16
 
 /**
  * Returns the length of vector p computed as:
@@ -10482,12 +10476,6 @@
 float __ovld __cnfn fast_length(float2);
 float __ovld __cnfn fast_length(float3);
 float __ovld __cnfn fast_length(float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_length(half);
-half __ovld __cnfn fast_length(half2);
-half __ovld __cnfn fast_length(half3);
-half __ovld __cnfn fast_length(half4);
-#endif //cl_khr_fp16
 
 /**
  * Returns a vector in the same direction as p but with a
@@ -10514,12 +10502,6 @@
 float2 __ovld __cnfn fast_normalize(float2);
 float3 __ovld __cnfn fast_normalize(float3);
 float4 __ovld __cnfn fast_normalize(float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_normalize(half);
-half2 __ovld __cnfn fast_normalize(half2);
-half3 __ovld __cnfn fast_normalize(half3);
-half4 __ovld __cnfn fast_normalize(half4);
-#endif //cl_khr_fp16
 
 // OpenCL v1.1 s6.11.6, v1.2 s6.12.6, v2.0 s6.13.6 - Relational Functions
 


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -10467,12 +10467,6 @@
 float __ovld __cnfn fast_distance(float2, float2);
 float __ovld __cnfn fast_distance(float3, float3);
 float __ovld __cnfn fast_distance(float4, float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_distance(half, half);
-half __ovld __cnfn fast_distance(half2, half2);
-half __ovld __cnfn fast_distance(half3, half3);
-half __ovld __cnfn fast_distance(half4, half4);
-#endif //cl_khr_fp16
 
 /**
  * Returns the length of vector p computed as:
@@ -10482,12 +10476,6 @@
 float __ovld __cnfn fast_length(float2);
 float __ovld __cnfn fast_length(float3);
 float __ovld __cnfn fast_length(float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_length(half);
-half __ovld __cnfn fast_length(half2);
-half __ovld __cnfn fast_length(half3);
-half __ovld __cnfn fast_length(half4);
-#endif //cl_khr_fp16
 
 /**
  * Returns a vector in the same direction as p but with a
@@ -10514,12 +10502,6 @@
 float2 __ovld __cnfn fast_normalize(float2);
 float3 __ovld __cnfn fast_normalize(float3);
 float4 __ovld __cnfn fast_normalize(float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_normalize(half);
-half2 __ovld __cnfn fast_normalize(half2);
-half3 __ovld __cnfn fast_normalize(half3);
-half4 __ovld __cnfn fast_normalize(half4);
-#endif //cl_khr_fp16
 
 // OpenCL v1.1 s6.11.6, v1.2 s6.12.6, v2.0 s6.13.6 - Relational Functions
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D128434: [OpenCL] Remove half scalar vload/vstore builtins

2022-06-30 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1d421e6e3b78: [OpenCL] Remove half scalar vload/vstore 
builtins (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128434/new/

https://reviews.llvm.org/D128434

Files:
  clang/lib/Headers/opencl-c.h


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -11255,7 +11255,6 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const __constant half *);
 half2 __ovld __purefn vload2(size_t, const __constant half *);
 half3 __ovld __purefn vload3(size_t, const __constant half *);
 half4 __ovld __purefn vload4(size_t, const __constant half *);
@@ -11319,7 +11318,6 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const half *);
 half2 __ovld __purefn vload2(size_t, const half *);
 half3 __ovld __purefn vload3(size_t, const half *);
 half4 __ovld __purefn vload4(size_t, const half *);
@@ -11484,19 +11482,16 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const __global half *);
 half2 __ovld __purefn vload2(size_t, const __global half *);
 half3 __ovld __purefn vload3(size_t, const __global half *);
 half4 __ovld __purefn vload4(size_t, const __global half *);
 half8 __ovld __purefn vload8(size_t, const __global half *);
 half16 __ovld __purefn vload16(size_t, const __global half *);
-half __ovld __purefn vload(size_t, const __local half *);
 half2 __ovld __purefn vload2(size_t, const __local half *);
 half3 __ovld __purefn vload3(size_t, const __local half *);
 half4 __ovld __purefn vload4(size_t, const __local half *);
 half8 __ovld __purefn vload8(size_t, const __local half *);
 half16 __ovld __purefn vload16(size_t, const __local half *);
-half __ovld __purefn vload(size_t, const __private half *);
 half2 __ovld __purefn vload2(size_t, const __private half *);
 half3 __ovld __purefn vload3(size_t, const __private half *);
 half4 __ovld __purefn vload4(size_t, const __private half *);
@@ -11559,7 +11554,6 @@
 void __ovld vstore16(double16, size_t, double *);
 #endif //cl_khr_fp64
 #ifdef cl_khr_fp16
-void __ovld vstore(half, size_t, half *);
 void __ovld vstore2(half2, size_t, half *);
 void __ovld vstore3(half3, size_t, half *);
 void __ovld vstore4(half4, size_t, half *);
@@ -11722,19 +11716,16 @@
 void __ovld vstore16(double16, size_t, __private double *);
 #endif //cl_khr_fp64
 #ifdef cl_khr_fp16
-void __ovld vstore(half, size_t, __global half *);
 void __ovld vstore2(half2, size_t, __global half *);
 void __ovld vstore3(half3, size_t, __global half *);
 void __ovld vstore4(half4, size_t, __global half *);
 void __ovld vstore8(half8, size_t, __global half *);
 void __ovld vstore16(half16, size_t, __global half *);
-void __ovld vstore(half, size_t, __local half *);
 void __ovld vstore2(half2, size_t, __local half *);
 void __ovld vstore3(half3, size_t, __local half *);
 void __ovld vstore4(half4, size_t, __local half *);
 void __ovld vstore8(half8, size_t, __local half *);
 void __ovld vstore16(half16, size_t, __local half *);
-void __ovld vstore(half, size_t, __private half *);
 void __ovld vstore2(half2, size_t, __private half *);
 void __ovld vstore3(half3, size_t, __private half *);
 void __ovld vstore4(half4, size_t, __private half *);


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -11255,7 +11255,6 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const __constant half *);
 half2 __ovld __purefn vload2(size_t, const __constant half *);
 half3 __ovld __purefn vload3(size_t, const __constant half *);
 half4 __ovld __purefn vload4(size_t, const __constant half *);
@@ -11319,7 +11318,6 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const half *);
 half2 __ovld __purefn vload2(size_t, const half *);
 half3 __ovld __purefn vload3(size_t, const half *);
 half4 __ovld __purefn vload4(size_t, const half *);
@@ -11484,19 +11482,16 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const __global half *);
 half2 __ovld __purefn vload2(size_t, const __global half *);
 half3 __ovld __purefn vload3(size_t, const __global half *);
 half4 __ovld __purefn vload4(size_t, const __global half *);
 half8 __ovld __purefn vload8(size_t, const __global half *);
 half16 __ovld __purefn vload16(size_t, const __global half *);
-half __ovld __purefn vload(size_t, const __local half *);
 half2 __ovld __purefn vload2(size_t, const __local half *);
 half3 __ovld __purefn vload3(size_t, const __local half *);
 half4 __ovld __purefn vload4(size_t, const __local half *);
 half8 __ovld __purefn vload8(size_t, const __local half *);
 half16 

[PATCH] D127961: [OpenCL] Reduce emitting candidate notes for builtins

2022-06-27 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG663e47a50f50: [OpenCL] Reduce emitting candidate notes for 
builtins (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127961/new/

https://reviews.llvm.org/D127961

Files:
  clang/lib/Sema/SemaOverload.cpp
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl


Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -171,14 +171,14 @@
 // extension is disabled.  Test this by counting the number of notes about
 // candidate functions.
 void test_atomic_double_reporting(volatile __generic atomic_int *a) {
-  atomic_init(a);
+  atomic_init(a, a);
   // expected-error@-1{{no matching function for call to 'atomic_init'}}
 #if defined(NO_FP64)
   // Expecting 5 candidates: int, uint, long, ulong, float
-  // expected-note@-4 5 {{candidate function not viable: requires 2 arguments, 
but 1 was provided}}
+  // expected-note@-4 5 {{candidate function not viable: no known conversion}}
 #else
   // Expecting 6 candidates: int, uint, long, ulong, float, double
-  // expected-note@-7 6 {{candidate function not viable: requires 2 arguments, 
but 1 was provided}}
+  // expected-note@-7 6 {{candidate function not viable: no known conversion}}
 #endif
 }
 
@@ -198,7 +198,6 @@
 
   atomic_exchange_explicit(a_int, d, memory_order_seq_cst);
   // expected-error@-1{{no matching function for call to 
'atomic_exchange_explicit'}}
-  // expected-note@-2 + {{candidate function not viable}}
 
   atomic_exchange_explicit(a_int, d, memory_order_seq_cst, 
memory_scope_work_group);
 }
@@ -272,9 +271,7 @@
   res = read_imageh(image_read_only_image2d, i2);
 #if __OPENCL_C_VERSION__ < CL_VERSION_1_2 && !defined(__OPENCL_CPP_VERSION__)
   // expected-error@-3{{no matching function for call to 'read_imagef'}}
-  // expected-note@-4 + {{candidate function not viable}}
-  // expected-error@-4{{no matching function for call to 'read_imageh'}}
-  // expected-note@-5 + {{candidate function not viable}}
+  // expected-error@-3{{no matching function for call to 'read_imageh'}}
 #endif
   res = read_imageh(image_read_only_image2d, sampler, i2);
 
@@ -304,7 +301,6 @@
   write_imagef(image3dwo, i4, i, f4);
 #if __OPENCL_C_VERSION__ <= CL_VERSION_1_2 && !defined(__OPENCL_CPP_VERSION__)
   // expected-error@-2{{no matching function for call to 'write_imagef'}}
-  // expected-note@-3 + {{candidate function not viable}}
 #endif
 }
 
Index: clang/lib/Sema/SemaOverload.cpp
===
--- clang/lib/Sema/SemaOverload.cpp
+++ clang/lib/Sema/SemaOverload.cpp
@@ -11266,6 +11266,13 @@
   if (shouldSkipNotingLambdaConversionDecl(Fn))
 return;
 
+  // There is no physical candidate declaration to point to for OpenCL 
builtins.
+  // Except for failed conversions, the notes are identical for each candidate,
+  // so do not generate such notes.
+  if (S.getLangOpts().OpenCL && Fn->isImplicit() &&
+  Cand->FailureKind != ovl_fail_bad_conversion)
+return;
+
   // Note deleted candidates, but only if they're viable.
   if (Cand->Viable) {
 if (Fn->isDeleted()) {


Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -171,14 +171,14 @@
 // extension is disabled.  Test this by counting the number of notes about
 // candidate functions.
 void test_atomic_double_reporting(volatile __generic atomic_int *a) {
-  atomic_init(a);
+  atomic_init(a, a);
   // expected-error@-1{{no matching function for call to 'atomic_init'}}
 #if defined(NO_FP64)
   // Expecting 5 candidates: int, uint, long, ulong, float
-  // expected-note@-4 5 {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@-4 5 {{candidate function not viable: no known conversion}}
 #else
   // Expecting 6 candidates: int, uint, long, ulong, float, double
-  // expected-note@-7 6 {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@-7 6 {{candidate function not viable: no known conversion}}
 #endif
 }
 
@@ -198,7 +198,6 @@
 
   atomic_exchange_explicit(a_int, d, memory_order_seq_cst);
   // expected-error@-1{{no matching function for call to 'atomic_exchange_explicit'}}
-  // expected-note@-2 + {{candidate function not viable}}
 
   atomic_exchange_explicit(a_int, d, memory_order_seq_cst, memory_scope_work_group);
 }
@@ -272,9 +271,7 @@
   res = read_imageh(image_read_only_image2d, i2);
 #if __OPENCL_C_VERSION__ < CL_VERSION_1_2 && !defined(__OPENCL_CPP_VERSION__)
   // expected-error@-3{{no 

[PATCH] D128436: [OpenCL] Remove fast_ half geometric builtins

2022-06-23 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added reviewers: Anastasia, stuart, azabaznov.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

These are not mentioned in the OpenCL C Specification nor in the
OpenCL Extension Specification.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D128436

Files:
  clang/lib/Headers/opencl-c.h


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -10467,12 +10467,6 @@
 float __ovld __cnfn fast_distance(float2, float2);
 float __ovld __cnfn fast_distance(float3, float3);
 float __ovld __cnfn fast_distance(float4, float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_distance(half, half);
-half __ovld __cnfn fast_distance(half2, half2);
-half __ovld __cnfn fast_distance(half3, half3);
-half __ovld __cnfn fast_distance(half4, half4);
-#endif //cl_khr_fp16
 
 /**
  * Returns the length of vector p computed as:
@@ -10482,12 +10476,6 @@
 float __ovld __cnfn fast_length(float2);
 float __ovld __cnfn fast_length(float3);
 float __ovld __cnfn fast_length(float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_length(half);
-half __ovld __cnfn fast_length(half2);
-half __ovld __cnfn fast_length(half3);
-half __ovld __cnfn fast_length(half4);
-#endif //cl_khr_fp16
 
 /**
  * Returns a vector in the same direction as p but with a
@@ -10514,12 +10502,6 @@
 float2 __ovld __cnfn fast_normalize(float2);
 float3 __ovld __cnfn fast_normalize(float3);
 float4 __ovld __cnfn fast_normalize(float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_normalize(half);
-half2 __ovld __cnfn fast_normalize(half2);
-half3 __ovld __cnfn fast_normalize(half3);
-half4 __ovld __cnfn fast_normalize(half4);
-#endif //cl_khr_fp16
 
 // OpenCL v1.1 s6.11.6, v1.2 s6.12.6, v2.0 s6.13.6 - Relational Functions
 


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -10467,12 +10467,6 @@
 float __ovld __cnfn fast_distance(float2, float2);
 float __ovld __cnfn fast_distance(float3, float3);
 float __ovld __cnfn fast_distance(float4, float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_distance(half, half);
-half __ovld __cnfn fast_distance(half2, half2);
-half __ovld __cnfn fast_distance(half3, half3);
-half __ovld __cnfn fast_distance(half4, half4);
-#endif //cl_khr_fp16
 
 /**
  * Returns the length of vector p computed as:
@@ -10482,12 +10476,6 @@
 float __ovld __cnfn fast_length(float2);
 float __ovld __cnfn fast_length(float3);
 float __ovld __cnfn fast_length(float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_length(half);
-half __ovld __cnfn fast_length(half2);
-half __ovld __cnfn fast_length(half3);
-half __ovld __cnfn fast_length(half4);
-#endif //cl_khr_fp16
 
 /**
  * Returns a vector in the same direction as p but with a
@@ -10514,12 +10502,6 @@
 float2 __ovld __cnfn fast_normalize(float2);
 float3 __ovld __cnfn fast_normalize(float3);
 float4 __ovld __cnfn fast_normalize(float4);
-#ifdef cl_khr_fp16
-half __ovld __cnfn fast_normalize(half);
-half2 __ovld __cnfn fast_normalize(half2);
-half3 __ovld __cnfn fast_normalize(half3);
-half4 __ovld __cnfn fast_normalize(half4);
-#endif //cl_khr_fp16
 
 // OpenCL v1.1 s6.11.6, v1.2 s6.12.6, v2.0 s6.13.6 - Relational Functions
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D128434: [OpenCL] Remove half scalar vload/vstore builtins

2022-06-23 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added reviewers: Anastasia, stuart, azabaznov.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

These are not mentioned in the OpenCL C Specification nor in the
OpenCL Extension Specification.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D128434

Files:
  clang/lib/Headers/opencl-c.h


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -11255,7 +11255,6 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const __constant half *);
 half2 __ovld __purefn vload2(size_t, const __constant half *);
 half3 __ovld __purefn vload3(size_t, const __constant half *);
 half4 __ovld __purefn vload4(size_t, const __constant half *);
@@ -11319,7 +11318,6 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const half *);
 half2 __ovld __purefn vload2(size_t, const half *);
 half3 __ovld __purefn vload3(size_t, const half *);
 half4 __ovld __purefn vload4(size_t, const half *);
@@ -11484,19 +11482,16 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const __global half *);
 half2 __ovld __purefn vload2(size_t, const __global half *);
 half3 __ovld __purefn vload3(size_t, const __global half *);
 half4 __ovld __purefn vload4(size_t, const __global half *);
 half8 __ovld __purefn vload8(size_t, const __global half *);
 half16 __ovld __purefn vload16(size_t, const __global half *);
-half __ovld __purefn vload(size_t, const __local half *);
 half2 __ovld __purefn vload2(size_t, const __local half *);
 half3 __ovld __purefn vload3(size_t, const __local half *);
 half4 __ovld __purefn vload4(size_t, const __local half *);
 half8 __ovld __purefn vload8(size_t, const __local half *);
 half16 __ovld __purefn vload16(size_t, const __local half *);
-half __ovld __purefn vload(size_t, const __private half *);
 half2 __ovld __purefn vload2(size_t, const __private half *);
 half3 __ovld __purefn vload3(size_t, const __private half *);
 half4 __ovld __purefn vload4(size_t, const __private half *);
@@ -11559,7 +11554,6 @@
 void __ovld vstore16(double16, size_t, double *);
 #endif //cl_khr_fp64
 #ifdef cl_khr_fp16
-void __ovld vstore(half, size_t, half *);
 void __ovld vstore2(half2, size_t, half *);
 void __ovld vstore3(half3, size_t, half *);
 void __ovld vstore4(half4, size_t, half *);
@@ -11722,19 +11716,16 @@
 void __ovld vstore16(double16, size_t, __private double *);
 #endif //cl_khr_fp64
 #ifdef cl_khr_fp16
-void __ovld vstore(half, size_t, __global half *);
 void __ovld vstore2(half2, size_t, __global half *);
 void __ovld vstore3(half3, size_t, __global half *);
 void __ovld vstore4(half4, size_t, __global half *);
 void __ovld vstore8(half8, size_t, __global half *);
 void __ovld vstore16(half16, size_t, __global half *);
-void __ovld vstore(half, size_t, __local half *);
 void __ovld vstore2(half2, size_t, __local half *);
 void __ovld vstore3(half3, size_t, __local half *);
 void __ovld vstore4(half4, size_t, __local half *);
 void __ovld vstore8(half8, size_t, __local half *);
 void __ovld vstore16(half16, size_t, __local half *);
-void __ovld vstore(half, size_t, __private half *);
 void __ovld vstore2(half2, size_t, __private half *);
 void __ovld vstore3(half3, size_t, __private half *);
 void __ovld vstore4(half4, size_t, __private half *);


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -11255,7 +11255,6 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const __constant half *);
 half2 __ovld __purefn vload2(size_t, const __constant half *);
 half3 __ovld __purefn vload3(size_t, const __constant half *);
 half4 __ovld __purefn vload4(size_t, const __constant half *);
@@ -11319,7 +11318,6 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const half *);
 half2 __ovld __purefn vload2(size_t, const half *);
 half3 __ovld __purefn vload3(size_t, const half *);
 half4 __ovld __purefn vload4(size_t, const half *);
@@ -11484,19 +11482,16 @@
 #endif //cl_khr_fp64
 
 #ifdef cl_khr_fp16
-half __ovld __purefn vload(size_t, const __global half *);
 half2 __ovld __purefn vload2(size_t, const __global half *);
 half3 __ovld __purefn vload3(size_t, const __global half *);
 half4 __ovld __purefn vload4(size_t, const __global half *);
 half8 __ovld __purefn vload8(size_t, const __global half *);
 half16 __ovld __purefn vload16(size_t, const __global half *);
-half __ovld __purefn vload(size_t, const __local half *);
 half2 __ovld __purefn vload2(size_t, const __local half *);
 half3 __ovld __purefn vload3(size_t, const __local half *);
 

[PATCH] D127961: [OpenCL] Reduce emitting candidate notes for builtins

2022-06-17 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/lib/Sema/SemaOverload.cpp:11224
+  // so do not generate such notes.
+  if (S.getLangOpts().OpenCL && Fn->isImplicit() &&
+  Cand->FailureKind != ovl_fail_bad_conversion)

Anastasia wrote:
> It would have been nice to print each of those overloads but my guess is that 
> it's too much work?
It's not trivial to print those overloads because we don't have a real source 
declaration, but even if it was trivial I am not sure if there is much value in 
printing all overloads.  Typically there are a lot of overloads for OpenCL 
builtins, partly because of all the vector versions.  I don't think a user will 
get much value out of screens full of overloads that didn't match.

For example, for the following code
```
int i, j, k;
i = max(i, j, k);
```

without this patch clang produces 121 note diagnostics.  If we manage to fit 
the diagnostic and candidate on a single line (which I doubt we can, normally 
they take 3 lines each), a user will still have to scroll through a few screens 
(on a 50-line terminal) of note diagnostics before reaching the actual error 
diagnostic.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127961/new/

https://reviews.llvm.org/D127961

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127961: [OpenCL] Reduce emitting candidate notes for builtins

2022-06-16 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: Anastasia.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

When overload resolution fails, clang emits a note diagnostic for each
candidate.  For OpenCL builtins this often leads to many repeated note
diagnostics with no new information.  Stop emitting such notes.

Update a test that was relying on counting those notes to check how
many builtins are available for certain extension configurations.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D127961

Files:
  clang/lib/Sema/SemaOverload.cpp
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl


Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -171,14 +171,14 @@
 // extension is disabled.  Test this by counting the number of notes about
 // candidate functions.
 void test_atomic_double_reporting(volatile __generic atomic_int *a) {
-  atomic_init(a);
+  atomic_init(a, a);
   // expected-error@-1{{no matching function for call to 'atomic_init'}}
 #if defined(NO_FP64)
   // Expecting 5 candidates: int, uint, long, ulong, float
-  // expected-note@-4 5 {{candidate function not viable: requires 2 arguments, 
but 1 was provided}}
+  // expected-note@-4 5 {{candidate function not viable: no known conversion}}
 #else
   // Expecting 6 candidates: int, uint, long, ulong, float, double
-  // expected-note@-7 6 {{candidate function not viable: requires 2 arguments, 
but 1 was provided}}
+  // expected-note@-7 6 {{candidate function not viable: no known conversion}}
 #endif
 }
 
@@ -198,7 +198,6 @@
 
   atomic_exchange_explicit(a_int, d, memory_order_seq_cst);
   // expected-error@-1{{no matching function for call to 
'atomic_exchange_explicit'}}
-  // expected-note@-2 + {{candidate function not viable}}
 
   atomic_exchange_explicit(a_int, d, memory_order_seq_cst, 
memory_scope_work_group);
 }
@@ -272,9 +271,7 @@
   res = read_imageh(image_read_only_image2d, i2);
 #if __OPENCL_C_VERSION__ < CL_VERSION_1_2 && !defined(__OPENCL_CPP_VERSION__)
   // expected-error@-3{{no matching function for call to 'read_imagef'}}
-  // expected-note@-4 + {{candidate function not viable}}
-  // expected-error@-4{{no matching function for call to 'read_imageh'}}
-  // expected-note@-5 + {{candidate function not viable}}
+  // expected-error@-3{{no matching function for call to 'read_imageh'}}
 #endif
   res = read_imageh(image_read_only_image2d, sampler, i2);
 
@@ -304,7 +301,6 @@
   write_imagef(image3dwo, i4, i, f4);
 #if __OPENCL_C_VERSION__ <= CL_VERSION_1_2 && !defined(__OPENCL_CPP_VERSION__)
   // expected-error@-2{{no matching function for call to 'write_imagef'}}
-  // expected-note@-3 + {{candidate function not viable}}
 #endif
 }
 
Index: clang/lib/Sema/SemaOverload.cpp
===
--- clang/lib/Sema/SemaOverload.cpp
+++ clang/lib/Sema/SemaOverload.cpp
@@ -11218,6 +11218,13 @@
   if (shouldSkipNotingLambdaConversionDecl(Fn))
 return;
 
+  // There is no physical candidate declaration to point to for OpenCL 
builtins.
+  // Except for failed conversions, the notes are identical for each candidate,
+  // so do not generate such notes.
+  if (S.getLangOpts().OpenCL && Fn->isImplicit() &&
+  Cand->FailureKind != ovl_fail_bad_conversion)
+return;
+
   // Note deleted candidates, but only if they're viable.
   if (Cand->Viable) {
 if (Fn->isDeleted()) {


Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -171,14 +171,14 @@
 // extension is disabled.  Test this by counting the number of notes about
 // candidate functions.
 void test_atomic_double_reporting(volatile __generic atomic_int *a) {
-  atomic_init(a);
+  atomic_init(a, a);
   // expected-error@-1{{no matching function for call to 'atomic_init'}}
 #if defined(NO_FP64)
   // Expecting 5 candidates: int, uint, long, ulong, float
-  // expected-note@-4 5 {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@-4 5 {{candidate function not viable: no known conversion}}
 #else
   // Expecting 6 candidates: int, uint, long, ulong, float, double
-  // expected-note@-7 6 {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@-7 6 {{candidate function not viable: no known conversion}}
 #endif
 }
 
@@ -198,7 +198,6 @@
 
   atomic_exchange_explicit(a_int, d, memory_order_seq_cst);
   // expected-error@-1{{no matching function for call to 'atomic_exchange_explicit'}}
-  // expected-note@-2 + {{candidate 

[PATCH] D126660: [OpenCL] Reword unknown extension pragma diagnostic

2022-06-15 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG7acc88be0312: [OpenCL] Reword unknown extension pragma 
diagnostic (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126660/new/

https://reviews.llvm.org/D126660

Files:
  clang/include/clang/Basic/DiagnosticParseKinds.td
  clang/test/Headers/opencl-c-header.cl
  clang/test/Parser/opencl-pragma.cl
  clang/test/SemaOpenCL/extension-begin.cl
  clang/test/SemaOpenCL/extension-version.cl

Index: clang/test/SemaOpenCL/extension-version.cl
===
--- clang/test/SemaOpenCL/extension-version.cl
+++ clang/test/SemaOpenCL/extension-version.cl
@@ -217,51 +217,51 @@
 // Check that pragmas for the OpenCL 3.0 features are rejected.
 
 #pragma OPENCL EXTENSION __opencl_c_int64 : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_int64' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_int64' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_3d_image_writes : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_3d_image_writes' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_3d_image_writes' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_atomic_order_acq_rel : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_atomic_order_acq_rel' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_acq_rel' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_atomic_order_seq_cst : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_atomic_order_seq_cst' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_seq_cst' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_device_enqueue : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_device_enqueue' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_device_enqueue' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_fp64 : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_fp64' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_fp64' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_generic_address_space : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_generic_address_space' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_generic_address_space' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_images : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_images' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_images' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_pipes : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_pipes' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_pipes' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_program_scope_global_variables : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_program_scope_global_variables' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_program_scope_global_variables' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_read_write_images : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_read_write_images' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_read_write_images' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_subgroups : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_subgroups' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_subgroups' unknown or does not require pragma - ignoring}}
 
 #pragma OPENCL EXTENSION __opencl_c_int64 : enable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_int64' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_int64' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_3d_image_writes : enable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_3d_image_writes' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_3d_image_writes' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_atomic_order_acq_rel : enable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_atomic_order_acq_rel' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_acq_rel' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_atomic_order_seq_cst : enable
-//expected-warning@-1{{unknown OpenCL extension 

[PATCH] D126660: [OpenCL] Reword unknown extension pragma diagnostic

2022-06-01 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D126660#3550356 , @Anastasia wrote:

> Ok, makes sense! Thanks!
>
> Btw I was thinking we should provide some way for developers to know what 
> extensions are being supported either through documentation or by querying 
> clang somehow? I am guessing documentation would be easier to implement but 
> harder to keep in sync?

I agree that would be nice to have.  That would also enable us to distinguish 
valid pragmaless extensions from garbage (e.g. mistyped extensions), and would 
also help `-cl-ext` parsing/diagnosing for example.  But that would require us 
to maintain a list of extensions inside Clang, which I believe is what we 
wanted to move away from (especially for extensions that only add library-like 
functionality)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126660/new/

https://reviews.llvm.org/D126660

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D126660: [OpenCL] Reword unknown extension pragma diagnostic

2022-05-30 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added reviewers: Anastasia, stuart.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

For newer OpenCL extensions that do not require a pragma, such as
`cl_khr_subgroup_shuffle`, a user could still accidentally attempt to
use a pragma.  This would result in a warning

  "unknown OpenCL extension 'cl_khr_subgroup_shuffle' - ignoring"

which could be mistakenly interpreted as "Clang does not support this
extension at all" instead of "Clang does not require any pragma for
this extension".


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D126660

Files:
  clang/include/clang/Basic/DiagnosticParseKinds.td
  clang/test/Headers/opencl-c-header.cl
  clang/test/Parser/opencl-pragma.cl
  clang/test/SemaOpenCL/extension-begin.cl
  clang/test/SemaOpenCL/extension-version.cl

Index: clang/test/SemaOpenCL/extension-version.cl
===
--- clang/test/SemaOpenCL/extension-version.cl
+++ clang/test/SemaOpenCL/extension-version.cl
@@ -217,51 +217,51 @@
 // Check that pragmas for the OpenCL 3.0 features are rejected.
 
 #pragma OPENCL EXTENSION __opencl_c_int64 : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_int64' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_int64' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_3d_image_writes : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_3d_image_writes' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_3d_image_writes' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_atomic_order_acq_rel : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_atomic_order_acq_rel' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_acq_rel' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_atomic_order_seq_cst : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_atomic_order_seq_cst' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_seq_cst' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_device_enqueue : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_device_enqueue' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_device_enqueue' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_fp64 : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_fp64' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_fp64' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_generic_address_space : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_generic_address_space' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_generic_address_space' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_images : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_images' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_images' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_pipes : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_pipes' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_pipes' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_program_scope_global_variables : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_program_scope_global_variables' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_program_scope_global_variables' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_read_write_images : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_read_write_images' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_read_write_images' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_subgroups : disable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_subgroups' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_subgroups' unknown or does not require pragma - ignoring}}
 
 #pragma OPENCL EXTENSION __opencl_c_int64 : enable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_int64' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_int64' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_3d_image_writes : enable
-//expected-warning@-1{{unknown OpenCL extension '__opencl_c_3d_image_writes' - ignoring}}
+//expected-warning@-1{{OpenCL extension '__opencl_c_3d_image_writes' unknown or does not require pragma - 

[PATCH] D124776: [SPIR-V] Allow setting SPIR-V version via target triple

2022-05-19 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124776/new/

https://reviews.llvm.org/D124776

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124256: [OpenCL] Add cl_khr_subgroup_rotate builtins

2022-05-18 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG21c29a8ae053: [OpenCL] Add cl_khr_subgroup_rotate builtins 
(authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124256/new/

https://reviews.llvm.org/D124256

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Headers/opencl-c.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/Headers/opencl-c-header.cl


Index: clang/test/Headers/opencl-c-header.cl
===
--- clang/test/Headers/opencl-c-header.cl
+++ clang/test/Headers/opencl-c-header.cl
@@ -127,6 +127,9 @@
 #if cl_khr_subgroup_clustered_reduce != 1
 #error "Incorrectly defined cl_khr_subgroup_clustered_reduce"
 #endif
+#if cl_khr_subgroup_rotate != 1
+#error "Incorrectly defined cl_khr_subgroup_rotate"
+#endif
 #if cl_khr_extended_bit_ops != 1
 #error "Incorrectly defined cl_khr_extended_bit_ops"
 #endif
@@ -208,6 +211,9 @@
 #ifdef cl_khr_subgroup_clustered_reduce
 #error "Incorrect cl_khr_subgroup_clustered_reduce define"
 #endif
+#ifdef cl_khr_subgroup_rotate
+#error "Incorrect cl_khr_subgroup_rotate define"
+#endif
 #ifdef cl_khr_extended_bit_ops
 #error "Incorrect cl_khr_extended_bit_ops define"
 #endif
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -1845,6 +1845,12 @@
   def : Builtin<"dot_acc_sat_4x8packed_su_int", [Int, UInt, UInt, Int], 
Attr.Const>;
 }
 
+// Section 48.3 - cl_khr_subgroup_rotate
+let Extension = FunctionExtension<"cl_khr_subgroup_rotate"> in {
+  def : Builtin<"sub_group_rotate", [AGenType1, AGenType1, Int], 
Attr.Convergent>;
+  def : Builtin<"sub_group_clustered_rotate", [AGenType1, AGenType1, Int, 
UInt], Attr.Convergent>;
+}
+
 //
 // Arm extensions.
 let Extension = ArmIntegerDotProductInt8 in {
Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -17275,6 +17275,40 @@
 int __ovld __cnfn dot_acc_sat_4x8packed_su_int(uint, uint, int);
 #endif // __opencl_c_integer_dot_product_input_4x8bit_packed
 
+#if defined(cl_khr_subgroup_rotate)
+char __ovld __conv sub_group_rotate(char, int);
+uchar __ovld __conv sub_group_rotate(uchar, int);
+short __ovld __conv sub_group_rotate(short, int);
+ushort __ovld __conv sub_group_rotate(ushort, int);
+int __ovld __conv sub_group_rotate(int, int);
+uint __ovld __conv sub_group_rotate(uint, int);
+long __ovld __conv sub_group_rotate(long, int);
+ulong __ovld __conv sub_group_rotate(ulong, int);
+float __ovld __conv sub_group_rotate(float, int);
+#if defined(cl_khr_fp64)
+double __ovld __conv sub_group_rotate(double, int);
+#endif // cl_khr_fp64
+#if defined(cl_khr_fp16)
+half __ovld __conv sub_group_rotate(half, int);
+#endif // cl_khr_fp16
+
+char __ovld __conv sub_group_clustered_rotate(char, int, uint);
+uchar __ovld __conv sub_group_clustered_rotate(uchar, int, uint);
+short __ovld __conv sub_group_clustered_rotate(short, int, uint);
+ushort __ovld __conv sub_group_clustered_rotate(ushort, int, uint);
+int __ovld __conv sub_group_clustered_rotate(int, int, uint);
+uint __ovld __conv sub_group_clustered_rotate(uint, int, uint);
+long __ovld __conv sub_group_clustered_rotate(long, int, uint);
+ulong __ovld __conv sub_group_clustered_rotate(ulong, int, uint);
+float __ovld __conv sub_group_clustered_rotate(float, int, uint);
+#if defined(cl_khr_fp64)
+double __ovld __conv sub_group_clustered_rotate(double, int, uint);
+#endif // cl_khr_fp64
+#if defined(cl_khr_fp16)
+half __ovld __conv sub_group_clustered_rotate(half, int, uint);
+#endif // cl_khr_fp16
+#endif // cl_khr_subgroup_rotate
+
 #if defined(cl_intel_subgroups)
 // Intel-Specific Sub Group Functions
 float   __ovld __conv intel_sub_group_shuffle( float , uint );
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -21,6 +21,7 @@
 #define cl_khr_subgroup_shuffle 1
 #define cl_khr_subgroup_shuffle_relative 1
 #define cl_khr_subgroup_clustered_reduce 1
+#define cl_khr_subgroup_rotate 1
 #define cl_khr_extended_bit_ops 1
 #define cl_khr_integer_dot_product 1
 #define __opencl_c_integer_dot_product_input_4x8bit 1


Index: clang/test/Headers/opencl-c-header.cl
===
--- clang/test/Headers/opencl-c-header.cl
+++ clang/test/Headers/opencl-c-header.cl
@@ -127,6 +127,9 @@
 #if cl_khr_subgroup_clustered_reduce != 1
 #error "Incorrectly defined cl_khr_subgroup_clustered_reduce"
 #endif
+#if cl_khr_subgroup_rotate != 1
+#error "Incorrectly defined cl_khr_subgroup_rotate"
+#endif
 #if cl_khr_extended_bit_ops != 1
 

[PATCH] D125401: [OpenCL] Do not guard vload/store_half builtins

2022-05-17 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGb250cca11d59: [OpenCL] Do not guard vload/store_half 
builtins (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125401/new/

https://reviews.llvm.org/D125401

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/half.cl

Index: clang/test/SemaOpenCL/half.cl
===
--- clang/test/SemaOpenCL/half.cl
+++ clang/test/SemaOpenCL/half.cl
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only -Wno-unused-value -triple spir-unknown-unknown
-// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only -Wno-unused-value -triple spir-unknown-unknown -fdeclare-opencl-builtins -finclude-default-header
+// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only -Wno-unused-value -triple spir-unknown-unknown -fdeclare-opencl-builtins -finclude-default-header -DHAVE_BUILTINS
 
 constant float f = 1.0h; // expected-error{{half precision constant requires cl_khr_fp16}}
 
@@ -22,6 +22,11 @@
   half *allowed2 = &*p;
   half *allowed3 = p + 1;
 
+#ifdef HAVE_BUILTINS
+  (void)ilogb(*p); // expected-error{{loading directly from pointer to type '__private half' requires cl_khr_fp16. Use vector data load builtin functions instead}}
+  vstore_half(42.0f, 0, p);
+#endif
+
   return h;
 }
 
@@ -49,6 +54,11 @@
   half *allowed2 = &*p;
   half *allowed3 = p + 1;
 
+#ifdef HAVE_BUILTINS
+  (void)ilogb(*p);
+  vstore_half(42.0f, 0, p);
+#endif
+
   return h;
 }
 
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -352,9 +352,22 @@
 let Extension = Fp64TypeExt in {
   def Double: Type<"double",QualType<"Context.DoubleTy">>;
 }
+
+// The half type for builtins that require the cl_khr_fp16 extension.
 let Extension = Fp16TypeExt in {
   def Half  : Type<"half",  QualType<"Context.HalfTy">>;
 }
+
+// Without the cl_khr_fp16 extension, the half type can only be used to declare
+// a pointer.  Define const and non-const pointer types in all address spaces.
+// Use the "__half" alias to allow the TableGen emitter to distinguish the
+// (extensionless) pointee type of these pointer-to-half types from the "half"
+// type defined above that already carries the cl_khr_fp16 extension.
+foreach AS = [PrivateAS, GlobalAS, ConstantAS, LocalAS, GenericAS] in {
+  def "HalfPtr" # AS  : PointerType>, AS>;
+  def "HalfPtrConst" # AS : PointerType>>, AS>;
+}
+
 def Size  : Type<"size_t",QualType<"Context.getSizeType()">>;
 def PtrDiff   : Type<"ptrdiff_t", QualType<"Context.getPointerDiffType()">>;
 def IntPtr: Type<"intptr_t",  QualType<"Context.getIntPtrType()">>;
@@ -877,22 +890,22 @@
 
 multiclass VloadVstoreHalf addrspaces, bit defStores> {
   foreach AS = addrspaces in {
-def : Builtin<"vload_half", [Float, Size, PointerType, AS>], Attr.Pure>;
+def : Builtin<"vload_half", [Float, Size, !cast("HalfPtrConst" # AS)], Attr.Pure>;
 foreach VSize = [2, 3, 4, 8, 16] in {
   foreach name = ["vload_half" # VSize, "vloada_half" # VSize] in {
-def : Builtin, Size, PointerType, AS>], Attr.Pure>;
+def : Builtin, Size, !cast("HalfPtrConst" # AS)], Attr.Pure>;
   }
 }
 if defStores then {
   foreach rnd = ["", "_rte", "_rtz", "_rtp", "_rtn"] in {
 foreach name = ["vstore_half" # rnd] in {
-  def : Builtin]>;
-  def : Builtin]>;
+  def : Builtin("HalfPtr" # AS)]>;
+  def : Builtin("HalfPtr" # AS)]>;
 }
 foreach VSize = [2, 3, 4, 8, 16] in {
   foreach name = ["vstore_half" # VSize # rnd, "vstorea_half" # VSize # rnd] in {
-def : Builtin, Size, PointerType]>;
-def : Builtin, Size, PointerType]>;
+def : Builtin, Size, !cast("HalfPtr" # AS)]>;
+def : Builtin, Size, !cast("HalfPtr" # AS)]>;
   }
 }
   }
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -202,6 +202,9 @@
 typedef double double16 __attribute__((ext_vector_type(16)));
 #endif
 
+// An internal alias for half, for use by OpenCLBuiltins.td.
+#define __half half
+
 #if defined(__OPENCL_CPP_VERSION__)
 #define NULL nullptr
 #elif defined(__OPENCL_C_VERSION__)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125401: [OpenCL] Do not guard vload/store_half builtins

2022-05-12 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh updated this revision to Diff 428952.
svenvh added a comment.

Add test case.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125401/new/

https://reviews.llvm.org/D125401

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/half.cl

Index: clang/test/SemaOpenCL/half.cl
===
--- clang/test/SemaOpenCL/half.cl
+++ clang/test/SemaOpenCL/half.cl
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only -Wno-unused-value -triple spir-unknown-unknown
-// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only -Wno-unused-value -triple spir-unknown-unknown -fdeclare-opencl-builtins -finclude-default-header
+// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only -Wno-unused-value -triple spir-unknown-unknown -fdeclare-opencl-builtins -finclude-default-header -DHAVE_BUILTINS
 
 constant float f = 1.0h; // expected-error{{half precision constant requires cl_khr_fp16}}
 
@@ -22,6 +22,11 @@
   half *allowed2 = &*p;
   half *allowed3 = p + 1;
 
+#ifdef HAVE_BUILTINS
+  (void)ilogb(*p); // expected-error{{loading directly from pointer to type '__private half' requires cl_khr_fp16. Use vector data load builtin functions instead}}
+  vstore_half(42.0f, 0, p);
+#endif
+
   return h;
 }
 
@@ -49,6 +54,11 @@
   half *allowed2 = &*p;
   half *allowed3 = p + 1;
 
+#ifdef HAVE_BUILTINS
+  (void)ilogb(*p);
+  vstore_half(42.0f, 0, p);
+#endif
+
   return h;
 }
 
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -352,9 +352,22 @@
 let Extension = Fp64TypeExt in {
   def Double: Type<"double",QualType<"Context.DoubleTy">>;
 }
+
+// The half type for builtins that require the cl_khr_fp16 extension.
 let Extension = Fp16TypeExt in {
   def Half  : Type<"half",  QualType<"Context.HalfTy">>;
 }
+
+// Without the cl_khr_fp16 extension, the half type can only be used to declare
+// a pointer.  Define const and non-const pointer types in all address spaces.
+// Use the "__half" alias to allow the TableGen emitter to distinguish the
+// (extensionless) pointee type of these pointer-to-half types from the "half"
+// type defined above that already carries the cl_khr_fp16 extension.
+foreach AS = [PrivateAS, GlobalAS, ConstantAS, LocalAS, GenericAS] in {
+  def "HalfPtr" # AS  : PointerType>, AS>;
+  def "HalfPtrConst" # AS : PointerType>>, AS>;
+}
+
 def Size  : Type<"size_t",QualType<"Context.getSizeType()">>;
 def PtrDiff   : Type<"ptrdiff_t", QualType<"Context.getPointerDiffType()">>;
 def IntPtr: Type<"intptr_t",  QualType<"Context.getIntPtrType()">>;
@@ -877,22 +890,22 @@
 
 multiclass VloadVstoreHalf addrspaces, bit defStores> {
   foreach AS = addrspaces in {
-def : Builtin<"vload_half", [Float, Size, PointerType, AS>], Attr.Pure>;
+def : Builtin<"vload_half", [Float, Size, !cast("HalfPtrConst" # AS)], Attr.Pure>;
 foreach VSize = [2, 3, 4, 8, 16] in {
   foreach name = ["vload_half" # VSize, "vloada_half" # VSize] in {
-def : Builtin, Size, PointerType, AS>], Attr.Pure>;
+def : Builtin, Size, !cast("HalfPtrConst" # AS)], Attr.Pure>;
   }
 }
 if defStores then {
   foreach rnd = ["", "_rte", "_rtz", "_rtp", "_rtn"] in {
 foreach name = ["vstore_half" # rnd] in {
-  def : Builtin]>;
-  def : Builtin]>;
+  def : Builtin("HalfPtr" # AS)]>;
+  def : Builtin("HalfPtr" # AS)]>;
 }
 foreach VSize = [2, 3, 4, 8, 16] in {
   foreach name = ["vstore_half" # VSize # rnd, "vstorea_half" # VSize # rnd] in {
-def : Builtin, Size, PointerType]>;
-def : Builtin, Size, PointerType]>;
+def : Builtin, Size, !cast("HalfPtr" # AS)]>;
+def : Builtin, Size, !cast("HalfPtr" # AS)]>;
   }
 }
   }
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -202,6 +202,9 @@
 typedef double double16 __attribute__((ext_vector_type(16)));
 #endif
 
+// An internal alias for half, for use by OpenCLBuiltins.td.
+#define __half half
+
 #if defined(__OPENCL_CPP_VERSION__)
 #define NULL nullptr
 #elif defined(__OPENCL_C_VERSION__)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125401: [OpenCL] Do not guard vload/store_half builtins

2022-05-11 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: Anastasia.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

The vload*_half* and vstore*_half* builtins do not require the
cl_khr_fp16 extension: pointers to `half` can be declared without the
extension and the _half variants of vload and vstore should be
available without the extension.

This aligns the guards for these builtins for
`-fdeclare-opencl-builtins` with `opencl-c.h`.

Fixes https://github.com/llvm/llvm-project/issues/55275


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D125401

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Sema/OpenCLBuiltins.td


Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -352,9 +352,22 @@
 let Extension = Fp64TypeExt in {
   def Double: Type<"double",QualType<"Context.DoubleTy">>;
 }
+
+// The half type for builtins that require the cl_khr_fp16 extension.
 let Extension = Fp16TypeExt in {
   def Half  : Type<"half",  QualType<"Context.HalfTy">>;
 }
+
+// Without the cl_khr_fp16 extension, the half type can only be used to declare
+// a pointer.  Define const and non-const pointer types in all address spaces.
+// Use the "__half" alias to allow the TableGen emitter to distinguish the
+// (extensionless) pointee type of these pointer-to-half types from the "half"
+// type defined above that already carries the cl_khr_fp16 extension.
+foreach AS = [PrivateAS, GlobalAS, ConstantAS, LocalAS, GenericAS] in {
+  def "HalfPtr" # AS  : PointerType>, AS>;
+  def "HalfPtrConst" # AS : PointerType>>, AS>;
+}
+
 def Size  : Type<"size_t",QualType<"Context.getSizeType()">>;
 def PtrDiff   : Type<"ptrdiff_t", QualType<"Context.getPointerDiffType()">>;
 def IntPtr: Type<"intptr_t",  QualType<"Context.getIntPtrType()">>;
@@ -877,22 +890,22 @@
 
 multiclass VloadVstoreHalf addrspaces, bit defStores> {
   foreach AS = addrspaces in {
-def : Builtin<"vload_half", [Float, Size, PointerType, 
AS>], Attr.Pure>;
+def : Builtin<"vload_half", [Float, Size, !cast("HalfPtrConst" # 
AS)], Attr.Pure>;
 foreach VSize = [2, 3, 4, 8, 16] in {
   foreach name = ["vload_half" # VSize, "vloada_half" # VSize] in {
-def : Builtin, Size, 
PointerType, AS>], Attr.Pure>;
+def : Builtin, Size, 
!cast("HalfPtrConst" # AS)], Attr.Pure>;
   }
 }
 if defStores then {
   foreach rnd = ["", "_rte", "_rtz", "_rtp", "_rtn"] in {
 foreach name = ["vstore_half" # rnd] in {
-  def : Builtin]>;
-  def : Builtin]>;
+  def : Builtin("HalfPtr" # 
AS)]>;
+  def : Builtin("HalfPtr" # 
AS)]>;
 }
 foreach VSize = [2, 3, 4, 8, 16] in {
   foreach name = ["vstore_half" # VSize # rnd, "vstorea_half" # VSize 
# rnd] in {
-def : Builtin, Size, 
PointerType]>;
-def : Builtin, Size, 
PointerType]>;
+def : Builtin, Size, 
!cast("HalfPtr" # AS)]>;
+def : Builtin, Size, 
!cast("HalfPtr" # AS)]>;
   }
 }
   }
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -202,6 +202,9 @@
 typedef double double16 __attribute__((ext_vector_type(16)));
 #endif
 
+// An internal alias for half, for use by OpenCLBuiltins.td.
+#define __half half
+
 #if defined(__OPENCL_CPP_VERSION__)
 #define NULL nullptr
 #elif defined(__OPENCL_C_VERSION__)


Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -352,9 +352,22 @@
 let Extension = Fp64TypeExt in {
   def Double: Type<"double",QualType<"Context.DoubleTy">>;
 }
+
+// The half type for builtins that require the cl_khr_fp16 extension.
 let Extension = Fp16TypeExt in {
   def Half  : Type<"half",  QualType<"Context.HalfTy">>;
 }
+
+// Without the cl_khr_fp16 extension, the half type can only be used to declare
+// a pointer.  Define const and non-const pointer types in all address spaces.
+// Use the "__half" alias to allow the TableGen emitter to distinguish the
+// (extensionless) pointee type of these pointer-to-half types from the "half"
+// type defined above that already carries the cl_khr_fp16 extension.
+foreach AS = [PrivateAS, GlobalAS, ConstantAS, LocalAS, GenericAS] in {
+  def "HalfPtr" # AS  : PointerType>, AS>;
+  def "HalfPtrConst" # AS : PointerType>>, AS>;
+}
+
 def Size  : Type<"size_t",QualType<"Context.getSizeType()">>;
 def PtrDiff   : Type<"ptrdiff_t", QualType<"Context.getPointerDiffType()">>;
 def IntPtr: Type<"intptr_t",  

[PATCH] D124256: [OpenCL] Add cl_khr_subgroup_rotate builtins

2022-05-11 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D124256#3474043 , @Anastasia wrote:

> LGTM! I imagine tablegen side is being tested automatically?

The TableGen definitions are tested by `clang/test/Headers/opencl-builtins.cl`

> Btw do we need to set the feature macro for SPIR/SPIR-V target and then test 
> it too?

Good point!  Added.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124256/new/

https://reviews.llvm.org/D124256

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124256: [OpenCL] Add cl_khr_subgroup_rotate builtins

2022-05-11 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh updated this revision to Diff 428652.
svenvh added a comment.

Added macro and macro test.  Added reference to Extension spec section.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124256/new/

https://reviews.llvm.org/D124256

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Headers/opencl-c.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/Headers/opencl-c-header.cl


Index: clang/test/Headers/opencl-c-header.cl
===
--- clang/test/Headers/opencl-c-header.cl
+++ clang/test/Headers/opencl-c-header.cl
@@ -127,6 +127,9 @@
 #if cl_khr_subgroup_clustered_reduce != 1
 #error "Incorrectly defined cl_khr_subgroup_clustered_reduce"
 #endif
+#if cl_khr_subgroup_rotate != 1
+#error "Incorrectly defined cl_khr_subgroup_rotate"
+#endif
 #if cl_khr_extended_bit_ops != 1
 #error "Incorrectly defined cl_khr_extended_bit_ops"
 #endif
@@ -208,6 +211,9 @@
 #ifdef cl_khr_subgroup_clustered_reduce
 #error "Incorrect cl_khr_subgroup_clustered_reduce define"
 #endif
+#ifdef cl_khr_subgroup_rotate
+#error "Incorrect cl_khr_subgroup_rotate define"
+#endif
 #ifdef cl_khr_extended_bit_ops
 #error "Incorrect cl_khr_extended_bit_ops define"
 #endif
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -1832,6 +1832,12 @@
   def : Builtin<"dot_acc_sat_4x8packed_su_int", [Int, UInt, UInt, Int], 
Attr.Const>;
 }
 
+// Section 48.3 - cl_khr_subgroup_rotate
+let Extension = FunctionExtension<"cl_khr_subgroup_rotate"> in {
+  def : Builtin<"sub_group_rotate", [AGenType1, AGenType1, Int], 
Attr.Convergent>;
+  def : Builtin<"sub_group_clustered_rotate", [AGenType1, AGenType1, Int, 
UInt], Attr.Convergent>;
+}
+
 //
 // Arm extensions.
 let Extension = ArmIntegerDotProductInt8 in {
Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -17275,6 +17275,40 @@
 int __ovld __cnfn dot_acc_sat_4x8packed_su_int(uint, uint, int);
 #endif // __opencl_c_integer_dot_product_input_4x8bit_packed
 
+#if defined(cl_khr_subgroup_rotate)
+char __ovld __conv sub_group_rotate(char, int);
+uchar __ovld __conv sub_group_rotate(uchar, int);
+short __ovld __conv sub_group_rotate(short, int);
+ushort __ovld __conv sub_group_rotate(ushort, int);
+int __ovld __conv sub_group_rotate(int, int);
+uint __ovld __conv sub_group_rotate(uint, int);
+long __ovld __conv sub_group_rotate(long, int);
+ulong __ovld __conv sub_group_rotate(ulong, int);
+float __ovld __conv sub_group_rotate(float, int);
+#if defined(cl_khr_fp64)
+double __ovld __conv sub_group_rotate(double, int);
+#endif // cl_khr_fp64
+#if defined(cl_khr_fp16)
+half __ovld __conv sub_group_rotate(half, int);
+#endif // cl_khr_fp16
+
+char __ovld __conv sub_group_clustered_rotate(char, int, uint);
+uchar __ovld __conv sub_group_clustered_rotate(uchar, int, uint);
+short __ovld __conv sub_group_clustered_rotate(short, int, uint);
+ushort __ovld __conv sub_group_clustered_rotate(ushort, int, uint);
+int __ovld __conv sub_group_clustered_rotate(int, int, uint);
+uint __ovld __conv sub_group_clustered_rotate(uint, int, uint);
+long __ovld __conv sub_group_clustered_rotate(long, int, uint);
+ulong __ovld __conv sub_group_clustered_rotate(ulong, int, uint);
+float __ovld __conv sub_group_clustered_rotate(float, int, uint);
+#if defined(cl_khr_fp64)
+double __ovld __conv sub_group_clustered_rotate(double, int, uint);
+#endif // cl_khr_fp64
+#if defined(cl_khr_fp16)
+half __ovld __conv sub_group_clustered_rotate(half, int, uint);
+#endif // cl_khr_fp16
+#endif // cl_khr_subgroup_rotate
+
 #if defined(cl_intel_subgroups)
 // Intel-Specific Sub Group Functions
 float   __ovld __conv intel_sub_group_shuffle( float , uint );
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -21,6 +21,7 @@
 #define cl_khr_subgroup_shuffle 1
 #define cl_khr_subgroup_shuffle_relative 1
 #define cl_khr_subgroup_clustered_reduce 1
+#define cl_khr_subgroup_rotate 1
 #define cl_khr_extended_bit_ops 1
 #define cl_khr_integer_dot_product 1
 #define __opencl_c_integer_dot_product_input_4x8bit 1


Index: clang/test/Headers/opencl-c-header.cl
===
--- clang/test/Headers/opencl-c-header.cl
+++ clang/test/Headers/opencl-c-header.cl
@@ -127,6 +127,9 @@
 #if cl_khr_subgroup_clustered_reduce != 1
 #error "Incorrectly defined cl_khr_subgroup_clustered_reduce"
 #endif
+#if cl_khr_subgroup_rotate != 1
+#error "Incorrectly defined cl_khr_subgroup_rotate"
+#endif
 #if cl_khr_extended_bit_ops != 1
 #error "Incorrectly defined cl_khr_extended_bit_ops"
 #endif
@@ -208,6 

[PATCH] D125243: [OpenCL] Make -cl-ext a driver option

2022-05-11 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

LGTM; just a few minor suggestions that you can address at commit time.




Comment at: clang/docs/UsersManual.rst:3145-3146
+
+Note that some targets e.g. SPIR/SPIR-V enable all extensions/features in 
clang by
+default.
+

Was this meant to go after the command example?



Comment at: clang/docs/UsersManual.rst:3215
+  All known OpenCL extensions and features are set to supported in the generic 
targets,
+  however :option:`-cl-ext` flag can be used to alter these settings.
+




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125243/new/

https://reviews.llvm.org/D125243

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124256: [OpenCL] Add cl_khr_subgroup_rotate builtins

2022-04-22 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: Anastasia.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

Add the builtins for the new OpenCL extension.  The specification is under 
review here: https://github.com/KhronosGroup/OpenCL-Docs/pull/781


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D124256

Files:
  clang/lib/Headers/opencl-c.h
  clang/lib/Sema/OpenCLBuiltins.td


Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -1832,6 +1832,12 @@
   def : Builtin<"dot_acc_sat_4x8packed_su_int", [Int, UInt, UInt, Int], 
Attr.Const>;
 }
 
+// cl_khr_subgroup_rotate
+let Extension = FunctionExtension<"cl_khr_subgroup_rotate"> in {
+  def : Builtin<"sub_group_rotate", [AGenType1, AGenType1, Int], 
Attr.Convergent>;
+  def : Builtin<"sub_group_clustered_rotate", [AGenType1, AGenType1, Int, 
UInt], Attr.Convergent>;
+}
+
 //
 // Arm extensions.
 let Extension = ArmIntegerDotProductInt8 in {
Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -17275,6 +17275,40 @@
 int __ovld __cnfn dot_acc_sat_4x8packed_su_int(uint, uint, int);
 #endif // __opencl_c_integer_dot_product_input_4x8bit_packed
 
+#if defined(cl_khr_subgroup_rotate)
+char __ovld __conv sub_group_rotate(char, int);
+uchar __ovld __conv sub_group_rotate(uchar, int);
+short __ovld __conv sub_group_rotate(short, int);
+ushort __ovld __conv sub_group_rotate(ushort, int);
+int __ovld __conv sub_group_rotate(int, int);
+uint __ovld __conv sub_group_rotate(uint, int);
+long __ovld __conv sub_group_rotate(long, int);
+ulong __ovld __conv sub_group_rotate(ulong, int);
+float __ovld __conv sub_group_rotate(float, int);
+#if defined(cl_khr_fp64)
+double __ovld __conv sub_group_rotate(double, int);
+#endif // cl_khr_fp64
+#if defined(cl_khr_fp16)
+half __ovld __conv sub_group_rotate(half, int);
+#endif // cl_khr_fp16
+
+char __ovld __conv sub_group_clustered_rotate(char, int, uint);
+uchar __ovld __conv sub_group_clustered_rotate(uchar, int, uint);
+short __ovld __conv sub_group_clustered_rotate(short, int, uint);
+ushort __ovld __conv sub_group_clustered_rotate(ushort, int, uint);
+int __ovld __conv sub_group_clustered_rotate(int, int, uint);
+uint __ovld __conv sub_group_clustered_rotate(uint, int, uint);
+long __ovld __conv sub_group_clustered_rotate(long, int, uint);
+ulong __ovld __conv sub_group_clustered_rotate(ulong, int, uint);
+float __ovld __conv sub_group_clustered_rotate(float, int, uint);
+#if defined(cl_khr_fp64)
+double __ovld __conv sub_group_clustered_rotate(double, int, uint);
+#endif // cl_khr_fp64
+#if defined(cl_khr_fp16)
+half __ovld __conv sub_group_clustered_rotate(half, int, uint);
+#endif // cl_khr_fp16
+#endif // cl_khr_subgroup_rotate
+
 #if defined(cl_intel_subgroups)
 // Intel-Specific Sub Group Functions
 float   __ovld __conv intel_sub_group_shuffle( float , uint );


Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -1832,6 +1832,12 @@
   def : Builtin<"dot_acc_sat_4x8packed_su_int", [Int, UInt, UInt, Int], Attr.Const>;
 }
 
+// cl_khr_subgroup_rotate
+let Extension = FunctionExtension<"cl_khr_subgroup_rotate"> in {
+  def : Builtin<"sub_group_rotate", [AGenType1, AGenType1, Int], Attr.Convergent>;
+  def : Builtin<"sub_group_clustered_rotate", [AGenType1, AGenType1, Int, UInt], Attr.Convergent>;
+}
+
 //
 // Arm extensions.
 let Extension = ArmIntegerDotProductInt8 in {
Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -17275,6 +17275,40 @@
 int __ovld __cnfn dot_acc_sat_4x8packed_su_int(uint, uint, int);
 #endif // __opencl_c_integer_dot_product_input_4x8bit_packed
 
+#if defined(cl_khr_subgroup_rotate)
+char __ovld __conv sub_group_rotate(char, int);
+uchar __ovld __conv sub_group_rotate(uchar, int);
+short __ovld __conv sub_group_rotate(short, int);
+ushort __ovld __conv sub_group_rotate(ushort, int);
+int __ovld __conv sub_group_rotate(int, int);
+uint __ovld __conv sub_group_rotate(uint, int);
+long __ovld __conv sub_group_rotate(long, int);
+ulong __ovld __conv sub_group_rotate(ulong, int);
+float __ovld __conv sub_group_rotate(float, int);
+#if defined(cl_khr_fp64)
+double __ovld __conv sub_group_rotate(double, int);
+#endif // cl_khr_fp64
+#if defined(cl_khr_fp16)
+half __ovld __conv sub_group_rotate(half, int);

[PATCH] D122728: [OpenCL] opencl-c.h: Add const to get_image_num_samples

2022-04-19 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGf3ee0afc6739: [OpenCL] opencl-c.h: Add const to 
get_image_num_samples (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122728/new/

https://reviews.llvm.org/D122728

Files:
  clang/lib/Headers/opencl-c.h


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -16118,21 +16118,21 @@
 * Return the number of samples associated with image
 */
 #if defined(cl_khr_gl_msaa_sharing)
-int __ovld get_image_num_samples(read_only image2d_msaa_t);
-int __ovld get_image_num_samples(read_only image2d_msaa_depth_t);
-int __ovld get_image_num_samples(read_only image2d_array_msaa_t);
-int __ovld get_image_num_samples(read_only image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_array_msaa_depth_t);
 
-int __ovld get_image_num_samples(write_only image2d_msaa_t);
-int __ovld get_image_num_samples(write_only image2d_msaa_depth_t);
-int __ovld get_image_num_samples(write_only image2d_array_msaa_t);
-int __ovld get_image_num_samples(write_only image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_array_msaa_depth_t);
 
 #if defined(__opencl_c_read_write_images)
-int __ovld get_image_num_samples(read_write image2d_msaa_t);
-int __ovld get_image_num_samples(read_write image2d_msaa_depth_t);
-int __ovld get_image_num_samples(read_write image2d_array_msaa_t);
-int __ovld get_image_num_samples(read_write image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_array_msaa_depth_t);
 #endif //defined(__opencl_c_read_write_images)
 #endif
 


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -16118,21 +16118,21 @@
 * Return the number of samples associated with image
 */
 #if defined(cl_khr_gl_msaa_sharing)
-int __ovld get_image_num_samples(read_only image2d_msaa_t);
-int __ovld get_image_num_samples(read_only image2d_msaa_depth_t);
-int __ovld get_image_num_samples(read_only image2d_array_msaa_t);
-int __ovld get_image_num_samples(read_only image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_array_msaa_depth_t);
 
-int __ovld get_image_num_samples(write_only image2d_msaa_t);
-int __ovld get_image_num_samples(write_only image2d_msaa_depth_t);
-int __ovld get_image_num_samples(write_only image2d_array_msaa_t);
-int __ovld get_image_num_samples(write_only image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_array_msaa_depth_t);
 
 #if defined(__opencl_c_read_write_images)
-int __ovld get_image_num_samples(read_write image2d_msaa_t);
-int __ovld get_image_num_samples(read_write image2d_msaa_depth_t);
-int __ovld get_image_num_samples(read_write image2d_array_msaa_t);
-int __ovld get_image_num_samples(read_write image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_array_msaa_depth_t);
 #endif //defined(__opencl_c_read_write_images)
 #endif
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120254: [OpenCL] Align subgroup builtin guards

2022-03-31 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

I've submitted the fix in `4dfec37037f5`.

As for testing, unfortunately I couldn't just extend the 
`SemaOpenCL/fdeclare-opencl-builtins.cl` test: the OpenCL 1.2 RUN lines 
explicitly disable the `cl_intel_subgroups` extension (which is the extension 
that brings in `sub_group_barrier` with CL1.2 for the generic `spir` triple).  
Adding a new `RUN` line or new test file for just this case seems a bit of an 
overkill, so I'm tempted to defer this until we have parity between 
`OpenCLBuiltins.td` and `opencl-c.h`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120254/new/

https://reviews.llvm.org/D120254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120254: [OpenCL] Align subgroup builtin guards

2022-03-31 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D120254#3419221 , @hvdijk wrote:

> This was worked around by modifying tests, but I believe this is a 
> fundamental problem in this change and was able to reproduce the error with 
> plain old clang:
>
>   $ cat test.cl
>   void sub_group_barrier();
>   
>   $ bin/clang -cl-std=CL1.2 -S -o - test.cl
>   error: enum type memory_scope not found; include the base header with 
> -finclude-default-header
>   1 error generated.
>   
>   $ bin/clang --version
>   clang version 15.0.0 (g...@github.com:llvm/llvm-project 
> c204cee642ee794901d2e8a9819b52ac12f92bc9)
>   Target: x86_64-unknown-linux-gnu
>   Thread model: posix
>   InstalledDir: /home/harald/llvm-project/build/bin
>
> The problem is that this change enables certain built-ins in OpenCL 1.2 that 
> take a memory_scope argument, but the memory_scope type is not defined in 
> OpenCL 1.2 mode. When we then process the function, sub_group_barrier in my 
> example, things break when checking whether the declaration matches the 
> built-in. I am not sure what the right fix here is. Can we just define the 
> type if any extension is enabled that requires the type, or is that not 
> allowed?

Thanks for digging further and providing a reproducer!  I think the fix is to 
only make the `sub_group_barrier(cl_mem_fence_flags flags, memory_scope)` 
overload available for OpenCL 2.0 or above.  That would also match `opencl-c.h`.

The following patch seems to fix the issue that you described:

  diff --git a/clang/lib/Sema/OpenCLBuiltins.td 
b/clang/lib/Sema/OpenCLBuiltins.td
  index f6de59223347..52740bacac33 100644
  --- a/clang/lib/Sema/OpenCLBuiltins.td
  +++ b/clang/lib/Sema/OpenCLBuiltins.td
  @@ -1692,7 +1692,9 @@ let Extension = FuncExtKhrSubgroups in {
   // --- Table 28.2.2 ---
   let Extension = FuncExtKhrSubgroups in {
 def : Builtin<"sub_group_barrier", [Void, MemFenceFlags], Attr.Convergent>;
  -  def : Builtin<"sub_group_barrier", [Void, MemFenceFlags, MemoryScope], 
Attr.Convergent>;
  +  let MinVersion = CL20 in {
  +def : Builtin<"sub_group_barrier", [Void, MemFenceFlags, MemoryScope], 
Attr.Convergent>;
  +  }
   }
   
   // --- Table 28.2.4 ---


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120254/new/

https://reviews.llvm.org/D120254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122728: [OpenCL] opencl-c.h: Add const to get_image_num_samples

2022-03-30 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: azabaznov.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
Herald added a project: All.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

Align with the `-fdeclare-opencl-builtins` option and other
get_image_* builtins which have the const attribute.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D122728

Files:
  clang/lib/Headers/opencl-c.h


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -16118,21 +16118,21 @@
 * Return the number of samples associated with image
 */
 #if defined(cl_khr_gl_msaa_sharing)
-int __ovld get_image_num_samples(read_only image2d_msaa_t);
-int __ovld get_image_num_samples(read_only image2d_msaa_depth_t);
-int __ovld get_image_num_samples(read_only image2d_array_msaa_t);
-int __ovld get_image_num_samples(read_only image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_array_msaa_depth_t);
 
-int __ovld get_image_num_samples(write_only image2d_msaa_t);
-int __ovld get_image_num_samples(write_only image2d_msaa_depth_t);
-int __ovld get_image_num_samples(write_only image2d_array_msaa_t);
-int __ovld get_image_num_samples(write_only image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_array_msaa_depth_t);
 
 #if defined(__opencl_c_read_write_images)
-int __ovld get_image_num_samples(read_write image2d_msaa_t);
-int __ovld get_image_num_samples(read_write image2d_msaa_depth_t);
-int __ovld get_image_num_samples(read_write image2d_array_msaa_t);
-int __ovld get_image_num_samples(read_write image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_array_msaa_depth_t);
 #endif //defined(__opencl_c_read_write_images)
 #endif
 


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -16118,21 +16118,21 @@
 * Return the number of samples associated with image
 */
 #if defined(cl_khr_gl_msaa_sharing)
-int __ovld get_image_num_samples(read_only image2d_msaa_t);
-int __ovld get_image_num_samples(read_only image2d_msaa_depth_t);
-int __ovld get_image_num_samples(read_only image2d_array_msaa_t);
-int __ovld get_image_num_samples(read_only image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_only image2d_array_msaa_depth_t);
 
-int __ovld get_image_num_samples(write_only image2d_msaa_t);
-int __ovld get_image_num_samples(write_only image2d_msaa_depth_t);
-int __ovld get_image_num_samples(write_only image2d_array_msaa_t);
-int __ovld get_image_num_samples(write_only image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(write_only image2d_array_msaa_depth_t);
 
 #if defined(__opencl_c_read_write_images)
-int __ovld get_image_num_samples(read_write image2d_msaa_t);
-int __ovld get_image_num_samples(read_write image2d_msaa_depth_t);
-int __ovld get_image_num_samples(read_write image2d_array_msaa_t);
-int __ovld get_image_num_samples(read_write image2d_array_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_msaa_depth_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_array_msaa_t);
+int __ovld __cnfn get_image_num_samples(read_write image2d_array_msaa_depth_t);
 #endif //defined(__opencl_c_read_write_images)
 #endif
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104040: [OpenCL] Add TableGen emitter for OpenCL builtin header

2022-03-23 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh updated this revision to Diff 417678.
svenvh added a comment.
Herald added a subscriber: Naghasan.
Herald added a project: All.

Rebased on latest `main`.

Also takes TypeExtensions into account now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104040/new/

https://reviews.llvm.org/D104040

Files:
  clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
  clang/utils/TableGen/TableGen.cpp
  clang/utils/TableGen/TableGenBackends.h

Index: clang/utils/TableGen/TableGenBackends.h
===
--- clang/utils/TableGen/TableGenBackends.h
+++ clang/utils/TableGen/TableGenBackends.h
@@ -123,6 +123,8 @@
 
 void EmitClangOpenCLBuiltins(llvm::RecordKeeper ,
  llvm::raw_ostream );
+void EmitClangOpenCLBuiltinHeader(llvm::RecordKeeper ,
+  llvm::raw_ostream );
 void EmitClangOpenCLBuiltinTests(llvm::RecordKeeper ,
  llvm::raw_ostream );
 
Index: clang/utils/TableGen/TableGen.cpp
===
--- clang/utils/TableGen/TableGen.cpp
+++ clang/utils/TableGen/TableGen.cpp
@@ -64,6 +64,7 @@
   GenClangCommentCommandInfo,
   GenClangCommentCommandList,
   GenClangOpenCLBuiltins,
+  GenClangOpenCLBuiltinHeader,
   GenClangOpenCLBuiltinTests,
   GenArmNeon,
   GenArmFP16,
@@ -198,6 +199,9 @@
"documentation comments"),
 clEnumValN(GenClangOpenCLBuiltins, "gen-clang-opencl-builtins",
"Generate OpenCL builtin declaration handlers"),
+clEnumValN(GenClangOpenCLBuiltinHeader,
+   "gen-clang-opencl-builtin-header",
+   "Generate OpenCL builtin header"),
 clEnumValN(GenClangOpenCLBuiltinTests, "gen-clang-opencl-builtin-tests",
"Generate OpenCL builtin declaration tests"),
 clEnumValN(GenArmNeon, "gen-arm-neon", "Generate arm_neon.h for clang"),
@@ -380,6 +384,9 @@
   case GenClangOpenCLBuiltins:
 EmitClangOpenCLBuiltins(Records, OS);
 break;
+  case GenClangOpenCLBuiltinHeader:
+EmitClangOpenCLBuiltinHeader(Records, OS);
+break;
   case GenClangOpenCLBuiltinTests:
 EmitClangOpenCLBuiltinTests(Records, OS);
 break;
Index: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
===
--- clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
+++ clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
@@ -324,6 +324,18 @@
   void emit() override;
 };
 
+// OpenCL builtin header generator.  This class processes the same TableGen
+// input as BuiltinNameEmitter, but generates a .h file that contains a
+// prototype for each builtin function described in the .td input.
+class OpenCLBuiltinHeaderEmitter : public OpenCLBuiltinFileEmitterBase {
+public:
+  OpenCLBuiltinHeaderEmitter(RecordKeeper , raw_ostream )
+  : OpenCLBuiltinFileEmitterBase(Records, OS) {}
+
+  // Entrypoint to generate the header.
+  void emit() override;
+};
+
 } // namespace
 
 void BuiltinNameEmitter::Emit() {
@@ -1260,11 +1272,74 @@
   }
 }
 
+void OpenCLBuiltinHeaderEmitter::emit() {
+  emitSourceFileHeader("OpenCL Builtin declarations", OS);
+
+  emitExtensionSetup();
+
+  OS << R"(
+#define __ovld __attribute__((overloadable))
+#define __conv __attribute__((convergent))
+#define __purefn __attribute__((pure))
+#define __cnfn __attribute__((const))
+
+)";
+
+  // Iterate over all builtins.
+  std::vector Builtins = Records.getAllDerivedDefinitions("Builtin");
+  for (const auto *B : Builtins) {
+StringRef Name = B->getValueAsString("Name");
+
+std::string OptionalExtensionEndif = emitExtensionGuard(B);
+std::string OptionalVersionEndif = emitVersionGuard(B);
+
+SmallVector, 4> FTypes;
+expandTypesInSignature(B->getValueAsListOfDefs("Signature"), FTypes);
+
+for (const auto  : FTypes) {
+  StringRef OptionalTypeExtEndif = emitTypeExtensionGuards(Signature);
+
+  // Emit function declaration.
+  OS << Signature[0] << " __ovld ";
+  if (B->getValueAsBit("IsConst"))
+OS << "__cnfn ";
+  if (B->getValueAsBit("IsPure"))
+OS << "__purefn ";
+  if (B->getValueAsBit("IsConv"))
+OS << "__conv ";
+
+  OS << Name << "(";
+  if (Signature.size() > 1) {
+for (unsigned I = 1; I < Signature.size(); I++) {
+  if (I != 1)
+OS << ", ";
+  OS << Signature[I];
+}
+  }
+  OS << ");\n";
+
+  OS << OptionalTypeExtEndif;
+}
+
+OS << OptionalVersionEndif;
+OS << OptionalExtensionEndif;
+  }
+
+  OS << "\n// Disable any extensions we may have enabled previously.\n"
+"#pragma OPENCL EXTENSION all : disable";
+}
+
 void clang::EmitClangOpenCLBuiltins(RecordKeeper , raw_ostream ) {
   BuiltinNameEmitter NameChecker(Records, OS);
   NameChecker.Emit();
 }
 
+void clang::EmitClangOpenCLBuiltinHeader(RecordKeeper ,
+  

[PATCH] D120470: [clang-tidy] Update tests to include opencl-c-base.h

2022-02-24 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was not accepted when it landed; it landed in state "Needs 
Review".
This revision was automatically updated to reflect the committed changes.
Closed by commit rGba18c360b2f3: [clang-tidy] Remove opencl-c.h inclusion from 
tests (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120470/new/

https://reviews.llvm.org/D120470

Files:
  
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
  clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp


Index: 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
===
--- 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
+++ 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
@@ -1,7 +1,7 @@
-// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c 
--include opencl-c.h -DOLDCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c 
--include opencl-c.h -DNEWCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLNEWAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLNEWAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c 
-DOLDCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c 
-DNEWCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL1.2 -c -DOLDCLNEWAOC
+// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL2.0 -c -DNEWCLNEWAOC
 
 #ifdef OLDCLOLDAOC  // OpenCL 1.2 Altera Offline Compiler < 17.1
 void __kernel error_barrier_no_id(__global int * foo, int size) {
Index: 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
===
--- 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
+++ 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
@@ -1,4 +1,4 @@
-// RUN: %check_clang_tidy %s altera-id-dependent-backward-branch %t -- 
-header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h
+// RUN: %check_clang_tidy %s altera-id-dependent-backward-branch %t -- 
-header-filter=.* "--" -cl-std=CL1.2 -c
 
 typedef struct ExampleStruct {
   int IDDepField;


Index: clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
===
--- clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
+++ clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
@@ -1,7 +1,7 @@
-// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLNEWAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* "--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLNEWAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c -DOLDCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c -DNEWCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 

[PATCH] D120470: [clang-tidy] Update tests to include opencl-c-base.h

2022-02-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

Since this is a simple test update I'll commit this now (before code review), 
to get affected CI back to green. Please let me know if there are any 
post-commit concerns.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120470/new/

https://reviews.llvm.org/D120470

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120262: [OpenCL] Handle TypeExtensions in OpenCLBuiltinFileEmitter

2022-02-24 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG28cdcf8e3c8e: [OpenCL] Handle TypeExtensions in 
OpenCLBuiltinFileEmitter (authored by svenvh).

Changed prior to commit:
  https://reviews.llvm.org/D120262?vs=410822=411121#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120262/new/

https://reviews.llvm.org/D120262

Files:
  clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp

Index: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
===
--- clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
+++ clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
@@ -17,6 +17,7 @@
 #include "TableGenBackends.h"
 #include "llvm/ADT/MapVector.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringMap.h"
@@ -293,6 +294,15 @@
   // was emitted.
   std::string emitVersionGuard(const Record *Builtin);
 
+  // Emit an #if guard for all type extensions required for the given type
+  // strings.  Return the corresponding closing #endif, or an empty string
+  // if no extension #if guard was emitted.
+  StringRef
+  emitTypeExtensionGuards(const SmallVectorImpl );
+
+  // Map type strings to type extensions (e.g. "half2" -> "cl_khr_fp16").
+  StringMap TypeExtMap;
+
   // Contains OpenCL builtin functions and related information, stored as
   // Record instances. They are coming from the associated TableGen file.
   RecordKeeper 
@@ -1057,7 +1067,16 @@
 // Insert the Cartesian product of the types and vector sizes.
 for (const auto  : VectorList) {
   for (const auto  : TypeList) {
-ExpandedArg.push_back(getTypeString(Type, Flags, Vector));
+std::string FullType = getTypeString(Type, Flags, Vector);
+ExpandedArg.push_back(FullType);
+
+// If the type requires an extension, add a TypeExtMap entry mapping
+// the full type name to the extension.
+StringRef Ext =
+Arg->getValueAsDef("Extension")->getValueAsString("ExtName");
+if (!Ext.empty() && TypeExtMap.find(FullType) == TypeExtMap.end()) {
+  TypeExtMap.insert({FullType, Ext});
+}
   }
 }
 NumSignatures = std::max(NumSignatures, ExpandedArg.size());
@@ -1141,6 +1160,39 @@
   return OptionalEndif;
 }
 
+StringRef OpenCLBuiltinFileEmitterBase::emitTypeExtensionGuards(
+const SmallVectorImpl ) {
+  SmallSet ExtSet;
+
+  // Iterate over all types to gather the set of required TypeExtensions.
+  for (const auto  : Signature) {
+StringRef TypeExt = TypeExtMap.lookup(Ty);
+if (!TypeExt.empty()) {
+  // The TypeExtensions are space-separated in the .td file.
+  SmallVector ExtVec;
+  TypeExt.split(ExtVec, " ");
+  for (const auto Ext : ExtVec) {
+ExtSet.insert(Ext);
+  }
+}
+  }
+
+  // Emit the #if only when at least one extension is required.
+  if (ExtSet.empty())
+return "";
+
+  OS << "#if ";
+  bool isFirst = true;
+  for (const auto Ext : ExtSet) {
+if (!isFirst)
+  OS << " && ";
+OS << "defined(" << Ext << ")";
+isFirst = false;
+  }
+  OS << "\n";
+  return "#endif // TypeExtension\n";
+}
+
 void OpenCLBuiltinTestEmitter::emit() {
   emitSourceFileHeader("OpenCL Builtin exhaustive testing", OS);
 
@@ -1163,6 +1215,8 @@
 std::string OptionalVersionEndif = emitVersionGuard(B);
 
 for (const auto  : FTypes) {
+  StringRef OptionalTypeExtEndif = emitTypeExtensionGuards(Signature);
+
   // Emit function declaration.
   OS << Signature[0] << " test" << TestID++ << "_" << Name << "(";
   if (Signature.size() > 1) {
@@ -1189,6 +1243,7 @@
 
   // End of function body.
   OS << "}\n";
+  OS << OptionalTypeExtEndif;
 }
 
 OS << OptionalVersionEndif;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120470: [clang-tidy] Update tests to include opencl-c-base.h

2022-02-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh updated this revision to Diff 411106.
svenvh edited the summary of this revision.
svenvh added reviewers: Anastasia, ffrankies.
svenvh added a comment.

After a bit of digging I realized we don't need the explicit include at all 
anymore.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120470/new/

https://reviews.llvm.org/D120470

Files:
  
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
  clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp


Index: 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
===
--- 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
+++ 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
@@ -1,7 +1,7 @@
-// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c 
--include opencl-c.h -DOLDCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c 
--include opencl-c.h -DNEWCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLNEWAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLNEWAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c 
-DOLDCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c 
-DNEWCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL1.2 -c -DOLDCLNEWAOC
+// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL2.0 -c -DNEWCLNEWAOC
 
 #ifdef OLDCLOLDAOC  // OpenCL 1.2 Altera Offline Compiler < 17.1
 void __kernel error_barrier_no_id(__global int * foo, int size) {
Index: 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
===
--- 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
+++ 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
@@ -1,4 +1,4 @@
-// RUN: %check_clang_tidy %s altera-id-dependent-backward-branch %t -- 
-header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h
+// RUN: %check_clang_tidy %s altera-id-dependent-backward-branch %t -- 
-header-filter=.* "--" -cl-std=CL1.2 -c
 
 typedef struct ExampleStruct {
   int IDDepField;


Index: clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
===
--- clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
+++ clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
@@ -1,7 +1,7 @@
-// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLNEWAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* "--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLNEWAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c -DOLDCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c -DNEWCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 

[PATCH] D120254: [OpenCL] Align subgroup builtin guards

2022-02-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

Thanks, I could reproduce the problem with your cmake line.  I have uploaded a 
fix for review in https://reviews.llvm.org/D120470


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120254/new/

https://reviews.llvm.org/D120254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120470: [clang-tidy] Update tests to include opencl-c-base.h

2022-02-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: dyung.
Herald added subscribers: Naghasan, ldrumm, xazax.hun, Anastasia, yaxunl.
svenvh requested review of this revision.
Herald added a project: clang-tools-extra.
Herald added a subscriber: cfe-commits.

After D120254  some clang-tidy tests started 
failing on release builds.
clang-tidy appears to be using the `-fdeclare-opencl-builtins`
functionality, so there is no need to include the full `opencl-c.h`
header.  Instead, only include the base header, which contains
definitions required by `-fdeclare-opencl-builtins`.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D120470

Files:
  
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
  clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp


Index: 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
===
--- 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
+++ 
clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
@@ -1,7 +1,7 @@
-// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c 
--include opencl-c.h -DOLDCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c 
--include opencl-c.h -DNEWCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLNEWAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLNEWAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c 
--include opencl-c-base.h -DOLDCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s 
altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c 
--include opencl-c-base.h -DNEWCLOLDAOC
+// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL1.2 -c --include opencl-c-base.h -DOLDCLNEWAOC
+// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s 
altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: 
altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* 
"--" -cl-std=CL2.0 -c --include opencl-c-base.h -DNEWCLNEWAOC
 
 #ifdef OLDCLOLDAOC  // OpenCL 1.2 Altera Offline Compiler < 17.1
 void __kernel error_barrier_no_id(__global int * foo, int size) {
Index: 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
===
--- 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
+++ 
clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
@@ -1,4 +1,4 @@
-// RUN: %check_clang_tidy %s altera-id-dependent-backward-branch %t -- 
-header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h
+// RUN: %check_clang_tidy %s altera-id-dependent-backward-branch %t -- 
-header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c-base.h
 
 typedef struct ExampleStruct {
   int IDDepField;


Index: clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
===
--- clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
+++ clang-tools-extra/test/clang-tidy/checkers/altera-single-work-item-barrier.cpp
@@ -1,7 +1,7 @@
-// RUN: %check_clang_tidy -check-suffix=OLDCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLOLDAOC %s altera-single-work-item-barrier %t -- -header-filter=.* "--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLOLDAOC
-// RUN: %check_clang_tidy -check-suffix=OLDCLNEWAOC %s altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h -DOLDCLNEWAOC
-// RUN: %check_clang_tidy -check-suffix=NEWCLNEWAOC %s altera-single-work-item-barrier %t -- -config='{CheckOptions: [{key: altera-single-work-item-barrier.AOCVersion, value: 1701}]}' -header-filter=.* "--" -cl-std=CL2.0 -c --include opencl-c.h -DNEWCLNEWAOC
+// RUN: %check_clang_tidy 

[PATCH] D120254: [OpenCL] Align subgroup builtin guards

2022-02-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D120254#3342551 , @dyung wrote:

> Hi, our internal release build bots are showing failures in two clang-tidy 
> tests that I bisected back to your commit, 
> clang-tidy/checkers/altera-id-dependent-backward-branch.cpp and 
> clang-tidy/checkers/altera-single-work-item-barrier.cpp. After this change, 
> both are exhibiting this error:
>
>   Error while processing 
> /home/dyung/src/upstream/aa9c2d19d9b73589d72114d6e0a4fb4ce42b922b-linux/tools/clang/tools/extra/test/clang-tidy/checkers/Output/altera-single-work-item-barrier.cpp.tmp.cpp.
>   error: enum type memory_scope not found; include the base header with 
> -finclude-default-header [clang-diagnostic-error]
>
> Oddly, this only fails in a release configuration. Can you take a look?

I'll try to reproduce the failure locally, but until I've done so perhaps you 
could try whether the following fixes one of the tests?  If so, then the other 
test will likely need a similar fix.

  diff --git 
a/clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
 
b/clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
  index a6dbab7b72fc..9bc1bbf173cc 100644
  --- 
a/clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
  +++ 
b/clang-tools-extra/test/clang-tidy/checkers/altera-id-dependent-backward-branch.cpp
  @@ -1,4 +1,4 @@
  -// RUN: %check_clang_tidy %s altera-id-dependent-backward-branch %t -- 
-header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h
  +// RUN: %check_clang_tidy %s altera-id-dependent-backward-branch %t -- 
-header-filter=.* "--" -cl-std=CL1.2 -c --include opencl-c.h --include 
opencl-c-base.h
   
   typedef struct ExampleStruct {
 int IDDepField;


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120254/new/

https://reviews.llvm.org/D120254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120262: [OpenCL] Handle TypeExtensions in OpenCLBuiltinFileEmitter

2022-02-23 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh marked 2 inline comments as done.
svenvh added inline comments.



Comment at: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp:1183
+  SmallVector ExtVec;
+  TypeExt.split(ExtVec, " ");
+  for (const auto Ext : ExtVec) {

arkangath wrote:
> Just in case if relevant, your "KeepEmpty" will default to true here.
> I don't know if it is possible or not (not enough context for me), but could 
> the .td file have "Extension0  Extention1" (two spaces) that could lead to 
> one empty StringRef here?
That would lead to generation of `#if defined(Extension0) && defined()`, which 
would cause a compilation error.  I think that's not unreasonable behavior to 
keep, to enforce that extensions are separated by exactly one space in the .td 
file for the sake of consistency.



Comment at: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp:1191
+  // Emit the #if when at least one extension is required.
+  if (!ExtSet.empty()) {
+OS << "#if ";

arkangath wrote:
> Arguably it _may_ be better to early-return here instead of increasing the 
> indentation, but I'm not asking for a change; just suggesting.
> The assigned values into OptionalEndif could be just returned directly rather 
> than storing in a local variable, unless you're predicting further changes to 
> this function benefit from this temporary existing.
You are right, that would align better with the style of `emitExtensionGuard` 
too; I'll apply your suggestion prior to committing.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120262/new/

https://reviews.llvm.org/D120262

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120262: [OpenCL] Handle TypeExtensions in OpenCLBuiltinFileEmitter

2022-02-23 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh marked 2 inline comments as done.
svenvh added inline comments.



Comment at: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp:296-298
 
+  // Emit an #if guard for all type extensions required for the given type
+  // strings.

arkangath wrote:
> Shouldn't what the return value means be documented here?
Good catch, added.



Comment at: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp:1190-1191
+}
+OS << "\n";
+OptionalEndif = "#endif // TypeExtension\n";
+  }

arkangath wrote:
> Seems to me that this is the only assignment to the OptionalEndif variable. 
> In which case, can't it be a StringRef ? And the return of the function be 
> StringRef too ?
Indeed; updated.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120262/new/

https://reviews.llvm.org/D120262

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120262: [OpenCL] Handle TypeExtensions in OpenCLBuiltinFileEmitter

2022-02-23 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh updated this revision to Diff 410822.
svenvh added a comment.

Use StringRef and extend comment.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120262/new/

https://reviews.llvm.org/D120262

Files:
  clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp

Index: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
===
--- clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
+++ clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
@@ -17,6 +17,7 @@
 #include "TableGenBackends.h"
 #include "llvm/ADT/MapVector.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringMap.h"
@@ -293,6 +294,15 @@
   // was emitted.
   std::string emitVersionGuard(const Record *Builtin);
 
+  // Emit an #if guard for all type extensions required for the given type
+  // strings.  Return the corresponding closing #endif, or an empty string
+  // if no extension #if guard was emitted.
+  StringRef
+  emitTypeExtensionGuards(const SmallVectorImpl );
+
+  // Map type strings to type extensions (e.g. "half2" -> "cl_khr_fp16").
+  StringMap TypeExtMap;
+
   // Contains OpenCL builtin functions and related information, stored as
   // Record instances. They are coming from the associated TableGen file.
   RecordKeeper 
@@ -1066,7 +1076,16 @@
 // Insert the Cartesian product of the types and vector sizes.
 for (const auto  : VectorList) {
   for (const auto  : TypeList) {
-ExpandedArg.push_back(getTypeString(Type, Flags, Vector));
+std::string FullType = getTypeString(Type, Flags, Vector);
+ExpandedArg.push_back(FullType);
+
+// If the type requires an extension, add a TypeExtMap entry mapping
+// the full type name to the extension.
+StringRef Ext =
+Arg->getValueAsDef("Extension")->getValueAsString("ExtName");
+if (!Ext.empty() && TypeExtMap.find(FullType) == TypeExtMap.end()) {
+  TypeExtMap.insert({FullType, Ext});
+}
   }
 }
 NumSignatures = std::max(NumSignatures, ExpandedArg.size());
@@ -1150,6 +1169,41 @@
   return OptionalEndif;
 }
 
+StringRef OpenCLBuiltinFileEmitterBase::emitTypeExtensionGuards(
+const SmallVectorImpl ) {
+  StringRef OptionalEndif;
+  SmallSet ExtSet;
+
+  // Iterate over all types to gather the set of required TypeExtensions.
+  for (const auto  : Signature) {
+StringRef TypeExt = TypeExtMap.lookup(Ty);
+if (!TypeExt.empty()) {
+  // The TypeExtensions are space-separated in the .td file.
+  SmallVector ExtVec;
+  TypeExt.split(ExtVec, " ");
+  for (const auto Ext : ExtVec) {
+ExtSet.insert(Ext);
+  }
+}
+  }
+
+  // Emit the #if when at least one extension is required.
+  if (!ExtSet.empty()) {
+OS << "#if ";
+bool isFirst = true;
+for (const auto Ext : ExtSet) {
+  if (!isFirst)
+OS << " && ";
+  OS << "defined(" << Ext << ")";
+  isFirst = false;
+}
+OS << "\n";
+OptionalEndif = "#endif // TypeExtension\n";
+  }
+
+  return OptionalEndif;
+}
+
 void OpenCLBuiltinTestEmitter::emit() {
   emitSourceFileHeader("OpenCL Builtin exhaustive testing", OS);
 
@@ -1172,6 +1226,8 @@
 std::string OptionalVersionEndif = emitVersionGuard(B);
 
 for (const auto  : FTypes) {
+  StringRef OptionalTypeExtEndif = emitTypeExtensionGuards(Signature);
+
   // Emit function declaration.
   OS << Signature[0] << " test" << TestID++ << "_" << Name << "(";
   if (Signature.size() > 1) {
@@ -1198,6 +1254,7 @@
 
   // End of function body.
   OS << "}\n";
+  OS << OptionalTypeExtEndif;
 }
 
 OS << OptionalVersionEndif;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120254: [OpenCL] Align subgroup builtin guards

2022-02-23 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
svenvh marked an inline comment as done.
Closed by commit rGaa9c2d19d9b7: [OpenCL] Align subgroup builtin guards 
(authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120254/new/

https://reviews.llvm.org/D120254

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Headers/opencl-c.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl


Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -1,7 +1,7 @@
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL -fdeclare-opencl-builtins -DNO_HEADER
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL -fdeclare-opencl-builtins -finclude-default-header
-// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL1.2 -fdeclare-opencl-builtins -DNO_HEADER
-// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL1.2 -fdeclare-opencl-builtins -finclude-default-header
+// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL1.2 -fdeclare-opencl-builtins -DNO_HEADER 
-cl-ext=-cl_intel_subgroups
+// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL1.2 -fdeclare-opencl-builtins -finclude-default-header 
-cl-ext=-cl_intel_subgroups
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL2.0 -fdeclare-opencl-builtins -DNO_HEADER
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL2.0 -fdeclare-opencl-builtins -finclude-default-header
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header
@@ -79,6 +79,7 @@
 #define cl_khr_subgroup_non_uniform_arithmetic 1
 #define cl_khr_subgroup_clustered_reduce 1
 #define __opencl_c_read_write_images 1
+#define __opencl_subgroup_builtins 1
 #endif
 
 #if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -83,7 +83,7 @@
 
 // FunctionExtension definitions.
 def FuncExtNone  : FunctionExtension<"">;
-def FuncExtKhrSubgroups  : 
FunctionExtension<"cl_khr_subgroups">;
+def FuncExtKhrSubgroups  : 
FunctionExtension<"__opencl_subgroup_builtins">;
 def FuncExtKhrSubgroupExtendedTypes  : 
FunctionExtension<"cl_khr_subgroup_extended_types">;
 def FuncExtKhrSubgroupNonUniformVote : 
FunctionExtension<"cl_khr_subgroup_non_uniform_vote">;
 def FuncExtKhrSubgroupBallot : 
FunctionExtension<"cl_khr_subgroup_ballot">;
Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -16282,7 +16282,7 @@
 
 // OpenCL Extension v2.0 s9.17 - Sub-groups
 
-#if defined(cl_intel_subgroups) || defined(cl_khr_subgroups) || 
defined(__opencl_c_subgroups)
+#if defined(__opencl_subgroup_builtins)
 // Shared Sub Group Functions
 uint__ovld get_sub_group_size(void);
 uint__ovld get_max_sub_group_size(void);
@@ -16381,7 +16381,7 @@
 double  __ovld __conv sub_group_scan_inclusive_max(double x);
 #endif //cl_khr_fp64
 
-#endif //cl_khr_subgroups cl_intel_subgroups __opencl_c_subgroups
+#endif // __opencl_subgroup_builtins
 
 #if defined(cl_khr_subgroup_extended_types)
 char __ovld __conv sub_group_broadcast( char value, uint index );
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -80,6 +80,11 @@
 #define __opencl_c_named_address_space_builtins 1
 #endif // !defined(__opencl_c_generic_address_space)
 
+#if defined(cl_intel_subgroups) || defined(cl_khr_subgroups) || 
defined(__opencl_c_subgroups)
+// Internal feature macro to provide subgroup builtins.
+#define __opencl_subgroup_builtins 1
+#endif
+
 // built-in scalar data types:
 
 /**


Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -1,7 +1,7 @@
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CL 

[PATCH] D120254: [OpenCL] Align subgroup builtin guards

2022-02-23 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh marked an inline comment as done.
svenvh added inline comments.



Comment at: clang/lib/Headers/opencl-c-base.h:85
+// Internal feature macro to provide subgroup builtins.
+#define __opencl_subgroup_builtins 1
+#endif

azabaznov wrote:
> svenvh wrote:
> > I'm in doubt whether we could just reuse `__opencl_c_subgroups` for this?
> I think we couldn't. Those subgroup features/extensions are different, as 
> implementation may support the extension but not the feature. The difference 
> is in subgroup independent forward progress: for example, it's required by 
> the extension, but optional in OpenCL C 3.0 feature.
Thanks for confirming, I will keep the separate internal feature macro then.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120254/new/

https://reviews.llvm.org/D120254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120032: [OpenCL] opencl-c.h: use uint/ulong consistently

2022-02-22 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe7e17b30d02d: [OpenCL] opencl-c.h: use uint/ulong 
consistently (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120032/new/

https://reviews.llvm.org/D120032

Files:
  clang/lib/Headers/opencl-c.h

Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -12919,28 +12919,28 @@
  * pointed by p. The function returns old.
  */
 int __ovld atomic_add(volatile __global int *p, int val);
-unsigned int __ovld atomic_add(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atomic_add(volatile __global uint *p, uint val);
 int __ovld atomic_add(volatile __local int *p, int val);
-unsigned int __ovld atomic_add(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atomic_add(volatile __local uint *p, uint val);
 #ifdef __OPENCL_CPP_VERSION__
 int __ovld atomic_add(volatile int *p, int val);
-unsigned int __ovld atomic_add(volatile unsigned int *p, unsigned int val);
+uint __ovld atomic_add(volatile uint *p, uint val);
 #endif
 
 #if defined(cl_khr_global_int32_base_atomics)
 int __ovld atom_add(volatile __global int *p, int val);
-unsigned int __ovld atom_add(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atom_add(volatile __global uint *p, uint val);
 #endif
 #if defined(cl_khr_local_int32_base_atomics)
 int __ovld atom_add(volatile __local int *p, int val);
-unsigned int __ovld atom_add(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atom_add(volatile __local uint *p, uint val);
 #endif
 
 #if defined(cl_khr_int64_base_atomics)
 long __ovld atom_add(volatile __global long *p, long val);
-unsigned long __ovld atom_add(volatile __global unsigned long *p, unsigned long val);
+ulong __ovld atom_add(volatile __global ulong *p, ulong val);
 long __ovld atom_add(volatile __local long *p, long val);
-unsigned long __ovld atom_add(volatile __local unsigned long *p, unsigned long val);
+ulong __ovld atom_add(volatile __local ulong *p, ulong val);
 #endif
 
 /**
@@ -12949,28 +12949,28 @@
  * returns old.
  */
 int __ovld atomic_sub(volatile __global int *p, int val);
-unsigned int __ovld atomic_sub(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atomic_sub(volatile __global uint *p, uint val);
 int __ovld atomic_sub(volatile __local int *p, int val);
-unsigned int __ovld atomic_sub(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atomic_sub(volatile __local uint *p, uint val);
 #ifdef __OPENCL_CPP_VERSION__
 int __ovld atomic_sub(volatile int *p, int val);
-unsigned int __ovld atomic_sub(volatile unsigned int *p, unsigned int val);
+uint __ovld atomic_sub(volatile uint *p, uint val);
 #endif
 
 #if defined(cl_khr_global_int32_base_atomics)
 int __ovld atom_sub(volatile __global int *p, int val);
-unsigned int __ovld atom_sub(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atom_sub(volatile __global uint *p, uint val);
 #endif
 #if defined(cl_khr_local_int32_base_atomics)
 int __ovld atom_sub(volatile __local int *p, int val);
-unsigned int __ovld atom_sub(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atom_sub(volatile __local uint *p, uint val);
 #endif
 
 #if defined(cl_khr_int64_base_atomics)
 long __ovld atom_sub(volatile __global long *p, long val);
-unsigned long __ovld atom_sub(volatile __global unsigned long *p, unsigned long val);
+ulong __ovld atom_sub(volatile __global ulong *p, ulong val);
 long __ovld atom_sub(volatile __local long *p, long val);
-unsigned long __ovld atom_sub(volatile __local unsigned long *p, unsigned long val);
+ulong __ovld atom_sub(volatile __local ulong *p, ulong val);
 #endif
 
 /**
@@ -12979,31 +12979,31 @@
  * value.
  */
 int __ovld atomic_xchg(volatile __global int *p, int val);
-unsigned int __ovld atomic_xchg(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atomic_xchg(volatile __global uint *p, uint val);
 int __ovld atomic_xchg(volatile __local int *p, int val);
-unsigned int __ovld atomic_xchg(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atomic_xchg(volatile __local uint *p, uint val);
 float __ovld atomic_xchg(volatile __global float *p, float val);
 float __ovld atomic_xchg(volatile __local float *p, float val);
 #ifdef __OPENCL_CPP_VERSION__
 int __ovld atomic_xchg(volatile int *p, int val);
-unsigned int __ovld atomic_xchg(volatile unsigned int *p, unsigned int val);
+uint __ovld atomic_xchg(volatile uint *p, uint val);
 float __ovld atomic_xchg(volatile float *p, float val);
 #endif
 
 #if defined(cl_khr_global_int32_base_atomics)
 int __ovld atom_xchg(volatile __global int *p, int val);
-unsigned int __ovld atom_xchg(volatile __global unsigned int *p, unsigned int val);

[PATCH] D120262: [OpenCL] Handle TypeExtensions in OpenCLBuiltinFileEmitter

2022-02-21 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: Anastasia.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

Until now, any types that had TypeExtensions attached to them were not
guarded with those extensions.  Extend the OpenCLBuiltinFileEmitter
such that all required extensions are emitted for the types of a
builtin function.

The `clang-tblgen -gen-clang-opencl-builtin-tests` emitter will now
produce e.g.:

  #if defined(cl_khr_fp16) && defined(cl_khr_fp64)
  half8 test11802_convert_half8_rtp(double8 arg1) {
return convert_half8_rtp(arg1);
  }
  #endif // TypeExtension


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D120262

Files:
  clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp

Index: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
===
--- clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
+++ clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
@@ -17,6 +17,7 @@
 #include "TableGenBackends.h"
 #include "llvm/ADT/MapVector.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringMap.h"
@@ -293,6 +294,14 @@
   // was emitted.
   std::string emitVersionGuard(const Record *Builtin);
 
+  // Emit an #if guard for all type extensions required for the given type
+  // strings.
+  std::string
+  emitTypeExtensionGuards(const SmallVectorImpl );
+
+  // Map type strings to type extensions (e.g. "half2" -> "cl_khr_fp16").
+  StringMap TypeExtMap;
+
   // Contains OpenCL builtin functions and related information, stored as
   // Record instances. They are coming from the associated TableGen file.
   RecordKeeper 
@@ -1057,7 +1066,16 @@
 // Insert the Cartesian product of the types and vector sizes.
 for (const auto  : VectorList) {
   for (const auto  : TypeList) {
-ExpandedArg.push_back(getTypeString(Type, Flags, Vector));
+std::string FullType = getTypeString(Type, Flags, Vector);
+ExpandedArg.push_back(FullType);
+
+// If the type requires an extension, add a TypeExtMap entry mapping
+// the full type name to the extension.
+StringRef Ext =
+Arg->getValueAsDef("Extension")->getValueAsString("ExtName");
+if (!Ext.empty() && TypeExtMap.find(FullType) == TypeExtMap.end()) {
+  TypeExtMap.insert({FullType, Ext});
+}
   }
 }
 NumSignatures = std::max(NumSignatures, ExpandedArg.size());
@@ -1141,6 +1159,41 @@
   return OptionalEndif;
 }
 
+std::string OpenCLBuiltinFileEmitterBase::emitTypeExtensionGuards(
+const SmallVectorImpl ) {
+  std::string OptionalEndif;
+  SmallSet ExtSet;
+
+  // Iterate over all types to gather the set of required TypeExtensions.
+  for (const auto  : Signature) {
+StringRef TypeExt = TypeExtMap.lookup(Ty);
+if (!TypeExt.empty()) {
+  // The TypeExtensions are space-separated in the .td file.
+  SmallVector ExtVec;
+  TypeExt.split(ExtVec, " ");
+  for (const auto Ext : ExtVec) {
+ExtSet.insert(Ext);
+  }
+}
+  }
+
+  // Emit the #if when at least one extension is required.
+  if (!ExtSet.empty()) {
+OS << "#if ";
+bool isFirst = true;
+for (const auto Ext : ExtSet) {
+  if (!isFirst)
+OS << " && ";
+  OS << "defined(" << Ext << ")";
+  isFirst = false;
+}
+OS << "\n";
+OptionalEndif = "#endif // TypeExtension\n";
+  }
+
+  return OptionalEndif;
+}
+
 void OpenCLBuiltinTestEmitter::emit() {
   emitSourceFileHeader("OpenCL Builtin exhaustive testing", OS);
 
@@ -1163,6 +1216,8 @@
 std::string OptionalVersionEndif = emitVersionGuard(B);
 
 for (const auto  : FTypes) {
+  std::string OptionalTypeExtEndif = emitTypeExtensionGuards(Signature);
+
   // Emit function declaration.
   OS << Signature[0] << " test" << TestID++ << "_" << Name << "(";
   if (Signature.size() > 1) {
@@ -1189,6 +1244,7 @@
 
   // End of function body.
   OS << "}\n";
+  OS << OptionalTypeExtEndif;
 }
 
 OS << OptionalVersionEndif;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120254: [OpenCL] Align subgroup builtin guards

2022-02-21 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/lib/Headers/opencl-c-base.h:85
+// Internal feature macro to provide subgroup builtins.
+#define __opencl_subgroup_builtins 1
+#endif

I'm in doubt whether we could just reuse `__opencl_c_subgroups` for this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120254/new/

https://reviews.llvm.org/D120254

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120254: [OpenCL] Align subgroup builtin guards

2022-02-21 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: azabaznov.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
svenvh requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Until now, subgroup builtins are available with `opencl-c.h` when at
least one of `cl_intel_subgroups`, `cl_khr_subgroups`, or
`__opencl_c_subgroups` is defined.  With `-fdeclare-opencl-builtins`,
subgroup builtins are conditionalized on `cl_khr_subgroups` only.

Align `-fdeclare-opencl-builtins` to `opencl-c.h` by introducing the
internal `__opencl_subgroup_builtins` macro.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D120254

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Headers/opencl-c.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl


Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -1,7 +1,7 @@
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL -fdeclare-opencl-builtins -DNO_HEADER
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL -fdeclare-opencl-builtins -finclude-default-header
-// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL1.2 -fdeclare-opencl-builtins -DNO_HEADER
-// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL1.2 -fdeclare-opencl-builtins -finclude-default-header
+// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL1.2 -fdeclare-opencl-builtins -DNO_HEADER 
-cl-ext=-cl_intel_subgroups
+// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL1.2 -fdeclare-opencl-builtins -finclude-default-header 
-cl-ext=-cl_intel_subgroups
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL2.0 -fdeclare-opencl-builtins -DNO_HEADER
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL2.0 -fdeclare-opencl-builtins -finclude-default-header
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror 
-fsyntax-only -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header
@@ -79,6 +79,7 @@
 #define cl_khr_subgroup_non_uniform_arithmetic 1
 #define cl_khr_subgroup_clustered_reduce 1
 #define __opencl_c_read_write_images 1
+#define __opencl_subgroup_builtins 1
 #endif
 
 #if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -83,7 +83,7 @@
 
 // FunctionExtension definitions.
 def FuncExtNone  : FunctionExtension<"">;
-def FuncExtKhrSubgroups  : 
FunctionExtension<"cl_khr_subgroups">;
+def FuncExtKhrSubgroups  : 
FunctionExtension<"__opencl_subgroup_builtins">;
 def FuncExtKhrSubgroupExtendedTypes  : 
FunctionExtension<"cl_khr_subgroup_extended_types">;
 def FuncExtKhrSubgroupNonUniformVote : 
FunctionExtension<"cl_khr_subgroup_non_uniform_vote">;
 def FuncExtKhrSubgroupBallot : 
FunctionExtension<"cl_khr_subgroup_ballot">;
Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -16282,7 +16282,7 @@
 
 // OpenCL Extension v2.0 s9.17 - Sub-groups
 
-#if defined(cl_intel_subgroups) || defined(cl_khr_subgroups) || 
defined(__opencl_c_subgroups)
+#if defined(__opencl_subgroup_builtins)
 // Shared Sub Group Functions
 uint__ovld get_sub_group_size(void);
 uint__ovld get_max_sub_group_size(void);
@@ -16381,7 +16381,7 @@
 double  __ovld __conv sub_group_scan_inclusive_max(double x);
 #endif //cl_khr_fp64
 
-#endif //cl_khr_subgroups cl_intel_subgroups __opencl_c_subgroups
+#endif // __opencl_subgroup_builtins
 
 #if defined(cl_khr_subgroup_extended_types)
 char __ovld __conv sub_group_broadcast( char value, uint index );
Index: clang/lib/Headers/opencl-c-base.h
===
--- clang/lib/Headers/opencl-c-base.h
+++ clang/lib/Headers/opencl-c-base.h
@@ -80,6 +80,11 @@
 #define __opencl_c_named_address_space_builtins 1
 #endif // !defined(__opencl_c_generic_address_space)
 
+#if defined(cl_intel_subgroups) || defined(cl_khr_subgroups) || 
defined(__opencl_c_subgroups)
+// Internal feature macro to provide subgroup builtins.
+#define __opencl_subgroup_builtins 1
+#endif
+
 // built-in scalar data types:
 
 /**


Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

[PATCH] D120032: [OpenCL] opencl-c.h: use uint/ulong consistently

2022-02-17 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: Anastasia.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

Most places already seem to use the short spelling instead of
'unsigned int/long', so perform the following substitutions:

  s/unsigned int /uint /g
  s/unsigned long /ulong /g

This simplifies completeness comparisons against OpenCLBuiltins.td.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D120032

Files:
  clang/lib/Headers/opencl-c.h

Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -12919,28 +12919,28 @@
  * pointed by p. The function returns old.
  */
 int __ovld atomic_add(volatile __global int *p, int val);
-unsigned int __ovld atomic_add(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atomic_add(volatile __global uint *p, uint val);
 int __ovld atomic_add(volatile __local int *p, int val);
-unsigned int __ovld atomic_add(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atomic_add(volatile __local uint *p, uint val);
 #ifdef __OPENCL_CPP_VERSION__
 int __ovld atomic_add(volatile int *p, int val);
-unsigned int __ovld atomic_add(volatile unsigned int *p, unsigned int val);
+uint __ovld atomic_add(volatile uint *p, uint val);
 #endif
 
 #if defined(cl_khr_global_int32_base_atomics)
 int __ovld atom_add(volatile __global int *p, int val);
-unsigned int __ovld atom_add(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atom_add(volatile __global uint *p, uint val);
 #endif
 #if defined(cl_khr_local_int32_base_atomics)
 int __ovld atom_add(volatile __local int *p, int val);
-unsigned int __ovld atom_add(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atom_add(volatile __local uint *p, uint val);
 #endif
 
 #if defined(cl_khr_int64_base_atomics)
 long __ovld atom_add(volatile __global long *p, long val);
-unsigned long __ovld atom_add(volatile __global unsigned long *p, unsigned long val);
+ulong __ovld atom_add(volatile __global ulong *p, ulong val);
 long __ovld atom_add(volatile __local long *p, long val);
-unsigned long __ovld atom_add(volatile __local unsigned long *p, unsigned long val);
+ulong __ovld atom_add(volatile __local ulong *p, ulong val);
 #endif
 
 /**
@@ -12949,28 +12949,28 @@
  * returns old.
  */
 int __ovld atomic_sub(volatile __global int *p, int val);
-unsigned int __ovld atomic_sub(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atomic_sub(volatile __global uint *p, uint val);
 int __ovld atomic_sub(volatile __local int *p, int val);
-unsigned int __ovld atomic_sub(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atomic_sub(volatile __local uint *p, uint val);
 #ifdef __OPENCL_CPP_VERSION__
 int __ovld atomic_sub(volatile int *p, int val);
-unsigned int __ovld atomic_sub(volatile unsigned int *p, unsigned int val);
+uint __ovld atomic_sub(volatile uint *p, uint val);
 #endif
 
 #if defined(cl_khr_global_int32_base_atomics)
 int __ovld atom_sub(volatile __global int *p, int val);
-unsigned int __ovld atom_sub(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atom_sub(volatile __global uint *p, uint val);
 #endif
 #if defined(cl_khr_local_int32_base_atomics)
 int __ovld atom_sub(volatile __local int *p, int val);
-unsigned int __ovld atom_sub(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atom_sub(volatile __local uint *p, uint val);
 #endif
 
 #if defined(cl_khr_int64_base_atomics)
 long __ovld atom_sub(volatile __global long *p, long val);
-unsigned long __ovld atom_sub(volatile __global unsigned long *p, unsigned long val);
+ulong __ovld atom_sub(volatile __global ulong *p, ulong val);
 long __ovld atom_sub(volatile __local long *p, long val);
-unsigned long __ovld atom_sub(volatile __local unsigned long *p, unsigned long val);
+ulong __ovld atom_sub(volatile __local ulong *p, ulong val);
 #endif
 
 /**
@@ -12979,31 +12979,31 @@
  * value.
  */
 int __ovld atomic_xchg(volatile __global int *p, int val);
-unsigned int __ovld atomic_xchg(volatile __global unsigned int *p, unsigned int val);
+uint __ovld atomic_xchg(volatile __global uint *p, uint val);
 int __ovld atomic_xchg(volatile __local int *p, int val);
-unsigned int __ovld atomic_xchg(volatile __local unsigned int *p, unsigned int val);
+uint __ovld atomic_xchg(volatile __local uint *p, uint val);
 float __ovld atomic_xchg(volatile __global float *p, float val);
 float __ovld atomic_xchg(volatile __local float *p, float val);
 #ifdef __OPENCL_CPP_VERSION__
 int __ovld atomic_xchg(volatile int *p, int val);
-unsigned int __ovld atomic_xchg(volatile unsigned int *p, unsigned int val);
+uint __ovld atomic_xchg(volatile uint *p, uint val);
 float __ovld atomic_xchg(volatile float *p, float val);
 #endif
 
 #if 

[PATCH] D119858: [OpenCL] Guard 64-bit atomic types

2022-02-17 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9798b33d1dc1: [OpenCL] Guard 64-bit atomic types (authored 
by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119858/new/

https://reviews.llvm.org/D119858

Files:
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
  clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp

Index: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
===
--- clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
+++ clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
@@ -733,6 +733,20 @@
   OS << "} // isOpenCLBuiltin\n";
 }
 
+// Emit an if-statement with an isMacroDefined call for each extension in
+// the space-separated list of extensions.
+static void EmitMacroChecks(raw_ostream , StringRef Extensions) {
+  SmallVector ExtVec;
+  Extensions.split(ExtVec, " ");
+  OS << "  if (";
+  for (StringRef Ext : ExtVec) {
+if (Ext != ExtVec.front())
+  OS << " && ";
+OS << "S.getPreprocessor().isMacroDefined(\"" << Ext << "\")";
+  }
+  OS << ") {\n  ";
+}
+
 void BuiltinNameEmitter::EmitQualTypeFinder() {
   OS << R"(
 
@@ -825,15 +839,14 @@
 // Collect all QualTypes for a single vector size into TypeList.
 OS << "  SmallVector TypeList;\n";
 for (const auto *T : BaseTypes) {
-  StringRef Ext =
+  StringRef Exts =
   T->getValueAsDef("Extension")->getValueAsString("ExtName");
-  if (!Ext.empty()) {
-OS << "  if (S.getPreprocessor().isMacroDefined(\"" << Ext
-   << "\")) {\n  ";
+  if (!Exts.empty()) {
+EmitMacroChecks(OS, Exts);
   }
   OS << "  TypeList.push_back("
  << T->getValueAsDef("QTExpr")->getValueAsString("TypeExpr") << ");\n";
-  if (!Ext.empty()) {
+  if (!Exts.empty()) {
 OS << "  }\n";
   }
 }
@@ -877,15 +890,14 @@
 // Emit the cases for non generic, non image types.
 OS << "case OCLT_" << T->getValueAsString("Name") << ":\n";
 
-StringRef Ext = T->getValueAsDef("Extension")->getValueAsString("ExtName");
-// If this type depends on an extension, ensure the extension macro is
+StringRef Exts = T->getValueAsDef("Extension")->getValueAsString("ExtName");
+// If this type depends on an extension, ensure the extension macros are
 // defined.
-if (!Ext.empty()) {
-  OS << "  if (S.getPreprocessor().isMacroDefined(\"" << Ext
- << "\")) {\n  ";
+if (!Exts.empty()) {
+  EmitMacroChecks(OS, Exts);
 }
 OS << "  QT.push_back(" << QT->getValueAsString("TypeExpr") << ");\n";
-if (!Ext.empty()) {
+if (!Exts.empty()) {
   OS << "  }\n";
 }
 OS << "  break;\n";
Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -163,6 +163,25 @@
 }
 #endif // !defined(NO_HEADER) && __OPENCL_C_VERSION__ >= 200
 
+#if !defined(NO_HEADER) && __OPENCL_C_VERSION__ == 200 && defined(__opencl_c_generic_address_space)
+
+// Test that overloads that use atomic_double are not available when the fp64
+// extension is disabled.  Test this by counting the number of notes about
+// candidate functions.
+void test_atomic_double_reporting(volatile __generic atomic_int *a) {
+  atomic_init(a);
+  // expected-error@-1{{no matching function for call to 'atomic_init'}}
+#if defined(NO_FP64)
+  // Expecting 5 candidates: int, uint, long, ulong, float
+  // expected-note@-4 5 {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+#else
+  // Expecting 6 candidates: int, uint, long, ulong, float, double
+  // expected-note@-7 6 {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+#endif
+}
+
+#endif
+
 #if defined(NO_ATOMSCOPE) && __OPENCL_C_VERSION__ >= 300
 // Disable the feature by undefining the feature macro.
 #undef __opencl_c_atomic_scope_device
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -78,6 +78,8 @@
 def NoTypeExt   : TypeExtension<"">;
 def Fp16TypeExt : TypeExtension<"cl_khr_fp16">;
 def Fp64TypeExt : TypeExtension<"cl_khr_fp64">;
+def Atomic64TypeExt : TypeExtension<"cl_khr_int64_base_atomics cl_khr_int64_extended_atomics">;
+def AtomicFp64TypeExt : TypeExtension<"cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64">;
 
 // FunctionExtension definitions.
 def FuncExtNone  : FunctionExtension<"">;
@@ -389,10 +391,14 @@
 // OpenCL v2.0 s6.13.11: Atomic integer and floating-point types.
 def AtomicInt : Type<"atomic_int", 

[PATCH] D119398: [OpenCL] Guard atomic_double with cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics

2022-02-16 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG477bc8e8b931: [OpenCL] Guard atomic_double with 
cl_khr_int64_* (authored by svenvh).

Changed prior to commit:
  https://reviews.llvm.org/D119398?vs=407506=409179#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119398/new/

https://reviews.llvm.org/D119398

Files:
  clang/lib/Headers/opencl-c.h


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -13832,6 +13832,7 @@
 #endif // defined(__opencl_c_ext_fp32_global_atomic_min_max) &&
\
 defined(__opencl_c_ext_fp32_local_atomic_min_max)
 
+#if defined(cl_khr_int64_base_atomics) && 
defined(cl_khr_int64_extended_atomics)
 #if defined(__opencl_c_ext_fp64_global_atomic_min_max)
 double __ovld atomic_fetch_min(volatile __global atomic_double *object,
double operand);
@@ -13882,6 +13883,8 @@
 memory_scope scope);
 #endif // defined(__opencl_c_ext_fp64_global_atomic_min_max) &&
\
 defined(__opencl_c_ext_fp64_local_atomic_min_max)
+#endif // defined(cl_khr_int64_base_atomics) &&
\
+defined(cl_khr_int64_extended_atomics)
 
 #if defined(__opencl_c_ext_fp16_global_atomic_add)
 half __ovld atomic_fetch_add(volatile __global atomic_half *object,
@@ -13985,6 +13988,7 @@
 #endif // defined(__opencl_c_ext_fp32_global_atomic_add) &&
\
 defined(__opencl_c_ext_fp32_local_atomic_add)
 
+#if defined(cl_khr_int64_base_atomics) && 
defined(cl_khr_int64_extended_atomics)
 #if defined(__opencl_c_ext_fp64_global_atomic_add)
 double __ovld atomic_fetch_add(volatile __global atomic_double *object,
double operand);
@@ -14035,6 +14039,8 @@
 memory_scope scope);
 #endif // defined(__opencl_c_ext_fp64_global_atomic_add) &&
\
 defined(__opencl_c_ext_fp64_local_atomic_add)
+#endif // defined(cl_khr_int64_base_atomics) &&
\
+defined(cl_khr_int64_extended_atomics)
 
 #endif // cl_ext_float_atomics
 


Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -13832,6 +13832,7 @@
 #endif // defined(__opencl_c_ext_fp32_global_atomic_min_max) &&\
 defined(__opencl_c_ext_fp32_local_atomic_min_max)
 
+#if defined(cl_khr_int64_base_atomics) && defined(cl_khr_int64_extended_atomics)
 #if defined(__opencl_c_ext_fp64_global_atomic_min_max)
 double __ovld atomic_fetch_min(volatile __global atomic_double *object,
double operand);
@@ -13882,6 +13883,8 @@
 memory_scope scope);
 #endif // defined(__opencl_c_ext_fp64_global_atomic_min_max) &&\
 defined(__opencl_c_ext_fp64_local_atomic_min_max)
+#endif // defined(cl_khr_int64_base_atomics) &&\
+defined(cl_khr_int64_extended_atomics)
 
 #if defined(__opencl_c_ext_fp16_global_atomic_add)
 half __ovld atomic_fetch_add(volatile __global atomic_half *object,
@@ -13985,6 +13988,7 @@
 #endif // defined(__opencl_c_ext_fp32_global_atomic_add) &&\
 defined(__opencl_c_ext_fp32_local_atomic_add)
 
+#if defined(cl_khr_int64_base_atomics) && defined(cl_khr_int64_extended_atomics)
 #if defined(__opencl_c_ext_fp64_global_atomic_add)
 double __ovld atomic_fetch_add(volatile __global atomic_double *object,
double operand);
@@ -14035,6 +14039,8 @@
 memory_scope scope);
 #endif // defined(__opencl_c_ext_fp64_global_atomic_add) &&\
 defined(__opencl_c_ext_fp64_local_atomic_add)
+#endif // defined(cl_khr_int64_base_atomics) &&\
+defined(cl_khr_int64_extended_atomics)
 
 #endif // cl_ext_float_atomics
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119858: [OpenCL] Guard 64-bit atomic types

2022-02-15 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D119858#3323565 , @Anastasia wrote:

> LGTM! Thanks!
>
> I imagine this is another change to align with `opencl-c.h`?

Yes. This addresses the issue of D119398  for 
tablegen (although the problem for the tablegen case was less severe, since it 
only affects diagnostics).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119858/new/

https://reviews.llvm.org/D119858

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119858: [OpenCL] Guard 64-bit atomic types

2022-02-15 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added reviewers: Anastasia, haonanya.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

Until now, overloads with a 64-bit atomic type argument were always
made available with `-fdeclare-opencl-builtins`.  Ensure these
overloads are only available when both the `cl_khr_int64_base_atomics`
and `cl_khr_int64_extended_atomics` extensions have been enabled, as
required by the OpenCL specification.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D119858

Files:
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
  clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp

Index: clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
===
--- clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
+++ clang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
@@ -733,6 +733,20 @@
   OS << "} // isOpenCLBuiltin\n";
 }
 
+// Emit an if-statement with an isMacroDefined call for each extension in
+// the space-separated list of extensions.
+static void EmitMacroChecks(raw_ostream , StringRef Extensions) {
+  SmallVector ExtVec;
+  Extensions.split(ExtVec, " ");
+  OS << "  if (";
+  for (StringRef Ext : ExtVec) {
+if (Ext != ExtVec.front())
+  OS << " && ";
+OS << "S.getPreprocessor().isMacroDefined(\"" << Ext << "\")";
+  }
+  OS << ") {\n  ";
+}
+
 void BuiltinNameEmitter::EmitQualTypeFinder() {
   OS << R"(
 
@@ -825,15 +839,14 @@
 // Collect all QualTypes for a single vector size into TypeList.
 OS << "  SmallVector TypeList;\n";
 for (const auto *T : BaseTypes) {
-  StringRef Ext =
+  StringRef Exts =
   T->getValueAsDef("Extension")->getValueAsString("ExtName");
-  if (!Ext.empty()) {
-OS << "  if (S.getPreprocessor().isMacroDefined(\"" << Ext
-   << "\")) {\n  ";
+  if (!Exts.empty()) {
+EmitMacroChecks(OS, Exts);
   }
   OS << "  TypeList.push_back("
  << T->getValueAsDef("QTExpr")->getValueAsString("TypeExpr") << ");\n";
-  if (!Ext.empty()) {
+  if (!Exts.empty()) {
 OS << "  }\n";
   }
 }
@@ -877,15 +890,14 @@
 // Emit the cases for non generic, non image types.
 OS << "case OCLT_" << T->getValueAsString("Name") << ":\n";
 
-StringRef Ext = T->getValueAsDef("Extension")->getValueAsString("ExtName");
-// If this type depends on an extension, ensure the extension macro is
+StringRef Exts = T->getValueAsDef("Extension")->getValueAsString("ExtName");
+// If this type depends on an extension, ensure the extension macros are
 // defined.
-if (!Ext.empty()) {
-  OS << "  if (S.getPreprocessor().isMacroDefined(\"" << Ext
- << "\")) {\n  ";
+if (!Exts.empty()) {
+  EmitMacroChecks(OS, Exts);
 }
 OS << "  QT.push_back(" << QT->getValueAsString("TypeExpr") << ");\n";
-if (!Ext.empty()) {
+if (!Exts.empty()) {
   OS << "  }\n";
 }
 OS << "  break;\n";
Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -163,6 +163,25 @@
 }
 #endif // !defined(NO_HEADER) && __OPENCL_C_VERSION__ >= 200
 
+#if !defined(NO_HEADER) && __OPENCL_C_VERSION__ == 200 && defined(__opencl_c_generic_address_space)
+
+// Test that overloads that use atomic_double are not available when the fp64
+// extension is disabled.  Test this by counting the number of notes about
+// candidate functions.
+void test_atomic_double_reporting(volatile __generic atomic_int *a) {
+  atomic_init(a);
+  // expected-error@-1{{no matching function for call to 'atomic_init'}}
+#if defined(NO_FP64)
+  // Expecting 5 candidates: int, uint, long, ulong, float
+  // expected-note@-4 5 {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+#else
+  // Expecting 6 candidates: int, uint, long, ulong, float, double
+  // expected-note@-7 6 {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+#endif
+}
+
+#endif
+
 #if defined(NO_ATOMSCOPE) && __OPENCL_C_VERSION__ >= 300
 // Disable the feature by undefining the feature macro.
 #undef __opencl_c_atomic_scope_device
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -78,6 +78,8 @@
 def NoTypeExt   : TypeExtension<"">;
 def Fp16TypeExt : TypeExtension<"cl_khr_fp16">;
 def Fp64TypeExt : TypeExtension<"cl_khr_fp64">;
+def Atomic64TypeExt : TypeExtension<"cl_khr_int64_base_atomics cl_khr_int64_extended_atomics">;
+def AtomicFp64TypeExt : TypeExtension<"cl_khr_int64_base_atomics cl_khr_int64_extended_atomics 

[PATCH] D119719: [Docs][OpenCL] Update OpenCL 3.0 status

2022-02-15 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

LGTM, just suggesting a minor textual improvement that can be made at commit 
time.




Comment at: clang/docs/UsersManual.rst:3063
 
-There is ongoing support for OpenCL v3.0 that is documented along with other
-experimental functionality and features in development on :doc:`OpenCLSupport`
-page.
+OpenCL v3.0 support is complete but it remains in experimental state, see more
+details about the experimental features in :doc:`OpenCLSupport` page.

```
OpenCL v3.0 support is complete but it remains in experimental state.
More details about the experimental features are described in 
:doc:`OpenCLSupport`.
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119719/new/

https://reviews.llvm.org/D119719

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119713: [Docs] Release 14 notes for SPIR-V in clang

2022-02-15 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119713/new/

https://reviews.llvm.org/D119713

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119560: [OpenCL] opencl-c.h: remove arg names from atomics

2022-02-15 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D119560#3322531 , @Anastasia wrote:

>> also makes the header no longer "claim" the identifiers "success",
>> "failure", "desired", "value" (such that you can compile with -Dvalue=...
>> when including the header for example, which currently breaks parsing
>> of the header).
>
> I don't get what you mean by this. :)

Compiling a CL source file with e.g. `clang -cl-std=CL2.0 -Xclang 
-finclude-default-header -cl-no-stdinc -Dvalue=1 
clang/test/CodeGenOpenCL/as_type.cl` gives lots of errors such as the 
following, because defining `value` as a macro (which is not a reserved 
identifier as far as I'm aware) collides with the argument names in the header:

  In file included from :1:
  lib/clang/15.0.0/include/opencl-c.h:13277:58: error: expected ')'
  void __ovld atomic_init(volatile atomic_int *object, int value);
   ^
  :1:15: note: expanded from here
  #define value 1



>> This is a big patch and it only touches the OpenCL 2 atomics for now. I
>> wonder if we should remove argument names from the other builtins too?
>> I think it would help unifying the header and tablegen approaches: if we
>> gradually move the header into some canonical form, we might be able
>> to eventually replace it with a tablegen-ed header, while being able to
>> easily confirm equivalence.
>
> The only drawback I see if that we will lose the history a bit in "git blame" 
> but:

Slight nuance: we will not lose any history, but I understand your concern: 
someone needs to look through this commit to see the previous commit that 
touched the code.

If there are no objections to removing all argument names from the header, I'll 
try to prepare patches for doing so.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119560/new/

https://reviews.llvm.org/D119560

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119710: [Docs][OpenCL] Release 14 notes

2022-02-14 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/docs/ReleaseNotes.rst:262
+- Added parsing support for optionality of device side enqueue and blocks (not
+  fully incomplete yet!).
+- Added missing support for optionality of various builtin functions:

incomplete -> complete



Comment at: clang/docs/ReleaseNotes.rst:277-279
+- Fix address space for temporaries (to be ``__private``).
+- Disallows static kernel functions.
+- Fix implicit definition of ``__cpp_threadsafe_static_init`` macro.

Nit: it would be better to use the same tense everywhere (fixed, added, ...).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119710/new/

https://reviews.llvm.org/D119710

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118605: [OpenCL] Add support of language builtins for OpenCL C 3.0

2022-02-11 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D118605#3313990 , @Anastasia wrote:

> In D118605#3313859 , @azabaznov 
> wrote:
>
>>> There are tests checking for this (e.g. clang/test/Frontend/opencl.cl), so 
>>> we need this check to preserve the existing behavior indeed.
>>
>> Thanks. The other test is `SemaOpenCL/clang-builtin-version.cl`.
>>
>>> But it might be worth asking someone outside of the OpenCL community 
>>> whether it's desirable to use the LanguageID enum in this way.
>>
>> I personally think this looks good now, for OpenCL in particularly, as it 
>> became version-agnostic (except for DSE). But we still are querying language 
>> options only, and we expect language options for generic AS, pipes and DSE 
>> to be immutable at this point.
>
> Note that this `LanguageID` is intended for Builtins use because there are 
> other `LanguageID`s used elsewhere.

Fair enough.  Just in case someone disagrees, there's always the option of 
giving the builtins a reserved name and providing a macro in `opencl-c-base.h` 
that maps the real name to the builtin, conditionalized on feature macros.  But 
macros have the drawback of less beautiful diagnostics, so if nobody objects we 
should keep what we have done currently I think.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118605/new/

https://reviews.llvm.org/D118605

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119420: [OpenCL] Add OpenCL 3.0 atomics to -fdeclare-opencl-builtins

2022-02-11 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG50f8abb9f40a: [OpenCL] Add OpenCL 3.0 atomics to 
-fdeclare-opencl-builtins (authored by svenvh).

Changed prior to commit:
  https://reviews.llvm.org/D119420?vs=407432=407813#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119420/new/

https://reviews.llvm.org/D119420

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -9,6 +9,7 @@
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CLC++ -fdeclare-opencl-builtins -finclude-default-header
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CLC++2021 -fdeclare-opencl-builtins -finclude-default-header
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CL2.0 -fdeclare-opencl-builtins -finclude-default-header -cl-ext=-cl_khr_fp64 -DNO_FP64
+// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header -DNO_ATOMSCOPE
 
 // Test the -fdeclare-opencl-builtins option.  This is not a completeness
 // test, so it should not test for all builtins defined by OpenCL.  Instead
@@ -80,6 +81,11 @@
 #define __opencl_c_read_write_images 1
 #endif
 
+#if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
+#define __opencl_c_atomic_order_seq_cst 1
+#define __opencl_c_atomic_scope_device 1
+#endif
+
 #define __opencl_c_named_address_space_builtins 1
 #endif
 
@@ -98,6 +104,7 @@
 #if !defined(NO_HEADER) && (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
 kernel void test_enum_args(volatile global atomic_int *global_p, global int *expected) {
   int desired;
+  atomic_work_item_fence(CLK_GLOBAL_MEM_FENCE, memory_order_acq_rel, memory_scope_device);
   atomic_compare_exchange_strong_explicit(global_p, expected, desired,
   memory_order_acq_rel,
   memory_order_relaxed,
@@ -156,6 +163,27 @@
 }
 #endif // !defined(NO_HEADER) && __OPENCL_C_VERSION__ >= 200
 
+#if defined(NO_ATOMSCOPE) && __OPENCL_C_VERSION__ >= 300
+// Disable the feature by undefining the feature macro.
+#undef __opencl_c_atomic_scope_device
+
+// Test that only the overload with explicit order and scope arguments is
+// available when the __opencl_c_atomic_scope_device feature is disabled.
+void test_atomics_without_scope_device(volatile __generic atomic_int *a_int) {
+  int d;
+
+  atomic_exchange(a_int, d);
+  // expected-error@-1{{implicit declaration of function 'atomic_exchange' is invalid in OpenCL}}
+
+  atomic_exchange_explicit(a_int, d, memory_order_seq_cst);
+  // expected-error@-1{{no matching function for call to 'atomic_exchange_explicit'}}
+  // expected-note@-2 + {{candidate function not viable}}
+
+  atomic_exchange_explicit(a_int, d, memory_order_seq_cst, memory_scope_work_group);
+}
+
+#endif
+
 // Test old atomic overloaded with generic address space in C++ for OpenCL.
 #if __OPENCL_C_VERSION__ >= 200
 void test_legacy_atomics_cpp(__generic volatile unsigned int *a) {
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -57,6 +57,23 @@
 // disabled.
 class TypeExtension : AbstractExtension<_Ext>;
 
+// Concatenate zero or more space-separated extensions in NewExts to Base and
+// return the resulting FunctionExtension in ret.
+class concatExtension {
+  FunctionExtension ret = FunctionExtension<
+!cond(
+  // Return Base extension if NewExts is empty,
+  !empty(NewExts) : Base.ExtName,
+
+  // otherwise, return NewExts if Base extension is empty,
+  !empty(Base.ExtName) : NewExts,
+
+  // otherwise, concatenate NewExts to Base.
+  true : Base.ExtName # " " # NewExts
+)
+  >;
+}
+
 // TypeExtension definitions.
 def NoTypeExt   : TypeExtension<"">;
 def Fp16TypeExt : TypeExtension<"cl_khr_fp16">;
@@ -1043,40 +1060,57 @@
 // OpenCL v2.0 s6.13.11 - Atomic Functions.
 
 // An atomic builtin with 2 additional _explicit variants.
-multiclass BuiltinAtomicExplicit Types> {
+multiclass BuiltinAtomicExplicit Types, FunctionExtension BaseExt> {
   // Without explicit MemoryOrder or MemoryScope.
-  def : Builtin;
+  let Extension = concatExtension.ret in {
+def : Builtin;
+  }
 
   // With an explicit MemoryOrder argument.
-  def : Builtin;
+  let Extension = concatExtension.ret in {
+def : Builtin;
+  }
 
   // With explicit MemoryOrder and MemoryScope 

[PATCH] D119398: [OpenCL] Guard atomic_double with cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics

2022-02-10 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.

Thanks, LGTM!  I'll try to followup with the .td changes soon.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119398/new/

https://reviews.llvm.org/D119398

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119398: [OpenCL] Guard atomic_double with cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics

2022-02-10 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D119398#3310746 , @Anastasia wrote:

> This might interfere with https://reviews.llvm.org/D119420

Yes it will conflict.

Atomic doubles are not guarded properly for other builtins (outside of the 
`cl_ext_float_atomics` extensions) either, and I have a different solution in 
mind to solve it for all uses and extensions.  So we could leave out the .td 
changes from this patch and only merge the `opencl-c.h` changes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119398/new/

https://reviews.llvm.org/D119398

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119420: [OpenCL] Add OpenCL 3.0 atomics to -fdeclare-opencl-builtins

2022-02-10 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added reviewers: Anastasia, haonanya.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

Add the atomic overloads for the `global` and `local` address spaces,
which are new in OpenCL 3.0.  Ensure the preexisting `generic`
overloads are guarded by the generic address space feature macro.

Ensure a subset of the atomic builtins are guarded by the
`__opencl_c_atomic_order_seq_cst` and `__opencl_c_atomic_scope_device`
feature macros, and enable those macros for SPIR/SPIR-V targets in
`opencl-c-base.h`.

Also guard the `cl_ext_float_atomics` builtins with the atomic order
and scope feature macros.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D119420

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -9,6 +9,7 @@
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CLC++ -fdeclare-opencl-builtins -finclude-default-header
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CLC++2021 -fdeclare-opencl-builtins -finclude-default-header
 // RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CL2.0 -fdeclare-opencl-builtins -finclude-default-header -cl-ext=-cl_khr_fp64 -DNO_FP64
+// RUN: %clang_cc1 %s -triple spir -verify -pedantic -Wconversion -Werror -fsyntax-only -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header -DNO_ATOMSCOPE
 
 // Test the -fdeclare-opencl-builtins option.  This is not a completeness
 // test, so it should not test for all builtins defined by OpenCL.  Instead
@@ -80,6 +81,11 @@
 #define __opencl_c_read_write_images 1
 #endif
 
+#if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
+#define __opencl_c_atomic_order_seq_cst 1
+#define __opencl_c_atomic_scope_device 1
+#endif
+
 #define __opencl_c_named_address_space_builtins 1
 #endif
 
@@ -98,6 +104,7 @@
 #if !defined(NO_HEADER) && (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
 kernel void test_enum_args(volatile global atomic_int *global_p, global int *expected) {
   int desired;
+  atomic_work_item_fence(CLK_GLOBAL_MEM_FENCE, memory_order_acq_rel, memory_scope_device);
   atomic_compare_exchange_strong_explicit(global_p, expected, desired,
   memory_order_acq_rel,
   memory_order_relaxed,
@@ -156,6 +163,27 @@
 }
 #endif // !defined(NO_HEADER) && __OPENCL_C_VERSION__ >= 200
 
+#if defined(NO_ATOMSCOPE) && __OPENCL_C_VERSION__ >= 300
+// Disable the feature by undefining the feature macro.
+#undef __opencl_c_atomic_scope_device
+
+// Test that only the overload with explicit order and scope arguments is
+// available when the __opencl_c_atomic_scope_device feature is disabled.
+void test_atomics_without_scope_device(volatile __generic atomic_int *a_int) {
+  int d;
+
+  atomic_exchange(a_int, d);
+  // expected-error@-1{{implicit declaration of function 'atomic_exchange' is invalid in OpenCL}}
+
+  atomic_exchange_explicit(a_int, d, memory_order_seq_cst);
+  // expected-error@-1{{no matching function for call to 'atomic_exchange_explicit'}}
+  // expected-note@-2 + {{candidate function not viable}}
+
+  atomic_exchange_explicit(a_int, d, memory_order_seq_cst, memory_scope_work_group);
+}
+
+#endif
+
 // Test old atomic overloaded with generic address space in C++ for OpenCL.
 #if __OPENCL_C_VERSION__ >= 200
 void test_legacy_atomics_cpp(__generic volatile unsigned int *a) {
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -57,6 +57,23 @@
 // disabled.
 class TypeExtension : AbstractExtension<_Ext>;
 
+// Concatenate zero or more space-separated extensions in NewExts to Base and
+// return the resulting FunctionExtension in ret.
+class concatExtension {
+  FunctionExtension ret = FunctionExtension<
+!cond(
+  // Return Base extension if NewExts is empty,
+  !empty(NewExts) : Base.ExtName,
+
+  // otherwise, return NewExts if Base extension is empty,
+  !empty(Base.ExtName) : NewExts,
+
+  // otherwise, concatenate NewExts to Base.
+  true : Base.ExtName # " " # NewExts
+)
+  >;
+}
+
 // TypeExtension definitions.
 def NoTypeExt   : TypeExtension<"">;
 def Fp16TypeExt : TypeExtension<"cl_khr_fp16">;
@@ -1043,40 +1060,57 @@
 // OpenCL v2.0 s6.13.11 - Atomic Functions.
 
 // An atomic builtin with 2 additional _explicit variants.
-multiclass 

[PATCH] D118605: [OpenCL] Add support of language builtins for OpenCL C 3.0

2022-02-09 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/include/clang/Basic/Builtins.def:88
 //   '__builtin_' prefix. It will be implemented in compiler-rt or libgcc.
+//  G -> this function uses generic address space (OpenCL).
+//  P -> this function uses pipes (OpenCL).

azabaznov wrote:
> Anastasia wrote:
> > azabaznov wrote:
> > > Anastasia wrote:
> > > > Anastasia wrote:
> > > > > It might be better to avoid adding such limited language-specific 
> > > > > functionality into generic representation of Builtins. Do you think 
> > > > > could we could just introduce specific language modes, say:
> > > > > 
> > > > > `OCLC_PIPES`
> > > > > `OCLC_DSE`
> > > > > `OCLC_GAS`
> > > > > 
> > > > > and then check against those in `builtinIsSupported`?
> > > > Btw another approach could be to do something similar to 
> > > > `TARGET_BUILTIN` i.e. list features in the last parameter as strings. 
> > > > We could add a separate macro for such builtins and just reuse target 
> > > > Builtins flow. This might be a bit more scalable in case we would need 
> > > > to add more of such builtins later on?
> > > > 
> > > > It might be better to avoid adding such limited language-specific 
> > > > functionality into generic representation of Builtins.
> > > 
> > > Nice idea! Though I think LanguageID is not designed to be used this way, 
> > > it's used only to check against specific language version. So it seems 
> > > pretty invasive. Also, function attributes seem more natural to me to 
> > > specify the type of the function. I don't know for sure which way is 
> > > better...
> > > 
> > > > Btw another approach could be to do something similar to TARGET_BUILTIN 
> > > > i.e. list features in the last parameter as strings.
> > > 
> > > I'd prefer to not use TARGET_BUILTIN as it operates on target feature, 
> > > but OpenCL feature is some other concept in clang...
> > Buitlins handling is pretty vital for clang so if we extend common 
> > functionality for just a few built-in functions it might not justify the 
> > overhead in terms of complexity, parsing time or space... so we would need 
> > to dive in those aspects more before finalizing the design... if we can 
> > avoid it we should try... and I feel in this case there might be some good 
> > ways to avoid it.
> > 
> > > Nice idea! Though I think LanguageID is not designed to be used this way, 
> > > it's used only to check against specific language version. So it seems 
> > > pretty invasive. Also, function attributes seem more natural to me to 
> > > specify the type of the function. I don't know for sure which way is 
> > > better...
> > 
> > I think this `LanguageID` is only used for the purposes of Builtins, so 
> > there should be no issue in evolving it differently. With the documentation 
> > and  adequate naming we can resolve the confusions in any.
> > 
> > The whole idea of language options in clang is that it is not dependent on 
> > the target. But we have violated this design already. The whole concept of 
> > OpenCL 3.0 language features that are target-specific is misaligned with 
> > the original design in clang.
> > 
> > > 
> > > Btw another approach could be to do something similar to 
> > > TARGET_BUILTIN i.e. list features in the last parameter as strings.
> > > 
> > > I'd prefer to not use TARGET_BUILTIN as it operates on target feature, 
> > > but OpenCL feature is some other concept in clang...
> > 
> > But we also have target features mirroring these, right? So I see no reason 
> > not to reuse what we already have... instead of adding another way to do 
> > the same or very similar thing...
> > 
> > We could also consider extending the functionality slightly to use language 
> > features instead however I can't see the immediate benefit at this point... 
> > other than it might be useful in the future... but we can't know for sure.
> > Buitlins handling is pretty vital for clang so if we extend common 
> > functionality for just a few built-in functions it might not justify the 
> > overhead in terms of complexity, parsing time or space... so we would need 
> > to dive in those aspects more before finalizing the design... if we can 
> > avoid it we should try... and I feel in this case there might be some good 
> > ways to avoid it.
> > 
> >>Nice idea! Though I think LanguageID is not designed to be used this way, 
> >>it's used only to check against specific >?language version. So it seems 
> >>pretty invasive. Also, function attributes seem more natural to me to 
> >>specify the type of the function. I don't know for sure which way is 
> >>better...
> > 
> > I think this LanguageID is only used for the purposes of Builtins, so there 
> > should be no issue in evolving it differently. With the documentation and 
> > adequate naming we can resolve the confusions in any.
> 
> So yeah, I think reusing LanguageID is pretty doable and sounds like a good 
> idea.
> 
> 
> > The whole idea of language options 

[PATCH] D119011: [clang] Cache OpenCL types

2022-02-07 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

LGTM, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119011/new/

https://reviews.llvm.org/D119011

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107769: [OpenCL] Make generic addrspace optional for -fdeclare-opencl-builtins

2022-01-31 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8e6099291dcb: [OpenCL] Make generic addrspace optional for 
-fdeclare-opencl-builtins (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107769/new/

https://reviews.llvm.org/D107769

Files:
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -70,12 +70,15 @@
 
 // Enable extensions that are enabled in opencl-c-base.h.
 #if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#define __opencl_c_generic_address_space 1
 #define cl_khr_subgroup_extended_types 1
 #define cl_khr_subgroup_ballot 1
 #define cl_khr_subgroup_non_uniform_arithmetic 1
 #define cl_khr_subgroup_clustered_reduce 1
 #define __opencl_c_read_write_images 1
 #endif
+
+#define __opencl_c_named_address_space_builtins 1
 #endif
 
 kernel void test_pointers(volatile global void *global_p, global const int4 *a) {
Index: clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
@@ -1,5 +1,12 @@
-// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL1.2 -finclude-default-header %s | FileCheck %s
-// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL1.2 -fdeclare-opencl-builtins -finclude-default-header %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL1.2 -finclude-default-header %s \
+// RUN: | FileCheck %s --check-prefixes CHECK,CHECK-NOGAS
+// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL1.2 -fdeclare-opencl-builtins -finclude-default-header %s \
+// RUN: | FileCheck %s --check-prefixes CHECK,CHECK-NOGAS
+// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header %s \
+// RUN: | FileCheck %s --check-prefixes CHECK,CHECK-GAS
+// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header \
+// RUN: -cl-ext=-__opencl_c_generic_address_space,-__opencl_c_pipes,-__opencl_c_device_enqueue %s \
+// RUN: | FileCheck %s --check-prefixes CHECK,CHECK-NOGAS
 
 // Test that mix is correctly defined.
 // CHECK-LABEL: @test_float
@@ -32,6 +39,15 @@
   size_t lid = get_local_id(0);
 }
 
+// Test that the correct builtin is called depending on the generic address
+// space feature availability.
+// CHECK-LABEL: @test_generic_optionality
+// CHECK-GAS: call spir_func float @_Z5fractfPU3AS4f
+// CHECK-NOGAS: call spir_func float @_Z5fractfPf
+void test_generic_optionality(float a, float *b) {
+  float res = fract(a, b);
+}
+
 // CHECK: attributes [[ATTR_CONST]] =
 // CHECK-SAME: readnone
 // CHECK: attributes [[ATTR_PURE]] =
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -85,6 +85,8 @@
 def FuncExtKhrGlMsaaSharing  : FunctionExtension<"cl_khr_gl_msaa_sharing">;
 def FuncExtKhrGlMsaaSharingReadWrite : FunctionExtension<"cl_khr_gl_msaa_sharing __opencl_c_read_write_images">;
 
+def FuncExtOpenCLCGenericAddressSpace: FunctionExtension<"__opencl_c_generic_address_space">;
+def FuncExtOpenCLCNamedAddressSpaceBuiltins : FunctionExtension<"__opencl_c_named_address_space_builtins">;
 def FuncExtOpenCLCPipes  : FunctionExtension<"__opencl_c_pipes">;
 def FuncExtOpenCLCWGCollectiveFunctions  : FunctionExtension<"__opencl_c_work_group_collective_functions">;
 def FuncExtOpenCLCReadWriteImages: FunctionExtension<"__opencl_c_read_write_images">;
@@ -591,10 +593,10 @@
   }
 }
 
-let MaxVersion = CL20 in {
+let Extension = FuncExtOpenCLCNamedAddressSpaceBuiltins in {
   defm : MathWithPointer<[GlobalAS, LocalAS, PrivateAS]>;
 }
-let MinVersion = CL20 in {
+let Extension = FuncExtOpenCLCGenericAddressSpace in {
   defm : MathWithPointer<[GenericAS]>;
 }
 
@@ -840,10 +842,10 @@
   }
 }
 
-let MaxVersion = CL20 in {
+let Extension = FuncExtOpenCLCNamedAddressSpaceBuiltins in {
   defm : VloadVstore<[GlobalAS, LocalAS, PrivateAS], 1>;
 }
-let MinVersion = CL20 in {
+let Extension = FuncExtOpenCLCGenericAddressSpace in {
   defm : VloadVstore<[GenericAS], 1>;
 }
 // vload with constant address space is available regardless of version.
@@ -874,10 +876,10 @@
   }
 }
 
-let MaxVersion = CL20 in {
+let Extension = 

[PATCH] D107769: [OpenCL] Make generic addrspace optional for -fdeclare-opencl-builtins

2022-01-31 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl:73
 #if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#define __opencl_c_generic_address_space 1
 #define cl_khr_subgroup_extended_types 1

Anastasia wrote:
> btw this is not correct for C++ for OpenCL 2021 but we are not testing this 
> with C++ for OpenCL 2021 which we should.
> 
> However it doesn't belong to this patch, but would you be able to add a FIXME 
> here to indicate the issue?
I'll add C++ for OpenCL 2021 (and CL3.0) testing in a separate commit.

As discussed offline, we are likely to drop headerless testing from here 
soonish, since we're relying more and more on opencl-c-base.h for 
`-fdeclare-opencl-builtins`.  We essentially have to redefine all the feature 
macros in the test too, when they're already in the header that we're 
deliberately excluding.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107769/new/

https://reviews.llvm.org/D107769

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118158: [OpenCL] opencl-c.h: refactor named addrspace builtins

2022-01-28 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGbfd8210f6f47: [OpenCL] opencl-c.h: refactor named addrspace 
builtins (authored by svenvh).

Changed prior to commit:
  https://reviews.llvm.org/D118158?vs=402939=403937#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118158/new/

https://reviews.llvm.org/D118158

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Headers/opencl-c.h

Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -7285,7 +7285,9 @@
 half8 __ovld fract(half8 x, half8 *iptr);
 half16 __ovld fract(half16 x, half16 *iptr);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_address_space_builtins)
 float __ovld fract(float x, __global float *iptr);
 float2 __ovld fract(float2 x, __global float2 *iptr);
 float3 __ovld fract(float3 x, __global float3 *iptr);
@@ -7344,7 +7346,7 @@
 half8 __ovld fract(half8 x, __private half8 *iptr);
 half16 __ovld fract(half16 x, __private half16 *iptr);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_address_space_builtins)
 
 /**
  * Extract mantissa and exponent from x. For each
@@ -7375,7 +7377,9 @@
 half8 __ovld frexp(half8 x, int8 *exp);
 half16 __ovld frexp(half16 x, int16 *exp);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_address_space_builtins)
 float __ovld frexp(float x, __global int *exp);
 float2 __ovld frexp(float2 x, __global int2 *exp);
 float3 __ovld frexp(float3 x, __global int3 *exp);
@@ -7434,7 +7438,7 @@
 half8 __ovld frexp(half8 x, __private int8 *exp);
 half16 __ovld frexp(half16 x, __private int16 *exp);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_address_space_builtins)
 
 /**
  * Compute the value of the square root of x^2 + y^2
@@ -7582,7 +7586,9 @@
 half8 __ovld lgamma_r(half8 x, int8 *signp);
 half16 __ovld lgamma_r(half16 x, int16 *signp);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_address_space_builtins)
 float __ovld lgamma_r(float x, __global int *signp);
 float2 __ovld lgamma_r(float2 x, __global int2 *signp);
 float3 __ovld lgamma_r(float3 x, __global int3 *signp);
@@ -7641,7 +7647,7 @@
 half8 __ovld lgamma_r(half8 x, __private int8 *signp);
 half16 __ovld lgamma_r(half16 x, __private int16 *signp);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_address_space_builtins)
 
 /**
  * Compute natural logarithm.
@@ -7888,7 +7894,9 @@
 half8 __ovld modf(half8 x, half8 *iptr);
 half16 __ovld modf(half16 x, half16 *iptr);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_address_space_builtins)
 float __ovld modf(float x, __global float *iptr);
 float2 __ovld modf(float2 x, __global float2 *iptr);
 float3 __ovld modf(float3 x, __global float3 *iptr);
@@ -7947,7 +7955,7 @@
 half8 __ovld modf(half8 x, __private half8 *iptr);
 half16 __ovld modf(half16 x, __private half16 *iptr);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_address_space_builtins)
 
 /**
  * Returns a quiet NaN. The nancode may be placed
@@ -8147,9 +8155,10 @@
 half4 __ovld remquo(half4 x, half4 y, int4 *quo);
 half8 __ovld remquo(half8 x, half8 y, int8 *quo);
 half16 __ovld remquo(half16 x, half16 y, int16 *quo);
-
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_address_space_builtins)
 float __ovld remquo(float x, float y, __global int *quo);
 float2 __ovld remquo(float2 x, float2 y, __global int2 *quo);
 float3 __ovld remquo(float3 x, float3 y, __global int3 *quo);
@@ -8208,7 +8217,7 @@
 half8 __ovld remquo(half8 x, half8 y, __private int8 *quo);
 half16 __ovld remquo(half16 x, half16 y, __private int16 *quo);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_address_space_builtins)
 /**
  * Round to integral value (using round to nearest
  * even rounding mode) in floating-point format.
@@ -8372,7 +8381,9 @@
 half8 __ovld sincos(half8 x, half8 *cosval);
 half16 __ovld sincos(half16 x, half16 *cosval);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_address_space_builtins)
 float __ovld sincos(float x, __global float *cosval);
 float2 __ovld sincos(float2 x, __global float2 *cosval);
 float3 __ovld sincos(float3 x, __global float3 *cosval);
@@ -8431,7 +8442,7 @@
 half8 __ovld sincos(half8 x, __private half8 

[PATCH] D107769: [OpenCL] Make generic addrspace optional for -fdeclare-opencl-builtins

2022-01-27 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh updated this revision to Diff 403705.
svenvh edited the summary of this revision.
svenvh added a comment.

Make use of the `__opencl_c_named_address_space_builtins` internal feature 
added by D118158 .  This should avoid 
affecting OpenCL 2.0.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107769/new/

https://reviews.llvm.org/D107769

Files:
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -70,12 +70,15 @@
 
 // Enable extensions that are enabled in opencl-c-base.h.
 #if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#define __opencl_c_generic_address_space 1
 #define cl_khr_subgroup_extended_types 1
 #define cl_khr_subgroup_ballot 1
 #define cl_khr_subgroup_non_uniform_arithmetic 1
 #define cl_khr_subgroup_clustered_reduce 1
 #define __opencl_c_read_write_images 1
 #endif
+
+#define __opencl_c_named_address_space_builtins 1
 #endif
 
 kernel void test_pointers(volatile global void *global_p, global const int4 *a) {
Index: clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/CodeGenOpenCL/fdeclare-opencl-builtins.cl
@@ -1,5 +1,12 @@
-// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL1.2 -finclude-default-header %s | FileCheck %s
-// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL1.2 -fdeclare-opencl-builtins -finclude-default-header %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL1.2 -finclude-default-header %s \
+// RUN: | FileCheck %s --check-prefixes CHECK,CHECK-NOGAS
+// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL1.2 -fdeclare-opencl-builtins -finclude-default-header %s \
+// RUN: | FileCheck %s --check-prefixes CHECK,CHECK-NOGAS
+// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header %s \
+// RUN: | FileCheck %s --check-prefixes CHECK,CHECK-GAS
+// RUN: %clang_cc1 -emit-llvm -o - -O0 -triple spir-unknown-unknown -cl-std=CL3.0 -fdeclare-opencl-builtins -finclude-default-header \
+// RUN: -cl-ext=-__opencl_c_generic_address_space,-__opencl_c_pipes,-__opencl_c_device_enqueue %s \
+// RUN: | FileCheck %s --check-prefixes CHECK,CHECK-NOGAS
 
 // Test that mix is correctly defined.
 // CHECK-LABEL: @test_float
@@ -32,6 +39,15 @@
   size_t lid = get_local_id(0);
 }
 
+// Test that the correct builtin is called depending on the generic address
+// space feature availability.
+// CHECK-LABEL: @test_generic_optionality
+// CHECK-GAS: call spir_func float @_Z5fractfPU3AS4f
+// CHECK-NOGAS: call spir_func float @_Z5fractfPf
+void test_generic_optionality(float a, float *b) {
+  float res = fract(a, b);
+}
+
 // CHECK: attributes [[ATTR_CONST]] =
 // CHECK-SAME: readnone
 // CHECK: attributes [[ATTR_PURE]] =
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -85,6 +85,8 @@
 def FuncExtKhrGlMsaaSharing  : FunctionExtension<"cl_khr_gl_msaa_sharing">;
 def FuncExtKhrGlMsaaSharingReadWrite : FunctionExtension<"cl_khr_gl_msaa_sharing __opencl_c_read_write_images">;
 
+def FuncExtOpenCLCGenericAddressSpace: FunctionExtension<"__opencl_c_generic_address_space">;
+def FuncExtOpenCLCNamedAddressSpaceBuiltins : FunctionExtension<"__opencl_c_named_address_space_builtins">;
 def FuncExtOpenCLCPipes  : FunctionExtension<"__opencl_c_pipes">;
 def FuncExtOpenCLCWGCollectiveFunctions  : FunctionExtension<"__opencl_c_work_group_collective_functions">;
 def FuncExtOpenCLCReadWriteImages: FunctionExtension<"__opencl_c_read_write_images">;
@@ -591,10 +593,10 @@
   }
 }
 
-let MaxVersion = CL20 in {
+let Extension = FuncExtOpenCLCNamedAddressSpaceBuiltins in {
   defm : MathWithPointer<[GlobalAS, LocalAS, PrivateAS]>;
 }
-let MinVersion = CL20 in {
+let Extension = FuncExtOpenCLCGenericAddressSpace in {
   defm : MathWithPointer<[GenericAS]>;
 }
 
@@ -840,10 +842,10 @@
   }
 }
 
-let MaxVersion = CL20 in {
+let Extension = FuncExtOpenCLCNamedAddressSpaceBuiltins in {
   defm : VloadVstore<[GlobalAS, LocalAS, PrivateAS], 1>;
 }
-let MinVersion = CL20 in {
+let Extension = FuncExtOpenCLCGenericAddressSpace in {
   defm : VloadVstore<[GenericAS], 1>;
 }
 // vload with constant address space is available regardless of version.
@@ -874,10 +876,10 @@
   }
 }
 
-let MaxVersion = CL20 in {
+let Extension = 

[PATCH] D118158: [OpenCL] opencl-c.h: refactor named addrspace builtins

2022-01-25 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: clang.
Herald added subscribers: Naghasan, ldrumm, Anastasia, yaxunl.
svenvh requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

The named address space overloads of builtins that take a pointer
argument are conditionalized on the `__opencl_c_generic_address_space`
feature macro (in a `#else` body).  Introduce an internal feature
macro instead, such that their availability can be controlled in a
single place and independently of the generic address space feature
macro.

This commit does not change the available builtins.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D118158

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Headers/opencl-c.h

Index: clang/lib/Headers/opencl-c.h
===
--- clang/lib/Headers/opencl-c.h
+++ clang/lib/Headers/opencl-c.h
@@ -7285,7 +7285,9 @@
 half8 __ovld fract(half8 x, half8 *iptr);
 half16 __ovld fract(half16 x, half16 *iptr);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_addrsp_builtins)
 float __ovld fract(float x, __global float *iptr);
 float2 __ovld fract(float2 x, __global float2 *iptr);
 float3 __ovld fract(float3 x, __global float3 *iptr);
@@ -7344,7 +7346,7 @@
 half8 __ovld fract(half8 x, __private half8 *iptr);
 half16 __ovld fract(half16 x, __private half16 *iptr);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_addrsp_builtins)
 
 /**
  * Extract mantissa and exponent from x. For each
@@ -7375,7 +7377,9 @@
 half8 __ovld frexp(half8 x, int8 *exp);
 half16 __ovld frexp(half16 x, int16 *exp);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_addrsp_builtins)
 float __ovld frexp(float x, __global int *exp);
 float2 __ovld frexp(float2 x, __global int2 *exp);
 float3 __ovld frexp(float3 x, __global int3 *exp);
@@ -7434,7 +7438,7 @@
 half8 __ovld frexp(half8 x, __private int8 *exp);
 half16 __ovld frexp(half16 x, __private int16 *exp);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_addrsp_builtins)
 
 /**
  * Compute the value of the square root of x^2 + y^2
@@ -7582,7 +7586,9 @@
 half8 __ovld lgamma_r(half8 x, int8 *signp);
 half16 __ovld lgamma_r(half16 x, int16 *signp);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_addrsp_builtins)
 float __ovld lgamma_r(float x, __global int *signp);
 float2 __ovld lgamma_r(float2 x, __global int2 *signp);
 float3 __ovld lgamma_r(float3 x, __global int3 *signp);
@@ -7641,7 +7647,7 @@
 half8 __ovld lgamma_r(half8 x, __private int8 *signp);
 half16 __ovld lgamma_r(half16 x, __private int16 *signp);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_addrsp_builtins)
 
 /**
  * Compute natural logarithm.
@@ -7888,7 +7894,9 @@
 half8 __ovld modf(half8 x, half8 *iptr);
 half16 __ovld modf(half16 x, half16 *iptr);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_addrsp_builtins)
 float __ovld modf(float x, __global float *iptr);
 float2 __ovld modf(float2 x, __global float2 *iptr);
 float3 __ovld modf(float3 x, __global float3 *iptr);
@@ -7947,7 +7955,7 @@
 half8 __ovld modf(half8 x, __private half8 *iptr);
 half16 __ovld modf(half16 x, __private half16 *iptr);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_addrsp_builtins)
 
 /**
  * Returns a quiet NaN. The nancode may be placed
@@ -8147,9 +8155,10 @@
 half4 __ovld remquo(half4 x, half4 y, int4 *quo);
 half8 __ovld remquo(half8 x, half8 y, int8 *quo);
 half16 __ovld remquo(half16 x, half16 y, int16 *quo);
-
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_addrsp_builtins)
 float __ovld remquo(float x, float y, __global int *quo);
 float2 __ovld remquo(float2 x, float2 y, __global int2 *quo);
 float3 __ovld remquo(float3 x, float3 y, __global int3 *quo);
@@ -8208,7 +8217,7 @@
 half8 __ovld remquo(half8 x, half8 y, __private int8 *quo);
 half16 __ovld remquo(half16 x, half16 y, __private int16 *quo);
 #endif //cl_khr_fp16
-#endif //defined(__opencl_c_generic_address_space)
+#endif //defined(__opencl_c_named_addrsp_builtins)
 /**
  * Round to integral value (using round to nearest
  * even rounding mode) in floating-point format.
@@ -8372,7 +8381,9 @@
 half8 __ovld sincos(half8 x, half8 *cosval);
 half16 __ovld sincos(half16 x, half16 *cosval);
 #endif //cl_khr_fp16
-#else
+#endif //defined(__opencl_c_generic_address_space)
+
+#if defined(__opencl_c_named_addrsp_builtins)
 float __ovld sincos(float x, __global float *cosval);
 float2 

[PATCH] D117899: [OpenCL] Make read_write images optional for -fdeclare-opencl-builtins

2022-01-25 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG91a0b464a853: [OpenCL] Make read_write images optional for 
-fdeclare-opencl-builtins (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117899/new/

https://reviews.llvm.org/D117899

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -74,6 +74,7 @@
 #define cl_khr_subgroup_ballot 1
 #define cl_khr_subgroup_non_uniform_arithmetic 1
 #define cl_khr_subgroup_clustered_reduce 1
+#define __opencl_c_read_write_images 1
 #endif
 #endif
 
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -80,11 +80,14 @@
 def FuncExtKhrInt64BaseAtomics   : FunctionExtension<"cl_khr_int64_base_atomics">;
 def FuncExtKhrInt64ExtendedAtomics   : FunctionExtension<"cl_khr_int64_extended_atomics">;
 def FuncExtKhrMipmapImage: FunctionExtension<"cl_khr_mipmap_image">;
+def FuncExtKhrMipmapImageReadWrite   : FunctionExtension<"cl_khr_mipmap_image __opencl_c_read_write_images">;
 def FuncExtKhrMipmapImageWrites  : FunctionExtension<"cl_khr_mipmap_image_writes">;
 def FuncExtKhrGlMsaaSharing  : FunctionExtension<"cl_khr_gl_msaa_sharing">;
+def FuncExtKhrGlMsaaSharingReadWrite : FunctionExtension<"cl_khr_gl_msaa_sharing __opencl_c_read_write_images">;
 
 def FuncExtOpenCLCPipes  : FunctionExtension<"__opencl_c_pipes">;
 def FuncExtOpenCLCWGCollectiveFunctions  : FunctionExtension<"__opencl_c_work_group_collective_functions">;
+def FuncExtOpenCLCReadWriteImages: FunctionExtension<"__opencl_c_read_write_images">;
 def FuncExtFloatAtomicsFp16GlobalLoadStore  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_global_atomic_load_store">;
 def FuncExtFloatAtomicsFp16LocalLoadStore   : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_local_atomic_load_store">;
 def FuncExtFloatAtomicsFp16GenericLoadStore : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_global_atomic_load_store __opencl_c_ext_fp16_local_atomic_load_store">;
@@ -1390,30 +1393,35 @@
 }
 
 // --- Table 23: Sampler-less Read Functions ---
+multiclass ImageReadSamplerless {
+  foreach imgTy = [Image2d, Image1dArray] in {
+def : Builtin<"read_imagef", [VectorType, ImageType, VectorType], Attr.Pure>;
+def : Builtin<"read_imagei", [VectorType, ImageType, VectorType], Attr.Pure>;
+def : Builtin<"read_imageui", [VectorType, ImageType, VectorType], Attr.Pure>;
+  }
+  foreach imgTy = [Image3d, Image2dArray] in {
+def : Builtin<"read_imagef", [VectorType, ImageType, VectorType], Attr.Pure>;
+def : Builtin<"read_imagei", [VectorType, ImageType, VectorType], Attr.Pure>;
+def : Builtin<"read_imageui", [VectorType, ImageType, VectorType], Attr.Pure>;
+  }
+  foreach imgTy = [Image1d, Image1dBuffer] in {
+def : Builtin<"read_imagef", [VectorType, ImageType, Int], Attr.Pure>;
+def : Builtin<"read_imagei", [VectorType, ImageType, Int], Attr.Pure>;
+def : Builtin<"read_imageui", [VectorType, ImageType, Int], Attr.Pure>;
+  }
+  def : Builtin<"read_imagef", [Float, ImageType, VectorType], Attr.Pure>;
+  def : Builtin<"read_imagef", [Float, ImageType, VectorType], Attr.Pure>;
+}
+
 let MinVersion = CL12 in {
-  foreach aQual = ["RO", "RW"] in {
-foreach imgTy = [Image2d, Image1dArray] in {
-  def : Builtin<"read_imagef", [VectorType, ImageType, VectorType], Attr.Pure>;
-  def : Builtin<"read_imagei", [VectorType, ImageType, VectorType], Attr.Pure>;
-  def : Builtin<"read_imageui", [VectorType, ImageType, VectorType], Attr.Pure>;
-}
-foreach imgTy = [Image3d, Image2dArray] in {
-  def : Builtin<"read_imagef", [VectorType, ImageType, VectorType], Attr.Pure>;
-  def : Builtin<"read_imagei", [VectorType, ImageType, VectorType], Attr.Pure>;
-  def : Builtin<"read_imageui", [VectorType, ImageType, VectorType], Attr.Pure>;
-}
-foreach imgTy = [Image1d, Image1dBuffer] in {
-  def : Builtin<"read_imagef", [VectorType, ImageType, Int], Attr.Pure>;
-  def : Builtin<"read_imagei", [VectorType, ImageType, Int], Attr.Pure>;
-  def : Builtin<"read_imageui", [VectorType, ImageType, Int], Attr.Pure>;
-}
-def : Builtin<"read_imagef", [Float, ImageType, VectorType], Attr.Pure>;
-def : Builtin<"read_imagef", [Float, ImageType, VectorType], Attr.Pure>;
+  defm : ImageReadSamplerless<"RO">;
+  let Extension = FuncExtOpenCLCReadWriteImages in {
+defm : ImageReadSamplerless<"RW">;
   }
 }
 
 // --- Table 24: 

[PATCH] D107769: [OpenCL] Make generic addrspace optional for -fdeclare-opencl-builtins

2022-01-24 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D107769#3265665 , @Anastasia wrote:

> The way I understand the spec for OpenCL C 2.0 is that whenever the address 
> space of the pointer is not listed it means generic has to be used, here is 
> one example:
> https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_C.html#vector-data-load-and-store-functions
>
>   gentypen vloadn(size_t offset, const gentype *p)
>   gentypen vloadn(size_t offset, const __constant gentype *p)
>
> that has no address space (i.e. `generic`) and `constant` explicitly. So I 
> think if address spaces are not listed explicitly they are not supposed to be 
> available.

The unified specification (which "specifies all versions of OpenCL C") seems to 
be making all overloads available as I understand; it is perhaps subtly 
different from the previous specification?

https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#vector-data-load-and-store-functions

  gentypen vloadn(size_t offset, const __global gentype *p)
  gentypen vloadn(size_t offset, const __local gentype *p)
  gentypen vloadn(size_t offset, const __constant gentype *p)
  gentypen vloadn(size_t offset, const __private gentype *p)
  
  For OpenCL C 2.0, or OpenCL C 3.0 or newer with the 
__opencl_c_generic_address_space feature:
  
  gentypen vloadn(size_t offset, const gentype *p)

Since the `__constant` overload should always be available, I think a reader 
can assume that the overloads directly above and below `__constant` are also 
always available?  So that the generic overload is an optional addition to the 
list of overloads.  If not, I'd expect the spec to specify a condition before 
listing the specific overloads.

> One implication of adding all address space overloads is that it makes 
> library size larger, but my feeling is that we don't have that many functions 
> with pointers to significantly impace the library size?

This patch should be touching all of them.  Not that many indeed, but it might 
still have a non-negligible impact on OpenCL libraries due to the combination 
of #vector-sizes * #types * #addrspaces.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107769/new/

https://reviews.llvm.org/D107769

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107539: [OpenCL] opencl-c.h: add __opencl_c_images and __opencl_c_read_write_images

2022-01-21 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

Thanks for committing this!  The corresponding TableGen changes are in D117899 
.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107539/new/

https://reviews.llvm.org/D107539

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117899: [OpenCL] Make read_write images optional for -fdeclare-opencl-builtins

2022-01-21 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added a reviewer: Anastasia.
svenvh added a project: clang.
Herald added subscribers: Naghasan, ldrumm, yaxunl.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

Ensure any use of a `read_write` image is guarded behind the 
`__opencl_c_read_write_images` feature macro.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D117899

Files:
  clang/lib/Headers/opencl-c-base.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

Index: clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
===
--- clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
+++ clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
@@ -74,6 +74,7 @@
 #define cl_khr_subgroup_ballot 1
 #define cl_khr_subgroup_non_uniform_arithmetic 1
 #define cl_khr_subgroup_clustered_reduce 1
+#define __opencl_c_read_write_images 1
 #endif
 #endif
 
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -80,11 +80,14 @@
 def FuncExtKhrInt64BaseAtomics   : FunctionExtension<"cl_khr_int64_base_atomics">;
 def FuncExtKhrInt64ExtendedAtomics   : FunctionExtension<"cl_khr_int64_extended_atomics">;
 def FuncExtKhrMipmapImage: FunctionExtension<"cl_khr_mipmap_image">;
+def FuncExtKhrMipmapImageReadWrite   : FunctionExtension<"cl_khr_mipmap_image __opencl_c_read_write_images">;
 def FuncExtKhrMipmapImageWrites  : FunctionExtension<"cl_khr_mipmap_image_writes">;
 def FuncExtKhrGlMsaaSharing  : FunctionExtension<"cl_khr_gl_msaa_sharing">;
+def FuncExtKhrGlMsaaSharingReadWrite : FunctionExtension<"cl_khr_gl_msaa_sharing __opencl_c_read_write_images">;
 
 def FuncExtOpenCLCPipes  : FunctionExtension<"__opencl_c_pipes">;
 def FuncExtOpenCLCWGCollectiveFunctions  : FunctionExtension<"__opencl_c_work_group_collective_functions">;
+def FuncExtOpenCLCReadWriteImages: FunctionExtension<"__opencl_c_read_write_images">;
 def FuncExtFloatAtomicsFp16GlobalLoadStore  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_global_atomic_load_store">;
 def FuncExtFloatAtomicsFp16LocalLoadStore   : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_local_atomic_load_store">;
 def FuncExtFloatAtomicsFp16GenericLoadStore : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_global_atomic_load_store __opencl_c_ext_fp16_local_atomic_load_store">;
@@ -1390,30 +1393,35 @@
 }
 
 // --- Table 23: Sampler-less Read Functions ---
+multiclass ImageReadSamplerless {
+  foreach imgTy = [Image2d, Image1dArray] in {
+def : Builtin<"read_imagef", [VectorType, ImageType, VectorType], Attr.Pure>;
+def : Builtin<"read_imagei", [VectorType, ImageType, VectorType], Attr.Pure>;
+def : Builtin<"read_imageui", [VectorType, ImageType, VectorType], Attr.Pure>;
+  }
+  foreach imgTy = [Image3d, Image2dArray] in {
+def : Builtin<"read_imagef", [VectorType, ImageType, VectorType], Attr.Pure>;
+def : Builtin<"read_imagei", [VectorType, ImageType, VectorType], Attr.Pure>;
+def : Builtin<"read_imageui", [VectorType, ImageType, VectorType], Attr.Pure>;
+  }
+  foreach imgTy = [Image1d, Image1dBuffer] in {
+def : Builtin<"read_imagef", [VectorType, ImageType, Int], Attr.Pure>;
+def : Builtin<"read_imagei", [VectorType, ImageType, Int], Attr.Pure>;
+def : Builtin<"read_imageui", [VectorType, ImageType, Int], Attr.Pure>;
+  }
+  def : Builtin<"read_imagef", [Float, ImageType, VectorType], Attr.Pure>;
+  def : Builtin<"read_imagef", [Float, ImageType, VectorType], Attr.Pure>;
+}
+
 let MinVersion = CL12 in {
-  foreach aQual = ["RO", "RW"] in {
-foreach imgTy = [Image2d, Image1dArray] in {
-  def : Builtin<"read_imagef", [VectorType, ImageType, VectorType], Attr.Pure>;
-  def : Builtin<"read_imagei", [VectorType, ImageType, VectorType], Attr.Pure>;
-  def : Builtin<"read_imageui", [VectorType, ImageType, VectorType], Attr.Pure>;
-}
-foreach imgTy = [Image3d, Image2dArray] in {
-  def : Builtin<"read_imagef", [VectorType, ImageType, VectorType], Attr.Pure>;
-  def : Builtin<"read_imagei", [VectorType, ImageType, VectorType], Attr.Pure>;
-  def : Builtin<"read_imageui", [VectorType, ImageType, VectorType], Attr.Pure>;
-}
-foreach imgTy = [Image1d, Image1dBuffer] in {
-  def : Builtin<"read_imagef", [VectorType, ImageType, Int], Attr.Pure>;
-  def : Builtin<"read_imagei", [VectorType, ImageType, Int], Attr.Pure>;
-  def : Builtin<"read_imageui", [VectorType, ImageType, Int], Attr.Pure>;
-}
-def : Builtin<"read_imagef", [Float, ImageType, VectorType], Attr.Pure>;
-def : Builtin<"read_imagef", [Float, ImageType, VectorType], Attr.Pure>;
+  defm : ImageReadSamplerless<"RO">;
+  let Extension = FuncExtOpenCLCReadWriteImages 

[PATCH] D107539: [OpenCL] opencl-c.h: add __opencl_c_images and __opencl_c_read_write_images

2022-01-17 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

@airlied are you still planning to land this?  I started looking at the 
corresponding .td changes when I realized we don't use 
`__opencl_c_read_write_images` in `opencl-c.h` either yet. :-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107539/new/

https://reviews.llvm.org/D107539

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115523: [OpenCL] Set external linkage for block enqueue kernels

2022-01-12 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG4b85800bfd6c: [OpenCL] Set external linkage for block 
enqueue kernels (authored by svenvh).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115523/new/

https://reviews.llvm.org/D115523

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl


Index: clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
===
--- clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
+++ clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
@@ -402,28 +402,28 @@
   size = get_kernel_sub_group_count_for_ndrange(ndrange, ^(){});
 }
 
-// COMMON: define internal spir_kernel void [[INVLK1]](i8 addrspace(4)* %0) 
#{{[0-9]+}} {
+// COMMON: define spir_kernel void [[INVLK1]](i8 addrspace(4)* %0) #{{[0-9]+}} 
{
 // COMMON: entry:
 // COMMON:  call spir_func void @__device_side_enqueue_block_invoke(i8 
addrspace(4)* %0)
 // COMMON:  ret void
 // COMMON: }
-// COMMON: define internal spir_kernel void [[INVLK2]](i8 addrspace(4)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK1]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK2]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK3]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK4]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK5]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK6]](i8 addrspace(4)* %0, i8 
addrspace(3)* %1, i8 addrspace(3)* %2, i8 addrspace(3)* %3) #{{[0-9]+}} {
+// COMMON: define spir_kernel void [[INVLK2]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK1]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK2]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK3]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK4]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK5]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK6]](i8 addrspace(4)* %0, i8 
addrspace(3)* %1, i8 addrspace(3)* %2, i8 addrspace(3)* %3) #{{[0-9]+}} {
 // COMMON: entry:
 // COMMON:  call spir_func void @__device_side_enqueue_block_invoke_9(i8 
addrspace(4)* %0, i8 addrspace(3)* %1, i8 addrspace(3)* %2, i8 addrspace(3)* %3)
 // COMMON:  ret void
 // COMMON: }
-// COMMON: define internal spir_kernel void [[INVGK7]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK7]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
 // COMMON: define internal spir_func void [[INVG8]](i8 addrspace(4)*{{.*}})
 // COMMON: define internal spir_func void [[INVG9]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)* %{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK8]](i8 addrspace(4)*{{.*}})
-// COMMON: define internal spir_kernel void [[INV_G_K]](i8 
addrspace(4)*{{.*}}, i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVLK3]](i8 addrspace(4)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK9]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK10]](i8 addrspace(4)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK11]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK8]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INV_G_K]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVLK3]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK9]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK10]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK11]](i8 addrspace(4)*{{.*}})
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -11417,7 +11417,7 @@
   auto  = CGF.getLLVMContext();
   std::string Name = Invoke->getName().str() + "_kernel";
   auto *FT = llvm::FunctionType::get(llvm::Type::getVoidTy(C), ArgTys, false);
-  auto *F = llvm::Function::Create(FT, llvm::GlobalValue::InternalLinkage, 
Name,
+  auto *F = llvm::Function::Create(FT, llvm::GlobalValue::ExternalLinkage, 
Name,
());
   auto IP = CGF.Builder.saveIP();
   auto *BB = llvm::BasicBlock::Create(C, "entry", F);


Index: clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl

[PATCH] D116266: [SPIR-V] Add linking of separate translation units using spirv-link

2022-01-11 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/test/Driver/spirv-toolchain.cl:71
+// SPLINK: {{llvm-spirv.*"}} [[BC]] "-o" [[SPV2:".*o"]]
+// SPLINK: {{"spirv-link.*"}} [[SPV1]] [[SPV2]] "-o" "a.out"

aganea wrote:
> Hello @Anastasia, this line fails on my machine. It works with `// SPLINK: 
> spirv-link{{.*}} [[SPV1]] [[SPV2]] "-o" "a.out"
> `
> 
> See output with the error:
> ```
> FAIL: Clang :: Driver/spirv-toolchain.cl (579 of 81682)
>  TEST 'Clang :: Driver/spirv-toolchain.cl' FAILED 
> 
> Script:
> --
> : 'RUN: at line 2';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 -x cl -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 3';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 
> 2>&1 | d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 4';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 -x ir -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 5';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 -x clcpp -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 6';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 -x c -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 12';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv32 -x cl -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV32 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 13';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv32 D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 
> 2>&1 | d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV32 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 14';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv32 -x ir -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV32 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 15';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv32 -x clcpp -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV32 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 16';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv32 -x c -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV32 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 24';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 -x cl -S 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPT64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 25';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 -x ir -S 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPT64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 26';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 -x clcpp -c 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPV64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 27';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv64 -x c -S 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPT64 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl
> : 'RUN: at line 33';   d:\git\llvm-project\release\bin\clang.exe -### 
> --target=spirv32 -x cl -S 
> D:\git\llvm-project\clang\test\Driver\spirv-toolchain.cl 2>&1 | 
> d:\git\llvm-project\release\bin\filecheck.exe --check-prefix=SPT32 
> 

[PATCH] D116266: [SPIR-V] Add linking of separate translation units using spirv-link

2022-01-10 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

LGTM.  I made some minor comments, that can be fixed before committing.




Comment at: clang/docs/UsersManual.rst:3567
 
+Linking is done using `spirv-link` linker from `the SPIRV-Tools project
+`_. Similar to other

(or: "+the spirv-link linker")



Comment at: clang/docs/UsersManual.rst:3569
+`_. Similar to other
+linkers Clang will expect `spirv-link` to be installed separately and to be
+present in the ``PATH`` environment variable. Please refer to `the build and





Comment at: clang/lib/Driver/ToolChains/SPIRV.cpp:78
+void SPIRV::Linker::ConstructJob(Compilation , const JobAction ,
+  const InputInfo ,
+  const InputInfoList ,

Indentation seems to be slightly off?  Can be fixed on commit.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116266/new/

https://reviews.llvm.org/D116266

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D112410: [SPIR-V] Add a toolchain for SPIR-V in clang

2021-12-16 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

Mostly some minor comments that you can address at commit time.

It would be good to get approval from another reviewer.




Comment at: clang/docs/UsersManual.rst:3538
+`_
+for more details. Clang will expects the ``llvm-spirv`` executable to
+be present in the ``PATH`` environment variable. Clang uses ``llvm-spirv``





Comment at: clang/test/Driver/spirv-toolchain.cl:10
+// SPV64-SAME: "-o" [[BC:".*bc"]]
+// SPV64: {{".*llvm-spirv.*"}} [[BC]] "-o" {{".*o"}}
+

svenvh wrote:
> Anastasia wrote:
> > svenvh wrote:
> > > Any reason to not just check for `llvm-spirv{{.*}}`, for consistency with 
> > > the clang check above?
> > Good question, apparently some tools get some target prefixes like if you 
> > look at `Driver::generatePrefixedToolNames`:
> > https://clang.llvm.org/doxygen/Driver_8cpp_source.html#l05169
> > 
> > But perhaps it doesn't happen for `llvm-spirv` and we can safely omit the 
> > prefix?
> Having a target prefix for llvm-spirv seems a bit redundant indeed.  
> `Driver::generatePrefixedToolNames` seems to add both a prefixed and 
> non-prefixed tool name.
I see you have taken out the checking of the prefix; my suggestion was to write 
`llvm-spirv{{.*}}`, to align with the clang check above.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112410/new/

https://reviews.llvm.org/D112410

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110742: [OpenCL] Add pure attributes to vload builtins

2021-12-16 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

In D110742#3197183 , @stuart wrote:

> Thanks!  I have updated the review to use `__attribute__((pure))` only (i.e. 
> it no longer uses `__attribute__((const))`.

LGTM!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110742/new/

https://reviews.llvm.org/D110742

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110742: [OpenCL] Add pure and const attributes to vload builtins

2021-12-15 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

> Does the langref need to be amended, first, or is it okay to interpret the 
> `readnone` attribute as it was clearly intended, without going through the 
> process of updating the langref first?
>
> I can update this review to use `__attribute__((pure))` for all address 
> spaces, for the time being, but it seems a shame that the poor wording in the 
> langref might (necessarily) prevent us from making the optimal change.

Apologies for the late reply...  I'd prefer to get the langref updated first, 
for the sake of consistency and to ensure other stakeholders agree with the 
interpretation.  You can still go ahead with the `__attribute__((pure))` 
changes of course.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110742/new/

https://reviews.llvm.org/D110742

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D112410: [SPIR-V] Add a toolchain for SPIR-V in clang

2021-12-13 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added inline comments.



Comment at: clang/test/Driver/spirv-toolchain.cl:10
+// SPV64-SAME: "-o" [[BC:".*bc"]]
+// SPV64: {{".*llvm-spirv.*"}} [[BC]] "-o" {{".*o"}}
+

Anastasia wrote:
> svenvh wrote:
> > Any reason to not just check for `llvm-spirv{{.*}}`, for consistency with 
> > the clang check above?
> Good question, apparently some tools get some target prefixes like if you 
> look at `Driver::generatePrefixedToolNames`:
> https://clang.llvm.org/doxygen/Driver_8cpp_source.html#l05169
> 
> But perhaps it doesn't happen for `llvm-spirv` and we can safely omit the 
> prefix?
Having a target prefix for llvm-spirv seems a bit redundant indeed.  
`Driver::generatePrefixedToolNames` seems to add both a prefixed and 
non-prefixed tool name.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112410/new/

https://reviews.llvm.org/D112410

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115523: [OpenCL] Set external linkage for block enqueue kernels

2021-12-10 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh created this revision.
svenvh added reviewers: Anastasia, yaxunl.
svenvh added a project: clang.
Herald added a subscriber: ldrumm.
svenvh requested review of this revision.
Herald added a subscriber: cfe-commits.

All kernels can be called from the host as per the SPIR_KERNEL calling
convention.  As such, all kernels should have external linkage, but
block enqueue kernels were created with internal linkage.

Reported-by: Pedro Olsen Ferreira


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D115523

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl


Index: clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
===
--- clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
+++ clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
@@ -402,28 +402,28 @@
   size = get_kernel_sub_group_count_for_ndrange(ndrange, ^(){});
 }
 
-// COMMON: define internal spir_kernel void [[INVLK1]](i8 addrspace(4)* %0) 
#{{[0-9]+}} {
+// COMMON: define spir_kernel void [[INVLK1]](i8 addrspace(4)* %0) #{{[0-9]+}} 
{
 // COMMON: entry:
 // COMMON:  call spir_func void @__device_side_enqueue_block_invoke(i8 
addrspace(4)* %0)
 // COMMON:  ret void
 // COMMON: }
-// COMMON: define internal spir_kernel void [[INVLK2]](i8 addrspace(4)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK1]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK2]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK3]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK4]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK5]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK6]](i8 addrspace(4)* %0, i8 
addrspace(3)* %1, i8 addrspace(3)* %2, i8 addrspace(3)* %3) #{{[0-9]+}} {
+// COMMON: define spir_kernel void [[INVLK2]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK1]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK2]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK3]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK4]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK5]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK6]](i8 addrspace(4)* %0, i8 
addrspace(3)* %1, i8 addrspace(3)* %2, i8 addrspace(3)* %3) #{{[0-9]+}} {
 // COMMON: entry:
 // COMMON:  call spir_func void @__device_side_enqueue_block_invoke_9(i8 
addrspace(4)* %0, i8 addrspace(3)* %1, i8 addrspace(3)* %2, i8 addrspace(3)* %3)
 // COMMON:  ret void
 // COMMON: }
-// COMMON: define internal spir_kernel void [[INVGK7]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK7]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
 // COMMON: define internal spir_func void [[INVG8]](i8 addrspace(4)*{{.*}})
 // COMMON: define internal spir_func void [[INVG9]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)* %{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK8]](i8 addrspace(4)*{{.*}})
-// COMMON: define internal spir_kernel void [[INV_G_K]](i8 
addrspace(4)*{{.*}}, i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVLK3]](i8 addrspace(4)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK9]](i8 addrspace(4)*{{.*}}, 
i8 addrspace(3)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK10]](i8 addrspace(4)*{{.*}})
-// COMMON: define internal spir_kernel void [[INVGK11]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK8]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INV_G_K]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVLK3]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK9]](i8 addrspace(4)*{{.*}}, i8 
addrspace(3)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK10]](i8 addrspace(4)*{{.*}})
+// COMMON: define spir_kernel void [[INVGK11]](i8 addrspace(4)*{{.*}})
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -11351,7 +11351,7 @@
   auto  = CGF.getLLVMContext();
   std::string Name = Invoke->getName().str() + "_kernel";
   auto *FT = llvm::FunctionType::get(llvm::Type::getVoidTy(C), ArgTys, false);
-  auto *F = llvm::Function::Create(FT, llvm::GlobalValue::InternalLinkage, 
Name,
+  auto *F = llvm::Function::Create(FT, llvm::GlobalValue::ExternalLinkage, 
Name,
());
   auto IP = CGF.Builder.saveIP();
   auto *BB = 

[PATCH] D112410: [SPIR-V] Add a toolchain for SPIR-V in clang

2021-12-02 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh requested changes to this revision.
svenvh added inline comments.
This revision now requires changes to proceed.



Comment at: clang/lib/Driver/Driver.cpp:3728
 
+  // Linking separate translation units for SPIR-V is not supported yet.
+  // It can be done either by LLVM IR linking before conversion of the final

FIXME? (assuming this is something we want to address eventually)



Comment at: clang/test/Driver/spirv-toolchain.cl:10
+// SPV64-SAME: "-o" [[BC:".*bc"]]
+// SPV64: {{".*llvm-spirv.*"}} [[BC]] "-o" {{".*o"}}
+

Any reason to not just check for `llvm-spirv{{.*}}`, for consistency with the 
clang check above?



Comment at: clang/test/Misc/warning-flags.c:21
 
-CHECK: Warnings without flags (67):
+CHECK: Warnings without flags (68):
 

The comment above says: "The list of warnings below should NEVER grow.", and 
the current patch violates that.  You'll need to add a warning group to the new 
warning.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112410/new/

https://reviews.llvm.org/D112410

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107769: [OpenCL] Make generic addrspace optional for -fdeclare-opencl-builtins

2021-10-14 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

In D107769#2967565 , @Anastasia wrote:

> In D107769#2960441 , @svenvh wrote:
>
>> I have done an alternative spin of this patch: instead of introducing 
>> `RequireDisabledExtension`, simply always make the `private`, `global`, and 
>> `local` overloads available.  This makes tablegen diverge from `opencl-c.h` 
>> though.  Perhaps we should also always add make the `private`, `global`, and 
>> `local` overloads available in `opencl-c.h`?
>
> Yes, we could do this indeed as a clang extension. I don't think this will 
> impact the user code. However, this means vendors will have to add 
> definitions for extra overloads in OpenCL 2.0 mode?

I wonder if making the `private`, `global`, and `local` overloads always 
available would even be considered an extension.  Unless I overlooked 
something, I cannot find anything in the spec saying that the `private`, 
`global`, and `local` overloads should **not** be available when a `generic` 
overload is available (even though this is what Clang has always done)?

Making all overloads always available in e.g. CL2.0 mode will indeed affect the 
generated calls, which means the definitions should be available when consuming 
the generated IR.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107769/new/

https://reviews.llvm.org/D107769

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109740: [OpenCL] Add atomic_half type builtins

2021-10-12 Thread Sven van Haastregt via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG544d89e847d4: [OpenCL] Add atomic_half type builtins 
(authored by svenvh).

Changed prior to commit:
  https://reviews.llvm.org/D109740?vs=376467=378938#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109740/new/

https://reviews.llvm.org/D109740

Files:
  clang/lib/Headers/opencl-c.h
  clang/lib/Sema/OpenCLBuiltins.td
  clang/lib/Sema/Sema.cpp
  clang/test/SemaOpenCL/atomic-ops.cl

Index: clang/test/SemaOpenCL/atomic-ops.cl
===
--- clang/test/SemaOpenCL/atomic-ops.cl
+++ clang/test/SemaOpenCL/atomic-ops.cl
@@ -19,7 +19,7 @@
 
 atomic_int gn;
 void f(atomic_int *i, const atomic_int *ci,
-   atomic_intptr_t *p, atomic_float *f, atomic_double *d, atomic_half *h, // expected-error {{unknown type name 'atomic_half'}}
+   atomic_intptr_t *p, atomic_float *f, atomic_double *d, atomic_half *h,
int *I, const int *CI,
intptr_t *P, float *D, struct S *s1, struct S *s2,
global atomic_int *i_g, local atomic_int *i_l, private atomic_int *i_p,
Index: clang/lib/Sema/Sema.cpp
===
--- clang/lib/Sema/Sema.cpp
+++ clang/lib/Sema/Sema.cpp
@@ -367,6 +367,11 @@
 AddPointerSizeDependentTypes();
   }
 
+  if (getOpenCLOptions().isSupported("cl_khr_fp16", getLangOpts())) {
+auto AtomicHalfT = Context.getAtomicType(Context.HalfTy);
+addImplicitTypedef("atomic_half", AtomicHalfT);
+  }
+
   std::vector Atomic64BitTypes;
   if (getOpenCLOptions().isSupported("cl_khr_int64_base_atomics",
  getLangOpts()) &&
Index: clang/lib/Sema/OpenCLBuiltins.td
===
--- clang/lib/Sema/OpenCLBuiltins.td
+++ clang/lib/Sema/OpenCLBuiltins.td
@@ -85,16 +85,25 @@
 
 def FuncExtOpenCLCPipes  : FunctionExtension<"__opencl_c_pipes">;
 def FuncExtOpenCLCWGCollectiveFunctions  : FunctionExtension<"__opencl_c_work_group_collective_functions">;
+def FuncExtFloatAtomicsFp16GlobalLoadStore  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_global_atomic_load_store">;
+def FuncExtFloatAtomicsFp16LocalLoadStore   : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_local_atomic_load_store">;
+def FuncExtFloatAtomicsFp16GenericLoadStore : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_global_atomic_load_store __opencl_c_ext_fp16_local_atomic_load_store">;
+def FuncExtFloatAtomicsFp16GlobalAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_global_atomic_add">;
 def FuncExtFloatAtomicsFp32GlobalAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_global_atomic_add">;
 def FuncExtFloatAtomicsFp64GlobalAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_global_atomic_add">;
+def FuncExtFloatAtomicsFp16LocalAdd  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_local_atomic_add">;
 def FuncExtFloatAtomicsFp32LocalAdd  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_add">;
 def FuncExtFloatAtomicsFp64LocalAdd  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_local_atomic_add">;
+def FuncExtFloatAtomicsFp16GenericAdd: FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_local_atomic_add __opencl_c_ext_fp16_global_atomic_add">;
 def FuncExtFloatAtomicsFp32GenericAdd: FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_add __opencl_c_ext_fp32_global_atomic_add">;
 def FuncExtFloatAtomicsFp64GenericAdd: FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_local_atomic_add __opencl_c_ext_fp64_global_atomic_add">;
+def FuncExtFloatAtomicsFp16GlobalMinMax  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_global_atomic_min_max">;
 def FuncExtFloatAtomicsFp32GlobalMinMax  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_global_atomic_min_max">;
 def FuncExtFloatAtomicsFp64GlobalMinMax  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_global_atomic_min_max">;
+def FuncExtFloatAtomicsFp16LocalMinMax   : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_local_atomic_min_max">;
 def FuncExtFloatAtomicsFp32LocalMinMax   : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_min_max">;
 def FuncExtFloatAtomicsFp64LocalMinMax   : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_local_atomic_min_max">;
+def FuncExtFloatAtomicsFp16GenericMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp16_local_atomic_min_max __opencl_c_ext_fp16_global_atomic_min_max">;
 def FuncExtFloatAtomicsFp32GenericMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_min_max __opencl_c_ext_fp32_global_atomic_min_max">;
 def 

[PATCH] D110742: [OpenCL] Add pure and const attributes to vload builtins

2021-10-07 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

> For the constant address space, the const attribute (or readnone) can be 
> used. As memory in the constant address space is immutable, the statement in 
> the langref that: "if a readnone function reads or writes memory visible to 
> the program, or has other side-effects, the behavior is undefined" does not 
> apply. The reading of immutable memory does not have side-effects, nor can it 
> be affected by side-effects.

I think `readnone` might be too strong, because the pointer argument will still 
be dereferenced (while `readnone` implies that "the function computes its 
result [...] based strictly on its arguments, without dereferencing any pointer 
arguments").


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110742/new/

https://reviews.llvm.org/D110742

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109740: [OpenCL] Add atomic_half type builtins

2021-10-07 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh accepted this revision.
svenvh added a comment.
This revision is now accepted and ready to land.

LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109740/new/

https://reviews.llvm.org/D109740

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109740: [OpenCL] Add atomic_half type builtins

2021-09-29 Thread Sven van Haastregt via Phabricator via cfe-commits
svenvh added a comment.

Looks good at a first quick glance, but please fix the formatting errors that 
are reported on the lines that you are adding.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109740/new/

https://reviews.llvm.org/D109740

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   5   >