[compiler-rt] [lldb] [libcxx] [llvm] [clang-tools-extra] [lld] [libc] [mlir] [openmp] [flang] [clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-12-15 Thread Sang Ik Lee via cfe-commits

silee2 wrote:

Closing as all sub-components has been merged.

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-14 Thread Sang Ik Lee via cfe-commits


@@ -71,7 +71,8 @@ void GPUToSPIRVPass::runOnOperation() {
   std::unique_ptr target =
   spirv::getMemorySpaceToStorageClassTarget(*context);
   spirv::MemorySpaceToStorageClassMap memorySpaceMap =
-  spirv::mapMemorySpaceToVulkanStorageClass;
+  this->useOpenCL ? spirv::mapMemorySpaceToOpenCLStorageClass :
+  spirv::mapMemorySpaceToVulkanStorageClass;

silee2 wrote:

Created #66445 which includes the code change and unit-test for covering.

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-12 Thread Guray Ozen via cfe-commits


@@ -811,8 +812,13 @@ LogicalResult 
ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
   // descriptor.
   Type elementPtrType = this->getElementPtrType(memRefType);
   auto stream = adaptor.getAsyncDependencies().front();
+
+  auto isHostShared = rewriter.create(
+  loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
   Value allocatedPtr =
-  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+  .getResult();

grypp wrote:

> the upstream GPUToLLVMConversion lowering does not support lowering of 
> gpu.alloc which is not async.

Would that work if omit that check when `host_shared` is present? 




https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-12 Thread Ronan Keryell via cfe-commits


@@ -811,8 +812,13 @@ LogicalResult 
ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
   // descriptor.
   Type elementPtrType = this->getElementPtrType(memRefType);
   auto stream = adaptor.getAsyncDependencies().front();
+
+  auto isHostShared = rewriter.create(
+  loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
   Value allocatedPtr =
-  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+  .getResult();

keryell wrote:

I guess that if the runtime uses actually synchronous allocation behind the 
scene and produces an always-ready async token, it works, even if non optimal.

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-12 Thread Nishant Patel via cfe-commits

https://github.com/nbpatel edited 
https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-12 Thread Nishant Patel via cfe-commits


@@ -811,8 +812,13 @@ LogicalResult 
ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
   // descriptor.
   Type elementPtrType = this->getElementPtrType(memRefType);
   auto stream = adaptor.getAsyncDependencies().front();
+
+  auto isHostShared = rewriter.create(
+  loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
   Value allocatedPtr =
-  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+  .getResult();

nbpatel wrote:

the upstream GPUToLLVMConversion lowering does not support lowering of 
gpu.alloc which is not async? 
https://github.com/llvm/llvm-project/blob/main/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp#L797
 


https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-09 Thread Guray Ozen via cfe-commits


@@ -811,8 +812,13 @@ LogicalResult 
ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
   // descriptor.
   Type elementPtrType = this->getElementPtrType(memRefType);
   auto stream = adaptor.getAsyncDependencies().front();
+
+  auto isHostShared = rewriter.create(
+  loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
   Value allocatedPtr =
-  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+  .getResult();

grypp wrote:

> Technically, SYCL provides a more abstract memory management with 
> sycl::buffer and sycl::accessor defining an implicit asynchronous task graph. 
> The allocation details are left to the implementation, asynchronous or 
> synchronous allocation is left to the implementers.

I haven't touched SYCL much, thanks for the explanation. Creating a task graph 
implicitly sounds interesting. In this case, SYCL users are ware of asynchrony 
while writing their program. In CUDA (or HIP), users choose sync or async 
execution. 

> Here the lower-level synchronous USM memory management API of SYCL is used 
> instead, similar to CUDA/HIP memory management.

Yes that's correct. I don't think there is an USM that can do allocation 
asynchronously. 

> So, should the async allocation in the example be synchronous instead?

Yes, I think this is the correct behaviour. We can disallow `host_shared` and 
`async` on the Op. 

Here are the possible IRs: 

```
// Valid
%memref = gpu.alloc host_shared (): memref<3x3xi64>

// Valid
%memref = gpu.alloc (): memref<3x3xi64>

// Invalid, USM managers don't allocate async  
%memref, %asyncToken = gpu.alloc async [%0] host_shared (): memref<3x3xi64>

// Valid, only for CUDA. Afaik, SYCL or HIP cannot do that
%memref, %asyncToken = gpu.alloc async [%0] (): memref<3x3xi64>
```

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-08 Thread Jianhui Li via cfe-commits

Jianhui-Li wrote:

> > We really just need a "mlir-runner" with target-platform as command 
> > parameters.
> 
> What kind of "target-platform" command parameters do you have in mind? (other 
> than what we do now)

The current way of mlir-cpu-runner using the share library name to indicate 
target-platform looks good to me: Cuda, Rocm, and SYCL with this PR.  Vulkan 
could be added same way.  mlir-cpu-spirv-runner could be refactored to be 
mlir-opt passes generating spirv binary and feed to mlir-cpu-runner.  

If we reach that state, the name "mlir-cpu-runner" could be promoted to 
"mlir-runner". That would clear up a lot of misunderstanding.  The 
"mlir-runner" is really running MLIR programs on a target platform, and has 
little to do with enabling specific language or runtime features of a target 
platform.  For example, we say "SYCL runner", people think it is about running 
SYCL program on top of MLIR. 

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Ronan Keryell via cfe-commits


@@ -811,8 +812,13 @@ LogicalResult 
ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
   // descriptor.
   Type elementPtrType = this->getElementPtrType(memRefType);
   auto stream = adaptor.getAsyncDependencies().front();
+
+  auto isHostShared = rewriter.create(
+  loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
   Value allocatedPtr =
-  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+  .getResult();

keryell wrote:

Technically, SYCL provides a more abstract memory management with 
`sycl::buffer` and `sycl::accessor` defining an implicit asynchronous task 
graph. The allocation details are left to the implementation, asynchronous or 
synchronous allocation is left to the implementers.
Here the lower-level synchronous USM memory management API of SYCL is used 
instead, similar to CUDA/HIP memory management.
So, should the `async` allocation in the example be synchronous instead?

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Fabian Mora via cfe-commits


@@ -0,0 +1,70 @@
+//===- SerializeToSPIRV.cpp - Convert GPU kernel to SPIRV blob 
-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+///
+/// \file
+/// This pass iterates all the SPIR-V modules in the top module and serializes
+/// each SPIR-V module to SPIR-V binary and then attachs the binary blob as a
+/// string attribute to the corresponding gpu module.
+///
+//===--===//
+
+#include "mlir/Dialect/GPU/Transforms/Passes.h"
+
+#include "mlir/Dialect/GPU/IR/GPUDialect.h"
+#include "mlir/Dialect/GPU/Transforms/Passes.h"
+#include "mlir/Dialect/SPIRV/IR/SPIRVDialect.h"
+#include "mlir/Dialect/SPIRV/IR/SPIRVOps.h"
+#include "mlir/Target/SPIRV/Serialization.h"
+
+namespace mlir {
+#define GEN_PASS_DEF_GPUSERIALIZETOSPIRVPASS
+#include "mlir/Dialect/GPU/Transforms/Passes.h.inc"
+} // namespace mlir
+
+using namespace mlir;
+
+struct GpuSerializeToSPIRVPass : public 
mlir::impl::GpuSerializeToSPIRVPassBase {
+public:
+  void runOnOperation() override {
+auto mod = getOperation();
+llvm::SmallVector spvBinary;
+for (mlir::gpu::GPUModuleOp gpuMod : mod.getOps()) {
+  auto name = gpuMod.getName();
+  // check that the spv module has the same name with gpu module except the
+  // prefix "__spv__"
+  auto isSameMod = [&](spirv::ModuleOp spvMod) -> bool {
+auto spvModName = spvMod.getName();
+return spvModName->consume_front("__spv__") && spvModName == name;
+  };
+  auto spvMods = mod.getOps();
+  auto it = llvm::find_if(spvMods, isSameMod);
+  if (it == spvMods.end()) {
+gpuMod.emitError() << "Unable to find corresponding SPIR-V module";
+signalPassFailure();
+return;
+  }
+  auto spvMod = *it;
+
+  spvBinary.clear();
+  // serialize the spv module to spv binary
+  if (mlir::failed(spirv::serialize(spvMod, spvBinary))) {
+spvMod.emitError() << "Failed to serialize SPIR-V module";
+signalPassFailure();
+return;
+  }
+
+  // attach the spv binary to the gpu module
+  auto spvData =
+  llvm::StringRef(reinterpret_cast(spvBinary.data()),
+  spvBinary.size() * sizeof(uint32_t));
+  auto spvAttr = mlir::StringAttr::get((), spvData);
+  gpuMod->setAttr(gpu::getDefaultGpuBinaryAnnotation(), spvAttr);
+  spvMod->erase();
+}
+  }
+};

fabianmcg wrote:

@silee2 here are the steps:
1. Implement a target attribute, see for example: 
[NVVMTargetAttr](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td#L1679-L1741).
 The idea of this attribute is to hold properties intrinsic to the target, like 
triple, chip, flags, etc.
2.  Add a pass to attach the target to a module, see: 
[GpuNVVMAttachTarget](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td#L85-L128)
 and 
[Dialect/GPU/Transforms/NVVMAttachTarget.cpp](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/GPU/Transforms/NVVMAttachTarget.cpp).
 The idea of this pass is to attach the SPIRV target to GPU modules, so it must 
know how to create them.
3. We're currently implementing `TargetAttrs` as external models to keep 
libraries separated, see 
[NVVM/Target.cpp](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Target/LLVM/NVVM/Target.cpp#L44-L50),
 so `GpuSerializeToSPIRVPass::run` would be there.
4. Modify `getModuleLoadFn` & `createKernelLaunch` appropriately in 
[SelectObjectAttr.cpp#L125-L15](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp#L125-L152)
 instead of adding the changes in `GPUToLLVMCommon`.
5. Then the compilation workflow should look something similar to this: [GPU: 
Compilation 
Overview](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview)

I'll take care of adding a pointer to the top module symbol table so it can be 
used be the `SPIRVTarget`.

If you have any questions just ping me in discord or discourse `@fabianmc`.

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Mehdi Amini via cfe-commits


@@ -0,0 +1,223 @@
+//===- SyclRuntimeWrappers.cpp - MLIR SYCL wrapper library ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Implements C wrappers around the sycl runtime library.
+//
+//===--===//
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#define SYCL_RUNTIME_EXPORT __declspec(dllexport)
+#else
+#define SYCL_RUNTIME_EXPORT
+#endif // _WIN32
+
+namespace {
+
+template 
+auto catchAll(F &) {
+  try {
+return func();
+  } catch (const std::exception ) {
+fprintf(stdout, "An exception was thrown: %s\n", e.what());
+fflush(stdout);
+abort();
+  } catch (...) {
+fprintf(stdout, "An unknown exception was thrown\n");
+fflush(stdout);
+abort();
+  }
+}
+
+#define L0_SAFE_CALL(call) 
\
+  {
\
+ze_result_t status = (call);   
\
+if (status != ZE_RESULT_SUCCESS) { 
\
+  fprintf(stdout, "L0 error %d\n", status);
\
+  fflush(stdout);  
\
+  abort(); 
\
+}  
\
+  }
+
+} // namespace
+
+static sycl::device getDefaultDevice() {
+  auto platformList = sycl::platform::get_platforms();
+  for (const auto  : platformList) {
+auto platformName = platform.get_info();
+bool isLevelZero = platformName.find("Level-Zero") != std::string::npos;
+if (!isLevelZero)
+  continue;
+
+return platform.get_devices()[0];
+  }
+  throw std::runtime_error("getDefaultDevice failed");
+}
+
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wglobal-constructors"

joker-eph wrote:

I don't think we have this anywhere in the codebase right now, can you remove 
it and use lazy initialization instead?

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Jianhui Li via cfe-commits

Jianhui-Li wrote:

> > At some point it would be nice to have some design document or 
> > documentation somewhere explaining how all these MLIR runners works, 
> > including this one.
> 
> The idea is to eventually consolidate all runners into one. This PR is just 
> another piece of the puzzle.
> 
> Once we're all happy with how the runners work, we should common them up 
> using command line options to select the "type" and CMake options to enable 
> particular runner types (depending on the runtimes and hardware available).
> 
> > Globally this PR add a SYCL runner, but it is very specific for Intel Level 
> > 0. It would be nice to have in the future some generalization, like SYCL 
> > using OpenCL interoperability interface to run the SPIR-V kernels or even 
> > native kernels.
> 
> Agreed! The SYCL runtime here is just being used to abstract the LevelZero 
> calls, but this work will be helpful when adding a full SYCL runner (actual 
> language extensions and libraries) to other CPUs/GPUs later.

Agree. The key point of this PR is not to create yet another runner, but just 
enable the existing mlir-cpu-runner to runs llvm + device kernel on SYCL 
platform. The current implementation uses L0 but could be extended. 

I won't call this as adding a "SYCL runner", it is add SYCL backend to 
mlir-cpu-runner.  The term "xxx runner" in MLIR is a bit overused. 
mlir-cpu-runner doesn't mean it runs only on CPU, instead it runs both the host 
and device kernel.   mlir-cpu-runner currently does that but the name is a bit 
misleading.  We really just need a "mlir-runner" with target-platform as 
command parameters.

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Ronan Keryell via cfe-commits

https://github.com/keryell edited 
https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Renato Golin via cfe-commits

rengolin wrote:

CI failure looks like Buildkite issue?
```
$ /etc/buildkite-agent/hooks/pre-checkout
--
  | BUILDKITE_REPO: https://github.com/llvm/llvm-project.git
  | fatal: not a git repository (or any parent up to mount point /var/lib)
  | Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  |  Error: The global pre-checkout hook exited with status 128
```

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Renato Golin via cfe-commits

rengolin wrote:

> At some point it would be nice to have some design document or documentation 
> somewhere explaining how all these MLIR runners works, including this one.

The idea is to eventually consolidate all runners into one. This PR is just 
another piece of the puzzle.

Once we're all happy with how the runners work, we should common them up using 
command line options to select the "type" and CMake options to enable 
particular runner types (depending on the runtimes and hardware available).

> Globally this PR add a SYCL runner, but it is very specific for Intel Level 
> 0. It would be nice to have in the future some generalization, like SYCL 
> using OpenCL interoperability interface to run the SPIR-V kernels or even 
> native kernels.

Agreed! The SYCL runtime here is just being used to abstract the LevelZero 
calls, but this work will be helpful when adding a full SYCL runner (actual 
language extensions and libraries) to other CPUs/GPUs later. 

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-06 Thread Ronan Keryell via cfe-commits


@@ -0,0 +1,223 @@
+//===- SyclRuntimeWrappers.cpp - MLIR SYCL wrapper library ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Implements C wrappers around the sycl runtime library.
+//
+//===--===//
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#define SYCL_RUNTIME_EXPORT __declspec(dllexport)
+#else
+#define SYCL_RUNTIME_EXPORT
+#endif // _WIN32
+
+namespace {
+
+template 
+auto catchAll(F &) {
+  try {
+return func();
+  } catch (const std::exception ) {
+fprintf(stdout, "An exception was thrown: %s\n", e.what());
+fflush(stdout);
+abort();
+  } catch (...) {
+fprintf(stdout, "An unknown exception was thrown\n");
+fflush(stdout);
+abort();
+  }
+}
+
+#define L0_SAFE_CALL(call) 
\
+  {
\
+ze_result_t status = (call);   
\
+if (status != ZE_RESULT_SUCCESS) { 
\
+  fprintf(stdout, "L0 error %d\n", status);
\
+  fflush(stdout);  
\
+  abort(); 
\
+}  
\
+  }
+
+} // namespace
+
+static sycl::device getDefaultDevice() {
+  auto platformList = sycl::platform::get_platforms();
+  for (const auto  : platformList) {
+auto platformName = platform.get_info();
+bool isLevelZero = platformName.find("Level-Zero") != std::string::npos;
+if (!isLevelZero)
+  continue;
+
+return platform.get_devices()[0];
+  }
+  throw std::runtime_error("getDefaultDevice failed");
+}
+
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wglobal-constructors"
+
+// Create global device and context
+sycl::device syclDevice = getDefaultDevice();
+sycl::context syclContext = sycl::context(syclDevice);

keryell wrote:

```suggestion
sycl::context syclContext { syclDevice };
```

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-06 Thread Ronan Keryell via cfe-commits

https://github.com/keryell edited 
https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-06 Thread Ronan Keryell via cfe-commits


@@ -0,0 +1,223 @@
+//===- SyclRuntimeWrappers.cpp - MLIR SYCL wrapper library ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Implements C wrappers around the sycl runtime library.
+//
+//===--===//
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#define SYCL_RUNTIME_EXPORT __declspec(dllexport)
+#else
+#define SYCL_RUNTIME_EXPORT
+#endif // _WIN32
+
+namespace {
+
+template 
+auto catchAll(F &) {
+  try {
+return func();
+  } catch (const std::exception ) {
+fprintf(stdout, "An exception was thrown: %s\n", e.what());
+fflush(stdout);
+abort();
+  } catch (...) {
+fprintf(stdout, "An unknown exception was thrown\n");
+fflush(stdout);
+abort();
+  }
+}
+
+#define L0_SAFE_CALL(call) 
\
+  {
\
+ze_result_t status = (call);   
\
+if (status != ZE_RESULT_SUCCESS) { 
\
+  fprintf(stdout, "L0 error %d\n", status);
\
+  fflush(stdout);  
\
+  abort(); 
\
+}  
\
+  }
+
+} // namespace
+
+static sycl::device getDefaultDevice() {
+  auto platformList = sycl::platform::get_platforms();
+  for (const auto  : platformList) {
+auto platformName = platform.get_info();
+bool isLevelZero = platformName.find("Level-Zero") != std::string::npos;
+if (!isLevelZero)
+  continue;
+
+return platform.get_devices()[0];
+  }
+  throw std::runtime_error("getDefaultDevice failed");
+}
+
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wglobal-constructors"
+
+// Create global device and context
+sycl::device syclDevice = getDefaultDevice();
+sycl::context syclContext = sycl::context(syclDevice);
+
+#pragma clang diagnostic pop
+
+struct QUEUE {

keryell wrote:

Why this spelling? Coding standard?
Why do you need this object? At the end this looks like a 
`std::optional`.

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-06 Thread Ronan Keryell via cfe-commits


@@ -0,0 +1,223 @@
+//===- SyclRuntimeWrappers.cpp - MLIR SYCL wrapper library ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Implements C wrappers around the sycl runtime library.
+//
+//===--===//
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#define SYCL_RUNTIME_EXPORT __declspec(dllexport)
+#else
+#define SYCL_RUNTIME_EXPORT
+#endif // _WIN32
+
+namespace {
+
+template 
+auto catchAll(F &) {
+  try {
+return func();
+  } catch (const std::exception ) {
+fprintf(stdout, "An exception was thrown: %s\n", e.what());
+fflush(stdout);
+abort();
+  } catch (...) {
+fprintf(stdout, "An unknown exception was thrown\n");
+fflush(stdout);
+abort();
+  }
+}
+
+#define L0_SAFE_CALL(call) 
\
+  {
\
+ze_result_t status = (call);   
\
+if (status != ZE_RESULT_SUCCESS) { 
\
+  fprintf(stdout, "L0 error %d\n", status);
\
+  fflush(stdout);  
\
+  abort(); 
\
+}  
\
+  }
+
+} // namespace
+
+static sycl::device getDefaultDevice() {
+  auto platformList = sycl::platform::get_platforms();
+  for (const auto  : platformList) {
+auto platformName = platform.get_info();
+bool isLevelZero = platformName.find("Level-Zero") != std::string::npos;
+if (!isLevelZero)
+  continue;
+
+return platform.get_devices()[0];
+  }
+  throw std::runtime_error("getDefaultDevice failed");
+}
+
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wglobal-constructors"
+
+// Create global device and context
+sycl::device syclDevice = getDefaultDevice();
+sycl::context syclContext = sycl::context(syclDevice);
+
+#pragma clang diagnostic pop
+
+struct QUEUE {
+  sycl::queue syclQueue_;
+
+  QUEUE() { syclQueue_ = sycl::queue(syclContext, syclDevice); }
+};
+
+static void *allocDeviceMemory(QUEUE *queue, size_t size, bool isShared) {
+  void *memPtr = nullptr;
+  if (isShared) {
+memPtr = sycl::aligned_alloc_shared(64, size, syclDevice, syclContext);
+  } else {
+memPtr = sycl::aligned_alloc_device(64, size, syclDevice, syclContext);
+  }
+  if (memPtr == nullptr) {
+throw std::runtime_error("mem allocation failed!");
+  }
+  return memPtr;
+}
+
+static void deallocDeviceMemory(QUEUE *queue, void *ptr) {
+  sycl::free(ptr, queue->syclQueue_);
+}
+
+static ze_module_handle_t loadModule(const void *data, size_t dataSize) {
+  assert(data);
+  ze_module_handle_t zeModule;
+  ze_module_desc_t desc = {ZE_STRUCTURE_TYPE_MODULE_DESC,
+   nullptr,
+   ZE_MODULE_FORMAT_IL_SPIRV,
+   dataSize,
+   (const uint8_t *)data,
+   nullptr,
+   nullptr};
+  auto zeDevice =
+  sycl::get_native(syclDevice);
+  auto zeContext =
+  sycl::get_native(syclContext);
+  L0_SAFE_CALL(zeModuleCreate(zeContext, zeDevice, , , nullptr));
+  return zeModule;
+}
+
+static sycl::kernel *getKernel(ze_module_handle_t zeModule, const char *name) {
+  assert(zeModule);
+  assert(name);
+  ze_kernel_handle_t zeKernel;
+  sycl::kernel *syclKernel;
+  ze_kernel_desc_t desc = {};
+  desc.pKernelName = name;
+
+  L0_SAFE_CALL(zeKernelCreate(zeModule, , ));
+  sycl::kernel_bundle kernelBundle =
+  sycl::make_kernel_bundle({zeModule},
+   syclContext);
+
+  auto kernel = sycl::make_kernel(
+  {kernelBundle, zeKernel}, syclContext);
+  syclKernel = new sycl::kernel(kernel);
+  return syclKernel;
+}
+
+static void launchKernel(QUEUE *queue, sycl::kernel *kernel, size_t gridX,

keryell wrote:

```suggestion
static void launchKernel(QUEUE queue, sycl::kernel kernel, size_t gridX,
```
or even use `&` if you are afraid of using the reference semantics of SYCL 
behind the scene.

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-06 Thread Ronan Keryell via cfe-commits


@@ -116,6 +116,7 @@ 
add_definitions(-DMLIR_ROCM_CONVERSIONS_ENABLED=${MLIR_ENABLE_ROCM_CONVERSIONS})
 
 set(MLIR_ENABLE_CUDA_RUNNER 0 CACHE BOOL "Enable building the mlir CUDA 
runner")
 set(MLIR_ENABLE_ROCM_RUNNER 0 CACHE BOOL "Enable building the mlir ROCm 
runner")
+set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir Sycl 
runner")

keryell wrote:

Please spell SYCL correctly.
```suggestion
set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir SYCL runner")
```
One could argue that `mlir` should be spelled `MLIR` but the train seems to 
have left long time ago. :-)

https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-06 Thread Sang Ik Lee via cfe-commits

https://github.com/silee2 edited https://github.com/llvm/llvm-project/pull/65539
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits