[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
dreachem added subscribers: dhruvachak, dreachem. dreachem added a comment. Herald added a subscriber: MaskRay. Herald added a project: All. @jdoerfert @tianshilei1992 @atmnpatel @dhruvachak Is the target to get this merged in for LLVM 16? Does the VGPU implementation provide a way to support OMPT callbacks for various constructs (parallel, worksharing, barriers, etc.)? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
jdoerfert added a comment. We can merge runtime first, build it in isolation, then libomptarget host runtime, then clang. Also make sure to adjust the commit messages Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
tianshilei1992 added a comment. Not sure if it's good to merge such a large patch. We could potentially split the patch to three independent patches: tool chain, device runtime, and the OpenMPOpt pass to support expansion of shared variable (which for some reason is not included in this patch. That is actually very important component otherwise the backend will complain about it). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
jdoerfert accepted this revision. jdoerfert added a comment. This revision is now accepted and ready to land. LG, with some things to address before the merge though. Didn't we have a pass to expand shared memory (and such)? Comment at: clang/lib/Basic/TargetInfo.cpp:155 + + if (Triple.getVendor() == llvm::Triple::OpenMP_VGPU) +AddrSpaceMap = ::omp::OpenMPVGPUAddrSpaceMap; use isOpenMPVGPU Comment at: clang/lib/Basic/Targets/X86.h:395 +return llvm::omp::VirtualGpuGridValues; + } }; Do we need the changes in this file at all? I couldn't see why. Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1125 +llvm::GlobalValue::LinkageTypes Linkage) { + if (CGM.getTarget().getTriple().getVendor() == llvm::Triple::OpenMP_VGPU) +return CGOpenMPRuntime::createOffloadEntry(ID, Addr, Size, Flags, Linkage); isOpenMPVGPU Comment at: clang/lib/CodeGen/CodeGenModule.cpp:252 default: -if (LangOpts.OpenMPSimd) +if (getTriple().getVendor() == llvm::Triple::OpenMP_VGPU) + OpenMPRuntime.reset(new CGOpenMPRuntimeGPU(*this)); isOpenMPVGPU Comment at: clang/lib/Driver/ToolChains/Gnu.cpp:3076 + + if (getTriple().getVendor() == llvm::Triple::OpenMP_VGPU) { +std::string BitcodeSuffix = getTripleString() + "-openmp_vgpu"; isOpenMPVGPU Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:323 +constexpr uint32_t UNSET = 0; +constexpr uint32_t SET = 1; + Remove these. Also the TODO below (copied from somewhere) Comment at: openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.cpp:85 + CTAEnvironmentTy *CTAE) +: ThreadIdInWarp(Idx++ % WE->getNumThreads()), + ThreadIdInBlock(WE->getWarpId() * WE->getNumThreads() + ThreadIdInWarp), This is racy, I think. Can we use atomic_add for all these Idx updates or pass the Id from the outside? Comment at: openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.h:118 + + // FIXME: This is wrong + LaneMaskTy getActiveMask() const; at least add more information what the problem and potential solutions are. Comment at: openmp/libomptarget/plugins/vgpu/src/rtl.cpp:271 + ThreadIdx++) { + Threads.emplace_back([this, GlobalThreadIdx, CTAEnv, WarpEnv]() { +ThreadEnvironment = new ThreadEnvironmentTy(WarpEnv, CTAEnv); Move the lambda into a helper function. indention of 12 is too much. Comment at: openmp/libomptarget/plugins/vgpu/src/rtl.cpp:313 + }); + GlobalThreadIdx = (GlobalThreadIdx + 1) % NumThreads; +} When do we have more threads than NumThreads? Comment at: openmp/libomptarget/plugins/vgpu/src/rtl.cpp:554 + +int32_t __tgt_rtl_data_delete(int32_t device_id, void *tgt_ptr) { + free(tgt_ptr); if we need for submit/retrieve, I'd assume to wait here too. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
atmnpatel added inline comments. Comment at: openmp/libomptarget/test/CMakeLists.txt:23 +continue() + ENDIF() string(STRIP "${CURRENT_TARGET}" CURRENT_TARGET) jdoerfert wrote: > This is to disable the tests? Not sure this is a good way though. For one, > can we check against -vgpu not x86, also openmp-vgpu or something, right? Yep Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
atmnpatel updated this revision to Diff 405407. atmnpatel marked 7 inline comments as done. atmnpatel added a comment. updates Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 Files: clang/lib/Basic/TargetInfo.cpp clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Gnu.cpp clang/lib/Frontend/CompilerInvocation.cpp llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h llvm/lib/Support/Triple.cpp openmp/CMakeLists.txt openmp/libomptarget/DeviceRTL/CMakeLists.txt openmp/libomptarget/DeviceRTL/include/ThreadEnvironment.h openmp/libomptarget/DeviceRTL/src/Debug.cpp openmp/libomptarget/DeviceRTL/src/Mapping.cpp openmp/libomptarget/DeviceRTL/src/Misc.cpp openmp/libomptarget/DeviceRTL/src/Synchronization.cpp openmp/libomptarget/DeviceRTL/src/Utils.cpp openmp/libomptarget/plugins/CMakeLists.txt openmp/libomptarget/plugins/vgpu/CMakeLists.txt openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.cpp openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.h openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.cpp openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.h openmp/libomptarget/plugins/vgpu/src/rtl.cpp openmp/libomptarget/src/rtl.cpp openmp/libomptarget/test/CMakeLists.txt Index: openmp/libomptarget/test/CMakeLists.txt === --- openmp/libomptarget/test/CMakeLists.txt +++ openmp/libomptarget/test/CMakeLists.txt @@ -18,6 +18,9 @@ string(REGEX MATCHALL "([^\ ]+\ |[^\ ]+$)" SYSTEM_TARGETS "${LIBOMPTARGET_SYSTEM_TARGETS}") foreach(CURRENT_TARGET IN LISTS SYSTEM_TARGETS) + IF ("${CURRENT_TARGET}" MATCHES "-vgpu") +continue() + ENDIF() string(STRIP "${CURRENT_TARGET}" CURRENT_TARGET) add_openmp_testsuite(check-libomptarget-${CURRENT_TARGET} "Running libomptarget tests" Index: openmp/libomptarget/src/rtl.cpp === --- openmp/libomptarget/src/rtl.cpp +++ openmp/libomptarget/src/rtl.cpp @@ -21,17 +21,22 @@ #include #include -// List of all plugins that can support offloading. -static const char *RTLNames[] = { -/* PowerPC target */ "libomptarget.rtl.ppc64.so", -/* x86_64 target*/ "libomptarget.rtl.x86_64.so", -/* CUDA target */ "libomptarget.rtl.cuda.so", -/* AArch64 target */ "libomptarget.rtl.aarch64.so", -/* SX-Aurora VE target */ "libomptarget.rtl.ve.so", -/* AMDGPU target*/ "libomptarget.rtl.amdgpu.so", -/* Remote target*/ "libomptarget.rtl.rpc.so", +struct PluginInfoTy { + std::string Name; + bool IsHost; }; +// List of all plugins that can support offloading. +static const PluginInfoTy Plugins[] = { +/* PowerPC target */ {"libomptarget.rtl.ppc64.so", true}, +/* x86_64 target*/ {"libomptarget.rtl.x86_64.so", true}, +/* CUDA target */ {"libomptarget.rtl.cuda.so", false}, +/* AArch64 target */ {"libomptarget.rtl.aarch64.so", true}, +/* SX-Aurora VE target */ {"libomptarget.rtl.ve.so", false}, +/* AMDGPU target*/ {"libomptarget.rtl.amdgpu.so", false}, +/* Remote target*/ {"libomptarget.rtl.rpc.so", false}, +/* Virtual GPU target */ {"libomptarget.rtl.vgpu.so", false}}; + PluginManager *PM; #if OMPTARGET_PROFILE_ENABLED @@ -86,21 +91,37 @@ return; } + // TODO: add ability to inspect image and decide automatically + bool UseVGPU = false; + if (auto *EnvFlag = std::getenv("LIBOMPTARGET_USE_VGPU")) +UseVGPU = true; + DP("Loading RTLs...\n"); // Attempt to open all the plugins and, if they exist, check if the interface // is correct and if they are supporting any devices. - for (auto *Name : RTLNames) { -DP("Loading library '%s'...\n", Name); -void *dynlib_handle = dlopen(Name, RTLD_NOW); + for (auto &[Name, IsHost] : Plugins) { +DP("Loading library '%s'...\n", Name.c_str()); + +int Flags = RTLD_NOW; + +if (Name.compare("libomptarget.rtl.vgpu.so") == 0) + Flags |= RTLD_GLOBAL; + +if (UseVGPU && IsHost) { + DP("Skipping library '%s': VGPU was requested.\n", Name.c_str()); + continue; +} + +void *dynlib_handle = dlopen(Name.c_str(), Flags); if (!dynlib_handle) { // Library does not exist or cannot be found. - DP("Unable to load library '%s': %s!\n", Name, dlerror()); + DP("Unable to load library '%s': %s!\n", Name.c_str(), dlerror()); continue; } -DP("Successfully loaded library '%s'!\n", Name); +DP("Successfully loaded library '%s'!\n", Name.c_str()); AllRTLs.emplace_back(); Index: openmp/libomptarget/plugins/vgpu/src/rtl.cpp === --- /dev/null +++
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
jdoerfert added inline comments. Comment at: clang/lib/Driver/ToolChains/Gnu.cpp:3082 + if (getTriple().getVendor() == llvm::Triple::OpenMP_VGPU) { +std::string BitcodeSuffix = "x86_64-vgpu"; +clang::driver::tools::addOpenMPDeviceRTL(getDriver(), DriverArgs, CC1Args, tianshilei1992 wrote: > Maybe `"x86_64-openmp_vpu"` now? not x86, right? triple contains the proper arch Comment at: openmp/libomptarget/DeviceRTL/src/Mapping.cpp:29 + +#include "ThreadEnvironment.h" + Move up to the beginning. Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:291 + +#include "ThreadEnvironment.h" +namespace impl { Move up. Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:342 + VGPUImpl::setLock((uint32_t *)Lock, UNSET, SET, OMP_SPIN, +mapping::getBlockId(), atomicCAS); +} We should simply use omp locks. Either here, or maybe better, in VGPUImpl. So redirect all calls to there and use a proper lock. no OMP_SPIN and stuff Comment at: openmp/libomptarget/DeviceRTL/src/Utils.cpp:118 + +#include "ThreadEnvironment.h" +namespace impl { Move up Comment at: openmp/libomptarget/DeviceRTL/src/Utils.cpp:127 + return getThreadEnvironment()->shuffleDown(Var, Delta); +} + Pass the mask, both times. Comment at: openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.cpp:49 + } // wait for 0 to be the read value +} + see above. Comment at: openmp/libomptarget/src/rtl.cpp:97 + continue; +} + Not only x86, also let's not do strcmp. Extend RTLNAmes to be an array of structs with more elaborate information, e.g., is host flag. That said, unsure if not loading the plugin is the right way to not grab the image. Good enough for now. Comment at: openmp/libomptarget/test/CMakeLists.txt:23 +continue() + ENDIF() string(STRIP "${CURRENT_TARGET}" CURRENT_TARGET) This is to disable the tests? Not sure this is a good way though. For one, can we check against -vgpu not x86, also openmp-vgpu or something, right? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
atmnpatel added inline comments. Comment at: openmp/libomptarget/DeviceRTL/CMakeLists.txt:231 + +compileDeviceRTLLibrary(x86_64 vgpu -target x86_64-vgpu -std=c++20 -stdlib=libc++ -I${devicertl_base_directory}/../plugins/vgpu/src) tianshilei1992 wrote: > It's not a good practice to specify include directories in CMake in this way. > Use `include_directories` instead. can't quite do that here I think, afaik both `include_directories` and `target_include_directories` require that CMake builds the target, but we specify custom targets/build commands so they don't get pulled in Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
atmnpatel updated this revision to Diff 401112. atmnpatel marked 9 inline comments as done. atmnpatel added a comment. Addressed comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 Files: clang/lib/Basic/TargetInfo.cpp clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Gnu.cpp clang/lib/Frontend/CompilerInvocation.cpp llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h llvm/lib/Support/Triple.cpp openmp/CMakeLists.txt openmp/libomptarget/DeviceRTL/CMakeLists.txt openmp/libomptarget/DeviceRTL/include/ThreadEnvironment.h openmp/libomptarget/DeviceRTL/src/Debug.cpp openmp/libomptarget/DeviceRTL/src/Mapping.cpp openmp/libomptarget/DeviceRTL/src/Misc.cpp openmp/libomptarget/DeviceRTL/src/Synchronization.cpp openmp/libomptarget/DeviceRTL/src/Utils.cpp openmp/libomptarget/plugins/CMakeLists.txt openmp/libomptarget/plugins/vgpu/CMakeLists.txt openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.cpp openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.h openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.cpp openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.h openmp/libomptarget/plugins/vgpu/src/rtl.cpp openmp/libomptarget/src/rtl.cpp openmp/libomptarget/test/CMakeLists.txt Index: openmp/libomptarget/test/CMakeLists.txt === --- openmp/libomptarget/test/CMakeLists.txt +++ openmp/libomptarget/test/CMakeLists.txt @@ -18,6 +18,9 @@ string(REGEX MATCHALL "([^\ ]+\ |[^\ ]+$)" SYSTEM_TARGETS "${LIBOMPTARGET_SYSTEM_TARGETS}") foreach(CURRENT_TARGET IN LISTS SYSTEM_TARGETS) + IF ("${CURRENT_TARGET}" MATCHES "x86_64-vgpu") +continue() + ENDIF() string(STRIP "${CURRENT_TARGET}" CURRENT_TARGET) add_openmp_testsuite(check-libomptarget-${CURRENT_TARGET} "Running libomptarget tests" Index: openmp/libomptarget/src/rtl.cpp === --- openmp/libomptarget/src/rtl.cpp +++ openmp/libomptarget/src/rtl.cpp @@ -30,6 +30,7 @@ /* SX-Aurora VE target */ "libomptarget.rtl.ve.so", /* AMDGPU target*/ "libomptarget.rtl.amdgpu.so", /* Remote target*/ "libomptarget.rtl.rpc.so", +/* Virtual GPU target */ "libomptarget.rtl.vgpu.so", }; PluginManager *PM; @@ -73,13 +74,29 @@ return; } + // TODO: add ability to inspect image and decide automatically + bool UseVGPU = false; + if (auto *EnvFlag = std::getenv("LIBOMPTARGET_USE_VGPU")) +UseVGPU = true; + DP("Loading RTLs...\n"); // Attempt to open all the plugins and, if they exist, check if the interface // is correct and if they are supporting any devices. for (auto *Name : RTLNames) { DP("Loading library '%s'...\n", Name); -void *dynlib_handle = dlopen(Name, RTLD_NOW); + +int Flags = RTLD_NOW; + +if (strcmp(Name, "libomptarget.rtl.vgpu.so") == 0) + Flags |= RTLD_GLOBAL; + +if (UseVGPU && (strcmp(Name, "libomptarget.rtl.x86_64.so") == 0)) { + DP("Skipping library '%s': VGPU was requested.\n", Name); + continue; +} + +void *dynlib_handle = dlopen(Name, Flags); if (!dynlib_handle) { // Library does not exist or cannot be found. Index: openmp/libomptarget/plugins/vgpu/src/rtl.cpp === --- /dev/null +++ openmp/libomptarget/plugins/vgpu/src/rtl.cpp @@ -0,0 +1,615 @@ +//===--RTLs/vgpu/src/rtl.cpp - Target RTLs Implementation - C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// RTL for virtual (x86) GPU +// +//===--===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "Debug.h" +#include "ThreadEnvironment.h" +#include "ThreadEnvironmentImpl.h" +#include "omptarget.h" +#include "omptargetplugin.h" + +#ifndef TARGET_NAME +#define TARGET_NAME Generic ELF - 64bit +#endif +#define DEBUG_PREFIX "TARGET " GETNAME(TARGET_NAME) " RTL" + +#ifndef TARGET_ELF_ID +#define TARGET_ELF_ID 0 +#endif + +#include "elf_common.h" + +#define OFFLOADSECTIONNAME "omp_offloading_entries" + +#define DEBUG false + +struct FFICallTy { + ffi_cif CIF; + std::vector ArgsTypes; + std::vector Args; + std::vector Ptrs; + void (*Entry)(void); + + FFICallTy(int32_t ArgNum, void **TgtArgs, ptrdiff_t *TgtOffsets, +void
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
jdoerfert added inline comments. Comment at: openmp/libomptarget/DeviceRTL/src/Kernel.cpp:127 +#pragma omp begin declare variant match( \ +device = {kind(cpu)}, implementation = {extension(match_any)}) tianshilei1992 wrote: > Are these code here unintentional? We don't need to specialize this function > for vgpu IIRC. we might be able to avoid it if we move the synchronize::threads "effect" into the VGPU instead. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
tianshilei1992 added inline comments. Comment at: openmp/libomptarget/DeviceRTL/CMakeLists.txt:231 + +compileDeviceRTLLibrary(x86_64 vgpu -target x86_64-vgpu -std=c++20 -stdlib=libc++ -I${devicertl_base_directory}/../plugins/vgpu/src) It's not a good practice to specify include directories in CMake in this way. Use `include_directories` instead. Comment at: openmp/libomptarget/DeviceRTL/src/Kernel.cpp:127 +#pragma omp begin declare variant match( \ +device = {kind(cpu)}, implementation = {extension(match_any)}) Are these code here unintentional? We don't need to specialize this function for vgpu IIRC. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
jdoerfert added inline comments. Comment at: llvm/lib/Support/Triple.cpp:512 + .Case("oe", Triple::OpenEmbedded) + .Case("vgpu", Triple::OpenMP_VGPU) + .Default(Triple::UnknownVendor); Comment at: openmp/libomptarget/DeviceRTL/src/Debug.cpp:53 +#pragma omp begin declare variant match( \ +device = {kind(cpu)}, implementation = {extension(match_any)}) +int32_t vprintf(const char *, void *); Comment at: openmp/libomptarget/DeviceRTL/src/Kernel.cpp:128 +#pragma omp begin declare variant match( \ +device = {kind(cpu)}, implementation = {extension(match_any)}) +void __kmpc_target_deinit(IdentTy *Ident, int8_t Mode, bool) { Comment at: openmp/libomptarget/DeviceRTL/src/Mapping.cpp:28 +#pragma omp begin declare variant match( \ +device = {kind(cpu)}, implementation = {extension(match_any)}) + Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:290 +#pragma omp begin declare variant match( \ +device = {kind(cpu)}, implementation = {extension(match_any)}) + Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:314 +// Simply call fenceKernel because there is no need to sync with host +void fenceSystem(int) { fenceKernel(0); } + Pass the memory order, also rename the arguments to match the coding convention. Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:317 +void syncWarp(__kmpc_impl_lanemask_t Mask) { + getThreadEnvironment()->syncWarp(); +} Pass the mask Comment at: openmp/libomptarget/DeviceRTL/src/Utils.cpp:56 +#pragma omp begin declare variant match( \ +device = {kind(cpu)}, implementation = {extension(match_any)}) + Comment at: openmp/libomptarget/DeviceRTL/src/Utils.cpp:68 + +#pragma omp end declare variant + Can't we merge this with AMDGPU? Comment at: openmp/libomptarget/DeviceRTL/src/Utils.cpp:138 +#pragma omp begin declare variant match( \ +device = {kind(cpu)}, implementation = {extension(match_any)}) + Comment at: openmp/libomptarget/plugins/vgpu/src/rtl.cpp:303 +TeamIdx += NumCTAs; + } + Can we split this up and create some helper functions maybe? Comment at: openmp/libomptarget/src/rtl.cpp:34 +/* Virtual GPU target */ "libomptarget.rtl.vgpu.so", }; Introduce an environment variable, if it is set, X86 target should skip the image. Also, add a TODO such that we later look into the image and inspect it to decide automatically. Comment at: openmp/libomptarget/test/lit.cfg:189 config.substitutions.append(("%libomptarget-compile-and-run-" + \ libomptarget_target, \ "echo ignored-command")) Leftovers. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
atmnpatel updated this revision to Diff 398370. atmnpatel added a comment. - Fixed lifetime issue around ffi_call - Addressed comments The existing x86 plugin uses ffi, so this does as well, no explicit benefit in doing so. Is it worth keeping? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 Files: clang/lib/Basic/TargetInfo.cpp clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Gnu.cpp clang/lib/Frontend/CompilerInvocation.cpp llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h llvm/lib/Support/Triple.cpp openmp/CMakeLists.txt openmp/libomptarget/DeviceRTL/CMakeLists.txt openmp/libomptarget/DeviceRTL/include/ThreadEnvironment.h openmp/libomptarget/DeviceRTL/src/Debug.cpp openmp/libomptarget/DeviceRTL/src/Kernel.cpp openmp/libomptarget/DeviceRTL/src/Mapping.cpp openmp/libomptarget/DeviceRTL/src/Misc.cpp openmp/libomptarget/DeviceRTL/src/Synchronization.cpp openmp/libomptarget/DeviceRTL/src/Utils.cpp openmp/libomptarget/plugins/CMakeLists.txt openmp/libomptarget/plugins/vgpu/CMakeLists.txt openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.cpp openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.h openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.h openmp/libomptarget/plugins/vgpu/src/rtl.cpp openmp/libomptarget/src/rtl.cpp openmp/libomptarget/test/lit.cfg Index: openmp/libomptarget/test/lit.cfg === --- openmp/libomptarget/test/lit.cfg +++ openmp/libomptarget/test/lit.cfg @@ -114,9 +114,11 @@ # Scan all the valid targets. for libomptarget_target in config.libomptarget_all_targets: +print("Checking {}".format(libomptarget_target)) # Is this target in the current system? If so create a compile, run and test # command. Otherwise create command that return false. if libomptarget_target == config.libomptarget_current_target: +print("First") config.substitutions.append(("%libomptarget-compilexx-run-and-check-generic", "%libomptarget-compilexx-run-and-check-" + libomptarget_target)) config.substitutions.append(("%libomptarget-compile-run-and-check-generic", @@ -176,6 +178,7 @@ config.substitutions.append(("%fcheck-" + libomptarget_target, \ config.libomptarget_filecheck + " %s")) else: +print("Second") config.substitutions.append(("%libomptarget-compile-run-and-check-" + \ libomptarget_target, \ "echo ignored-command")) Index: openmp/libomptarget/src/rtl.cpp === --- openmp/libomptarget/src/rtl.cpp +++ openmp/libomptarget/src/rtl.cpp @@ -24,12 +24,13 @@ // List of all plugins that can support offloading. static const char *RTLNames[] = { /* PowerPC target */ "libomptarget.rtl.ppc64.so", -/* x86_64 target*/ "libomptarget.rtl.x86_64.so", +/* x86_64 target "libomptarget.rtl.x86_64.so", */ /* CUDA target */ "libomptarget.rtl.cuda.so", /* AArch64 target */ "libomptarget.rtl.aarch64.so", /* SX-Aurora VE target */ "libomptarget.rtl.ve.so", /* AMDGPU target*/ "libomptarget.rtl.amdgpu.so", /* Remote target*/ "libomptarget.rtl.rpc.so", +/* Virtual GPU target */ "libomptarget.rtl.vgpu.so", }; PluginManager *PM; @@ -79,7 +80,13 @@ // is correct and if they are supporting any devices. for (auto *Name : RTLNames) { DP("Loading library '%s'...\n", Name); -void *dynlib_handle = dlopen(Name, RTLD_NOW); + +int Flags = RTLD_NOW; + +if (strcmp(Name, "libomptarget.rtl.vgpu.so") == 0) + Flags |= RTLD_GLOBAL; + +void *dynlib_handle = dlopen(Name, Flags); if (!dynlib_handle) { // Library does not exist or cannot be found. Index: openmp/libomptarget/plugins/vgpu/src/rtl.cpp === --- /dev/null +++ openmp/libomptarget/plugins/vgpu/src/rtl.cpp @@ -0,0 +1,609 @@ +//===--RTLs/vgpu/src/rtl.cpp - Target RTLs Implementation - C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// RTL for virtual (x86) GPU +// +//===--===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "Debug.h" +#include "ThreadEnvironment.h"
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
JonChesterfield added a comment. I can't see it in the diff - does the cmake somewhere enable the existing tests on this new target? A bit surprised to see ffi involved, are we thinking of spawning a separate process for the target? Comment at: clang/lib/Basic/Targets/X86.h:49 +static const unsigned X86VGPUAddrSpaceMap[] = { +0, // Default It's not clear to me what this is x86 specific. Being able to run our tests on power / arm etc seems like an advantage. Would also mean we would avoid adding openmp stuff the x86 specific files. Maybe OpenMPVGPUAddrSpaceMap and put it in one of the openmp source files? Comment at: clang/lib/Frontend/CompilerInvocation.cpp:3988 +(T.isNVPTX() || T.isAMDGCN() || + T.getVendor() == llvm::Triple::OpenMP_VGPU) && Args.hasArg(options::OPT_fopenmp_cuda_mode); Add a isOpenmpVGPU function? Comment at: openmp/libomptarget/DeviceRTL/CMakeLists.txt:135 -I${devicertl_base_directory}/../include + -I${devicertl_base_directory}/../plugins/vgpu/src ${LIBOMPTARGET_LLVM_INCLUDE_DIRS_DEVICERTL} Should only add this include to the vgu, not all the plugins. May be able to use relative include paths to drop it entirely Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
tianshilei1992 added inline comments. Comment at: clang/lib/Driver/ToolChains/Gnu.cpp:3082 + if (getTriple().getVendor() == llvm::Triple::OpenMP_VGPU) { +std::string BitcodeSuffix = "x86_64-vgpu"; +clang::driver::tools::addOpenMPDeviceRTL(getDriver(), DriverArgs, CC1Args, Maybe `"x86_64-openmp_vpu"` now? Comment at: llvm/lib/Support/Triple.cpp:189 + case OpenMP_VGPU: +return "vgpu"; } `"openmp_vpu"`? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
atmnpatel updated this revision to Diff 386426. atmnpatel added a comment. small nit fix Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 Files: clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Gnu.cpp clang/lib/Frontend/CompilerInvocation.cpp llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h llvm/lib/Support/Triple.cpp openmp/CMakeLists.txt openmp/libomptarget/DeviceRTL/CMakeLists.txt openmp/libomptarget/DeviceRTL/src/Debug.cpp openmp/libomptarget/DeviceRTL/src/Kernel.cpp openmp/libomptarget/DeviceRTL/src/Mapping.cpp openmp/libomptarget/DeviceRTL/src/Misc.cpp openmp/libomptarget/DeviceRTL/src/Synchronization.cpp openmp/libomptarget/DeviceRTL/src/Utils.cpp openmp/libomptarget/plugins/CMakeLists.txt openmp/libomptarget/plugins/vgpu/CMakeLists.txt openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.cpp openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.h openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.h openmp/libomptarget/plugins/vgpu/src/rtl.cpp openmp/libomptarget/src/rtl.cpp Index: openmp/libomptarget/src/rtl.cpp === --- openmp/libomptarget/src/rtl.cpp +++ openmp/libomptarget/src/rtl.cpp @@ -30,6 +30,7 @@ /* SX-Aurora VE target */ "libomptarget.rtl.ve.so", /* AMDGPU target*/ "libomptarget.rtl.amdgpu.so", /* Remote target*/ "libomptarget.rtl.rpc.so", +/* Virtual GPU target */ "libomptarget.rtl.vgpu.so", }; PluginManager *PM; @@ -79,7 +80,13 @@ // is correct and if they are supporting any devices. for (auto *Name : RTLNames) { DP("Loading library '%s'...\n", Name); -void *dynlib_handle = dlopen(Name, RTLD_NOW); + +int Flags = RTLD_NOW; + +if (strcmp(Name, "libomptarget.rtl.vgpu.so") == 0) + Flags |= RTLD_GLOBAL; + +void *dynlib_handle = dlopen(Name, Flags); if (!dynlib_handle) { // Library does not exist or cannot be found. Index: openmp/libomptarget/plugins/vgpu/src/rtl.cpp === --- /dev/null +++ openmp/libomptarget/plugins/vgpu/src/rtl.cpp @@ -0,0 +1,623 @@ +//===--RTLs/vgpu/src/rtl.cpp - Target RTLs Implementation - C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// RTL for virtual (x86) GPU +// +//===--===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "Debug.h" +#include "ThreadEnvironment.h" +#include "ThreadEnvironmentImpl.h" +#include "omptarget.h" +#include "omptargetplugin.h" + +#ifndef TARGET_NAME +#define TARGET_NAME Generic ELF - 64bit +#endif +#define DEBUG_PREFIX "TARGET " GETNAME(TARGET_NAME) " RTL" + +#ifndef TARGET_ELF_ID +#define TARGET_ELF_ID 0 +#endif + +#include "elf_common.h" + +#define NUMBER_OF_DEVICES 1 +#define OFFLOADSECTIONNAME "omp_offloading_entries" + +#define DEBUG false + +/// Array of Dynamic libraries loaded for this target. +struct DynLibTy { + char *FileName; + void *Handle; +}; + +/// Keep entries table per device. +struct FuncOrGblEntryTy { + __tgt_target_table Table; +}; + +thread_local ThreadEnvironmentTy *ThreadEnvironment; + +/// Class containing all the device information. +class RTLDeviceInfoTy { + std::vector> FuncGblEntries; + +public: + std::list DynLibs; + + // Record entry point associated with device. + void createOffloadTable(int32_t device_id, __tgt_offload_entry *begin, + __tgt_offload_entry *end) { +assert(device_id < (int32_t)FuncGblEntries.size() && + "Unexpected device id!"); +FuncGblEntries[device_id].emplace_back(); +FuncOrGblEntryTy = FuncGblEntries[device_id].back(); + +E.Table.EntriesBegin = begin; +E.Table.EntriesEnd = end; + } + + // Return true if the entry is associated with device. + bool findOffloadEntry(int32_t device_id, void *addr) { +assert(device_id < (int32_t)FuncGblEntries.size() && + "Unexpected device id!"); +FuncOrGblEntryTy = FuncGblEntries[device_id].back(); + +for (__tgt_offload_entry *i = E.Table.EntriesBegin, *e = E.Table.EntriesEnd; + i < e; ++i) { + if (i->addr == addr) +return true; +} + +return false; + } + + // Return the pointer to the target entries table. + __tgt_target_table *getOffloadEntriesTable(int32_t
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
atmnpatel updated this revision to Diff 386425. atmnpatel added a comment. I removed the shared var opt - might be best to keep this in a separate patch @tianshilei1992. Also addressed comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 Files: clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Gnu.cpp clang/lib/Frontend/CompilerInvocation.cpp llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h llvm/lib/Support/Triple.cpp openmp/CMakeLists.txt openmp/libomptarget/DeviceRTL/CMakeLists.txt openmp/libomptarget/DeviceRTL/src/Debug.cpp openmp/libomptarget/DeviceRTL/src/Kernel.cpp openmp/libomptarget/DeviceRTL/src/Mapping.cpp openmp/libomptarget/DeviceRTL/src/Misc.cpp openmp/libomptarget/DeviceRTL/src/Synchronization.cpp openmp/libomptarget/DeviceRTL/src/Utils.cpp openmp/libomptarget/plugins/CMakeLists.txt openmp/libomptarget/plugins/vgpu/CMakeLists.txt openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.cpp openmp/libomptarget/plugins/vgpu/src/ThreadEnvironment.h openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.h openmp/libomptarget/plugins/vgpu/src/rtl.cpp openmp/libomptarget/src/rtl.cpp Index: openmp/libomptarget/src/rtl.cpp === --- openmp/libomptarget/src/rtl.cpp +++ openmp/libomptarget/src/rtl.cpp @@ -30,6 +30,7 @@ /* SX-Aurora VE target */ "libomptarget.rtl.ve.so", /* AMDGPU target*/ "libomptarget.rtl.amdgpu.so", /* Remote target*/ "libomptarget.rtl.rpc.so", +/* Virtual GPU target */ "libomptarget.rtl.vgpu.so", }; PluginManager *PM; @@ -79,7 +80,7 @@ // is correct and if they are supporting any devices. for (auto *Name : RTLNames) { DP("Loading library '%s'...\n", Name); -void *dynlib_handle = dlopen(Name, RTLD_NOW); +void *dynlib_handle = dlopen(Name, RTLD_NOW | RTLD_GLOBAL); if (!dynlib_handle) { // Library does not exist or cannot be found. Index: openmp/libomptarget/plugins/vgpu/src/rtl.cpp === --- /dev/null +++ openmp/libomptarget/plugins/vgpu/src/rtl.cpp @@ -0,0 +1,623 @@ +//===--RTLs/vgpu/src/rtl.cpp - Target RTLs Implementation - C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// RTL for virtual (x86) GPU +// +//===--===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "Debug.h" +#include "ThreadEnvironment.h" +#include "ThreadEnvironmentImpl.h" +#include "omptarget.h" +#include "omptargetplugin.h" + +#ifndef TARGET_NAME +#define TARGET_NAME Generic ELF - 64bit +#endif +#define DEBUG_PREFIX "TARGET " GETNAME(TARGET_NAME) " RTL" + +#ifndef TARGET_ELF_ID +#define TARGET_ELF_ID 0 +#endif + +#include "elf_common.h" + +#define NUMBER_OF_DEVICES 1 +#define OFFLOADSECTIONNAME "omp_offloading_entries" + +#define DEBUG false + +/// Array of Dynamic libraries loaded for this target. +struct DynLibTy { + char *FileName; + void *Handle; +}; + +/// Keep entries table per device. +struct FuncOrGblEntryTy { + __tgt_target_table Table; +}; + +thread_local ThreadEnvironmentTy *ThreadEnvironment; + +/// Class containing all the device information. +class RTLDeviceInfoTy { + std::vector> FuncGblEntries; + +public: + std::list DynLibs; + + // Record entry point associated with device. + void createOffloadTable(int32_t device_id, __tgt_offload_entry *begin, + __tgt_offload_entry *end) { +assert(device_id < (int32_t)FuncGblEntries.size() && + "Unexpected device id!"); +FuncGblEntries[device_id].emplace_back(); +FuncOrGblEntryTy = FuncGblEntries[device_id].back(); + +E.Table.EntriesBegin = begin; +E.Table.EntriesEnd = end; + } + + // Return true if the entry is associated with device. + bool findOffloadEntry(int32_t device_id, void *addr) { +assert(device_id < (int32_t)FuncGblEntries.size() && + "Unexpected device id!"); +FuncOrGblEntryTy = FuncGblEntries[device_id].back(); + +for (__tgt_offload_entry *i = E.Table.EntriesBegin, *e = E.Table.EntriesEnd; + i < e; ++i) { + if (i->addr == addr) +return true; +} + +return false; + } + + // Return the pointer to the target entries table. + __tgt_target_table *getOffloadEntriesTable(int32_t
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
jdoerfert added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntimeVirtualGPU.cpp:54 + CGOpenMPRuntime::createOffloadEntry(ID, Addr, Size, Flags, Linkage); +} We should be able to get rid of this file (and the cuda/hip) version. Might be the right time now as a precommit. Comment at: llvm/include/llvm/ADT/Triple.h:166 +VGPU, +LastVendorType = VGPU }; Let's call it OpenMP_VGPU or something like that to make it clear. Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:2177 +} + /// Abstract Attribute for tracking ICV values. @tianshilei1992 This needs a test. Comment at: openmp/libomptarget/DeviceRTL/src/Kernel.cpp:107 + synchronize::threads(); + // Signal the workers to exit the state machine and exit the kernel. I don't think we should do this. Instead, the plugin should signal as threads finish the kernel. Comment at: openmp/libomptarget/DeviceRTL/src/Mapping.cpp:171 +#pragma omp begin declare variant match( \ +device = {arch(x86, x86_64)}, implementation = {extension(match_any)}) + We probably should use kind(CPU) or something instead. Nothing x86 specific about it I think. Comment at: openmp/libomptarget/include/DeviceEnvironment.h:83 + +ThreadEnvironmentTy *getThreadEnvironment(void); + This should go into a new file (ThreadEnvironment) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin
atmnpatel created this revision. atmnpatel added reviewers: jdoerfert, tianshilei1992, JonChesterfield. Herald added subscribers: ormris, dexonsmith, pengfei, hiraditya, mgorny. atmnpatel requested review of this revision. Herald added subscribers: llvm-commits, openmp-commits, cfe-commits, sstefan1. Herald added projects: clang, OpenMP, LLVM. This patch introduces a virtual GPU (x86) plugin. This allows for the emulation of the GPU environment on the host. This re-uses the same execution model, compilation paths, runtimes as a physical GPU. The number of threads, warps, and CTAs are set through the environment variables `VGPU_{NUM_THREADS,NUM_WARPS,WARPS_PER_CTA}` respectively. Known Bugs (hence WIP): - In the rebase from LLVM 12, larger applications started segfaulting. Small programs still work with this patch. - The virtual GPU should be able to execute kernels asynchronously using the streams - but there is an unknown lifetime issue around the `ffi_call` that prevents the removal of the await after the `scheduleAsync` call. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D113359 Files: clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGOpenMPRuntimeVirtualGPU.cpp clang/lib/CodeGen/CGOpenMPRuntimeVirtualGPU.h clang/lib/CodeGen/CMakeLists.txt clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Gnu.cpp clang/lib/Frontend/CompilerInvocation.cpp llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h llvm/include/llvm/Frontend/OpenMP/OMPKinds.def llvm/lib/Support/Triple.cpp llvm/lib/Transforms/IPO/OpenMPOpt.cpp llvm/utils/gn/secondary/clang/lib/CodeGen/BUILD.gn openmp/CMakeLists.txt openmp/libomptarget/DeviceRTL/CMakeLists.txt openmp/libomptarget/DeviceRTL/include/Interface.h openmp/libomptarget/DeviceRTL/src/Kernel.cpp openmp/libomptarget/DeviceRTL/src/Mapping.cpp openmp/libomptarget/DeviceRTL/src/Misc.cpp openmp/libomptarget/DeviceRTL/src/Synchronization.cpp openmp/libomptarget/DeviceRTL/src/Utils.cpp openmp/libomptarget/include/DeviceEnvironment.h openmp/libomptarget/plugins/CMakeLists.txt openmp/libomptarget/plugins/vgpu/CMakeLists.txt openmp/libomptarget/plugins/vgpu/src/DeviceEnvironment.cpp openmp/libomptarget/plugins/vgpu/src/DeviceEnvironmentImpl.h openmp/libomptarget/plugins/vgpu/src/rtl.cpp openmp/libomptarget/src/rtl.cpp Index: openmp/libomptarget/src/rtl.cpp === --- openmp/libomptarget/src/rtl.cpp +++ openmp/libomptarget/src/rtl.cpp @@ -34,6 +34,7 @@ /* SX-Aurora VE target */ "libomptarget.rtl.ve.so", /* AMDGPU target*/ "libomptarget.rtl.amdgpu.so", /* Remote target*/ "libomptarget.rtl.rpc.so", +/* Virtual GPU target */ "libomptarget.rtl.vgpu.so", }; PluginManager *PM; @@ -83,7 +84,7 @@ // is correct and if they are supporting any devices. for (auto *Name : RTLNames) { DP("Loading library '%s'...\n", Name); -void *dynlib_handle = dlopen(Name, RTLD_NOW); +void *dynlib_handle = dlopen(Name, RTLD_NOW | RTLD_GLOBAL); if (!dynlib_handle) { // Library does not exist or cannot be found. Index: openmp/libomptarget/plugins/vgpu/src/rtl.cpp === --- /dev/null +++ openmp/libomptarget/plugins/vgpu/src/rtl.cpp @@ -0,0 +1,623 @@ +//===--RTLs/vgpu/src/rtl.cpp - Target RTLs Implementation - C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// RTL for virtual (x86) GPU +// +//===--===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "Debug.h" +#include "DeviceEnvironment.h" +#include "DeviceEnvironmentImpl.h" +#include "omptarget.h" +#include "omptargetplugin.h" + +#ifndef TARGET_NAME +#define TARGET_NAME Generic ELF - 64bit +#endif +#define DEBUG_PREFIX "TARGET " GETNAME(TARGET_NAME) " RTL" + +#ifndef TARGET_ELF_ID +#define TARGET_ELF_ID 0 +#endif + +#include "elf_common.h" + +#define NUMBER_OF_DEVICES 1 +#define OFFLOADSECTIONNAME "omp_offloading_entries" + +#define DEBUG false + +/// Array of Dynamic libraries loaded for this target. +struct DynLibTy { + char *FileName; + void *Handle; +}; + +/// Keep entries table per device. +struct FuncOrGblEntryTy { + __tgt_target_table Table; +}; + +thread_local ThreadEnvironmentTy *ThreadEnvironment; + +/// Class containing all the device information. +class RTLDeviceInfoTy { + std::vector> FuncGblEntries; + +public: + std::list DynLibs; + + // Record