Re: r315996 - [CMake][OpenMP] Customize default offloading arch
Am 2017-12-07 20:34, schrieb Jonas Hahnfeld via cfe-commits: Hi Ahmed, Am 2017-12-07 19:57, schrieb Ahmed Bougacha: Hi Jonas, On Tue, Oct 17, 2017 at 6:37 AM, Jonas Hahnfeld via cfe-commitswrote: Author: hahnfeld Date: Tue Oct 17 06:37:36 2017 New Revision: 315996 URL: http://llvm.org/viewvc/llvm-project?rev=315996=rev Log: [CMake][OpenMP] Customize default offloading arch For the shuffle instructions in reductions we need at least sm_30 but the user may want to customize the default architecture. Differential Revision: https://reviews.llvm.org/D38883 Modified: cfe/trunk/CMakeLists.txt cfe/trunk/include/clang/Config/config.h.cmake cfe/trunk/lib/Driver/ToolChains/Cuda.cpp cfe/trunk/lib/Driver/ToolChains/Cuda.h Modified: cfe/trunk/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/CMakeLists.txt?rev=315996=315995=315996=diff == --- cfe/trunk/CMakeLists.txt (original) +++ cfe/trunk/CMakeLists.txt Tue Oct 17 06:37:36 2017 @@ -235,6 +235,17 @@ endif() set(CLANG_DEFAULT_OPENMP_RUNTIME "libomp" CACHE STRING "Default OpenMP runtime used by -fopenmp.") +# OpenMP offloading requires at least sm_30 because we use shuffle instructions +# to generate efficient code for reductions. +set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING + "Default architecture for OpenMP offloading to Nvidia GPUs.") +string(REGEX MATCH "^sm_([0-9]+)$" MATCHED_ARCH "${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}") +if (NOT DEFINED MATCHED_ARCH OR "${CMAKE_MATCH_1}" LESS 30) + message(WARNING "Resetting default architecture for OpenMP offloading to Nvidia GPUs to sm_30") This warning is pretty noisy and doesn't affect most people: I don't know what it means but I get it in every cmake run. Can we somehow restrict or disable it? So the next line used to say + set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING +"Default architecture for OpenMP offloading to Nvidia GPUs." FORCE) which should make sure that the cache is updated to a "correct" value and you only see the warning once. That said, we have raised the default to "sm_35" today, maybe something has gone wrong here. Let me check that and come back to you! Works "correctly" (at least as intended) for me: I get a warning if the cache has an incorrect value or the user specifies it on the command line. Right then the cache is updated (FORCEd set) and the warning isn't printed in future CMake invocations. I'm using CMake 3.5.2, maybe a newer version behaves differently? In that case I agree that we should fix this, the warning wasn't meant to annoy everyone on each reconfiguration! Jonas ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: r315996 - [CMake][OpenMP] Customize default offloading arch
Hi Ahmed, Am 2017-12-07 19:57, schrieb Ahmed Bougacha: Hi Jonas, On Tue, Oct 17, 2017 at 6:37 AM, Jonas Hahnfeld via cfe-commitswrote: Author: hahnfeld Date: Tue Oct 17 06:37:36 2017 New Revision: 315996 URL: http://llvm.org/viewvc/llvm-project?rev=315996=rev Log: [CMake][OpenMP] Customize default offloading arch For the shuffle instructions in reductions we need at least sm_30 but the user may want to customize the default architecture. Differential Revision: https://reviews.llvm.org/D38883 Modified: cfe/trunk/CMakeLists.txt cfe/trunk/include/clang/Config/config.h.cmake cfe/trunk/lib/Driver/ToolChains/Cuda.cpp cfe/trunk/lib/Driver/ToolChains/Cuda.h Modified: cfe/trunk/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/CMakeLists.txt?rev=315996=315995=315996=diff == --- cfe/trunk/CMakeLists.txt (original) +++ cfe/trunk/CMakeLists.txt Tue Oct 17 06:37:36 2017 @@ -235,6 +235,17 @@ endif() set(CLANG_DEFAULT_OPENMP_RUNTIME "libomp" CACHE STRING "Default OpenMP runtime used by -fopenmp.") +# OpenMP offloading requires at least sm_30 because we use shuffle instructions +# to generate efficient code for reductions. +set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING + "Default architecture for OpenMP offloading to Nvidia GPUs.") +string(REGEX MATCH "^sm_([0-9]+)$" MATCHED_ARCH "${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}") +if (NOT DEFINED MATCHED_ARCH OR "${CMAKE_MATCH_1}" LESS 30) + message(WARNING "Resetting default architecture for OpenMP offloading to Nvidia GPUs to sm_30") This warning is pretty noisy and doesn't affect most people: I don't know what it means but I get it in every cmake run. Can we somehow restrict or disable it? So the next line used to say + set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING +"Default architecture for OpenMP offloading to Nvidia GPUs." FORCE) which should make sure that the cache is updated to a "correct" value and you only see the warning once. That said, we have raised the default to "sm_35" today, maybe something has gone wrong here. Let me check that and come back to you! Cheers, Jonas ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: r315996 - [CMake][OpenMP] Customize default offloading arch
Hi Jonas, On Tue, Oct 17, 2017 at 6:37 AM, Jonas Hahnfeld via cfe-commitswrote: > Author: hahnfeld > Date: Tue Oct 17 06:37:36 2017 > New Revision: 315996 > > URL: http://llvm.org/viewvc/llvm-project?rev=315996=rev > Log: > [CMake][OpenMP] Customize default offloading arch > > For the shuffle instructions in reductions we need at least sm_30 > but the user may want to customize the default architecture. > > Differential Revision: https://reviews.llvm.org/D38883 > > Modified: > cfe/trunk/CMakeLists.txt > cfe/trunk/include/clang/Config/config.h.cmake > cfe/trunk/lib/Driver/ToolChains/Cuda.cpp > cfe/trunk/lib/Driver/ToolChains/Cuda.h > > Modified: cfe/trunk/CMakeLists.txt > URL: > http://llvm.org/viewvc/llvm-project/cfe/trunk/CMakeLists.txt?rev=315996=315995=315996=diff > == > --- cfe/trunk/CMakeLists.txt (original) > +++ cfe/trunk/CMakeLists.txt Tue Oct 17 06:37:36 2017 > @@ -235,6 +235,17 @@ endif() > set(CLANG_DEFAULT_OPENMP_RUNTIME "libomp" CACHE STRING >"Default OpenMP runtime used by -fopenmp.") > > +# OpenMP offloading requires at least sm_30 because we use shuffle > instructions > +# to generate efficient code for reductions. > +set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING > + "Default architecture for OpenMP offloading to Nvidia GPUs.") > +string(REGEX MATCH "^sm_([0-9]+)$" MATCHED_ARCH > "${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}") > +if (NOT DEFINED MATCHED_ARCH OR "${CMAKE_MATCH_1}" LESS 30) > + message(WARNING "Resetting default architecture for OpenMP offloading to > Nvidia GPUs to sm_30") This warning is pretty noisy and doesn't affect most people: I don't know what it means but I get it in every cmake run. Can we somehow restrict or disable it? Thanks! -Ahmed > + set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING > +"Default architecture for OpenMP offloading to Nvidia GPUs." FORCE) > +endif() > + > set(CLANG_VENDOR ${PACKAGE_VENDOR} CACHE STRING >"Vendor-specific text for showing with version information.") > > > Modified: cfe/trunk/include/clang/Config/config.h.cmake > URL: > http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Config/config.h.cmake?rev=315996=315995=315996=diff > == > --- cfe/trunk/include/clang/Config/config.h.cmake (original) > +++ cfe/trunk/include/clang/Config/config.h.cmake Tue Oct 17 06:37:36 2017 > @@ -20,6 +20,9 @@ > /* Default OpenMP runtime used by -fopenmp. */ > #define CLANG_DEFAULT_OPENMP_RUNTIME "${CLANG_DEFAULT_OPENMP_RUNTIME}" > > +/* Default architecture for OpenMP offloading to Nvidia GPUs. */ > +#define CLANG_OPENMP_NVPTX_DEFAULT_ARCH "${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}" > + > /* Multilib suffix for libdir. */ > #define CLANG_LIBDIR_SUFFIX "${CLANG_LIBDIR_SUFFIX}" > > > Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp > URL: > http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=315996=315995=315996=diff > == > --- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original) > +++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Oct 17 06:37:36 2017 > @@ -542,9 +542,9 @@ CudaToolChain::TranslateArgs(const llvm: >// flags are not duplicated. >// Also append the compute capability. >if (DeviceOffloadKind == Action::OFK_OpenMP) { > -for (Arg *A : Args){ > +for (Arg *A : Args) { >bool IsDuplicate = false; > - for (Arg *DALArg : *DAL){ > + for (Arg *DALArg : *DAL) { > if (A == DALArg) { >IsDuplicate = true; >break; > @@ -555,14 +555,9 @@ CudaToolChain::TranslateArgs(const llvm: > } > > StringRef Arch = DAL->getLastArgValue(options::OPT_march_EQ); > -if (Arch.empty()) { > - // Default compute capability for CUDA toolchain is the > - // lowest compute capability supported by the installed > - // CUDA version. > - DAL->AddJoinedArg(nullptr, > - Opts.getOption(options::OPT_march_EQ), > - CudaInstallation.getLowestExistingArch()); > -} > +if (Arch.empty()) > + DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ), > +CLANG_OPENMP_NVPTX_DEFAULT_ARCH); > > return DAL; >} > > Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.h > URL: > http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.h?rev=315996=315995=315996=diff > == > --- cfe/trunk/lib/Driver/ToolChains/Cuda.h (original) > +++ cfe/trunk/lib/Driver/ToolChains/Cuda.h Tue Oct 17 06:37:36 2017 > @@ -76,17 +76,6 @@ public: >std::string getLibDeviceFile(StringRef Gpu) const { > return LibDeviceMap.lookup(Gpu); >} > - /// \brief Get lowest available compute capability > - /// for which a libdevice
r315996 - [CMake][OpenMP] Customize default offloading arch
Author: hahnfeld Date: Tue Oct 17 06:37:36 2017 New Revision: 315996 URL: http://llvm.org/viewvc/llvm-project?rev=315996=rev Log: [CMake][OpenMP] Customize default offloading arch For the shuffle instructions in reductions we need at least sm_30 but the user may want to customize the default architecture. Differential Revision: https://reviews.llvm.org/D38883 Modified: cfe/trunk/CMakeLists.txt cfe/trunk/include/clang/Config/config.h.cmake cfe/trunk/lib/Driver/ToolChains/Cuda.cpp cfe/trunk/lib/Driver/ToolChains/Cuda.h Modified: cfe/trunk/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/CMakeLists.txt?rev=315996=315995=315996=diff == --- cfe/trunk/CMakeLists.txt (original) +++ cfe/trunk/CMakeLists.txt Tue Oct 17 06:37:36 2017 @@ -235,6 +235,17 @@ endif() set(CLANG_DEFAULT_OPENMP_RUNTIME "libomp" CACHE STRING "Default OpenMP runtime used by -fopenmp.") +# OpenMP offloading requires at least sm_30 because we use shuffle instructions +# to generate efficient code for reductions. +set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING + "Default architecture for OpenMP offloading to Nvidia GPUs.") +string(REGEX MATCH "^sm_([0-9]+)$" MATCHED_ARCH "${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}") +if (NOT DEFINED MATCHED_ARCH OR "${CMAKE_MATCH_1}" LESS 30) + message(WARNING "Resetting default architecture for OpenMP offloading to Nvidia GPUs to sm_30") + set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING +"Default architecture for OpenMP offloading to Nvidia GPUs." FORCE) +endif() + set(CLANG_VENDOR ${PACKAGE_VENDOR} CACHE STRING "Vendor-specific text for showing with version information.") Modified: cfe/trunk/include/clang/Config/config.h.cmake URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Config/config.h.cmake?rev=315996=315995=315996=diff == --- cfe/trunk/include/clang/Config/config.h.cmake (original) +++ cfe/trunk/include/clang/Config/config.h.cmake Tue Oct 17 06:37:36 2017 @@ -20,6 +20,9 @@ /* Default OpenMP runtime used by -fopenmp. */ #define CLANG_DEFAULT_OPENMP_RUNTIME "${CLANG_DEFAULT_OPENMP_RUNTIME}" +/* Default architecture for OpenMP offloading to Nvidia GPUs. */ +#define CLANG_OPENMP_NVPTX_DEFAULT_ARCH "${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}" + /* Multilib suffix for libdir. */ #define CLANG_LIBDIR_SUFFIX "${CLANG_LIBDIR_SUFFIX}" Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=315996=315995=315996=diff == --- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Oct 17 06:37:36 2017 @@ -542,9 +542,9 @@ CudaToolChain::TranslateArgs(const llvm: // flags are not duplicated. // Also append the compute capability. if (DeviceOffloadKind == Action::OFK_OpenMP) { -for (Arg *A : Args){ +for (Arg *A : Args) { bool IsDuplicate = false; - for (Arg *DALArg : *DAL){ + for (Arg *DALArg : *DAL) { if (A == DALArg) { IsDuplicate = true; break; @@ -555,14 +555,9 @@ CudaToolChain::TranslateArgs(const llvm: } StringRef Arch = DAL->getLastArgValue(options::OPT_march_EQ); -if (Arch.empty()) { - // Default compute capability for CUDA toolchain is the - // lowest compute capability supported by the installed - // CUDA version. - DAL->AddJoinedArg(nullptr, - Opts.getOption(options::OPT_march_EQ), - CudaInstallation.getLowestExistingArch()); -} +if (Arch.empty()) + DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ), +CLANG_OPENMP_NVPTX_DEFAULT_ARCH); return DAL; } Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.h?rev=315996=315995=315996=diff == --- cfe/trunk/lib/Driver/ToolChains/Cuda.h (original) +++ cfe/trunk/lib/Driver/ToolChains/Cuda.h Tue Oct 17 06:37:36 2017 @@ -76,17 +76,6 @@ public: std::string getLibDeviceFile(StringRef Gpu) const { return LibDeviceMap.lookup(Gpu); } - /// \brief Get lowest available compute capability - /// for which a libdevice library exists. - std::string getLowestExistingArch() const { -std::string LibDeviceFile; -for (auto key : LibDeviceMap.keys()) { - LibDeviceFile = LibDeviceMap.lookup(key); - if (!LibDeviceFile.empty()) -return key; -} -return "sm_20"; - } }; namespace tools { ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits