[clang] 1b7db40 - [HLSL][SPIR-V] Target `directx` is required
Author: Michael Liao Date: 2024-04-26T15:07:13-04:00 New Revision: 1b7db405b97cc350e2de243683273e9946fc0bd0 URL: https://github.com/llvm/llvm-project/commit/1b7db405b97cc350e2de243683273e9946fc0bd0 DIFF: https://github.com/llvm/llvm-project/commit/1b7db405b97cc350e2de243683273e9946fc0bd0.diff LOG: [HLSL][SPIR-V] Target `directx` is required - One of tests needs target directx Added: Modified: clang/test/Driver/hlsl-lang-targets-spirv.hlsl Removed: diff --git a/clang/test/Driver/hlsl-lang-targets-spirv.hlsl b/clang/test/Driver/hlsl-lang-targets-spirv.hlsl index b86c2e01f8d80e..61b10e1648c52b 100644 --- a/clang/test/Driver/hlsl-lang-targets-spirv.hlsl +++ b/clang/test/Driver/hlsl-lang-targets-spirv.hlsl @@ -1,4 +1,5 @@ // REQUIRES: spirv-registered-target +// REQUIRES: directx-registered-target // Supported targets // ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 4bb5d48 - [clang][NFC] Fix CUDA clang-cl tests
Author: Michael Liao Date: 2024-04-09T11:55:31-04:00 New Revision: 4bb5d48584818646a31a1ba4bfbbd658b7dfbe67 URL: https://github.com/llvm/llvm-project/commit/4bb5d48584818646a31a1ba4bfbbd658b7dfbe67 DIFF: https://github.com/llvm/llvm-project/commit/4bb5d48584818646a31a1ba4bfbbd658b7dfbe67.diff LOG: [clang][NFC] Fix CUDA clang-cl tests - Add '--' argument to prevent interpreting intput files as options starting with '/'. Fix test failure after 2921a0928c71f4ee652a2478283e47ab5ffebf58. Added: Modified: clang/test/Driver/cuda-external-tools.cu Removed: diff --git a/clang/test/Driver/cuda-external-tools.cu b/clang/test/Driver/cuda-external-tools.cu index d9564d026b4faa..9ada0cf8595dc6 100644 --- a/clang/test/Driver/cuda-external-tools.cu +++ b/clang/test/Driver/cuda-external-tools.cu @@ -89,7 +89,7 @@ // Check -Xcuda-ptxas with clang-cl // RUN: %clang_cl -### -c -Xcuda-ptxas -foo1 \ // RUN: --offload-arch=sm_35 --cuda-path=%S/Inputs/CUDA/usr/local/cuda \ -// RUN: -Xcuda-ptxas -foo2 %s 2>&1 \ +// RUN: -Xcuda-ptxas -foo2 -- %s 2>&1 \ // RUN: | FileCheck -check-prefixes=CHECK,SM35,PTXAS-EXTRA %s // MacOS spot-checks ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 81ae2a8 - [clang] Fix '-Wunused-variable' warnings. NFC
Author: Michael Liao Date: 2023-12-24T22:00:57-05:00 New Revision: 81ae2a8bb01d38162e0269fc6819584af6d60b03 URL: https://github.com/llvm/llvm-project/commit/81ae2a8bb01d38162e0269fc6819584af6d60b03 DIFF: https://github.com/llvm/llvm-project/commit/81ae2a8bb01d38162e0269fc6819584af6d60b03.diff LOG: [clang] Fix '-Wunused-variable' warnings. NFC Added: Modified: clang/lib/Driver/ToolChains/Clang.cpp Removed: diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 4783affd3220bc..70dc7e54aca125 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -3203,13 +3203,13 @@ static void RenderFloatingPointOptions(const ToolChain , const Driver , options::OPT_fstrict_float_cast_overflow, false)) CmdArgs.push_back("-fno-strict-float-cast-overflow"); - if (const Arg *A = Args.getLastArg(options::OPT_fcx_limited_range)) + if (Args.hasArg(options::OPT_fcx_limited_range)) CmdArgs.push_back("-fcx-limited-range"); - if (const Arg *A = Args.getLastArg(options::OPT_fcx_fortran_rules)) + if (Args.hasArg(options::OPT_fcx_fortran_rules)) CmdArgs.push_back("-fcx-fortran-rules"); - if (const Arg *A = Args.getLastArg(options::OPT_fno_cx_limited_range)) + if (Args.hasArg(options::OPT_fno_cx_limited_range)) CmdArgs.push_back("-fno-cx-limited-range"); - if (const Arg *A = Args.getLastArg(options::OPT_fno_cx_fortran_rules)) + if (Args.hasArg(options::OPT_fno_cx_fortran_rules)) CmdArgs.push_back("-fno-cx-fortran-rules"); } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 7b12d8b - [clang][Tests] Fix shared build. NFC
Author: Michael Liao Date: 2023-10-12T12:24:18-04:00 New Revision: 7b12d8bf8a1ff1540e32345b045f813644708a71 URL: https://github.com/llvm/llvm-project/commit/7b12d8bf8a1ff1540e32345b045f813644708a71 DIFF: https://github.com/llvm/llvm-project/commit/7b12d8bf8a1ff1540e32345b045f813644708a71.diff LOG: [clang][Tests] Fix shared build. NFC Added: Modified: clang/unittests/AST/Interp/CMakeLists.txt Removed: diff --git a/clang/unittests/AST/Interp/CMakeLists.txt b/clang/unittests/AST/Interp/CMakeLists.txt index e8d41091af40cda..8fa5c85064dbce5 100644 --- a/clang/unittests/AST/Interp/CMakeLists.txt +++ b/clang/unittests/AST/Interp/CMakeLists.txt @@ -5,7 +5,10 @@ add_clang_unittest(InterpTests clang_target_link_libraries(InterpTests PRIVATE clangAST + clangASTMatchers clangBasic + clangFrontend + clangSerialization clangTooling ) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 4edde41 - [clang][APINotes] Fix build error due to `-fpermissive` on GCC. NFC
Author: Michael Liao Date: 2023-08-17T15:00:39-04:00 New Revision: 4edde41daed5e5e0b9aab2322215ddc2535f4cfd URL: https://github.com/llvm/llvm-project/commit/4edde41daed5e5e0b9aab2322215ddc2535f4cfd DIFF: https://github.com/llvm/llvm-project/commit/4edde41daed5e5e0b9aab2322215ddc2535f4cfd.diff LOG: [clang][APINotes] Fix build error due to `-fpermissive` on GCC. NFC Added: Modified: clang/lib/APINotes/APINotesWriter.cpp Removed: diff --git a/clang/lib/APINotes/APINotesWriter.cpp b/clang/lib/APINotes/APINotesWriter.cpp index 1a3d66a547f6ad..aad4c886bdd66d 100644 --- a/clang/lib/APINotes/APINotesWriter.cpp +++ b/clang/lib/APINotes/APINotesWriter.cpp @@ -891,7 +891,7 @@ unsigned getFunctionInfoSize(const FunctionInfo ) { } /// Emit a serialized representation of the function information. -static void emitFunctionInfo(raw_ostream , const FunctionInfo ) { +void emitFunctionInfo(raw_ostream , const FunctionInfo ) { emitCommonEntityInfo(OS, FI); uint8_t flags = 0; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 2daf91d - Fix shared library build again from 1c9a800. NFC
Author: Michael Liao Date: 2023-05-24T14:09:38-04:00 New Revision: 2daf91dae3bc25d2ffb869d9781d9f4496a27d02 URL: https://github.com/llvm/llvm-project/commit/2daf91dae3bc25d2ffb869d9781d9f4496a27d02 DIFF: https://github.com/llvm/llvm-project/commit/2daf91dae3bc25d2ffb869d9781d9f4496a27d02.diff LOG: Fix shared library build again from 1c9a800. NFC Added: Modified: clang/unittests/Serialization/CMakeLists.txt Removed: diff --git a/clang/unittests/Serialization/CMakeLists.txt b/clang/unittests/Serialization/CMakeLists.txt index 6b82ad91e5ec..44e4ecb31436 100644 --- a/clang/unittests/Serialization/CMakeLists.txt +++ b/clang/unittests/Serialization/CMakeLists.txt @@ -1,6 +1,7 @@ set(LLVM_LINK_COMPONENTS BitReader BitstreamReader + FrontendOpenMP Support ) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 058f04e - [clang] Fix another case where CPlusPlus2b is still used.
Author: Michael Liao Date: 2023-05-04T14:27:33-04:00 New Revision: 058f04ea7dcbafbeed271fa75ee65e41409b4479 URL: https://github.com/llvm/llvm-project/commit/058f04ea7dcbafbeed271fa75ee65e41409b4479 DIFF: https://github.com/llvm/llvm-project/commit/058f04ea7dcbafbeed271fa75ee65e41409b4479.diff LOG: [clang] Fix another case where CPlusPlus2b is still used. Added: Modified: clang/lib/Sema/SemaDeclCXX.cpp Removed: diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp index 5b7ee09ac4c7..208e34a40e48 100644 --- a/clang/lib/Sema/SemaDeclCXX.cpp +++ b/clang/lib/Sema/SemaDeclCXX.cpp @@ -8829,7 +8829,7 @@ bool Sema::CheckExplicitlyDefaultedComparison(Scope *S, FunctionDecl *FD, CheckConstexprParameterTypes(*this, FD, CheckConstexprKind::Diagnose) && !Info.Constexpr) { Diag(FD->getBeginLoc(), - getLangOpts().CPlusPlus2b + getLangOpts().CPlusPlus23 ? diag::warn_cxx2b_compat_defaulted_comparison_constexpr_mismatch : diag::ext_defaulted_comparison_constexpr_mismatch) << FD->isImplicit() << (int)DCK << FD->isConsteval(); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] b323b40 - [clang] Fix build after https://reviews.llvm.org/D149553
Author: Michael Liao Date: 2023-05-04T14:23:46-04:00 New Revision: b323b407f76d22bfc08b1430f7952c03eb504288 URL: https://github.com/llvm/llvm-project/commit/b323b407f76d22bfc08b1430f7952c03eb504288 DIFF: https://github.com/llvm/llvm-project/commit/b323b407f76d22bfc08b1430f7952c03eb504288.diff LOG: [clang] Fix build after https://reviews.llvm.org/D149553 - `CXXPre2bCompat` is referenced somewhere after being removed. - More warning messages on c++2b need refining Added: Modified: clang/include/clang/Basic/DiagnosticSemaKinds.td Removed: diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index f0ff7d0f3169..7df963f7ba05 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -9440,7 +9440,7 @@ def warn_cxx2b_compat_defaulted_comparison_constexpr_mismatch : Warning< "%select{|for which the corresponding implicit 'operator==' }0 " "invokes a non-constexpr comparison function is incompatible with C++ " "standards before C++2b">, - InGroup, DefaultIgnore; + InGroup, DefaultIgnore; def note_defaulted_comparison_not_constexpr : Note< "non-constexpr comparison function would be used to compare " "%select{|member %1|base class %1}0">; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 3c83480 - [clang][AST] Fix `-Wuninitialized`. NFC
Author: Michael Liao Date: 2023-04-09T15:58:10-04:00 New Revision: 3c83480ae95dde9b5d45b6fd7cdb1c64332531d7 URL: https://github.com/llvm/llvm-project/commit/3c83480ae95dde9b5d45b6fd7cdb1c64332531d7 DIFF: https://github.com/llvm/llvm-project/commit/3c83480ae95dde9b5d45b6fd7cdb1c64332531d7.diff LOG: [clang][AST] Fix `-Wuninitialized`. NFC - Adjust the declaration order as non-static member are initialized in order of declaration in the class definition. Added: Modified: clang/lib/AST/MicrosoftMangle.cpp Removed: diff --git a/clang/lib/AST/MicrosoftMangle.cpp b/clang/lib/AST/MicrosoftMangle.cpp index d8c837bcade02..e0fd8abe5e3b8 100644 --- a/clang/lib/AST/MicrosoftMangle.cpp +++ b/clang/lib/AST/MicrosoftMangle.cpp @@ -327,8 +327,8 @@ class MicrosoftCXXNameMangler { typedef llvm::DenseMap TemplateArgStringMap; TemplateArgStringMap TemplateArgStrings; - llvm::StringSaver TemplateArgStringStorage; llvm::BumpPtrAllocator TemplateArgStringStorageAlloc; + llvm::StringSaver TemplateArgStringStorage; typedef std::set> PassObjectSizeArgsSet; PassObjectSizeArgsSet PassObjectSizeArgs; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] cd61d2a - [clang][CodeGen][NFC] Fix `llvm-else-after-return`
Author: Michael Liao Date: 2023-01-25T13:53:35-05:00 New Revision: cd61d2abe0fdfcee52e16998f7f3fda82572cd6f URL: https://github.com/llvm/llvm-project/commit/cd61d2abe0fdfcee52e16998f7f3fda82572cd6f DIFF: https://github.com/llvm/llvm-project/commit/cd61d2abe0fdfcee52e16998f7f3fda82572cd6f.diff LOG: [clang][CodeGen][NFC] Fix `llvm-else-after-return` Added: Modified: clang/lib/CodeGen/CodeGenModule.cpp Removed: diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 12d602fed6932..38e6ec6634e88 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3318,7 +3318,8 @@ void CodeGenModule::EmitGlobal(GlobalDecl GD) { if (MustBeEmitted(Global)) EmitOMPDeclareReduction(DRD); return; -} else if (auto *DMD = dyn_cast(Global)) { +} +if (auto *DMD = dyn_cast(Global)) { if (MustBeEmitted(Global)) EmitOMPDeclareMapper(DMD); return; @@ -4687,16 +4688,17 @@ LangAS CodeGenModule::GetGlobalVarAddressSpace(const VarDecl *D) { return LangAS::sycl_global; if (LangOpts.CUDA && LangOpts.CUDAIsDevice) { -if (D && D->hasAttr()) - return LangAS::cuda_constant; -else if (D && D->hasAttr()) - return LangAS::cuda_shared; -else if (D && D->hasAttr()) - return LangAS::cuda_device; -else if (D && D->getType().isConstQualified()) - return LangAS::cuda_constant; -else - return LangAS::cuda_device; +if (D) { + if (D->hasAttr()) +return LangAS::cuda_constant; + if (D->hasAttr()) +return LangAS::cuda_shared; + if (D->hasAttr()) +return LangAS::cuda_device; + if (D->getType().isConstQualified()) +return LangAS::cuda_constant; +} +return LangAS::cuda_device; } if (LangOpts.OpenMP) { ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 036aeac - [Testing] Fix the shared build. NFC.
Author: Michael Liao Date: 2022-04-22T02:46:54-04:00 New Revision: 036aeac36c00f4390e861118f536150b366beaaf URL: https://github.com/llvm/llvm-project/commit/036aeac36c00f4390e861118f536150b366beaaf DIFF: https://github.com/llvm/llvm-project/commit/036aeac36c00f4390e861118f536150b366beaaf.diff LOG: [Testing] Fix the shared build. NFC. Added: Modified: clang/lib/Testing/CMakeLists.txt Removed: diff --git a/clang/lib/Testing/CMakeLists.txt b/clang/lib/Testing/CMakeLists.txt index 68ed32ba85b1e..49b6787959bc1 100644 --- a/clang/lib/Testing/CMakeLists.txt +++ b/clang/lib/Testing/CMakeLists.txt @@ -16,8 +16,11 @@ add_llvm_library(clangTesting clang_target_link_libraries(clangTesting PRIVATE + clangAST clangBasic clangFrontend + clangLex + clangSerialization ) target_link_libraries(clangTesting ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 73ab5fd - [clang] Fix shared build. NFC.
Author: Michael Liao Date: 2022-03-30T14:05:14-04:00 New Revision: 73ab5fd17b5726543554621410124ebae953dc6b URL: https://github.com/llvm/llvm-project/commit/73ab5fd17b5726543554621410124ebae953dc6b DIFF: https://github.com/llvm/llvm-project/commit/73ab5fd17b5726543554621410124ebae953dc6b.diff LOG: [clang] Fix shared build. NFC. Added: Modified: clang/lib/ExtractAPI/CMakeLists.txt Removed: diff --git a/clang/lib/ExtractAPI/CMakeLists.txt b/clang/lib/ExtractAPI/CMakeLists.txt index 044caa0922483..f194ae342c20f 100644 --- a/clang/lib/ExtractAPI/CMakeLists.txt +++ b/clang/lib/ExtractAPI/CMakeLists.txt @@ -14,4 +14,5 @@ add_clang_library(clangExtractAPI clangBasic clangFrontend clangIndex + clangLex ) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 7505aee - [clang] Pacify GCC warning. NFC.
Author: Michael Liao Date: 2022-01-03T11:05:36-05:00 New Revision: 7505aeefc4e615520e2c822b9647dad4a48276b9 URL: https://github.com/llvm/llvm-project/commit/7505aeefc4e615520e2c822b9647dad4a48276b9 DIFF: https://github.com/llvm/llvm-project/commit/7505aeefc4e615520e2c822b9647dad4a48276b9.diff LOG: [clang] Pacify GCC warning. NFC. - This partially reverts d677a7cb056b17145a50ec8ca2ab6d5f4c494749 to pacify GCC warnings like ``` base class should be explicitly initialized in the copy constructor ``` - Shall we keep turning on option `IgnoreBaseInCopyConstructors` when enabling `readability-redundant-member-init` check? Added: Modified: clang/include/clang/Basic/Diagnostic.h clang/include/clang/Basic/PartialDiagnostic.h Removed: diff --git a/clang/include/clang/Basic/Diagnostic.h b/clang/include/clang/Basic/Diagnostic.h index 6a80823d12422..e5577e74fa639 100644 --- a/clang/include/clang/Basic/Diagnostic.h +++ b/clang/include/clang/Basic/Diagnostic.h @@ -1326,7 +1326,7 @@ class DiagnosticBuilder : public StreamingDiagnostic { public: /// Copy constructor. When copied, this "takes" the diagnostic info from the /// input and neuters it. - DiagnosticBuilder(const DiagnosticBuilder ) { + DiagnosticBuilder(const DiagnosticBuilder ) : StreamingDiagnostic() { DiagObj = D.DiagObj; DiagStorage = D.DiagStorage; IsActive = D.IsActive; diff --git a/clang/include/clang/Basic/PartialDiagnostic.h b/clang/include/clang/Basic/PartialDiagnostic.h index 217441979869b..9fb70bff7fee1 100644 --- a/clang/include/clang/Basic/PartialDiagnostic.h +++ b/clang/include/clang/Basic/PartialDiagnostic.h @@ -49,7 +49,8 @@ class PartialDiagnostic : public StreamingDiagnostic { PartialDiagnostic(unsigned DiagID, DiagStorageAllocator _) : StreamingDiagnostic(Allocator_), DiagID(DiagID) {} - PartialDiagnostic(const PartialDiagnostic ) : DiagID(Other.DiagID) { + PartialDiagnostic(const PartialDiagnostic ) + : StreamingDiagnostic(), DiagID(Other.DiagID) { Allocator = Other.Allocator; if (Other.DiagStorage) { DiagStorage = getStorage(); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 17414b6 - Fix shared build of unittests.
Author: Michael Liao Date: 2021-12-10T15:33:56-05:00 New Revision: 17414b61245dcd3ae96c447762e8f776856a733c URL: https://github.com/llvm/llvm-project/commit/17414b61245dcd3ae96c447762e8f776856a733c DIFF: https://github.com/llvm/llvm-project/commit/17414b61245dcd3ae96c447762e8f776856a733c.diff LOG: Fix shared build of unittests. Added: Modified: clang/unittests/Analysis/FlowSensitive/CMakeLists.txt Removed: diff --git a/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt b/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt index d6f38c9404abc..f651cdeff1b55 100644 --- a/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt +++ b/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt @@ -1,4 +1,5 @@ set(LLVM_LINK_COMPONENTS + FrontendOpenMP Support ) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] bf22593 - [InferAddressSpaces] Support assumed addrspaces from addrspace predicates.
Author: Michael Liao Date: 2021-11-08T16:51:57-05:00 New Revision: bf225939bc3acf936c962f24423d3bb5ddd4c93f URL: https://github.com/llvm/llvm-project/commit/bf225939bc3acf936c962f24423d3bb5ddd4c93f DIFF: https://github.com/llvm/llvm-project/commit/bf225939bc3acf936c962f24423d3bb5ddd4c93f.diff LOG: [InferAddressSpaces] Support assumed addrspaces from addrspace predicates. - CUDA cannot associate memory space with pointer types. Even though Clang could add extra attributes to specify the address space explicitly on a pointer type, it breaks the portability between Clang and NVCC. - This change proposes to assume the address space from a pointer from the assumption built upon target-specific address space predicates, such as `__isGlobal` from CUDA. E.g., ``` foo(float *p) { __builtin_assume(__isGlobal(p)); // From there, we could assume p is a global pointer instead of a // generic one. } ``` This makes the code portable without introducing the implementation-specific features. Note that NVCC starts to support __builtin_assume from version 11. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D112041 Added: llvm/test/Transforms/InferAddressSpaces/AMDGPU/builtin-assumed-addrspace.ll llvm/test/Transforms/InferAddressSpaces/NVPTX/builtin-assumed-addrspace.ll Modified: clang/test/CodeGen/thinlto-distributed-newpm.ll llvm/include/llvm/Analysis/AssumptionCache.h llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/include/llvm/Target/TargetMachine.h llvm/lib/Analysis/AssumptionCache.cpp llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp llvm/lib/Target/NVPTX/NVPTXTargetMachine.h llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp llvm/test/Other/loop-pm-invalidation.ll llvm/test/Other/new-pass-manager.ll llvm/test/Other/new-pm-lto-defaults.ll llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll llvm/test/Transforms/LoopRotate/pr35210.ll llvm/unittests/Analysis/AssumeBundleQueriesTest.cpp Removed: diff --git a/clang/test/CodeGen/thinlto-distributed-newpm.ll b/clang/test/CodeGen/thinlto-distributed-newpm.ll index 8f7fc5e9b8411..87dc19f29e1ba 100644 --- a/clang/test/CodeGen/thinlto-distributed-newpm.ll +++ b/clang/test/CodeGen/thinlto-distributed-newpm.ll @@ -47,11 +47,11 @@ ; CHECK-O: Running pass: PromotePass ; CHECK-O: Running analysis: DominatorTreeAnalysis on main ; CHECK-O: Running analysis: AssumptionAnalysis on main +; CHECK-O: Running analysis: TargetIRAnalysis on main ; CHECK-O: Running pass: DeadArgumentEliminationPass ; CHECK-O: Running pass: InstCombinePass on main ; CHECK-O: Running analysis: TargetLibraryAnalysis on main ; CHECK-O: Running analysis: OptimizationRemarkEmitterAnalysis on main -; CHECK-O: Running analysis: TargetIRAnalysis on main ; CHECK-O: Running analysis: AAManager on main ; CHECK-O: Running analysis: BasicAA on main ; CHECK-O: Running analysis: ScopedNoAliasAA on main diff --git a/llvm/include/llvm/Analysis/AssumptionCache.h b/llvm/include/llvm/Analysis/AssumptionCache.h index 51d04bd8cf022..12dd9b04c9323 100644 --- a/llvm/include/llvm/Analysis/AssumptionCache.h +++ b/llvm/include/llvm/Analysis/AssumptionCache.h @@ -29,6 +29,7 @@ namespace llvm { class AssumeInst; class Function; class raw_ostream; +class TargetTransformInfo; class Value; /// A cache of \@llvm.assume calls within a function. @@ -59,6 +60,8 @@ class AssumptionCache { /// We track this to lazily populate our assumptions. Function + TargetTransformInfo *TTI; + /// Vector of weak value handles to calls of the \@llvm.assume /// intrinsic. SmallVector AssumeHandles; @@ -103,7 +106,8 @@ class AssumptionCache { public: /// Construct an AssumptionCache from a function by scanning all of /// its instructions. - AssumptionCache(Function ) : F(F) {} + AssumptionCache(Function , TargetTransformInfo *TTI = nullptr) + : F(F), TTI(TTI) {} /// This cache is designed to be self-updating and so it should never be /// invalidated. @@ -174,9 +178,7 @@ class AssumptionAnalysis : public AnalysisInfoMixin { public: using Result = AssumptionCache; - AssumptionCache run(Function , FunctionAnalysisManager &) { -return AssumptionCache(F); - } + AssumptionCache run(Function , FunctionAnalysisManager &); }; /// Printer pass for the \c AssumptionAnalysis results. diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h index e3cf87612e9c3..4312c2ae0de63 100644 --- a/llvm/include/llvm/Analysis/TargetTransformInfo.h +++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
[clang] 6fe902d - [cuda] Add address space predicate funuctions.
Author: Michael Liao Date: 2021-10-19T16:20:14-04:00 New Revision: 6fe902daf931dedf6e958b43c043cb57bb612daf URL: https://github.com/llvm/llvm-project/commit/6fe902daf931dedf6e958b43c043cb57bb612daf DIFF: https://github.com/llvm/llvm-project/commit/6fe902daf931dedf6e958b43c043cb57bb612daf.diff LOG: [cuda] Add address space predicate funuctions. - Add the missing NVVM predicate builtins on address space checking - Redefine them as pure functions so that they could be used in __builtin_assume. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D112053 Added: Modified: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/Headers/__clang_cuda_runtime_wrapper.h Removed: diff --git a/clang/include/clang/Basic/BuiltinsNVPTX.def b/clang/include/clang/Basic/BuiltinsNVPTX.def index 907a99af532c3..7afee4dbc80bc 100644 --- a/clang/include/clang/Basic/BuiltinsNVPTX.def +++ b/clang/include/clang/Basic/BuiltinsNVPTX.def @@ -687,6 +687,12 @@ BUILTIN(__nvvm_ldg_f2, "E2fE2fC*", "") BUILTIN(__nvvm_ldg_f4, "E4fE4fC*", "") BUILTIN(__nvvm_ldg_d2, "E2dE2dC*", "") +// Address space predicates. +BUILTIN(__nvvm_isspacep_const, "bvC*", "nc") +BUILTIN(__nvvm_isspacep_global, "bvC*", "nc") +BUILTIN(__nvvm_isspacep_local, "bvC*", "nc") +BUILTIN(__nvvm_isspacep_shared, "bvC*", "nc") + // Builtins to support WMMA instructions on sm_70 TARGET_BUILTIN(__hmma_m16n16k16_ld_a, "vi*iC*UiIi", "", AND(SM_70,PTX60)) TARGET_BUILTIN(__hmma_m16n16k16_ld_b, "vi*iC*UiIi", "", AND(SM_70,PTX60)) diff --git a/clang/lib/Headers/__clang_cuda_runtime_wrapper.h b/clang/lib/Headers/__clang_cuda_runtime_wrapper.h index 33aa25fb2d73c..512fc300fc344 100644 --- a/clang/lib/Headers/__clang_cuda_runtime_wrapper.h +++ b/clang/lib/Headers/__clang_cuda_runtime_wrapper.h @@ -271,7 +271,38 @@ static inline __device__ void __brkpt(int __c) { __brkpt(); } #undef __CUDABE__ #endif #include "sm_20_atomic_functions.hpp" +// Predicate functions used in `__builtin_assume` need to have no side effect. +// However, sm_20_intrinsics.hpp doesn't define them with neither pure nor +// const attribute. Rename definitions from sm_20_intrinsics.hpp and re-define +// them as pure ones. +#pragma push_macro("__isGlobal") +#pragma push_macro("__isShared") +#pragma push_macro("__isConstant") +#pragma push_macro("__isLocal") +#define __isGlobal __ignored_cuda___isGlobal +#define __isShared __ignored_cuda___isShared +#define __isConstant __ignored_cuda___isConstant +#define __isLocal __ignored_cuda___isLocal #include "sm_20_intrinsics.hpp" +#pragma pop_macro("__isGlobal") +#pragma pop_macro("__isShared") +#pragma pop_macro("__isConstant") +#pragma pop_macro("__isLocal") +#pragma push_macro("__DEVICE__") +#define __DEVICE__ static __device__ __forceinline__ __attribute__((const)) +__DEVICE__ unsigned int __isGlobal(const void *p) { + return __nvvm_isspacep_global(p); +} +__DEVICE__ unsigned int __isShared(const void *p) { + return __nvvm_isspacep_shared(p); +} +__DEVICE__ unsigned int __isConstant(const void *p) { + return __nvvm_isspacep_const(p); +} +__DEVICE__ unsigned int __isLocal(const void *p) { + return __nvvm_isspacep_local(p); +} +#pragma pop_macro("__DEVICE__") #include "sm_32_atomic_functions.hpp" // Don't include sm_30_intrinsics.h and sm_32_intrinsics.h. These define the ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 6ec36d1 - [cuda] Mark builtin texture/surface reference variable as 'externally_initialized'.
Author: Michael Liao Date: 2021-08-09T13:27:40-04:00 New Revision: 6ec36d18ec7b29b471bbe50502beb5b35de3975c URL: https://github.com/llvm/llvm-project/commit/6ec36d18ec7b29b471bbe50502beb5b35de3975c DIFF: https://github.com/llvm/llvm-project/commit/6ec36d18ec7b29b471bbe50502beb5b35de3975c.diff LOG: [cuda] Mark builtin texture/surface reference variable as 'externally_initialized'. - They need to be preserved even if there's no reference within the device code as the host code may need to initialize them based on the application logic. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D107718 Added: Modified: clang/lib/CodeGen/CodeGenModule.cpp clang/test/CodeGenCUDA/surface.cu clang/test/CodeGenCUDA/texture.cu Removed: diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 49a1396b58e3a..13520861fe9b6 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -4438,7 +4438,9 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl *D, if (GV && LangOpts.CUDA) { if (LangOpts.CUDAIsDevice) { if (Linkage != llvm::GlobalValue::InternalLinkage && - (D->hasAttr() || D->hasAttr())) + (D->hasAttr() || D->hasAttr() || + D->getType()->isCUDADeviceBuiltinSurfaceType() || + D->getType()->isCUDADeviceBuiltinTextureType())) GV->setExternallyInitialized(true); } else { getCUDARuntime().internalizeDeviceSideVar(D, Linkage); diff --git a/clang/test/CodeGenCUDA/surface.cu b/clang/test/CodeGenCUDA/surface.cu index 0bf17091081b1..eedae5473fcfc 100644 --- a/clang/test/CodeGenCUDA/surface.cu +++ b/clang/test/CodeGenCUDA/surface.cu @@ -19,7 +19,7 @@ struct __attribute__((device_builtin_surface_type)) surface : public }; // On the device side, surface references are represented as `i64` handles. -// DEVICE: @surf ={{.*}} addrspace(1) global i64 undef, align 4 +// DEVICE: @surf ={{.*}} addrspace(1) externally_initialized global i64 undef, align 4 // On the host side, they remain in the original type. // HOST: @surf = internal global %struct.surface // HOST: @0 = private unnamed_addr constant [5 x i8] c"surf\00" diff --git a/clang/test/CodeGenCUDA/texture.cu b/clang/test/CodeGenCUDA/texture.cu index 8a966194340aa..0bb8cd48dcaa7 100644 --- a/clang/test/CodeGenCUDA/texture.cu +++ b/clang/test/CodeGenCUDA/texture.cu @@ -19,8 +19,8 @@ struct __attribute__((device_builtin_texture_type)) texture : public textureRefe }; // On the device side, texture references are represented as `i64` handles. -// DEVICE: @tex ={{.*}} addrspace(1) global i64 undef, align 4 -// DEVICE: @norm ={{.*}} addrspace(1) global i64 undef, align 4 +// DEVICE: @tex ={{.*}} addrspace(1) externally_initialized global i64 undef, align 4 +// DEVICE: @norm ={{.*}} addrspace(1) externally_initialized global i64 undef, align 4 // On the host side, they remain in the original type. // HOST: @tex = internal global %struct.texture // HOST: @norm = internal global %struct.texture ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 4e5d9c8 - [Internalize] Preserve variables externally initialized.
Author: Michael Liao Date: 2021-07-08T10:48:19-04:00 New Revision: 4e5d9c88033f1fc5d5206a02d8303bc6de43cf2b URL: https://github.com/llvm/llvm-project/commit/4e5d9c88033f1fc5d5206a02d8303bc6de43cf2b DIFF: https://github.com/llvm/llvm-project/commit/4e5d9c88033f1fc5d5206a02d8303bc6de43cf2b.diff LOG: [Internalize] Preserve variables externally initialized. - ``externally_initialized`` variables would be initialized or modified elsewhere. Particularly, CUDA or HIP may have host code to initialize or modify ``externally_initialized`` device variables, which may not be explicitly referenced on the device side but may still be used through the host side interfaces. Not preserving them triggers the elimination of them in the GlobalDCE and breaks the user code. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D105135 Added: llvm/test/Transforms/Internalize/externally-initialized.ll Modified: clang/test/CodeGenCUDA/host-used-device-var.cu clang/test/CodeGenCUDA/unused-global-var.cu llvm/lib/Transforms/IPO/Internalize.cpp Removed: diff --git a/clang/test/CodeGenCUDA/host-used-device-var.cu b/clang/test/CodeGenCUDA/host-used-device-var.cu index b94ef689b3162..6bb5757052946 100644 --- a/clang/test/CodeGenCUDA/host-used-device-var.cu +++ b/clang/test/CodeGenCUDA/host-used-device-var.cu @@ -15,14 +15,14 @@ #include "Inputs/cuda.h" -// Check device variables used by neither host nor device functioins are not kept. - -// DEV-NEG-NOT: @v1 +// DEV-DAG: @v1 __device__ int v1; -// DEV-NEG-NOT: @v2 +// DEV-DAG: @v2 __constant__ int v2; +// Check device variables used by neither host nor device functioins are not kept. + // DEV-NEG-NOT: @_ZL2v3 static __device__ int v3; diff --git a/clang/test/CodeGenCUDA/unused-global-var.cu b/clang/test/CodeGenCUDA/unused-global-var.cu index 1dbb3a22563c8..c091e83eda70a 100644 --- a/clang/test/CodeGenCUDA/unused-global-var.cu +++ b/clang/test/CodeGenCUDA/unused-global-var.cu @@ -15,14 +15,14 @@ // DCE before internalization. This test makes sure unused global variables // are eliminated. -// Check unused device/constant variables are eliminated. - -// NEGCHK-NOT: @v1 +// CHECK-DAG: @v1 __device__ int v1; -// NEGCHK-NOT: @v2 +// CHECK-DAG: @v2 __constant__ int v2; +// Check unused device/constant variables are eliminated. + // NEGCHK-NOT: @_ZL2v3 constexpr int v3 = 1; diff --git a/llvm/lib/Transforms/IPO/Internalize.cpp b/llvm/lib/Transforms/IPO/Internalize.cpp index 008712c87988b..cf8da0baebe41 100644 --- a/llvm/lib/Transforms/IPO/Internalize.cpp +++ b/llvm/lib/Transforms/IPO/Internalize.cpp @@ -101,6 +101,12 @@ bool InternalizePass::shouldPreserveGV(const GlobalValue ) { if (GV.hasDLLExportStorageClass()) return true; + // As the name suggests, externally initialized variables need preserving as + // they would be initialized elsewhere externally. + if (const auto *G = dyn_cast()) +if (G->isExternallyInitialized()) + return true; + // Already local, has nothing to do. if (GV.hasLocalLinkage()) return false; diff --git a/llvm/test/Transforms/Internalize/externally-initialized.ll b/llvm/test/Transforms/Internalize/externally-initialized.ll new file mode 100644 index 0..4c24e53543db9 --- /dev/null +++ b/llvm/test/Transforms/Internalize/externally-initialized.ll @@ -0,0 +1,7 @@ +; RUN: opt < %s -internalize -S | FileCheck %s +; RUN: opt < %s -passes=internalize -S | FileCheck %s + +; CHECK: @G0 +; CHECK-NOT: internal +; CHECK-SAME: global i32 +@G0 = protected externally_initialized global i32 0, align 4 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 948308e - Fix `-Wunused-variable` warning. NFC.
Author: Michael Liao Date: 2021-06-28T22:50:36-04:00 New Revision: 948308ef34dc7da8bb741a85eb9941cc2b05d227 URL: https://github.com/llvm/llvm-project/commit/948308ef34dc7da8bb741a85eb9941cc2b05d227 DIFF: https://github.com/llvm/llvm-project/commit/948308ef34dc7da8bb741a85eb9941cc2b05d227.diff LOG: Fix `-Wunused-variable` warning. NFC. Added: Modified: clang/lib/CodeGen/CGCall.cpp Removed: diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 35b34179cc23..4ff6c632b61d 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -2173,7 +2173,7 @@ void CodeGenModule::ConstructAttributeList(StringRef Name, // Add "sample-profile-suffix-elision-policy" attribute for internal linkage // functions with -funique-internal-linkage-names. if (TargetDecl && CodeGenOpts.UniqueInternalLinkageNames) { -if (auto *Fn = dyn_cast(TargetDecl)) { +if (isa(TargetDecl)) { if (this->getFunctionLinkage(CalleeInfo.getCalleeDecl()) == llvm::GlobalValue::InternalLinkage) FuncAttrs.addAttribute("sample-profile-suffix-elision-policy", ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 01bf529 - Recommit of a2fdf9d4d734732a6fa9288f1ffdf12bf8618123.
Author: Michael Liao Date: 2021-02-05T11:27:30-05:00 New Revision: 01bf529db2cf465b029e29e537807576bfcbc452 URL: https://github.com/llvm/llvm-project/commit/01bf529db2cf465b029e29e537807576bfcbc452 DIFF: https://github.com/llvm/llvm-project/commit/01bf529db2cf465b029e29e537807576bfcbc452.diff LOG: Recommit of a2fdf9d4d734732a6fa9288f1ffdf12bf8618123. - The failures are all cc1-based tests due to the missing `-aux-triple` options, which is always prepared by the driver in CUDA/HIP compilation. - Add extra check on the missing aux-targetinfo to prevent crashing. [hip][cuda] Enable extended lambda support on Windows. - On Windows, extended lambda has extra issues due to the numbering schemes are different between the host compilation (Microsoft C++ ABI) and the device compilation (Itanium C++ ABI. Additional device side lambda number is required per lambda for the host compilation to correctly mangle the device-side lambda name. - A hybrid numbering context `MSHIPNumberingContext` is introduced to number a lambda for both host- and device-compilations. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D69322 This reverts commit 4874ff02417916cc9ff994b34abcb5e563056546. Added: Modified: clang/include/clang/AST/ASTContext.h clang/include/clang/AST/DeclCXX.h clang/include/clang/AST/Mangle.h clang/include/clang/AST/MangleNumberingContext.h clang/include/clang/Sema/Sema.h clang/lib/AST/ASTImporter.cpp clang/lib/AST/CXXABI.h clang/lib/AST/DeclCXX.cpp clang/lib/AST/ItaniumCXXABI.cpp clang/lib/AST/ItaniumMangle.cpp clang/lib/AST/MicrosoftCXXABI.cpp clang/lib/CodeGen/CGCUDANV.cpp clang/lib/Sema/SemaLambda.cpp clang/lib/Sema/TreeTransform.h clang/lib/Serialization/ASTReaderDecl.cpp clang/lib/Serialization/ASTWriter.cpp clang/test/CodeGenCUDA/unnamed-types.cu Removed: diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h index ce47d54e44b0..ae69a68608b7 100644 --- a/clang/include/clang/AST/ASTContext.h +++ b/clang/include/clang/AST/ASTContext.h @@ -538,6 +538,9 @@ class ASTContext : public RefCountedBase { /// need them (like static local vars). llvm::MapVector MangleNumbers; llvm::MapVector StaticLocalNumbers; + /// Mapping the associated device lambda mangling number if present. + mutable llvm::DenseMap + DeviceLambdaManglingNumbers; /// Mapping that stores parameterIndex values for ParmVarDecls when /// that value exceeds the bitfield size of ParmVarDeclBits.ParameterIndex. diff --git a/clang/include/clang/AST/DeclCXX.h b/clang/include/clang/AST/DeclCXX.h index e32101bb2276..89006b1cfa7f 100644 --- a/clang/include/clang/AST/DeclCXX.h +++ b/clang/include/clang/AST/DeclCXX.h @@ -1735,6 +1735,12 @@ class CXXRecordDecl : public RecordDecl { getLambdaData().HasKnownInternalLinkage = HasKnownInternalLinkage; } + /// Set the device side mangling number. + void setDeviceLambdaManglingNumber(unsigned Num) const; + + /// Retrieve the device side mangling number. + unsigned getDeviceLambdaManglingNumber() const; + /// Returns the inheritance model used for this record. MSInheritanceModel getMSInheritanceModel() const; diff --git a/clang/include/clang/AST/Mangle.h b/clang/include/clang/AST/Mangle.h index 6506ad542cc3..13b436cdca3e 100644 --- a/clang/include/clang/AST/Mangle.h +++ b/clang/include/clang/AST/Mangle.h @@ -107,6 +107,9 @@ class MangleContext { virtual bool shouldMangleCXXName(const NamedDecl *D) = 0; virtual bool shouldMangleStringLiteral(const StringLiteral *SL) = 0; + virtual bool isDeviceMangleContext() const { return false; } + virtual void setDeviceMangleContext(bool) {} + // FIXME: consider replacing raw_ostream & with something like SmallString &. void mangleName(GlobalDecl GD, raw_ostream &); virtual void mangleCXXName(GlobalDecl GD, raw_ostream &) = 0; diff --git a/clang/include/clang/AST/MangleNumberingContext.h b/clang/include/clang/AST/MangleNumberingContext.h index f1ca6a05dbaf..eb33759682d6 100644 --- a/clang/include/clang/AST/MangleNumberingContext.h +++ b/clang/include/clang/AST/MangleNumberingContext.h @@ -52,6 +52,11 @@ class MangleNumberingContext { /// this context. virtual unsigned getManglingNumber(const TagDecl *TD, unsigned MSLocalManglingNumber) = 0; + + /// Retrieve the mangling number of a new lambda expression with the + /// given call operator within the device context. No device number is + /// assigned if there's no device numbering context is associated. + virtual unsigned getDeviceManglingNumber(const CXXMethodDecl *) { return 0; } }; } // end namespace clang diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h index ea20ada56abc..68420fcbb85f 100644 --- a/clang/include/clang/Sema/Sema.h +++
[clang] a2fdf9d - [hip][cuda] Enable extended lambda support on Windows.
Author: Michael Liao Date: 2021-02-04T01:38:29-05:00 New Revision: a2fdf9d4d734732a6fa9288f1ffdf12bf8618123 URL: https://github.com/llvm/llvm-project/commit/a2fdf9d4d734732a6fa9288f1ffdf12bf8618123 DIFF: https://github.com/llvm/llvm-project/commit/a2fdf9d4d734732a6fa9288f1ffdf12bf8618123.diff LOG: [hip][cuda] Enable extended lambda support on Windows. - On Windows, extended lambda has extra issues due to the numbering schemes are different between the host compilation (Microsoft C++ ABI) and the device compilation (Itanium C++ ABI. Additional device side lambda number is required per lambda for the host compilation to correctly mangle the device-side lambda name. - A hybrid numbering context `MSHIPNumberingContext` is introduced to number a lambda for both host- and device-compilations. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D69322 Added: Modified: clang/include/clang/AST/ASTContext.h clang/include/clang/AST/DeclCXX.h clang/include/clang/AST/Mangle.h clang/include/clang/AST/MangleNumberingContext.h clang/include/clang/Sema/Sema.h clang/lib/AST/ASTImporter.cpp clang/lib/AST/CXXABI.h clang/lib/AST/DeclCXX.cpp clang/lib/AST/ItaniumCXXABI.cpp clang/lib/AST/ItaniumMangle.cpp clang/lib/AST/MicrosoftCXXABI.cpp clang/lib/CodeGen/CGCUDANV.cpp clang/lib/Sema/SemaLambda.cpp clang/lib/Sema/TreeTransform.h clang/lib/Serialization/ASTReaderDecl.cpp clang/lib/Serialization/ASTWriter.cpp clang/test/CodeGenCUDA/ms-linker-options.cu clang/test/CodeGenCUDA/unnamed-types.cu Removed: diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h index ce47d54e44b0..ae69a68608b7 100644 --- a/clang/include/clang/AST/ASTContext.h +++ b/clang/include/clang/AST/ASTContext.h @@ -538,6 +538,9 @@ class ASTContext : public RefCountedBase { /// need them (like static local vars). llvm::MapVector MangleNumbers; llvm::MapVector StaticLocalNumbers; + /// Mapping the associated device lambda mangling number if present. + mutable llvm::DenseMap + DeviceLambdaManglingNumbers; /// Mapping that stores parameterIndex values for ParmVarDecls when /// that value exceeds the bitfield size of ParmVarDeclBits.ParameterIndex. diff --git a/clang/include/clang/AST/DeclCXX.h b/clang/include/clang/AST/DeclCXX.h index e32101bb2276..89006b1cfa7f 100644 --- a/clang/include/clang/AST/DeclCXX.h +++ b/clang/include/clang/AST/DeclCXX.h @@ -1735,6 +1735,12 @@ class CXXRecordDecl : public RecordDecl { getLambdaData().HasKnownInternalLinkage = HasKnownInternalLinkage; } + /// Set the device side mangling number. + void setDeviceLambdaManglingNumber(unsigned Num) const; + + /// Retrieve the device side mangling number. + unsigned getDeviceLambdaManglingNumber() const; + /// Returns the inheritance model used for this record. MSInheritanceModel getMSInheritanceModel() const; diff --git a/clang/include/clang/AST/Mangle.h b/clang/include/clang/AST/Mangle.h index 6506ad542cc3..13b436cdca3e 100644 --- a/clang/include/clang/AST/Mangle.h +++ b/clang/include/clang/AST/Mangle.h @@ -107,6 +107,9 @@ class MangleContext { virtual bool shouldMangleCXXName(const NamedDecl *D) = 0; virtual bool shouldMangleStringLiteral(const StringLiteral *SL) = 0; + virtual bool isDeviceMangleContext() const { return false; } + virtual void setDeviceMangleContext(bool) {} + // FIXME: consider replacing raw_ostream & with something like SmallString &. void mangleName(GlobalDecl GD, raw_ostream &); virtual void mangleCXXName(GlobalDecl GD, raw_ostream &) = 0; diff --git a/clang/include/clang/AST/MangleNumberingContext.h b/clang/include/clang/AST/MangleNumberingContext.h index f1ca6a05dbaf..eb33759682d6 100644 --- a/clang/include/clang/AST/MangleNumberingContext.h +++ b/clang/include/clang/AST/MangleNumberingContext.h @@ -52,6 +52,11 @@ class MangleNumberingContext { /// this context. virtual unsigned getManglingNumber(const TagDecl *TD, unsigned MSLocalManglingNumber) = 0; + + /// Retrieve the mangling number of a new lambda expression with the + /// given call operator within the device context. No device number is + /// assigned if there's no device numbering context is associated. + virtual unsigned getDeviceManglingNumber(const CXXMethodDecl *) { return 0; } }; } // end namespace clang diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h index 2fca81d25345..1c4942a37112 100644 --- a/clang/include/clang/Sema/Sema.h +++ b/clang/include/clang/Sema/Sema.h @@ -6558,7 +6558,7 @@ class Sema final { /// Number lambda for linkage purposes if necessary. void handleLambdaNumbering( CXXRecordDecl *Class, CXXMethodDecl *Method, - Optional> Mangling = None); + Optional> Mangling = None);
[clang] 7b5d7c7 - [hip] Fix `` compilation on Windows with VS2019.
Author: Michael Liao Date: 2021-01-20T16:43:44-05:00 New Revision: 7b5d7c7b0a2479de007ad18b947459b71667 URL: https://github.com/llvm/llvm-project/commit/7b5d7c7b0a2479de007ad18b947459b71667 DIFF: https://github.com/llvm/llvm-project/commit/7b5d7c7b0a2479de007ad18b947459b71667.diff LOG: [hip] Fix `` compilation on Windows with VS2019. Differential Revision: https://reviews.llvm.org/D95075 Added: Modified: clang/lib/Headers/__clang_hip_cmath.h Removed: diff --git a/clang/lib/Headers/__clang_hip_cmath.h b/clang/lib/Headers/__clang_hip_cmath.h index 128d64e271b8..cd22a2df954b 100644 --- a/clang/lib/Headers/__clang_hip_cmath.h +++ b/clang/lib/Headers/__clang_hip_cmath.h @@ -626,6 +626,13 @@ _GLIBCXX_END_NAMESPACE_VERSION // Define device-side math functions from on MSVC. #if defined(_MSC_VER) + +// Before VS2019, `` is also included in `` and other headers. +// But, from VS2019, it's only included in ``. Need to include +// `` here to ensure C functions declared there won't be markded as +// `__host__` and `__device__` through `` wrapper. +#include + #if defined(__cplusplus) extern "C" { #endif // defined(__cplusplus) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] f78d6af - [hip] Enable HIP compilation with ` on MSVC.
Author: Michael Liao Date: 2021-01-07T17:41:28-05:00 New Revision: f78d6af7319aa676a0f9f6cbb982f21c96e9aac5 URL: https://github.com/llvm/llvm-project/commit/f78d6af7319aa676a0f9f6cbb982f21c96e9aac5 DIFF: https://github.com/llvm/llvm-project/commit/f78d6af7319aa676a0f9f6cbb982f21c96e9aac5.diff LOG: [hip] Enable HIP compilation with ` on MSVC. - MSVC has different `` implementation which calls into functions declared in ``. Provide their device-side implementation to enable `` compilation on HIP Windows. Differential Revision: https://reviews.llvm.org/D93638 Added: Modified: clang/lib/Headers/__clang_hip_cmath.h Removed: diff --git a/clang/lib/Headers/__clang_hip_cmath.h b/clang/lib/Headers/__clang_hip_cmath.h index 3a702587ee17..128d64e271b8 100644 --- a/clang/lib/Headers/__clang_hip_cmath.h +++ b/clang/lib/Headers/__clang_hip_cmath.h @@ -624,6 +624,34 @@ _GLIBCXX_END_NAMESPACE_VERSION } // namespace std #endif +// Define device-side math functions from on MSVC. +#if defined(_MSC_VER) +#if defined(__cplusplus) +extern "C" { +#endif // defined(__cplusplus) +__DEVICE__ __attribute__((overloadable)) double _Cosh(double x, double y) { + return cosh(x) * y; +} +__DEVICE__ __attribute__((overloadable)) float _FCosh(float x, float y) { + return coshf(x) * y; +} +__DEVICE__ __attribute__((overloadable)) short _Dtest(double *p) { + return fpclassify(*p); +} +__DEVICE__ __attribute__((overloadable)) short _FDtest(float *p) { + return fpclassify(*p); +} +__DEVICE__ __attribute__((overloadable)) double _Sinh(double x, double y) { + return sinh(x) * y; +} +__DEVICE__ __attribute__((overloadable)) float _FSinh(float x, float y) { + return sinhf(x) * y; +} +#if defined(__cplusplus) +} +#endif // defined(__cplusplus) +#endif // defined(_MSC_VER) + #pragma pop_macro("__DEVICE__") #endif // __CLANG_HIP_CMATH_H__ ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 2a29ce3 - [hip] Fix HIP version parsing.
Author: Michael Liao Date: 2021-01-06T17:00:14-05:00 New Revision: 2a29ce303451375bbf1de7c971296553ef5d9beb URL: https://github.com/llvm/llvm-project/commit/2a29ce303451375bbf1de7c971296553ef5d9beb DIFF: https://github.com/llvm/llvm-project/commit/2a29ce303451375bbf1de7c971296553ef5d9beb.diff LOG: [hip] Fix HIP version parsing. - Need trimming before parsing major or minor version numbers. This's required due to the different line ending on Windows. - In addition, the integer conversion may fail due to invalid char. Return that parsing function return `true` when the parsing fails. Differential Revision: https://reviews.llvm.org/D93587 Added: Modified: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/lib/Driver/ToolChains/ROCm.h clang/test/Driver/Inputs/rocm/bin/.hipVersion Removed: diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp index 565a77e07fd8..0971a2da62a3 100644 --- a/clang/lib/Driver/ToolChains/AMDGPU.cpp +++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp @@ -88,23 +88,30 @@ void RocmInstallationDetector::scanLibDevicePath(llvm::StringRef Path) { } } -void RocmInstallationDetector::ParseHIPVersionFile(llvm::StringRef V) { +// Parse and extract version numbers from `.hipVersion`. Return `true` if +// the parsing fails. +bool RocmInstallationDetector::parseHIPVersionFile(llvm::StringRef V) { SmallVector VersionParts; V.split(VersionParts, '\n'); - unsigned Major; - unsigned Minor; + unsigned Major = ~0U; + unsigned Minor = ~0U; for (auto Part : VersionParts) { -auto Splits = Part.split('='); -if (Splits.first == "HIP_VERSION_MAJOR") - Splits.second.getAsInteger(0, Major); -else if (Splits.first == "HIP_VERSION_MINOR") - Splits.second.getAsInteger(0, Minor); -else if (Splits.first == "HIP_VERSION_PATCH") +auto Splits = Part.rtrim().split('='); +if (Splits.first == "HIP_VERSION_MAJOR") { + if (Splits.second.getAsInteger(0, Major)) +return true; +} else if (Splits.first == "HIP_VERSION_MINOR") { + if (Splits.second.getAsInteger(0, Minor)) +return true; +} else if (Splits.first == "HIP_VERSION_PATCH") VersionPatch = Splits.second.str(); } + if (Major == ~0U || Minor == ~0U) +return true; VersionMajorMinor = llvm::VersionTuple(Major, Minor); DetectedVersion = (Twine(Major) + "." + Twine(Minor) + "." + VersionPatch).str(); + return false; } // For candidate specified by --rocm-path we do not do strict check. @@ -290,7 +297,8 @@ void RocmInstallationDetector::detectHIPRuntime() { continue; if (HIPVersionArg.empty() && VersionFile) - ParseHIPVersionFile((*VersionFile)->getBuffer()); + if (parseHIPVersionFile((*VersionFile)->getBuffer())) +continue; HasHIPRuntime = true; return; diff --git a/clang/lib/Driver/ToolChains/ROCm.h b/clang/lib/Driver/ToolChains/ROCm.h index 27c7d8b0ee54..21e62a465d7b 100644 --- a/clang/lib/Driver/ToolChains/ROCm.h +++ b/clang/lib/Driver/ToolChains/ROCm.h @@ -103,7 +103,7 @@ class RocmInstallationDetector { } void scanLibDevicePath(llvm::StringRef Path); - void ParseHIPVersionFile(llvm::StringRef V); + bool parseHIPVersionFile(llvm::StringRef V); SmallVector getInstallationPathCandidates(); public: diff --git a/clang/test/Driver/Inputs/rocm/bin/.hipVersion b/clang/test/Driver/Inputs/rocm/bin/.hipVersion index 48ee6f10c3e4..677293c09139 100644 --- a/clang/test/Driver/Inputs/rocm/bin/.hipVersion +++ b/clang/test/Driver/Inputs/rocm/bin/.hipVersion @@ -1,4 +1,6 @@ # Auto-generated by cmake -HIP_VERSION_MAJOR=3 +# NOTE: The trailing whitespace is added on purpose to verify that these +# whitespaces are trimmed before paring. +HIP_VERSION_MAJOR=3 HIP_VERSION_MINOR=6 HIP_VERSION_PATCH=20214-a2917cd ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] bb8d20d - [cuda][hip] Fix typoes in header wrappers.
Author: Michael Liao Date: 2020-12-21T13:02:47-05:00 New Revision: bb8d20d9f3bb955ae6f6143d24749faf61d573a9 URL: https://github.com/llvm/llvm-project/commit/bb8d20d9f3bb955ae6f6143d24749faf61d573a9 DIFF: https://github.com/llvm/llvm-project/commit/bb8d20d9f3bb955ae6f6143d24749faf61d573a9.diff LOG: [cuda][hip] Fix typoes in header wrappers. Added: Modified: clang/lib/Headers/cuda_wrappers/algorithm clang/lib/Headers/cuda_wrappers/new Removed: diff --git a/clang/lib/Headers/cuda_wrappers/algorithm b/clang/lib/Headers/cuda_wrappers/algorithm index 01af18360d8d..f14a0b00bb04 100644 --- a/clang/lib/Headers/cuda_wrappers/algorithm +++ b/clang/lib/Headers/cuda_wrappers/algorithm @@ -1,4 +1,4 @@ -/*=== complex - CUDA wrapper for === +/*=== algorithm - CUDA wrapper for -=== * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal diff --git a/clang/lib/Headers/cuda_wrappers/new b/clang/lib/Headers/cuda_wrappers/new index 7f255314056a..d5fb3b7011de 100644 --- a/clang/lib/Headers/cuda_wrappers/new +++ b/clang/lib/Headers/cuda_wrappers/new @@ -1,4 +1,4 @@ -/*=== complex - CUDA wrapper for --=== +/*=== new - CUDA wrapper for -=== * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] d8949a8 - [hip] Fix host object creation from fatbin
Author: Michael Liao Date: 2020-12-02T10:36:01-05:00 New Revision: d8949a8ad3ca2a39ffe69df76e2c3f5fd73efec0 URL: https://github.com/llvm/llvm-project/commit/d8949a8ad3ca2a39ffe69df76e2c3f5fd73efec0 DIFF: https://github.com/llvm/llvm-project/commit/d8949a8ad3ca2a39ffe69df76e2c3f5fd73efec0.diff LOG: [hip] Fix host object creation from fatbin - `__hip_fatbin` should a symbol in `.hip_fatbin` section. Differential Revision: https://reviews.llvm.org/D92418 Added: Modified: clang/lib/Driver/ToolChains/HIP.cpp Removed: diff --git a/clang/lib/Driver/ToolChains/HIP.cpp b/clang/lib/Driver/ToolChains/HIP.cpp index a06835eee024..fc1103b48a99 100644 --- a/clang/lib/Driver/ToolChains/HIP.cpp +++ b/clang/lib/Driver/ToolChains/HIP.cpp @@ -178,8 +178,7 @@ void AMDGCN::Linker::constructGenerateObjFileFromHIPFatBinary( ObjStream << "# HIP Object Generator\n"; ObjStream << "# *** Automatically generated by Clang ***\n"; ObjStream << " .type __hip_fatbin,@object\n"; - ObjStream << " .section .hip_fatbin,\"aMS\",@progbits,1\n"; - ObjStream << " .data\n"; + ObjStream << " .section .hip_fatbin,\"a\",@progbits\n"; ObjStream << " .globl __hip_fatbin\n"; ObjStream << " .p2align " << llvm::Log2(llvm::Align(HIPCodeObjectAlign)) << "\n"; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] f375885 - [InferAddrSpace] Teach to handle assumed address space.
Author: Michael Liao Date: 2020-11-16T17:06:33-05:00 New Revision: f375885ab86d1b3e82269725c8e9aa49f347b4a7 URL: https://github.com/llvm/llvm-project/commit/f375885ab86d1b3e82269725c8e9aa49f347b4a7 DIFF: https://github.com/llvm/llvm-project/commit/f375885ab86d1b3e82269725c8e9aa49f347b4a7.diff LOG: [InferAddrSpace] Teach to handle assumed address space. - In certain cases, a generic pointer could be assumed as a pointer to the global memory space or other spaces. With a dedicated target hook to query that address space from a given value, infer-address-space pass could infer and propagate that to all its users. Differential Revision: https://reviews.llvm.org/D91121 Added: llvm/test/Transforms/InferAddressSpaces/AMDGPU/assumed-addrspace.ll Modified: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu llvm/docs/AMDGPUUsage.rst llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/include/llvm/Target/TargetMachine.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp llvm/test/CodeGen/AMDGPU/GlobalISel/divergent-control-flow.ll Removed: diff --git a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu index dc4659856026..da1f4b65f719 100644 --- a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu +++ b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu @@ -56,20 +56,24 @@ struct S { int *x; float *y; }; -// `by-val` struct will be coerced into a similar struct with all generic -// pointers lowerd into global ones. +// `by-val` struct is passed by-indirect-alias (a mix of by-ref and indirect +// by-val). However, the enhanced address inferring pass should be able to +// assume they are global pointers. +// // HOST: define void @_Z22__device_stub__kernel41S(i32* %s.coerce0, float* %s.coerce1) // COMMON-LABEL: define amdgpu_kernel void @_Z7kernel41S(%struct.S addrspace(4)*{{.*}} byref(%struct.S) align 8 %0) // OPT: [[R0:%.*]] = getelementptr inbounds %struct.S, %struct.S addrspace(4)* %0, i64 0, i32 0 // OPT: [[P0:%.*]] = load i32*, i32* addrspace(4)* [[R0]], align 8 +// OPT: [[G0:%.*]] = addrspacecast i32* [[P0]] to i32 addrspace(1)* // OPT: [[R1:%.*]] = getelementptr inbounds %struct.S, %struct.S addrspace(4)* %0, i64 0, i32 1 // OPT: [[P1:%.*]] = load float*, float* addrspace(4)* [[R1]], align 8 -// OPT: [[V0:%.*]] = load i32, i32* [[P0]], align 4 +// OPT: [[G1:%.*]] = addrspacecast float* [[P1]] to float addrspace(1)* +// OPT: [[V0:%.*]] = load i32, i32 addrspace(1)* [[G0]], align 4 // OPT: [[INC:%.*]] = add nsw i32 [[V0]], 1 -// OPT: store i32 [[INC]], i32* [[P0]], align 4 -// OPT: [[V1:%.*]] = load float, float* [[P1]], align 4 +// OPT: store i32 [[INC]], i32 addrspace(1)* [[G0]], align 4 +// OPT: [[V1:%.*]] = load float, float addrspace(1)* [[G1]], align 4 // OPT: [[ADD:%.*]] = fadd contract float [[V1]], 1.00e+00 -// OPT: store float [[ADD]], float* [[P1]], align 4 +// OPT: store float [[ADD]], float addrspace(1)* [[G1]], align 4 // OPT: ret void __global__ void kernel4(struct S s) { s.x[0]++; @@ -87,19 +91,24 @@ __global__ void kernel5(struct S *s) { struct T { float *x[2]; }; -// `by-val` array is also coerced. +// `by-val` array is passed by-indirect-alias (a mix of by-ref and indirect +// by-val). However, the enhanced address inferring pass should be able to +// assume they are global pointers. +// // HOST: define void @_Z22__device_stub__kernel61T(float* %t.coerce0, float* %t.coerce1) // COMMON-LABEL: define amdgpu_kernel void @_Z7kernel61T(%struct.T addrspace(4)*{{.*}} byref(%struct.T) align 8 %0) // OPT: [[R0:%.*]] = getelementptr inbounds %struct.T, %struct.T addrspace(4)* %0, i64 0, i32 0, i64 0 // OPT: [[P0:%.*]] = load float*, float* addrspace(4)* [[R0]], align 8 +// OPT: [[G0:%.*]] = addrspacecast float* [[P0]] to float addrspace(1)* // OPT: [[R1:%.*]] = getelementptr inbounds %struct.T, %struct.T addrspace(4)* %0, i64 0, i32 0, i64 1 // OPT: [[P1:%.*]] = load float*, float* addrspace(4)* [[R1]], align 8 -// OPT: [[V0:%.*]] = load float, float* [[P0]], align 4 +// OPT: [[G1:%.*]] = addrspacecast float* [[P1]] to float addrspace(1)* +// OPT: [[V0:%.*]] = load float, float addrspace(1)* [[G0]], align 4 // OPT: [[ADD0:%.*]] = fadd contract float [[V0]], 1.00e+00 -// OPT: store float [[ADD0]], float* [[P0]], align 4 -// OPT: [[V1:%.*]] = load float, float* [[P1]], align 4 +// OPT: store float [[ADD0]], float addrspace(1)* [[G0]], align 4 +// OPT: [[V1:%.*]] = load float, float addrspace(1)* [[G1]], align 4 // OPT: [[ADD1:%.*]] = fadd contract float [[V1]], 2.00e+00 -// OPT: store float
[clang] 8920ef0 - [hip] Remove the coercion on aggregate kernel arguments.
Author: Michael Liao Date: 2020-11-12T21:19:30-05:00 New Revision: 8920ef06a138c46b208fb6471d500261c4b9bacc URL: https://github.com/llvm/llvm-project/commit/8920ef06a138c46b208fb6471d500261c4b9bacc DIFF: https://github.com/llvm/llvm-project/commit/8920ef06a138c46b208fb6471d500261c4b9bacc.diff LOG: [hip] Remove the coercion on aggregate kernel arguments. - If an aggregate argument is indirectly accessed within kernels, direct passing results in unpromotable `alloca`, which degrade performance significantly. InferAddrSpace pass is enhanced in [D91121](https://reviews.llvm.org/D91121) to take the assumption that generic pointers loaded from the constant memory could be regarded global ones. The need for the coercion on aggregate arguments is mitigated. Differential Revision: https://reviews.llvm.org/D89980 Added: Modified: clang/lib/CodeGen/TargetInfo.cpp clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu clang/test/CodeGenCUDA/kernel-args.cu Removed: diff --git a/clang/lib/CodeGen/TargetInfo.cpp b/clang/lib/CodeGen/TargetInfo.cpp index 63502ccf7a38..1e5920322ecd 100644 --- a/clang/lib/CodeGen/TargetInfo.cpp +++ b/clang/lib/CodeGen/TargetInfo.cpp @@ -8712,35 +8712,9 @@ class AMDGPUABIInfo final : public DefaultABIInfo { bool isHomogeneousAggregateSmallEnough(const Type *Base, uint64_t Members) const override; - // Coerce HIP pointer arguments from generic pointers to global ones. + // Coerce HIP scalar pointer arguments from generic pointers to global ones. llvm::Type *coerceKernelArgumentType(llvm::Type *Ty, unsigned FromAS, unsigned ToAS) const { -// Structure types. -if (auto STy = dyn_cast(Ty)) { - SmallVector EltTys; - bool Changed = false; - for (auto T : STy->elements()) { -auto NT = coerceKernelArgumentType(T, FromAS, ToAS); -EltTys.push_back(NT); -Changed |= (NT != T); - } - // Skip if there is no change in element types. - if (!Changed) -return STy; - if (STy->hasName()) -return llvm::StructType::create( -EltTys, (STy->getName() + ".coerce").str(), STy->isPacked()); - return llvm::StructType::get(getVMContext(), EltTys, STy->isPacked()); -} -// Array types. -if (auto ATy = dyn_cast(Ty)) { - auto T = ATy->getElementType(); - auto NT = coerceKernelArgumentType(T, FromAS, ToAS); - // Skip if there is no change in that element type. - if (NT == T) -return ATy; - return llvm::ArrayType::get(NT, ATy->getNumElements()); -} // Single value types. if (Ty->isPointerTy() && Ty->getPointerAddressSpace() == FromAS) return llvm::PointerType::get( diff --git a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu index 2660a5f14f90..dc4659856026 100644 --- a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu +++ b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu @@ -9,8 +9,6 @@ // Coerced struct from `struct S` without all generic pointers lowered into // global ones. -// COMMON: %struct.S.coerce = type { i32 addrspace(1)*, float addrspace(1)* } -// COMMON: %struct.T.coerce = type { [2 x float addrspace(1)*] } // On the host-side compilation, generic pointer won't be coerced. // HOST-NOT: %struct.S.coerce @@ -61,15 +59,17 @@ struct S { // `by-val` struct will be coerced into a similar struct with all generic // pointers lowerd into global ones. // HOST: define void @_Z22__device_stub__kernel41S(i32* %s.coerce0, float* %s.coerce1) -// COMMON-LABEL: define amdgpu_kernel void @_Z7kernel41S(%struct.S.coerce %s.coerce) -// OPT: [[P0:%.*]] = extractvalue %struct.S.coerce %s.coerce, 0 -// OPT: [[P1:%.*]] = extractvalue %struct.S.coerce %s.coerce, 1 -// OPT: [[V0:%.*]] = load i32, i32 addrspace(1)* [[P0]], align 4 +// COMMON-LABEL: define amdgpu_kernel void @_Z7kernel41S(%struct.S addrspace(4)*{{.*}} byref(%struct.S) align 8 %0) +// OPT: [[R0:%.*]] = getelementptr inbounds %struct.S, %struct.S addrspace(4)* %0, i64 0, i32 0 +// OPT: [[P0:%.*]] = load i32*, i32* addrspace(4)* [[R0]], align 8 +// OPT: [[R1:%.*]] = getelementptr inbounds %struct.S, %struct.S addrspace(4)* %0, i64 0, i32 1 +// OPT: [[P1:%.*]] = load float*, float* addrspace(4)* [[R1]], align 8 +// OPT: [[V0:%.*]] = load i32, i32* [[P0]], align 4 // OPT: [[INC:%.*]] = add nsw i32 [[V0]], 1 -// OPT: store i32 [[INC]], i32 addrspace(1)* [[P0]], align 4 -// OPT: [[V1:%.*]] = load float, float addrspace(1)* [[P1]], align 4 +// OPT: store i32 [[INC]], i32* [[P0]], align 4 +// OPT: [[V1:%.*]] = load float, float* [[P1]], align 4 // OPT: [[ADD:%.*]] = fadd contract float [[V1]], 1.00e+00 -// OPT: store float [[ADD]], float addrspace(1)* [[P1]], align 4 +// OPT: store float
[clang] 23c6d15 - [amdgpu] Add `llvm.amdgcn.endpgm` support.
Author: Michael Liao Date: 2020-11-05T19:06:50-05:00 New Revision: 23c6d1501d80073784cab367d30d50419ffa5706 URL: https://github.com/llvm/llvm-project/commit/23c6d1501d80073784cab367d30d50419ffa5706 DIFF: https://github.com/llvm/llvm-project/commit/23c6d1501d80073784cab367d30d50419ffa5706.diff LOG: [amdgpu] Add `llvm.amdgcn.endpgm` support. - `llvm.amdgcn.endpgm` is added to enable "abort" support. Differential Revision: https://reviews.llvm.org/D90809 Added: llvm/test/CodeGen/AMDGPU/amd.endpgm.ll Modified: clang/include/clang/Basic/BuiltinsAMDGPU.def clang/test/CodeGenCUDA/builtins-amdgcn.cu llvm/include/llvm/IR/IntrinsicsAMDGPU.td llvm/lib/Target/AMDGPU/SOPInstructions.td Removed: diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def b/clang/include/clang/Basic/BuiltinsAMDGPU.def index f5901e6f8f3b..123a7ad212da 100644 --- a/clang/include/clang/Basic/BuiltinsAMDGPU.def +++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def @@ -214,6 +214,8 @@ BUILTIN(__builtin_amdgcn_read_exec, "LUi", "nc") BUILTIN(__builtin_amdgcn_read_exec_lo, "Ui", "nc") BUILTIN(__builtin_amdgcn_read_exec_hi, "Ui", "nc") +BUILTIN(__builtin_amdgcn_endpgm, "v", "nr") + //===--===// // R600-NI only builtins. //===--===// diff --git a/clang/test/CodeGenCUDA/builtins-amdgcn.cu b/clang/test/CodeGenCUDA/builtins-amdgcn.cu index 1c3a79064595..8f0d0d0801bd 100644 --- a/clang/test/CodeGenCUDA/builtins-amdgcn.cu +++ b/clang/test/CodeGenCUDA/builtins-amdgcn.cu @@ -16,3 +16,9 @@ void test_ds_fmax(float src) { __shared__ float shared; volatile float x = __builtin_amdgcn_ds_fmaxf(, src, 0, 0, false); } + +// CHECK-LABEL: @_Z6endpgmv( +// CHECK: call void @llvm.amdgcn.endpgm() +__global__ void endpgm() { + __builtin_amdgcn_endpgm(); +} diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td index 304377ce28ab..bc04fa40f2a8 100644 --- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td +++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td @@ -1577,6 +1577,10 @@ def int_amdgcn_wqm_vote : Intrinsic<[llvm_i1_ty], // FIXME: Should this be IntrNoMem, IntrHasSideEffects, or IntrWillReturn? def int_amdgcn_kill : Intrinsic<[], [llvm_i1_ty], []>; +def int_amdgcn_endpgm : GCCBuiltin<"__builtin_amdgcn_endpgm">, + Intrinsic<[], [], [IntrNoReturn, IntrCold, IntrNoMem, IntrHasSideEffects] +>; + // Copies the active channels of the source value to the destination value, // with the guarantee that the source value is computed as if the entire // program were executed in Whole Wavefront Mode, i.e. with all channels diff --git a/llvm/lib/Target/AMDGPU/SOPInstructions.td b/llvm/lib/Target/AMDGPU/SOPInstructions.td index 08966d7d62eb..00527171ff11 100644 --- a/llvm/lib/Target/AMDGPU/SOPInstructions.td +++ b/llvm/lib/Target/AMDGPU/SOPInstructions.td @@ -1118,6 +1118,7 @@ let isTerminator = 1 in { def S_ENDPGM : SOPP_Pseudo<"s_endpgm", (ins EndpgmImm:$simm16), "$simm16"> { let isBarrier = 1; let isReturn = 1; + let hasSideEffects = 1; } def S_ENDPGM_SAVED : SOPP_Pseudo<"s_endpgm_saved", (ins)> { @@ -1328,6 +1329,11 @@ def : GCNPat < (S_ENDPGM (i16 0)) >; +def : GCNPat < + (int_amdgcn_endpgm), +(S_ENDPGM (i16 0)) +>; + def : GCNPat < (i64 (ctpop i64:$src)), (i64 (REG_SEQUENCE SReg_64, diff --git a/llvm/test/CodeGen/AMDGPU/amd.endpgm.ll b/llvm/test/CodeGen/AMDGPU/amd.endpgm.ll new file mode 100644 index ..ac9cd0699118 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/amd.endpgm.ll @@ -0,0 +1,50 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck %s + +define amdgpu_kernel void @test0() { +; CHECK-LABEL: test0: +; CHECK: ; %bb.0: +; CHECK-NEXT:s_endpgm + tail call void @llvm.amdgcn.endpgm() + unreachable +} + +define void @test1() { +; CHECK-LABEL: test1: +; CHECK: ; %bb.0: +; CHECK-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) +; CHECK-NEXT:s_endpgm + tail call void @llvm.amdgcn.endpgm() + unreachable +} + +define amdgpu_kernel void @test2(i32* %p, i32 %x) { +; CHECK-LABEL: test2: +; CHECK: ; %bb.0: +; CHECK-NEXT:s_load_dword s2, s[0:1], 0x2c +; CHECK-NEXT:s_waitcnt lgkmcnt(0) +; CHECK-NEXT:s_cmp_lt_i32 s2, 1 +; CHECK-NEXT:s_cbranch_scc0 BB2_2 +; CHECK-NEXT: ; %bb.1: ; %else +; CHECK-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24 +; CHECK-NEXT:v_mov_b32_e32 v2, s2 +; CHECK-NEXT:s_waitcnt lgkmcnt(0) +; CHECK-NEXT:v_mov_b32_e32 v0, s0 +; CHECK-NEXT:v_mov_b32_e32 v1, s1 +; CHECK-NEXT:flat_store_dword v[0:1], v2 +; CHECK-NEXT:s_endpgm +; CHECK-NEXT: BB2_2: ; %then +; CHECK-NEXT:s_endpgm + %cond = icmp sgt i32 %x, 0 + br i1 %cond, label
[clang] 1bcec29 - Only run when `arm` is registered. NFC.
Author: Michael Liao Date: 2020-10-21T09:30:07-04:00 New Revision: 1bcec29afb321976cdcaa632ee6a47567dd651a7 URL: https://github.com/llvm/llvm-project/commit/1bcec29afb321976cdcaa632ee6a47567dd651a7 DIFF: https://github.com/llvm/llvm-project/commit/1bcec29afb321976cdcaa632ee6a47567dd651a7.diff LOG: Only run when `arm` is registered. NFC. Added: Modified: clang/test/Driver/arm-float-abi.c Removed: diff --git a/clang/test/Driver/arm-float-abi.c b/clang/test/Driver/arm-float-abi.c index 74ba3fd3bc57..294f02444769 100644 --- a/clang/test/Driver/arm-float-abi.c +++ b/clang/test/Driver/arm-float-abi.c @@ -1,3 +1,4 @@ +// REQUIRES: arm-registered-target // RUN: not %clang %s -target armv7-apple-ios -mfloat-abi=hard 2>&1 | FileCheck -check-prefix=ARMV7-ERROR %s // RUN: %clang %s -target armv7-apple-ios -mfloat-abi=softfp -### 2>&1 | FileCheck -check-prefix=NOERROR %s // RUN: %clang %s -arch armv7 -target thumbv7-apple-darwin-eabi -mfloat-abi=hard -### 2>&1 | FileCheck -check-prefix=NOERROR %s ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] e7a6915 - Revert "[clang] Fix warnings on the missing of explicitly copy constructor on the base class. NFC."
Author: Michael Liao Date: 2020-10-20T10:25:20-04:00 New Revision: e7a69158635a30cb673e443a3b95ece359c72cc1 URL: https://github.com/llvm/llvm-project/commit/e7a69158635a30cb673e443a3b95ece359c72cc1 DIFF: https://github.com/llvm/llvm-project/commit/e7a69158635a30cb673e443a3b95ece359c72cc1.diff LOG: Revert "[clang] Fix warnings on the missing of explicitly copy constructor on the base class. NFC." This reverts commit 1ed506deaddb41870d22f5b48d52ba710e8d6c00. Added: Modified: clang/include/clang/Basic/Diagnostic.h clang/include/clang/Basic/PartialDiagnostic.h Removed: diff --git a/clang/include/clang/Basic/Diagnostic.h b/clang/include/clang/Basic/Diagnostic.h index 3895e1f458948..f17b98f740385 100644 --- a/clang/include/clang/Basic/Diagnostic.h +++ b/clang/include/clang/Basic/Diagnostic.h @@ -1284,7 +1284,7 @@ class DiagnosticBuilder : public StreamingDiagnostic { public: /// Copy constructor. When copied, this "takes" the diagnostic info from the /// input and neuters it. - DiagnosticBuilder(const DiagnosticBuilder ) : StreamingDiagnostic(D) { + DiagnosticBuilder(const DiagnosticBuilder ) { DiagObj = D.DiagObj; DiagStorage = D.DiagStorage; IsActive = D.IsActive; diff --git a/clang/include/clang/Basic/PartialDiagnostic.h b/clang/include/clang/Basic/PartialDiagnostic.h index 9ddf64d2de2c5..9e017902b1205 100644 --- a/clang/include/clang/Basic/PartialDiagnostic.h +++ b/clang/include/clang/Basic/PartialDiagnostic.h @@ -49,8 +49,7 @@ class PartialDiagnostic : public StreamingDiagnostic { PartialDiagnostic(unsigned DiagID, DiagStorageAllocator _) : StreamingDiagnostic(Allocator_), DiagID(DiagID) {} - PartialDiagnostic(const PartialDiagnostic ) - : StreamingDiagnostic(Other), DiagID(Other.DiagID) { + PartialDiagnostic(const PartialDiagnostic ) : DiagID(Other.DiagID) { Allocator = Other.Allocator; if (Other.DiagStorage) { DiagStorage = getStorage(); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 1ed506d - [clang] Fix warnings on the missing of explicitly copy constructor on the base class. NFC.
Author: Michael Liao Date: 2020-10-20T10:06:24-04:00 New Revision: 1ed506deaddb41870d22f5b48d52ba710e8d6c00 URL: https://github.com/llvm/llvm-project/commit/1ed506deaddb41870d22f5b48d52ba710e8d6c00 DIFF: https://github.com/llvm/llvm-project/commit/1ed506deaddb41870d22f5b48d52ba710e8d6c00.diff LOG: [clang] Fix warnings on the missing of explicitly copy constructor on the base class. NFC. Added: Modified: clang/include/clang/Basic/Diagnostic.h clang/include/clang/Basic/PartialDiagnostic.h Removed: diff --git a/clang/include/clang/Basic/Diagnostic.h b/clang/include/clang/Basic/Diagnostic.h index f17b98f74038..3895e1f45894 100644 --- a/clang/include/clang/Basic/Diagnostic.h +++ b/clang/include/clang/Basic/Diagnostic.h @@ -1284,7 +1284,7 @@ class DiagnosticBuilder : public StreamingDiagnostic { public: /// Copy constructor. When copied, this "takes" the diagnostic info from the /// input and neuters it. - DiagnosticBuilder(const DiagnosticBuilder ) { + DiagnosticBuilder(const DiagnosticBuilder ) : StreamingDiagnostic(D) { DiagObj = D.DiagObj; DiagStorage = D.DiagStorage; IsActive = D.IsActive; diff --git a/clang/include/clang/Basic/PartialDiagnostic.h b/clang/include/clang/Basic/PartialDiagnostic.h index 9e017902b120..9ddf64d2de2c 100644 --- a/clang/include/clang/Basic/PartialDiagnostic.h +++ b/clang/include/clang/Basic/PartialDiagnostic.h @@ -49,7 +49,8 @@ class PartialDiagnostic : public StreamingDiagnostic { PartialDiagnostic(unsigned DiagID, DiagStorageAllocator _) : StreamingDiagnostic(Allocator_), DiagID(DiagID) {} - PartialDiagnostic(const PartialDiagnostic ) : DiagID(Other.DiagID) { + PartialDiagnostic(const PartialDiagnostic ) + : StreamingDiagnostic(Other), DiagID(Other.DiagID) { Allocator = Other.Allocator; if (Other.DiagStorage) { DiagStorage = getStorage(); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] b21ad3b - Fix `-Wparentheses` warnings. NFC.
Author: Michael Liao Date: 2020-10-14T10:11:19-04:00 New Revision: b21ad3b66bce942ee6e0f5b1fcfdea31928005a7 URL: https://github.com/llvm/llvm-project/commit/b21ad3b66bce942ee6e0f5b1fcfdea31928005a7 DIFF: https://github.com/llvm/llvm-project/commit/b21ad3b66bce942ee6e0f5b1fcfdea31928005a7.diff LOG: Fix `-Wparentheses` warnings. NFC. Added: Modified: clang/lib/Sema/SemaExpr.cpp Removed: diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp index a02db2293bcc..0407d5bb7f6c 100644 --- a/clang/lib/Sema/SemaExpr.cpp +++ b/clang/lib/Sema/SemaExpr.cpp @@ -6384,10 +6384,10 @@ ExprResult Sema::BuildCallExpr(Scope *Scope, Expr *Fn, SourceLocation LParenLoc, if (Context.isDependenceAllowed() && (Fn->isTypeDependent() || Expr::hasAnyTypeDependentArguments(ArgExprs))) { assert(!getLangOpts().CPlusPlus); -assert(Fn->containsErrors() || - llvm::any_of(ArgExprs, -[](clang::Expr *E) { return E->containsErrors(); }) && - "should only occur in error-recovery path."); +assert((Fn->containsErrors() || +llvm::any_of(ArgExprs, + [](clang::Expr *E) { return E->containsErrors(); })) && + "should only occur in error-recovery path."); QualType ReturnType = llvm::isa_and_nonnull(NDecl) ? dyn_cast(NDecl)->getCallResultType() ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 8c36eaf - [clang][opencl][codegen] Remove the insertion of `correctly-rounded-divide-sqrt-fp-math` fn-attr.
Author: Michael Liao Date: 2020-10-01T11:07:39-04:00 New Revision: 8c36eaf0377285acb89c319582d9666e60f42007 URL: https://github.com/llvm/llvm-project/commit/8c36eaf0377285acb89c319582d9666e60f42007 DIFF: https://github.com/llvm/llvm-project/commit/8c36eaf0377285acb89c319582d9666e60f42007.diff LOG: [clang][opencl][codegen] Remove the insertion of `correctly-rounded-divide-sqrt-fp-math` fn-attr. - `-cl-fp32-correctly-rounded-divide-sqrt` is already handled in a per-instruction manner by annotating the accuracy required. There's no need to add that fn-attr. So far, there's no in-tree backend handling that attr and that OpenCL specific option. - In case that out-of-tree backends are broken, this change could be reverted if those backends could not be fixed. Differential Revision: https://reviews.llvm.org/D88424 Added: Modified: clang/lib/CodeGen/CGCall.cpp clang/test/CodeGenOpenCL/amdgpu-attrs.cl clang/test/CodeGenOpenCL/fpmath.cl Removed: diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index ec7ddf8b5d9e..cb03e025e19e 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -1794,11 +1794,6 @@ void CodeGenModule::getDefaultFunctionAttributes(StringRef Name, llvm::utostr(CodeGenOpts.SSPBufferSize)); FuncAttrs.addAttribute("no-signed-zeros-fp-math", llvm::toStringRef(LangOpts.NoSignedZero)); -if (getLangOpts().OpenCL) { - FuncAttrs.addAttribute( - "correctly-rounded-divide-sqrt-fp-math", - llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt)); -} // TODO: Reciprocal estimate codegen options should apply to instructions? const std::vector = CodeGenOpts.Reciprocals; diff --git a/clang/test/CodeGenOpenCL/amdgpu-attrs.cl b/clang/test/CodeGenOpenCL/amdgpu-attrs.cl index 13f8b1191c2b..9156c45f4939 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-attrs.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-attrs.cl @@ -190,5 +190,5 @@ kernel void default_kernel() { // CHECK-DAG: attributes [[FLAT_WORK_GROUP_SIZE_32_64_WAVES_PER_EU_2_NUM_SGPR_32_NUM_VGPR_64]] = {{.*}} "amdgpu-flat-work-group-size"="32,64" "amdgpu-implicitarg-num-bytes"="56" "amdgpu-num-sgpr"="32" "amdgpu-num-vgpr"="64" "amdgpu-waves-per-eu"="2" // CHECK-DAG: attributes [[FLAT_WORK_GROUP_SIZE_32_64_WAVES_PER_EU_2_4_NUM_SGPR_32_NUM_VGPR_64]] = {{.*}} "amdgpu-flat-work-group-size"="32,64" "amdgpu-implicitarg-num-bytes"="56" "amdgpu-num-sgpr"="32" "amdgpu-num-vgpr"="64" "amdgpu-waves-per-eu"="2,4" -// CHECK-DAG: attributes [[A_FUNCTION]] = {{.*}} "correctly-rounded-divide-sqrt-fp-math"="false" +// CHECK-DAG: attributes [[A_FUNCTION]] = {{.*}} // CHECK-DAG: attributes [[DEFAULT_KERNEL_ATTRS]] = {{.*}} "amdgpu-flat-work-group-size"="1,256" "amdgpu-implicitarg-num-bytes"="56" diff --git a/clang/test/CodeGenOpenCL/fpmath.cl b/clang/test/CodeGenOpenCL/fpmath.cl index 0108d909c94e..36cb8e68ea7c 100644 --- a/clang/test/CodeGenOpenCL/fpmath.cl +++ b/clang/test/CodeGenOpenCL/fpmath.cl @@ -7,7 +7,6 @@ typedef __attribute__(( ext_vector_type(4) )) float float4; float spscalardiv(float a, float b) { // CHECK: @spscalardiv - // CHECK: #[[ATTR:[0-9]+]] // CHECK: fdiv{{.*}}, // NODIVOPT: !fpmath ![[MD:[0-9]+]] // DIVOPT-NOT: !fpmath ![[MD:[0-9]+]] @@ -16,7 +15,6 @@ float spscalardiv(float a, float b) { float4 spvectordiv(float4 a, float4 b) { // CHECK: @spvectordiv - // CHECK: #[[ATTR2:[0-9]+]] // CHECK: fdiv{{.*}}, // NODIVOPT: !fpmath ![[MD]] // DIVOPT-NOT: !fpmath ![[MD]] @@ -38,18 +36,9 @@ void testdbllit(long *val) { #pragma OPENCL EXTENSION cl_khr_fp64 : enable double dpscalardiv(double a, double b) { // CHECK: @dpscalardiv - // CHECK: #[[ATTR]] // CHECK-NOT: !fpmath return a / b; } #endif -// CHECK: attributes #[[ATTR]] = { -// NODIVOPT-SAME: "correctly-rounded-divide-sqrt-fp-math"="false" -// DIVOPT-SAME: "correctly-rounded-divide-sqrt-fp-math"="true" -// CHECK-SAME: } -// CHECK: attributes #[[ATTR2]] = { -// NODIVOPT-SAME: "correctly-rounded-divide-sqrt-fp-math"="false" -// DIVOPT-SAME: "correctly-rounded-divide-sqrt-fp-math"="true" -// CHECK-SAME: } // NODIVOPT: ![[MD]] = !{float 2.50e+00} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 5dbf80c - [clang][codegen] Annotate `correctly-rounded-divide-sqrt-fp-math` fn-attr for OpenCL only.
Author: Michael Liao Date: 2020-09-28T11:40:32-04:00 New Revision: 5dbf80cad9556e222c4383960007fc0b27ea9541 URL: https://github.com/llvm/llvm-project/commit/5dbf80cad9556e222c4383960007fc0b27ea9541 DIFF: https://github.com/llvm/llvm-project/commit/5dbf80cad9556e222c4383960007fc0b27ea9541.diff LOG: [clang][codegen] Annotate `correctly-rounded-divide-sqrt-fp-math` fn-attr for OpenCL only. - `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL at most. Differential revision: https://reviews.llvm.org/D88303 Added: Modified: clang/lib/CodeGen/CGCall.cpp clang/test/CodeGen/complex-builtins.c clang/test/CodeGen/complex-libcalls.c clang/test/CodeGen/math-builtins.c clang/test/CodeGen/math-libcalls.c Removed: diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index ff41764a4a4d..9ccbe87fab66 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -1794,9 +1794,11 @@ void CodeGenModule::getDefaultFunctionAttributes(StringRef Name, llvm::utostr(CodeGenOpts.SSPBufferSize)); FuncAttrs.addAttribute("no-signed-zeros-fp-math", llvm::toStringRef(LangOpts.NoSignedZero)); -FuncAttrs.addAttribute( -"correctly-rounded-divide-sqrt-fp-math", -llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt)); +if (getLangOpts().OpenCL) { + FuncAttrs.addAttribute( + "correctly-rounded-divide-sqrt-fp-math", + llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt)); +} // TODO: Reciprocal estimate codegen options should apply to instructions? const std::vector = CodeGenOpts.Reciprocals; diff --git a/clang/test/CodeGen/complex-builtins.c b/clang/test/CodeGen/complex-builtins.c index 7ee2d6d84857..96c0e7117016 100644 --- a/clang/test/CodeGen/complex-builtins.c +++ b/clang/test/CodeGen/complex-builtins.c @@ -197,10 +197,8 @@ void foo(float f) { // HAS_ERRNO: declare { x86_fp80, x86_fp80 } @ctanhl({ x86_fp80, x86_fp80 }* byval({ x86_fp80, x86_fp80 }) align 16) [[NOT_READNONE]] }; - // NO__ERRNO: attributes [[READNONE]] = { {{.*}}readnone{{.*}} } -// NO__ERRNO: attributes [[NOT_READNONE]] = { nounwind "correctly{{.*}} } +// NO__ERRNO: attributes [[NOT_READNONE]] = { nounwind {{.*}} } -// HAS_ERRNO: attributes [[NOT_READNONE]] = { nounwind "correctly{{.*}} } +// HAS_ERRNO: attributes [[NOT_READNONE]] = { nounwind {{.*}} } // HAS_ERRNO: attributes [[READNONE]] = { {{.*}}readnone{{.*}} } - diff --git a/clang/test/CodeGen/complex-libcalls.c b/clang/test/CodeGen/complex-libcalls.c index 248041788293..9bd419a83821 100644 --- a/clang/test/CodeGen/complex-libcalls.c +++ b/clang/test/CodeGen/complex-libcalls.c @@ -197,10 +197,8 @@ void foo(float f) { // HAS_ERRNO: declare { x86_fp80, x86_fp80 } @ctanhl({ x86_fp80, x86_fp80 }* byval({ x86_fp80, x86_fp80 }) align 16) [[NOT_READNONE]] }; - // NO__ERRNO: attributes [[READNONE]] = { {{.*}}readnone{{.*}} } -// NO__ERRNO: attributes [[NOT_READNONE]] = { nounwind "correctly{{.*}} } +// NO__ERRNO: attributes [[NOT_READNONE]] = { nounwind {{.*}} } -// HAS_ERRNO: attributes [[NOT_READNONE]] = { nounwind "correctly{{.*}} } +// HAS_ERRNO: attributes [[NOT_READNONE]] = { nounwind {{.*}} } // HAS_ERRNO: attributes [[READNONE]] = { {{.*}}readnone{{.*}} } - diff --git a/clang/test/CodeGen/math-builtins.c b/clang/test/CodeGen/math-builtins.c index 13e9c13096f2..8aadd050ee89 100644 --- a/clang/test/CodeGen/math-builtins.c +++ b/clang/test/CodeGen/math-builtins.c @@ -577,13 +577,12 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c) { // HAS_ERRNO: declare x86_fp80 @llvm.trunc.f80(x86_fp80) [[READNONE_INTRINSIC]] }; - // NO__ERRNO: attributes [[READNONE]] = { {{.*}}readnone{{.*}} } // NO__ERRNO: attributes [[READNONE_INTRINSIC]] = { {{.*}}readnone{{.*}} } -// NO__ERRNO: attributes [[NOT_READNONE]] = { nounwind "correctly{{.*}} } +// NO__ERRNO: attributes [[NOT_READNONE]] = { nounwind {{.*}} } // NO__ERRNO: attributes [[PURE]] = { {{.*}}readonly{{.*}} } -// HAS_ERRNO: attributes [[NOT_READNONE]] = { nounwind "correctly{{.*}} } +// HAS_ERRNO: attributes [[NOT_READNONE]] = { nounwind {{.*}} } // HAS_ERRNO: attributes [[READNONE_INTRINSIC]] = { {{.*}}readnone{{.*}} } // HAS_ERRNO: attributes [[PURE]] = { {{.*}}readonly{{.*}} } // HAS_ERRNO: attributes [[READNONE]] = { {{.*}}readnone{{.*}} } diff --git a/clang/test/CodeGen/math-libcalls.c b/clang/test/CodeGen/math-libcalls.c index 97a87beb12ec..51bdc5218fde 100644 --- a/clang/test/CodeGen/math-libcalls.c +++ b/clang/test/CodeGen/math-libcalls.c @@ -532,13 +532,12 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c) { // HAS_ERRNO: declare x86_fp80 @llvm.trunc.f80(x86_fp80)
[clang] 4d4f092 - [clang][codegen] Skip adding default function attributes on intrinsics.
Author: Michael Liao Date: 2020-09-16T14:10:05-04:00 New Revision: 4d4f0922837de3f1aa9862ae8a8d941b3b6e5f78 URL: https://github.com/llvm/llvm-project/commit/4d4f0922837de3f1aa9862ae8a8d941b3b6e5f78 DIFF: https://github.com/llvm/llvm-project/commit/4d4f0922837de3f1aa9862ae8a8d941b3b6e5f78.diff LOG: [clang][codegen] Skip adding default function attributes on intrinsics. - After loading builtin bitcode for linking, skip adding default function attributes on LLVM intrinsics as their attributes are well-defined and retrieved directly from internal definitions. Adding extra attributes on intrinsics results in inconsistent result when `-save-temps` is present. Also, that makes few optimizations conservative. Differential Revision: https://reviews.llvm.org/D87761 Added: clang/test/CodeGenCUDA/Inputs/device-lib-code.ll clang/test/CodeGenCUDA/dft-func-attr-skip-intrinsic.hip Modified: clang/lib/CodeGen/CodeGenAction.cpp Removed: diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 5a6ce0f5dbd5..eda4beff78b7 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -245,8 +245,13 @@ namespace clang { bool LinkInModules() { for (auto : LinkModules) { if (LM.PropagateAttrs) - for (Function : *LM.Module) + for (Function : *LM.Module) { +// Skip intrinsics. Keep consistent with how intrinsics are created +// in LLVM IR. +if (F.isIntrinsic()) + continue; Gen->CGM().addDefaultFunctionDefinitionAttributes(F); + } CurLinkModule = LM.Module.get(); diff --git a/clang/test/CodeGenCUDA/Inputs/device-lib-code.ll b/clang/test/CodeGenCUDA/Inputs/device-lib-code.ll new file mode 100644 index ..43ec911fb02c --- /dev/null +++ b/clang/test/CodeGenCUDA/Inputs/device-lib-code.ll @@ -0,0 +1,5 @@ +define linkonce_odr protected float @__ocml_fma_f32(float %0, float %1, float %2) local_unnamed_addr { + %4 = tail call float @llvm.fma.f32(float %0, float %1, float %2) + ret float %4 +} +declare float @llvm.fma.f32(float, float, float) diff --git a/clang/test/CodeGenCUDA/dft-func-attr-skip-intrinsic.hip b/clang/test/CodeGenCUDA/dft-func-attr-skip-intrinsic.hip new file mode 100644 index ..9e3e436200fc --- /dev/null +++ b/clang/test/CodeGenCUDA/dft-func-attr-skip-intrinsic.hip @@ -0,0 +1,18 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -x ir -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm-bc -disable-llvm-passes -o %t.bc %S/Inputs/device-lib-code.ll +// RUN: %clang_cc1 -x hip -fcuda-is-device -triple amdgcn-amd-amdhsa -mlink-builtin-bitcode %t.bc -emit-llvm -disable-llvm-passes -o - %s | FileCheck %s + +#include "Inputs/cuda.h" + +extern "C" __device__ float __ocml_fma_f32(float x, float y, float z); + +__device__ float foo(float x) { + return __ocml_fma_f32(x, x, x); +} + +// CHECK: {{^}}define{{.*}} @__ocml_fma_f32{{.*}} [[ATTR1:#[0-9]+]] +// CHECK: {{^}}declare{{.*}} @llvm.fma.f32{{.*}} [[ATTR2:#[0-9]+]] +// CHECK: attributes [[ATTR1]] = { convergent +// CHECK: attributes [[ATTR2]] = { +// CHECK-NOT: convergent +// CHECK: } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: r364428 - Make CodeGen depend on ASTMatchers
b22d4504968 was committed last night. On Fri, Sep 11, 2020 at 9:30 AM Vassil Vassilev wrote: > > On 9/11/20 5:13 AM, Michael LIAO wrote: > > That change was added long ago to fix the shared library build. > > Possibly, there are changes removing that dependency then. Just > > verified that removing that dependency is just fine. > > >That's great! Would you commit that change or should I? > > > > > > On Thu, Sep 10, 2020 at 6:48 AM Vassil Vassilev > > wrote: > >> Hello, > >> > >> IIUC, clang's CodeGen does not immediately depend on ASTMatchers. I > >> was wondering what is the reason for inserting such a dependency to fix > >> the shared library builds? > >> > >> Can you give more details about the failure you are fixing? > >> > >> Sorry for the late question. > >> > >> Best, Vassil > >> On 6/26/19 5:13 PM, Michael Liao via cfe-commits wrote: > >>> Author: hliao > >>> Date: Wed Jun 26 07:13:43 2019 > >>> New Revision: 364428 > >>> > >>> URL: http://llvm.org/viewvc/llvm-project?rev=364428=rev > >>> Log: > >>> Make CodeGen depend on ASTMatchers > >>> > >>> - Shared library builds are broken due to the missing dependency. > >>> > >>> Modified: > >>> cfe/trunk/lib/CodeGen/CMakeLists.txt > >>> > >>> Modified: cfe/trunk/lib/CodeGen/CMakeLists.txt > >>> URL: > >>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CMakeLists.txt?rev=364428=364427=364428=diff > >>> == > >>> --- cfe/trunk/lib/CodeGen/CMakeLists.txt (original) > >>> +++ cfe/trunk/lib/CodeGen/CMakeLists.txt Wed Jun 26 07:13:43 2019 > >>> @@ -101,6 +101,7 @@ add_clang_library(clangCodeGen > >>> LINK_LIBS > >>> clangAnalysis > >>> clangAST > >>> + clangASTMatchers > >>> clangBasic > >>> clangFrontend > >>> clangLex > >>> > >>> > >>> ___ > >>> cfe-commits mailing list > >>> cfe-commits@lists.llvm.org > >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits > >> > ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] b22d450 - Remove dependency on clangASTMatchers.
Author: Michael Liao Date: 2020-09-10T22:17:48-04:00 New Revision: b22d45049682d1461b6b786f159681e2e5c2ce24 URL: https://github.com/llvm/llvm-project/commit/b22d45049682d1461b6b786f159681e2e5c2ce24 DIFF: https://github.com/llvm/llvm-project/commit/b22d45049682d1461b6b786f159681e2e5c2ce24.diff LOG: Remove dependency on clangASTMatchers. - It seems no long required for shared library builds. Added: Modified: clang/lib/CodeGen/CMakeLists.txt Removed: diff --git a/clang/lib/CodeGen/CMakeLists.txt b/clang/lib/CodeGen/CMakeLists.txt index f47ecd9bf846..4039277707c5 100644 --- a/clang/lib/CodeGen/CMakeLists.txt +++ b/clang/lib/CodeGen/CMakeLists.txt @@ -92,7 +92,6 @@ add_clang_library(clangCodeGen LINK_LIBS clangAnalysis clangAST - clangASTMatchers clangBasic clangFrontend clangLex ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: r364428 - Make CodeGen depend on ASTMatchers
That change was added long ago to fix the shared library build. Possibly, there are changes removing that dependency then. Just verified that removing that dependency is just fine. On Thu, Sep 10, 2020 at 6:48 AM Vassil Vassilev wrote: > > Hello, > >IIUC, clang's CodeGen does not immediately depend on ASTMatchers. I > was wondering what is the reason for inserting such a dependency to fix > the shared library builds? > >Can you give more details about the failure you are fixing? > >Sorry for the late question. > > Best, Vassil > On 6/26/19 5:13 PM, Michael Liao via cfe-commits wrote: > > Author: hliao > > Date: Wed Jun 26 07:13:43 2019 > > New Revision: 364428 > > > > URL: http://llvm.org/viewvc/llvm-project?rev=364428=rev > > Log: > > Make CodeGen depend on ASTMatchers > > > > - Shared library builds are broken due to the missing dependency. > > > > Modified: > > cfe/trunk/lib/CodeGen/CMakeLists.txt > > > > Modified: cfe/trunk/lib/CodeGen/CMakeLists.txt > > URL: > > http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CMakeLists.txt?rev=364428=364427=364428=diff > > == > > --- cfe/trunk/lib/CodeGen/CMakeLists.txt (original) > > +++ cfe/trunk/lib/CodeGen/CMakeLists.txt Wed Jun 26 07:13:43 2019 > > @@ -101,6 +101,7 @@ add_clang_library(clangCodeGen > > LINK_LIBS > > clangAnalysis > > clangAST > > + clangASTMatchers > > clangBasic > > clangFrontend > > clangLex > > > > > > ___ > > cfe-commits mailing list > > cfe-commits@lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits > > ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] c7b683c - [PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions.
Author: Michael Liao Date: 2020-08-10T11:01:46-04:00 New Revision: c7b683c126b849dab5c81e7deecfc1e61f8563a0 URL: https://github.com/llvm/llvm-project/commit/c7b683c126b849dab5c81e7deecfc1e61f8563a0 DIFF: https://github.com/llvm/llvm-project/commit/c7b683c126b849dab5c81e7deecfc1e61f8563a0.diff LOG: [PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions. - Skip generating profile data on `__global__` function in the host compilation. It's a host-side stub function only and don't have profile instrumentation generated on the real function body. The extra profile data results in the malformed instrumentation profile data. - Skip generating region mapping on functions in the wrong-side, i.e., + For the device compilation, skip host-only functions; and, + For the host compilation, skip device-only functions (including `__global__` functions.) - As the device-side profiling is not ready yet, only host-side profile code generation is checked. Differential Revision: https://reviews.llvm.org/D85276 Added: clang/test/CodeGenCUDA/profile-coverage-mapping.cu Modified: clang/lib/CodeGen/CodeGenPGO.cpp Removed: diff --git a/clang/lib/CodeGen/CodeGenPGO.cpp b/clang/lib/CodeGen/CodeGenPGO.cpp index e810f608ab78..be3c50b99f30 100644 --- a/clang/lib/CodeGen/CodeGenPGO.cpp +++ b/clang/lib/CodeGen/CodeGenPGO.cpp @@ -773,6 +773,11 @@ void CodeGenPGO::assignRegionCounters(GlobalDecl GD, llvm::Function *Fn) { if (!D->hasBody()) return; + // Skip CUDA/HIP kernel launch stub functions. + if (CGM.getLangOpts().CUDA && !CGM.getLangOpts().CUDAIsDevice && + D->hasAttr()) +return; + bool InstrumentRegions = CGM.getCodeGenOpts().hasProfileClangInstr(); llvm::IndexedInstrProfReader *PGOReader = CGM.getPGOReader(); if (!InstrumentRegions && !PGOReader) @@ -831,6 +836,18 @@ bool CodeGenPGO::skipRegionMappingForDecl(const Decl *D) { if (!D->getBody()) return true; + // Skip host-only functions in the CUDA device compilation and device-only + // functions in the host compilation. Just roughly filter them out based on + // the function attributes. If there are effectively host-only or device-only + // ones, their coverage mapping may still be generated. + if (CGM.getLangOpts().CUDA && + ((CGM.getLangOpts().CUDAIsDevice && !D->hasAttr() && +!D->hasAttr()) || + (!CGM.getLangOpts().CUDAIsDevice && +(D->hasAttr() || + (!D->hasAttr() && D->hasAttr()) +return true; + // Don't map the functions in system headers. const auto = CGM.getContext().getSourceManager(); auto Loc = D->getBody()->getBeginLoc(); diff --git a/clang/test/CodeGenCUDA/profile-coverage-mapping.cu b/clang/test/CodeGenCUDA/profile-coverage-mapping.cu new file mode 100644 index ..5eae6f10e0ea --- /dev/null +++ b/clang/test/CodeGenCUDA/profile-coverage-mapping.cu @@ -0,0 +1,20 @@ +// RUN: echo "GPU binary would be here" > %t +// RUN: %clang_cc1 -fprofile-instrument=clang -triple x86_64-linux-gnu -target-sdk-version=8.0 -fcuda-include-gpubinary %t -emit-llvm -o - %s | FileCheck --check-prefix=PGOGEN %s +// RUN: %clang_cc1 -fprofile-instrument=clang -fcoverage-mapping -triple x86_64-linux-gnu -target-sdk-version=8.0 -fcuda-include-gpubinary %t -emit-llvm -o - %s | FileCheck --check-prefix=COVMAP %s +// RUN: %clang_cc1 -fprofile-instrument=clang -fcoverage-mapping -dump-coverage-mapping -triple x86_64-linux-gnu -target-sdk-version=8.0 -fcuda-include-gpubinary %t -emit-llvm-only -o - %s | FileCheck --check-prefix=MAPPING %s + +#include "Inputs/cuda.h" + +// PGOGEN-NOT: @__profn_{{.*kernel.*}} = +// COVMAP-COUNT-2: section "__llvm_covfun", comdat +// COVMAP-NOT: section "__llvm_covfun", comdat +// MAPPING-NOT: {{.*dfn.*}}: +// MAPPING-NOT: {{.*kernel.*}}: + +__device__ void dfn(int i) {} + +__global__ void kernel(int i) { dfn(i); } + +void host(void) { + kernel<<<1, 1>>>(1); +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] b8409c0 - Fix `-Wreturn-type` warning. NFC.
Author: Michael Liao Date: 2020-07-11T16:20:41-04:00 New Revision: b8409c03ed90807f3d49c7d98dceea98cf461f7a URL: https://github.com/llvm/llvm-project/commit/b8409c03ed90807f3d49c7d98dceea98cf461f7a DIFF: https://github.com/llvm/llvm-project/commit/b8409c03ed90807f3d49c7d98dceea98cf461f7a.diff LOG: Fix `-Wreturn-type` warning. NFC. Added: Modified: clang/lib/Tooling/Syntax/BuildTree.cpp Removed: diff --git a/clang/lib/Tooling/Syntax/BuildTree.cpp b/clang/lib/Tooling/Syntax/BuildTree.cpp index 6d13f1ace83b..1f192180ec45 100644 --- a/clang/lib/Tooling/Syntax/BuildTree.cpp +++ b/clang/lib/Tooling/Syntax/BuildTree.cpp @@ -750,6 +750,7 @@ class BuildTreeVisitor : public RecursiveASTVisitor { return new (allocator()) syntax::FloatUserDefinedLiteralExpression; } } +llvm_unreachable("Unknown literal operator kind."); } bool WalkUpFromUserDefinedLiteral(UserDefinedLiteral *S) { ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 471c806 - [hip] Refine `clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu`
Author: Michael Liao Date: 2020-06-25T23:57:08-04:00 New Revision: 471c806a45bbac2f0f4274d8bea383d06d397a84 URL: https://github.com/llvm/llvm-project/commit/471c806a45bbac2f0f4274d8bea383d06d397a84 DIFF: https://github.com/llvm/llvm-project/commit/471c806a45bbac2f0f4274d8bea383d06d397a84.diff LOG: [hip] Refine `clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu` - Require target x86 being enabled as well. Added: Modified: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu Removed: diff --git a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu index 99284c04e5cc..2660a5f14f90 100644 --- a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu +++ b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu @@ -1,4 +1,6 @@ +// REQUIRES: x86-registered-target // REQUIRES: amdgpu-registered-target + // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -o - | FileCheck --check-prefixes=COMMON,CHECK %s // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -disable-O0-optnone -o - | opt -S -O2 | FileCheck %s --check-prefixes=COMMON,OPT // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -x hip %s -o - | FileCheck -check-prefix=HOST %s ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 0723b18 - [hip] Re-enable `clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu`
Author: Michael Liao Date: 2020-06-25T22:29:27-04:00 New Revision: 0723b1891fac8f79f92549e3bcac9112be4ebd43 URL: https://github.com/llvm/llvm-project/commit/0723b1891fac8f79f92549e3bcac9112be4ebd43 DIFF: https://github.com/llvm/llvm-project/commit/0723b1891fac8f79f92549e3bcac9112be4ebd43.diff LOG: [hip] Re-enable `clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu` - Require amdgpu target being enabled. Added: Modified: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu Removed: diff --git a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu index 3021e73780f4..99284c04e5cc 100644 --- a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu +++ b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu @@ -1,4 +1,4 @@ -// XFAIL: * +// REQUIRES: amdgpu-registered-target // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -o - | FileCheck --check-prefixes=COMMON,CHECK %s // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -disable-O0-optnone -o - | opt -S -O2 | FileCheck %s --check-prefixes=COMMON,OPT // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -x hip %s -o - | FileCheck -check-prefix=HOST %s ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] d3f437d - [hip] Disable test temporarily due to failures on build servers.
Author: Michael Liao Date: 2020-06-25T22:04:20-04:00 New Revision: d3f437d35189f7567294daf3e60e08326e64994a URL: https://github.com/llvm/llvm-project/commit/d3f437d35189f7567294daf3e60e08326e64994a DIFF: https://github.com/llvm/llvm-project/commit/d3f437d35189f7567294daf3e60e08326e64994a.diff LOG: [hip] Disable test temporarily due to failures on build servers. Added: Modified: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu Removed: diff --git a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu index 8c102d339863..3021e73780f4 100644 --- a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu +++ b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu @@ -1,3 +1,4 @@ +// XFAIL: * // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -o - | FileCheck --check-prefixes=COMMON,CHECK %s // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -disable-O0-optnone -o - | opt -S -O2 | FileCheck %s --check-prefixes=COMMON,OPT // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -x hip %s -o - | FileCheck -check-prefix=HOST %s ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] dccfaac - [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.
Author: Michael Liao Date: 2020-06-25T20:46:56-04:00 New Revision: dccfaacf93e1c4801cbcc4686f64eb8a35564ff7 URL: https://github.com/llvm/llvm-project/commit/dccfaacf93e1c4801cbcc4686f64eb8a35564ff7 DIFF: https://github.com/llvm/llvm-project/commit/dccfaacf93e1c4801cbcc4686f64eb8a35564ff7.diff LOG: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`. Summary: - `ptrtoint` and `inttoptr` are defined as no-op casts if the integer value as the same size as the pointer value. The pair of `ptrtoint`/`inttoptr` is in fact a no-op cast sequence between different address spaces. Teach `infer-address-spaces` to handle them like a `bitcast`. Reviewers: arsenm, chandlerc Subscribers: jvesely, wdng, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81938 Added: llvm/test/Transforms/InferAddressSpaces/AMDGPU/noop-ptrint-pair.ll Modified: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp Removed: diff --git a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu index 73ab9edf318e..8c102d339863 100644 --- a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu +++ b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu @@ -1,37 +1,52 @@ -// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -o - | FileCheck %s +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -o - | FileCheck --check-prefixes=COMMON,CHECK %s +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -disable-O0-optnone -o - | opt -S -O2 | FileCheck %s --check-prefixes=COMMON,OPT // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -x hip %s -o - | FileCheck -check-prefix=HOST %s #include "Inputs/cuda.h" // Coerced struct from `struct S` without all generic pointers lowered into // global ones. -// CHECK: %struct.S.coerce = type { i32 addrspace(1)*, float addrspace(1)* } -// CHECK: %struct.T.coerce = type { [2 x float addrspace(1)*] } +// COMMON: %struct.S.coerce = type { i32 addrspace(1)*, float addrspace(1)* } +// COMMON: %struct.T.coerce = type { [2 x float addrspace(1)*] } // On the host-side compilation, generic pointer won't be coerced. // HOST-NOT: %struct.S.coerce // HOST-NOT: %struct.T.coerce -// CHECK: define amdgpu_kernel void @_Z7kernel1Pi(i32 addrspace(1)* %x.coerce) // HOST: define void @_Z22__device_stub__kernel1Pi(i32* %x) +// COMMON-LABEL: define amdgpu_kernel void @_Z7kernel1Pi(i32 addrspace(1)*{{.*}} %x.coerce) +// CHECK: = addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to [[TYPE]]* +// CHECK-NOT: = addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to [[TYPE]]* +// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4 +// OPT: [[INC:%.*]] = add nsw i32 [[VAL]], 1 +// OPT: store i32 [[INC]], i32 addrspace(1)* %x.coerce, align 4 +// OPT: ret void __global__ void kernel1(int *x) { x[0]++; } -// CHECK: define amdgpu_kernel void @_Z7kernel2Ri(i32 addrspace(1)* nonnull align 4 dereferenceable(4) %x.coerce) // HOST: define void @_Z22__device_stub__kernel2Ri(i32* nonnull align 4 dereferenceable(4) %x) +// COMMON-LABEL: define amdgpu_kernel void @_Z7kernel2Ri(i32 addrspace(1)*{{.*}} nonnull align 4 dereferenceable(4) %x.coerce) +// CHECK: = addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to [[TYPE]]* +// CHECK-NOT: = addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to [[TYPE]]* +// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4 +// OPT: [[INC:%.*]] = add nsw i32 [[VAL]], 1 +// OPT: store i32 [[INC]], i32 addrspace(1)* %x.coerce, align 4 +// OPT: ret void __global__ void kernel2(int ) { x++; } -// CHECK: define amdgpu_kernel void @_Z7kernel3PU3AS2iPU3AS1i(i32 addrspace(2)* %x, i32 addrspace(1)* %y) // HOST: define void @_Z22__device_stub__kernel3PU3AS2iPU3AS1i(i32 addrspace(2)* %x, i32 addrspace(1)* %y) +// CHECK-LABEL: define amdgpu_kernel void @_Z7kernel3PU3AS2iPU3AS1i(i32 addrspace(2)*{{.*}} %x, i32 addrspace(1)*{{.*}} %y) +// CHECK-NOT: = addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to [[TYPE]]* __global__ void kernel3(__attribute__((address_space(2))) int *x, __attribute__((address_space(1))) int *y) { y[0] = x[0]; } -// CHECK: define void @_Z4funcPi(i32* %x) +// COMMON-LABEL: define void @_Z4funcPi(i32*{{.*}} %x) +// CHECK-NOT: = addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to [[TYPE]]* __device__ void func(int *x) { x[0]++; } @@ -42,16 +57,25 @@ struct S { }; // `by-val` struct will be coerced into a similar struct with all generic // pointers lowerd into global ones. -// CHECK: define amdgpu_kernel void @_Z7kernel41S(%struct.S.coerce %s.coerce) // HOST: define void
[clang] ebc9e0f - Fix coding style. NFC.
Author: Michael Liao Date: 2020-06-24T13:13:42-04:00 New Revision: ebc9e0f1f0786b892b4a6eaf50013a18aed31aa5 URL: https://github.com/llvm/llvm-project/commit/ebc9e0f1f0786b892b4a6eaf50013a18aed31aa5 DIFF: https://github.com/llvm/llvm-project/commit/ebc9e0f1f0786b892b4a6eaf50013a18aed31aa5.diff LOG: Fix coding style. NFC. - Remove `else` after `return`. Added: Modified: clang/lib/CodeGen/CodeGenModule.cpp Removed: diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index f0ab5165584c..5e902aa15bcf 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3656,26 +3656,29 @@ CodeGenModule::GetOrCreateLLVMGlobal(StringRef MangledName, } llvm::Constant * -CodeGenModule::GetAddrOfGlobal(GlobalDecl GD, - ForDefinition_t IsForDefinition) { +CodeGenModule::GetAddrOfGlobal(GlobalDecl GD, ForDefinition_t IsForDefinition) { const Decl *D = GD.getDecl(); + if (isa(D) || isa(D)) return getAddrOfCXXStructor(GD, /*FnInfo=*/nullptr, /*FnType=*/nullptr, /*DontDefer=*/false, IsForDefinition); - else if (isa(D)) { -auto FInfo = ().arrangeCXXMethodDeclaration( -cast(D)); + + if (isa(D)) { +auto FInfo = +().arrangeCXXMethodDeclaration(cast(D)); auto Ty = getTypes().GetFunctionType(*FInfo); return GetAddrOfFunction(GD, Ty, /*ForVTable=*/false, /*DontDefer=*/false, IsForDefinition); - } else if (isa(D)) { + } + + if (isa(D)) { const CGFunctionInfo = getTypes().arrangeGlobalDeclaration(GD); llvm::FunctionType *Ty = getTypes().GetFunctionType(FI); return GetAddrOfFunction(GD, Ty, /*ForVTable=*/false, /*DontDefer=*/false, IsForDefinition); - } else -return GetAddrOfGlobalVar(cast(D), /*Ty=*/nullptr, - IsForDefinition); + } + + return GetAddrOfGlobalVar(cast(D), /*Ty=*/nullptr, IsForDefinition); } llvm::GlobalVariable *CodeGenModule::CreateOrReplaceCXXRuntimeVariable( ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] e830fa2 - [clang][amdgpu] Prefer not using `fp16` conversion intrinsics.
Author: Michael Liao Date: 2020-06-16T10:21:56-04:00 New Revision: e830fa260da9d3bbe99c4176b4ddb6aa5e6229dd URL: https://github.com/llvm/llvm-project/commit/e830fa260da9d3bbe99c4176b4ddb6aa5e6229dd DIFF: https://github.com/llvm/llvm-project/commit/e830fa260da9d3bbe99c4176b4ddb6aa5e6229dd.diff LOG: [clang][amdgpu] Prefer not using `fp16` conversion intrinsics. Reviewers: yaxunl, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, kerbowa, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81849 Added: clang/test/CodeGenHIP/half.hip Modified: clang/lib/Basic/Targets/AMDGPU.h Removed: diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h index e4194a881e3f..387b91abb537 100644 --- a/clang/lib/Basic/Targets/AMDGPU.h +++ b/clang/lib/Basic/Targets/AMDGPU.h @@ -219,6 +219,8 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo { ArrayRef getTargetBuiltins() const override; + bool useFP16ConversionIntrinsics() const override { return false; } + void getTargetDefines(const LangOptions , MacroBuilder ) const override; diff --git a/clang/test/CodeGenHIP/half.hip b/clang/test/CodeGenHIP/half.hip new file mode 100644 index ..d5dd43d51fdf --- /dev/null +++ b/clang/test/CodeGenHIP/half.hip @@ -0,0 +1,16 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -x hip -emit-llvm -fcuda-is-device -o - %s | FileCheck %s + +#define __device__ __attribute__((device)) + +// CHECK-LABEL: @_Z2d0DF16_ +// CHECK: fpext +__device__ float d0(_Float16 x) { + return x; +} + +// CHECK-LABEL: @_Z2d1f +// CHECK: fptrunc +__device__ _Float16 d1(float x) { + return x; +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 6dd0580 - [hip] Fix the failed test case due to the additional backend phase.
Author: Michael Liao Date: 2020-06-10T15:06:06-04:00 New Revision: 6dd058083208d58c6a7005bfb092cee085fd9a48 URL: https://github.com/llvm/llvm-project/commit/6dd058083208d58c6a7005bfb092cee085fd9a48 DIFF: https://github.com/llvm/llvm-project/commit/6dd058083208d58c6a7005bfb092cee085fd9a48.diff LOG: [hip] Fix the failed test case due to the additional backend phase. Added: Modified: clang/test/Driver/cuda-phases.cu Removed: diff --git a/clang/test/Driver/cuda-phases.cu b/clang/test/Driver/cuda-phases.cu index 58be50ae2e12..acbf345f85c6 100644 --- a/clang/test/Driver/cuda-phases.cu +++ b/clang/test/Driver/cuda-phases.cu @@ -49,9 +49,10 @@ // BIN_AMD_NRDC-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-[[T]]) // BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-[[T]]) // BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-[[T]]) -// BIN_AMD_RDC-DAG: [[P15:[0-9]+]]: linker, {[[P5]]}, image, (device-[[T]], [[ARCH]]) -// BIN_AMD_RDC-DAG: [[P16:[0-9]+]]: offload, "host-[[T]] (powerpc64le-ibm-linux-gnu)" {[[P14]]}, -// BIN_AMD_RDC-DAG-SAME: "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH]])" {[[P15]]}, object +// BIN_AMD_RDC-DAG: [[P15:[0-9]+]]: backend, {[[P5]]}, ir, (device-[[T]], [[ARCH]]) +// BIN_AMD_RDC-DAG: [[P16:[0-9]+]]: linker, {[[P15]]}, image, (device-[[T]], [[ARCH]]) +// BIN_AMD_RDC-DAG: [[P17:[0-9]+]]: offload, "host-[[T]] (powerpc64le-ibm-linux-gnu)" {[[P14]]}, +// BIN_AMD_RDC-DAG-SAME: "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH]])" {[[P16]]}, object // // Test single gpu architecture up to the assemble phase. @@ -109,11 +110,13 @@ // BIN2_AMD-DAG: [[P19:[0-9]+]]: backend, {[[P2]]}, assembler, (host-[[T]]) // BIN2-DAG: [[P20:[0-9]+]]: assembler, {[[P19]]}, object, (host-[[T]]) // BIN2-DAG: [[P21:[0-9]+]]: linker, {[[P20]]}, image, (host-[[T]]) -// BIN2_AMD-DAG: [[P22:[0-9]+]]: linker, {[[P5]]}, image, (device-[[T]], [[ARCH1]]) -// BIN2_AMD-DAG: [[P23:[0-9]+]]: linker, {[[P12]]}, image, (device-[[T]], [[ARCH2]]) -// BIN2_AMD-DAG: [[P24:[0-9]+]]: offload, "host-[[T]] (powerpc64le-ibm-linux-gnu)" {[[P21]]}, -// BIN2_AMD-DAG-SAME: "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH1]])" {[[P22]]}, -// BIN2_AMD-DAG-SAME: "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH2]])" {[[P23]]}, object +// BIN2_AMD-DAG: [[P22:[0-9]+]]: backend, {[[P5]]}, ir, (device-[[T]], [[ARCH1]]) +// BIN2_AMD-DAG: [[P23:[0-9]+]]: backend, {[[P12]]}, ir, (device-[[T]], [[ARCH2]]) +// BIN2_AMD-DAG: [[P24:[0-9]+]]: linker, {[[P22]]}, image, (device-[[T]], [[ARCH1]]) +// BIN2_AMD-DAG: [[P25:[0-9]+]]: linker, {[[P23]]}, image, (device-[[T]], [[ARCH2]]) +// BIN2_AMD-DAG: [[P26:[0-9]+]]: offload, "host-[[T]] (powerpc64le-ibm-linux-gnu)" {[[P21]]}, +// BIN2_AMD-DAG-SAME: "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH1]])" {[[P24]]}, +// BIN2_AMD-DAG-SAME: "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH2]])" {[[P25]]}, object // // Test two gpu architecturess up to the assemble phase. ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 8b6821a - [hip] Fix device-only relocatable code compilation.
Author: Michael Liao Date: 2020-06-10T14:10:41-04:00 New Revision: 8b6821a5843bb321b3738e2519beae7142e62928 URL: https://github.com/llvm/llvm-project/commit/8b6821a5843bb321b3738e2519beae7142e62928 DIFF: https://github.com/llvm/llvm-project/commit/8b6821a5843bb321b3738e2519beae7142e62928.diff LOG: [hip] Fix device-only relocatable code compilation. Summary: - In HIP, just as the regular device-only compilation, the device-only relocatable code compilation should not involve offload bundle. - In addition, that device-only relocatable code compilation should have the similar 3 steps, namely preprocessor, compile, and backend, to the regular code generation with `-emit-llvm`. Reviewers: yaxunl, tra Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81427 Added: clang/test/Driver/hip-rdc-device-only.hip Modified: clang/lib/Driver/Driver.cpp Removed: diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index 8cc5eceaa512..19faf0968e08 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -2705,9 +2705,7 @@ class OffloadingActionBuilder final { // backend and assemble phases to output LLVM IR. Except for generating // non-relocatable device coee, where we generate fat binary for device // code and pass to host in Backend phase. - if (CudaDeviceActions.empty() || - (CurPhase == phases::Backend && Relocatable) || - CurPhase == phases::Assemble) + if (CudaDeviceActions.empty()) return ABRT_Success; assert(((CurPhase == phases::Link && Relocatable) || @@ -2781,9 +2779,11 @@ class OffloadingActionBuilder final { } // By default, we produce an action for each device arch. - for (Action * : CudaDeviceActions) -A = C.getDriver().ConstructPhaseAction(C, Args, CurPhase, A, - AssociatedOffloadKind); + if (!Relocatable || CurPhase <= phases::Backend) { +for (Action * : CudaDeviceActions) + A = C.getDriver().ConstructPhaseAction(C, Args, CurPhase, A, + AssociatedOffloadKind); + } return (CompileDeviceOnly && CurPhase == FinalPhase) ? ABRT_Ignore_Host : ABRT_Success; @@ -3668,7 +3668,10 @@ Action *Driver::ConstructPhaseAction( Args.hasArg(options::OPT_S) ? types::TY_LTO_IR : types::TY_LTO_BC; return C.MakeAction(Input, Output); } -if (Args.hasArg(options::OPT_emit_llvm)) { +if (Args.hasArg(options::OPT_emit_llvm) || +(TargetDeviceOffloadKind == Action::OFK_HIP && + Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, + false))) { types::ID Output = Args.hasArg(options::OPT_S) ? types::TY_LLVM_IR : types::TY_LLVM_BC; return C.MakeAction(Input, Output); @@ -4588,8 +4591,19 @@ const char *Driver::GetNamedOutputPath(Compilation , const JobAction , // When using both -save-temps and -emit-llvm, use a ".tmp.bc" suffix for // the unoptimized bitcode so that it does not get overwritten by the ".bc" // optimized bitcode output. -if (!AtTopLevel && C.getArgs().hasArg(options::OPT_emit_llvm) && -JA.getType() == types::TY_LLVM_BC) +auto IsHIPRDCInCompilePhase = [](const JobAction , + const llvm::opt::DerivedArgList ) { + // The relocatable compilation in HIP implies -emit-llvm. Similarly, use a + // ".tmp.bc" suffix for the unoptimized bitcode (generated in the compile + // phase.) + return isa(JA) && + JA.getOffloadingDeviceKind() == Action::OFK_HIP && + Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, + false); +}; +if (!AtTopLevel && JA.getType() == types::TY_LLVM_BC && +(C.getArgs().hasArg(options::OPT_emit_llvm) || + IsHIPRDCInCompilePhase(JA, C.getArgs( Suffixed += ".tmp"; Suffixed += '.'; Suffixed += Suffix; diff --git a/clang/test/Driver/hip-rdc-device-only.hip b/clang/test/Driver/hip-rdc-device-only.hip new file mode 100644 index ..4bdff628466a --- /dev/null +++ b/clang/test/Driver/hip-rdc-device-only.hip @@ -0,0 +1,144 @@ +// REQUIRES: clang-driver +// REQUIRES: x86-registered-target +// REQUIRES: amdgpu-registered-target + +// RUN: %clang -### -target x86_64-linux-gnu \ +// RUN: -x hip --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 \ +// RUN: -c -nogpuinc -nogpulib --cuda-device-only -fgpu-rdc \ +// RUN: %S/Inputs/hip_multiple_inputs/a.cu \ +// RUN: %S/Inputs/hip_multiple_inputs/b.hip \ +// RUN: 2>&1 | FileCheck -check-prefixes=COMMON,EMITBC %s + +// With `-emit-llvm`, the output should be the same as the aforementioned line +// as
[clang] 276c8dd - [clang][codegen] Refactor argument loading in function prolog. NFC.
Author: Michael Liao Date: 2020-05-05T15:31:51-04:00 New Revision: 276c8dde0b58cfe29035448a27e16eff9fcf2a5a URL: https://github.com/llvm/llvm-project/commit/276c8dde0b58cfe29035448a27e16eff9fcf2a5a DIFF: https://github.com/llvm/llvm-project/commit/276c8dde0b58cfe29035448a27e16eff9fcf2a5a.diff LOG: [clang][codegen] Refactor argument loading in function prolog. NFC. Summary: - Skip copying function arguments and unnecessary casting by using them directly. Reviewers: rjmccall, kerbowa, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79394 Added: Modified: clang/lib/CodeGen/CGCall.cpp clang/lib/CodeGen/CodeGenFunction.h Removed: diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 55f106e7c300..44f298892ecf 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -1016,8 +1016,8 @@ static void forConstantArrayExpansion(CodeGenFunction , } } -void CodeGenFunction::ExpandTypeFromArgs( -QualType Ty, LValue LV, SmallVectorImpl::iterator ) { +void CodeGenFunction::ExpandTypeFromArgs(QualType Ty, LValue LV, + llvm::Function::arg_iterator ) { assert(LV.isSimple() && "Unexpected non-simple lvalue during struct expansion."); @@ -1046,17 +1046,17 @@ void CodeGenFunction::ExpandTypeFromArgs( ExpandTypeFromArgs(FD->getType(), SubLV, AI); } } else if (isa(Exp.get())) { -auto realValue = *AI++; -auto imagValue = *AI++; +auto realValue = &*AI++; +auto imagValue = &*AI++; EmitStoreOfComplex(ComplexPairTy(realValue, imagValue), LV, /*init*/ true); } else { // Call EmitStoreOfScalar except when the lvalue is a bitfield to emit a // primitive store. assert(isa(Exp.get())); if (LV.isBitField()) - EmitStoreThroughLValue(RValue::get(*AI++), LV); + EmitStoreThroughLValue(RValue::get(&*AI++), LV); else - EmitStoreOfScalar(*AI++, LV); + EmitStoreOfScalar(&*AI++, LV); } } @@ -2323,19 +2323,13 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , // simplify. ClangToLLVMArgMapping IRFunctionArgs(CGM.getContext(), FI); - // Flattened function arguments. - SmallVector FnArgs; - FnArgs.reserve(IRFunctionArgs.totalIRArgs()); - for (auto : Fn->args()) { -FnArgs.push_back(); - } - assert(FnArgs.size() == IRFunctionArgs.totalIRArgs()); + assert(Fn->arg_size() == IRFunctionArgs.totalIRArgs()); // If we're using inalloca, all the memory arguments are GEPs off of the last // parameter, which is a pointer to the complete memory area. Address ArgStruct = Address::invalid(); if (IRFunctionArgs.hasInallocaArg()) { -ArgStruct = Address(FnArgs[IRFunctionArgs.getInallocaArgNo()], +ArgStruct = Address(Fn->getArg(IRFunctionArgs.getInallocaArgNo()), FI.getArgStructAlignment()); assert(ArgStruct.getType() == FI.getArgStruct()->getPointerTo()); @@ -2343,7 +2337,7 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , // Name the struct return parameter. if (IRFunctionArgs.hasSRetArg()) { -auto AI = cast(FnArgs[IRFunctionArgs.getSRetArgNo()]); +auto AI = Fn->getArg(IRFunctionArgs.getSRetArgNo()); AI->setName("agg.result"); AI->addAttr(llvm::Attribute::NoAlias); } @@ -2394,7 +2388,8 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , case ABIArgInfo::Indirect: { assert(NumIRArgs == 1); - Address ParamAddr = Address(FnArgs[FirstIRArg], ArgI.getIndirectAlign()); + Address ParamAddr = + Address(Fn->getArg(FirstIRArg), ArgI.getIndirectAlign()); if (!hasScalarEvaluationKind(Ty)) { // Aggregates and complex variables are accessed by reference. All we @@ -2436,8 +2431,7 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , ArgI.getCoerceToType() == ConvertType(Ty) && ArgI.getDirectOffset() == 0) { assert(NumIRArgs == 1); -llvm::Value *V = FnArgs[FirstIRArg]; -auto AI = cast(V); +auto AI = Fn->getArg(FirstIRArg); if (const ParmVarDecl *PVD = dyn_cast(Arg)) { if (getNonNullAttr(CurCodeDecl, PVD, PVD->getType(), @@ -2499,6 +2493,7 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , // LLVM expects swifterror parameters to be used in very restricted // ways. Copy the value into a less-restricted temporary. +llvm::Value *V = AI; if (FI.getExtParameterInfo(ArgNo).getABI() == ParameterABI::SwiftErrorResult) { QualType pointeeTy = Ty->getPointeeType(); @@ -2560,7 +2555,7 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , assert(STy->getNumElements() == NumIRArgs); for (unsigned i = 0, e = STy->getNumElements();
[clang] 9142c0b - [clang][codegen] Hoist parameter attribute setting in function prolog.
Author: Michael Liao Date: 2020-05-05T15:31:51-04:00 New Revision: 9142c0b46bfea13d9348ab3d1d706a10ad9e5c8e URL: https://github.com/llvm/llvm-project/commit/9142c0b46bfea13d9348ab3d1d706a10ad9e5c8e DIFF: https://github.com/llvm/llvm-project/commit/9142c0b46bfea13d9348ab3d1d706a10ad9e5c8e.diff LOG: [clang][codegen] Hoist parameter attribute setting in function prolog. Summary: - If the coerced type is still a pointer, it should be set with proper parameter attributes, such as `noalias`, `nonnull`, and etc. Hoist that (pointer) parameter attribute setting so that the coerced pointer parameter could be marked properly. Depends on D79394 Reviewers: rjmccall, kerbowa, yaxunl Subscribers: jvesely, nhaehnle, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79395 Added: Modified: clang/lib/CodeGen/CGCall.cpp clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu Removed: diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 44f298892ecf..e336741d9111 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -2425,15 +2425,18 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , case ABIArgInfo::Extend: case ABIArgInfo::Direct: { - - // If we have the trivial case, handle it with no muss and fuss. - if (!isa(ArgI.getCoerceToType()) && - ArgI.getCoerceToType() == ConvertType(Ty) && - ArgI.getDirectOffset() == 0) { + auto AI = Fn->getArg(FirstIRArg); + llvm::Type *LTy = ConvertType(Arg->getType()); + + // Prepare parameter attributes. So far, only attributes for pointer + // parameters are prepared. See + // http://llvm.org/docs/LangRef.html#paramattrs. + if (ArgI.getDirectOffset() == 0 && LTy->isPointerTy() && + ArgI.getCoerceToType()->isPointerTy()) { assert(NumIRArgs == 1); -auto AI = Fn->getArg(FirstIRArg); if (const ParmVarDecl *PVD = dyn_cast(Arg)) { + // Set `nonnull` attribute if any. if (getNonNullAttr(CurCodeDecl, PVD, PVD->getType(), PVD->getFunctionScopeIndex()) && !CGM.getCodeGenOpts().NullPointerIsValid) @@ -2471,6 +2474,7 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , AI->addAttr(llvm::Attribute::NonNull); } + // Set `align` attribute if any. const auto *AVAttr = PVD->getAttr(); if (!AVAttr) if (const auto *TOTy = dyn_cast(OTy)) @@ -2488,8 +2492,17 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo , } } +// Set 'noalias' if an argument type has the `restrict` qualifier. if (Arg->getType().isRestrictQualified()) AI->addAttr(llvm::Attribute::NoAlias); + } + + // Prepare the argument value. If we have the trivial case, handle it + // with no muss and fuss. + if (!isa(ArgI.getCoerceToType()) && + ArgI.getCoerceToType() == ConvertType(Ty) && + ArgI.getDirectOffset() == 0) { +assert(NumIRArgs == 1); // LLVM expects swifterror parameters to be used in very restricted // ways. Copy the value into a less-restricted temporary. diff --git a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu index 6e4de1f0f5c3..8aeb0f759e6c 100644 --- a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu +++ b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu @@ -67,3 +67,10 @@ __global__ void kernel6(struct T t) { t.x[0][0] += 1.f; t.x[1][0] += 2.f; } + +// Check that coerced pointers retain the noalias attribute when qualified with __restrict. +// CHECK: define amdgpu_kernel void @_Z7kernel7Pi(i32 addrspace(1)* noalias %x.coerce) +// HOST: define void @_Z22__device_stub__kernel7Pi(i32* noalias %x) +__global__ void kernel7(int *__restrict x) { + x[0]++; +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] f3a3db8 - Add the missing '='. NFC.
Author: Michael Liao Date: 2020-05-02T01:07:44-04:00 New Revision: f3a3db8627e9fa673fa413a2e41fe5443db7c6c3 URL: https://github.com/llvm/llvm-project/commit/f3a3db8627e9fa673fa413a2e41fe5443db7c6c3 DIFF: https://github.com/llvm/llvm-project/commit/f3a3db8627e9fa673fa413a2e41fe5443db7c6c3.diff LOG: Add the missing '='. NFC. Added: Modified: clang/lib/Format/UnwrappedLineParser.cpp Removed: diff --git a/clang/lib/Format/UnwrappedLineParser.cpp b/clang/lib/Format/UnwrappedLineParser.cpp index 7456e6d5fb48..96e0bd2276fa 100644 --- a/clang/lib/Format/UnwrappedLineParser.cpp +++ b/clang/lib/Format/UnwrappedLineParser.cpp @@ -1471,7 +1471,7 @@ void UnwrappedLineParser::parseStructuralElement() { } else if (Style.Language == FormatStyle::LK_Proto && FormatTok->Tok.is(tok::less)) { nextToken(); -parseBracedList(/*ContinueOnSemicolons=*/false, /*IsEnum*/false, +parseBracedList(/*ContinueOnSemicolons=*/false, /*IsEnum=*/false, /*ClosingBraceKind=*/tok::greater); } break; @@ -1824,7 +1824,7 @@ bool UnwrappedLineParser::parseBracedList(bool ContinueOnSemicolons, case tok::less: if (Style.Language == FormatStyle::LK_Proto) { nextToken(); -parseBracedList(/*ContinueOnSemicolons=*/false, /*IsEnum*/false, +parseBracedList(/*ContinueOnSemicolons=*/false, /*IsEnum=*/false, /*ClosingBraceKind=*/tok::greater); } else { nextToken(); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] d1c4361 - [clang-format] Add the missing default argument.
Author: Michael Liao Date: 2020-04-30T17:36:43-04:00 New Revision: d1c43615ed068f2f915ccdd959ef583cd5177929 URL: https://github.com/llvm/llvm-project/commit/d1c43615ed068f2f915ccdd959ef583cd5177929 DIFF: https://github.com/llvm/llvm-project/commit/d1c43615ed068f2f915ccdd959ef583cd5177929.diff LOG: [clang-format] Add the missing default argument. Added: Modified: clang/lib/Format/UnwrappedLineParser.cpp Removed: diff --git a/clang/lib/Format/UnwrappedLineParser.cpp b/clang/lib/Format/UnwrappedLineParser.cpp index c9528188c61c..7456e6d5fb48 100644 --- a/clang/lib/Format/UnwrappedLineParser.cpp +++ b/clang/lib/Format/UnwrappedLineParser.cpp @@ -1471,7 +1471,7 @@ void UnwrappedLineParser::parseStructuralElement() { } else if (Style.Language == FormatStyle::LK_Proto && FormatTok->Tok.is(tok::less)) { nextToken(); -parseBracedList(/*ContinueOnSemicolons=*/false, +parseBracedList(/*ContinueOnSemicolons=*/false, /*IsEnum*/false, /*ClosingBraceKind=*/tok::greater); } break; @@ -1824,7 +1824,7 @@ bool UnwrappedLineParser::parseBracedList(bool ContinueOnSemicolons, case tok::less: if (Style.Language == FormatStyle::LK_Proto) { nextToken(); -parseBracedList(/*ContinueOnSemicolons=*/false, +parseBracedList(/*ContinueOnSemicolons=*/false, /*IsEnum*/false, /*ClosingBraceKind=*/tok::greater); } else { nextToken(); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 612720d - [hip] Remove test using `hip_pinned_shadow` attribute. NFC.
Author: Michael Liao Date: 2020-04-27T16:44:59-04:00 New Revision: 612720db874d06a50b793c301e5b3b857e3e7c70 URL: https://github.com/llvm/llvm-project/commit/612720db874d06a50b793c301e5b3b857e3e7c70 DIFF: https://github.com/llvm/llvm-project/commit/612720db874d06a50b793c301e5b3b857e3e7c70.diff LOG: [hip] Remove test using `hip_pinned_shadow` attribute. NFC. Added: Modified: Removed: clang/test/CodeGenCUDA/hip-pinned-shadow.hip diff --git a/clang/test/CodeGenCUDA/hip-pinned-shadow.hip b/clang/test/CodeGenCUDA/hip-pinned-shadow.hip deleted file mode 100644 index 7f0e7544d828.. --- a/clang/test/CodeGenCUDA/hip-pinned-shadow.hip +++ /dev/null @@ -1,27 +0,0 @@ -// REQUIRES: amdgpu-registered-target - -// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -std=c++11 -fvisibility hidden -fapply-global-visibility-to-externs \ -// RUN: -emit-llvm -o - %s | FileCheck -check-prefixes=HIPDEV %s -// RUN: %clang_cc1 -triple x86_64 -std=c++11 \ -// RUN: -emit-llvm -o - %s | FileCheck -check-prefixes=HIPHOST %s -// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -std=c++11 -fvisibility hidden -fapply-global-visibility-to-externs \ -// RUN: -O3 -emit-llvm -o - %s | FileCheck -check-prefixes=HIPDEVUNSED %s - -struct textureReference { - int a; -}; - -template -struct texture : public textureReference { -texture() { a = 1; } -}; - -__attribute__((hip_pinned_shadow)) texture tex; -// CUDADEV-NOT: @tex -// CUDAHOST-NOT: call i32 @__hipRegisterVar{{.*}}@tex -// HIPDEV: @tex = external addrspace(1) global %struct.texture -// HIPDEV-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev -// HIPHOST: define{{.*}}@_ZN7textureIfLi2ELi1EEC1Ev -// HIPHOST: call i32 @__hipRegisterVar{{.*}}@tex{{.*}}i32 0, i32 4, i32 0, i32 0) -// HIPDEVUNSED: @tex = external addrspace(1) global %struct.texture -// HIPDEVUNSED-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 86e3b73 - [hip] Claim builtin type `__float128` supported if the host target supports it.
Author: Michael Liao Date: 2020-04-21T15:56:40-04:00 New Revision: 86e3b735cd803cc22c9eae15d99ce9df5956aae6 URL: https://github.com/llvm/llvm-project/commit/86e3b735cd803cc22c9eae15d99ce9df5956aae6 DIFF: https://github.com/llvm/llvm-project/commit/86e3b735cd803cc22c9eae15d99ce9df5956aae6.diff LOG: [hip] Claim builtin type `__float128` supported if the host target supports it. Reviewers: tra, yaxunl Subscribers: jvesely, nhaehnle, kerbowa, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78513 Added: clang/test/SemaCUDA/amdgpu-f128.cu Modified: clang/lib/Basic/Targets/AMDGPU.cpp Removed: diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp index 3fd9008e4660..b9d7640a10b8 100644 --- a/clang/lib/Basic/Targets/AMDGPU.cpp +++ b/clang/lib/Basic/Targets/AMDGPU.cpp @@ -363,4 +363,17 @@ void AMDGPUTargetInfo::setAuxTarget(const TargetInfo *Aux) { copyAuxTarget(Aux); LongDoubleFormat = SaveLongDoubleFormat; Float128Format = SaveFloat128Format; + // For certain builtin types support on the host target, claim they are + // support to pass the compilation of the host code during the device-side + // compilation. + // FIXME: As the side effect, we also accept `__float128` uses in the device + // code. To rejct these builtin types supported in the host target but not in + // the device target, one approach would support `device_builtin` attribute + // so that we could tell the device builtin types from the host ones. The + // also solves the diff erent representations of the same builtin type, such + // as `size_t` in the MSVC environment. + if (Aux->hasFloat128Type()) { +HasFloat128 = true; +Float128Format = DoubleFormat; + } } diff --git a/clang/test/SemaCUDA/amdgpu-f128.cu b/clang/test/SemaCUDA/amdgpu-f128.cu new file mode 100644 index ..9a0212cdb93c --- /dev/null +++ b/clang/test/SemaCUDA/amdgpu-f128.cu @@ -0,0 +1,4 @@ +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -aux-triple x86_64-unknown-linux-gnu -fcuda-is-device -fsyntax-only -verify %s + +// expected-no-diagnostics +typedef __float128 f128_t; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 50472c4 - Remove extra ‘;’. NFC.
Author: Michael Liao Date: 2020-04-15T17:22:03-04:00 New Revision: 50472c422cc6d27a4532a4025c4339fb6920 URL: https://github.com/llvm/llvm-project/commit/50472c422cc6d27a4532a4025c4339fb6920 DIFF: https://github.com/llvm/llvm-project/commit/50472c422cc6d27a4532a4025c4339fb6920.diff LOG: Remove extra ‘;’. NFC. Added: Modified: clang/include/clang/AST/RecursiveASTVisitor.h Removed: diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h index 10ea91ea9cfa..85eb6259a419 100644 --- a/clang/include/clang/AST/RecursiveASTVisitor.h +++ b/clang/include/clang/AST/RecursiveASTVisitor.h @@ -1993,7 +1993,7 @@ DEF_TRAVERSE_DECL(BindingDecl, { DEF_TRAVERSE_DECL(MSPropertyDecl, { TRY_TO(TraverseDeclaratorHelper(D)); }) -DEF_TRAVERSE_DECL(MSGuidDecl, {}); +DEF_TRAVERSE_DECL(MSGuidDecl, {}) DEF_TRAVERSE_DECL(FieldDecl, { TRY_TO(TraverseDeclaratorHelper(D)); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 96c4ec8 - Remove extra whitespace. NFC.
Author: Michael Liao Date: 2020-04-10T03:22:01-04:00 New Revision: 96c4ec8fdbd95048114cf058679bd8fc08ab76b3 URL: https://github.com/llvm/llvm-project/commit/96c4ec8fdbd95048114cf058679bd8fc08ab76b3 DIFF: https://github.com/llvm/llvm-project/commit/96c4ec8fdbd95048114cf058679bd8fc08ab76b3.diff LOG: Remove extra whitespace. NFC. Added: Modified: clang/include/clang/Basic/DiagnosticDriverKinds.td Removed: diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td index e35ca843ff56..cba59cb3b66d 100644 --- a/clang/include/clang/Basic/DiagnosticDriverKinds.td +++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td @@ -50,14 +50,14 @@ def warn_drv_avr_stdlib_not_linked: Warning< InGroup; def err_drv_cuda_bad_gpu_arch : Error<"Unsupported CUDA gpu architecture: %0">; def err_drv_no_cuda_installation : Error< - "cannot find CUDA installation. Provide its path via --cuda-path, or pass " + "cannot find CUDA installation. Provide its path via --cuda-path, or pass " "-nocudainc to build without CUDA includes.">; def err_drv_no_cuda_libdevice : Error< "cannot find libdevice for %0. Provide path to diff erent CUDA installation " "via --cuda-path, or pass -nocudalib to build without linking with libdevice.">; def err_drv_cuda_version_unsupported : Error< "GPU arch %0 is supported by CUDA versions between %1 and %2 (inclusive), " - "but installation at %3 is %4. Use --cuda-path to specify a diff erent CUDA " + "but installation at %3 is %4. Use --cuda-path to specify a diff erent CUDA " "install, pass a diff erent GPU arch with --cuda-gpu-arch, or pass " "--no-cuda-version-check.">; def warn_drv_unknown_cuda_version: Warning< ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] c97be2c - [hip] Remove `hip_pinned_shadow`.
Author: Michael Liao Date: 2020-04-07T09:51:49-04:00 New Revision: c97be2c377852fad7eb38409aae5692fa417e49b URL: https://github.com/llvm/llvm-project/commit/c97be2c377852fad7eb38409aae5692fa417e49b DIFF: https://github.com/llvm/llvm-project/commit/c97be2c377852fad7eb38409aae5692fa417e49b.diff LOG: [hip] Remove `hip_pinned_shadow`. Summary: - Use `device_builtin_surface` and `device_builtin_texture` for surface/texture reference support. So far, both the host and device use the same reference type, which could be revised later when interface/implementation is stablized. Reviewers: yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77583 Added: Modified: clang/include/clang/Basic/Attr.td clang/include/clang/Basic/AttrDocs.td clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenModule.h clang/lib/CodeGen/TargetInfo.cpp clang/lib/Driver/ToolChains/HIP.cpp clang/lib/Sema/SemaDeclAttr.cpp clang/test/Driver/hip-toolchain-no-rdc.hip clang/test/Driver/hip-toolchain-rdc.hip clang/test/Misc/pragma-attribute-supported-attributes-list.test Removed: clang/test/AST/ast-dump-hip-pinned-shadow.cu clang/test/SemaCUDA/hip-pinned-shadow.cu diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index f55ce2cc84dd..c586f9b9466a 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -322,7 +322,6 @@ class LangOpt { def MicrosoftExt : LangOpt<"MicrosoftExt">; def Borland : LangOpt<"Borland">; def CUDA : LangOpt<"CUDA">; -def HIP : LangOpt<"HIP">; def SYCL : LangOpt<"SYCLIsDevice">; def COnly : LangOpt<"", "!LangOpts.CPlusPlus">; def CPlusPlus : LangOpt<"CPlusPlus">; @@ -1052,13 +1051,6 @@ def CUDADevice : InheritableAttr { let Documentation = [Undocumented]; } -def HIPPinnedShadow : InheritableAttr { - let Spellings = [GNU<"hip_pinned_shadow">, Declspec<"__hip_pinned_shadow__">]; - let Subjects = SubjectList<[Var]>; - let LangOpts = [HIP]; - let Documentation = [HIPPinnedShadowDocs]; -} - def CUDADeviceBuiltin : IgnoredAttr { let Spellings = [GNU<"device_builtin">, Declspec<"__device_builtin__">]; let LangOpts = [CUDA]; diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index fb1c82a80115..36561c04d395 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -4613,18 +4613,6 @@ only call one function. }]; } -def HIPPinnedShadowDocs : Documentation { - let Category = DocCatType; - let Content = [{ -The GNU style attribute __attribute__((hip_pinned_shadow)) or MSVC style attribute -__declspec(hip_pinned_shadow) can be added to the definition of a global variable -to indicate it is a HIP pinned shadow variable. A HIP pinned shadow variable can -be accessed on both device side and host side. It has external linkage and is -not initialized on device side. It has internal linkage and is initialized by -the initializer on host side. - }]; -} - def CUDADeviceBuiltinSurfaceTypeDocs : Documentation { let Category = DocCatType; let Content = [{ diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 1645a9eb17de..8b7d52b88146 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -1955,9 +1955,9 @@ void CodeGenModule::SetFunctionAttributes(GlobalDecl GD, llvm::Function *F, } } -void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV, bool SkipCheck) { - assert(SkipCheck || (!GV->isDeclaration() && - "Only globals with definition can force usage.")); +void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV) { + assert(!GV->isDeclaration() && + "Only globals with definition can force usage."); LLVMUsed.emplace_back(GV); } @@ -2520,7 +2520,6 @@ void CodeGenModule::EmitGlobal(GlobalDecl GD) { !Global->hasAttr() && !Global->hasAttr() && !Global->hasAttr() && - !(LangOpts.HIP && Global->hasAttr()) && !Global->getType()->isCUDADeviceBuiltinSurfaceType() && !Global->getType()->isCUDADeviceBuiltinTextureType()) return; @@ -3928,10 +3927,8 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl *D, D->getType()->isCUDADeviceBuiltinTextureType()); // HIP pinned shadow of initialized host-side global variables are also // left undefined. - bool IsHIPPinnedShadowVar = - getLangOpts().CUDAIsDevice && D->hasAttr(); - if (getLangOpts().CUDA && (IsCUDASharedVar || IsCUDAShadowVar || - IsCUDADeviceShadowVar || IsHIPPinnedShadowVar)) + if (getLangOpts().CUDA && + (IsCUDASharedVar || IsCUDAShadowVar || IsCUDADeviceShadowVar)) Init = llvm::UndefValue::get(getTypes().ConvertType(ASTTy)); else if
[clang] b952d79 - [cuda][hip] Fix `RegisterVar` function prototype.
Author: Michael Liao Date: 2020-04-03T12:57:09-04:00 New Revision: b952d799cacdb7efd44c1c9468bb11471cc16874 URL: https://github.com/llvm/llvm-project/commit/b952d799cacdb7efd44c1c9468bb11471cc16874 DIFF: https://github.com/llvm/llvm-project/commit/b952d799cacdb7efd44c1c9468bb11471cc16874.diff LOG: [cuda][hip] Fix `RegisterVar` function prototype. Summary: - `RegisterVar` has `void` return type and `size_t` in its variable size parameter in HIP or CUDA 9.0+. Reviewers: tra, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77398 Added: Modified: clang/include/clang/Basic/Cuda.h clang/lib/Basic/Cuda.cpp clang/lib/CodeGen/CGCUDANV.cpp clang/test/CodeGenCUDA/device-stub.cu Removed: diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h index da572957d10d..c2ebf8734304 100644 --- a/clang/include/clang/Basic/Cuda.h +++ b/clang/include/clang/Basic/Cuda.h @@ -117,6 +117,7 @@ enum class CudaFeature { CUDA_USES_FATBIN_REGISTER_END, }; +CudaVersion ToCudaVersion(llvm::VersionTuple); bool CudaFeatureEnabled(llvm::VersionTuple, CudaFeature); bool CudaFeatureEnabled(CudaVersion, CudaFeature); diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp index e06d120c58bf..74eb5473b71d 100644 --- a/clang/lib/Basic/Cuda.cpp +++ b/clang/lib/Basic/Cuda.cpp @@ -362,7 +362,7 @@ CudaVersion MaxVersionForCudaArch(CudaArch A) { } } -static CudaVersion ToCudaVersion(llvm::VersionTuple Version) { +CudaVersion ToCudaVersion(llvm::VersionTuple Version) { int IVer = Version.getMajor() * 10 + Version.getMinor().getValueOr(0); switch(IVer) { diff --git a/clang/lib/CodeGen/CGCUDANV.cpp b/clang/lib/CodeGen/CGCUDANV.cpp index 6d92ef33b885..351c5058aa4c 100644 --- a/clang/lib/CodeGen/CGCUDANV.cpp +++ b/clang/lib/CodeGen/CGCUDANV.cpp @@ -440,13 +440,19 @@ llvm::Function *CGNVCUDARuntime::makeRegisterGlobalsFn() { Builder.CreateCall(RegisterFunc, Args); } + llvm::Type *VarSizeTy = IntTy; + // For HIP or CUDA 9.0+, device variable size is type of `size_t`. + if (CGM.getLangOpts().HIP || + ToCudaVersion(CGM.getTarget().getSDKVersion()) >= CudaVersion::CUDA_90) +VarSizeTy = SizeTy; + // void __cudaRegisterVar(void **, char *, char *, const char *, //int, int, int, int) llvm::Type *RegisterVarParams[] = {VoidPtrPtrTy, CharPtrTy, CharPtrTy, - CharPtrTy,IntTy, IntTy, + CharPtrTy,IntTy, VarSizeTy, IntTy,IntTy}; llvm::FunctionCallee RegisterVar = CGM.CreateRuntimeFunction( - llvm::FunctionType::get(IntTy, RegisterVarParams, false), + llvm::FunctionType::get(VoidTy, RegisterVarParams, false), addUnderscoredPrefixToName("RegisterVar")); // void __cudaRegisterSurface(void **, const struct surfaceReference *, //const void **, const char *, int, int); @@ -476,7 +482,7 @@ llvm::Function *CGNVCUDARuntime::makeRegisterGlobalsFn() { VarName, VarName, llvm::ConstantInt::get(IntTy, Info.Flags.isExtern()), - llvm::ConstantInt::get(IntTy, VarSize), + llvm::ConstantInt::get(VarSizeTy, VarSize), llvm::ConstantInt::get(IntTy, Info.Flags.isConstant()), llvm::ConstantInt::get(IntTy, 0)}; Builder.CreateCall(RegisterVar, Args); diff --git a/clang/test/CodeGenCUDA/device-stub.cu b/clang/test/CodeGenCUDA/device-stub.cu index 9db5738cdede..0f4a5644fd48 100644 --- a/clang/test/CodeGenCUDA/device-stub.cu +++ b/clang/test/CodeGenCUDA/device-stub.cu @@ -181,10 +181,10 @@ void hostfunc(void) { kernelfunc<<<1, 1>>>(1, 1, 1); } // Test that we've built a function to register kernels and global vars. // ALL: define internal void @__[[PREFIX]]_register_globals // ALL: call{{.*}}[[PREFIX]]RegisterFunction(i8** %0, {{.*}}kernelfunc{{[^,]*}}, {{[^@]*}}@0 -// ALL-DAG: call{{.*}}[[PREFIX]]RegisterVar(i8** %0, {{.*}}device_var{{[^,]*}}, {{[^@]*}}@1, {{.*}}i32 0, i32 4, i32 0, i32 0 -// ALL-DAG: call{{.*}}[[PREFIX]]RegisterVar(i8** %0, {{.*}}constant_var{{[^,]*}}, {{[^@]*}}@2, {{.*}}i32 0, i32 4, i32 1, i32 0 -// ALL-DAG: call{{.*}}[[PREFIX]]RegisterVar(i8** %0, {{.*}}ext_device_var_def{{[^,]*}}, {{[^@]*}}@3, {{.*}}i32 0, i32 4, i32 0, i32 0 -// ALL-DAG: call{{.*}}[[PREFIX]]RegisterVar(i8** %0, {{.*}}ext_constant_var_def{{[^,]*}}, {{[^@]*}}@4, {{.*}}i32 0, i32 4, i32 1, i32 0 +// ALL-DAG: call void {{.*}}[[PREFIX]]RegisterVar(i8** %0, {{.*}}device_var{{[^,]*}}, {{[^@]*}}@1, {{.*}}i32 0, {{i32|i64}} 4, i32 0, i32 0 +// ALL-DAG: call void {{.*}}[[PREFIX]]RegisterVar(i8** %0, {{.*}}constant_var{{[^,]*}}, {{[^@]*}}@2, {{.*}}i32 0, {{i32|i64}} 4, i32 1, i32 0 +// ALL-DAG: call void {{.*}}[[PREFIX]]RegisterVar(i8** %0,
[clang] cb63893 - Fix GCC warning on enum class bitfield. NFC.
Author: Michael Liao Date: 2020-03-28T10:20:34-04:00 New Revision: cb6389360b05e8f89d09ff133a4ba1fd011866c5 URL: https://github.com/llvm/llvm-project/commit/cb6389360b05e8f89d09ff133a4ba1fd011866c5 DIFF: https://github.com/llvm/llvm-project/commit/cb6389360b05e8f89d09ff133a4ba1fd011866c5.diff LOG: Fix GCC warning on enum class bitfield. NFC. Added: Modified: clang/lib/CodeGen/CGCUDANV.cpp clang/lib/CodeGen/CGCUDARuntime.h Removed: diff --git a/clang/lib/CodeGen/CGCUDANV.cpp b/clang/lib/CodeGen/CGCUDANV.cpp index ed02a7dc9173..6d92ef33b885 100644 --- a/clang/lib/CodeGen/CGCUDANV.cpp +++ b/clang/lib/CodeGen/CGCUDANV.cpp @@ -466,18 +466,19 @@ llvm::Function *CGNVCUDARuntime::makeRegisterGlobalsFn() { for (auto & : DeviceVars) { llvm::GlobalVariable *Var = Info.Var; llvm::Constant *VarName = makeConstantString(getDeviceSideName(Info.D)); -switch (Info.Flags.Kind) { +switch (Info.Flags.getKind()) { case DeviceVarFlags::Variable: { uint64_t VarSize = CGM.getDataLayout().getTypeAllocSize(Var->getValueType()); - llvm::Value *Args[] = {, - Builder.CreateBitCast(Var, VoidPtrTy), - VarName, - VarName, - llvm::ConstantInt::get(IntTy, Info.Flags.Extern), - llvm::ConstantInt::get(IntTy, VarSize), - llvm::ConstantInt::get(IntTy, Info.Flags.Constant), - llvm::ConstantInt::get(IntTy, 0)}; + llvm::Value *Args[] = { + , + Builder.CreateBitCast(Var, VoidPtrTy), + VarName, + VarName, + llvm::ConstantInt::get(IntTy, Info.Flags.isExtern()), + llvm::ConstantInt::get(IntTy, VarSize), + llvm::ConstantInt::get(IntTy, Info.Flags.isConstant()), + llvm::ConstantInt::get(IntTy, 0)}; Builder.CreateCall(RegisterVar, Args); break; } @@ -485,16 +486,16 @@ llvm::Function *CGNVCUDARuntime::makeRegisterGlobalsFn() { Builder.CreateCall( RegisterSurf, {, Builder.CreateBitCast(Var, VoidPtrTy), VarName, - VarName, llvm::ConstantInt::get(IntTy, Info.Flags.SurfTexType), - llvm::ConstantInt::get(IntTy, Info.Flags.Extern)}); + VarName, llvm::ConstantInt::get(IntTy, Info.Flags.getSurfTexType()), + llvm::ConstantInt::get(IntTy, Info.Flags.isExtern())}); break; case DeviceVarFlags::Texture: Builder.CreateCall( RegisterTex, {, Builder.CreateBitCast(Var, VoidPtrTy), VarName, - VarName, llvm::ConstantInt::get(IntTy, Info.Flags.SurfTexType), - llvm::ConstantInt::get(IntTy, Info.Flags.Normalized), - llvm::ConstantInt::get(IntTy, Info.Flags.Extern)}); + VarName, llvm::ConstantInt::get(IntTy, Info.Flags.getSurfTexType()), + llvm::ConstantInt::get(IntTy, Info.Flags.isNormalized()), + llvm::ConstantInt::get(IntTy, Info.Flags.isExtern())}); break; } } diff --git a/clang/lib/CodeGen/CGCUDARuntime.h b/clang/lib/CodeGen/CGCUDARuntime.h index b26132420d65..19e70a2022a5 100644 --- a/clang/lib/CodeGen/CGCUDARuntime.h +++ b/clang/lib/CodeGen/CGCUDARuntime.h @@ -42,17 +42,30 @@ class CGCUDARuntime { public: // Global variable properties that must be passed to CUDA runtime. - struct DeviceVarFlags { -enum DeviceVarKind : unsigned { + class DeviceVarFlags { + public: +enum DeviceVarKind { Variable, // Variable Surface, // Builtin surface Texture, // Builtin texture }; -DeviceVarKind Kind : 2; + + private: +unsigned Kind : 2; unsigned Extern : 1; unsigned Constant : 1; // Constant variable. unsigned Normalized : 1; // Normalized texture. int SurfTexType; // Type of surface/texutre. + + public: +DeviceVarFlags(DeviceVarKind K, bool E, bool C, bool N, int T) +: Kind(K), Extern(E), Constant(C), Normalized(N), SurfTexType(T) {} + +DeviceVarKind getKind() const { return static_cast(Kind); } +bool isExtern() const { return Extern; } +bool isConstant() const { return Constant; } +bool isNormalized() const { return Normalized; } +int getSurfTexType() const { return SurfTexType; } }; CGCUDARuntime(CodeGenModule ) : CGM(CGM) {} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 5be9b8c - [cuda][hip] Add CUDA builtin surface/texture reference support.
Author: Michael Liao Date: 2020-03-27T17:18:49-04:00 New Revision: 5be9b8cbe2b2253f78a09a863bef18e574737465 URL: https://github.com/llvm/llvm-project/commit/5be9b8cbe2b2253f78a09a863bef18e574737465 DIFF: https://github.com/llvm/llvm-project/commit/5be9b8cbe2b2253f78a09a863bef18e574737465.diff LOG: [cuda][hip] Add CUDA builtin surface/texture reference support. Summary: - Re-commit after fix Sema checks on partial template specialization. Reviewers: tra, rjmccall, yaxunl, a.sidorin Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76365 Added: clang/test/CodeGenCUDA/surface.cu clang/test/CodeGenCUDA/texture.cu Modified: clang/include/clang/AST/Type.h clang/include/clang/Basic/Attr.td clang/include/clang/Basic/AttrDocs.td clang/include/clang/Basic/DiagnosticSemaKinds.td clang/lib/AST/Type.cpp clang/lib/CodeGen/CGCUDANV.cpp clang/lib/CodeGen/CGCUDARuntime.h clang/lib/CodeGen/CGExprAgg.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenTypes.cpp clang/lib/CodeGen/TargetInfo.cpp clang/lib/CodeGen/TargetInfo.h clang/lib/Headers/__clang_cuda_runtime_wrapper.h clang/lib/Sema/SemaDeclAttr.cpp clang/lib/Sema/SemaDeclCXX.cpp clang/test/Misc/pragma-attribute-supported-attributes-list.test clang/test/SemaCUDA/attr-declspec.cu clang/test/SemaCUDA/attributes-on-non-cuda.cu clang/test/SemaCUDA/bad-attributes.cu llvm/include/llvm/IR/Operator.h Removed: diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h index 3a2411b4ed29..6b46fc5ad312 100644 --- a/clang/include/clang/AST/Type.h +++ b/clang/include/clang/AST/Type.h @@ -2111,6 +2111,11 @@ class alignas(8) Type : public ExtQualsTypeCommonBase { /// than implicitly __strong. bool isObjCARCImplicitlyUnretainedType() const; + /// Check if the type is the CUDA device builtin surface type. + bool isCUDADeviceBuiltinSurfaceType() const; + /// Check if the type is the CUDA device builtin texture type. + bool isCUDADeviceBuiltinTextureType() const; + /// Return the implicit lifetime for this type, which must not be dependent. Qualifiers::ObjCLifetime getObjCARCImplicitLifetime() const; diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 5a90b2be2cbf..96bfdd313f47 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -1064,16 +1064,20 @@ def CUDADeviceBuiltin : IgnoredAttr { let LangOpts = [CUDA]; } -def CUDADeviceBuiltinSurfaceType : IgnoredAttr { +def CUDADeviceBuiltinSurfaceType : InheritableAttr { let Spellings = [GNU<"device_builtin_surface_type">, Declspec<"__device_builtin_surface_type__">]; let LangOpts = [CUDA]; + let Subjects = SubjectList<[CXXRecord]>; + let Documentation = [CUDADeviceBuiltinSurfaceTypeDocs]; } -def CUDADeviceBuiltinTextureType : IgnoredAttr { +def CUDADeviceBuiltinTextureType : InheritableAttr { let Spellings = [GNU<"device_builtin_texture_type">, Declspec<"__device_builtin_texture_type__">]; let LangOpts = [CUDA]; + let Subjects = SubjectList<[CXXRecord]>; + let Documentation = [CUDADeviceBuiltinTextureTypeDocs]; } def CUDAGlobal : InheritableAttr { diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index a1cf25ed3929..2c89dc6f4952 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -4624,6 +4624,28 @@ the initializer on host side. }]; } +def CUDADeviceBuiltinSurfaceTypeDocs : Documentation { + let Category = DocCatType; + let Content = [{ +The ``device_builtin_surface_type`` attribute can be applied to a class +template when declaring the surface reference. A surface reference variable +could be accessed on the host side and, on the device side, might be translated +into an internal surface object, which is established through surface bind and +unbind runtime APIs. + }]; +} + +def CUDADeviceBuiltinTextureTypeDocs : Documentation { + let Category = DocCatType; + let Content = [{ +The ``device_builtin_texture_type`` attribute can be applied to a class +template when declaring the texture reference. A texture reference variable +could be accessed on the host side and, on the device side, might be translated +into an internal texture object, which is established through texture bind and +unbind runtime APIs. + }]; +} + def LifetimeOwnerDocs : Documentation { let Category = DocCatDecl; let Content = [{ diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 762dd1469236..c642c2ba36c8 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -7967,6 +7967,22 @@ def err_cuda_ovl_target : Error< def
Re: [clang] d264f02 - Fix `-Wreturn-type` warning. NFC.
Thanks for catching that. On Thu, Mar 26, 2020 at 11:14 PM David Blaikie wrote: > > Usually this sort of thing is addressed with llvm_unreachable, rather than a > return statement that's not expected to be reached by any valid execution of > LLVM (it'd require a carefully hand-crafted CPU kind to reach that return > (since all the actual enumerators result in returns earlier, in the switch > statement above), which probably isn't an intended code path?) > > I've made that change in 819e540208d5d62e7841d0dbdef3580eecc2c2b6 > > On Wed, Mar 25, 2020 at 9:59 PM Michael Liao via cfe-commits > wrote: >> >> >> Author: Michael Liao >> Date: 2020-03-26T00:53:24-04:00 >> New Revision: d264f02c6f502960e2bcdd332f250efc702d09f2 >> >> URL: >> https://github.com/llvm/llvm-project/commit/d264f02c6f502960e2bcdd332f250efc702d09f2 >> DIFF: >> https://github.com/llvm/llvm-project/commit/d264f02c6f502960e2bcdd332f250efc702d09f2.diff >> >> LOG: Fix `-Wreturn-type` warning. NFC. >> >> Added: >> >> >> Modified: >> clang/lib/Basic/Targets/X86.cpp >> >> Removed: >> >> >> >> >> diff --git a/clang/lib/Basic/Targets/X86.cpp >> b/clang/lib/Basic/Targets/X86.cpp >> index f35b520de657..8a7d0f17760e 100644 >> --- a/clang/lib/Basic/Targets/X86.cpp >> +++ b/clang/lib/Basic/Targets/X86.cpp >> @@ -1842,6 +1842,7 @@ Optional >> X86TargetInfo::getCPUCacheLineSize() const { >> case CK_Generic: >>return None; >>} >> + return None; >> } >> >> bool X86TargetInfo::validateOutputSize(const llvm::StringMap >> , >> >> >> >> ___ >> cfe-commits mailing list >> cfe-commits@lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 6a9ad5f - [cuda][hip] Add CUDA builtin surface/texture reference support.
Author: Michael Liao Date: 2020-03-26T14:44:52-04:00 New Revision: 6a9ad5f3f4ac66f0cae592e911f4baeb6ee5eca6 URL: https://github.com/llvm/llvm-project/commit/6a9ad5f3f4ac66f0cae592e911f4baeb6ee5eca6 DIFF: https://github.com/llvm/llvm-project/commit/6a9ad5f3f4ac66f0cae592e911f4baeb6ee5eca6.diff LOG: [cuda][hip] Add CUDA builtin surface/texture reference support. Summary: - Even though the bindless surface/texture interfaces are promoted, there are still code using surface/texture references. For example, [PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the compilation issue for code using `tex2D` with texture references. For better compatibility, this patch proposes the support of surface/texture references. - Due to the absent documentation and magic headers, it's believed that `nvcc` does use builtins for texture support. From the limited NVVM documentation[^nvvm] and NVPTX backend texture/surface related tests[^test], it's believed that surface/texture references are supported by replacing their reference types, which are annotated with `device_builtin_surface_type`/`device_builtin_texture_type`, with the corresponding handle-like object types, `cudaSurfaceObject_t` or `cudaTextureObject_t`, in the device-side compilation. On the host side, that global handle variables are registered and will be established and updated later when corresponding binding/unbinding APIs are called[^bind]. Surface/texture references are most like device global variables but represented in different types on the host and device sides. - In this patch, the following changes are proposed to support that behavior: + Refine `device_builtin_surface_type` and `device_builtin_texture_type` attributes to be applied on `Type` decl only to check whether a variable is of the surface/texture reference type. + Add hooks in code generation to replace that reference types with the correponding object types as well as all accesses to them. In particular, `nvvm.texsurf.handle.internal` should be used to load object handles from global reference variables[^texsurf] as well as metadata annotations. + Generate host-side registration with proper template argument parsing. --- [^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf [^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll [^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf). [^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later. Reviewers: tra, rjmccall, yaxunl, a.sidorin Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76365 Added: clang/test/CodeGenCUDA/surface.cu clang/test/CodeGenCUDA/texture.cu Modified: clang/include/clang/AST/Type.h clang/include/clang/Basic/Attr.td clang/include/clang/Basic/AttrDocs.td clang/include/clang/Basic/DiagnosticSemaKinds.td clang/lib/AST/Type.cpp clang/lib/CodeGen/CGCUDANV.cpp clang/lib/CodeGen/CGCUDARuntime.h clang/lib/CodeGen/CGExprAgg.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenTypes.cpp clang/lib/CodeGen/TargetInfo.cpp clang/lib/CodeGen/TargetInfo.h clang/lib/Headers/__clang_cuda_runtime_wrapper.h clang/lib/Sema/SemaDeclAttr.cpp clang/lib/Sema/SemaDeclCXX.cpp clang/test/Misc/pragma-attribute-supported-attributes-list.test clang/test/SemaCUDA/attr-declspec.cu clang/test/SemaCUDA/attributes-on-non-cuda.cu clang/test/SemaCUDA/bad-attributes.cu llvm/include/llvm/IR/Operator.h Removed: diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h index b8f49127bbd0..673d37109eb6 100644 --- a/clang/include/clang/AST/Type.h +++ b/clang/include/clang/AST/Type.h @@ -2111,6 +2111,11 @@ class alignas(8) Type : public ExtQualsTypeCommonBase { /// than implicitly __strong. bool isObjCARCImplicitlyUnretainedType() const; + /// Check if the type is the CUDA device builtin surface type. + bool isCUDADeviceBuiltinSurfaceType() const; + /// Check if the type is the CUDA device builtin texture type. + bool isCUDADeviceBuiltinTextureType() const; + /// Return the implicit lifetime for this type, which must not be dependent. Qualifiers::ObjCLifetime getObjCARCImplicitLifetime() const; diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 5a90b2be2cbf..96bfdd313f47 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -1064,16 +1064,20 @@ def CUDADeviceBuiltin : IgnoredAttr { let LangOpts = [CUDA]; } -def CUDADeviceBuiltinSurfaceType : IgnoredAttr { +def
[clang] d264f02 - Fix `-Wreturn-type` warning. NFC.
Author: Michael Liao Date: 2020-03-26T00:53:24-04:00 New Revision: d264f02c6f502960e2bcdd332f250efc702d09f2 URL: https://github.com/llvm/llvm-project/commit/d264f02c6f502960e2bcdd332f250efc702d09f2 DIFF: https://github.com/llvm/llvm-project/commit/d264f02c6f502960e2bcdd332f250efc702d09f2.diff LOG: Fix `-Wreturn-type` warning. NFC. Added: Modified: clang/lib/Basic/Targets/X86.cpp Removed: diff --git a/clang/lib/Basic/Targets/X86.cpp b/clang/lib/Basic/Targets/X86.cpp index f35b520de657..8a7d0f17760e 100644 --- a/clang/lib/Basic/Targets/X86.cpp +++ b/clang/lib/Basic/Targets/X86.cpp @@ -1842,6 +1842,7 @@ Optional X86TargetInfo::getCPUCacheLineSize() const { case CK_Generic: return None; } + return None; } bool X86TargetInfo::validateOutputSize(const llvm::StringMap , ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 4f4e687 - [test][clang][driver] Add required features.
Author: Michael Liao Date: 2020-03-24T17:08:21-04:00 New Revision: 4f4e68799fd55c7023e685161de6f6bb1ada16d5 URL: https://github.com/llvm/llvm-project/commit/4f4e68799fd55c7023e685161de6f6bb1ada16d5 DIFF: https://github.com/llvm/llvm-project/commit/4f4e68799fd55c7023e685161de6f6bb1ada16d5.diff LOG: [test][clang][driver] Add required features. - to avoid false alarms on builds without that features. Added: Modified: clang/test/Driver/save-temps.c Removed: diff --git a/clang/test/Driver/save-temps.c b/clang/test/Driver/save-temps.c index b0cfa4fd814a..a26ba9f4ec0d 100644 --- a/clang/test/Driver/save-temps.c +++ b/clang/test/Driver/save-temps.c @@ -1,3 +1,6 @@ +// REQUIRES: x86-registered-target +// REQUIRES: arm-registered-target + // RUN: %clang -target x86_64-apple-darwin -save-temps -arch x86_64 %s -### 2>&1 \ // RUN: | FileCheck %s // CHECK: "-o" "save-temps.i" ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] a4edea2 - Fix `-Wunused-variable` warning. NFC.
Author: Michael Liao Date: 2020-03-20T09:31:58-04:00 New Revision: a4edea29be2a77a8c8c237d75563a09a61791442 URL: https://github.com/llvm/llvm-project/commit/a4edea29be2a77a8c8c237d75563a09a61791442 DIFF: https://github.com/llvm/llvm-project/commit/a4edea29be2a77a8c8c237d75563a09a61791442.diff LOG: Fix `-Wunused-variable` warning. NFC. Added: Modified: clang/lib/Sema/SemaExpr.cpp Removed: diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp index eaded8e92d7c..137aae883aa6 100644 --- a/clang/lib/Sema/SemaExpr.cpp +++ b/clang/lib/Sema/SemaExpr.cpp @@ -15490,8 +15490,7 @@ static void RemoveNestedImmediateInvocation( /// nowhere in the expression being transformed therefore will not be rebuilt. /// Setting AllowSkippingFirstCXXConstructExpr to false will prevent from /// skipping the first CXXConstructExpr. - if (auto *OldExpr = - dyn_cast(It->getPointer()->IgnoreImplicit())) + if (isa(It->getPointer()->IgnoreImplicit())) Transformer.AllowSkippingFirstCXXConstructExpr = false; ExprResult Res = Transformer.TransformExpr(It->getPointer()->getSubExpr()); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 4cf01ed - [hip] Revise `GlobalDecl` constructors. NFC.
Author: Michael Liao Date: 2020-03-18T09:33:39-04:00 New Revision: 4cf01ed75e35e7bd3ef8ef1a2192c7f4656ab545 URL: https://github.com/llvm/llvm-project/commit/4cf01ed75e35e7bd3ef8ef1a2192c7f4656ab545 DIFF: https://github.com/llvm/llvm-project/commit/4cf01ed75e35e7bd3ef8ef1a2192c7f4656ab545.diff LOG: [hip] Revise `GlobalDecl` constructors. NFC. Summary: - https://reviews.llvm.org/D68578 revises the `GlobalDecl` constructors to ensure all GPU kernels have `ReferenceKenelKind` initialized properly with an explicit constructor and static one. But, there are lots of places using the implicit constructor triggering the assertion on non-GPU kernels. That's found in compilation of many tests and workloads. - Fixing all of them may change more code and, more importantly, all of them assumes the default kernel reference kind. This patch changes that constructor to tell `CUDAGlobalAttr` and construct `GlobalDecl` properly. Reviewers: yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76344 Added: Modified: clang/include/clang/AST/GlobalDecl.h clang/lib/AST/Expr.cpp clang/lib/AST/ItaniumMangle.cpp clang/lib/AST/Mangle.cpp clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/CodeGen/CGDecl.cpp clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenModule.h Removed: diff --git a/clang/include/clang/AST/GlobalDecl.h b/clang/include/clang/AST/GlobalDecl.h index d2b5566a4cfa..bf30d9b94235 100644 --- a/clang/include/clang/AST/GlobalDecl.h +++ b/clang/include/clang/AST/GlobalDecl.h @@ -68,7 +68,15 @@ class GlobalDecl { GlobalDecl(const VarDecl *D) { Init(D);} GlobalDecl(const FunctionDecl *D, unsigned MVIndex = 0) : MultiVersionIndex(MVIndex) { -Init(D); +if (!D->hasAttr()) { + Init(D); + return; +} +Value.setPointerAndInt(D, unsigned(getDefaultKernelReference(D))); + } + GlobalDecl(const FunctionDecl *D, KernelReferenceKind Kind) + : Value(D, unsigned(Kind)) { +assert(D->hasAttr() && "Decl is not a GPU kernel!"); } GlobalDecl(const NamedDecl *D) { Init(D); } GlobalDecl(const BlockDecl *D) { Init(D); } @@ -80,10 +88,6 @@ class GlobalDecl { GlobalDecl(const CXXDestructorDecl *D, CXXDtorType Type) : Value(D, Type) {} GlobalDecl(const VarDecl *D, DynamicInitKind StubKind) : Value(D, unsigned(StubKind)) {} - GlobalDecl(const FunctionDecl *D, KernelReferenceKind Kind) - : Value(D, unsigned(Kind)) { -assert(D->hasAttr() && "Decl is not a GPU kernel!"); - } GlobalDecl getCanonicalDecl() const { GlobalDecl CanonGD; @@ -145,10 +149,10 @@ class GlobalDecl { return GD; } - static GlobalDecl getDefaultKernelReference(const FunctionDecl *D) { -return GlobalDecl(D, D->getASTContext().getLangOpts().CUDAIsDevice - ? KernelReferenceKind::Kernel - : KernelReferenceKind::Stub); + static KernelReferenceKind getDefaultKernelReference(const FunctionDecl *D) { +return D->getASTContext().getLangOpts().CUDAIsDevice + ? KernelReferenceKind::Kernel + : KernelReferenceKind::Stub; } GlobalDecl getWithDecl(const Decl *D) { diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp index 1eb56c30283c..6591b0481d4b 100644 --- a/clang/lib/AST/Expr.cpp +++ b/clang/lib/AST/Expr.cpp @@ -567,7 +567,7 @@ std::string PredefinedExpr::ComputeName(IdentKind IK, const Decl *CurrentDecl) { else if (const CXXDestructorDecl *DD = dyn_cast(ND)) GD = GlobalDecl(DD, Dtor_Base); else if (ND->hasAttr()) - GD = GlobalDecl::getDefaultKernelReference(cast(ND)); + GD = GlobalDecl(cast(ND)); else GD = GlobalDecl(ND); MC->mangleName(GD, Out); diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp index 6deab2a95494..12e1fc589fdc 100644 --- a/clang/lib/AST/ItaniumMangle.cpp +++ b/clang/lib/AST/ItaniumMangle.cpp @@ -1575,14 +1575,8 @@ static GlobalDecl getParentOfLocalEntity(const DeclContext *DC) { GD = GlobalDecl(CD, Ctor_Complete); else if (auto *DD = dyn_cast(DC)) GD = GlobalDecl(DD, Dtor_Complete); - else { -auto *FD = cast(DC); -// Local variables can only exist in real kernels. -if (FD->hasAttr()) - GD = GlobalDecl(FD, KernelReferenceKind::Kernel); -else - GD = GlobalDecl(FD); - } + else +GD = GlobalDecl(cast(DC)); return GD; } diff --git a/clang/lib/AST/Mangle.cpp b/clang/lib/AST/Mangle.cpp index cc46994c1003..30078fcb243d 100644 --- a/clang/lib/AST/Mangle.cpp +++ b/clang/lib/AST/Mangle.cpp @@ -444,7 +444,7 @@ class ASTNameGenerator::Implementation { else if (const auto *DtorD = dyn_cast(D)) GD = GlobalDecl(DtorD, Dtor_Complete); else if (D->hasAttr()) -GD =
[clang] a2920c4 - [codegen] Fix one more case where `getGlobalDecl` should be used. NFC.
Author: Michael Liao Date: 2020-03-17T17:56:47-04:00 New Revision: a2920c4ea9971cc38cbca3d6e10ccb10ab83a462 URL: https://github.com/llvm/llvm-project/commit/a2920c4ea9971cc38cbca3d6e10ccb10ab83a462 DIFF: https://github.com/llvm/llvm-project/commit/a2920c4ea9971cc38cbca3d6e10ccb10ab83a462.diff LOG: [codegen] Fix one more case where `getGlobalDecl` should be used. NFC. - After https://reviews.llvm.org/D68578, the implicit conversion from `FunctionDecl` to `GlobalDecl` needs replacing with `getGlobalDecl`; otherwise, assertion is triggered. Added: Modified: clang/lib/CodeGen/CGDebugInfo.cpp Removed: diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp index 94dab4c85614..a1ecddd3da7f 100644 --- a/clang/lib/CodeGen/CGDebugInfo.cpp +++ b/clang/lib/CodeGen/CGDebugInfo.cpp @@ -3833,7 +3833,8 @@ void CGDebugInfo::EmitFuncDeclForCallSite(llvm::CallBase *CallOrInvoke, // create the one describing the function in order to have complete // call site debug info. if (!CalleeDecl->isStatic() && !CalleeDecl->isInlined()) -EmitFunctionDecl(CalleeDecl, CalleeDecl->getLocation(), CalleeType, Func); +EmitFunctionDecl(CGM.getGlobalDecl(CalleeDecl), CalleeDecl->getLocation(), + CalleeType, Func); } void CGDebugInfo::EmitInlineFunctionStart(CGBuilderTy , GlobalDecl GD) { ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 073dbaa - Fix GCC warnings. NFC.
Author: Michael Liao Date: 2020-03-08T13:00:36-04:00 New Revision: 073dbaae39724ea860b5957fe47ecc1c2a84b197 URL: https://github.com/llvm/llvm-project/commit/073dbaae39724ea860b5957fe47ecc1c2a84b197 DIFF: https://github.com/llvm/llvm-project/commit/073dbaae39724ea860b5957fe47ecc1c2a84b197.diff LOG: Fix GCC warnings. NFC. Added: Modified: clang/lib/AST/ItaniumMangle.cpp clang/lib/Index/USRGeneration.cpp Removed: diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp index 5cc66a0a5778..63e34653637e 100644 --- a/clang/lib/AST/ItaniumMangle.cpp +++ b/clang/lib/AST/ItaniumMangle.cpp @@ -641,7 +641,7 @@ void CXXNameMangler::mangle(GlobalDecl GD) { //::= //::= Out << "_Z"; - if (const FunctionDecl *FD = dyn_cast(GD.getDecl())) + if (isa(GD.getDecl())) mangleFunctionEncoding(GD); else if (const VarDecl *VD = dyn_cast(GD.getDecl())) mangleName(VD); diff --git a/clang/lib/Index/USRGeneration.cpp b/clang/lib/Index/USRGeneration.cpp index f3eb653f10fa..7972d0a047c2 100644 --- a/clang/lib/Index/USRGeneration.cpp +++ b/clang/lib/Index/USRGeneration.cpp @@ -388,7 +388,7 @@ static const ObjCCategoryDecl *getCategoryContext(const NamedDecl *D) { if (auto *ICD = dyn_cast(D->getDeclContext())) return ICD->getCategoryDecl(); return nullptr; -}; +} void USRGenerator::VisitObjCMethodDecl(const ObjCMethodDecl *D) { const DeclContext *container = D->getDeclContext(); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] f6a3ac1 - Fix `-Wunused-variable` warning. NFC.
Author: Michael Liao Date: 2020-02-12T12:45:14-05:00 New Revision: f6a3ac150b8d9f3458f526cf76ebcd545bfc1898 URL: https://github.com/llvm/llvm-project/commit/f6a3ac150b8d9f3458f526cf76ebcd545bfc1898 DIFF: https://github.com/llvm/llvm-project/commit/f6a3ac150b8d9f3458f526cf76ebcd545bfc1898.diff LOG: Fix `-Wunused-variable` warning. NFC. Added: Modified: clang/lib/CodeGen/TargetInfo.cpp Removed: diff --git a/clang/lib/CodeGen/TargetInfo.cpp b/clang/lib/CodeGen/TargetInfo.cpp index 905440febb51..e40f24d0ca4d 100644 --- a/clang/lib/CodeGen/TargetInfo.cpp +++ b/clang/lib/CodeGen/TargetInfo.cpp @@ -7578,7 +7578,7 @@ ABIArgInfo HexagonABIInfo::classifyReturnType(QualType RetTy) const { const TargetInfo = CGT.getTarget(); uint64_t Size = getContext().getTypeSize(RetTy); - if (const auto *VecTy = RetTy->getAs()) { + if (RetTy->getAs()) { // HVX vectors are returned in vector registers or register pairs. if (T.hasFeature("hvx")) { assert(T.hasFeature("hvx-length64b") || T.hasFeature("hvx-length128b")); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] a067891 - [clang][codegen] Fix another lifetime emission on alloca on non-default address space.
Author: Michael Liao Date: 2020-02-10T00:15:56-05:00 New Revision: a06789138987d1f64bb2f97d3a5c0f39eaf94715 URL: https://github.com/llvm/llvm-project/commit/a06789138987d1f64bb2f97d3a5c0f39eaf94715 DIFF: https://github.com/llvm/llvm-project/commit/a06789138987d1f64bb2f97d3a5c0f39eaf94715.diff LOG: [clang][codegen] Fix another lifetime emission on alloca on non-default address space. - Lifetime intrinsics expect the pointer directly from alloca. Need extra handling for targets with alloca on non-default (or non-zero) address space. Added: clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cpp Modified: clang/lib/CodeGen/CGCall.cpp clang/lib/CodeGen/CodeGenFunction.h Removed: diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 3edcfb21ef34..9ef2a3b3d099 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -3690,8 +3690,9 @@ void CodeGenFunction::EmitCallArg(CallArgList , const Expr *E, } AggValueSlot ArgSlot = AggValueSlot::ignored(); + Address ArgSlotAlloca = Address::invalid(); if (hasAggregateEvaluationKind(E->getType())) { -ArgSlot = CreateAggTemp(E->getType(), "agg.tmp"); +ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", ); // Emit a lifetime start/end for this temporary. If the type has a // destructor, then we need to keep it alive. FIXME: We should still be able @@ -3699,8 +3700,9 @@ void CodeGenFunction::EmitCallArg(CallArgList , const Expr *E, if (!E->getType().isDestructedType()) { uint64_t size = CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(E->getType())); - if (auto *lifetimeSize = EmitLifetimeStart(size, ArgSlot.getPointer())) -args.addLifetimeCleanup({ArgSlot.getPointer(), lifetimeSize}); + if (auto *lifetimeSize = + EmitLifetimeStart(size, ArgSlotAlloca.getPointer())) +args.addLifetimeCleanup({ArgSlotAlloca.getPointer(), lifetimeSize}); } } diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index f48d8a4cc366..7ddd38c7b262 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -2264,8 +2264,9 @@ class CodeGenFunction : public CodeGenTypeCache { /// CreateAggTemp - Create a temporary memory object for the given /// aggregate type. - AggValueSlot CreateAggTemp(QualType T, const Twine = "tmp") { -return AggValueSlot::forAddr(CreateMemTemp(T, Name), + AggValueSlot CreateAggTemp(QualType T, const Twine = "tmp", + Address *Alloca = nullptr) { +return AggValueSlot::forAddr(CreateMemTemp(T, Name, Alloca), T.getQualifiers(), AggValueSlot::IsNotDestructed, AggValueSlot::DoesNotNeedGCBarriers, diff --git a/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cpp b/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cpp new file mode 100644 index ..e9d3683cfaa2 --- /dev/null +++ b/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cpp @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -O3 -disable-llvm-passes -o - %s | FileCheck %s + +struct A { + float x, y, z, w; +}; + +void foo(A a); + +// CHECK-LABEL: @_Z4testv +// CHECK: %[[lvar:.*]] = alloca %struct.A, align 4, addrspace(5) +// CHECK: %[[atmp:.*]] = alloca %struct.A, align 4, addrspace(5) +// CHECK: %[[lcst:.*]] = bitcast %struct.A addrspace(5)* %[[lvar]] to i8 addrspace(5)* +// CHECK: call void @llvm.lifetime.start.p5i8(i64 16, i8 addrspace(5)* %[[lcst]] +// CHECK: %[[acst:.*]] = bitcast %struct.A addrspace(5)* %[[atmp]] to i8 addrspace(5)* +// CHECK: call void @llvm.lifetime.start.p5i8(i64 16, i8 addrspace(5)* %[[acst]] +void test() { + A a; + foo(a); +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 2926917 - [clang] Fix linkage of nested lambdas.
Author: Michael Liao Date: 2020-02-07T13:24:21-05:00 New Revision: 2926917f430d705f084813b63a40fafc61872524 URL: https://github.com/llvm/llvm-project/commit/2926917f430d705f084813b63a40fafc61872524 DIFF: https://github.com/llvm/llvm-project/commit/2926917f430d705f084813b63a40fafc61872524.diff LOG: [clang] Fix linkage of nested lambdas. patch from Philippe Daouadi This is an attempt to fix [PR#44368](https://bugs.llvm.org/show_bug.cgi?id=44368) This effectively reverts [D1783](https://reviews.llvm.org/D1783). It doesn't break the current tests and fixes the test that this commit adds. We now decide of a lambda linkage only depending on the visibility of its parent context. Differential Revision: https://reviews.llvm.org/D73701 Added: Modified: clang/lib/AST/Decl.cpp clang/test/CodeGenCXX/lambda-expressions-nested-linkage.cpp Removed: diff --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp index 0d30f64b992e..216137bf74f9 100644 --- a/clang/lib/AST/Decl.cpp +++ b/clang/lib/AST/Decl.cpp @@ -1318,19 +1318,6 @@ LinkageInfo LinkageComputer::getLVForLocalDecl(const NamedDecl *D, LV.isVisibilityExplicit()); } -static inline const CXXRecordDecl* -getOutermostEnclosingLambda(const CXXRecordDecl *Record) { - const CXXRecordDecl *Ret = Record; - while (Record && Record->isLambda()) { -Ret = Record; -if (!Record->getParent()) break; -// Get the Containing Class of this Lambda Class -Record = dyn_cast_or_null( - Record->getParent()->getParent()); - } - return Ret; -} - LinkageInfo LinkageComputer::computeLVForDecl(const NamedDecl *D, LVComputationKind computation, bool IgnoreVarTypeLinkage) { @@ -1396,25 +1383,9 @@ LinkageInfo LinkageComputer::computeLVForDecl(const NamedDecl *D, return getInternalLinkageFor(D); } -// This lambda has its linkage/visibility determined: -// - either by the outermost lambda if that lambda has no mangling -//number. -// - or by the parent of the outer most lambda -// This prevents infinite recursion in settings such as nested lambdas -// used in NSDMI's, for e.g. -// struct L { -//int t{}; -//int t2 = ([](int a) { return [](int b) { return b; };})(t)(t); -// }; -const CXXRecordDecl *OuterMostLambda = -getOutermostEnclosingLambda(Record); -if (OuterMostLambda->hasKnownLambdaInternalLinkage() || -!OuterMostLambda->getLambdaManglingNumber()) - return getInternalLinkageFor(D); - return getLVForClosure( - OuterMostLambda->getDeclContext()->getRedeclContext(), - OuterMostLambda->getLambdaContextDecl(), computation); + Record->getDeclContext()->getRedeclContext(), + Record->getLambdaContextDecl(), computation); } break; diff --git a/clang/test/CodeGenCXX/lambda-expressions-nested-linkage.cpp b/clang/test/CodeGenCXX/lambda-expressions-nested-linkage.cpp index 9a449874cb85..6b45645c9e2d 100644 --- a/clang/test/CodeGenCXX/lambda-expressions-nested-linkage.cpp +++ b/clang/test/CodeGenCXX/lambda-expressions-nested-linkage.cpp @@ -1,4 +1,5 @@ // RUN: %clang_cc1 -triple x86_64-apple-darwin10.0.0 -fblocks -emit-llvm -o - %s -fexceptions -std=c++11 | FileCheck %s +// RUN: %clang_cc1 -triple x86_64-apple-darwin10.0.0 -fblocks -emit-llvm -o - %s -fexceptions -std=c++14 | FileCheck --check-prefixes=CHECK,CXX14 %s // CHECK-LABEL: define void @_ZN19non_inline_function3fooEv() // CHECK-LABEL: define internal void @"_ZZN19non_inline_function3fooEvENK3$_0clEi"(%class.anon @@ -51,3 +52,18 @@ inline int foo() { } int use = foo(); } + +#if __cplusplus >= 201402L +// CXX14-LABEL: define internal void @"_ZZZN32lambda_capture_in_generic_lambda3fooIiEEDavENKUlT_E_clIZNS_L1fEvE3$_1EEDaS1_ENKUlvE_clEv" +namespace lambda_capture_in_generic_lambda { +template auto foo() { + return [](auto func) { +[func] { func(); }(); + }; +} +static void f() { + foo()([] { }); +} +void f1() { f(); } +} +#endif ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 6f5a159 - [clang][driver] Clean up unnecessary reference to TC. NFC.
Author: Michael Liao Date: 2020-02-06T15:14:21-05:00 New Revision: 6f5a159eab8d3fecdbbc741a38c970c0149b3c96 URL: https://github.com/llvm/llvm-project/commit/6f5a159eab8d3fecdbbc741a38c970c0149b3c96 DIFF: https://github.com/llvm/llvm-project/commit/6f5a159eab8d3fecdbbc741a38c970c0149b3c96.diff LOG: [clang][driver] Clean up unnecessary reference to TC. NFC. Added: Modified: clang/lib/Driver/ToolChains/Clang.cpp Removed: diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 37adad152c56..65039ac64b5a 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -307,10 +307,9 @@ static void getWebAssemblyTargetFeatures(const ArgList , handleTargetFeaturesGroup(Args, Features, options::OPT_m_wasm_Features_Group); } -static void getTargetFeatures(const ToolChain , const llvm::Triple , +static void getTargetFeatures(const Driver , const llvm::Triple , const ArgList , ArgStringList , bool ForAS, bool IsAux = false) { - const Driver = TC.getDriver(); std::vector Features; switch (Triple.getArch()) { default: @@ -1594,7 +1593,7 @@ void Clang::RenderTargetOptions(const llvm::Triple , const ToolChain = getToolChain(); // Add the target features - getTargetFeatures(TC, EffectiveTriple, Args, CmdArgs, false); + getTargetFeatures(TC.getDriver(), EffectiveTriple, Args, CmdArgs, false); // Add target specific flags. switch (TC.getArch()) { @@ -4643,7 +4642,7 @@ void Clang::ConstructJob(Compilation , const JobAction , CmdArgs.push_back("-aux-target-cpu"); CmdArgs.push_back(Args.MakeArgString(HostCPU)); } -getTargetFeatures(TC, *TC.getAuxTriple(), HostArgs, CmdArgs, +getTargetFeatures(D, *TC.getAuxTriple(), HostArgs, CmdArgs, /*ForAS*/ false, /*IsAux*/ true); } @@ -6679,7 +6678,7 @@ void ClangAs::ConstructJob(Compilation , const JobAction , } // Add the target features - getTargetFeatures(getToolChain(), Triple, Args, CmdArgs, true); + getTargetFeatures(D, Triple, Args, CmdArgs, true); // Ignore explicit -force_cpusubtype_ALL option. (void)Args.hasArg(options::OPT_force__cpusubtype__ALL); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 318d0ed - Fix warning on unused variables. NFC.
Author: Michael Liao Date: 2020-02-06T12:21:20-05:00 New Revision: 318d0ede572080f18d0106dbc354e11c88329a84 URL: https://github.com/llvm/llvm-project/commit/318d0ede572080f18d0106dbc354e11c88329a84 DIFF: https://github.com/llvm/llvm-project/commit/318d0ede572080f18d0106dbc354e11c88329a84.diff LOG: Fix warning on unused variables. NFC. Added: Modified: clang/lib/CodeGen/CGDebugInfo.cpp Removed: diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp index 0e54e9419356..e171082942f6 100644 --- a/clang/lib/CodeGen/CGDebugInfo.cpp +++ b/clang/lib/CodeGen/CGDebugInfo.cpp @@ -3667,7 +3667,7 @@ void CGDebugInfo::EmitFunctionStart(GlobalDecl GD, SourceLocation Loc, } else { Name = Fn->getName(); -if (const auto *BD = dyn_cast(D)) +if (isa(D)) LinkageName = Name; Flags |= llvm::DINode::FlagPrototyped; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 09a8812 - [clang][driver][ARM] Clean up ARM target & feature checking in clang driver.
Author: Michael Liao Date: 2020-02-06T08:57:52-05:00 New Revision: 09a88120c9269a9af0d80bc59afb2cb5806140ff URL: https://github.com/llvm/llvm-project/commit/09a88120c9269a9af0d80bc59afb2cb5806140ff DIFF: https://github.com/llvm/llvm-project/commit/09a88120c9269a9af0d80bc59afb2cb5806140ff.diff LOG: [clang][driver][ARM] Clean up ARM target & feature checking in clang driver. Summary: - Similar to other targets, instead of passing a toolchain, a driver argument should be passed into `arm::getARMTargetFeatures`. Aslo, that routine should honor the specified triple. Refactor `arm::getARMFloatABI` with 2 separate interfaces. One has the original parameters and the other uses the driver and the specified triple. - That fixes an issue when target & features are queried during the offload compilation, where the specified triple should be checked instead of a effective triple. A previously failed test is re-enabled. Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D74020 Added: Modified: clang/lib/Driver/ToolChains/Arch/ARM.cpp clang/lib/Driver/ToolChains/Arch/ARM.h clang/lib/Driver/ToolChains/Clang.cpp clang/test/Driver/cuda-simple.cu Removed: diff --git a/clang/lib/Driver/ToolChains/Arch/ARM.cpp b/clang/lib/Driver/ToolChains/Arch/ARM.cpp index ce3990038a4b..18bd1317fbc2 100644 --- a/clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ b/clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -137,9 +137,8 @@ bool arm::useAAPCSForMachO(const llvm::Triple ) { } // Select mode for reading thread pointer (-mtp=soft/cp15). -arm::ReadTPMode arm::getReadTPMode(const ToolChain , const ArgList ) { +arm::ReadTPMode arm::getReadTPMode(const Driver , const ArgList ) { if (Arg *A = Args.getLastArg(options::OPT_mtp_mode_EQ)) { -const Driver = TC.getDriver(); arm::ReadTPMode ThreadPointer = llvm::StringSwitch(A->getValue()) .Case("cp15", ReadTPMode::Cp15) @@ -156,11 +155,14 @@ arm::ReadTPMode arm::getReadTPMode(const ToolChain , const ArgList ) { return ReadTPMode::Soft; } +arm::FloatABI arm::getARMFloatABI(const ToolChain , const ArgList ) { + return arm::getARMFloatABI(TC.getDriver(), TC.getEffectiveTriple(), Args); +} + // Select the float ABI as determined by -msoft-float, -mhard-float, and // -mfloat-abi=. -arm::FloatABI arm::getARMFloatABI(const ToolChain , const ArgList ) { - const Driver = TC.getDriver(); - const llvm::Triple = TC.getEffectiveTriple(); +arm::FloatABI arm::getARMFloatABI(const Driver , const llvm::Triple , + const ArgList ) { auto SubArch = getARMSubArchVersionNumber(Triple); arm::FloatABI ABI = FloatABI::Invalid; if (Arg *A = @@ -276,18 +278,13 @@ arm::FloatABI arm::getARMFloatABI(const ToolChain , const ArgList ) { return ABI; } -void arm::getARMTargetFeatures(const ToolChain , - const llvm::Triple , - const ArgList , - ArgStringList , - std::vector , - bool ForAS) { - const Driver = TC.getDriver(); - +void arm::getARMTargetFeatures(const Driver , const llvm::Triple , + const ArgList , ArgStringList , + std::vector , bool ForAS) { bool KernelOrKext = Args.hasArg(options::OPT_mkernel, options::OPT_fapple_kext); - arm::FloatABI ABI = arm::getARMFloatABI(TC, Args); - arm::ReadTPMode ThreadPointer = arm::getReadTPMode(TC, Args); + arm::FloatABI ABI = arm::getARMFloatABI(D, Triple, Args); + arm::ReadTPMode ThreadPointer = arm::getReadTPMode(D, Args); const Arg *WaCPU = nullptr, *WaFPU = nullptr; const Arg *WaHDiv = nullptr, *WaArch = nullptr; diff --git a/clang/lib/Driver/ToolChains/Arch/ARM.h b/clang/lib/Driver/ToolChains/Arch/ARM.h index 5640f8371262..0ba1a59852aa 100644 --- a/clang/lib/Driver/ToolChains/Arch/ARM.h +++ b/clang/lib/Driver/ToolChains/Arch/ARM.h @@ -48,13 +48,15 @@ enum class FloatABI { }; FloatABI getARMFloatABI(const ToolChain , const llvm::opt::ArgList ); -ReadTPMode getReadTPMode(const ToolChain , const llvm::opt::ArgList ); +FloatABI getARMFloatABI(const Driver , const llvm::Triple , +const llvm::opt::ArgList ); +ReadTPMode getReadTPMode(const Driver , const llvm::opt::ArgList ); bool useAAPCSForMachO(const llvm::Triple ); void getARMArchCPUFromArgs(const llvm::opt::ArgList , llvm::StringRef , llvm::StringRef , bool FromAs = false); -void getARMTargetFeatures(const ToolChain , const llvm::Triple , +void getARMTargetFeatures(const Driver , const llvm::Triple , const llvm::opt::ArgList , llvm::opt::ArgStringList ,
[clang] b642e03 - [cuda][hip] Temporarily XFAIL on arm
Author: Michael Liao Date: 2020-02-04T20:25:12-05:00 New Revision: b642e0348512a83505900ae00844f5f60ebeac45 URL: https://github.com/llvm/llvm-project/commit/b642e0348512a83505900ae00844f5f60ebeac45 DIFF: https://github.com/llvm/llvm-project/commit/b642e0348512a83505900ae00844f5f60ebeac45.diff LOG: [cuda][hip] Temporarily XFAIL on arm Added: Modified: clang/test/Driver/cuda-simple.cu Removed: diff --git a/clang/test/Driver/cuda-simple.cu b/clang/test/Driver/cuda-simple.cu index 54e18403108b..d2daf88a68de 100644 --- a/clang/test/Driver/cuda-simple.cu +++ b/clang/test/Driver/cuda-simple.cu @@ -1,6 +1,7 @@ // Verify that we can parse a simple CUDA file with or without -save-temps // http://llvm.org/PR22936 // RUN: %clang -nocudainc -nocudalib -Werror -fsyntax-only -c %s +// XFAIL: arm // // Verify that we pass -x cuda-cpp-output to compiler after // preprocessing a CUDA file ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] ccac6b2 - [hip] Properly populate macros based on host processor.
Author: Michael Liao Date: 2020-02-04T15:36:14-05:00 New Revision: ccac6b2bf877337a883c3763e41a529d8f9cc1ff URL: https://github.com/llvm/llvm-project/commit/ccac6b2bf877337a883c3763e41a529d8f9cc1ff DIFF: https://github.com/llvm/llvm-project/commit/ccac6b2bf877337a883c3763e41a529d8f9cc1ff.diff LOG: [hip] Properly populate macros based on host processor. Summary: - The device compilation needs to have a consistent source code compared to the corresponding host compilation. If macros based on the host-specific target processor is not properly populated, the device compilation may fail due to the inconsistent source after the preprocessor. So far, only the host triple is used to build the macros. If a detailed host CPU target or certain features are specified, macros derived from them won't be populated properly, e.g. `__SSE3__` won't be added unless `+sse3` feature is present. On Windows compilation compatible with MSVC, that missing macros result in that intrinsics are not included and cause device compilation failure on the host-side source. - This patch addresses this issue by introducing two `cc1` options, i.e., `-aux-target-cpu` and `-aux-target-feature`. If a specific host CPU target or certain features are specified, the compiler driver will append them during the construction of the offline compilation actions. Then, the toolchain in `cc1` phase will populate macros accordingly. - An internal option `--gpu-use-aux-triple-only` is added to fall back the original behavior to help diagnosing potential issues from the new behavior. Reviewers: tra, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73942 Added: clang/test/Driver/hip-host-cpu-features.hip clang/test/Preprocessor/hip-host-cpu-macros.cu Modified: clang/include/clang/Driver/CC1Options.td clang/include/clang/Driver/Options.td clang/include/clang/Frontend/FrontendOptions.h clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInstance.cpp clang/lib/Frontend/CompilerInvocation.cpp Removed: diff --git a/clang/include/clang/Driver/CC1Options.td b/clang/include/clang/Driver/CC1Options.td index f535d86d9b5e..0d0b05f8961c 100644 --- a/clang/include/clang/Driver/CC1Options.td +++ b/clang/include/clang/Driver/CC1Options.td @@ -482,6 +482,10 @@ def cc1as : Flag<["-"], "cc1as">; def ast_merge : Separate<["-"], "ast-merge">, MetaVarName<"">, HelpText<"Merge the given AST file into the translation unit being compiled.">; +def aux_target_cpu : Separate<["-"], "aux-target-cpu">, + HelpText<"Target a specific auxiliary cpu type">; +def aux_target_feature : Separate<["-"], "aux-target-feature">, + HelpText<"Target specific auxiliary attributes">; def aux_triple : Separate<["-"], "aux-triple">, HelpText<"Auxiliary target triple.">; def code_completion_at : Separate<["-"], "code-completion-at">, diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 388ff094ae44..2c925d018da7 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -549,6 +549,9 @@ def c : Flag<["-"], "c">, Flags<[DriverOption]>, Group, def fconvergent_functions : Flag<["-"], "fconvergent-functions">, Group, Flags<[CC1Option]>, HelpText<"Assume functions may be convergent">; +def gpu_use_aux_triple_only : Flag<["--"], "gpu-use-aux-triple-only">, + InternalDriverOpt, HelpText<"Prepare '-aux-triple' only without populating " + "'-aux-target-cpu' and '-aux-target-feature'.">; def cuda_device_only : Flag<["--"], "cuda-device-only">, HelpText<"Compile CUDA code for device only">; def cuda_host_only : Flag<["--"], "cuda-host-only">, diff --git a/clang/include/clang/Frontend/FrontendOptions.h b/clang/include/clang/Frontend/FrontendOptions.h index 09969b596d63..2adc6319810c 100644 --- a/clang/include/clang/Frontend/FrontendOptions.h +++ b/clang/include/clang/Frontend/FrontendOptions.h @@ -426,9 +426,15 @@ class FrontendOptions { /// (in the format produced by -fdump-record-layouts). std::string OverrideRecordLayoutsFile; - /// Auxiliary triple for CUDA compilation. + /// Auxiliary triple for CUDA/HIP compilation. std::string AuxTriple; + /// Auxiliary target CPU for CUDA/HIP compilation. + Optional AuxTargetCPU; + + /// Auxiliary target features for CUDA/HIP compilation. + Optional> AuxTargetFeatures; + /// Filename to write statistics to. std::string StatsFile; diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 6f092ca274c0..ccdfbe8c604f 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -309,7 +309,7 @@ static void getWebAssemblyTargetFeatures(const ArgList , static void getTargetFeatures(const ToolChain ,
[clang] 268e57b - [clang][driver] Remove an unused parameter. NFC.
Author: Michael Liao Date: 2020-02-01T16:18:05-05:00 New Revision: 268e57bd35d7e05928820ad90f325e19e7a809d0 URL: https://github.com/llvm/llvm-project/commit/268e57bd35d7e05928820ad90f325e19e7a809d0 DIFF: https://github.com/llvm/llvm-project/commit/268e57bd35d7e05928820ad90f325e19e7a809d0.diff LOG: [clang][driver] Remove an unused parameter. NFC. - Group relevant code together. Added: Modified: clang/lib/Driver/ToolChains/Clang.cpp Removed: diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 510dc19f7a90..c8195f0f4ccf 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -3607,8 +3607,7 @@ static DwarfFissionKind getDebugFissionKind(const Driver , static void RenderDebugOptions(const ToolChain , const Driver , const llvm::Triple , const ArgList , - bool EmitCodeView, bool IsWindowsMSVC, - ArgStringList , + bool EmitCodeView, ArgStringList , codegenoptions::DebugInfoKind , DwarfFissionKind ) { if (Args.hasFlag(options::OPT_fdebug_info_for_profiling, @@ -4651,8 +4650,8 @@ void Clang::ConstructJob(Compilation , const JobAction , AddClangCLArgs(Args, InputType, CmdArgs, , ); DwarfFissionKind DwarfFission; - RenderDebugOptions(TC, D, RawTriple, Args, EmitCodeView, IsWindowsMSVC, - CmdArgs, DebugInfoKind, DwarfFission); + RenderDebugOptions(TC, D, RawTriple, Args, EmitCodeView, CmdArgs, + DebugInfoKind, DwarfFission); // Add the split debug info name to the command lines here so we // can propagate it to the backend. @@ -5352,16 +5351,16 @@ void Clang::ConstructJob(Compilation , const JobAction , RawTriple.isOSDarwin() && !KernelOrKext)) CmdArgs.push_back("-fregister-global-dtors-with-atexit"); - // -fms-extensions=0 is default. - if (Args.hasFlag(options::OPT_fms_extensions, options::OPT_fno_ms_extensions, - IsWindowsMSVC)) -CmdArgs.push_back("-fms-extensions"); - // -fno-use-line-directives is default. if (Args.hasFlag(options::OPT_fuse_line_directives, options::OPT_fno_use_line_directives, false)) CmdArgs.push_back("-fuse-line-directives"); + // -fms-extensions=0 is default. + if (Args.hasFlag(options::OPT_fms_extensions, options::OPT_fno_ms_extensions, + IsWindowsMSVC)) +CmdArgs.push_back("-fms-extensions"); + // -fms-compatibility=0 is default. bool IsMSVCCompat = Args.hasFlag( options::OPT_fms_compatibility, options::OPT_fno_ms_compatibility, ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 49f7bc9 - [hip] Remove `-Werror=format-nonliteral`
Author: Michael Liao Date: 2020-01-23T11:02:11-05:00 New Revision: 49f7bc9e1e50eb8f6e065f97585b3bf0bcc23d5c URL: https://github.com/llvm/llvm-project/commit/49f7bc9e1e50eb8f6e065f97585b3bf0bcc23d5c DIFF: https://github.com/llvm/llvm-project/commit/49f7bc9e1e50eb8f6e065f97585b3bf0bcc23d5c.diff LOG: [hip] Remove `-Werror=format-nonliteral` Summary: - It won't distinguish host and device code and trigger compilation failure on irrelevant code. Reviewers: sameerds, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73224 Added: Modified: clang/lib/Driver/ToolChains/HIP.cpp clang/test/Driver/hip-printf.hip Removed: diff --git a/clang/lib/Driver/ToolChains/HIP.cpp b/clang/lib/Driver/ToolChains/HIP.cpp index 4772a6fb6b10..7039ddeabd57 100644 --- a/clang/lib/Driver/ToolChains/HIP.cpp +++ b/clang/lib/Driver/ToolChains/HIP.cpp @@ -425,7 +425,6 @@ Tool *HIPToolChain::buildLinker() const { void HIPToolChain::addClangWarningOptions(ArgStringList ) const { HostTC.addClangWarningOptions(CC1Args); - CC1Args.push_back("-Werror=format-nonliteral"); } ToolChain::CXXStdlibType diff --git a/clang/test/Driver/hip-printf.hip b/clang/test/Driver/hip-printf.hip index 2df344f8fb2e..ada6c651ddb7 100644 --- a/clang/test/Driver/hip-printf.hip +++ b/clang/test/Driver/hip-printf.hip @@ -6,4 +6,4 @@ // RUN: %s 2>&1 | FileCheck %s // CHECK: [[CLANG:".*clang.*"]] "-cc1" -// CHECK-SAME: "-Werror=format-nonliteral" +// CHECK-NOT: "-Werror=format-nonliteral" ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 70b53a3 - Fix gcc `-Wunused-variable` warning. NFC.
Author: Michael Liao Date: 2020-01-19T12:24:21-05:00 New Revision: 70b53a301888fe2be36996b41a7dd5aa7c256dc9 URL: https://github.com/llvm/llvm-project/commit/70b53a301888fe2be36996b41a7dd5aa7c256dc9 DIFF: https://github.com/llvm/llvm-project/commit/70b53a301888fe2be36996b41a7dd5aa7c256dc9.diff LOG: Fix gcc `-Wunused-variable` warning. NFC. Added: Modified: clang/lib/Serialization/ASTReaderStmt.cpp Removed: diff --git a/clang/lib/Serialization/ASTReaderStmt.cpp b/clang/lib/Serialization/ASTReaderStmt.cpp index 5dd0ef9d43c3..99dc1e9172c4 100644 --- a/clang/lib/Serialization/ASTReaderStmt.cpp +++ b/clang/lib/Serialization/ASTReaderStmt.cpp @@ -732,7 +732,7 @@ readConstraintSatisfaction(ASTRecordReader ) { unsigned NumDetailRecords = Record.readInt(); for (unsigned i = 0; i != NumDetailRecords; ++i) { Expr *ConstraintExpr = Record.readExpr(); - if (bool IsDiagnostic = Record.readInt()) { + if (/* IsDiagnostic */Record.readInt()) { SourceLocation DiagLocation = Record.readSourceLocation(); std::string DiagMessage = Record.readString(); Satisfaction.Details.emplace_back( @@ -822,7 +822,7 @@ void ASTStmtReader::VisitRequiresExpr(RequiresExpr *E) { Req.emplace(); } else { NoexceptLoc = Record.readSourceLocation(); - switch (auto returnTypeRequirementKind = Record.readInt()) { + switch (/* returnTypeRequirementKind */Record.readInt()) { case 0: // No return type requirement. Req.emplace(); @@ -853,7 +853,7 @@ void ASTStmtReader::VisitRequiresExpr(RequiresExpr *E) { std::move(*Req)); } break; case concepts::Requirement::RK_Nested: { -if (bool IsSubstitutionDiagnostic = Record.readInt()) { +if (/* IsSubstitutionDiagnostic */Record.readInt()) { R = new (Record.getContext()) concepts::NestedRequirement( readSubstitutionDiagnostic(Record)); break; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] 40514a7 - [clangd] Add workaround for GCC5 host compilers. NFC.
Author: Michael Liao Date: 2020-01-16T16:05:22-05:00 New Revision: 40514a7d7a3b745ba43c2d014e54a0d78d65d957 URL: https://github.com/llvm/llvm-project/commit/40514a7d7a3b745ba43c2d014e54a0d78d65d957 DIFF: https://github.com/llvm/llvm-project/commit/40514a7d7a3b745ba43c2d014e54a0d78d65d957.diff LOG: [clangd] Add workaround for GCC5 host compilers. NFC. Added: Modified: clang-tools-extra/clangd/Hover.cpp Removed: diff --git a/clang-tools-extra/clangd/Hover.cpp b/clang-tools-extra/clangd/Hover.cpp index cfa5e3bf93fb..ad715db4d5eb 100644 --- a/clang-tools-extra/clangd/Hover.cpp +++ b/clang-tools-extra/clangd/Hover.cpp @@ -439,7 +439,13 @@ bool isLiteral(const Expr *E) { llvm::StringLiteral getNameForExpr(const Expr *E) { // FIXME: Come up with names for `special` expressions. - return "expression"; + // + // It's an known issue for GCC5, https://godbolt.org/z/Z_tbgi. Work around + // that by using explicit conversion constructor. + // + // TODO: Once GCC5 is fully retired and not the minimal requirement as stated + // in `GettingStarted`, please remove the explicit conversion constructor. + return llvm::StringLiteral("expression"); } // Generates hover info for evaluatable expressions. ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] a3490e3 - Remove trailing `;`. NFC.
Author: Michael Liao Date: 2020-01-14T16:52:20-05:00 New Revision: a3490e3e3d38d502179329f76138d96c5b2bab88 URL: https://github.com/llvm/llvm-project/commit/a3490e3e3d38d502179329f76138d96c5b2bab88 DIFF: https://github.com/llvm/llvm-project/commit/a3490e3e3d38d502179329f76138d96c5b2bab88.diff LOG: Remove trailing `;`. NFC. Added: Modified: clang/lib/Tooling/Syntax/Tree.cpp Removed: diff --git a/clang/lib/Tooling/Syntax/Tree.cpp b/clang/lib/Tooling/Syntax/Tree.cpp index 9f028c0be3b4..9a6270ec4cce 100644 --- a/clang/lib/Tooling/Syntax/Tree.cpp +++ b/clang/lib/Tooling/Syntax/Tree.cpp @@ -29,7 +29,7 @@ static void traverse(syntax::Node *N, traverse(static_cast(N), [&](const syntax::Node *N) { Visit(const_cast(N)); }); -}; +} } // namespace syntax::Arena::Arena(SourceManager , const LangOptions , ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 7cee288 - Fix `-Wunused-variable` warning. NFC.
Author: Michael Liao Date: 2019-12-21T11:10:35-05:00 New Revision: 7cee28858674d233903e92b7a0c49b07b05ed3d3 URL: https://github.com/llvm/llvm-project/commit/7cee28858674d233903e92b7a0c49b07b05ed3d3 DIFF: https://github.com/llvm/llvm-project/commit/7cee28858674d233903e92b7a0c49b07b05ed3d3.diff LOG: Fix `-Wunused-variable` warning. NFC. Added: Modified: clang/lib/Sema/SemaDeclObjC.cpp Removed: diff --git a/clang/lib/Sema/SemaDeclObjC.cpp b/clang/lib/Sema/SemaDeclObjC.cpp index 4fdddfbb7a7e..5fdf6aeed5b4 100644 --- a/clang/lib/Sema/SemaDeclObjC.cpp +++ b/clang/lib/Sema/SemaDeclObjC.cpp @@ -4776,7 +4776,7 @@ Decl *Sema::ActOnMethodDeclaration( if (auto *Cat = dyn_cast(IMD->getDeclContext())) decl = Cat->IsClassExtension() ? 1 : 2; -if (auto *Cat = dyn_cast(ImpDecl)) +if (isa(ImpDecl)) impl = 1 + (decl != 0); Diag(ObjCMethod->getLocation(), ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 6626e5a - Fix compilation warning from GCC7. NFC.
Author: Michael Liao Date: 2019-12-09T10:11:27-05:00 New Revision: 6626e5a06a99b29b388f2dffde2c16f8eb5ded46 URL: https://github.com/llvm/llvm-project/commit/6626e5a06a99b29b388f2dffde2c16f8eb5ded46 DIFF: https://github.com/llvm/llvm-project/commit/6626e5a06a99b29b388f2dffde2c16f8eb5ded46.diff LOG: Fix compilation warning from GCC7. NFC. Added: Modified: clang/lib/Sema/SemaDeclCXX.cpp Removed: diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp index c8b95983f03c..d0857a5de817 100644 --- a/clang/lib/Sema/SemaDeclCXX.cpp +++ b/clang/lib/Sema/SemaDeclCXX.cpp @@ -7103,6 +7103,7 @@ class DefaultedComparisonVisitor { ResultList Results; switch (DCK) { +default: case DefaultedComparisonKind::None: llvm_unreachable("not a defaulted comparison"); @@ -7592,6 +7593,7 @@ class DefaultedComparisonSynthesizer return StmtError(); switch (DCK) { +default: case DefaultedComparisonKind::None: llvm_unreachable("not a defaulted comparison"); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] f2ace9d - Add `QualType::hasAddressSpace`. NFC.
Author: Michael Liao Date: 2019-12-06T13:08:55-05:00 New Revision: f2ace9d6005b4ffc6f6fc068c1aac897d871df7a URL: https://github.com/llvm/llvm-project/commit/f2ace9d6005b4ffc6f6fc068c1aac897d871df7a DIFF: https://github.com/llvm/llvm-project/commit/f2ace9d6005b4ffc6f6fc068c1aac897d871df7a.diff LOG: Add `QualType::hasAddressSpace`. NFC. - Add that as a shorthand of .getQualifiers().hasAddressSpace(). - Simplify related code. Added: Modified: clang/include/clang/AST/Type.h clang/lib/Sema/SemaDecl.cpp clang/lib/Sema/SemaExpr.cpp clang/lib/Sema/SemaInit.cpp clang/lib/Sema/SemaOverload.cpp clang/lib/Sema/SemaType.cpp clang/lib/StaticAnalyzer/Checkers/DereferenceChecker.cpp Removed: diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h index 02c9aa403b5a..caf2a3dd79a3 100644 --- a/clang/include/clang/AST/Type.h +++ b/clang/include/clang/AST/Type.h @@ -1046,6 +1046,9 @@ class QualType { ID.AddPointer(getAsOpaquePtr()); } + /// Check if this type has any address space qualifier. + inline bool hasAddressSpace() const; + /// Return the address space of this type. inline LangAS getAddressSpace() const; @@ -6276,6 +6279,11 @@ inline void QualType::removeLocalCVRQualifiers(unsigned Mask) { removeLocalFastQualifiers(Mask); } +/// Check if this type has any address space qualifier. +inline bool QualType::hasAddressSpace() const { + return getQualifiers().hasAddressSpace(); +} + /// Return the address space of this type. inline LangAS QualType::getAddressSpace() const { return getQualifiers().getAddressSpace(); diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp index 660be458a698..0e38d6bfaf93 100644 --- a/clang/lib/Sema/SemaDecl.cpp +++ b/clang/lib/Sema/SemaDecl.cpp @@ -6118,7 +6118,7 @@ bool Sema::inferObjCARCLifetime(ValueDecl *decl) { } void Sema::deduceOpenCLAddressSpace(ValueDecl *Decl) { - if (Decl->getType().getQualifiers().hasAddressSpace()) + if (Decl->getType().hasAddressSpace()) return; if (VarDecl *Var = dyn_cast(Decl)) { QualType Type = Var->getType(); @@ -6132,7 +6132,7 @@ void Sema::deduceOpenCLAddressSpace(ValueDecl *Decl) { // type has no address space yet, deduce it now. if (auto DT = dyn_cast(Type)) { auto OrigTy = DT->getOriginalType(); - if (!OrigTy.getQualifiers().hasAddressSpace() && OrigTy->isArrayType()) { + if (!OrigTy.hasAddressSpace() && OrigTy->isArrayType()) { // Add the address space to the original array type and then propagate // that to the element type through `getAsArrayType`. OrigTy = Context.getAddrSpaceQualType(OrigTy, ImplAS); @@ -16094,7 +16094,7 @@ FieldDecl *Sema::CheckFieldDecl(DeclarationName Name, QualType T, } // TR 18037 does not allow fields to be declared with address space - if (T.getQualifiers().hasAddressSpace() || T->isDependentAddressSpaceType() || + if (T.hasAddressSpace() || T->isDependentAddressSpaceType() || T->getBaseElementTypeUnsafe()->isDependentAddressSpaceType()) { Diag(Loc, diag::err_field_with_address_space); Record->setInvalidDecl(); diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp index c53a4b789bed..e2c37f8f5238 100644 --- a/clang/lib/Sema/SemaExpr.cpp +++ b/clang/lib/Sema/SemaExpr.cpp @@ -5445,15 +5445,15 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext , Expr *Arg = ArgRes.get(); QualType ArgType = Arg->getType(); if (!ParamType->isPointerType() || -ParamType.getQualifiers().hasAddressSpace() || +ParamType.hasAddressSpace() || !ArgType->isPointerType() || -!ArgType->getPointeeType().getQualifiers().hasAddressSpace()) { +!ArgType->getPointeeType().hasAddressSpace()) { OverloadParams.push_back(ParamType); continue; } QualType PointeeType = ParamType->getPointeeType(); -if (PointeeType.getQualifiers().hasAddressSpace()) +if (PointeeType.hasAddressSpace()) continue; NeedsNewDecl = true; diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp index 7421754d95ca..cc9d1a4f6256 100644 --- a/clang/lib/Sema/SemaInit.cpp +++ b/clang/lib/Sema/SemaInit.cpp @@ -7853,9 +7853,8 @@ ExprResult InitializationSequence::Perform(Sema , // OpenCL v2.0 s6.13.11.1. atomic variables can be initialized in global scope QualType ETy = Entity.getType(); - Qualifiers TyQualifiers = ETy.getQualifiers(); - bool HasGlobalAS = TyQualifiers.hasAddressSpace() && - TyQualifiers.getAddressSpace() == LangAS::opencl_global; + bool HasGlobalAS = ETy.hasAddressSpace() && + ETy.getAddressSpace() == LangAS::opencl_global; if (S.getLangOpts().OpenCLVersion >= 200 && ETy->isAtomicType() && !HasGlobalAS && diff --git a/clang/lib/Sema/SemaOverload.cpp
[clang] fa9dd41 - [opencl] Fix address space deduction on array variables.
Author: Michael Liao Date: 2019-12-04T09:37:50-05:00 New Revision: fa9dd410a9a9aa65ce6731cbe1ee12c5941eb3e8 URL: https://github.com/llvm/llvm-project/commit/fa9dd410a9a9aa65ce6731cbe1ee12c5941eb3e8 DIFF: https://github.com/llvm/llvm-project/commit/fa9dd410a9a9aa65ce6731cbe1ee12c5941eb3e8.diff LOG: [opencl] Fix address space deduction on array variables. Summary: - The deduced address space needs applying to its element type as well. Reviewers: Anastasia Subscribers: yaxunl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70981 Added: Modified: clang/lib/Sema/SemaDecl.cpp clang/test/SemaOpenCL/address-spaces.cl Removed: diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp index d35037273106..660be458a698 100644 --- a/clang/lib/Sema/SemaDecl.cpp +++ b/clang/lib/Sema/SemaDecl.cpp @@ -6128,7 +6128,26 @@ void Sema::deduceOpenCLAddressSpace(ValueDecl *Decl) { if ((getLangOpts().OpenCLCPlusPlus || getLangOpts().OpenCLVersion >= 200) && Var->hasGlobalStorage()) ImplAS = LangAS::opencl_global; +// If the original type from a decayed type is an array type and that array +// type has no address space yet, deduce it now. +if (auto DT = dyn_cast(Type)) { + auto OrigTy = DT->getOriginalType(); + if (!OrigTy.getQualifiers().hasAddressSpace() && OrigTy->isArrayType()) { +// Add the address space to the original array type and then propagate +// that to the element type through `getAsArrayType`. +OrigTy = Context.getAddrSpaceQualType(OrigTy, ImplAS); +OrigTy = QualType(Context.getAsArrayType(OrigTy), 0); +// Re-generate the decayed type. +Type = Context.getDecayedType(OrigTy); + } +} Type = Context.getAddrSpaceQualType(Type, ImplAS); +// Apply any qualifiers (including address space) from the array type to +// the element type. This implements C99 6.7.3p8: "If the specification of +// an array type includes any type qualifiers, the element type is so +// qualified, not the array type." +if (Type->isArrayType()) + Type = QualType(Context.getAsArrayType(Type), 0); Decl->setType(Type); } } diff --git a/clang/test/SemaOpenCL/address-spaces.cl b/clang/test/SemaOpenCL/address-spaces.cl index 55a55dc75050..09a6dd0ba53f 100644 --- a/clang/test/SemaOpenCL/address-spaces.cl +++ b/clang/test/SemaOpenCL/address-spaces.cl @@ -241,3 +241,10 @@ void func_multiple_addr(void) { __private private_int_t var5; // expected-warning {{multiple identical address spaces specified for type}} __private private_int_t *var6;// expected-warning {{multiple identical address spaces specified for type}} } + +void func_with_array_param(const unsigned data[16]); + +__kernel void k() { + unsigned data[16]; + func_with_array_param(data); +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 59312cb - Fix warning on unused variable. NFC.
Author: Michael Liao Date: 2019-12-03T21:16:10-05:00 New Revision: 59312cb0b81ca13f0674dde66b8e87a8d51d4dda URL: https://github.com/llvm/llvm-project/commit/59312cb0b81ca13f0674dde66b8e87a8d51d4dda DIFF: https://github.com/llvm/llvm-project/commit/59312cb0b81ca13f0674dde66b8e87a8d51d4dda.diff LOG: Fix warning on unused variable. NFC. Added: Modified: clang/lib/AST/Expr.cpp Removed: diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp index a73531ad5fad..3bc2ea60aa14 100644 --- a/clang/lib/AST/Expr.cpp +++ b/clang/lib/AST/Expr.cpp @@ -1678,7 +1678,7 @@ MemberExpr *MemberExpr::Create( MemberExpr *E = new (Mem) MemberExpr(Base, IsArrow, OperatorLoc, MemberDecl, NameInfo, T, VK, OK, NOUR); - if (FieldDecl *Field = dyn_cast(MemberDecl)) { + if (isa(MemberDecl)) { DeclContext *DC = MemberDecl->getDeclContext(); // dyn_cast_or_null is used to handle objC variables which do not // have a declaration context. ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 59e69fe - Fix warning on extra ';'. NFC.
Author: Michael Liao Date: 2019-12-03T16:02:55-05:00 New Revision: 59e69fefab883984e81c77aef58ba587060e87f2 URL: https://github.com/llvm/llvm-project/commit/59e69fefab883984e81c77aef58ba587060e87f2 DIFF: https://github.com/llvm/llvm-project/commit/59e69fefab883984e81c77aef58ba587060e87f2.diff LOG: Fix warning on extra ';'. NFC. Added: Modified: clang/lib/Format/TokenAnnotator.cpp Removed: diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp index 93cb36961ee5..d5d394e61926 100644 --- a/clang/lib/Format/TokenAnnotator.cpp +++ b/clang/lib/Format/TokenAnnotator.cpp @@ -2597,7 +2597,7 @@ bool TokenAnnotator::spaceRequiredBeforeParens(const FormatToken ) const { static bool isKeywordWithCondition(const FormatToken ) { return Tok.isOneOf(tok::kw_if, tok::kw_for, tok::kw_while, tok::kw_switch, tok::kw_constexpr); -}; +} bool TokenAnnotator::spaceRequiredBetween(const AnnotatedLine , const FormatToken , ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] c4afc65 - Fix compilation warning. NFC.
Author: Michael Liao Date: 2019-11-21T12:07:13-05:00 New Revision: c4afc6566a64e6be3f77271781a147bb5ff98b0c URL: https://github.com/llvm/llvm-project/commit/c4afc6566a64e6be3f77271781a147bb5ff98b0c DIFF: https://github.com/llvm/llvm-project/commit/c4afc6566a64e6be3f77271781a147bb5ff98b0c.diff LOG: Fix compilation warning. NFC. Added: Modified: clang/lib/Driver/Driver.cpp Removed: diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index 90f3cea5b2af..c1173e3ddbf0 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -3498,7 +3498,7 @@ void Driver::BuildActions(Compilation , DerivedArgList , Actions.push_back( C.MakeAction(MergerInputs, types::TY_Image)); - if (Arg *A = Args.getLastArg(options::OPT_emit_interface_stubs)) { + if (Args.hasArg(options::OPT_emit_interface_stubs)) { llvm::SmallVector PhaseList; if (Args.hasArg(options::OPT_c)) { llvm::SmallVector CompilePhaseList; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 0a220de - [HIP] Fix visibility for 'extern' device variables.
Author: Michael Liao Date: 2019-11-05T14:19:32-05:00 New Revision: 0a220de9e9ca3e6786df6c03fd37668815805c62 URL: https://github.com/llvm/llvm-project/commit/0a220de9e9ca3e6786df6c03fd37668815805c62 DIFF: https://github.com/llvm/llvm-project/commit/0a220de9e9ca3e6786df6c03fd37668815805c62.diff LOG: [HIP] Fix visibility for 'extern' device variables. Summary: - Fix a bug which misses the change for a variable to be set with target-specific attributes. Reviewers: yaxunl Subscribers: jvesely, nhaehnle, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D63020 Added: Modified: clang/lib/CodeGen/CodeGenModule.cpp clang/test/CodeGenCUDA/amdgpu-visibility.cu Removed: diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 49df82cea42b..be8f389e1809 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3575,6 +3575,9 @@ CodeGenModule::GetOrCreateLLVMGlobal(StringRef MangledName, } } + if (GV->isDeclaration()) +getTargetCodeGenInfo().setTargetAttributes(D, GV, *this); + LangAS ExpectedAS = D ? D->getType().getAddressSpace() : (LangOpts.OpenCL ? LangAS::opencl_global : LangAS::Default); @@ -3584,9 +3587,6 @@ CodeGenModule::GetOrCreateLLVMGlobal(StringRef MangledName, return getTargetCodeGenInfo().performAddrSpaceCast(*this, GV, AddrSpace, ExpectedAS, Ty); - if (GV->isDeclaration()) -getTargetCodeGenInfo().setTargetAttributes(D, GV, *this); - return GV; } diff --git a/clang/test/CodeGenCUDA/amdgpu-visibility.cu b/clang/test/CodeGenCUDA/amdgpu-visibility.cu index 9f44eb047f82..f23e562a4f29 100644 --- a/clang/test/CodeGenCUDA/amdgpu-visibility.cu +++ b/clang/test/CodeGenCUDA/amdgpu-visibility.cu @@ -13,6 +13,16 @@ __constant__ int c; __device__ int g; +// CHECK-DEFAULT: @e = external addrspace(1) global +// CHECK-PROTECTED: @e = external protected addrspace(1) global +// CHECK-HIDDEN: @e = external protected addrspace(1) global +extern __device__ int e; + +// dummy one to hold reference to `e`. +__device__ int f() { + return e; +} + // CHECK-DEFAULT: define amdgpu_kernel void @_Z3foov() // CHECK-PROTECTED: define protected amdgpu_kernel void @_Z3foov() // CHECK-HIDDEN: define protected amdgpu_kernel void @_Z3foov() ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 15140e4 - [hip] Enable pointer argument lowering through coercing type.
Author: Michael Liao Date: 2019-11-05T13:05:05-05:00 New Revision: 15140e4bacf94fbc509e5a139909aefcd1cc3363 URL: https://github.com/llvm/llvm-project/commit/15140e4bacf94fbc509e5a139909aefcd1cc3363 DIFF: https://github.com/llvm/llvm-project/commit/15140e4bacf94fbc509e5a139909aefcd1cc3363.diff LOG: [hip] Enable pointer argument lowering through coercing type. Reviewers: tra, rjmccall, yaxunl Subscribers: jvesely, nhaehnle, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69826 Added: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu Modified: clang/lib/CodeGen/CGCall.cpp clang/lib/CodeGen/TargetInfo.cpp Removed: diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 62e8fa037013..e832e4c28334 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -1305,6 +1305,15 @@ static void CreateCoercedStore(llvm::Value *Src, DstTy = Dst.getType()->getElementType(); } + llvm::PointerType *SrcPtrTy = llvm::dyn_cast(SrcTy); + llvm::PointerType *DstPtrTy = llvm::dyn_cast(DstTy); + if (SrcPtrTy && DstPtrTy && + SrcPtrTy->getAddressSpace() != DstPtrTy->getAddressSpace()) { +Src = CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(Src, DstTy); +CGF.Builder.CreateStore(Src, Dst, DstIsVolatile); +return; + } + // If the source and destination are integer or pointer types, just do an // extension or truncation to the desired type. if ((isa(SrcTy) || isa(SrcTy)) && diff --git a/clang/lib/CodeGen/TargetInfo.cpp b/clang/lib/CodeGen/TargetInfo.cpp index e33d69c86b3c..26c527d7c983 100644 --- a/clang/lib/CodeGen/TargetInfo.cpp +++ b/clang/lib/CodeGen/TargetInfo.cpp @@ -7685,6 +7685,42 @@ class AMDGPUABIInfo final : public DefaultABIInfo { bool isHomogeneousAggregateSmallEnough(const Type *Base, uint64_t Members) const override; + // Coerce HIP pointer arguments from generic pointers to global ones. + llvm::Type *coerceKernelArgumentType(llvm::Type *Ty, unsigned FromAS, + unsigned ToAS) const { +// Structure types. +if (auto STy = dyn_cast(Ty)) { + SmallVector EltTys; + bool Changed = false; + for (auto T : STy->elements()) { +auto NT = coerceKernelArgumentType(T, FromAS, ToAS); +EltTys.push_back(NT); +Changed |= (NT != T); + } + // Skip if there is no change in element types. + if (!Changed) +return STy; + if (STy->hasName()) +return llvm::StructType::create( +EltTys, (STy->getName() + ".coerce").str(), STy->isPacked()); + return llvm::StructType::get(getVMContext(), EltTys, STy->isPacked()); +} +// Arrary types. +if (auto ATy = dyn_cast(Ty)) { + auto T = ATy->getElementType(); + auto NT = coerceKernelArgumentType(T, FromAS, ToAS); + // Skip if there is no change in that element type. + if (NT == T) +return ATy; + return llvm::ArrayType::get(NT, ATy->getNumElements()); +} +// Single value types. +if (Ty->isPointerTy() && Ty->getPointerAddressSpace() == FromAS) + return llvm::PointerType::get( + cast(Ty)->getElementType(), ToAS); +return Ty; + } + public: explicit AMDGPUABIInfo(CodeGen::CodeGenTypes ) : DefaultABIInfo(CGT) {} @@ -7812,14 +7848,22 @@ ABIArgInfo AMDGPUABIInfo::classifyKernelArgumentType(QualType Ty) const { // TODO: Can we omit empty structs? - // Coerce single element structs to its element. + llvm::Type *LTy = nullptr; if (const Type *SeltTy = isSingleElementStruct(Ty, getContext())) -return ABIArgInfo::getDirect(CGT.ConvertType(QualType(SeltTy, 0))); +LTy = CGT.ConvertType(QualType(SeltTy, 0)); + + if (getContext().getLangOpts().HIP) { +if (!LTy) + LTy = CGT.ConvertType(Ty); +LTy = coerceKernelArgumentType( +LTy, /*FromAS=*/getContext().getTargetAddressSpace(LangAS::Default), +/*ToAS=*/getContext().getTargetAddressSpace(LangAS::cuda_device)); + } // If we set CanBeFlattened to true, CodeGen will expand the struct to its // individual elements, which confuses the Clover OpenCL backend; therefore we // have to set it to false here. Other args of getDirect() are just defaults. - return ABIArgInfo::getDirect(nullptr, 0, nullptr, false); + return ABIArgInfo::getDirect(LTy, 0, nullptr, false); } ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty, diff --git a/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu new file mode 100644 index ..cb8a75882d4d --- /dev/null +++ b/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu @@ -0,0 +1,69 @@ +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm -x hip %s -o - | FileCheck %s +// RUN: %clang_cc1
[clang] d142ec6 - Fix compilation warning. NFC.
Author: Michael Liao Date: 2019-11-04T10:01:50-05:00 New Revision: d142ec6fef9a053c9fd9edb5a388203cdb121e65 URL: https://github.com/llvm/llvm-project/commit/d142ec6fef9a053c9fd9edb5a388203cdb121e65 DIFF: https://github.com/llvm/llvm-project/commit/d142ec6fef9a053c9fd9edb5a388203cdb121e65.diff LOG: Fix compilation warning. NFC. Added: Modified: clang/lib/Driver/ToolChains/Darwin.cpp Removed: diff --git a/clang/lib/Driver/ToolChains/Darwin.cpp b/clang/lib/Driver/ToolChains/Darwin.cpp index d8c18effd62c..d550eea94670 100644 --- a/clang/lib/Driver/ToolChains/Darwin.cpp +++ b/clang/lib/Driver/ToolChains/Darwin.cpp @@ -1527,8 +1527,8 @@ getDeploymentTargetFromEnvironmentVariables(const Driver , Targets[Darwin::TvOS] = ""; } else { // Don't allow conflicts in any other platform. -int FirstTarget = llvm::array_lengthof(Targets); -for (int I = 0; I != llvm::array_lengthof(Targets); ++I) { +unsigned FirstTarget = llvm::array_lengthof(Targets); +for (unsigned I = 0; I != llvm::array_lengthof(Targets); ++I) { if (Targets[I].empty()) continue; if (FirstTarget == llvm::array_lengthof(Targets)) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 45787e5 - Fix compilation warning. NFC.
Author: Michael Liao Date: 2019-10-25T01:06:52-04:00 New Revision: 45787e56829f47e45d127882b1cd1821e7022e68 URL: https://github.com/llvm/llvm-project/commit/45787e56829f47e45d127882b1cd1821e7022e68 DIFF: https://github.com/llvm/llvm-project/commit/45787e56829f47e45d127882b1cd1821e7022e68.diff LOG: Fix compilation warning. NFC. Added: Modified: clang/lib/CodeGen/CGBuiltin.cpp Removed: diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 2caa8509ea06..9a56116173ec 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -6874,6 +6874,7 @@ Value *CodeGenFunction::EmitARMMVEBuiltinExpr(unsigned BuiltinID, return ToReturn; } } + llvm_unreachable("unknown custom codegen type."); } static Value *EmitAArch64TblBuiltinExpr(CodeGenFunction , unsigned BuiltinID, ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 5a48678 - [hip] Allow the declaration of functions with variadic arguments in HIP.
Author: Michael Liao Date: 2019-10-25T00:39:24-04:00 New Revision: 5a48678a6a1619fada23641a68c2d95ee57806b1 URL: https://github.com/llvm/llvm-project/commit/5a48678a6a1619fada23641a68c2d95ee57806b1 DIFF: https://github.com/llvm/llvm-project/commit/5a48678a6a1619fada23641a68c2d95ee57806b1.diff LOG: [hip] Allow the declaration of functions with variadic arguments in HIP. Summary: - As variadic parameters have the lowest rank in overload resolution, without real usage of `va_arg`, they are commonly used as the catch-all fallbacks in SFINAE. As the front-end still reports errors on calls to `va_arg`, the declaration of functions with variadic arguments should be allowed in general. Reviewers: jlebar, tra, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69389 Added: Modified: clang/lib/CodeGen/TargetInfo.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Driver/ToolChains/HIP.cpp clang/test/Driver/hip-toolchain-no-rdc.hip clang/test/Driver/hip-toolchain-rdc.hip Removed: diff --git a/clang/lib/CodeGen/TargetInfo.cpp b/clang/lib/CodeGen/TargetInfo.cpp index c2c7b8bf653b..e33d69c86b3c 100644 --- a/clang/lib/CodeGen/TargetInfo.cpp +++ b/clang/lib/CodeGen/TargetInfo.cpp @@ -7694,6 +7694,8 @@ class AMDGPUABIInfo final : public DefaultABIInfo { ABIArgInfo classifyArgumentType(QualType Ty, unsigned ) const; void computeInfo(CGFunctionInfo ) const override; + Address EmitVAArg(CodeGenFunction , Address VAListAddr, +QualType Ty) const override; }; bool AMDGPUABIInfo::isHomogeneousAggregateBaseType(QualType Ty) const { @@ -7757,6 +7759,11 @@ void AMDGPUABIInfo::computeInfo(CGFunctionInfo ) const { } } +Address AMDGPUABIInfo::EmitVAArg(CodeGenFunction , Address VAListAddr, + QualType Ty) const { + llvm_unreachable("AMDGPU does not support varargs"); +} + ABIArgInfo AMDGPUABIInfo::classifyReturnType(QualType RetTy) const { if (isAggregateTypeForABI(RetTy)) { // Records with non-trivial destructors/copy-constructors should not be diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 70c70dcdbd4d..ae12465d3f8b 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5334,6 +5334,9 @@ void Clang::ConstructJob(Compilation , const JobAction , CmdArgs.push_back("-fcuda-short-ptr"); } + if (IsHIP) +CmdArgs.push_back("-fcuda-allow-variadic-functions"); + // OpenMP offloading device jobs take the argument -fopenmp-host-ir-file-path // to specify the result of the compile phase on the host, so the meaningful // device declarations can be identified. Also, -fopenmp-is-device is passed diff --git a/clang/lib/Driver/ToolChains/HIP.cpp b/clang/lib/Driver/ToolChains/HIP.cpp index d84a454359ad..1053a1a60978 100644 --- a/clang/lib/Driver/ToolChains/HIP.cpp +++ b/clang/lib/Driver/ToolChains/HIP.cpp @@ -296,6 +296,8 @@ void HIPToolChain::addClangTargetOptions( options::OPT_fno_gpu_allow_device_init, false)) CC1Args.push_back("-fgpu-allow-device-init"); + CC1Args.push_back("-fcuda-allow-variadic-functions"); + // Default to "hidden" visibility, as object level linking will not be // supported for the foreseeable future. if (!DriverArgs.hasArg(options::OPT_fvisibility_EQ, diff --git a/clang/test/Driver/hip-toolchain-no-rdc.hip b/clang/test/Driver/hip-toolchain-no-rdc.hip index 540b93286053..74b53cfb 100644 --- a/clang/test/Driver/hip-toolchain-no-rdc.hip +++ b/clang/test/Driver/hip-toolchain-no-rdc.hip @@ -20,7 +20,7 @@ // CHECK-SAME: "-aux-triple" "x86_64-unknown-linux-gnu" // CHECK-SAME: "-emit-llvm-bc" // CHECK-SAME: {{.*}} "-main-file-name" "a.cu" {{.*}} "-target-cpu" "gfx803" -// CHECK-SAME: "-fcuda-is-device" "-fvisibility" "hidden" +// CHECK-SAME: "-fcuda-is-device" "-fcuda-allow-variadic-functions" "-fvisibility" "hidden" // CHECK-SAME: "-fapply-global-visibility-to-externs" // CHECK-SAME: "{{.*}}lib1.bc" "{{.*}}lib2.bc" // CHECK-SAME: {{.*}} "-o" [[A_BC_803:".*bc"]] "-x" "hip" @@ -48,7 +48,7 @@ // CHECK-SAME: "-aux-triple" "x86_64-unknown-linux-gnu" // CHECK-SAME: "-emit-llvm-bc" // CHECK-SAME: {{.*}} "-main-file-name" "a.cu" {{.*}} "-target-cpu" "gfx900" -// CHECK-SAME: "-fcuda-is-device" "-fvisibility" "hidden" +// CHECK-SAME: "-fcuda-is-device" "-fcuda-allow-variadic-functions" "-fvisibility" "hidden" // CHECK-SAME: "-fapply-global-visibility-to-externs" // CHECK-SAME: "{{.*}}lib1.bc" "{{.*}}lib2.bc" // CHECK-SAME: {{.*}} "-o" [[A_BC_900:".*bc"]] "-x" "hip" @@ -92,7 +92,7 @@ // CHECK-SAME: "-aux-triple" "x86_64-unknown-linux-gnu" // CHECK-SAME: "-emit-llvm-bc" // CHECK-SAME: {{.*}} "-main-file-name" "b.hip" {{.*}} "-target-cpu" "gfx803" -// CHECK-SAME: "-fcuda-is-device" "-fvisibility"
[clang] 114de1e - Minor coding style fix. NFC.
Author: Michael Liao Date: 2019-10-22T04:32:30Z New Revision: 114de1eab29c06ac097c0e97feb713d616798f7a URL: https://github.com/llvm/llvm-project/commit/114de1eab29c06ac097c0e97feb713d616798f7a DIFF: https://github.com/llvm/llvm-project/commit/114de1eab29c06ac097c0e97feb713d616798f7a.diff LOG: Minor coding style fix. NFC. llvm-svn: 375478 Added: Modified: clang/lib/Sema/SemaLambda.cpp Removed: diff --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp index 749b0f2caaa0..c6b19a0b195c 100644 --- a/clang/lib/Sema/SemaLambda.cpp +++ b/clang/lib/Sema/SemaLambda.cpp @@ -444,7 +444,8 @@ void Sema::handleLambdaNumbering( } auto getMangleNumberingContext = - [this](CXXRecordDecl *Class, Decl *ManglingContextDecl) -> MangleNumberingContext * { + [this](CXXRecordDecl *Class, + Decl *ManglingContextDecl) -> MangleNumberingContext * { // Get mangle numbering context if there's any extra decl context. if (ManglingContextDecl) return ( ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r375310 - [clang][driver] Print compilation phases with indentation.
Author: hliao Date: Fri Oct 18 17:17:00 2019 New Revision: 375310 URL: http://llvm.org/viewvc/llvm-project?rev=375310=rev Log: [clang][driver] Print compilation phases with indentation. Reviewers: tra, sfantao, echristo Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69124 Modified: cfe/trunk/lib/Driver/Driver.cpp Modified: cfe/trunk/lib/Driver/Driver.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Driver.cpp?rev=375310=375309=375310=diff == --- cfe/trunk/lib/Driver/Driver.cpp (original) +++ cfe/trunk/lib/Driver/Driver.cpp Fri Oct 18 17:17:00 2019 @@ -1802,23 +1802,36 @@ bool Driver::HandleImmediateArgs(const C return true; } +enum { + TopLevelAction = 0, + HeadSibAction = 1, + OtherSibAction = 2, +}; + // Display an action graph human-readably. Action A is the "sink" node // and latest-occuring action. Traversal is in pre-order, visiting the // inputs to each action before printing the action itself. static unsigned PrintActions1(const Compilation , Action *A, - std::map ) { + std::map , + Twine Indent = {}, int Kind = TopLevelAction) { if (Ids.count(A)) // A was already visited. return Ids[A]; std::string str; llvm::raw_string_ostream os(str); + auto getSibIndent = [](int K) -> Twine { +return (K == HeadSibAction) ? " " : (K == OtherSibAction) ? "| " : ""; + }; + + Twine SibIndent = Indent + getSibIndent(Kind); + int SibKind = HeadSibAction; os << Action::getClassName(A->getKind()) << ", "; if (InputAction *IA = dyn_cast(A)) { os << "\"" << IA->getInputArg().getValue() << "\""; } else if (BindArchAction *BIA = dyn_cast(A)) { os << '"' << BIA->getArchName() << '"' << ", {" - << PrintActions1(C, *BIA->input_begin(), Ids) << "}"; + << PrintActions1(C, *BIA->input_begin(), Ids, SibIndent, SibKind) << "}"; } else if (OffloadAction *OA = dyn_cast(A)) { bool IsFirst = true; OA->doOnEachDependence( @@ -1841,8 +1854,9 @@ static unsigned PrintActions1(const Comp os << ":" << BoundArch; os << ")"; os << '"'; - os << " {" << PrintActions1(C, A, Ids) << "}"; + os << " {" << PrintActions1(C, A, Ids, SibIndent, SibKind) << "}"; IsFirst = false; + SibKind = OtherSibAction; }); } else { const ActionList *AL = >getInputs(); @@ -1850,8 +1864,9 @@ static unsigned PrintActions1(const Comp if (AL->size()) { const char *Prefix = "{"; for (Action *PreRequisite : *AL) { -os << Prefix << PrintActions1(C, PreRequisite, Ids); +os << Prefix << PrintActions1(C, PreRequisite, Ids, SibIndent, SibKind); Prefix = ", "; +SibKind = OtherSibAction; } os << "}"; } else @@ -1872,9 +1887,13 @@ static unsigned PrintActions1(const Comp } } + auto getSelfIndent = [](int K) -> Twine { +return (K == HeadSibAction) ? "+- " : (K == OtherSibAction) ? "|- " : ""; + }; + unsigned Id = Ids.size(); Ids[A] = Id; - llvm::errs() << Id << ": " << os.str() << ", " + llvm::errs() << Indent + getSelfIndent(Kind) << Id << ": " << os.str() << ", " << types::getTypeName(A->getType()) << offload_os.str() << "\n"; return Id; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r375309 - [hip][cuda] Fix the extended lambda name mangling issue.
Author: hliao Date: Fri Oct 18 17:15:19 2019 New Revision: 375309 URL: http://llvm.org/viewvc/llvm-project?rev=375309=rev Log: [hip][cuda] Fix the extended lambda name mangling issue. Summary: - HIP/CUDA host side needs to use device kernel symbol name to match the device side binaries. Without a consistent naming between host- and device-side compilations, it's risky that wrong device binaries are executed. Consistent naming is usually not an issue until unnamed types are used, especially the lambda. In this patch, the consistent name mangling is addressed for the extended lambdas, i.e. the lambdas annotated with `__device__`. - In [Itanium C++ ABI][1], the mangling of the lambda is generally unspecified unless, in certain cases, ODR rule is required to ensure consisent naming cross TUs. The extended lambda is such a case as its name may be part of a device kernel function, e.g., the extended lambda is used as a template argument and etc. Thus, we need to force ODR for extended lambdas as they are referenced in both device- and host-side TUs. Furthermore, if a extended lambda is nested in other (extended or not) lambdas, those lambdas are required to follow ODR naming as well. This patch revises the current lambda mangle numbering to force ODR from an extended lambda to all its parent lambdas. - On the other side, the aforementioned ODR naming should not change those lambdas' original linkages, i.e., we cannot replace the original `internal` with `linkonce_odr`; otherwise, we may violate ODR in general. This patch introduces a new field `HasKnownInternalLinkage` in lambda data to decouple the current linkage calculation based on mangling number assigned. [1]: https://itanium-cxx-abi.github.io/cxx-abi/abi.html Reviewers: tra, rsmith, yaxunl, martong, shafik Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D68818 Added: cfe/trunk/test/CodeGenCUDA/unnamed-types.cu Modified: cfe/trunk/include/clang/AST/DeclCXX.h cfe/trunk/include/clang/Sema/Sema.h cfe/trunk/lib/AST/ASTImporter.cpp cfe/trunk/lib/AST/Decl.cpp cfe/trunk/lib/Sema/SemaLambda.cpp cfe/trunk/lib/Sema/TreeTransform.h cfe/trunk/lib/Serialization/ASTReaderDecl.cpp cfe/trunk/lib/Serialization/ASTWriter.cpp Modified: cfe/trunk/include/clang/AST/DeclCXX.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/DeclCXX.h?rev=375309=375308=375309=diff == --- cfe/trunk/include/clang/AST/DeclCXX.h (original) +++ cfe/trunk/include/clang/AST/DeclCXX.h Fri Oct 18 17:15:19 2019 @@ -389,9 +389,12 @@ class CXXRecordDecl : public RecordDecl /// The number of explicit captures in this lambda. unsigned NumExplicitCaptures : 13; +/// Has known `internal` linkage. +unsigned HasKnownInternalLinkage : 1; + /// The number used to indicate this lambda expression for name /// mangling in the Itanium C++ ABI. -unsigned ManglingNumber = 0; +unsigned ManglingNumber : 31; /// The declaration that provides context for this lambda, if the /// actual DeclContext does not suffice. This is used for lambdas that @@ -406,12 +409,12 @@ class CXXRecordDecl : public RecordDecl /// The type of the call method. TypeSourceInfo *MethodTyInfo; -LambdaDefinitionData(CXXRecordDecl *D, TypeSourceInfo *Info, - bool Dependent, bool IsGeneric, - LambdaCaptureDefault CaptureDefault) - : DefinitionData(D), Dependent(Dependent), IsGenericLambda(IsGeneric), -CaptureDefault(CaptureDefault), NumCaptures(0), NumExplicitCaptures(0), -MethodTyInfo(Info) { +LambdaDefinitionData(CXXRecordDecl *D, TypeSourceInfo *Info, bool Dependent, + bool IsGeneric, LambdaCaptureDefault CaptureDefault) +: DefinitionData(D), Dependent(Dependent), IsGenericLambda(IsGeneric), + CaptureDefault(CaptureDefault), NumCaptures(0), + NumExplicitCaptures(0), HasKnownInternalLinkage(0), ManglingNumber(0), + MethodTyInfo(Info) { IsLambda = true; // C++1z [expr.prim.lambda]p4: @@ -1705,6 +1708,13 @@ public: return getLambdaData().ManglingNumber; } + /// The lambda is known to has internal linkage no matter whether it has name + /// mangling number. + bool hasKnownLambdaInternalLinkage() const { +assert(isLambda() && "Not a lambda closure type!"); +return getLambdaData().HasKnownInternalLinkage; + } + /// Retrieve the declaration that provides additional context for a /// lambda, when the normal declaration context is not specific enough. /// @@ -1718,9 +1728,12 @@ public: /// Set the mangling number and context declaration for a lambda /// class. - void setLambdaMangling(unsigned ManglingNumber, Decl *ContextDecl) { + void setLambdaMangling(unsigned ManglingNumber, Decl
r375245 - [tooling] Relax an assert when multiple GPU targets are specified.
Author: hliao Date: Fri Oct 18 08:03:34 2019 New Revision: 375245 URL: http://llvm.org/viewvc/llvm-project?rev=375245=rev Log: [tooling] Relax an assert when multiple GPU targets are specified. Modified: cfe/trunk/lib/Tooling/Tooling.cpp Modified: cfe/trunk/lib/Tooling/Tooling.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Tooling/Tooling.cpp?rev=375245=375244=375245=diff == --- cfe/trunk/lib/Tooling/Tooling.cpp (original) +++ cfe/trunk/lib/Tooling/Tooling.cpp Fri Oct 18 08:03:34 2019 @@ -105,7 +105,7 @@ static const llvm::opt::ArgStringList *g // tooling will consider host-compilation only. For tooling on device // compilation, device compilation only option, such as // `--cuda-device-only`, needs specifying. -assert(Actions.size() == 2); +assert(Actions.size() > 1); assert( isa(Actions.front()) || // On MacOSX real actions may end up being wrapped in ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] r375039 - [clangd] Add the missing dependency on `clangLex`.
Author: hliao Date: Wed Oct 16 13:22:54 2019 New Revision: 375039 URL: http://llvm.org/viewvc/llvm-project?rev=375039=rev Log: [clangd] Add the missing dependency on `clangLex`. Modified: clang-tools-extra/trunk/clangd/refactor/tweaks/CMakeLists.txt clang-tools-extra/trunk/clangd/tool/CMakeLists.txt Modified: clang-tools-extra/trunk/clangd/refactor/tweaks/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/clang-tools-extra/trunk/clangd/refactor/tweaks/CMakeLists.txt?rev=375039=375038=375039=diff == --- clang-tools-extra/trunk/clangd/refactor/tweaks/CMakeLists.txt (original) +++ clang-tools-extra/trunk/clangd/refactor/tweaks/CMakeLists.txt Wed Oct 16 13:22:54 2019 @@ -26,6 +26,7 @@ add_clang_library(clangDaemonTweaks OBJE clangAST clangBasic clangDaemon + clangLex clangToolingCore clangToolingRefactoring clangToolingSyntax Modified: clang-tools-extra/trunk/clangd/tool/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/clang-tools-extra/trunk/clangd/tool/CMakeLists.txt?rev=375039=375038=375039=diff == --- clang-tools-extra/trunk/clangd/tool/CMakeLists.txt (original) +++ clang-tools-extra/trunk/clangd/tool/CMakeLists.txt Wed Oct 16 13:22:54 2019 @@ -21,6 +21,7 @@ clang_target_link_libraries(clangd clangBasic clangFormat clangFrontend + clangLex clangSema clangTooling clangToolingCore ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r374478 - [tooling] Fix assertion on MacOSX.
Author: hliao Date: Thu Oct 10 16:45:20 2019 New Revision: 374478 URL: http://llvm.org/viewvc/llvm-project?rev=374478=rev Log: [tooling] Fix assertion on MacOSX. Modified: cfe/trunk/lib/Tooling/Tooling.cpp Modified: cfe/trunk/lib/Tooling/Tooling.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Tooling/Tooling.cpp?rev=374478=374477=374478=diff == --- cfe/trunk/lib/Tooling/Tooling.cpp (original) +++ cfe/trunk/lib/Tooling/Tooling.cpp Thu Oct 10 16:45:20 2019 @@ -106,7 +106,12 @@ static const llvm::opt::ArgStringList *g // compilation, device compilation only option, such as // `--cuda-device-only`, needs specifying. assert(Actions.size() == 2); -assert(isa(Actions.front())); +assert( +isa(Actions.front()) || +// On MacOSX real actions may end up being wrapped in +// BindArchAction. +(isa(Actions.front()) && + isa(*Actions.front()->input_begin(; OffloadCompilation = true; break; } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits