[PATCH] D121302: [HIP] Fix -fno-gpu-sanitize

2022-03-09 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGda9a70313d60: [HIP] Fix -fno-gpu-sanitize (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE

[PATCH] D120132: [HIP] Fix HIP include path

2022-03-09 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG6730b44480fc: [HIP] Fix HIP include path (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120132/new/ http

[PATCH] D121302: [HIP] Fix -fno-gpu-sanitize

2022-03-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a project: All. yaxunl requested review of this revision. Fix a typo about -fno-gpu-sanitize handling and disable warnings when -fno-gpu-sanitize is specified. https://reviews.llvm.org/D121302 Files: clang/lib/Driver/Too

[PATCH] D120132: [HIP] Fix HIP include path

2022-03-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I found a simple fix. Use -idirafter instead of -isystem-internal. It is still system include path but will be added after all other system include paths. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120132/new/ https://reviews.llvm.org/D120132 ___

[PATCH] D120132: [HIP] Fix HIP include path

2022-03-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 414100. yaxunl added a comment. use -idirafter to include HIP include path CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120132/new/ https://reviews.llvm.org/D120132 Files: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/test/Driver/hip-include-pat

[PATCH] D120911: [CUDA][HIP] Fix offloading kind for linking C++ programs

2022-03-04 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGe5eb365069cc: [CUDA][HIP] Fix offloading kind for linking C++ programs (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Gi

[PATCH] D120910: [HIP] Fix job action offloading kind for mixed HIP/C++ compilation

2022-03-04 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGbde13a8102ba: [HIP] Fix job action offloading kind for mixed HIP/C++ compilation (authored by yaxunl). Herald added a project: clang. Repository:

[PATCH] D120911: [CUDA][HIP] Fix offloading kind for linking C++ programs

2022-03-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 412989. yaxunl added a comment. add more tests CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120911/new/ https://reviews.llvm.org/D120911 Files: clang/include/clang/Driver/Action.h clang/lib/Driver/Driver.cpp clang/test/Driver/hip-phases.hip

[PATCH] D120272: [CUDA] Add driver support for compiling CUDA with the new driver

2022-03-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/Driver.cpp:4099-4102 + for (auto &Arg : Args.getAllArgValues(options::OPT_offload_arch_EQ)) +Archs.insert(getCanonicalArchString(C, Args, Arg, Kind)); + for (auto &Arg : Args.getAllArgValues(options::OPT_no_offload_

[PATCH] D120132: [HIP] Fix HIP include path

2022-03-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Herald added a project: All. In D120132#3352255 , @tra wrote: > In D120132#3351999 , @yaxunl wrote: > >> In D120132#3351853 , @tra wrote: >> >>> In

[PATCH] D120911: [CUDA][HIP] Fix offloading kind for linking C++ programs

2022-03-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a subscriber: carlosgalvezp. Herald added a project: All. yaxunl requested review of this revision. When both CUDA or HIP programs and C++ programs are passed to clang driver without `-c`, C++ programs are treated as CUDA or

[PATCH] D120910: [HIP] Fix job action offloading kind for mixed HIP/C++ compilation

2022-03-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a project: All. yaxunl requested review of this revision. When both HIP and C++ programs are input files to clang with -c, clang treats C++ programs as HIP programs, which is incorrect. This is due to action builder does not

[PATCH] D120132: [HIP] Fix HIP include path

2022-03-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D120132#3351853 , @tra wrote: > In D120132#3351391 , @yaxunl wrote: > >> > > > >> If any input file is HIP program, clang driver will use HIP offload kind for >> all inputs. This behav

[PATCH] D120132: [HIP] Fix HIP include path

2022-03-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D120132#3350020 , @tra wrote: > In D120132#3349936 , @yaxunl wrote: > >> Users may use clang driver to compile HIP program and C++ program with one >> clang driver invocation, e.g. >> >

[PATCH] D120697: [clang-offload-bundler] HIP and OpenMP comaptibility for linking heterogeneous archive library

2022-03-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120697/new/ https://reviews.llvm.org/D120697 __

[PATCH] D120132: [HIP] Fix HIP include path

2022-02-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D120132#3349538 , @tra wrote: > In D120132#3345534 , @yaxunl wrote: > >> I just found one issue with the current patch. It adds HIP include path for >> non-HIP programs. >> >> We should

[PATCH] D120662: [clang-offload-bundler] add -input/-output options

2022-02-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. There seem to be regressions on Windows: https://buildkite.com/llvm-project/premerge-checks/builds/81394#c0e8e7e4-1f70-47d3-852a-e65469d70c41 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120662/new/ https://reviews.llvm.or

[PATCH] D120557: [HIP] File device library ABI version file name

2022-02-28 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG092f15ac40ce: [HIP] File device library ABI version file name (authored by yaxunl). Herald added a project: clang. Changed prior to commit: https:

[PATCH] D120563: [HIP] Fix test hip-link-bundled-archive.hip

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGdf0c98364322: [HIP] Fix test hip-link-bundled-archive.hip (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D

[PATCH] D120566: [OpenCL][AMDGPU]: Do not allow a call to kernel

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. One of my concerns is that all kernels are duplicated which may cause code object size doubled. Do we need to make the clone always_inline and let the kernel call its clone to avoid duplicate function bodies? Or LLVM has some pass to do that? Another concern is that the

[PATCH] D120132: [HIP] Fix HIP include path

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl planned changes to this revision. yaxunl added a comment. I just found one issue with the current patch. It adds HIP include path for non-HIP programs. We should only add HIP include path for JobAction with HIP offloading kind. However, AddClangSystemIncludeArgs is not per job action. I

[PATCH] D120529: Disable broken hip test on Windows

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. https://reviews.llvm.org/D120563 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120529/new/ https://reviews.llvm.org/D120529 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D120563: [HIP] Fix test hip-link-bundled-archive.hip

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, shangwuyao. yaxunl requested review of this revision. match pattern should match lld.exe on windows https://reviews.llvm.org/D120563 Files: clang/test/Driver/hip-link-bundle-archive.hip Index: clang/test/Driver/hip-link-bundle-archi

[PATCH] D120529: Disable broken hip test on Windows

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Sorry I missed this failure. Thanks for disabling it. I will come up with a fix. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120529/new/ https://reviews.llvm.org/D120529 ___ cf

[PATCH] D120366: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D120366#3345401 , @yaxunl wrote: > In D120366#3344428 , @shangwuyao > wrote: > >> @yaxunl I saw that you added the test recently, could you provide some >> context? I think this test i

[PATCH] D120366: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D120366#3344428 , @shangwuyao wrote: > @yaxunl I saw that you added the test recently, could you provide some > context? I think this test is broken at HEAD as I saw it is broken for other > patches (see this build >

[PATCH] D120557: [HIP] File device library ABI version file name

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added subscribers: kerbowa, jvesely. yaxunl requested review of this revision. It should be oclc_abi_version* instead of abi_version*. https://reviews.llvm.org/D120557 Files: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/test

[PATCH] D120132: [HIP] Fix HIP include path

2022-02-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 411290. yaxunl added a comment. revised by Artem's comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120132/new/ https://reviews.llvm.org/D120132 Files: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/lib/Driver/ToolChains/CrossWindows.cpp c

[PATCH] D120132: [HIP] Fix HIP include path

2022-02-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 411150. yaxunl edited the summary of this revision. yaxunl added a reviewer: linjamaki. yaxunl added a comment. Herald added subscribers: mstorsjo, emaste. add HIP include path after adding other system include paths CHANGES SINCE LAST ACTION https://review

[PATCH] D120298: [HIP] Support `-fgpu-default-stream`

2022-02-23 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG9d899d8f0187: [HIP] Support `-fgpu-default-stream` (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANG

[PATCH] D120132: [HIP] Fix HIP include path

2022-02-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:531-532 + DriverArgs.hasArg(options::OPT_nostdlibinc)) { +CC1Args.push_back("-internal-isystem"); +CC1Args.push_back(HipIncludePath); + } tra wrote: > yaxunl wrote: > >

[PATCH] D120132: [HIP] Fix HIP include path

2022-02-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:531-532 + DriverArgs.hasArg(options::OPT_nostdlibinc)) { +CC1Args.push_back("-internal-isystem"); +CC1Args.push_back(HipIncludePath); + } tra wrote: > yaxunl wrote: > >

[PATCH] D120298: [HIP] Support `-fgpu-default-stream`

2022-02-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D120298#3341250 , @tra wrote: > LGTM with a minor nit. > >> Also -DHIP_API_PER_THREAD_DEFAULT_STREAM is passed to clang -cc1 to enable >> other per-thread stream > > You may want to rephrase patch description it a bit to match

[PATCH] D120132: [HIP] Fix HIP include path

2022-02-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:531-532 + DriverArgs.hasArg(options::OPT_nostdlibinc)) { +CC1Args.push_back("-internal-isystem"); +CC1Args.push_back(HipIncludePath); + } tra wrote: > My impression, af

[PATCH] D120298: [HIP] Support `-fgpu-default-stream`

2022-02-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 410691. yaxunl marked an inline comment as done. yaxunl retitled this revision from "[HIP] Support `--default-stream`" to "[HIP] Support `-fgpu-default-stream`". yaxunl edited the summary of this revision. yaxunl added a comment. rename the option and use prep

[PATCH] D88425: Skip -fPIE for AMDGPU and HIP toolchain

2022-02-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D88425#3337003 , @MaskRay wrote: > I plan to default CMake `CLANG_DEFAULT_PIE_ON_LINUX` to on in D120305 > and hip-fpie-option.hip will fail. Do you > mind investigating the issue? You need t

[PATCH] D120298: [HIP] Support `--default-stream`

2022-02-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/include/clang/Driver/Options.td:962 NegFlag>; +def default_stream_EQ : Joined<["--"], "default-stream=">, + HelpText<"Specify default stream. Valid values are 'legacy' and 'per-thread'.

[PATCH] D120298: [HIP] Support `--default-stream`

2022-02-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added subscribers: dexonsmith, dang. yaxunl requested review of this revision. Introduce `--default-stream={legacy|per-thread}` option to support per-thread default stream for HIP runtime. When `--default-stream=per-thread`, HIP k

[PATCH] D120070: [HIP] Support linking archive of bundled bitcode

2022-02-19 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGfa0f90bc55ed: [HIP] Support linking archive of bundled bitcode (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Mon

[PATCH] D120132: [HIP] Fix HIP include path

2022-02-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added subscribers: kerbowa, jvesely. yaxunl requested review of this revision. The clang compiler prepends the HIP header include paths to the search list using -internal-isystem when building for the HIP language. This prevents wa

[PATCH] D120070: [HIP] Support linking archive of bundled bitcode

2022-02-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/test/Driver/hip-link-bundle-archive.hip:3 + +// RUN: touch %T/libhipBundled.a + yaxunl wrote: > tra wrote: > > Is this file necessary? `clang -###` should not need the file t

[PATCH] D120070: [HIP] Support linking archive of bundled bitcode

2022-02-17 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/test/Driver/hip-link-bundle-archive.hip:3 + +// RUN: touch %T/libhipBundled.a + tra wrote: > Is this file necessary? `clang -###` should not need the file to be present > in

[PATCH] D120070: [HIP] Support linking archive of bundled bitcode

2022-02-17 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. yaxunl requested review of this revision. HIP programs compiled with -c -fgpu-rdc generate clang-offload-bundler bundles which contain bitcode for different GPU's. Such files can be archived to an archive file which can be linked with HI

[PATCH] D119207: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

2022-02-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:10322 ABIArgInfo SPIRVABIInfo::classifyKernelArgumentType(QualType Ty) const { - if (getContext().getLangOpts().HIP) { + if (getContext().getLangOpts().CUDAIsDevice) { // Coerce pointer arguments w

[PATCH] D119615: [CUDA][HIP] Do not promote constexpr var with non-constant initializer

2022-02-15 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. yaxunl marked an inline comment as done. Closed by commit rG73b22935a7a8: [CUDA][HIP] Do not promote constexpr var with non-constant initializer (authored by yaxunl). H

[PATCH] D119615: [CUDA][HIP] Do not promote constexpr var with non-constant initializer

2022-02-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/lib/Sema/SemaCUDA.cpp:148-150 + if ((Var->isConstexpr() || Var->getType().isConstQualified()) && + Var->hasAttr() && !hasExplicitAttr(Var)) tra wrote: > So the i

[PATCH] D119615: [CUDA][HIP] Do not promote constexpr var with non-constant initializer

2022-02-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. yaxunl requested review of this revision. constexpr var may be initialized with address of non-const variable. In this case the initializer is not constant in device compilation. This has been handled for const vars but not for constexpr

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-08 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG1d97cb1f6e44: [HIP] Emit amdgpu_code_object_version module flag (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 406605. yaxunl marked 2 inline comments as done. yaxunl added a comment. fix comments and add a driver test for -cc1as CHANGES SINCE LAST ACTION https://reviews.llvm.org/D119026/new/ https://reviews.llvm.org/D119026 Files: clang/include/clang/Basic/Targ

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done. yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1166 CmdArgs.insert(CmdArgs.begin() + 1, "-mllvm"); +// -cc1as does not need -mcode-object-version option. +if (!IsCC1As) tra wro

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 406536. yaxunl marked 2 inline comments as done. yaxunl added a comment. add a test for -cc1as CHANGES SINCE LAST ACTION https://reviews.llvm.org/D119026/new/ https://reviews.llvm.org/D119026 Files: clang/include/clang/Basic/TargetOptions.h clang/incl

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done. yaxunl added inline comments. Comment at: clang/include/clang/Driver/Options.td:3445 def mcode_object_version_EQ : Joined<["-"], "mcode-object-version=">, Group, - HelpText<"Specify code object ABI version. Defaults to 4. (AMDGPU only)"

[PATCH] D118876: [HIPSPV] Fix literals are mapped to Generic address space

2022-02-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. It has been cherry-picked to 14.x by 02d5b112 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118876/new/ https://reviews.llvm.org/D118876 ___

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 406221. yaxunl marked an inline comment as done. yaxunl added a comment. marshalling the arg as enum. fix test failures for -cc1as. temporarily disable it except v5. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D119026/new/ https://reviews.llvm.org/

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:575 +// times 100. +if (getTarget().getTargetOpts().CodeObjectVersion != "none") { + unsigned CodeObjVer; tra wrote: > yaxunl wrote

[PATCH] D118876: [HIPSPV] Fix literals are mapped to Generic address space

2022-02-05 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG171da443d598: [HIPSPV] Fix literals are mapped to Generic address space (authored by yaxunl). Repository: rG LLVM Github Monorepo CHANGES SINCE L

[PATCH] D118876: [HIPSPV] Fix literals are mapped to Generic address space

2022-02-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a subscriber: tstellar. yaxunl added a comment. In D118876#3295958 , @linjamaki wrote: > Thanks for the review, @yaxunl. Could you push this to the LLVM? And to the > LLVM 14 release branch too, if possible? Sure. @tstellar What is the curr

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:575 +// times 100. +if (getTarget().getTargetOpts().CodeObjectVersion != "none") { + unsigned CodeObjVer; tra wrote: > When will it ever be set to `none`? Does the new opti

[PATCH] D119026: [HIP] Emit amdgpu_code_object_version module flag

2022-02-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, b-sumner. Herald added subscribers: dang, kerbowa, t-tye, tpr, dstuttard, jvesely, kzhuravl. yaxunl requested review of this revision. Herald added a subscriber: wdng. code object version determines ABI, therefore should not be mixed. Th

[PATCH] D118949: [HIP] Support code object v5

2022-02-04 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGd4e4ef2e81e0: [HIP] Support code object v5 (authored by yaxunl). Herald added a project: clang. Changed prior to commit: https://reviews.llvm.org/

[PATCH] D118949: [HIP] Support code object v5

2022-02-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done. yaxunl added a comment. Will revise as recommended when committing. Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118949/new/ https://reviews.llvm.org/D118949 ___ cfe-commits mailing list cfe-c

[PATCH] D118949: [HIP] Support code object v5

2022-02-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, b-sumner. Herald added subscribers: dang, kerbowa, jvesely. yaxunl requested review of this revision. Herald added a reviewer: jdoerfert. Herald added a subscriber: sstefan1. New device library supporting v4 and v5 has abi_version_400.bc a

[PATCH] D118876: [HIPSPV] Fix literals are mapped to Generic address space

2022-02-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118876/new/ https://reviews.llvm.org/D118876 __

[PATCH] D115523: [OpenCL] Set external linkage for block enqueue kernels

2022-01-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D115523#3266584 , @Anastasia wrote: > In D115523#3240870 , @yaxunl wrote: > >> In D115523#3237857 , @Anastasia >> wrote: >> >>> In D115523#3237

[PATCH] D118153: [CUDA][HIP] Do not treat host var address as constant in device compilation

2022-01-28 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG8428c75da1ab: [CUDA][HIP] Do not treat host var address as constant in device compilation (authored by yaxunl). Herald added a project: clang. Repos

[PATCH] D117137: [Driver] Add CUDA support for --offload param

2022-01-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D117137#3274035 , @dcastagna wrote: > @yaxunl Are you OK landing this change as it is, without the check for OS and > environment in getHIPOffloadTargetTriple? > We can follow up with patch that adds checks for in OS and enviro

[PATCH] D118153: [CUDA][HIP] Do not treat host var address as constant in device compilation

2022-01-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 404010. yaxunl marked 3 inline comments as done. yaxunl added a comment. Revised by Artem's comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118153/new/ https://reviews.llvm.org/D118153 Files: clang/include/clang/AST/ASTContext.h clang/li

[PATCH] D118153: [CUDA][HIP] Do not treat host var address as constant in device compilation

2022-01-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done. yaxunl added inline comments. Comment at: clang/lib/AST/ExprConstant.cpp:2227 +!Var->hasAttr() && +!Var->hasAttr() && +!Var->getType()->isCUDADeviceBuiltinSurfaceType() && tra wrote: > D

[PATCH] D118153: [CUDA][HIP] Do not treat host var address as constant in device compilation

2022-01-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 403706. yaxunl added a comment. Fix the regression in lit tests. Basically in device compilation we still evaluate constant expression for host functions or host template instantiation. If we just disallow host variable in any constant expressions we will ge

[PATCH] D117137: [Driver] Add CUDA support for --offload param

2022-01-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D117137#3273330 , @tra wrote: > In D117137#3269365 , @yaxunl wrote: > >> Does that mean only "spirv{64}-unknown-unknown" is acceptable, or >> "spirv{64}-amd-unknown-unknown" is also acc

[PATCH] D118153: [CUDA][HIP] Do not treat host var address as constant in device compilation

2022-01-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D118153#3270122 , @tra wrote: > LGTM. > > Do we need to do anything special about `__managed__` vars? Right `__managed__` var is special. Its address is set by runtime, therefore it is not a constant. nvcc does not treat it as

[PATCH] D118153: [CUDA][HIP] Do not treat host var address as constant in device compilation

2022-01-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 402924. yaxunl added a comment. fix test CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118153/new/ https://reviews.llvm.org/D118153 Files: clang/lib/AST/ExprConstant.cpp clang/test/CodeGenCUDA/const-var.cu clang/test/SemaCUDA/const-var.cu Ind

[PATCH] D118153: [CUDA][HIP] Do not treat host var address as constant in device compilation

2022-01-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. yaxunl requested review of this revision. Currently clang treats host var address as constant in device compilation, which causes const vars initialized with host var address promoted to device variables incorrectly and results in undefin

[PATCH] D117137: [Driver] Add CUDA support for --offload param

2022-01-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D117137#3268548 , @linjamaki wrote: > SPIR-V target requires that the OS and the environment type is unknown (see > TargetInfo::AllocateTarget and BaseSPIRTargetInfo). The clang would fail to > create a SPIR-V target if there

[PATCH] D117137: [Driver] Add CUDA support for --offline param

2022-01-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. The title says `--offline` option, which should be `--offload`. Comment at: clang/include/clang/Driver/Options.td:1142 def offload_EQ : CommaJoined<["--"], "offload=">, Flags<[NoXarchOption]>, - HelpText<"Specify comma-separated list of offloading targ

[PATCH] D116216: Prevent adding module flag - amdgpu_hostcall multiple times.

2022-01-19 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was not accepted when it landed; it landed in state "Needs Review". This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG85c2bd2a0e0e: Prevent adding module flag amdgpu_hostcall multiple

[PATCH] D116216: Prevent adding module flag - amdgpu_hostcall multiple times.

2022-01-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116216/new/ https://reviews.llvm.org/D116216 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi

[PATCH] D115523: [OpenCL] Set external linkage for block enqueue kernels

2022-01-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D115523#3237857 , @Anastasia wrote: > In D115523#3237410 , @yaxunl wrote: > >> It is possible that block kernels are defined and invoked in static >> functions, therefore two block kern

[PATCH] D115523: [OpenCL] Set external linkage for block enqueue kernels

2022-01-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. It is possible that block kernels are defined and invoked in static functions, therefore two block kernels in different TU's may have the same name. Making such kernels external may cause duplicate symbols. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTIO

[PATCH] D116967: [HIP] Fix device malloc/free

2022-01-11 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG694fd10659eb: [HIP] Fix device malloc/free (authored by yaxunl). Herald added a project: clang. Changed prior to commit: https://reviews.llvm.org/

[PATCH] D116967: [HIP] Fix device malloc/free

2022-01-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Headers/__clang_hip_runtime_wrapper.h:80 +#if HIP_VERSION_MAJOR > 4 || (HIP_VERSION_MAJOR == 4 && HIP_VERSION_MINOR >= 5) +extern "C" __device__ unsigned long long __ockl_dm_alloc(unsigned long long __size);

[PATCH] D116840: [HIP] Fix device only linking for -fgpu-rdc

2022-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG98ab43a1d209: [HIP] Fix device only linking for -fgpu-rdc (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo

[PATCH] D116967: [HIP] Fix device malloc/free

2022-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, b-sumner. yaxunl requested review of this revision. ROCm 4.5 device library introduced `__ockl_dm_alloc` and `__oclk_dm_dealloc` for supporting device side malloc/free. This patch redefines device malloc/free to use these functions. It a

[PATCH] D116840: [HIP] Fix device only linking for -fgpu-rdc

2022-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 398662. yaxunl added a comment. avoid clearing AL CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116840/new/ https://reviews.llvm.org/D116840 Files: clang/lib/Driver/Driver.cpp clang/test/Driver/hip-phases.hip clang/test/Driver/hip-toolchain-rd

[PATCH] D116840: [HIP] Fix device only linking for -fgpu-rdc

2022-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/Driver.cpp:3173 + AssociatedOffloadKind); +AL.clear(); +// Offload the host object to the host linker. tra wrote: > Doing `clear()` in a function intended to append looks

[PATCH] D116216: Prevent adding module flag - amdgpu_hostcall multiple times.

2022-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I think it will be cleaner to keep the original amdgpu-asan.cu unchanged whereas add amdgpu-asan-printf.cu which tests asan with printf. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116216/new/ https://reviews.llvm.org/D116216 _

[PATCH] D116840: [HIP] Fix device only linking for -fgpu-rdc

2022-01-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. yaxunl requested review of this revision. Currently when -fgpu-rdc is specified, HIP toolchain always does host linking even if --cuda-device-only is specified. This patch fixes that. Only device linking is performed when --cuda-device-

[PATCH] D116216: Prevent adding module flag - amdgpu_hostcall multiple times.

2022-01-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D116216#3209335 , @pvellien wrote: > @yaxunl It would be very much helpful to know how to write test coverage for > this particular patch? thanks there is a lit test amdgpu-asan.cu. You can add a call of print to that test an

[PATCH] D116216: Prevent adding module flag - amdgpu_hostcall multiple times.

2021-12-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116216/new/ https://reviews.llvm.org/D116216 __

[PATCH] D110622: [HIPSPV][3/4] Enable SPIR-V emission for HIP

2021-12-20 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGa6786cdd5757: [HIPSPV][3/4] Enable SPIR-V emission for HIP (authored by yaxunl). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D110622: [HIPSPV][3/4] Enable SPIR-V emission for HIP

2021-12-17 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D110622#3199233 , @linjamaki wrote: > Assuming that this patch is ready to land. @tra or @yaxunl, could you please > commit this patch to the LLVM for us? Thanks. I can help commit this patch. Repository: rG LLVM Github Mo

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D115661#3193413 , @yaxunl wrote: > In D115661#3193157 , @arsenm wrote: > >> In D115661#3193152 , @yaxunl wrote: >> >>> In D115661#3192983

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D115661#3193157 , @arsenm wrote: > In D115661#3193152 , @yaxunl wrote: > >> In D115661#3192983 , @estewart08 >> wrote: >> >>> In D115661#319047

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D115661#3192983 , @estewart08 wrote: > In D115661#3190477 , @yaxunl wrote: > >> This may cause perf regressions for HIP. > > Do you have a test that would show such a regression? Emitti

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl requested changes to this revision. yaxunl added a comment. This revision now requires changes to proceed. This may cause perf regressions for HIP. Comment at: clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp:10 // X86: @_ZN1A3FooE ={{.*}} constant i32 123, align 4 -//

[PATCH] D110549: [HIPSPV][1/4] Refactor HIP tool chain

2021-12-13 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG78b0f3701d44: [HIPSPV][1/4] Refactor HIP tool chain (authored by yaxunl). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https:/

[PATCH] D110549: [HIPSPV][1/4] Refactor HIP tool chain

2021-12-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D110549#3185409 , @linjamaki wrote: > Assuming this patch is ready to land. @yaxunl, Could you please commit this > patch to the LLVM for us. Thanks. I will test it with our internal CI, then commit it. Repository: rG LLVM

[PATCH] D115283: [AMDGPU] Set "amdgpu_hostcall" module flag if an AMDGPU function has calls to device lib functions that use hostcalls.

2021-12-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D115283#3183034 , @JonChesterfield wrote: > In D115283#3182879 , @yaxunl wrote: > >> In D115283#3181128 , >> @JonChesterfield wrote: >> >>> No

[PATCH] D115283: [AMDGPU] Set "amdgpu_hostcall" module flag if an AMDGPU function has calls to device lib functions that use hostcalls.

2021-12-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D115283#3181109 , @kpyzhov wrote: > In D115283#3180836 , @yaxunl wrote: > >> If we only need to check whether `__ockl_hostcall_internal` exists in the >> final module in LLVM codegen to

[PATCH] D115283: [AMDGPU] Set "amdgpu_hostcall" module flag if an AMDGPU function has calls to device lib functions that use hostcalls.

2021-12-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D115283#3181128 , @JonChesterfield wrote: > Not exactly that. The weak symbol isn't the function name, as that gets > renamed or inlined. We discussed this before. As code object ABI use runtime metadata to represent hostcal

<    3   4   5   6   7   8   9   10   11   12   >