[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-05-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 19 inline comments as done.
yaxunl added a comment.

In https://reviews.llvm.org/D45212#1105177, @tra wrote:

> Hi,
>
> Sorry about the long silence. I'm back to continue the reviews. I'll handle 
> what I can today and will continue with the rest on Tuesday.
>
> It looks like patch description needs to be updated:
>
> > Use clang-offload-bindler to create binary for device ISA.
>
> I don't see anything related to offload-bundler in this patch any more.


You are right. Using clang-offload-bundler to create binary for device ISA has 
been moved to another patch. Will update the description of this patch.

In https://reviews.llvm.org/D45212#1105282, @tra wrote:

> One more thing -- it would be really good to add some tests to make sure your 
> commands are constructed the way you want.


will do




Comment at: include/clang/Driver/Options.td:582
 def fno_cuda_rdc : Flag<["-"], "fno-cuda-rdc">;
+def hip_device_lib_path_EQ : Joined<["--"], "hip-device-lib-path=">, 
Group,
+  HelpText<"HIP device library path">;

tra wrote:
> I'm not sure about `i_Group`? This will cause this option to be passed to all 
> preprocessor jobs. It will also be passed to host and device side 
> compilations, while you probably only want/need it on device side only.
will change to Link_Group



Comment at: lib/Driver/ToolChains/Cuda.cpp:323
+C.getDriver().Diag(diag::err_drv_no_such_file) << BCName;
+  CmdArgs.push_back(Args.MakeArgString(FullName));
+  return FoundLibDevice;

tra wrote:
> FullName is already result of Args.MakeArgString. You only need to do it once.
will fix



Comment at: lib/Driver/ToolChains/Cuda.cpp:329
+// object file. It calls llvm-link, opt, llc, then lld steps.
+void AMDGCN::Linker::ConstructJob(Compilation , const JobAction ,
+  const InputInfo ,

tra wrote:
> This function is too large to easily see that we're actually constructing 
> sequence of commands.
> I'd probably split construction of individual tool's command line into its 
> own function.
will do



Comment at: lib/Driver/ToolChains/Cuda.cpp:336
+  assert(StringRef(JA.getOffloadingArch()).startswith("gfx") &&
+ " unless gfx processor, backend should be clang");
+  const auto  =

tra wrote:
> No need for the leading space in the message.
will fix.



Comment at: lib/Driver/ToolChains/Cuda.cpp:344-345
+  // Add the input bc's created by compile step.
+  for (InputInfoList::const_iterator it = Inputs.begin(), ie = Inputs.end();
+   it != ie; ++it) {
+const InputInfo  = *it;

tra wrote:
> `for (const InputInfo  : Inputs)` ?
will fix



Comment at: lib/Driver/ToolChains/Cuda.cpp:350
+
+  std::string GFXNAME = JA.getOffloadingArch();
+

tra wrote:
> All-caps name looks like a macro. Rename to `GfxName` ?
will fix



Comment at: lib/Driver/ToolChains/Cuda.cpp:354-359
+  // Find in --hip-device-lib-path and HIP_LIBRARY_PATH.
+  for (auto Arg : Args) {
+if (Arg->getSpelling() == "--hip-device-lib-path=") {
+  LibraryPaths.push_back(Args.MakeArgString(Arg->getValue()));
+}
+  }

tra wrote:
> ```
> for (path : Args.getAllArgValues(...)) {
>LibraryPaths.push_back(Args.MakeArgString(path));
> }
> 
> ```
will fix



Comment at: lib/Driver/ToolChains/Cuda.cpp:375-378
+  addBCLib(C, Args, CmdArgs, LibraryPaths,
+   (Twine("oclc_isa_version_") + StringRef(GFXNAME).drop_front(3) +
+".amdgcn.bc")
+   .str());

tra wrote:
> This is somewhat unreadable. Perhaps you could construct the name in a temp 
> variable.
will do



Comment at: lib/Driver/ToolChains/Cuda.cpp:384
+  const char *ResultingBitcodeF =
+  C.addTempFile(C.getArgs().MakeArgString(TmpName.c_str()));
+  CmdArgs.push_back(ResultingBitcodeF);

tra wrote:
> You don't need to use c_str() for MakeArgString. It will happily accept 
> std::string.
will fix



Comment at: lib/Driver/ToolChains/Cuda.cpp:394
+  // The input to opt is the output from llvm-link.
+  OptArgs.push_back(ResultingBitcodeF);
+  // Pass optimization arg to opt.

tra wrote:
> `BitcodeOutputFile`?
will change



Comment at: lib/Driver/ToolChains/Cuda.cpp:417
+  const char *mcpustr = Args.MakeArgString("-mcpu=" + GFXNAME);
+  OptArgs.push_back(mcpustr);
+  OptArgs.push_back("-o");

tra wrote:
> I think you can get rid of the temp var here without hurting readability.
will do



Comment at: lib/Driver/ToolChains/Cuda.cpp:420
+  std::string OptOutputFileName =
+  C.getDriver().GetTemporaryPath("OPT_OUTPUT", "bc");
+  const char *OptOutputFile =

tra wrote:
> I wonder if we could derive temp file name from the input's name. This may 
> make it 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

One more thing -- it would be really good to add some tests to make sure your 
commands are constructed the way you want.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

Hi,

Sorry about the long silence. I'm back to continue the reviews. I'll handle 
what I can today and will continue with the rest on Tuesday.

It looks like patch description needs to be updated:

> Use clang-offload-bindler to create binary for device ISA.

I don't see anything related to offload-bundler in this patch any more.




Comment at: include/clang/Driver/Options.td:582
 def fno_cuda_rdc : Flag<["-"], "fno-cuda-rdc">;
+def hip_device_lib_path_EQ : Joined<["--"], "hip-device-lib-path=">, 
Group,
+  HelpText<"HIP device library path">;

I'm not sure about `i_Group`? This will cause this option to be passed to all 
preprocessor jobs. It will also be passed to host and device side compilations, 
while you probably only want/need it on device side only.



Comment at: lib/Driver/ToolChains/Cuda.cpp:323
+C.getDriver().Diag(diag::err_drv_no_such_file) << BCName;
+  CmdArgs.push_back(Args.MakeArgString(FullName));
+  return FoundLibDevice;

FullName is already result of Args.MakeArgString. You only need to do it once.



Comment at: lib/Driver/ToolChains/Cuda.cpp:329
+// object file. It calls llvm-link, opt, llc, then lld steps.
+void AMDGCN::Linker::ConstructJob(Compilation , const JobAction ,
+  const InputInfo ,

This function is too large to easily see that we're actually constructing 
sequence of commands.
I'd probably split construction of individual tool's command line into its own 
function.



Comment at: lib/Driver/ToolChains/Cuda.cpp:336
+  assert(StringRef(JA.getOffloadingArch()).startswith("gfx") &&
+ " unless gfx processor, backend should be clang");
+  const auto  =

No need for the leading space in the message.



Comment at: lib/Driver/ToolChains/Cuda.cpp:344-345
+  // Add the input bc's created by compile step.
+  for (InputInfoList::const_iterator it = Inputs.begin(), ie = Inputs.end();
+   it != ie; ++it) {
+const InputInfo  = *it;

`for (const InputInfo  : Inputs)` ?



Comment at: lib/Driver/ToolChains/Cuda.cpp:350
+
+  std::string GFXNAME = JA.getOffloadingArch();
+

All-caps name looks like a macro. Rename to `GfxName` ?



Comment at: lib/Driver/ToolChains/Cuda.cpp:354-359
+  // Find in --hip-device-lib-path and HIP_LIBRARY_PATH.
+  for (auto Arg : Args) {
+if (Arg->getSpelling() == "--hip-device-lib-path=") {
+  LibraryPaths.push_back(Args.MakeArgString(Arg->getValue()));
+}
+  }

```
for (path : Args.getAllArgValues(...)) {
   LibraryPaths.push_back(Args.MakeArgString(path));
}

```



Comment at: lib/Driver/ToolChains/Cuda.cpp:375-378
+  addBCLib(C, Args, CmdArgs, LibraryPaths,
+   (Twine("oclc_isa_version_") + StringRef(GFXNAME).drop_front(3) +
+".amdgcn.bc")
+   .str());

This is somewhat unreadable. Perhaps you could construct the name in a temp 
variable.



Comment at: lib/Driver/ToolChains/Cuda.cpp:384
+  const char *ResultingBitcodeF =
+  C.addTempFile(C.getArgs().MakeArgString(TmpName.c_str()));
+  CmdArgs.push_back(ResultingBitcodeF);

You don't need to use c_str() for MakeArgString. It will happily accept 
std::string.



Comment at: lib/Driver/ToolChains/Cuda.cpp:394
+  // The input to opt is the output from llvm-link.
+  OptArgs.push_back(ResultingBitcodeF);
+  // Pass optimization arg to opt.

`BitcodeOutputFile`?



Comment at: lib/Driver/ToolChains/Cuda.cpp:417
+  const char *mcpustr = Args.MakeArgString("-mcpu=" + GFXNAME);
+  OptArgs.push_back(mcpustr);
+  OptArgs.push_back("-o");

I think you can get rid of the temp var here without hurting readability.



Comment at: lib/Driver/ToolChains/Cuda.cpp:420
+  std::string OptOutputFileName =
+  C.getDriver().GetTemporaryPath("OPT_OUTPUT", "bc");
+  const char *OptOutputFile =

I wonder if we could derive temp file name from the input's name. This may make 
it easier to find relevant temp files if/when we need to debug the compilation 
process.



Comment at: lib/Driver/ToolChains/Cuda.cpp:422
+  const char *OptOutputFile =
+  C.addTempFile(C.getArgs().MakeArgString(OptOutputFileName.c_str()));
+  OptArgs.push_back(OptOutputFile);

No need for c_str() here.



Comment at: lib/Driver/ToolChains/Cuda.cpp:439
+  const char *LlcOutputFile =
+  C.addTempFile(C.getArgs().MakeArgString(LlcOutputFileName.c_str()));
+  LlcArgs.push_back(LlcOutputFile);

c_str(), again.



Comment at: lib/Driver/ToolChains/Cuda.cpp:764
+  if (DriverArgs.hasArg(options::OPT_nocudalib) ||
+  DeviceOffloadingKind == Action::OFK_HIP)
 return;

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-05-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

Hi Artem,

I've addressed your comments. Any further changes are needed? Thanks.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-05-17 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

ping


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-05-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 145297.
yaxunl added a comment.

clean up code and separate action builder to another review.


https://reviews.llvm.org/D45212

Files:
  include/clang/Driver/Options.td
  lib/Driver/ToolChains/Cuda.cpp
  lib/Driver/ToolChains/Cuda.h

Index: lib/Driver/ToolChains/Cuda.h
===
--- lib/Driver/ToolChains/Cuda.h
+++ lib/Driver/ToolChains/Cuda.h
@@ -127,6 +127,25 @@
 };
 
 } // end namespace NVPTX
+
+namespace AMDGCN {
+// Runs llvm-link/opt/llc/lld, which links multiple LLVM bitcode, together with
+// device library, then compiles it to ISA in a shared object.
+class LLVM_LIBRARY_VISIBILITY Linker : public Tool {
+public:
+  Linker(const ToolChain )
+  : Tool("AMDGCN::Linker", "amdgcn-link", TC, RF_Full,
+ llvm::sys::WEM_UTF8, "--options-file") {}
+
+  bool hasIntegratedCPP() const override { return false; }
+
+  void ConstructJob(Compilation , const JobAction ,
+const InputInfo , const InputInfoList ,
+const llvm::opt::ArgList ,
+const char *LinkingOutput) const override;
+};
+
+} // end namespace AMDGCN
 } // end namespace tools
 
 namespace toolchains {
@@ -150,9 +169,7 @@
  llvm::opt::ArgStringList ,
  Action::OffloadKind DeviceOffloadKind) const override;
 
-  // Never try to use the integrated assembler with CUDA; always fork out to
-  // ptxas.
-  bool useIntegratedAs() const override { return false; }
+  bool useIntegratedAs() const override;
   bool isCrossCompiling() const override { return true; }
   bool isPICDefault() const override { return false; }
   bool isPIEDefault() const override { return false; }
Index: lib/Driver/ToolChains/Cuda.cpp
===
--- lib/Driver/ToolChains/Cuda.cpp
+++ lib/Driver/ToolChains/Cuda.cpp
@@ -304,6 +304,157 @@
   return NoDebug;
 }
 
+static bool addBCLib(Compilation , const ArgList ,
+ ArgStringList , ArgStringList LibraryPaths,
+ StringRef BCName) {
+  std::string FullName;
+  bool FoundLibDevice = false;
+  for (std::string LibraryPath : LibraryPaths) {
+SmallString<128> Path(LibraryPath);
+llvm::sys::path::append(Path, BCName);
+FullName = Args.MakeArgString(Path);
+if (llvm::sys::fs::exists(FullName.c_str())) {
+  FoundLibDevice = true;
+  break;
+}
+  }
+  if (!FoundLibDevice)
+C.getDriver().Diag(diag::err_drv_no_such_file) << BCName;
+  CmdArgs.push_back(Args.MakeArgString(FullName));
+  return FoundLibDevice;
+}
+
+// For amdgcn the inputs of the linker job are device bitcode and output is
+// object file. It calls llvm-link, opt, llc, then lld steps.
+void AMDGCN::Linker::ConstructJob(Compilation , const JobAction ,
+  const InputInfo ,
+  const InputInfoList ,
+  const ArgList ,
+  const char *LinkingOutput) const {
+
+  assert(StringRef(JA.getOffloadingArch()).startswith("gfx") &&
+ " unless gfx processor, backend should be clang");
+  const auto  =
+  static_cast(getToolChain());
+  assert(TC.getTriple().getArch() == llvm::Triple::amdgcn && "Wrong platform");
+
+  // Construct llvm-link command.
+  ArgStringList CmdArgs;
+  // Add the input bc's created by compile step.
+  for (InputInfoList::const_iterator it = Inputs.begin(), ie = Inputs.end();
+   it != ie; ++it) {
+const InputInfo  = *it;
+CmdArgs.push_back(II.getFilename());
+  }
+
+  std::string GFXNAME = JA.getOffloadingArch();
+
+  ArgStringList LibraryPaths;
+
+  // Find in --hip-device-lib-path and HIP_LIBRARY_PATH.
+  for (auto Arg : Args) {
+if (Arg->getSpelling() == "--hip-device-lib-path=") {
+  LibraryPaths.push_back(Args.MakeArgString(Arg->getValue()));
+}
+  }
+
+  addDirectoryList(Args, LibraryPaths, "-L", "HIP_DEVICE_LIB_PATH");
+
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "libhiprt.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "opencl.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "ockl.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "irif.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "ocml.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "oclc_finite_only_off.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "oclc_daz_opt_off.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths,
+   "oclc_correctly_rounded_sqrt_on.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "oclc_unsafe_math_off.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "hc.amdgcn.bc");
+  // Drop gfx in GFXNAME.
+  addBCLib(C, Args, CmdArgs, LibraryPaths,
+   (Twine("oclc_isa_version_") + StringRef(GFXNAME).drop_front(3) +
+".amdgcn.bc")
+   .str());
+
+  // Add an intermediate output file which 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 14 inline comments as done.
yaxunl added a comment.

ping


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 143320.
yaxunl marked an inline comment as done.
yaxunl added a comment.

Remove manually added passes from opt and add -mtriple.


https://reviews.llvm.org/D45212

Files:
  include/clang/Basic/DiagnosticDriverKinds.td
  include/clang/Driver/Action.h
  include/clang/Driver/Options.td
  include/clang/Driver/ToolChain.h
  lib/Driver/Action.cpp
  lib/Driver/Compilation.cpp
  lib/Driver/Driver.cpp
  lib/Driver/ToolChain.cpp
  lib/Driver/ToolChains/Clang.cpp
  lib/Driver/ToolChains/Cuda.cpp
  lib/Driver/ToolChains/Cuda.h
  test/Driver/cuda-bad-arch.cu
  test/Driver/cuda-phases.cu

Index: test/Driver/cuda-phases.cu
===
--- test/Driver/cuda-phases.cu
+++ test/Driver/cuda-phases.cu
@@ -7,195 +7,233 @@
 // REQUIRES: clang-driver
 // REQUIRES: powerpc-registered-target
 // REQUIRES: nvptx-registered-target
-
+// REQUIRES: amdgpu-registered-target
 //
 // Test single gpu architecture with complete compilation.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s 2>&1 \
-// RUN: | FileCheck -check-prefix=BIN %s
-// BIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (host-cuda)
-// BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-cuda)
-// BIN-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, cuda-cpp-output, (device-cuda, sm_30)
-// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-cuda, sm_30)
-// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-cuda, sm_30)
-// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-cuda, sm_30)
-// BIN-DAG: [[P8:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P7]]}, object
-// BIN-DAG: [[P9:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P6]]}, assembler
-// BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-cuda)
-// BIN-DAG: [[P11:[0-9]+]]: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-cuda (nvptx64-nvidia-cuda)" {[[P10]]}, ir
-// BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-cuda)
-// BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-cuda)
-// BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-cuda)
+// RUN: | FileCheck -check-prefixes=BIN,BIN_NV %s
+// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=gfx803 %s 2>&1 \
+// RUN: | FileCheck -check-prefixes=BIN,BIN_AMD %s
+// BIN_NV-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:cuda]], (host-[[T]])
+// BIN_AMD-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:hip]], (host-[[T]])
+// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (host-[[T]])
+// BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-[[T]])
+// BIN_NV-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-[[T]], [[ARCH:sm_30]])
+// BIN_AMD-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-[[T]], [[ARCH:gfx803]])
+// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-[[T]], [[ARCH]])
+// BIN_NV-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE:nvptx64-nvidia-cuda]]:[[ARCH]])" {[[P7]]}, object
+// BIN_AMD-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH]])" {[[P7]]}, object
+// BIN-DAG: [[P9:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE]]:[[ARCH]])" {[[P6]]}, assembler
+// BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-[[T]])
+// BIN-DAG: [[P11:[0-9]+]]: offload, "host-[[T]] (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-[[T]] ([[TRIPLE]])" {[[P10]]}, ir
+// BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-[[T]])
+// BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-[[T]])
+// BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-[[T]])
 
 //
 // Test single gpu architecture up to the assemble phase.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s -S 2>&1 \
-// RUN: | FileCheck -check-prefix=ASM %s
-// ASM-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// ASM-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (device-cuda, sm_30)
-// ASM-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-cuda, sm_30)
-// ASM-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-cuda, sm_30)
-// ASM-DAG: [[P4:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P3]]}, assembler
-// ASM-DAG: [[P5:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// ASM-DAG: [[P6:[0-9]+]]: preprocessor, 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done.
yaxunl added inline comments.



Comment at: lib/Driver/ToolChains/Cuda.cpp:498-501
+OptArgs.push_back(mcpustr);
+OptArgs.push_back("-dce");
+OptArgs.push_back("-sroa");
+OptArgs.push_back("-globaldce");

yaxunl wrote:
> tra wrote:
> > This does not look like the right place to disable particular passes. 
> > Shouldn't it be done somewhere in LLVM?
> These are not disabling the passes but adding these passes. They are some 
> optimizations which are usually improving performance for amdgcn.
Since opt is now able to adjust passes based on -mtriple and -mcpu, will remove 
these manually added passes and add -mtriple and -mcpu instead.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 143307.
yaxunl added a comment.

minor bug fix: do not add CUDA specific link options for HIP.


https://reviews.llvm.org/D45212

Files:
  include/clang/Basic/DiagnosticDriverKinds.td
  include/clang/Driver/Action.h
  include/clang/Driver/Options.td
  include/clang/Driver/ToolChain.h
  lib/Driver/Action.cpp
  lib/Driver/Compilation.cpp
  lib/Driver/Driver.cpp
  lib/Driver/ToolChain.cpp
  lib/Driver/ToolChains/Clang.cpp
  lib/Driver/ToolChains/Cuda.cpp
  lib/Driver/ToolChains/Cuda.h
  test/Driver/cuda-bad-arch.cu
  test/Driver/cuda-phases.cu

Index: test/Driver/cuda-phases.cu
===
--- test/Driver/cuda-phases.cu
+++ test/Driver/cuda-phases.cu
@@ -7,195 +7,233 @@
 // REQUIRES: clang-driver
 // REQUIRES: powerpc-registered-target
 // REQUIRES: nvptx-registered-target
-
+// REQUIRES: amdgpu-registered-target
 //
 // Test single gpu architecture with complete compilation.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s 2>&1 \
-// RUN: | FileCheck -check-prefix=BIN %s
-// BIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (host-cuda)
-// BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-cuda)
-// BIN-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, cuda-cpp-output, (device-cuda, sm_30)
-// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-cuda, sm_30)
-// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-cuda, sm_30)
-// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-cuda, sm_30)
-// BIN-DAG: [[P8:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P7]]}, object
-// BIN-DAG: [[P9:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P6]]}, assembler
-// BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-cuda)
-// BIN-DAG: [[P11:[0-9]+]]: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-cuda (nvptx64-nvidia-cuda)" {[[P10]]}, ir
-// BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-cuda)
-// BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-cuda)
-// BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-cuda)
+// RUN: | FileCheck -check-prefixes=BIN,BIN_NV %s
+// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=gfx803 %s 2>&1 \
+// RUN: | FileCheck -check-prefixes=BIN,BIN_AMD %s
+// BIN_NV-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:cuda]], (host-[[T]])
+// BIN_AMD-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:hip]], (host-[[T]])
+// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (host-[[T]])
+// BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-[[T]])
+// BIN_NV-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-[[T]], [[ARCH:sm_30]])
+// BIN_AMD-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-[[T]], [[ARCH:gfx803]])
+// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-[[T]], [[ARCH]])
+// BIN_NV-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE:nvptx64-nvidia-cuda]]:[[ARCH]])" {[[P7]]}, object
+// BIN_AMD-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH]])" {[[P7]]}, object
+// BIN-DAG: [[P9:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE]]:[[ARCH]])" {[[P6]]}, assembler
+// BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-[[T]])
+// BIN-DAG: [[P11:[0-9]+]]: offload, "host-[[T]] (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-[[T]] ([[TRIPLE]])" {[[P10]]}, ir
+// BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-[[T]])
+// BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-[[T]])
+// BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-[[T]])
 
 //
 // Test single gpu architecture up to the assemble phase.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s -S 2>&1 \
-// RUN: | FileCheck -check-prefix=ASM %s
-// ASM-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// ASM-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (device-cuda, sm_30)
-// ASM-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-cuda, sm_30)
-// ASM-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-cuda, sm_30)
-// ASM-DAG: [[P4:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P3]]}, assembler
-// ASM-DAG: [[P5:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// ASM-DAG: [[P6:[0-9]+]]: preprocessor, {[[P5]]}, cuda-cpp-output, (host-cuda)

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 143298.
yaxunl edited the summary of this revision.
yaxunl added a comment.

sync to ToT.


https://reviews.llvm.org/D45212

Files:
  include/clang/Basic/DiagnosticDriverKinds.td
  include/clang/Driver/Action.h
  include/clang/Driver/Options.td
  include/clang/Driver/ToolChain.h
  lib/Driver/Action.cpp
  lib/Driver/Compilation.cpp
  lib/Driver/Driver.cpp
  lib/Driver/ToolChain.cpp
  lib/Driver/ToolChains/Clang.cpp
  lib/Driver/ToolChains/Cuda.cpp
  lib/Driver/ToolChains/Cuda.h
  test/Driver/cuda-bad-arch.cu
  test/Driver/cuda-phases.cu

Index: test/Driver/cuda-phases.cu
===
--- test/Driver/cuda-phases.cu
+++ test/Driver/cuda-phases.cu
@@ -7,195 +7,233 @@
 // REQUIRES: clang-driver
 // REQUIRES: powerpc-registered-target
 // REQUIRES: nvptx-registered-target
-
+// REQUIRES: amdgpu-registered-target
 //
 // Test single gpu architecture with complete compilation.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s 2>&1 \
-// RUN: | FileCheck -check-prefix=BIN %s
-// BIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (host-cuda)
-// BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-cuda)
-// BIN-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, cuda-cpp-output, (device-cuda, sm_30)
-// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-cuda, sm_30)
-// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-cuda, sm_30)
-// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-cuda, sm_30)
-// BIN-DAG: [[P8:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P7]]}, object
-// BIN-DAG: [[P9:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P6]]}, assembler
-// BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-cuda)
-// BIN-DAG: [[P11:[0-9]+]]: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-cuda (nvptx64-nvidia-cuda)" {[[P10]]}, ir
-// BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-cuda)
-// BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-cuda)
-// BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-cuda)
+// RUN: | FileCheck -check-prefixes=BIN,BIN_NV %s
+// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=gfx803 %s 2>&1 \
+// RUN: | FileCheck -check-prefixes=BIN,BIN_AMD %s
+// BIN_NV-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:cuda]], (host-[[T]])
+// BIN_AMD-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:hip]], (host-[[T]])
+// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (host-[[T]])
+// BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-[[T]])
+// BIN_NV-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-[[T]], [[ARCH:sm_30]])
+// BIN_AMD-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-[[T]], [[ARCH:gfx803]])
+// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-[[T]], [[ARCH]])
+// BIN_NV-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE:nvptx64-nvidia-cuda]]:[[ARCH]])" {[[P7]]}, object
+// BIN_AMD-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH]])" {[[P7]]}, object
+// BIN-DAG: [[P9:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE]]:[[ARCH]])" {[[P6]]}, assembler
+// BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-[[T]])
+// BIN-DAG: [[P11:[0-9]+]]: offload, "host-[[T]] (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-[[T]] ([[TRIPLE]])" {[[P10]]}, ir
+// BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-[[T]])
+// BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-[[T]])
+// BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-[[T]])
 
 //
 // Test single gpu architecture up to the assemble phase.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s -S 2>&1 \
-// RUN: | FileCheck -check-prefix=ASM %s
-// ASM-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// ASM-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (device-cuda, sm_30)
-// ASM-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-cuda, sm_30)
-// ASM-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-cuda, sm_30)
-// ASM-DAG: [[P4:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P3]]}, assembler
-// ASM-DAG: [[P5:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// ASM-DAG: [[P6:[0-9]+]]: preprocessor, {[[P5]]}, cuda-cpp-output, (host-cuda)
-// 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 143218.
yaxunl marked 14 inline comments as done.
yaxunl added a comment.

Revised by Artem's comments.


https://reviews.llvm.org/D45212

Files:
  include/clang/Basic/DiagnosticDriverKinds.td
  include/clang/Driver/Action.h
  include/clang/Driver/Options.td
  include/clang/Driver/ToolChain.h
  lib/Driver/Action.cpp
  lib/Driver/Compilation.cpp
  lib/Driver/Driver.cpp
  lib/Driver/ToolChain.cpp
  lib/Driver/ToolChains/Clang.cpp
  lib/Driver/ToolChains/Cuda.cpp
  lib/Driver/ToolChains/Cuda.h
  test/Driver/cuda-bad-arch.cu
  test/Driver/cuda-phases.cu

Index: test/Driver/cuda-phases.cu
===
--- test/Driver/cuda-phases.cu
+++ test/Driver/cuda-phases.cu
@@ -7,195 +7,233 @@
 // REQUIRES: clang-driver
 // REQUIRES: powerpc-registered-target
 // REQUIRES: nvptx-registered-target
-
+// REQUIRES: amdgpu-registered-target
 //
 // Test single gpu architecture with complete compilation.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s 2>&1 \
-// RUN: | FileCheck -check-prefix=BIN %s
-// BIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (host-cuda)
-// BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-cuda)
-// BIN-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, cuda-cpp-output, (device-cuda, sm_30)
-// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-cuda, sm_30)
-// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-cuda, sm_30)
-// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-cuda, sm_30)
-// BIN-DAG: [[P8:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P7]]}, object
-// BIN-DAG: [[P9:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P6]]}, assembler
-// BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-cuda)
-// BIN-DAG: [[P11:[0-9]+]]: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-cuda (nvptx64-nvidia-cuda)" {[[P10]]}, ir
-// BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-cuda)
-// BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-cuda)
-// BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-cuda)
+// RUN: | FileCheck -check-prefixes=BIN,BIN_NV %s
+// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=gfx803 %s 2>&1 \
+// RUN: | FileCheck -check-prefixes=BIN,BIN_AMD %s
+// BIN_NV-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:cuda]], (host-[[T]])
+// BIN_AMD-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:hip]], (host-[[T]])
+// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (host-[[T]])
+// BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-[[T]])
+// BIN_NV-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-[[T]], [[ARCH:sm_30]])
+// BIN_AMD-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-[[T]], [[ARCH:gfx803]])
+// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-[[T]], [[ARCH]])
+// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-[[T]], [[ARCH]])
+// BIN_NV-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE:nvptx64-nvidia-cuda]]:[[ARCH]])" {[[P7]]}, object
+// BIN_AMD-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH]])" {[[P7]]}, object
+// BIN-DAG: [[P9:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE]]:[[ARCH]])" {[[P6]]}, assembler
+// BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-[[T]])
+// BIN-DAG: [[P11:[0-9]+]]: offload, "host-[[T]] (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-[[T]] ([[TRIPLE]])" {[[P10]]}, ir
+// BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-[[T]])
+// BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-[[T]])
+// BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-[[T]])
 
 //
 // Test single gpu architecture up to the assemble phase.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s -S 2>&1 \
-// RUN: | FileCheck -check-prefix=ASM %s
-// ASM-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// ASM-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (device-cuda, sm_30)
-// ASM-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-cuda, sm_30)
-// ASM-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-cuda, sm_30)
-// ASM-DAG: [[P4:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P3]]}, assembler
-// ASM-DAG: [[P5:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// ASM-DAG: [[P6:[0-9]+]]: preprocessor, {[[P5]]}, cuda-cpp-output, 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 21 inline comments as done.
yaxunl added inline comments.



Comment at: include/clang/Driver/Options.td:552-553
+def : Joined<["--"], "offload-arch=">, Alias;
+def offload_archs : Joined<["--"], "offload-archs=">, Flags<[DriverOption]>,
+  HelpText<"List of offload architectures for CUDA/HIP/OpenMP (e.g. 
sm_35,gfx803).">;
 def no_cuda_gpu_arch_EQ : Joined<["--"], "no-cuda-gpu-arch=">, 
Flags<[DriverOption]>,

tra wrote:
> The discussion you've mentioned appears to be unfinished. @jlebar has raised 
> a valid point regarding -no-offload-arch[s] and his email went unanswered 
> AFAICT.
> 
> If you propose to generalize a way to specify offload targets, it should 
> probably be done in a separate patch.
> 
> I'd just stick with `--cuda-gpu-arch` for the time being until we figure out 
> how we're going to handle target specification in general. It works well 
> enough for the moment and the new options are not required for the 
> functionality implemented by this patch.
Will use `--cuda-gpu-arch` for this patch.



Comment at: lib/Driver/Driver.cpp:2333-2339
+  if (HostTC->getTriple().isNVPTX() ||
+  HostTC->getTriple().getArch() == llvm::Triple::amdgcn) {
 // We do not support targeting NVPTX for host compilation. Throw
 // an error and abort pipeline construction early so we don't trip
 // asserts that assume device-side compilation.
 C.getDriver().Diag(diag::err_drv_cuda_nvptx_host);
 return true;

tra wrote:
> You really want to either have your own error message or change existing one 
> to something more generic. `unsupported use of NVPTX for host compilation.` 
> will sound very confusing tof someone who tries to compile for AMD GPU.
Will generalise this error message.



Comment at: lib/Driver/Driver.cpp:3319
+  static bool isAMDGPUCUDAOffloading(const Action *A, llvm::Triple T) {
+return A->isOffloading(Action::OFK_Cuda) &&
+   (StringRef(A->getOffloadingArch()).startswith("gfx") ||

tra wrote:
> Would it make sense for HIP to have its own offloading kind? You seem to be 
> adding similar checks in few other places.
Yes it makes sense to let HIP have its own offloading kind. Will do.



Comment at: lib/Driver/Driver.cpp:3472-3474
+const char *Value = A->getValue();
+if (const char *Ext = strrchr(Value, '.'))
+  InputType = TC.LookupTypeForExtension(Ext + 1);

tra wrote:
> `llvm::sys::path::extension` ?
will do



Comment at: lib/Driver/Driver.cpp:3905-3912
+SmallString<128> fname(Split.first.str().c_str());
+if (!BoundArch.empty()) {
+  fname += "-";
+  fname.append(BoundArch);
+}
 std::string TmpName = GetTemporaryPath(
+fname, types::getTypeTempSuffix(JA.getType(), IsCLMode()));

tra wrote:
> I'm not sure why we do this here. Nor does it seem relevant to HIP. 
will remove it



Comment at: lib/Driver/Driver.cpp:3982-3985
+if (((StringRef)BaseInput).endswith(".a"))
+  Suffixed += "a";
+else
+  Suffixed += Suffix;

tra wrote:
> Same here -- I'm not sure why we need this nor how it's related to HIP.
will remove



Comment at: lib/Driver/ToolChains/Cuda.cpp:352
+  CmdArgs.push_back("-o");
+  std::string TmpName = C.getDriver().GetTemporaryPath("LC_OUTPUT", "o");
+  const char *llcOutputFile =

tra wrote:
> `L*L*C_OUTPUT` ?
will change to LLC_OUTPUT



Comment at: lib/Driver/ToolChains/Cuda.cpp:356
+  CmdArgs.push_back(llcOutputFile);
+  const char *Exec = Args.MakeArgString(C.getDriver().Dir + "/llc");
+  C.addCommand(llvm::make_unique(JA, *this, Exec, CmdArgs, Inputs));

tra wrote:
> Please use routines in`llvm::sys::path` here and in other places where you 
> manipulate paths.
will do



Comment at: lib/Driver/ToolChains/Cuda.cpp:360-363
+  CmdArgs2.push_back("-flavor");
+  CmdArgs2.push_back("gnu");
+  CmdArgs2.push_back("--no-undefined");
+  CmdArgs2.push_back("-shared");

tra wrote:
> You could use .append({...})
> ```
> CmdArgs2.append({"foo","bar","buz"});
> ```
will do



Comment at: lib/Driver/ToolChains/Cuda.cpp:436-439
+if (Arg->getSpelling() == "-L") {
+  LibraryPaths.push_back(Args.MakeArgString(
+  std::string(Arg->getValue()) + "/libdevice/" + 
std::string(GFXNAME)));
+  LibraryPaths.push_back(Args.MakeArgString(Arg->getValue()));

tra wrote:
> This is rather awkward -- you're looking for /libdevice under all paths 
> specified by -L, but there's no way to explicitly point to the directory with 
> the bitcode library. If device library may be in a non-canonical location, 
> I'd rather there was an explicit option to specify it. Furthermore, my 
> understanding is that you will need to find these bitcode 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-17 Thread Artem Belevich via Phabricator via cfe-commits
tra requested changes to this revision.
tra added a subscriber: jlebar.
tra added a comment.
This revision now requires changes to proceed.

Sorry about the delay. I've been out for few days.




Comment at: include/clang/Driver/Options.td:552-553
+def : Joined<["--"], "offload-arch=">, Alias;
+def offload_archs : Joined<["--"], "offload-archs=">, Flags<[DriverOption]>,
+  HelpText<"List of offload architectures for CUDA/HIP/OpenMP (e.g. 
sm_35,gfx803).">;
 def no_cuda_gpu_arch_EQ : Joined<["--"], "no-cuda-gpu-arch=">, 
Flags<[DriverOption]>,

The discussion you've mentioned appears to be unfinished. @jlebar has raised a 
valid point regarding -no-offload-arch[s] and his email went unanswered AFAICT.

If you propose to generalize a way to specify offload targets, it should 
probably be done in a separate patch.

I'd just stick with `--cuda-gpu-arch` for the time being until we figure out 
how we're going to handle target specification in general. It works well enough 
for the moment and the new options are not required for the functionality 
implemented by this patch.



Comment at: lib/Driver/Driver.cpp:2333-2339
+  if (HostTC->getTriple().isNVPTX() ||
+  HostTC->getTriple().getArch() == llvm::Triple::amdgcn) {
 // We do not support targeting NVPTX for host compilation. Throw
 // an error and abort pipeline construction early so we don't trip
 // asserts that assume device-side compilation.
 C.getDriver().Diag(diag::err_drv_cuda_nvptx_host);
 return true;

You really want to either have your own error message or change existing one to 
something more generic. `unsupported use of NVPTX for host compilation.` will 
sound very confusing tof someone who tries to compile for AMD GPU.



Comment at: lib/Driver/Driver.cpp:3319
+  static bool isAMDGPUCUDAOffloading(const Action *A, llvm::Triple T) {
+return A->isOffloading(Action::OFK_Cuda) &&
+   (StringRef(A->getOffloadingArch()).startswith("gfx") ||

Would it make sense for HIP to have its own offloading kind? You seem to be 
adding similar checks in few other places.



Comment at: lib/Driver/Driver.cpp:3472-3474
+const char *Value = A->getValue();
+if (const char *Ext = strrchr(Value, '.'))
+  InputType = TC.LookupTypeForExtension(Ext + 1);

`llvm::sys::path::extension` ?



Comment at: lib/Driver/Driver.cpp:3905-3912
+SmallString<128> fname(Split.first.str().c_str());
+if (!BoundArch.empty()) {
+  fname += "-";
+  fname.append(BoundArch);
+}
 std::string TmpName = GetTemporaryPath(
+fname, types::getTypeTempSuffix(JA.getType(), IsCLMode()));

I'm not sure why we do this here. Nor does it seem relevant to HIP. 



Comment at: lib/Driver/Driver.cpp:3982-3985
+if (((StringRef)BaseInput).endswith(".a"))
+  Suffixed += "a";
+else
+  Suffixed += Suffix;

Same here -- I'm not sure why we need this nor how it's related to HIP.



Comment at: lib/Driver/ToolChains/Cuda.cpp:352
+  CmdArgs.push_back("-o");
+  std::string TmpName = C.getDriver().GetTemporaryPath("LC_OUTPUT", "o");
+  const char *llcOutputFile =

`L*L*C_OUTPUT` ?



Comment at: lib/Driver/ToolChains/Cuda.cpp:356
+  CmdArgs.push_back(llcOutputFile);
+  const char *Exec = Args.MakeArgString(C.getDriver().Dir + "/llc");
+  C.addCommand(llvm::make_unique(JA, *this, Exec, CmdArgs, Inputs));

Please use routines in`llvm::sys::path` here and in other places where you 
manipulate paths.



Comment at: lib/Driver/ToolChains/Cuda.cpp:360-363
+  CmdArgs2.push_back("-flavor");
+  CmdArgs2.push_back("gnu");
+  CmdArgs2.push_back("--no-undefined");
+  CmdArgs2.push_back("-shared");

You could use .append({...})
```
CmdArgs2.append({"foo","bar","buz"});
```



Comment at: lib/Driver/ToolChains/Cuda.cpp:436-439
+if (Arg->getSpelling() == "-L") {
+  LibraryPaths.push_back(Args.MakeArgString(
+  std::string(Arg->getValue()) + "/libdevice/" + 
std::string(GFXNAME)));
+  LibraryPaths.push_back(Args.MakeArgString(Arg->getValue()));

This is rather awkward -- you're looking for /libdevice under all paths 
specified by -L, but there's no way to explicitly point to the directory with 
the bitcode library. If device library may be in a non-canonical location, I'd 
rather there was an explicit option to specify it. Furthermore, my 
understanding is that you will need to find these bitcode libraries during `-c` 
compilation. Using `-L` to derive bitcode search path during compilation looks 
wrong to me.



Comment at: lib/Driver/ToolChains/Cuda.cpp:445
+  LibraryPaths.push_back(Args.MakeArgString(
+  C.getDriver().Dir + "/../lib/libdevice/" + 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-13 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment.

In https://reviews.llvm.org/D45212#1066842, @rjmccall wrote:

> I'd still prefer if someone with more driver-design expertise weighed in, but 
> we might not have any specialists there.


I think you should at least give @tra the possibility to take a look. Last time 
we essentially ended up reverting a patch.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-13 Thread John McCall via Phabricator via cfe-commits
rjmccall accepted this revision.
rjmccall added a comment.
This revision is now accepted and ready to land.

I'd still prefer if someone with more driver-design expertise weighed in, but 
we might not have any specialists there.

LGTM, although you might consider changing your tests a bit: FileCheck recently 
added support for a -D argument where you can predefine variables in the 
command line.  But that's just a suggestion, not something I'm asking you to do 
as part of review.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-10 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment.

In https://reviews.llvm.org/D45212#1063186, @yaxunl wrote:

> In https://reviews.llvm.org/D45212#1060628, @Hahnfeld wrote:
>
> > Can this revision be split further? The summary mentions many things that 
> > might make up multiple independent changes...
>
>
> I separated file type changes to https://reviews.llvm.org/D45489 and deferred 
> some other changes for future.
>
> It is kind of difficult to split this patch further since they depend on each 
> other.


You can add `Related Revisions` to make this easy to see.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In https://reviews.llvm.org/D45212#1060628, @Hahnfeld wrote:

> Can this revision be split further? The summary mentions many things that 
> might make up multiple independent changes...


I separated file type changes to https://reviews.llvm.org/D45489 and deferred 
some other changes for future.

It is kind of difficult to split this patch further since they depend on each 
other.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 141863.
yaxunl marked 2 inline comments as done.
yaxunl edited the summary of this revision.
yaxunl added a comment.

Separate file type changes to another patch.


https://reviews.llvm.org/D45212

Files:
  include/clang/Driver/Options.td
  include/clang/Driver/ToolChain.h
  lib/Driver/Driver.cpp
  lib/Driver/ToolChain.cpp
  lib/Driver/ToolChains/Cuda.cpp
  lib/Driver/ToolChains/Cuda.h
  test/Driver/cuda-phases.cu

Index: test/Driver/cuda-phases.cu
===
--- test/Driver/cuda-phases.cu
+++ test/Driver/cuda-phases.cu
@@ -7,24 +7,29 @@
 // REQUIRES: clang-driver
 // REQUIRES: powerpc-registered-target
 // REQUIRES: nvptx-registered-target
-
+// REQUIRES: amdgpu-registered-target
 //
 // Test single gpu architecture with complete compilation.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s 2>&1 \
-// RUN: | FileCheck -check-prefix=BIN %s
-// BIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (host-cuda)
+// RUN: | FileCheck -check-prefixes=BIN,BIN_NV %s
+// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=gfx803 %s 2>&1 \
+// RUN: | FileCheck -check-prefixes=BIN,BIN_AMD %s
+// BIN_NV-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:cuda]], (host-cuda)
+// BIN_AMD-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:hip]], (host-cuda)
+// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (host-cuda)
 // BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-cuda)
-// BIN-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, cuda-cpp-output, (device-cuda, sm_30)
-// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-cuda, sm_30)
-// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-cuda, sm_30)
-// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-cuda, sm_30)
-// BIN-DAG: [[P8:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P7]]}, object
-// BIN-DAG: [[P9:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P6]]}, assembler
+// BIN_NV-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-cuda, [[ARCH:sm_30]])
+// BIN_AMD-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-cuda, [[ARCH:gfx803]])
+// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, [[T]]-cpp-output, (device-cuda, [[ARCH]])
+// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-cuda, [[ARCH]])
+// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-cuda, [[ARCH]])
+// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-cuda, [[ARCH]])
+// BIN_NV-DAG: [[P8:[0-9]+]]: offload, "device-cuda ([[TRIPLE:nvptx64-nvidia-cuda]]:[[ARCH]])" {[[P7]]}, object
+// BIN_AMD-DAG: [[P8:[0-9]+]]: offload, "device-cuda ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH]])" {[[P7]]}, object
+// BIN-DAG: [[P9:[0-9]+]]: offload, "device-cuda ([[TRIPLE]]:[[ARCH]])" {[[P6]]}, assembler
 // BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-cuda)
-// BIN-DAG: [[P11:[0-9]+]]: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-cuda (nvptx64-nvidia-cuda)" {[[P10]]}, ir
+// BIN-DAG: [[P11:[0-9]+]]: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-cuda ([[TRIPLE]])" {[[P10]]}, ir
 // BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-cuda)
 // BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-cuda)
 // BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-cuda)
@@ -34,40 +39,44 @@
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s -S 2>&1 \
 // RUN: | FileCheck -check-prefix=ASM %s
-// ASM-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// ASM-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (device-cuda, sm_30)
-// ASM-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-cuda, sm_30)
-// ASM-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-cuda, sm_30)
-// ASM-DAG: [[P4:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P3]]}, assembler
-// ASM-DAG: [[P5:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// ASM-DAG: [[P6:[0-9]+]]: preprocessor, {[[P5]]}, cuda-cpp-output, (host-cuda)
+// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=gfx803 %s -S 2>&1 \
+// RUN: | FileCheck -check-prefix=ASM %s
+// ASM-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:cuda|hip]], (device-cuda, [[ARCH:sm_30|gfx803]])
+// ASM-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-cuda, [[ARCH]])
+// ASM-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-cuda, [[ARCH]])
+// ASM-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-cuda, [[ARCH]])
+// ASM-DAG: [[P4:[0-9]+]]: 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done.
yaxunl added inline comments.



Comment at: lib/Driver/Driver.cpp:2267
+if ((IA->getType() != types::TY_CUDA) &&
+IA->getType() != types::TY_HIP) {
   // The builder will ignore this input.

rjmccall wrote:
> The extra parens are weird here.
will remove it.



Comment at: lib/Driver/ToolChains/Cuda.cpp:263
+// HIP needs c++11.
+CC1Args.push_back("-std=c++11");
+// Skip CUDA includes for HIP.

Hahnfeld wrote:
> Will this override the user's value, e.g. `-std=c++14`?
No. We will add diagnotics in a separate patch.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-07 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment.

Can this revision be split further? The summary mentions many things that might 
make up multiple independent changes...




Comment at: lib/Driver/ToolChains/Cuda.cpp:263
+// HIP needs c++11.
+CC1Args.push_back("-std=c++11");
+// Skip CUDA includes for HIP.

Will this override the user's value, e.g. `-std=c++14`?


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-06 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

This looks okay to me, but I think it would better if someone with more 
expertise in the design of the driver and frontend code could review this.




Comment at: lib/Driver/Driver.cpp:2267
+if ((IA->getType() != types::TY_CUDA) &&
+IA->getType() != types::TY_HIP) {
   // The builder will ignore this input.

The extra parens are weird here.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 141221.
yaxunl marked an inline comment as done.
yaxunl edited the summary of this revision.
yaxunl added a comment.

Revised by reviewers' comments, including comments from previous review.


https://reviews.llvm.org/D45212

Files:
  include/clang/Basic/DiagnosticDriverKinds.td
  include/clang/Driver/Options.td
  include/clang/Driver/ToolChain.h
  include/clang/Driver/Types.def
  lib/Driver/Driver.cpp
  lib/Driver/SanitizerArgs.cpp
  lib/Driver/ToolChain.cpp
  lib/Driver/ToolChains/Clang.cpp
  lib/Driver/ToolChains/Cuda.cpp
  lib/Driver/ToolChains/Cuda.h
  lib/Driver/ToolChains/Gnu.cpp
  lib/Driver/Types.cpp
  lib/Frontend/CompilerInstance.cpp
  test/Driver/cuda-phases.cu

Index: test/Driver/cuda-phases.cu
===
--- test/Driver/cuda-phases.cu
+++ test/Driver/cuda-phases.cu
@@ -7,24 +7,29 @@
 // REQUIRES: clang-driver
 // REQUIRES: powerpc-registered-target
 // REQUIRES: nvptx-registered-target
-
+// REQUIRES: amdgpu-registered-target
 //
 // Test single gpu architecture with complete compilation.
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s 2>&1 \
-// RUN: | FileCheck -check-prefix=BIN %s
-// BIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (host-cuda)
+// RUN: | FileCheck -check-prefixes=BIN,BIN_NV %s
+// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=gfx803 %s 2>&1 \
+// RUN: | FileCheck -check-prefixes=BIN,BIN_AMD %s
+// BIN_NV-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:cuda]], (host-cuda)
+// BIN_AMD-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:hip]], (host-cuda)
+// BIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (host-cuda)
 // BIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (host-cuda)
-// BIN-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, cuda-cpp-output, (device-cuda, sm_30)
-// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-cuda, sm_30)
-// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-cuda, sm_30)
-// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-cuda, sm_30)
-// BIN-DAG: [[P8:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P7]]}, object
-// BIN-DAG: [[P9:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P6]]}, assembler
+// BIN_NV-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-cuda, [[ARCH:sm_30]])
+// BIN_AMD-DAG: [[P3:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T]], (device-cuda, [[ARCH:gfx803]])
+// BIN-DAG: [[P4:[0-9]+]]: preprocessor, {[[P3]]}, [[T]]-cpp-output, (device-cuda, [[ARCH]])
+// BIN-DAG: [[P5:[0-9]+]]: compiler, {[[P4]]}, ir, (device-cuda, [[ARCH]])
+// BIN-DAG: [[P6:[0-9]+]]: backend, {[[P5]]}, assembler, (device-cuda, [[ARCH]])
+// BIN-DAG: [[P7:[0-9]+]]: assembler, {[[P6]]}, object, (device-cuda, [[ARCH]])
+// BIN_NV-DAG: [[P8:[0-9]+]]: offload, "device-cuda ([[TRIPLE:nvptx64-nvidia-cuda]]:[[ARCH]])" {[[P7]]}, object
+// BIN_AMD-DAG: [[P8:[0-9]+]]: offload, "device-cuda ([[TRIPLE:amdgcn-amd-amdhsa]]:[[ARCH]])" {[[P7]]}, object
+// BIN-DAG: [[P9:[0-9]+]]: offload, "device-cuda ([[TRIPLE]]:[[ARCH]])" {[[P6]]}, assembler
 // BIN-DAG: [[P10:[0-9]+]]: linker, {[[P8]], [[P9]]}, cuda-fatbin, (device-cuda)
-// BIN-DAG: [[P11:[0-9]+]]: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-cuda (nvptx64-nvidia-cuda)" {[[P10]]}, ir
+// BIN-DAG: [[P11:[0-9]+]]: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {[[P2]]}, "device-cuda ([[TRIPLE]])" {[[P10]]}, ir
 // BIN-DAG: [[P12:[0-9]+]]: backend, {[[P11]]}, assembler, (host-cuda)
 // BIN-DAG: [[P13:[0-9]+]]: assembler, {[[P12]]}, object, (host-cuda)
 // BIN-DAG: [[P14:[0-9]+]]: linker, {[[P13]]}, image, (host-cuda)
@@ -34,40 +39,44 @@
 //
 // RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s -S 2>&1 \
 // RUN: | FileCheck -check-prefix=ASM %s
-// ASM-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (device-cuda, sm_30)
-// ASM-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, cuda-cpp-output, (device-cuda, sm_30)
-// ASM-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-cuda, sm_30)
-// ASM-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-cuda, sm_30)
-// ASM-DAG: [[P4:[0-9]+]]: offload, "device-cuda (nvptx64-nvidia-cuda:sm_30)" {[[P3]]}, assembler
-// ASM-DAG: [[P5:[0-9]+]]: input, "{{.*}}cuda-phases.cu", cuda, (host-cuda)
-// ASM-DAG: [[P6:[0-9]+]]: preprocessor, {[[P5]]}, cuda-cpp-output, (host-cuda)
+// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=gfx803 %s -S 2>&1 \
+// RUN: | FileCheck -check-prefix=ASM %s
+// ASM-DAG: [[P0:[0-9]+]]: input, "{{.*}}cuda-phases.cu", [[T:cuda|hip]], (device-cuda, [[ARCH:sm_30|gfx803]])
+// ASM-DAG: 

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done.
yaxunl added inline comments.



Comment at: include/clang/Basic/Cuda.h:61
+  GFX900,
+  GFX902,
   LAST,

Hahnfeld wrote:
> rjmccall wrote:
> > yaxunl wrote:
> > > rjmccall wrote:
> > > > Does this actually have anything to do with HIP?  You have a lot of 
> > > > changes in this patch which seem to just be about supporting more GPU 
> > > > revisions.
> > > This patch not only adds support of HIP language mode but also adds 
> > > support of amdgpu to CUDA toolchain.
> > > 
> > > Currently HIP extension is only supported by amdgpu although in the 
> > > future it may be supported by other targets.
> > I understand that, but I think you can separate those two patches without 
> > too much difficulty.
> There already is D42800 with lots of unanswered comments, so those need to be 
> addressed first.
Will split the changes for supporting amdgcn to another review.



Comment at: include/clang/Basic/Cuda.h:61
+  GFX900,
+  GFX902,
   LAST,

yaxunl wrote:
> Hahnfeld wrote:
> > rjmccall wrote:
> > > yaxunl wrote:
> > > > rjmccall wrote:
> > > > > Does this actually have anything to do with HIP?  You have a lot of 
> > > > > changes in this patch which seem to just be about supporting more GPU 
> > > > > revisions.
> > > > This patch not only adds support of HIP language mode but also adds 
> > > > support of amdgpu to CUDA toolchain.
> > > > 
> > > > Currently HIP extension is only supported by amdgpu although in the 
> > > > future it may be supported by other targets.
> > > I understand that, but I think you can separate those two patches without 
> > > too much difficulty.
> > There already is D42800 with lots of unanswered comments, so those need to 
> > be addressed first.
> Will split the changes for supporting amdgcn to another review.
Will address those comments in the split review.



Comment at: include/clang/Driver/Options.td:556
+def offload_archs : Joined<["--"], "offload-archs=">, Flags<[DriverOption]>,
+  HelpText<"List of offload architectures for CUDA/HIP/OpenMP (e.g. 
sm_35,gfx803).">;
 def no_cuda_gpu_arch_EQ : Joined<["--"], "no-cuda-gpu-arch=">, 
Flags<[DriverOption]>,

rjmccall wrote:
> yaxunl wrote:
> > rjmccall wrote:
> > > Do we absolutely need the non-CUDA-related aliases here?  We generally 
> > > try to be good about namespacing extension-specific language options.
> > > 
> > > I understand that you're probably trying to maintain command-line 
> > > compatibility with some existing toolchain, but if it's possible to avoid 
> > > this, I would be much happier.
> > There were discussions about a uniform clang option for offloading sub-arcs 
> > 
> > http://lists.llvm.org/pipermail/llvm-dev/2017-February/109896.html 
> > 
> > the consensus seem to be -offload-arch or -offload-archs.
> > 
> > This patch attempts to make that transition to use this new option.
> > 
> > We can separate this change to a different patch though.
> I don't mind the `-offload-*` options; I'm more concerned about `--host-only` 
> and so on.
Will drop those options for now. Since HIP is just a language mode of CUDA, it 
is OK to use CUDA options e.g. 'cuda-device-only' for HIP.



Comment at: include/clang/Driver/ToolChain.h:124
   mutable std::unique_ptr Clang;
+  mutable std::unique_ptr Backend;
   mutable std::unique_ptr Assemble;

rjmccall wrote:
> yaxunl wrote:
> > rjmccall wrote:
> > > "Backend" is a really generic name for this thing that is probably 
> > > hyper-specific to the CUDA translation model.
> > Agree. This tool actually links bunch of bitcode libraries (so called 
> > device libraries). For non-GPU targets, this is usually unnecessary since 
> > they support ISA-level linking. However most GPU targets do not support 
> > that, therefore they need this stage.
> > 
> > How about renaming it as BitcodeLink?
> DeviceLibraryLink, maybe?  I wouldn't want someone to think this was related 
> to ordinary LTO.
That sounds good. Will change to that.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-04 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments.



Comment at: include/clang/Basic/Cuda.h:61
+  GFX900,
+  GFX902,
   LAST,

rjmccall wrote:
> yaxunl wrote:
> > rjmccall wrote:
> > > Does this actually have anything to do with HIP?  You have a lot of 
> > > changes in this patch which seem to just be about supporting more GPU 
> > > revisions.
> > This patch not only adds support of HIP language mode but also adds support 
> > of amdgpu to CUDA toolchain.
> > 
> > Currently HIP extension is only supported by amdgpu although in the future 
> > it may be supported by other targets.
> I understand that, but I think you can separate those two patches without too 
> much difficulty.
There already is D42800 with lots of unanswered comments, so those need to be 
addressed first.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-04-04 Thread John McCall via Phabricator via cfe-commits
rjmccall added inline comments.



Comment at: include/clang/Basic/Cuda.h:61
+  GFX900,
+  GFX902,
   LAST,

yaxunl wrote:
> rjmccall wrote:
> > Does this actually have anything to do with HIP?  You have a lot of changes 
> > in this patch which seem to just be about supporting more GPU revisions.
> This patch not only adds support of HIP language mode but also adds support 
> of amdgpu to CUDA toolchain.
> 
> Currently HIP extension is only supported by amdgpu although in the future it 
> may be supported by other targets.
I understand that, but I think you can separate those two patches without too 
much difficulty.



Comment at: include/clang/Driver/Options.td:556
+def offload_archs : Joined<["--"], "offload-archs=">, Flags<[DriverOption]>,
+  HelpText<"List of offload architectures for CUDA/HIP/OpenMP (e.g. 
sm_35,gfx803).">;
 def no_cuda_gpu_arch_EQ : Joined<["--"], "no-cuda-gpu-arch=">, 
Flags<[DriverOption]>,

yaxunl wrote:
> rjmccall wrote:
> > Do we absolutely need the non-CUDA-related aliases here?  We generally try 
> > to be good about namespacing extension-specific language options.
> > 
> > I understand that you're probably trying to maintain command-line 
> > compatibility with some existing toolchain, but if it's possible to avoid 
> > this, I would be much happier.
> There were discussions about a uniform clang option for offloading sub-arcs 
> 
> http://lists.llvm.org/pipermail/llvm-dev/2017-February/109896.html 
> 
> the consensus seem to be -offload-arch or -offload-archs.
> 
> This patch attempts to make that transition to use this new option.
> 
> We can separate this change to a different patch though.
I don't mind the `-offload-*` options; I'm more concerned about `--host-only` 
and so on.



Comment at: include/clang/Driver/ToolChain.h:124
   mutable std::unique_ptr Clang;
+  mutable std::unique_ptr Backend;
   mutable std::unique_ptr Assemble;

yaxunl wrote:
> rjmccall wrote:
> > "Backend" is a really generic name for this thing that is probably 
> > hyper-specific to the CUDA translation model.
> Agree. This tool actually links bunch of bitcode libraries (so called device 
> libraries). For non-GPU targets, this is usually unnecessary since they 
> support ISA-level linking. However most GPU targets do not support that, 
> therefore they need this stage.
> 
> How about renaming it as BitcodeLink?
DeviceLibraryLink, maybe?  I wouldn't want someone to think this was related to 
ordinary LTO.


https://reviews.llvm.org/D45212



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits