[PATCH] D156426: [HIP] link HIP runtime library without --hip-link

2023-07-27 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan accepted this revision.
scchan added a comment.

LGTM thanks


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156426/new/

https://reviews.llvm.org/D156426

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D155213: [Driver] Add `-f[no-]offload-uniform-block`

2023-07-27 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan accepted this revision.
scchan added a comment.
This revision is now accepted and ready to land.

LGTM thanks


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155213/new/

https://reviews.llvm.org/D155213

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156426: [HIP] link HIP runtime library without --hip-link

2023-07-27 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/docs/HIPSupport.rst:71
+In the above command, the ``--hip-link`` flag instructs Clang to link the HIP 
runtime library. However,
+the use of this flag is unnecessary if a HIP input file is already present in 
your program.
+

clang++ -xhip  --offload-arch=gfx906 sample.o -o sample 
would also work?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156426/new/

https://reviews.llvm.org/D156426

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D155213: [HIP] Add `-fno-offload-uniform-block`

2023-07-27 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/lib/CodeGen/CGCall.cpp:2391
 
 if (TargetDecl->hasAttr()) {
   if (getLangOpts().OpenCLVersion <= 120) {

The block here needs to be aware of this new flag.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155213/new/

https://reviews.llvm.org/D155213

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154123: [HIP] Start document HIP support by clang

2023-07-24 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan accepted this revision.
scchan added a comment.
This revision is now accepted and ready to land.

LGTM thanks


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154123/new/

https://reviews.llvm.org/D154123

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154123: [HIP] Start document HIP support by clang

2023-07-24 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/docs/HIPSupport.rst:65
+
+   clang++ --offload-arch=gfx906 -xhip sample.cpp -o sample
+

missing --hip-link



Comment at: clang/docs/HIPSupport.rst:77
+
+   clang++ --offload-arch=native -xhip sample.cpp -o sample
+

missing --hip-link


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154123/new/

https://reviews.llvm.org/D154123

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D155539: [CUDA][HIP] Use the same default language std as C++

2023-07-18 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan accepted this revision.
scchan added a comment.
This revision is now accepted and ready to land.

LGTM  thanks


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155539/new/

https://reviews.llvm.org/D155539

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154881: [HIP] Ignore host linker flags for device-only

2023-07-17 Thread Siu Chi Chan via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGcbc4bbb85c72: [HIP] Ignore host linker flags for device-only 
(authored by scchan).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154881/new/

https://reviews.llvm.org/D154881

Files:
  clang/lib/Driver/Driver.cpp
  clang/test/Driver/hip-phases.hip

Index: clang/test/Driver/hip-phases.hip
===
--- clang/test/Driver/hip-phases.hip
+++ clang/test/Driver/hip-phases.hip
@@ -219,6 +219,14 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only 2>&1 \
 // RUN: | FileCheck -check-prefixes=DBIN %s
+//
+// Test single gpu architecture with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -Wl,--disable-new-dtags 2>&1 \
+// RUN: | FileCheck -check-prefixes=DBIN %s
+
 // DBIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // DBIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // DBIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -229,6 +237,7 @@
 // DBIN-DAG: [[P7:[0-9]+]]: linker, {[[P6]]}, hip-fatbin, (device-hip, )
 // DBIN-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" {[[P7]]}, hip-fatbin
 // DBIN-NOT: host
+
 //
 // Test single gpu architecture up to the assemble phase in device-only
 // compilation mode.
@@ -251,6 +260,11 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -fhip-emit-relocatable 2>&1 \
 // RUN: | FileCheck -check-prefixes=RELOC %s
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -fhip-emit-relocatable -Wl,--disable-new-dtags \
+// RUN: 2>&1 | FileCheck -check-prefixes=RELOC %s
+//
 // RELOC-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // RELOC-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // RELOC-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -258,6 +272,7 @@
 // RELOC-DAG: [[P4:[0-9]+]]: assembler, {[[P3]]}, object, (device-[[T]], [[ARCH]])
 // RELOC-NOT: linker
 // RELOC-DAG: [[P5:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:[[ARCH]])" {[[P4]]}, object
+// RELOC-NOT: host
 
 //
 // Test two gpu architectures with compile to relocatable in device-only
@@ -266,6 +281,11 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only -fhip-emit-relocatable 2>&1 \
 // RUN: | FileCheck -check-prefixes=RELOC2 %s
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only -fhip-emit-relocatable \
+// RUN: -Wl,--disable-new-dtags 2>&1 | FileCheck -check-prefixes=RELOC2 %s
+//
 // RELOC2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // RELOC2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // RELOC2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -280,6 +300,7 @@
 // RELOC2-DAG: [[P10:[0-9]+]]: assembler, {[[P9]]}, object, (device-[[T]], [[ARCH2]])
 // RELOC2-NOT: linker
 // RELOC2-DAG: [[P11:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:[[ARCH2]])" {[[P10]]}, object
+// RELOC2-NOT: host
 
 //
 // Test two gpu architectures with complete compilation in device-only
@@ -288,6 +309,14 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only \
 // RUN: 2>&1 | FileCheck -check-prefixes=DBIN2 %s
+//
+// Test two gpu architectures with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only \
+// RUN: -Wl,--disable-new-dtags 2>&1 | FileCheck -check-prefixes=DBIN2 %s
+
 // DBIN2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // DBIN2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // DBIN2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -305,6 +334,7 @@
 // DBIN2-DAG: [[P14:[0-9]+]]: linker, {[[P6]], [[P13]]}, hip-fatbin, (device-hip, )
 // DBIN2-DAG: [[P15:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" {[[P14]]}, hip-fatbin
 // 

[PATCH] D154881: [HIP] Ignore host linker flags for device-only

2023-07-17 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 541060.
scchan added a comment.

minor formatting fixes, rebased


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154881/new/

https://reviews.llvm.org/D154881

Files:
  clang/lib/Driver/Driver.cpp
  clang/test/Driver/hip-phases.hip

Index: clang/test/Driver/hip-phases.hip
===
--- clang/test/Driver/hip-phases.hip
+++ clang/test/Driver/hip-phases.hip
@@ -219,6 +219,14 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only 2>&1 \
 // RUN: | FileCheck -check-prefixes=DBIN %s
+//
+// Test single gpu architecture with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -Wl,--disable-new-dtags 2>&1 \
+// RUN: | FileCheck -check-prefixes=DBIN %s
+
 // DBIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // DBIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // DBIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -229,6 +237,7 @@
 // DBIN-DAG: [[P7:[0-9]+]]: linker, {[[P6]]}, hip-fatbin, (device-hip, )
 // DBIN-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" {[[P7]]}, hip-fatbin
 // DBIN-NOT: host
+
 //
 // Test single gpu architecture up to the assemble phase in device-only
 // compilation mode.
@@ -251,6 +260,11 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -fhip-emit-relocatable 2>&1 \
 // RUN: | FileCheck -check-prefixes=RELOC %s
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -fhip-emit-relocatable -Wl,--disable-new-dtags \
+// RUN: 2>&1 | FileCheck -check-prefixes=RELOC %s
+//
 // RELOC-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // RELOC-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // RELOC-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -258,6 +272,7 @@
 // RELOC-DAG: [[P4:[0-9]+]]: assembler, {[[P3]]}, object, (device-[[T]], [[ARCH]])
 // RELOC-NOT: linker
 // RELOC-DAG: [[P5:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:[[ARCH]])" {[[P4]]}, object
+// RELOC-NOT: host
 
 //
 // Test two gpu architectures with compile to relocatable in device-only
@@ -266,6 +281,11 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only -fhip-emit-relocatable 2>&1 \
 // RUN: | FileCheck -check-prefixes=RELOC2 %s
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only -fhip-emit-relocatable \
+// RUN: -Wl,--disable-new-dtags 2>&1 | FileCheck -check-prefixes=RELOC2 %s
+//
 // RELOC2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // RELOC2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // RELOC2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -280,6 +300,7 @@
 // RELOC2-DAG: [[P10:[0-9]+]]: assembler, {[[P9]]}, object, (device-[[T]], [[ARCH2]])
 // RELOC2-NOT: linker
 // RELOC2-DAG: [[P11:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:[[ARCH2]])" {[[P10]]}, object
+// RELOC2-NOT: host
 
 //
 // Test two gpu architectures with complete compilation in device-only
@@ -288,6 +309,14 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only \
 // RUN: 2>&1 | FileCheck -check-prefixes=DBIN2 %s
+//
+// Test two gpu architectures with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only \
+// RUN: -Wl,--disable-new-dtags 2>&1 | FileCheck -check-prefixes=DBIN2 %s
+
 // DBIN2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // DBIN2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // DBIN2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -305,6 +334,7 @@
 // DBIN2-DAG: [[P14:[0-9]+]]: linker, {[[P6]], [[P13]]}, hip-fatbin, (device-hip, )
 // DBIN2-DAG: [[P15:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" {[[P14]]}, hip-fatbin
 // DBIN2-NOT: host
+
 //
 // Test two gpu architectures up to the assemble phase 

[PATCH] D155213: [HIP] Add `-fno-hip-uniform-block`

2023-07-14 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/include/clang/Driver/Options.td:1092
   ShouldParseIf;
+defm hip_uniform_block : BoolFOption<"hip-uniform-block",
+  LangOpts<"HIPUniformBlock">, DefaultTrue,

arsenm wrote:
> Can we avoid adding yet another language flag for something that's reusable 
> for everything? Is there an --offload- ?
Don't we need a different default value for some languages like OpenCL?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155213/new/

https://reviews.llvm.org/D155213

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154881: [HIP] Ignore host linker flags for device-only

2023-07-14 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 540481.
scchan added a comment.

added extra checks for not host, rebased


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154881/new/

https://reviews.llvm.org/D154881

Files:
  clang/lib/Driver/Driver.cpp
  clang/test/Driver/hip-phases.hip

Index: clang/test/Driver/hip-phases.hip
===
--- clang/test/Driver/hip-phases.hip
+++ clang/test/Driver/hip-phases.hip
@@ -219,6 +219,14 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only 2>&1 \
 // RUN: | FileCheck -check-prefixes=DBIN %s
+//
+// Test single gpu architecture with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -Wl,--disable-new-dtags 2>&1 \
+// RUN: | FileCheck -check-prefixes=DBIN %s
+
 // DBIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // DBIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // DBIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -229,6 +237,7 @@
 // DBIN-DAG: [[P7:[0-9]+]]: linker, {[[P6]]}, hip-fatbin, (device-hip, )
 // DBIN-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" {[[P7]]}, hip-fatbin
 // DBIN-NOT: host
+
 //
 // Test single gpu architecture up to the assemble phase in device-only
 // compilation mode.
@@ -251,6 +260,11 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -fhip-emit-relocatable 2>&1 \
 // RUN: | FileCheck -check-prefixes=RELOC %s
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -fhip-emit-relocatable -Wl,--disable-new-dtags \
+// RUN: 2>&1 | FileCheck -check-prefixes=RELOC %s
+//
 // RELOC-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // RELOC-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // RELOC-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -258,6 +272,7 @@
 // RELOC-DAG: [[P4:[0-9]+]]: assembler, {[[P3]]}, object, (device-[[T]], [[ARCH]])
 // RELOC-NOT: linker
 // RELOC-DAG: [[P5:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:[[ARCH]])" {[[P4]]}, object
+// RELOC-NOT: host
 
 //
 // Test two gpu architectures with compile to relocatable in device-only
@@ -266,6 +281,11 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only -fhip-emit-relocatable 2>&1 \
 // RUN: | FileCheck -check-prefixes=RELOC2 %s
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only -fhip-emit-relocatable \
+// RUN: -Wl,--disable-new-dtags 2>&1 | FileCheck -check-prefixes=RELOC2 %s
+//
 // RELOC2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // RELOC2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // RELOC2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -280,6 +300,7 @@
 // RELOC2-DAG: [[P10:[0-9]+]]: assembler, {[[P9]]}, object, (device-[[T]], [[ARCH2]])
 // RELOC2-NOT: linker
 // RELOC2-DAG: [[P11:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:[[ARCH2]])" {[[P10]]}, object
+// RELOC2-NOT: host
 
 //
 // Test two gpu architectures with complete compilation in device-only
@@ -288,6 +309,14 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only \
 // RUN: 2>&1 | FileCheck -check-prefixes=DBIN2 %s
+//
+// Test two gpu architectures with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only \
+// RUN: -Wl,--disable-new-dtags 2>&1 | FileCheck -check-prefixes=DBIN2 %s
+
 // DBIN2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // DBIN2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // DBIN2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -305,6 +334,7 @@
 // DBIN2-DAG: [[P14:[0-9]+]]: linker, {[[P6]], [[P13]]}, hip-fatbin, (device-hip, )
 // DBIN2-DAG: [[P15:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" {[[P14]]}, hip-fatbin
 // DBIN2-NOT: host
+
 //
 // Test two gpu architectures up to the 

[PATCH] D154881: [HIP] Ignore host linker flags for device-only

2023-07-11 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 539173.
scchan added a comment.

addressed comments from yaxunl, rebased


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154881/new/

https://reviews.llvm.org/D154881

Files:
  clang/lib/Driver/Driver.cpp
  clang/test/Driver/hip-phases.hip

Index: clang/test/Driver/hip-phases.hip
===
--- clang/test/Driver/hip-phases.hip
+++ clang/test/Driver/hip-phases.hip
@@ -219,6 +219,14 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only 2>&1 \
 // RUN: | FileCheck -check-prefixes=DBIN %s
+//
+// Test single gpu architecture with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -Wl,--disable-new-dtags 2>&1 \
+// RUN: | FileCheck -check-prefixes=DBIN %s
+
 // DBIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // DBIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // DBIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -229,6 +237,7 @@
 // DBIN-DAG: [[P7:[0-9]+]]: linker, {[[P6]]}, hip-fatbin, (device-hip, )
 // DBIN-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" {[[P7]]}, hip-fatbin
 // DBIN-NOT: host
+
 //
 // Test single gpu architecture up to the assemble phase in device-only
 // compilation mode.
@@ -251,6 +260,11 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -fhip-emit-relocatable 2>&1 \
 // RUN: | FileCheck -check-prefixes=RELOC %s
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -fhip-emit-relocatable -Wl,--disable-new-dtags \
+// RUN: 2>&1 | FileCheck -check-prefixes=RELOC %s
+//
 // RELOC-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // RELOC-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // RELOC-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -266,6 +280,11 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only -fhip-emit-relocatable 2>&1 \
 // RUN: | FileCheck -check-prefixes=RELOC2 %s
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only -fhip-emit-relocatable \
+// RUN: -Wl,--disable-new-dtags 2>&1 | FileCheck -check-prefixes=RELOC2 %s
+//
 // RELOC2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // RELOC2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // RELOC2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -288,6 +307,14 @@
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only \
 // RUN: 2>&1 | FileCheck -check-prefixes=DBIN2 %s
+//
+// Test two gpu architectures with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only \
+// RUN: -Wl,--disable-new-dtags 2>&1 | FileCheck -check-prefixes=DBIN2 %s
+
 // DBIN2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], (device-[[T]], [[ARCH:gfx803]])
 // DBIN2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, (device-[[T]], [[ARCH]])
 // DBIN2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
@@ -305,6 +332,7 @@
 // DBIN2-DAG: [[P14:[0-9]+]]: linker, {[[P6]], [[P13]]}, hip-fatbin, (device-hip, )
 // DBIN2-DAG: [[P15:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" {[[P14]]}, hip-fatbin
 // DBIN2-NOT: host
+
 //
 // Test two gpu architectures up to the assemble phase in device-only
 // compilation mode.
@@ -357,11 +385,21 @@
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %t/obj1.o %t/obj2.o \
 // RUN: -fgpu-rdc --cuda-device-only 2>&1 | FileCheck -check-prefixes=L2,RL2,RL2-DEV %s
 
+// RUN: %clang --target=x86_64-unknown-linux-gnu -ccc-print-phases --hip-link \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %t/obj1.o %t/obj2.o \
+// RUN: -fgpu-rdc --cuda-device-only -Wl,--disable-new-dtags 2>&1 \
+// RUN: | FileCheck -check-prefixes=L2,RL2,RL2-DEV %s
+
 // RUN: %clang --target=x86_64-unknown-linux-gnu -ccc-print-phases --hip-link \
 // RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 

[PATCH] D154881: [HIP] Ignore host linker flags for device-only

2023-07-10 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
scchan added a reviewer: yaxunl.
Herald added a project: All.
scchan requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay.
Herald added a project: clang.

When compiling in device only mode (e.g. --offload-device-only), the
host linker phase would not happen and therefore, the driver should
ignore all the host linker flags.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D154881

Files:
  clang/lib/Driver/Driver.cpp
  clang/test/Driver/hip-phases.hip


Index: clang/test/Driver/hip-phases.hip
===
--- clang/test/Driver/hip-phases.hip
+++ clang/test/Driver/hip-phases.hip
@@ -229,6 +229,25 @@
 // DBIN-DAG: [[P7:[0-9]+]]: linker, {[[P6]]}, hip-fatbin, (device-hip, )
 // DBIN-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" 
{[[P7]]}, hip-fatbin
 // DBIN-NOT: host
+
+//
+// Test single gpu architecture with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 %s --cuda-device-only -Wl,--disable-new-dtags 
2>&1 \
+// RUN: | FileCheck -check-prefixes=HLFDBIN %s
+// HLFDBIN-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], 
(device-[[T]], [[ARCH:gfx803]])
+// HLFDBIN-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, 
(device-[[T]], [[ARCH]])
+// HLFDBIN-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], [[ARCH]])
+// HLFDBIN-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-[[T]], 
[[ARCH]])
+// HLFDBIN-DAG: [[P4:[0-9]+]]: assembler, {[[P3]]}, object, (device-[[T]], 
[[ARCH]])
+// HLFDBIN-DAG: [[P5:[0-9]+]]: linker, {[[P4]]}, image, (device-[[T]], 
[[ARCH]])
+// HLFDBIN-DAG: [[P6:[0-9]+]]: offload, "device-[[T]] 
(amdgcn-amd-amdhsa:[[ARCH]])" {[[P5]]}, image
+// HLFDBIN-DAG: [[P7:[0-9]+]]: linker, {[[P6]]}, hip-fatbin, (device-hip, )
+// HLFDBIN-DAG: [[P8:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" 
{[[P7]]}, hip-fatbin
+// HLFDBIN-NOT: host
+
 //
 // Test single gpu architecture up to the assemble phase in device-only
 // compilation mode.
@@ -305,6 +324,32 @@
 // DBIN2-DAG: [[P14:[0-9]+]]: linker, {[[P6]], [[P13]]}, hip-fatbin, 
(device-hip, )
 // DBIN2-DAG: [[P15:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" 
{[[P14]]}, hip-fatbin
 // DBIN2-NOT: host
+
+//
+// Test two gpu architectures with complete compilation in device-only
+// compilation mode with an unused host linker flag.
+//
+// RUN: %clang -x hip --target=x86_64-unknown-linux-gnu -ccc-print-phases \
+// RUN: --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 %s --cuda-device-only 
-Wl,--disable-new-dtags \
+// RUN: 2>&1 | FileCheck -check-prefixes=HLFDBIN2 %s
+// HLFDBIN2-DAG: [[P0:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T:hip]], 
(device-[[T]], [[ARCH:gfx803]])
+// HLFDBIN2-DAG: [[P1:[0-9]+]]: preprocessor, {[[P0]]}, [[T]]-cpp-output, 
(device-[[T]], [[ARCH]])
+// HLFDBIN2-DAG: [[P2:[0-9]+]]: compiler, {[[P1]]}, ir, (device-[[T]], 
[[ARCH]])
+// HLFDBIN2-DAG: [[P3:[0-9]+]]: backend, {[[P2]]}, assembler, (device-[[T]], 
[[ARCH]])
+// HLFDBIN2-DAG: [[P4:[0-9]+]]: assembler, {[[P3]]}, object, (device-[[T]], 
[[ARCH]])
+// HLFDBIN2-DAG: [[P5:[0-9]+]]: linker, {[[P4]]}, image, (device-[[T]], 
[[ARCH]])
+// HLFDBIN2-DAG: [[P6:[0-9]+]]: offload, "device-[[T]] 
(amdgcn-amd-amdhsa:[[ARCH]])" {[[P5]]}, image
+// HLFDBIN2-DAG: [[P7:[0-9]+]]: input, "{{.*}}hip-phases.hip", [[T]], 
(device-[[T]], [[ARCH2:gfx900]])
+// HLFDBIN2-DAG: [[P8:[0-9]+]]: preprocessor, {[[P7]]}, [[T]]-cpp-output, 
(device-[[T]], [[ARCH2]])
+// HLFDBIN2-DAG: [[P9:[0-9]+]]: compiler, {[[P8]]}, ir, (device-[[T]], 
[[ARCH2]])
+// HLFDBIN2-DAG: [[P10:[0-9]+]]: backend, {[[P9]]}, assembler, (device-[[T]], 
[[ARCH2]])
+// HLFDBIN2-DAG: [[P11:[0-9]+]]: assembler, {[[P10]]}, object, (device-[[T]], 
[[ARCH2]])
+// HLFDBIN2-DAG: [[P12:[0-9]+]]: linker, {[[P11]]}, image, (device-[[T]], 
[[ARCH2]])
+// HLFDBIN2-DAG: [[P13:[0-9]+]]: offload, "device-[[T]] 
(amdgcn-amd-amdhsa:[[ARCH2]])" {[[P12]]}, image
+// HLFDBIN2-DAG: [[P14:[0-9]+]]: linker, {[[P6]], [[P13]]}, hip-fatbin, 
(device-hip, )
+// HLFDBIN2-DAG: [[P15:[0-9]+]]: offload, "device-[[T]] (amdgcn-amd-amdhsa:)" 
{[[P14]]}, hip-fatbin
+// HLFDBIN2-NOT: host
+
 //
 // Test two gpu architectures up to the assemble phase in device-only
 // compilation mode.
Index: clang/lib/Driver/Driver.cpp
===
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -4144,8 +4144,10 @@
   if (Phase == phases::Link) {
 assert(Phase == PL.back() && "linking must be final compilation 
step.");
 // We don't need to generate additional link commands if emitting AMD 
bitcode
+// or compiling only for the offload device
 if (!(C.getInputArgs().hasArg(options::OPT_hip_link) &&
- 

[PATCH] D154133: [amdgpu] start documenting amdgpu support by clang

2023-07-07 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/docs/AMDGPUSupport.rst:47
+   * - ``__amdgcn_feature___``
+ - Defined for each supported target feature. The value is 1 if the 
feature is enabled and 0 if it is disabled. Allowed feature names are sramecc 
and xnack.
+   * - ``__AMDGCN_CUMODE__``

yaxunl wrote:
> scchan wrote:
> > This set of feature macros is tricky.  A feature macro is defined only when 
> > the corresponding target feature has been explicitly specified during 
> > compilation; otherwise it's undefined.  Users will have to refer to the 
> > target feature table to get the semantic of each state.  For example, for 
> > xnack, undefined=="any", 1 for xnack+, 0 for xnack-.
> > 
> > Is there a reason we don't support features other than sramecc and xnack?
> This macro is designed to support Target ID only, therefore it is only 
> emitted when target ID containing the feature is used.
> 
> The target ID features are stable since they are used to select code objects 
> from fat binaries.
> 
> My understanding is that other features used by amdgpu backend do not have 
> stable feature names, therefore are not suitable to be used as predefined 
> macros for users to condition their code.
The current wording seems to imply that a feature macro is always defined if 
that target feature is supported but that's not the case.  For example, the 
macro __amdgcn_feature_xnack__ is undefined if a user compiles for xnack="any".


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154133/new/

https://reviews.llvm.org/D154133

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154123: [HIP] Start document HIP support by clang

2023-07-07 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/docs/HIPSupport.rst:82
+===
+
+1. **Compiler Options**: If you specify paths using the compiler options, 
these will override the correponding paths set via environment variables. 

I think it would be easier to understand by explicitly calling out the order of 
precedence:

- For HIP path:
  - --hip-path option
  - HIP_PATH env var
  - --rocm-path option
 - ROCM_PATH env var
 - default automatic detection


-- For device lib:
  - --hip-device-lib-path option
  - HIP_DEVICE_LIB_PATH env var
  - --rocm-path option
  -  ROCM_PATH env var
  - default automatic detection



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154123/new/

https://reviews.llvm.org/D154123

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154133: [amdgpu] start documenting amdgpu support by clang

2023-07-07 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/docs/AMDGPUSupport.rst:47
+   * - ``__amdgcn_feature___``
+ - Defined for each supported target feature. The value is 1 if the 
feature is enabled and 0 if it is disabled. Allowed feature names are sramecc 
and xnack.
+   * - ``__AMDGCN_CUMODE__``

This set of feature macros is tricky.  A feature macro is defined only when the 
corresponding target feature has been explicitly specified during compilation; 
otherwise it's undefined.  Users will have to refer to the target feature table 
to get the semantic of each state.  For example, for xnack, undefined=="any", 1 
for xnack+, 0 for xnack-.

Is there a reason we don't support features other than sramecc and xnack?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154133/new/

https://reviews.llvm.org/D154133

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154123: [HIP] Start document HIP support by clang

2023-06-29 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/docs/HIPSupport.rst:33
+
+   clang -c --offload-arch=gfx906 test.hip -o test.o
+

can we add -xhip to enable HIP for files without the .hip file ext?



Comment at: clang/docs/HIPSupport.rst:56
+Environment Variables
+=
+

I'd suggest adding a section to cover compiler options to pass various hip/rocm 
related paths and that using the compiler options would be a much more 
preferable alternative.



Comment at: clang/docs/HIPSupport.rst:87
+   * - ``__HIPCC__``
+ - This macro indicates that the code is being compiled using the HIP 
compiler.
+   * - ``__HIP_MEMORY_SCOPE_SINGLETHREAD``

I think this macro is essentially an alias to __HIP__?



Comment at: clang/docs/HIPSupport.rst:99
+   * - ``__HIP_DEVICE_COMPILE__``
+ - This macro is defined when the code is being compiled for a HIP device.
+   * - ``__HIP_NO_IMAGE_SUPPORT``

It might be helpful to provide more context here as to how a HIP source file 
gets compiled (once for the host and then once for each device arch).  Also I'd 
suggest moving this up closer to __HIP__ given they are related.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154123/new/

https://reviews.llvm.org/D154123

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152207: [HIP] Instruct lld to go through all archives

2023-06-09 Thread Siu Chi Chan via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGf1aee32f1c85: [HIP] Instruct lld to go through all archives 
(authored by scchan).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152207/new/

https://reviews.llvm.org/D152207

Files:
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,7 +80,9 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--no-whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,18 +126,22 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
+// LINK-SAME: "--no-whole-archive"
 
 // LINK-NOT: "*.llvm-link"
 // LINK-NOT: ".*opt"
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
+// LINK-SAME: "--no-whole-archive"
 
 // LINK-BUNDLE: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // LINK-BUNDLE-SAME: 
"-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the 
device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that 
are
+  // introduced through the host symbol resolution.
+  // For example: host_A() (A.obj) --> host_B(B.obj) --> device_kernel_B()
+  // (B.obj) In this case, the device linker doesn't know that A.obj actually
+  // depends on the kernel functions in B.obj.  When linking to static device
+  // library, the device linker may drop some of the device global symbols if
+  // they aren't referenced.  As a workaround, we are adding to the
+  // --whole-archive flag such that all global symbols would be linked in.
+  LldArgs.push_back("--whole-archive");
+
   for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
 Arg->claim();
@@ -169,6 +181,8 @@
  /*IsBitCodeSDL=*/true,
  /*PostClangLink=*/false);
 
+  LldArgs.push_back("--no-whole-archive");
+
   const char *Lld = Args.MakeArgString(getToolChain().GetProgramPath("lld"));
   C.addCommand(std::make_unique(JA, *this, 
ResponseFileSupport::None(),
  Lld, LldArgs, Inputs, Output));


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,7 +80,9 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--no-whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,18 +126,22 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 

[PATCH] D152207: [HIP] Instruct lld to go through all archives

2023-06-08 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 529667.
scchan added a comment.

rebased


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152207/new/

https://reviews.llvm.org/D152207

Files:
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,7 +80,9 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--no-whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,18 +126,22 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
+// LINK-SAME: "--no-whole-archive"
 
 // LINK-NOT: "*.llvm-link"
 // LINK-NOT: ".*opt"
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
+// LINK-SAME: "--no-whole-archive"
 
 // LINK-BUNDLE: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // LINK-BUNDLE-SAME: 
"-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the 
device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that 
are
+  // introduced through the host symbol resolution.
+  // For example: host_A() (A.obj) --> host_B(B.obj) --> device_kernel_B()
+  // (B.obj) In this case, the device linker doesn't know that A.obj actually
+  // depends on the kernel functions in B.obj.  When linking to static device
+  // library, the device linker may drop some of the device global symbols if
+  // they aren't referenced.  As a workaround, we are adding to the
+  // --whole-archive flag such that all global symbols would be linked in.
+  LldArgs.push_back("--whole-archive");
+
   for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
 Arg->claim();
@@ -169,6 +181,8 @@
  /*IsBitCodeSDL=*/true,
  /*PostClangLink=*/false);
 
+  LldArgs.push_back("--no-whole-archive");
+
   const char *Lld = Args.MakeArgString(getToolChain().GetProgramPath("lld"));
   C.addCommand(std::make_unique(JA, *this, 
ResponseFileSupport::None(),
  Lld, LldArgs, Inputs, Output));


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,7 +80,9 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--no-whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,18 +126,22 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
+// LINK-SAME: 

[PATCH] D152207: [HIP] Instruct lld to go through all archives

2023-06-07 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 529400.
scchan added a comment.

Removed redundant comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152207/new/

https://reviews.llvm.org/D152207

Files:
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,7 +80,9 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--no-whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,18 +126,22 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
+// LINK-SAME: "--no-whole-archive"
 
 // LINK-NOT: "*.llvm-link"
 // LINK-NOT: ".*opt"
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
+// LINK-SAME: "--no-whole-archive"
 
 // LINK-BUNDLE: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // LINK-BUNDLE-SAME: 
"-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the 
device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that 
are
+  // introduced through the host symbol resolution.
+  // For example: host_A() (A.obj) --> host_B(B.obj) --> device_kernel_B()
+  // (B.obj) In this case, the device linker doesn't know that A.obj actually
+  // depends on the kernel functions in B.obj.  When linking to static device
+  // library, the device linker may drop some of the device global symbols if
+  // they aren't referenced.  As a workaround, we are adding to the
+  // --whole-archive flag such that all global symbols would be linked in.
+  LldArgs.push_back("--whole-archive");
+
   for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
 Arg->claim();
@@ -169,6 +181,8 @@
  /*IsBitCodeSDL=*/true,
  /*PostClangLink=*/false);
 
+  LldArgs.push_back("--no-whole-archive");
+
   const char *Lld = Args.MakeArgString(getToolChain().GetProgramPath("lld"));
   C.addCommand(std::make_unique(JA, *this, 
ResponseFileSupport::None(),
  Lld, LldArgs, Inputs, Output));


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,7 +80,9 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--no-whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,18 +126,22 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" 

[PATCH] D152207: [HIP] Instruct lld to go through all archives

2023-06-07 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 529366.
scchan added a comment.

added a matching --no-whole-archive


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152207/new/

https://reviews.llvm.org/D152207

Files:
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,7 +80,9 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--no-whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,18 +126,22 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
+// LINK-SAME: "--no-whole-archive"
 
 // LINK-NOT: "*.llvm-link"
 // LINK-NOT: ".*opt"
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
+// LINK-SAME: "--no-whole-archive"
 
 // LINK-BUNDLE: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // LINK-BUNDLE-SAME: 
"-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the 
device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that 
are
+  // introduced through the host symbol resolution.
+  // For example: host_A() (A.obj) --> host_B(B.obj) --> device_kernel_B()
+  // (B.obj) In this case, the device linker doesn't know that A.obj actually
+  // depends on the kernel functions in B.obj.  When linking to static device
+  // library, the device linker may drop some of the device global symbols if
+  // they aren't referenced.  As a workaround, we are adding to the
+  // --whole-archive flag such that all global symbols would be linked in.
+  LldArgs.push_back("--whole-archive");
+
   for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
 Arg->claim();
@@ -169,6 +181,9 @@
  /*IsBitCodeSDL=*/true,
  /*PostClangLink=*/false);
 
+  // pair with the --whole-archive being added previously
+  LldArgs.push_back("--no-whole-archive");
+
   const char *Lld = Args.MakeArgString(getToolChain().GetProgramPath("lld"));
   C.addCommand(std::make_unique(JA, *this, 
ResponseFileSupport::None(),
  Lld, LldArgs, Inputs, Output));


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,7 +80,9 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--no-whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,18 +126,22 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: 

[PATCH] D152207: [HIP] Instruct lld to go through all archives

2023-06-07 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 529311.
scchan added a comment.

fix some formatting issue


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152207/new/

https://reviews.llvm.org/D152207

Files:
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,6 +80,7 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
 
 // combine images generated into hip fat binary object
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,6 +126,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
@@ -135,6 +136,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the 
device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that 
are
+  // introduced through the host symbol resolution.
+  // For example: host_A() (A.obj) --> host_B(B.obj) --> device_kernel_B()
+  // (B.obj) In this case, the device linker doesn't know that A.obj actually
+  // depends on the kernel functions in B.obj.  When linking to static device
+  // library, the device linker may drop some of the device global symbols if
+  // they aren't referenced.  As a workaround, we are adding to the
+  // --whole-archive flag such that all global symbols would be linked in.
+  LldArgs.push_back("--whole-archive");
+
   for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
 Arg->claim();


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,6 +80,7 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
 
 // combine images generated into hip fat binary object
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,6 +126,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
@@ -135,6 +136,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that are
+  // introduced 

[PATCH] D152207: [HIP] Instruct lld to go through all archives

2023-06-06 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 529061.
scchan added a comment.

remove ws


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152207/new/

https://reviews.llvm.org/D152207

Files:
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,6 +80,7 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
 
 // combine images generated into hip fat binary object
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,6 +126,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
@@ -135,6 +136,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the 
device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that 
are
+  // introduced through the host symbol resolution.
+  // For example: host_A() (A.obj) --> host_B(B.obj) --> device_kernel_B() 
(B.obj)
+  // In this case, the device linker doesn't know that A.obj actually depends 
on 
+  // the kernel functions in B.obj.  When linking to static device library, 
the 
+  // device linker may drop some of the device global symbols if they aren't
+  // referenced.  As a workaround, we are adding to the --whole-archive flag 
such
+  // that all global symbols would be linked in.
+  LldArgs.push_back("--whole-archive");
+
   for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
 Arg->claim();


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,6 +80,7 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
 
 // combine images generated into hip fat binary object
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,6 +126,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
@@ -135,6 +136,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that are
+  // introduced through the 

[PATCH] D152207: [HIP] Instruct lld to go through all archives

2023-06-06 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 529060.
scchan added a comment.

Updated patch to address review feedback


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152207/new/

https://reviews.llvm.org/D152207

Files:
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,6 +80,7 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
 
 // combine images generated into hip fat binary object
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,6 +126,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
@@ -135,10 +136,12 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
 
+
 // LINK-BUNDLE: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // LINK-BUNDLE-SAME: 
"-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // LINK-BUNDLE-SAME: "-input={{.*}}" "-input=[[IMG_DEV1]]" 
"-input=[[IMG_DEV2]]" "-output=[[BUNDLE:.*]]"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -152,6 +152,18 @@
 
   addLinkerCompressDebugSectionsOption(TC, Args, LldArgs);
 
+  // Given that host and device linking happen in separate processes, the 
device
+  // linker doesn't always have the visibility as to which device symbols are
+  // needed by a program, especially for the device symbol dependencies that 
are
+  // introduced through the host symbol resolution.
+  // For example: host_A() (A.obj) --> host_B(B.obj) --> device_kernel_B() 
(B.obj)
+  // In this case, the device linker doesn't know that A.obj actually depends 
on 
+  // the kernel functions in B.obj.  When linking to static device library, 
the 
+  // device linker may drop some of the device global symbols if they aren't
+  // referenced.  As a workaround, we are adding to the --whole-archive flag 
such
+  // that all global symbols would be linked in.
+  LldArgs.push_back("--whole-archive");
+
   for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
 LldArgs.push_back(Arg->getValue(1));
 Arg->claim();


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -80,6 +80,7 @@
 // CHECK-NOT: ".*llc"
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
+// CHECK-SAME: "--whole-archive"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
 
 // combine images generated into hip fat binary object
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -126,6 +126,7 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx803"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
@@ -135,10 +136,12 @@
 // LINK-NOT: ".*llc"
 // LINK: {{".*lld.*"}} {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // LINK-SAME: "-plugin-opt=mcpu=gfx900"
+// LINK-SAME: "--whole-archive"
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
 
+
 // LINK-BUNDLE: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // LINK-BUNDLE-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // LINK-BUNDLE-SAME: "-input={{.*}}" "-input=[[IMG_DEV1]]" 

[PATCH] D152207: [HIP] Instruct lld to go through all archives

2023-06-05 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
Herald added a subscriber: yaxunl.
Herald added a project: All.
scchan requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay.
Herald added a project: clang.

Add the --whole-archive flag when linking HIP programs to instruct lld
to go through every archive library to link in all the kernel functions
(entry pointers to the GPU program); otherwise, lld may skip some
library files if there are no more symbols that need to be resolved.

Change-Id: Ic7ea61ec3e6925f80a63d04654ac72177ffb27c8


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D152207

Files:
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -81,6 +81,7 @@
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -129,6 +129,7 @@
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
+// LINK-SAME: "--whole-archive"
 
 // LINK-NOT: "*.llvm-link"
 // LINK-NOT: ".*opt"
@@ -138,6 +139,7 @@
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
+// LINK-SAME: "--whole-archive"
 
 // LINK-BUNDLE: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // LINK-BUNDLE-SAME: 
"-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -163,6 +163,9 @@
 
   // Look for archive of bundled bitcode in arguments, and add temporary files
   // for the extracted archive of bitcode to inputs.
+  // The archive libraries have to be linked with --whole-archive to instruct
+  // the linker to go through every library to look for kernel functions
+  LldArgs.push_back("--whole-archive");
   auto TargetID = Args.getLastArgValue(options::OPT_mcpu_EQ);
   AddStaticDeviceLibsLinking(C, *this, JA, Inputs, Args, LldArgs, "amdgcn",
  TargetID,


Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -81,6 +81,7 @@
 // CHECK: [[LLD]] {{.*}} "-plugin-opt=-amdgpu-internalize-symbols"
 // CHECK-SAME: "-plugin-opt=mcpu=gfx900"
 // CHECK-SAME: "-o" "[[IMG_DEV2:.*out]]" [[A_BC2]] [[B_BC2]]
+// CHECK-SAME: "--whole-archive"
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -129,6 +129,7 @@
 // LLD-TMP-SAME: "-o" "[[IMG_DEV1:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx803]]"
 // LINK-SAME "[[A_BC1]]" "[[B_BC1]]"
+// LINK-SAME: "--whole-archive"
 
 // LINK-NOT: "*.llvm-link"
 // LINK-NOT: ".*opt"
@@ -138,6 +139,7 @@
 // LLD-TMP-SAME: "-o" "[[IMG_DEV2:.*.out]]"
 // LLD-FIN-SAME: "-o" "[[IMG_DEV1:a.out-.*gfx900]]"
 // LINK-SAME "[[A_BC2]]" "[[B_BC2]]"
+// LINK-SAME: "--whole-archive"
 
 // LINK-BUNDLE: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // LINK-BUNDLE-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -163,6 +163,9 @@
 
   // Look for archive of bundled bitcode in arguments, and add temporary files
   // for the extracted archive of bitcode to inputs.
+  // The archive libraries have to be linked with --whole-archive to instruct
+  // the linker to go through every library to look for kernel functions
+  LldArgs.push_back("--whole-archive");
   auto TargetID = Args.getLastArgValue(options::OPT_mcpu_EQ);
   AddStaticDeviceLibsLinking(C, *this, JA, Inputs, Args, 

[PATCH] D145591: [clang][HIP][OpenMP] Add warning if mixed HIP / OpenMP offloading

2023-03-08 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added a comment.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145591/new/

https://reviews.llvm.org/D145591

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D142506: [AMDGCN] Fix device lib test to work with lib64

2023-01-25 Thread Siu Chi Chan via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG56184bb3ad3f: [AMDGCN] Fix device lib test to work with 
lib64 (authored by scchan).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142506/new/

https://reviews.llvm.org/D142506

Files:
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/asanrtl.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/hip.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/ockl.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_abi_version_400.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_abi_version_500.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_daz_opt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_daz_opt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_finite_only_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_finite_only_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_1010.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_1011.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_1012.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_803.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_900.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_908.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_unsafe_math_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_unsafe_math_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_wavefrontsize64_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_wavefrontsize64_on.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/ocml.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/opencl.bc
  clang/test/Driver/hip-device-libs.hip


Index: clang/test/Driver/hip-device-libs.hip
===
--- clang/test/Driver/hip-device-libs.hip
+++ clang/test/Driver/hip-device-libs.hip
@@ -203,7 +203,7 @@
 // ALL-NOT: error:
 // ALL: {{"[^"]*clang[^"]*"}}
 
-// RESDIR-SAME: "-mlink-builtin-bitcode" 
"[[DEVICELIB_DIR:[^"]+(/|)rocm_resource_dir(/|)lib(/|)amdgcn(/|).*]]hip.bc"
+// RESDIR-SAME: "-mlink-builtin-bitcode" 
"[[DEVICELIB_DIR:[^"]+(/|)rocm_resource_dir(/|)lib(64)?(/|)amdgcn(/|).*]]hip.bc"
 // ROCMDIR-SAME: "-mlink-builtin-bitcode" 
"[[DEVICELIB_DIR:[^"]+(/|)rocm(/|)amdgcn(/|).*]]hip.bc"
 
 // ALL-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR]]ocml.bc"


Index: clang/test/Driver/hip-device-libs.hip
===
--- clang/test/Driver/hip-device-libs.hip
+++ clang/test/Driver/hip-device-libs.hip
@@ -203,7 +203,7 @@
 // ALL-NOT: error:
 // ALL: {{"[^"]*clang[^"]*"}}
 
-// RESDIR-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR:[^"]+(/|)rocm_resource_dir(/|)lib(/|)amdgcn(/|).*]]hip.bc"
+// RESDIR-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR:[^"]+(/|)rocm_resource_dir(/|)lib(64)?(/|)amdgcn(/|).*]]hip.bc"
 // ROCMDIR-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR:[^"]+(/|)rocm(/|)amdgcn(/|).*]]hip.bc"
 
 // ALL-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR]]ocml.bc"
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D142506: [AMDGCN] Fix device lib test to work with lib64

2023-01-24 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added a comment.

In D142506#4078402 , @arsenm wrote:

> The resource directory should be strictly controlled. why would lib64 ever be 
> used here?

The intention is to eventually move the device libs into the library directory 
inside clang's resource directory, which its name is affected 
LLVM_LIBDIR_SUFFIX.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142506/new/

https://reviews.llvm.org/D142506

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D142506: [AMDGCN] Fix device lib test to work with lib64

2023-01-24 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
scchan added reviewers: yaxunl, mgorny.
Herald added subscribers: kosarev, jvesely.
Herald added a project: All.
scchan requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This change fixes an issue introduced by
https://reviews.llvm.org/D140315.  A new unit test was added to validate
the search for the device libraries being placed in Clang's resource
directory; however, the test didn't take into account that certain
platforms use lib64 for 64-bit library directory.

Change-Id: I8b2ae01899214aef63f4e593c3a509dbf54582ef


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D142506

Files:
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/asanrtl.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/hip.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/ockl.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_abi_version_400.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_abi_version_500.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_daz_opt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_daz_opt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_finite_only_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_finite_only_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_1010.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_1011.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_1012.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_803.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_900.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_isa_version_908.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_unsafe_math_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_unsafe_math_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_wavefrontsize64_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/oclc_wavefrontsize64_on.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/ocml.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib64/amdgcn/bitcode/opencl.bc
  clang/test/Driver/hip-device-libs.hip


Index: clang/test/Driver/hip-device-libs.hip
===
--- clang/test/Driver/hip-device-libs.hip
+++ clang/test/Driver/hip-device-libs.hip
@@ -203,7 +203,7 @@
 // ALL-NOT: error:
 // ALL: {{"[^"]*clang[^"]*"}}
 
-// RESDIR-SAME: "-mlink-builtin-bitcode" 
"[[DEVICELIB_DIR:[^"]+(/|)rocm_resource_dir(/|)lib(/|)amdgcn(/|).*]]hip.bc"
+// RESDIR-SAME: "-mlink-builtin-bitcode" 
"[[DEVICELIB_DIR:[^"]+(/|)rocm_resource_dir(/|)lib(64)?(/|)amdgcn(/|).*]]hip.bc"
 // ROCMDIR-SAME: "-mlink-builtin-bitcode" 
"[[DEVICELIB_DIR:[^"]+(/|)rocm(/|)amdgcn(/|).*]]hip.bc"
 
 // ALL-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR]]ocml.bc"


Index: clang/test/Driver/hip-device-libs.hip
===
--- clang/test/Driver/hip-device-libs.hip
+++ clang/test/Driver/hip-device-libs.hip
@@ -203,7 +203,7 @@
 // ALL-NOT: error:
 // ALL: {{"[^"]*clang[^"]*"}}
 
-// RESDIR-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR:[^"]+(/|)rocm_resource_dir(/|)lib(/|)amdgcn(/|).*]]hip.bc"
+// RESDIR-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR:[^"]+(/|)rocm_resource_dir(/|)lib(64)?(/|)amdgcn(/|).*]]hip.bc"
 // ROCMDIR-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR:[^"]+(/|)rocm(/|)amdgcn(/|).*]]hip.bc"
 
 // ALL-SAME: "-mlink-builtin-bitcode" "[[DEVICELIB_DIR]]ocml.bc"
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140315: [AMDGCN] Update search path for device libraries

2023-01-24 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added a comment.

In D140315#4076614 , @mgorny wrote:

> I'm sorry for reporting it this late (clang's been broken for over two weeks, 
> so we couldn't periodically test it) but this change is causing test 
> regressions on Gentoo/amd64:

Thanks for reporting the issue.  I'm working on a patch to fix the test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140315/new/

https://reviews.llvm.org/D140315

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140315: [AMDGCN] Update search path for device libraries

2023-01-11 Thread Siu Chi Chan via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa18fe67b9fcf: [AMDGCN] Update search path for device 
libraries (authored by scchan).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140315/new/

https://reviews.llvm.org/D140315

Files:
  clang/lib/Driver/ToolChains/AMDGPU.cpp
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/asanrtl.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/hip.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/ockl.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_abi_version_400.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_abi_version_500.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_daz_opt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_daz_opt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_finite_only_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_finite_only_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1010.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1011.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1012.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_803.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_900.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_908.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_unsafe_math_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_unsafe_math_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_wavefrontsize64_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_wavefrontsize64_on.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/ocml.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/opencl.bc
  clang/test/Driver/hip-device-libs.hip

Index: clang/test/Driver/hip-device-libs.hip
===
--- clang/test/Driver/hip-device-libs.hip
+++ clang/test/Driver/hip-device-libs.hip
@@ -9,7 +9,7 @@
 // RUN:  --cuda-gpu-arch=gfx803 \
 // RUN:  --rocm-path=%S/Inputs/rocm   \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD,ROCMDIR
 
 
 // Test subtarget with flushing off by ddefault.
@@ -17,7 +17,7 @@
 // RUN:  --cuda-gpu-arch=gfx900 \
 // RUN:  --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD,ROCMDIR
 
 
 // Test explicit flag, opposite of target default.
@@ -26,7 +26,7 @@
 // RUN:   -fgpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD,ROCMDIR
 
 
 // Test explicit flag, opposite of target default.
@@ -35,7 +35,7 @@
 // RUN:   -fno-gpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD,ROCMDIR
 
 
 // Test explicit flag, same as target default.
@@ -44,7 +44,7 @@
 // RUN:   -fno-gpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD,ROCMDIR
 
 
 // Test explicit flag, same as target default.
@@ -53,7 +53,7 @@
 // RUN:   -fgpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD,ROCMDIR
 
 
 // Test last flag wins, not flushing
@@ -62,7 +62,7 @@
 // RUN:   -fgpu-flush-denormals-to-zero -fno-gpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD,ROCMDIR
 
 
 // RUN: %clang -### --target=x86_64-linux-gnu \
@@ -70,7 +70,7 @@
 // RUN:   -fgpu-flush-denormals-to-zero -fno-gpu-flush-denormals-to-zero \
 // 

[PATCH] D140315: [AMDGCN] Update search path for device libraries

2022-12-22 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added a comment.

In D140315#4011440 , @yaxunl wrote:

> files under 
> clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver seem 
> not used.

removed


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140315/new/

https://reviews.llvm.org/D140315

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140315: [AMDGCN] Update search path for device libraries

2022-12-22 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 484945.
scchan added a comment.

Remove unused bitcode files


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140315/new/

https://reviews.llvm.org/D140315

Files:
  clang/lib/Driver/ToolChains/AMDGPU.cpp
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/asanrtl.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/hip.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/ockl.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_abi_version_400.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_abi_version_500.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_daz_opt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_daz_opt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_finite_only_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_finite_only_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1010.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1011.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1012.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_803.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_900.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_908.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_unsafe_math_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_unsafe_math_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_wavefrontsize64_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_wavefrontsize64_on.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/ocml.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/opencl.bc
  clang/test/Driver/hip-device-libs.hip

Index: clang/test/Driver/hip-device-libs.hip
===
--- clang/test/Driver/hip-device-libs.hip
+++ clang/test/Driver/hip-device-libs.hip
@@ -9,7 +9,7 @@
 // RUN:  --cuda-gpu-arch=gfx803 \
 // RUN:  --rocm-path=%S/Inputs/rocm   \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD,ROCMDIR
 
 
 // Test subtarget with flushing off by ddefault.
@@ -17,7 +17,7 @@
 // RUN:  --cuda-gpu-arch=gfx900 \
 // RUN:  --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD,ROCMDIR
 
 
 // Test explicit flag, opposite of target default.
@@ -26,7 +26,7 @@
 // RUN:   -fgpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD,ROCMDIR
 
 
 // Test explicit flag, opposite of target default.
@@ -35,7 +35,7 @@
 // RUN:   -fno-gpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD,ROCMDIR
 
 
 // Test explicit flag, same as target default.
@@ -44,7 +44,7 @@
 // RUN:   -fno-gpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD,ROCMDIR
 
 
 // Test explicit flag, same as target default.
@@ -53,7 +53,7 @@
 // RUN:   -fgpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,FLUSHD,ROCMDIR
 
 
 // Test last flag wins, not flushing
@@ -62,7 +62,7 @@
 // RUN:   -fgpu-flush-denormals-to-zero -fno-gpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
-// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD
+// RUN: 2>&1 | FileCheck %s --check-prefixes=ALL,NOFLUSHD,ROCMDIR
 
 
 // RUN: %clang -### --target=x86_64-linux-gnu \
@@ -70,7 +70,7 @@
 // RUN:   -fgpu-flush-denormals-to-zero -fno-gpu-flush-denormals-to-zero \
 // RUN:   --rocm-path=%S/Inputs/rocm   \
 // RUN:   

[PATCH] D140315: [AMDGCN] Update search path for device libraries

2022-12-19 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
scchan added a reviewer: yaxunl.
Herald added subscribers: kosarev, kerbowa, jvesely.
Herald added a project: All.
scchan requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay.
Herald added a project: clang.

- Add support for finding device libraries in new ROCm directory

structure

- Simplify and remove the handling of legacy ROCm directory structure

Change-Id: I04da3bc9da85ced4b56b0225efb6b94448b8c5a1


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D140315

Files:
  clang/lib/Driver/ToolChains/AMDGPU.cpp
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/asanrtl.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/hip.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/ockl.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_correctly_rounded_sqrt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_correctly_rounded_sqrt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_daz_opt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_daz_opt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_finite_only_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_finite_only_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_isa_version_1010.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_isa_version_1011.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_isa_version_1012.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_isa_version_803.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_isa_version_900.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_isa_version_908.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_unsafe_math_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_unsafe_math_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_wavefrontsize64_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/oclc_wavefrontsize64_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/ocml.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver/opencl.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/asanrtl.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/hip.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/ockl.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_abi_version_400.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_abi_version_500.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_daz_opt_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_daz_opt_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_finite_only_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_finite_only_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1010.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1011.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_1012.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_803.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_900.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_isa_version_908.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_unsafe_math_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_unsafe_math_on.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_wavefrontsize64_off.bc
  
clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/oclc_wavefrontsize64_on.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/ocml.bc
  clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode/opencl.bc
  clang/test/Driver/hip-device-libs.hip

Index: clang/test/Driver/hip-device-libs.hip
===
--- clang/test/Driver/hip-device-libs.hip
+++ clang/test/Driver/hip-device-libs.hip
@@ -9,7 +9,7 @@
 // RUN:  --cuda-gpu-arch=gfx803 \
 // RUN:  --rocm-path=%S/Inputs/rocm   \
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip 

[PATCH] D134314: [HIP] stop forcing the lang std in the driver

2022-09-29 Thread Siu Chi Chan via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGcecb0e98d4b1: [HIP] stop forcing the lang std in the driver 
(authored by scchan).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134314/new/

https://reviews.llvm.org/D134314

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/hip-std.hip


Index: clang/test/Driver/hip-std.hip
===
--- clang/test/Driver/hip-std.hip
+++ clang/test/Driver/hip-std.hip
@@ -3,8 +3,10 @@
 
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -offload-arch=gfx906 %s \
 // RUN:   2>&1 | FileCheck -check-prefixes=DEFAULT %s
-// DEFAULT: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}"-std=c++11"
-// DEFAULT: "-cc1"{{.*}}"-std=c++11"
+// DEFAULT: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}
+// DEFAULT-NOT: "-std="{{.*}}
+// DEFAULT: "-cc1"{{.*}}
+// DEFAULT-NOT: "-std="{{.*}}
 
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -offload-arch=gfx906 %s \
 // RUN:   -std=c++17 %s 2>&1 | FileCheck -check-prefixes=SPECIFIED %s
@@ -13,8 +15,10 @@
 
 // RUN: %clang -### --target=x86_64-pc-windows-msvc -offload-arch=gfx906 %s \
 // RUN:   2>&1 | FileCheck -check-prefixes=MSVC-DEF %s
-// MSVC-DEF: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}"-std=c++14"
-// MSVC-DEF: "-cc1"{{.*}}"-std=c++14"
+// MSVC-DEF: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}
+// MSVC-DEF-NOT: "-std="{{.*}}
+// MSVC-DEF: "-cc1"{{.*}}
+// MSVC-DEF-NOT: "-std="{{.*}}
 
 // RUN: %clang -### --target=x86_64-pc-windows-msvc -offload-arch=gfx906 %s \
 // RUN:   -std=c++17 %s 2>&1 | FileCheck -check-prefixes=MSVC-SPEC %s
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5882,11 +5882,6 @@
 
 Args.AddLastArg(CmdArgs, options::OPT_ftrigraphs,
 options::OPT_fno_trigraphs);
-
-// HIP headers has minimum C++ standard requirements. Therefore set the
-// default language standard.
-if (IsHIP)
-  CmdArgs.push_back(IsWindowsMSVC ? "-std=c++14" : "-std=c++11");
   }
 
   // GCC's behavior for -Wwrite-strings is a bit strange:


Index: clang/test/Driver/hip-std.hip
===
--- clang/test/Driver/hip-std.hip
+++ clang/test/Driver/hip-std.hip
@@ -3,8 +3,10 @@
 
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -offload-arch=gfx906 %s \
 // RUN:   2>&1 | FileCheck -check-prefixes=DEFAULT %s
-// DEFAULT: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}"-std=c++11"
-// DEFAULT: "-cc1"{{.*}}"-std=c++11"
+// DEFAULT: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}
+// DEFAULT-NOT: "-std="{{.*}}
+// DEFAULT: "-cc1"{{.*}}
+// DEFAULT-NOT: "-std="{{.*}}
 
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -offload-arch=gfx906 %s \
 // RUN:   -std=c++17 %s 2>&1 | FileCheck -check-prefixes=SPECIFIED %s
@@ -13,8 +15,10 @@
 
 // RUN: %clang -### --target=x86_64-pc-windows-msvc -offload-arch=gfx906 %s \
 // RUN:   2>&1 | FileCheck -check-prefixes=MSVC-DEF %s
-// MSVC-DEF: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}"-std=c++14"
-// MSVC-DEF: "-cc1"{{.*}}"-std=c++14"
+// MSVC-DEF: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}
+// MSVC-DEF-NOT: "-std="{{.*}}
+// MSVC-DEF: "-cc1"{{.*}}
+// MSVC-DEF-NOT: "-std="{{.*}}
 
 // RUN: %clang -### --target=x86_64-pc-windows-msvc -offload-arch=gfx906 %s \
 // RUN:   -std=c++17 %s 2>&1 | FileCheck -check-prefixes=MSVC-SPEC %s
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5882,11 +5882,6 @@
 
 Args.AddLastArg(CmdArgs, options::OPT_ftrigraphs,
 options::OPT_fno_trigraphs);
-
-// HIP headers has minimum C++ standard requirements. Therefore set the
-// default language standard.
-if (IsHIP)
-  CmdArgs.push_back(IsWindowsMSVC ? "-std=c++14" : "-std=c++11");
   }
 
   // GCC's behavior for -Wwrite-strings is a bit strange:
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134314: [HIP] stop forcing the lang std in the driver

2022-09-20 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
scchan added a reviewer: yaxunl.
Herald added a project: All.
scchan requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay.
Herald added a project: clang.

D103221  changed HIP's default to C++14, 
removing the driver logic to
force it into a different std.

Change-Id: I9f5220a7456687039b0bd3b3574f3124d3cc7665


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D134314

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/hip-std.hip


Index: clang/test/Driver/hip-std.hip
===
--- clang/test/Driver/hip-std.hip
+++ clang/test/Driver/hip-std.hip
@@ -3,8 +3,10 @@
 
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -offload-arch=gfx906 %s \
 // RUN:   2>&1 | FileCheck -check-prefixes=DEFAULT %s
-// DEFAULT: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}"-std=c++11"
-// DEFAULT: "-cc1"{{.*}}"-std=c++11"
+// DEFAULT: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}
+// DEFAULT-NOT: "-std="{{.*}}
+// DEFAULT: "-cc1"{{.*}}
+// DEFAULT-NOT: "-std="{{.*}}
 
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -offload-arch=gfx906 %s \
 // RUN:   -std=c++17 %s 2>&1 | FileCheck -check-prefixes=SPECIFIED %s
@@ -13,8 +15,10 @@
 
 // RUN: %clang -### --target=x86_64-pc-windows-msvc -offload-arch=gfx906 %s \
 // RUN:   2>&1 | FileCheck -check-prefixes=MSVC-DEF %s
-// MSVC-DEF: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}"-std=c++14"
-// MSVC-DEF: "-cc1"{{.*}}"-std=c++14"
+// MSVC-DEF: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}
+// MSVC-DEF-NOT: "-std="{{.*}}
+// MSVC-DEF: "-cc1"{{.*}}
+// MSVC-DEF-NOT: "-std="{{.*}}
 
 // RUN: %clang -### --target=x86_64-pc-windows-msvc -offload-arch=gfx906 %s \
 // RUN:   -std=c++17 %s 2>&1 | FileCheck -check-prefixes=MSVC-SPEC %s
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5881,11 +5881,6 @@
 
 Args.AddLastArg(CmdArgs, options::OPT_ftrigraphs,
 options::OPT_fno_trigraphs);
-
-// HIP headers has minimum C++ standard requirements. Therefore set the
-// default language standard.
-if (IsHIP)
-  CmdArgs.push_back(IsWindowsMSVC ? "-std=c++14" : "-std=c++11");
   }
 
   // GCC's behavior for -Wwrite-strings is a bit strange:


Index: clang/test/Driver/hip-std.hip
===
--- clang/test/Driver/hip-std.hip
+++ clang/test/Driver/hip-std.hip
@@ -3,8 +3,10 @@
 
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -offload-arch=gfx906 %s \
 // RUN:   2>&1 | FileCheck -check-prefixes=DEFAULT %s
-// DEFAULT: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}"-std=c++11"
-// DEFAULT: "-cc1"{{.*}}"-std=c++11"
+// DEFAULT: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}
+// DEFAULT-NOT: "-std="{{.*}}
+// DEFAULT: "-cc1"{{.*}}
+// DEFAULT-NOT: "-std="{{.*}}
 
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu -offload-arch=gfx906 %s \
 // RUN:   -std=c++17 %s 2>&1 | FileCheck -check-prefixes=SPECIFIED %s
@@ -13,8 +15,10 @@
 
 // RUN: %clang -### --target=x86_64-pc-windows-msvc -offload-arch=gfx906 %s \
 // RUN:   2>&1 | FileCheck -check-prefixes=MSVC-DEF %s
-// MSVC-DEF: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}"-std=c++14"
-// MSVC-DEF: "-cc1"{{.*}}"-std=c++14"
+// MSVC-DEF: "-cc1"{{.*}}"-fcuda-is-device"{{.*}}
+// MSVC-DEF-NOT: "-std="{{.*}}
+// MSVC-DEF: "-cc1"{{.*}}
+// MSVC-DEF-NOT: "-std="{{.*}}
 
 // RUN: %clang -### --target=x86_64-pc-windows-msvc -offload-arch=gfx906 %s \
 // RUN:   -std=c++17 %s 2>&1 | FileCheck -check-prefixes=MSVC-SPEC %s
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5881,11 +5881,6 @@
 
 Args.AddLastArg(CmdArgs, options::OPT_ftrigraphs,
 options::OPT_fno_trigraphs);
-
-// HIP headers has minimum C++ standard requirements. Therefore set the
-// default language standard.
-if (IsHIP)
-  CmdArgs.push_back(IsWindowsMSVC ? "-std=c++14" : "-std=c++11");
   }
 
   // GCC's behavior for -Wwrite-strings is a bit strange:
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-12 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan accepted this revision.
scchan added a comment.
This revision is now accepted and ready to land.

LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133705/new/

https://reviews.llvm.org/D133705

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-12 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan requested changes to this revision.
scchan added inline comments.
This revision now requires changes to proceed.



Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:1959
+  if (FoundAOB)
+break;
 }

The AOBFileNames small vector needs to be cleared if !FoundAOB or just move its 
declaration into the LibraryPaths loop.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133705/new/

https://reviews.llvm.org/D133705

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123387: [clang-offload-bundler] fix "no output file" issue with -outputs

2022-04-08 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
scchan added reviewers: yaxunl, saiislam, ronlieb.
Herald added a project: All.
scchan requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Fix backward compatibility issue due to D120662 
.

Change-Id: I7cd0f704aabbaac7dcf59fd4b73b4f0e0cdfa69f


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D123387

Files:
  clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp


Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -1450,7 +1450,7 @@
 return 0;
   }
 
-  if (OutputFileNames.getNumOccurrences() == 0) {
+  if (OutputFileNames.size() == 0) {
 reportError(
 createStringError(errc::invalid_argument, "no output file 
specified!"));
   }


Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -1450,7 +1450,7 @@
 return 0;
   }
 
-  if (OutputFileNames.getNumOccurrences() == 0) {
+  if (OutputFileNames.size() == 0) {
 reportError(
 createStringError(errc::invalid_argument, "no output file specified!"));
   }
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120662: [clang-offload-bundler] add -input/-output options

2022-04-04 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 420218.
scchan added a comment.

- fixed the hip-link-bundle-archive.hip test issue on Windows
- fixed a couple of formatting issues


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120662/new/

https://reviews.llvm.org/D120662

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPUtility.cpp
  clang/test/Driver/clang-offload-bundler-asserts-on.c
  clang/test/Driver/clang-offload-bundler.c
  clang/test/Driver/fat-archive-unbundle-ext.c
  clang/test/Driver/fat_archive_amdgpu.cpp
  clang/test/Driver/fat_archive_nvptx.cpp
  clang/test/Driver/hip-device-compile.hip
  clang/test/Driver/hip-link-bundle-archive.hip
  clang/test/Driver/hip-link-save-temps.hip
  clang/test/Driver/hip-output-file-name.hip
  clang/test/Driver/hip-rdc-device-only.hip
  clang/test/Driver/hip-save-temps.hip
  clang/test/Driver/hip-toolchain-device-only.hip
  clang/test/Driver/hip-toolchain-no-rdc.hip
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip
  clang/test/Driver/hip-toolchain-rdc.hip
  clang/test/Driver/hip-unbundle-preproc.hip
  clang/test/Driver/hipspv-toolchain-rdc.hip
  clang/test/Driver/hipspv-toolchain.hip
  clang/test/Driver/openmp-offload-gpu.c
  clang/test/Driver/openmp-offload.c
  clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -62,15 +62,26 @@
 // and -help) will be hidden.
 static cl::OptionCategory
 ClangOffloadBundlerCategory("clang-offload-bundler options");
-
 static cl::list
-InputFileNames("inputs", cl::CommaSeparated, cl::OneOrMore,
-   cl::desc("[,...]"),
+InputFileNames("input", cl::ZeroOrMore,
+   cl::desc("Input file."
+" Can be specified multiple times "
+"for multiple input files."),
cl::cat(ClangOffloadBundlerCategory));
 static cl::list
-OutputFileNames("outputs", cl::CommaSeparated,
-cl::desc("[,...]"),
+InputFileNamesDeprecatedOpt("inputs", cl::CommaSeparated, cl::ZeroOrMore,
+cl::desc("[,...] (deprecated)"),
+cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+OutputFileNames("output", cl::ZeroOrMore,
+cl::desc("Output file."
+ " Can be specified multiple times "
+ "for multiple output files."),
 cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+OutputFileNamesDeprecatedOpt("outputs", cl::CommaSeparated, cl::ZeroOrMore,
+ cl::desc("[,...] (deprecated)"),
+ cl::cat(ClangOffloadBundlerCategory));
 static cl::list
 TargetNames("targets", cl::CommaSeparated,
 cl::desc("[-,...]"),
@@ -1384,6 +1395,38 @@
 }
   };
 
+  auto warningOS = [argv]() -> raw_ostream & {
+return WithColor::warning(errs(), StringRef(argv[0]));
+  };
+
+  if (InputFileNames.getNumOccurrences() != 0 &&
+  InputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+reportError(createStringError(
+errc::invalid_argument,
+"-inputs and -input cannot be used together, use only -input instead"));
+  }
+  if (InputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-inputs is deprecated, use -input instead\n";
+// temporary hack to support -inputs
+std::vector  = InputFileNames;
+s.insert(s.end(), InputFileNamesDeprecatedOpt.begin(),
+ InputFileNamesDeprecatedOpt.end());
+  }
+
+  if (OutputFileNames.getNumOccurrences() != 0 &&
+  OutputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+reportError(createStringError(errc::invalid_argument,
+  "-outputs and -output cannot be used "
+  "together, use only -output instead"));
+  }
+  if (OutputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-outputs is deprecated, use -output instead\n";
+// temporary hack to support -outputs
+std::vector  = OutputFileNames;
+s.insert(s.end(), OutputFileNamesDeprecatedOpt.begin(),
+ OutputFileNamesDeprecatedOpt.end());
+  }
+
   if (ListBundleIDs) {
 if (Unbundle) {
   reportError(
@@ -1408,9 +1451,8 @@
   }
 
   if (OutputFileNames.getNumOccurrences() == 0) {
-reportError(createStringError(
-errc::invalid_argument,
-"for the --outputs option: must be specified at least once!"));
+reportError(
+createStringError(errc::invalid_argument, "no output file specified!"));
   }
   if 

[PATCH] D120662: [clang-offload-bundler] add -input/-output options

2022-04-01 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 419847.
scchan added a comment.

update the failing lit tests to recognize the new switches


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120662/new/

https://reviews.llvm.org/D120662

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPUtility.cpp
  clang/test/Driver/clang-offload-bundler-asserts-on.c
  clang/test/Driver/clang-offload-bundler.c
  clang/test/Driver/fat-archive-unbundle-ext.c
  clang/test/Driver/fat_archive_amdgpu.cpp
  clang/test/Driver/fat_archive_nvptx.cpp
  clang/test/Driver/hip-device-compile.hip
  clang/test/Driver/hip-link-bundle-archive.hip
  clang/test/Driver/hip-link-save-temps.hip
  clang/test/Driver/hip-output-file-name.hip
  clang/test/Driver/hip-rdc-device-only.hip
  clang/test/Driver/hip-save-temps.hip
  clang/test/Driver/hip-toolchain-device-only.hip
  clang/test/Driver/hip-toolchain-no-rdc.hip
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip
  clang/test/Driver/hip-toolchain-rdc.hip
  clang/test/Driver/hip-unbundle-preproc.hip
  clang/test/Driver/hipspv-toolchain-rdc.hip
  clang/test/Driver/hipspv-toolchain.hip
  clang/test/Driver/openmp-offload-gpu.c
  clang/test/Driver/openmp-offload.c
  clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -62,15 +62,26 @@
 // and -help) will be hidden.
 static cl::OptionCategory
 ClangOffloadBundlerCategory("clang-offload-bundler options");
-
 static cl::list
-InputFileNames("inputs", cl::CommaSeparated, cl::OneOrMore,
-   cl::desc("[,...]"),
+InputFileNames("input", cl::ZeroOrMore,
+   cl::desc("Input file."
+" Can be specified multiple times "
+"for multiple input files."),
+   cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+InputFileNamesDeprecatedOpt("inputs", cl::CommaSeparated, cl::ZeroOrMore,
+   cl::desc("[,...] (deprecated)"),
cl::cat(ClangOffloadBundlerCategory));
 static cl::list
-OutputFileNames("outputs", cl::CommaSeparated,
-cl::desc("[,...]"),
+OutputFileNames("output", cl::ZeroOrMore,
+cl::desc("Output file."
+" Can be specified multiple times "
+"for multiple output files."),
 cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+OutputFileNamesDeprecatedOpt("outputs", cl::CommaSeparated, cl::ZeroOrMore,
+cl::desc("[,...] (deprecated)"),
+cl::cat(ClangOffloadBundlerCategory));
 static cl::list
 TargetNames("targets", cl::CommaSeparated,
 cl::desc("[-,...]"),
@@ -1384,6 +1395,36 @@
 }
   };
 
+  auto warningOS = [argv]() -> raw_ostream & {
+return WithColor::warning(errs(), StringRef(argv[0]));
+  };
+
+  if (InputFileNames.getNumOccurrences() != 0 &&
+  InputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+  reportError(createStringError(
+errc::invalid_argument,
+"-inputs and -input cannot be used together, use only -input instead"));
+  }
+  if (InputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-inputs is deprecated, use -input instead\n";
+// temporary hack to support -inputs
+std::vector& s = InputFileNames;
+s.insert(s.end(), InputFileNamesDeprecatedOpt.begin(), InputFileNamesDeprecatedOpt.end());
+  }
+
+  if (OutputFileNames.getNumOccurrences() != 0 &&
+  OutputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+  reportError(createStringError(
+errc::invalid_argument,
+"-outputs and -output cannot be used together, use only -output instead"));
+  }
+  if (OutputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-outputs is deprecated, use -output instead\n";
+// temporary hack to support -outputs
+std::vector& s = OutputFileNames;
+s.insert(s.end(), OutputFileNamesDeprecatedOpt.begin(), OutputFileNamesDeprecatedOpt.end());
+  }
+
   if (ListBundleIDs) {
 if (Unbundle) {
   reportError(
@@ -1410,7 +1451,7 @@
   if (OutputFileNames.getNumOccurrences() == 0) {
 reportError(createStringError(
 errc::invalid_argument,
-"for the --outputs option: must be specified at least once!"));
+"no output file specified!"));
   }
   if (TargetNames.getNumOccurrences() == 0) {
 reportError(createStringError(
Index: clang/test/Driver/openmp-offload.c
===
--- clang/test/Driver/openmp-offload.c
+++ clang/test/Driver/openmp-offload.c
@@ -464,14 +464,14 @@
 // Create host object and 

[PATCH] D120662: [clang-offload-bundler] add -input/-output options

2022-03-28 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 418711.
scchan added a comment.

update the clang-offload-bundler-asserts-on test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120662/new/

https://reviews.llvm.org/D120662

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPUtility.cpp
  clang/test/Driver/clang-offload-bundler-asserts-on.c
  clang/test/Driver/clang-offload-bundler.c
  clang/test/Driver/fat-archive-unbundle-ext.c
  clang/test/Driver/fat_archive_amdgpu.cpp
  clang/test/Driver/fat_archive_nvptx.cpp
  clang/test/Driver/hip-device-compile.hip
  clang/test/Driver/hip-link-bundle-archive.hip
  clang/test/Driver/hip-link-save-temps.hip
  clang/test/Driver/hip-output-file-name.hip
  clang/test/Driver/hip-rdc-device-only.hip
  clang/test/Driver/hip-save-temps.hip
  clang/test/Driver/hip-toolchain-device-only.hip
  clang/test/Driver/hip-toolchain-no-rdc.hip
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip
  clang/test/Driver/hip-toolchain-rdc.hip
  clang/test/Driver/hip-unbundle-preproc.hip
  clang/test/Driver/hipspv-toolchain-rdc.hip
  clang/test/Driver/hipspv-toolchain.hip
  clang/test/Driver/openmp-offload-gpu.c
  clang/test/Driver/openmp-offload.c
  clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -62,15 +62,26 @@
 // and -help) will be hidden.
 static cl::OptionCategory
 ClangOffloadBundlerCategory("clang-offload-bundler options");
-
 static cl::list
-InputFileNames("inputs", cl::CommaSeparated, cl::OneOrMore,
-   cl::desc("[,...]"),
+InputFileNames("input", cl::ZeroOrMore,
+   cl::desc("Input file."
+" Can be specified multiple times "
+"for multiple input files."),
+   cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+InputFileNamesDeprecatedOpt("inputs", cl::CommaSeparated, cl::ZeroOrMore,
+   cl::desc("[,...] (deprecated)"),
cl::cat(ClangOffloadBundlerCategory));
 static cl::list
-OutputFileNames("outputs", cl::CommaSeparated,
-cl::desc("[,...]"),
+OutputFileNames("output", cl::ZeroOrMore,
+cl::desc("Output file."
+" Can be specified multiple times "
+"for multiple output files."),
 cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+OutputFileNamesDeprecatedOpt("outputs", cl::CommaSeparated, cl::ZeroOrMore,
+cl::desc("[,...] (deprecated)"),
+cl::cat(ClangOffloadBundlerCategory));
 static cl::list
 TargetNames("targets", cl::CommaSeparated,
 cl::desc("[-,...]"),
@@ -1384,6 +1395,36 @@
 }
   };
 
+  auto warningOS = [argv]() -> raw_ostream & {
+return WithColor::warning(errs(), StringRef(argv[0]));
+  };
+
+  if (InputFileNames.getNumOccurrences() != 0 &&
+  InputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+  reportError(createStringError(
+errc::invalid_argument,
+"-inputs and -input cannot be used together, use only -input instead"));
+  }
+  if (InputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-inputs is deprecated, use -input instead\n";
+// temporary hack to support -inputs
+std::vector& s = InputFileNames;
+s.insert(s.end(), InputFileNamesDeprecatedOpt.begin(), InputFileNamesDeprecatedOpt.end());
+  }
+
+  if (OutputFileNames.getNumOccurrences() != 0 &&
+  OutputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+  reportError(createStringError(
+errc::invalid_argument,
+"-outputs and -output cannot be used together, use only -output instead"));
+  }
+  if (OutputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-outputs is deprecated, use -output instead\n";
+// temporary hack to support -outputs
+std::vector& s = OutputFileNames;
+s.insert(s.end(), OutputFileNamesDeprecatedOpt.begin(), OutputFileNamesDeprecatedOpt.end());
+  }
+
   if (ListBundleIDs) {
 if (Unbundle) {
   reportError(
@@ -1410,7 +1451,7 @@
   if (OutputFileNames.getNumOccurrences() == 0) {
 reportError(createStringError(
 errc::invalid_argument,
-"for the --outputs option: must be specified at least once!"));
+"no output file specified!"));
   }
   if (TargetNames.getNumOccurrences() == 0) {
 reportError(createStringError(
Index: clang/test/Driver/openmp-offload.c
===
--- clang/test/Driver/openmp-offload.c
+++ clang/test/Driver/openmp-offload.c
@@ -464,14 +464,14 @@
 // Create host object and bundle.
 

[PATCH] D120662: [clang-offload-bundler] add -input/-output options

2022-03-28 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 418710.
scchan added a comment.
Herald added a subscriber: MaskRay.
Herald added a project: All.

fix lit test failures on Windows, rebased


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120662/new/

https://reviews.llvm.org/D120662

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPUtility.cpp
  clang/test/Driver/clang-offload-bundler-asserts-on.c
  clang/test/Driver/clang-offload-bundler.c
  clang/test/Driver/fat-archive-unbundle-ext.c
  clang/test/Driver/fat_archive_amdgpu.cpp
  clang/test/Driver/fat_archive_nvptx.cpp
  clang/test/Driver/hip-device-compile.hip
  clang/test/Driver/hip-link-bundle-archive.hip
  clang/test/Driver/hip-link-save-temps.hip
  clang/test/Driver/hip-output-file-name.hip
  clang/test/Driver/hip-rdc-device-only.hip
  clang/test/Driver/hip-save-temps.hip
  clang/test/Driver/hip-toolchain-device-only.hip
  clang/test/Driver/hip-toolchain-no-rdc.hip
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip
  clang/test/Driver/hip-toolchain-rdc.hip
  clang/test/Driver/hip-unbundle-preproc.hip
  clang/test/Driver/hipspv-toolchain-rdc.hip
  clang/test/Driver/hipspv-toolchain.hip
  clang/test/Driver/openmp-offload-gpu.c
  clang/test/Driver/openmp-offload.c
  clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -62,15 +62,26 @@
 // and -help) will be hidden.
 static cl::OptionCategory
 ClangOffloadBundlerCategory("clang-offload-bundler options");
-
 static cl::list
-InputFileNames("inputs", cl::CommaSeparated, cl::OneOrMore,
-   cl::desc("[,...]"),
+InputFileNames("input", cl::ZeroOrMore,
+   cl::desc("Input file."
+" Can be specified multiple times "
+"for multiple input files."),
+   cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+InputFileNamesDeprecatedOpt("inputs", cl::CommaSeparated, cl::ZeroOrMore,
+   cl::desc("[,...] (deprecated)"),
cl::cat(ClangOffloadBundlerCategory));
 static cl::list
-OutputFileNames("outputs", cl::CommaSeparated,
-cl::desc("[,...]"),
+OutputFileNames("output", cl::ZeroOrMore,
+cl::desc("Output file."
+" Can be specified multiple times "
+"for multiple output files."),
 cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+OutputFileNamesDeprecatedOpt("outputs", cl::CommaSeparated, cl::ZeroOrMore,
+cl::desc("[,...] (deprecated)"),
+cl::cat(ClangOffloadBundlerCategory));
 static cl::list
 TargetNames("targets", cl::CommaSeparated,
 cl::desc("[-,...]"),
@@ -1384,6 +1395,36 @@
 }
   };
 
+  auto warningOS = [argv]() -> raw_ostream & {
+return WithColor::warning(errs(), StringRef(argv[0]));
+  };
+
+  if (InputFileNames.getNumOccurrences() != 0 &&
+  InputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+  reportError(createStringError(
+errc::invalid_argument,
+"-inputs and -input cannot be used together, use only -input instead"));
+  }
+  if (InputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-inputs is deprecated, use -input instead\n";
+// temporary hack to support -inputs
+std::vector& s = InputFileNames;
+s.insert(s.end(), InputFileNamesDeprecatedOpt.begin(), InputFileNamesDeprecatedOpt.end());
+  }
+
+  if (OutputFileNames.getNumOccurrences() != 0 &&
+  OutputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+  reportError(createStringError(
+errc::invalid_argument,
+"-outputs and -output cannot be used together, use only -output instead"));
+  }
+  if (OutputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-outputs is deprecated, use -output instead\n";
+// temporary hack to support -outputs
+std::vector& s = OutputFileNames;
+s.insert(s.end(), OutputFileNamesDeprecatedOpt.begin(), OutputFileNamesDeprecatedOpt.end());
+  }
+
   if (ListBundleIDs) {
 if (Unbundle) {
   reportError(
@@ -1410,7 +1451,7 @@
   if (OutputFileNames.getNumOccurrences() == 0) {
 reportError(createStringError(
 errc::invalid_argument,
-"for the --outputs option: must be specified at least once!"));
+"no output file specified!"));
   }
   if (TargetNames.getNumOccurrences() == 0) {
 reportError(createStringError(
Index: clang/test/Driver/openmp-offload.c
===
--- clang/test/Driver/openmp-offload.c
+++ clang/test/Driver/openmp-offload.c

[PATCH] D120662: [clang-offload-bundler] add -input/-output options

2022-02-28 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
scchan added a reviewer: yaxunl.
Herald added subscribers: asavonic, kerbowa, jvesely.
Herald added a reviewer: alexander-shaposhnikov.
scchan requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

Currently, clang-offload-bundler has -inputs and -outputs options that accept 
values with comma as the delimiter. This causes issues with file paths 
containing commas, which are valid file paths on Linux.

This add two new options -input and -output, which accept one single file, and 
allow multiple instances. This allows arbitrary file paths. The old -inputs and 
-outputs options will be kept for backward compatibility, but are not allowed 
to be used with -input and -output options for simplicity. In the future, 
-inputs and -outputs options will be phasing out.

RFC: 
https://discourse.llvm.org/t/rfc-adding-input-and-output-options-to-clang-offload-bundler/60049


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D120662

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIPUtility.cpp
  clang/test/Driver/clang-offload-bundler-asserts-on.c
  clang/test/Driver/clang-offload-bundler.c
  clang/test/Driver/fat-archive-unbundle-ext.c
  clang/test/Driver/fat_archive_amdgpu.cpp
  clang/test/Driver/fat_archive_nvptx.cpp
  clang/test/Driver/hip-device-compile.hip
  clang/test/Driver/hip-link-bundle-archive.hip
  clang/test/Driver/hip-link-save-temps.hip
  clang/test/Driver/hip-output-file-name.hip
  clang/test/Driver/hip-rdc-device-only.hip
  clang/test/Driver/hip-save-temps.hip
  clang/test/Driver/hip-toolchain-device-only.hip
  clang/test/Driver/hip-toolchain-no-rdc.hip
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip
  clang/test/Driver/hip-toolchain-rdc.hip
  clang/test/Driver/hip-unbundle-preproc.hip
  clang/test/Driver/hipspv-toolchain-rdc.hip
  clang/test/Driver/hipspv-toolchain.hip
  clang/test/Driver/openmp-offload-gpu.c
  clang/test/Driver/openmp-offload.c
  clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -62,15 +62,26 @@
 // and -help) will be hidden.
 static cl::OptionCategory
 ClangOffloadBundlerCategory("clang-offload-bundler options");
-
 static cl::list
-InputFileNames("inputs", cl::CommaSeparated, cl::OneOrMore,
-   cl::desc("[,...]"),
+InputFileNames("input", cl::ZeroOrMore,
+   cl::desc("Input file."  
+" Can be specified multiple times "
+"for multiple input files."),
+   cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+InputFileNamesDeprecatedOpt("inputs", cl::CommaSeparated, cl::ZeroOrMore,
+   cl::desc("[,...] (deprecated)"),
cl::cat(ClangOffloadBundlerCategory));
 static cl::list
-OutputFileNames("outputs", cl::CommaSeparated,
-cl::desc("[,...]"),
+OutputFileNames("output", cl::ZeroOrMore,
+cl::desc("Output file."  
+" Can be specified multiple times "
+"for multiple output files."),
 cl::cat(ClangOffloadBundlerCategory));
+static cl::list
+OutputFileNamesDeprecatedOpt("outputs", cl::CommaSeparated, cl::ZeroOrMore,
+cl::desc("[,...] (deprecated)"),
+cl::cat(ClangOffloadBundlerCategory));
 static cl::list
 TargetNames("targets", cl::CommaSeparated,
 cl::desc("[-,...]"),
@@ -1363,6 +1374,36 @@
 }
   };
 
+  auto warningOS = [argv]() -> raw_ostream & {
+return WithColor::warning(errs(), StringRef(argv[0]));
+  };
+
+  if (InputFileNames.getNumOccurrences() != 0 &&
+  InputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+  reportError(createStringError(
+errc::invalid_argument,
+"-inputs and -input cannot be used together, use only -input instead"));   
+  }
+  if (InputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-inputs is deprecated, use -input instead\n";
+// temporary hack to support -inputs
+std::vector& s = InputFileNames;
+s.insert(s.end(), InputFileNamesDeprecatedOpt.begin(), InputFileNamesDeprecatedOpt.end());
+  }
+
+  if (OutputFileNames.getNumOccurrences() != 0 &&
+  OutputFileNamesDeprecatedOpt.getNumOccurrences() != 0) {
+  reportError(createStringError(
+errc::invalid_argument,
+"-outputs and -output cannot be used together, use only -output instead"));   
+  }
+  if (OutputFileNamesDeprecatedOpt.size()) {
+warningOS() << "-outputs is deprecated, use -output instead\n";
+// 

[PATCH] D106571: [HIP] Fix visibility of __hip_fatbin

2021-07-22 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan accepted this revision.
scchan added a comment.
This revision is now accepted and ready to land.

LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106571/new/

https://reviews.llvm.org/D106571

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-25 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added inline comments.



Comment at: clang/lib/Headers/__clang_hip_cmath.h:29
+#ifdef __OPENMP_AMDGCN__
+#define __DEVICE__ static constexpr __attribute__((always_inline, nothrow))
+#define __constant__ __attribute__((constant))

`__DEVICE__` should not imply constexpr.  It should be added to each function 
separately.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104904/new/

https://reviews.llvm.org/D104904

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104392: [HIP] Add support functions for C++ polymorphic types

2021-06-18 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 353041.
scchan added a comment.

Minor clean up in the hip-header.hip test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104392/new/

https://reviews.llvm.org/D104392

Files:
  clang/lib/Headers/__clang_hip_runtime_wrapper.h
  clang/test/Headers/hip-header.hip


Index: clang/test/Headers/hip-header.hip
===
--- clang/test/Headers/hip-header.hip
+++ clang/test/Headers/hip-header.hip
@@ -14,6 +14,29 @@
 
 // expected-no-diagnostics
 
+// Check support for pure and deleted virtual functions
+struct base {
+  __host__
+  __device__
+  virtual void pv() = 0;
+  __host__
+  __device__
+  virtual void dv() = delete;
+};
+struct derived:base {
+  __host__
+  __device__
+  virtual void pv() override {};
+};
+__device__ void test_vf() {
+derived d;
+}
+// CHECK: @_ZTV7derived = linkonce_odr unnamed_addr addrspace(1) constant { [4 
x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.derived*)* 
@_ZN7derived2pvEv to i8*), i8* bitcast (void ()* @__cxa_deleted_virtual to 
i8*)] }, comdat, align 8
+// CHECK: @_ZTV4base = linkonce_odr unnamed_addr addrspace(1) constant { [4 x 
i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (void ()* 
@__cxa_pure_virtual to i8*), i8* bitcast (void ()* @__cxa_deleted_virtual to 
i8*)] }, comdat, align 8
+
+// CHECK: define{{.*}}void @__cxa_pure_virtual()
+// CHECK: define{{.*}}void @__cxa_deleted_virtual()
+
 struct Number {
   __device__ Number(float _x) : x(_x) {}
   float x;
Index: clang/lib/Headers/__clang_hip_runtime_wrapper.h
===
--- clang/lib/Headers/__clang_hip_runtime_wrapper.h
+++ clang/lib/Headers/__clang_hip_runtime_wrapper.h
@@ -51,6 +51,23 @@
   #define nullptr NULL;
 #endif
 
+#ifdef __cplusplus
+extern "C" {
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_pure_virtual(void) {
+__builtin_trap();
+  }
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_deleted_virtual(void) {
+__builtin_trap();
+  } 
+}
+#endif //__cplusplus
+
 #if __HIP_ENABLE_DEVICE_MALLOC__
 extern "C" __device__ void *__hip_malloc(size_t __size);
 extern "C" __device__ void *__hip_free(void *__ptr);


Index: clang/test/Headers/hip-header.hip
===
--- clang/test/Headers/hip-header.hip
+++ clang/test/Headers/hip-header.hip
@@ -14,6 +14,29 @@
 
 // expected-no-diagnostics
 
+// Check support for pure and deleted virtual functions
+struct base {
+  __host__
+  __device__
+  virtual void pv() = 0;
+  __host__
+  __device__
+  virtual void dv() = delete;
+};
+struct derived:base {
+  __host__
+  __device__
+  virtual void pv() override {};
+};
+__device__ void test_vf() {
+derived d;
+}
+// CHECK: @_ZTV7derived = linkonce_odr unnamed_addr addrspace(1) constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.derived*)* @_ZN7derived2pvEv to i8*), i8* bitcast (void ()* @__cxa_deleted_virtual to i8*)] }, comdat, align 8
+// CHECK: @_ZTV4base = linkonce_odr unnamed_addr addrspace(1) constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (void ()* @__cxa_pure_virtual to i8*), i8* bitcast (void ()* @__cxa_deleted_virtual to i8*)] }, comdat, align 8
+
+// CHECK: define{{.*}}void @__cxa_pure_virtual()
+// CHECK: define{{.*}}void @__cxa_deleted_virtual()
+
 struct Number {
   __device__ Number(float _x) : x(_x) {}
   float x;
Index: clang/lib/Headers/__clang_hip_runtime_wrapper.h
===
--- clang/lib/Headers/__clang_hip_runtime_wrapper.h
+++ clang/lib/Headers/__clang_hip_runtime_wrapper.h
@@ -51,6 +51,23 @@
   #define nullptr NULL;
 #endif
 
+#ifdef __cplusplus
+extern "C" {
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_pure_virtual(void) {
+__builtin_trap();
+  }
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_deleted_virtual(void) {
+__builtin_trap();
+  } 
+}
+#endif //__cplusplus
+
 #if __HIP_ENABLE_DEVICE_MALLOC__
 extern "C" __device__ void *__hip_malloc(size_t __size);
 extern "C" __device__ void *__hip_free(void *__ptr);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104392: [HIP] Add support functions for C++ polymorphic types

2021-06-18 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan updated this revision to Diff 353035.
scchan added a comment.

Adding test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104392/new/

https://reviews.llvm.org/D104392

Files:
  clang/lib/Headers/__clang_hip_runtime_wrapper.h
  clang/test/Headers/hip-header.hip


Index: clang/test/Headers/hip-header.hip
===
--- clang/test/Headers/hip-header.hip
+++ clang/test/Headers/hip-header.hip
@@ -14,6 +14,30 @@
 
 // expected-no-diagnostics
 
+// Check support for pure and deleted virtual functions
+struct base {
+  __host__
+  __device__
+  virtual void pv() = 0;
+  __host__
+  __device__
+  virtual void dv() = delete;
+};
+struct derived:base {
+  __host__
+  __device__
+  virtual void pv() override {};
+};
+__device__ void foo(derived* b) {
+derived d;
+}
+// CHECK: @_ZTV7derived = linkonce_odr unnamed_addr addrspace(1) constant { [4 
x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.derived*)* 
@_ZN7derived2pvEv to i8*), i8* bitcast (void ()* @__cxa_deleted_virtual to 
i8*)] }, comdat, align 8
+// CHECK: @_ZTV4base = linkonce_odr unnamed_addr addrspace(1) constant { [4 x 
i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (void ()* 
@__cxa_pure_virtual to i8*), i8* bitcast (void ()* @__cxa_deleted_virtual to 
i8*)] }, comdat, align 8
+
+// CHECK: define{{.*}}void @__cxa_pure_virtual()
+// CHECK: define{{.*}}void @__cxa_deleted_virtual()
+
+
 struct Number {
   __device__ Number(float _x) : x(_x) {}
   float x;
@@ -61,3 +85,4 @@
 __device__ float test_max() {
   return max(5, 6.0);
 }
+
Index: clang/lib/Headers/__clang_hip_runtime_wrapper.h
===
--- clang/lib/Headers/__clang_hip_runtime_wrapper.h
+++ clang/lib/Headers/__clang_hip_runtime_wrapper.h
@@ -51,6 +51,23 @@
   #define nullptr NULL;
 #endif
 
+#ifdef __cplusplus
+extern "C" {
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_pure_virtual(void) {
+__builtin_trap();
+  }
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_deleted_virtual(void) {
+__builtin_trap();
+  } 
+}
+#endif //__cplusplus
+
 #if __HIP_ENABLE_DEVICE_MALLOC__
 extern "C" __device__ void *__hip_malloc(size_t __size);
 extern "C" __device__ void *__hip_free(void *__ptr);


Index: clang/test/Headers/hip-header.hip
===
--- clang/test/Headers/hip-header.hip
+++ clang/test/Headers/hip-header.hip
@@ -14,6 +14,30 @@
 
 // expected-no-diagnostics
 
+// Check support for pure and deleted virtual functions
+struct base {
+  __host__
+  __device__
+  virtual void pv() = 0;
+  __host__
+  __device__
+  virtual void dv() = delete;
+};
+struct derived:base {
+  __host__
+  __device__
+  virtual void pv() override {};
+};
+__device__ void foo(derived* b) {
+derived d;
+}
+// CHECK: @_ZTV7derived = linkonce_odr unnamed_addr addrspace(1) constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.derived*)* @_ZN7derived2pvEv to i8*), i8* bitcast (void ()* @__cxa_deleted_virtual to i8*)] }, comdat, align 8
+// CHECK: @_ZTV4base = linkonce_odr unnamed_addr addrspace(1) constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (void ()* @__cxa_pure_virtual to i8*), i8* bitcast (void ()* @__cxa_deleted_virtual to i8*)] }, comdat, align 8
+
+// CHECK: define{{.*}}void @__cxa_pure_virtual()
+// CHECK: define{{.*}}void @__cxa_deleted_virtual()
+
+
 struct Number {
   __device__ Number(float _x) : x(_x) {}
   float x;
@@ -61,3 +85,4 @@
 __device__ float test_max() {
   return max(5, 6.0);
 }
+
Index: clang/lib/Headers/__clang_hip_runtime_wrapper.h
===
--- clang/lib/Headers/__clang_hip_runtime_wrapper.h
+++ clang/lib/Headers/__clang_hip_runtime_wrapper.h
@@ -51,6 +51,23 @@
   #define nullptr NULL;
 #endif
 
+#ifdef __cplusplus
+extern "C" {
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_pure_virtual(void) {
+__builtin_trap();
+  }
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_deleted_virtual(void) {
+__builtin_trap();
+  } 
+}
+#endif //__cplusplus
+
 #if __HIP_ENABLE_DEVICE_MALLOC__
 extern "C" __device__ void *__hip_malloc(size_t __size);
 extern "C" __device__ void *__hip_free(void *__ptr);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104392: [HIP] Add support functions for C++ polymorphic types

2021-06-16 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
scchan added a reviewer: yaxunl.
scchan requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Add runtime functions to detect invalid calls to pure or deleted virtual
functions


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D104392

Files:
  clang/lib/Headers/__clang_hip_runtime_wrapper.h


Index: clang/lib/Headers/__clang_hip_runtime_wrapper.h
===
--- clang/lib/Headers/__clang_hip_runtime_wrapper.h
+++ clang/lib/Headers/__clang_hip_runtime_wrapper.h
@@ -51,6 +51,23 @@
   #define nullptr NULL;
 #endif
 
+#ifdef __cplusplus
+extern "C" {
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_pure_virtual(void) {
+__builtin_trap();
+  }
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_deleted_virtual(void) {
+__builtin_trap();
+  } 
+}
+#endif //__cplusplus
+
 #if __HIP_ENABLE_DEVICE_MALLOC__
 extern "C" __device__ void *__hip_malloc(size_t __size);
 extern "C" __device__ void *__hip_free(void *__ptr);


Index: clang/lib/Headers/__clang_hip_runtime_wrapper.h
===
--- clang/lib/Headers/__clang_hip_runtime_wrapper.h
+++ clang/lib/Headers/__clang_hip_runtime_wrapper.h
@@ -51,6 +51,23 @@
   #define nullptr NULL;
 #endif
 
+#ifdef __cplusplus
+extern "C" {
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_pure_virtual(void) {
+__builtin_trap();
+  }
+  __attribute__((__visibility__("default")))
+  __attribute__((weak))
+  __attribute__((noreturn))
+  __device__ void __cxa_deleted_virtual(void) {
+__builtin_trap();
+  } 
+}
+#endif //__cplusplus
+
 #if __HIP_ENABLE_DEVICE_MALLOC__
 extern "C" __device__ void *__hip_malloc(size_t __size);
 extern "C" __device__ void *__hip_free(void *__ptr);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D85471: Make clang HIP headers compatible with C++98

2020-08-06 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan created this revision.
Herald added subscribers: cfe-commits, yaxunl.
Herald added a project: clang.
scchan requested review of this revision.

Automation to detect compiler features, such as CMake's 
target_compile_features, 
would attempt to detect compiler features by explicitly using langugage flags.  
This change ensures that the HIP headers would still work with C++98.

Change-Id: I381f76fdaf42196d9272efeaa5a3a159edca0ab1


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D85471

Files:
  clang/lib/Headers/__clang_hip_libdevice_declares.h
  clang/lib/Headers/__clang_hip_math.h
  clang/lib/Headers/__clang_hip_runtime_wrapper.h

Index: clang/lib/Headers/__clang_hip_runtime_wrapper.h
===
--- clang/lib/Headers/__clang_hip_runtime_wrapper.h
+++ clang/lib/Headers/__clang_hip_runtime_wrapper.h
@@ -28,6 +28,10 @@
 #define __shared__ __attribute__((shared))
 #define __constant__ __attribute__((constant))
 
+#if !defined(__cplusplus) || __cplusplus < 201103L
+  #define nullptr NULL;
+#endif
+
 #if __HIP_ENABLE_DEVICE_MALLOC__
 extern "C" __device__ void *__hip_malloc(size_t __size);
 extern "C" __device__ void *__hip_free(void *__ptr);
Index: clang/lib/Headers/__clang_hip_math.h
===
--- clang/lib/Headers/__clang_hip_math.h
+++ clang/lib/Headers/__clang_hip_math.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #pragma push_macro("__DEVICE__")
 #pragma push_macro("__RETURN_TYPE")
@@ -22,6 +23,34 @@
 #define __DEVICE__ static __device__
 #define __RETURN_TYPE bool
 
+#if defined (__cplusplus) && __cplusplus < 201103L
+//emulate static_assert on type sizes
+template
+struct __compare_result{};
+template<>
+struct __compare_result {
+  static const bool valid;
+};
+
+__DEVICE__
+inline void __suppress_unused_warning(bool b) {};
+template
+__DEVICE__
+inline void __static_assert_equal_size() {
+__suppress_unused_warning(__compare_result::valid);
+}
+
+#define __static_assert_type_size_equal(A, B) \
+  __static_assert_equal_size()
+
+#else
+
+#define __static_assert_type_size_equal(A,B) \
+  static_assert((A) == (B), "")
+
+#endif
+
+
 __DEVICE__
 inline uint64_t __make_mantissa_base8(const char *__tagp) {
   uint64_t __r = 0;
@@ -252,9 +281,8 @@
   uint32_t exponent : 8;
   uint32_t sign : 1;
 } bits;
-
-static_assert(sizeof(float) == sizeof(struct ieee_float), "");
   } __tmp;
+  __static_assert_type_size_equal(sizeof(__tmp.val), sizeof(__tmp.bits));
 
   __tmp.bits.sign = 0u;
   __tmp.bits.exponent = ~0u;
@@ -716,8 +744,8 @@
   uint32_t exponent : 11;
   uint32_t sign : 1;
 } bits;
-static_assert(sizeof(double) == sizeof(struct ieee_double), "");
   } __tmp;
+  __static_assert_type_size_equal(sizeof(__tmp.val), sizeof(__tmp.bits));
 
   __tmp.bits.sign = 0u;
   __tmp.bits.exponent = ~0u;
@@ -726,7 +754,7 @@
 
   return __tmp.val;
 #else
-  static_assert(sizeof(uint64_t) == sizeof(double));
+  __static_assert_type_size_equal(sizeof(uint64_t), sizeof(double));
   uint64_t val = __make_mantissa(__tagp);
   val |= 0xFFF << 51;
   return *reinterpret_cast();
Index: clang/lib/Headers/__clang_hip_libdevice_declares.h
===
--- clang/lib/Headers/__clang_hip_libdevice_declares.h
+++ clang/lib/Headers/__clang_hip_libdevice_declares.h
@@ -318,7 +318,7 @@
 __device__ inline __2f16
 __llvm_amdgcn_rcp_2f16(__2f16 __x) // Not currently exposed by ROCDL.
 {
-  return (__2f16){__llvm_amdgcn_rcp_f16(__x.x), __llvm_amdgcn_rcp_f16(__x.y)};
+  return (__2f16)(__llvm_amdgcn_rcp_f16(__x.x), __llvm_amdgcn_rcp_f16(__x.y));
 }
 __device__ __attribute__((const)) __2f16 __ocml_rint_2f16(__2f16);
 __device__ __attribute__((const)) __2f16 __ocml_rsqrt_2f16(__2f16);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D48493: [HIP] Support flush denorms bitcode

2018-06-22 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added a comment.

LGTM


Repository:
  rC Clang

https://reviews.llvm.org/D48493



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31210: [AMDGPU] Add new address space mapping

2017-03-21 Thread Siu Chi Chan via Phabricator via cfe-commits
scchan added a comment.

In https://reviews.llvm.org/D31210#706955, @yaxunl wrote:

> In https://reviews.llvm.org/D31210#706890, @kzhuravl wrote:
>
> > In https://reviews.llvm.org/D31210#706880, @rampitec wrote:
> >
> > > I'm concerned about the default address space to be 64 bit. It would move 
> > > alloca into generic address space effectively making private address to 
> > > be 64 bit.
> > >  This may have very undesirable performance implications, like address 
> > > arithmetic can become expensive 64 bit and only be truncated at load or 
> > > store.
> > >  I realize you will use addrspacecast on an alloca's value, though I'm 
> > > not sure that is sufficient to mitigate performance hit.
> > >  I believe such change shall not be made without a good performance 
> > > comparison with the feature enabled, provided the very likely performance 
> > > issues.
> >
> >
> > Did not we want to use this: 
> > http://lists.llvm.org/pipermail/llvm-dev/2017-March/99.html and use 
> > non-0 for our allocas?
>
>
> Our final goal is to let alloca return private pointer. The Clang changes are 
> mostly common whether alloca returns generic pointer or private pointer. 
> Actually to be able to test the above patch we need to get the changes in 
> Clang done first.


The goal for this is mainly to bring the mapping closer to nvptx?


https://reviews.llvm.org/D31210



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits