================
@@ -0,0 +1,16 @@
+#include "../Inputs/cuda.h"
+
+// RUN: %clang_cc1 -triple=amdgcn-amd-amdhsa -x hip -fclangir \
----------------
RiverDave wrote:

I can prolly help out a bit: 

When dealing with `cc1` it defaults to host compilation unless we explicitly 
pass `-fcuda-is-device` (to well.. obviously perform device comp). In this 
specific test case We're passing a gpu a triple to perform host compilation 
which makes zero sense in a real-life scenario but `-cc1` seamlessly allows it.

The "two-pass compilation" you highlight is usually done directly through the 
regular clang driver. where two "pipelines" are run
- a gpu triple `-fcuda-is-device`
- a cpu triple without the aforementioned flag. 

The test itself its not wrong but for host comp it'll make sense to utilize any 
suitable cpu triple ( x86, aarch64...)

https://github.com/llvm/llvm-project/pull/177698
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to