[PATCH] D76948: [cuda][hip] Add CUDA builtin surface/texture reference support.

2020-03-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

In D76948#1946878 , @hliao wrote:

> I tried that before submitting this one. But, as it's in the closed state, I 
> cannot submit that anymore. I will attach the difference against the previous 
> change somewhere.


I've reopened it. Let's move the patch and discussion there.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76948/new/

https://reviews.llvm.org/D76948



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76948: [cuda][hip] Add CUDA builtin surface/texture reference support.

2020-03-27 Thread Michael Liao via Phabricator via cfe-commits
hliao marked an inline comment as done.
hliao added inline comments.



Comment at: clang/test/SemaCUDA/bad-attributes.cu:74-75
+
+typedef __attribute__((device_builtin_surface_type)) unsigned long long s0_ty; 
// expected-warning {{'device_builtin_surface_type' attribute only applies to 
classes}}
+typedef __attribute__((device_builtin_texture_type)) unsigned long long t0_ty; 
// expected-warning {{'device_builtin_texture_type' attribute only applies to 
classes}}
+

tra wrote:
> Please add few test cases replicating use of these attributes in CUDA headers.
the replication from CUDA headers is added on those codegen tests. These tests 
are illegal ones which sema checks should identify.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76948/new/

https://reviews.llvm.org/D76948



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76948: [cuda][hip] Add CUDA builtin surface/texture reference support.

2020-03-27 Thread Michael Liao via Phabricator via cfe-commits
hliao added a comment.

In D76948#1946861 , @tra wrote:

> Would it be possible to update the old review with the new diff? It would 
> make it easier to see the incremental changes you've made. If the old review 
> can be reopened that would be great as it would keep all relevant info in one 
> place, but I'm fine doing the review here, too, if phabricator does not let 
> you do it.


Check this for the new change.

https://gist.github.com/darkbuck/836dbb3112ca2e5fab769cf3cdaecd09


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76948/new/

https://reviews.llvm.org/D76948



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76948: [cuda][hip] Add CUDA builtin surface/texture reference support.

2020-03-27 Thread Michael Liao via Phabricator via cfe-commits
hliao added a comment.

In D76948#1946861 , @tra wrote:

> Would it be possible to update the old review with the new diff? It would 
> make it easier to see the incremental changes you've made. If the old review 
> can be reopened that would be great as it would keep all relevant info in one 
> place, but I'm fine doing the review here, too, if phabricator does not let 
> you do it.


I tried that before submitting this one. But, as it's in the closed state, I 
cannot submit that anymore. I will attach the difference against the previous 
change somewhere.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76948/new/

https://reviews.llvm.org/D76948



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76948: [cuda][hip] Add CUDA builtin surface/texture reference support.

2020-03-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

Would it be possible to update the old review with the new diff? It would make 
it easier to see the incremental changes you've made. If the old review can be 
reopened that would be great as it would keep all relevant info in one place, 
but I'm fine doing the review here, too, if phabricator does not let you do it.




Comment at: clang/test/SemaCUDA/bad-attributes.cu:74-75
+
+typedef __attribute__((device_builtin_surface_type)) unsigned long long s0_ty; 
// expected-warning {{'device_builtin_surface_type' attribute only applies to 
classes}}
+typedef __attribute__((device_builtin_texture_type)) unsigned long long t0_ty; 
// expected-warning {{'device_builtin_texture_type' attribute only applies to 
classes}}
+

Please add few test cases replicating use of these attributes in CUDA headers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76948/new/

https://reviews.llvm.org/D76948



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76948: [cuda][hip] Add CUDA builtin surface/texture reference support.

2020-03-27 Thread Michael Liao via Phabricator via cfe-commits
hliao added a comment.

This's revised change from https://reviews.llvm.org/D76365 after fixing Sema 
checks on the template partial specialization. With this change, I could 
compile the following sample code using surface reference.

kernel.cu

  #include 
  
  surface surf;
  
  #if defined(__clang__)
  __device__ int
  suld_2d_trap(surface, int, int) 
asm("llvm.nvvm.suld.2d.i32.trap");
  
  template 
  static inline __device__ T
  surf2Dread(surface s, int x, int y) {
// By default, `surf2Dread` uses trap mode.
return suld_2d_trap(s, x, y);
  }
  #endif
  
  __device__ int foo(int x, int y) { return surf2Dread(surf, x, y); }

With NVCC, it generates

`kernel.ptx` after `nvcc --ptx -rdc=true kernel.cu`

  //
  // Generated by NVIDIA NVVM Compiler
  //
  // Compiler Build ID: CL-27506705
  // Cuda compilation tools, release 10.2, V10.2.89
  // Based on LLVM 3.4svn
  //
  
  .version 6.5
  .target sm_30
  .address_size 64
  
  // .globl   _Z3fooii
  .visible .global .surfref surf;
  
  .visible .func  (.param .b32 func_retval0) _Z3fooii(
  .param .b32 _Z3fooii_param_0,
  .param .b32 _Z3fooii_param_1
  )
  {
  .reg .b32   %r<4>;
  .reg .b64   %rd<2>;
  
  
  ld.param.u32%r1, [_Z3fooii_param_0];
  ld.param.u32%r2, [_Z3fooii_param_1];
  suld.b.2d.b32.trap {%r3}, [surf, {%r1, %r2}];
  st.param.b32[func_retval0+0], %r3;
  ret;
  }

With Clang, it generates

`kernel-cuda-nvptx64-nvidia-cuda-sm_30.s` after `clang --cuda-device-only 
--cuda-gpu-arch=sm_30 -O2 -S kernel.cu`

  //
  // Generated by LLVM NVPTX Back-End
  //
  
  .version 6.4
  .target sm_30
  .address_size 64
  
  // .globl   _Z3fooii
  .visible .global .surfref surf;
  
  .visible .func  (.param .b32 func_retval0) _Z3fooii(
  .param .b32 _Z3fooii_param_0,
  .param .b32 _Z3fooii_param_1
  )
  {
  .reg .b32   %r<4>;
  .reg .b64   %rd<2>;
  
  ld.param.u32%r1, [_Z3fooii_param_0];
  ld.param.u32%r2, [_Z3fooii_param_1];
  mov.u64 %rd1, surf;
  suld.b.2d.b32.trap {%r3}, [%rd1, {%r1, %r2}];
  st.param.b32[func_retval0+0], %r3;
  ret;
  
  }


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76948/new/

https://reviews.llvm.org/D76948



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76948: [cuda][hip] Add CUDA builtin surface/texture reference support.

2020-03-27 Thread Michael Liao via Phabricator via cfe-commits
hliao created this revision.
hliao added reviewers: tra, rjmccall, yaxunl.
Herald added a reviewer: a.sidorin.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.
hliao added a comment.

This's revised change from https://reviews.llvm.org/D76365 after fixing Sema 
checks on the template partial specialization. With this change, I could 
compile the following sample code using surface reference.

kernel.cu

  #include 
  
  surface surf;
  
  #if defined(__clang__)
  __device__ int
  suld_2d_trap(surface, int, int) 
asm("llvm.nvvm.suld.2d.i32.trap");
  
  template 
  static inline __device__ T
  surf2Dread(surface s, int x, int y) {
// By default, `surf2Dread` uses trap mode.
return suld_2d_trap(s, x, y);
  }
  #endif
  
  __device__ int foo(int x, int y) { return surf2Dread(surf, x, y); }

With NVCC, it generates

`kernel.ptx` after `nvcc --ptx -rdc=true kernel.cu`

  //
  // Generated by NVIDIA NVVM Compiler
  //
  // Compiler Build ID: CL-27506705
  // Cuda compilation tools, release 10.2, V10.2.89
  // Based on LLVM 3.4svn
  //
  
  .version 6.5
  .target sm_30
  .address_size 64
  
  // .globl   _Z3fooii
  .visible .global .surfref surf;
  
  .visible .func  (.param .b32 func_retval0) _Z3fooii(
  .param .b32 _Z3fooii_param_0,
  .param .b32 _Z3fooii_param_1
  )
  {
  .reg .b32   %r<4>;
  .reg .b64   %rd<2>;
  
  
  ld.param.u32%r1, [_Z3fooii_param_0];
  ld.param.u32%r2, [_Z3fooii_param_1];
  suld.b.2d.b32.trap {%r3}, [surf, {%r1, %r2}];
  st.param.b32[func_retval0+0], %r3;
  ret;
  }

With Clang, it generates

`kernel-cuda-nvptx64-nvidia-cuda-sm_30.s` after `clang --cuda-device-only 
--cuda-gpu-arch=sm_30 -O2 -S kernel.cu`

  //
  // Generated by LLVM NVPTX Back-End
  //
  
  .version 6.4
  .target sm_30
  .address_size 64
  
  // .globl   _Z3fooii
  .visible .global .surfref surf;
  
  .visible .func  (.param .b32 func_retval0) _Z3fooii(
  .param .b32 _Z3fooii_param_0,
  .param .b32 _Z3fooii_param_1
  )
  {
  .reg .b32   %r<4>;
  .reg .b64   %rd<2>;
  
  ld.param.u32%r1, [_Z3fooii_param_0];
  ld.param.u32%r2, [_Z3fooii_param_1];
  mov.u64 %rd1, surf;
  suld.b.2d.b32.trap {%r3}, [%rd1, {%r1, %r2}];
  st.param.b32[func_retval0+0], %r3;
  ret;
  
  }


- Re-commit after fix Sema checks on partial template specialization.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D76948

Files:
  clang/include/clang/AST/Type.h
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/AttrDocs.td
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/AST/Type.cpp
  clang/lib/CodeGen/CGCUDANV.cpp
  clang/lib/CodeGen/CGCUDARuntime.h
  clang/lib/CodeGen/CGExprAgg.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenTypes.cpp
  clang/lib/CodeGen/TargetInfo.cpp
  clang/lib/CodeGen/TargetInfo.h
  clang/lib/Headers/__clang_cuda_runtime_wrapper.h
  clang/lib/Sema/SemaDeclAttr.cpp
  clang/lib/Sema/SemaDeclCXX.cpp
  clang/test/CodeGenCUDA/surface.cu
  clang/test/CodeGenCUDA/texture.cu
  clang/test/Misc/pragma-attribute-supported-attributes-list.test
  clang/test/SemaCUDA/attr-declspec.cu
  clang/test/SemaCUDA/attributes-on-non-cuda.cu
  clang/test/SemaCUDA/bad-attributes.cu
  llvm/include/llvm/IR/Operator.h

Index: llvm/include/llvm/IR/Operator.h
===
--- llvm/include/llvm/IR/Operator.h
+++ llvm/include/llvm/IR/Operator.h
@@ -599,6 +599,25 @@
   }
 };
 
+class AddrSpaceCastOperator
+: public ConcreteOperator {
+  friend class AddrSpaceCastInst;
+  friend class ConstantExpr;
+
+public:
+  Value *getPointerOperand() { return getOperand(0); }
+
+  const Value *getPointerOperand() const { return getOperand(0); }
+
+  unsigned getSrcAddressSpace() const {
+return getPointerOperand()->getType()->getPointerAddressSpace();
+  }
+
+  unsigned getDestAddressSpace() const {
+return getType()->getPointerAddressSpace();
+  }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_IR_OPERATOR_H
Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -70,3 +70,27 @@
 __device__ void device_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be global}}
 }
+
+typedef __attribute__((device_builtin_surface_type)) unsigned long long s0_ty; // expected-warning {{'device_builtin_surface_type' attribute only applies to classes}}
+typedef __attribute__((device_builtin_texture_type)) unsigned long long t0_ty; // expected-warning {{'device_builtin_texture_type' attribute only applies to classes}}
+
+struct __attribute__((device_builtin_surface_type)) s1_ref {}; // expected-error {{illegal device builtin surface reference