[PATCH] D112718: Add intrinsics and builtins for PTX atomics with semantic orders

2022-01-12 Thread Tadej Ciglarič via Phabricator via cfe-commits
t4c1 added inline comments.



Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:1057
+
+BUILTIN(__nvvm_atom_xchg_global_i, "iiD*i", "n")
+TARGET_BUILTIN(__nvvm_atom_cta_xchg_global_i, "iiD*i", "n", SM_60)

tra wrote:
> Naghasan wrote:
> > tra wrote:
> > > t4c1 wrote:
> > > > tra wrote:
> > > > > t4c1 wrote:
> > > > > > tra wrote:
> > > > > > > We need to figure out how address-space-specific builtins are 
> > > > > > > supposed to work.
> > > > > > > Right now two competing approaches.
> > > > > > > 
> > > > > > > This patch declares builtins with generic pointer as an argument, 
> > > > > > > but, according to the test, expects to be used with the 
> > > > > > > AS-specific pointer.
> > > > > > > It probably does not catch a wrong-AS pointer passed as an 
> > > > > > > argument, either.
> > > > > > > It does happen to work, but I think it's mostly due to the fact 
> > > > > > > that LLVM intrinsics are overloaded and we happen to end up 
> > > > > > > addrspacecast'ing  things correctly if the builtin gets the right 
> > > > > > > kind of pointer.
> > > > > > > 
> > > > > > > The other approach is to declare the pointer with the expected 
> > > > > > > AS. E.g:
> > > > > > > > TARGET_BUILTIN(__nvvm_mbarrier_init_shared, "vWi*3i", "", 
> > > > > > > > AND(SM_80,PTX70))
> > > > > > > 
> > > > > > > IMO, this is the correct way to do it, but it is also rather 
> > > > > > > inconvenient to use from CUDA code as it operates on generic 
> > > > > > > pointers.
> > > > > > > 
> > > > > > > @jdoerfert - WDYT?
> > > > > > >TARGET_BUILTIN(__nvvm_mbarrier_init_shared, "vWi*3i", "", 
> > > > > > >AND(SM_80,PTX70))
> > > > > > >IMO, this is the correct way to do it, but it is also rather 
> > > > > > >inconvenient to use from CUDA code as it operates on generic 
> > > > > > >pointers.
> > > > > > 
> > > > > > I tried doing this, however it is also completely unusable from 
> > > > > > OpenCL code (which is our use case). Trying to use it gives you 
> > > > > > errors about how casting pointers to different address spaces  - 
> > > > > > for example from local to AS3 is not allowed.
> > > > > Hmm. It should've worked. It would need the same explicit cast to a 
> > > > > pointer in AS(3) as in your tests, but it would safeguard against 
> > > > > attempts to pass it a generic pointer. E.g. 
> > > > > https://godbolt.org/z/qE6oxzheM
> > > > > 
> > > > > Explicit casting to AS(X) for AS-specific variants is annoying, but 
> > > > > it's probably unavoidable at the moment regardless of whether we 
> > > > > declare the pointer argument to be AS-specific or not.  LLVM will not 
> > > > > always be able to infer that a pointer is in particular AS.
> > > > > Using specific AS in the declaration has a minor benefit of 
> > > > > safeguarding at compile time against unintentional use of generic 
> > > > > pointers.
> > > > > 
> > > > > Ideally we may want to convert generic variant of the builtin to 
> > > > > AS-specific one, if LLVM does know the AS. We currently do this for 
> > > > > loads/stores, but not for other instructions.
> > > > > 
> > > > Well, it does not work. See: https://godbolt.org/z/vM6bKncc4. Declaring 
> > > > the pointer to be in generic AS is the only way I got it to work. If 
> > > > you know a way to call a builtin declared with AS numbers in OpenCL, I 
> > > > am happy to do that instead.
> > > Could you help me understand how different address spaces are handled by 
> > > OpenCL in clang and LLVM?
> > > What exactly are `__local` or `__private` in OpenCL? Do they map to 
> > > LLVM's address spaces? 
> > > Clang does complain that we're trying to change AS, but I do not see AS 
> > > used in the IR: https://godbolt.org/z/soajoE991
> > > 
> > > AFAICT what happens is:
> > > * OpenCL does use explicit AS for pointers (`__local`, `__private` seem 
> > > to pop up in the error messages)
> > > * `__attribute__((address_space(3)))` does not match the AS of `__local` 
> > > and that leads to an error.
> > > * generic pointer argument works because we are allowed to cast any 
> > > specific AS to generic one.
> > > 
> > > In other words, having specific-AS arguemt works as intended, we just 
> > > have a mismatch between AS number used by OpenCL and AS number used by 
> > > NVPTX and we're not allowed to cast between two specific AS.
> > > 
> > > If that's indeed the case, then what we appear to be missing is the 
> > > mapping from OpenCL AS to the target-specific AS, which should, ideally, 
> > > be done by clang itself. So, if NVPTX-specific builtin operating on 
> > > shared pointer is called with a pointer in OpenCL's equivalent of CUDA's 
> > > `__shared__`, it should be mapped to AS(3).
> > > 
> > > Could you help me understand how different address spaces are handled by 
> > > OpenCL in clang and LLVM?
> > 
> > clang makes a strong distinction between language and target address spaces 
> > since ~3.6 (was more loose before). Each target in clang defines a map 

[PATCH] D112718: Add intrinsics and builtins for PTX atomics with semantic orders

2022-01-09 Thread Tadej Ciglarič via Phabricator via cfe-commits
t4c1 added inline comments.



Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:1057
+
+BUILTIN(__nvvm_atom_xchg_global_i, "iiD*i", "n")
+TARGET_BUILTIN(__nvvm_atom_cta_xchg_global_i, "iiD*i", "n", SM_60)

tra wrote:
> t4c1 wrote:
> > tra wrote:
> > > We need to figure out how address-space-specific builtins are supposed to 
> > > work.
> > > Right now two competing approaches.
> > > 
> > > This patch declares builtins with generic pointer as an argument, but, 
> > > according to the test, expects to be used with the AS-specific pointer.
> > > It probably does not catch a wrong-AS pointer passed as an argument, 
> > > either.
> > > It does happen to work, but I think it's mostly due to the fact that LLVM 
> > > intrinsics are overloaded and we happen to end up addrspacecast'ing  
> > > things correctly if the builtin gets the right kind of pointer.
> > > 
> > > The other approach is to declare the pointer with the expected AS. E.g:
> > > > TARGET_BUILTIN(__nvvm_mbarrier_init_shared, "vWi*3i", "", 
> > > > AND(SM_80,PTX70))
> > > 
> > > IMO, this is the correct way to do it, but it is also rather inconvenient 
> > > to use from CUDA code as it operates on generic pointers.
> > > 
> > > @jdoerfert - WDYT?
> > >TARGET_BUILTIN(__nvvm_mbarrier_init_shared, "vWi*3i", "", AND(SM_80,PTX70))
> > >IMO, this is the correct way to do it, but it is also rather inconvenient 
> > >to use from CUDA code as it operates on generic pointers.
> > 
> > I tried doing this, however it is also completely unusable from OpenCL code 
> > (which is our use case). Trying to use it gives you errors about how 
> > casting pointers to different address spaces  - for example from local to 
> > AS3 is not allowed.
> Hmm. It should've worked. It would need the same explicit cast to a pointer 
> in AS(3) as in your tests, but it would safeguard against attempts to pass it 
> a generic pointer. E.g. https://godbolt.org/z/qE6oxzheM
> 
> Explicit casting to AS(X) for AS-specific variants is annoying, but it's 
> probably unavoidable at the moment regardless of whether we declare the 
> pointer argument to be AS-specific or not.  LLVM will not always be able to 
> infer that a pointer is in particular AS.
> Using specific AS in the declaration has a minor benefit of safeguarding at 
> compile time against unintentional use of generic pointers.
> 
> Ideally we may want to convert generic variant of the builtin to AS-specific 
> one, if LLVM does know the AS. We currently do this for loads/stores, but not 
> for other instructions.
> 
Well, it does not work. See: https://godbolt.org/z/vM6bKncc4. Declaring the 
pointer to be in generic AS is the only way I got it to work. If you know a way 
to call a builtin declared with AS numbers in OpenCL, I am happy to do that 
instead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112718/new/

https://reviews.llvm.org/D112718

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D112718: Add intrinsics and builtins for PTX atomics with semantic orders

2022-01-06 Thread Tadej Ciglarič via Phabricator via cfe-commits
t4c1 added inline comments.



Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:1057
+
+BUILTIN(__nvvm_atom_xchg_global_i, "iiD*i", "n")
+TARGET_BUILTIN(__nvvm_atom_cta_xchg_global_i, "iiD*i", "n", SM_60)

tra wrote:
> We need to figure out how address-space-specific builtins are supposed to 
> work.
> Right now two competing approaches.
> 
> This patch declares builtins with generic pointer as an argument, but, 
> according to the test, expects to be used with the AS-specific pointer.
> It probably does not catch a wrong-AS pointer passed as an argument, either.
> It does happen to work, but I think it's mostly due to the fact that LLVM 
> intrinsics are overloaded and we happen to end up addrspacecast'ing  things 
> correctly if the builtin gets the right kind of pointer.
> 
> The other approach is to declare the pointer with the expected AS. E.g:
> > TARGET_BUILTIN(__nvvm_mbarrier_init_shared, "vWi*3i", "", AND(SM_80,PTX70))
> 
> IMO, this is the correct way to do it, but it is also rather inconvenient to 
> use from CUDA code as it operates on generic pointers.
> 
> @jdoerfert - WDYT?
>TARGET_BUILTIN(__nvvm_mbarrier_init_shared, "vWi*3i", "", AND(SM_80,PTX70))
>IMO, this is the correct way to do it, but it is also rather inconvenient to 
>use from CUDA code as it operates on generic pointers.

I tried doing this, however it is also completely unusable from OpenCL code 
(which is our use case). Trying to use it gives you errors about how casting 
pointers to different address spaces  - for example from local to AS3 is not 
allowed.



Comment at: clang/test/CodeGen/builtins-nvptx.c:557
+  // expected-error@+1 {{'__nvvm_atom_acquire_add_global_i' needs target 
feature sm_70}}
+  __nvvm_atom_acquire_add_global_i((__attribute__((address_space(1))) int 
*)ip, i);
+

tra wrote:
> What happens if I pass a wrong pointer kind? E.g. a generic or shared pointer?
It will silently accept it. I can look into how to output appropriate error 
message.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112718/new/

https://reviews.llvm.org/D112718

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D112718: Add intrinsics and builtins for PTX atomics with semantic orders

2022-01-03 Thread Tadej Ciglarič via Phabricator via cfe-commits
t4c1 updated this revision to Diff 397047.
t4c1 added a comment.

Updated to the latest version that was merged into intel/llvm repo.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112718/new/

https://reviews.llvm.org/D112718

Files:
  clang/include/clang/Basic/BuiltinsNVPTX.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-nvptx.c
  llvm/include/llvm/IR/IntrinsicsNVVM.td
  llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
  llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
  llvm/lib/Target/NVPTX/NVPTXSubtarget.h
  llvm/test/CodeGen/NVPTX/atomics-with-semantics.ll

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D112718: Add intrinsics and builtins for PTX atomics with semantic orders

2021-10-28 Thread Tadej Ciglarič via Phabricator via cfe-commits
t4c1 created this revision.
t4c1 added reviewers: jholewinski, bader.
Herald added subscribers: asavonic, hiraditya.
t4c1 requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, jdoerfert.
Herald added projects: clang, LLVM.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D112718

Files:
  clang/include/clang/Basic/BuiltinsNVPTX.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-nvptx.c
  llvm/include/llvm/IR/IntrinsicsNVVM.td
  llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
  llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
  llvm/lib/Target/NVPTX/NVPTXSubtarget.h
  llvm/test/CodeGen/NVPTX/atomics-with-semantics.ll

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits