Artem-B wrote:

> This patch, which simply makes it legal on all architectures but do nothing 
> is it's older than sm_70.

I do not think this is the right thing to do. "do nothing" is not what one 
would expect from a `nanosleep`.

Let's unpack your problem a bit.

__nvvm_reflect() is probably closest to what you would need. However, IIUIC, if 
you use it to provide nanosleep-based variant and an alternative for the older 
GPUs, the `nanosleep` variant code will still hang off the dead branch of 
if(__nvvm_reflect()) and if it's not eliminated by DCE (which it would not if 
optimizations are off), the resulting PTX will be invalid for the older GPUs.

In other words, pushing nanosleep implementation into an intrinsic makes things 
compile everywhere at the expense of doing a wrong thing on the older GPUs. I 
do not think it's a good trade-off.

Perhaps a better approach would be to incorporate dead branch elimination onto 
NVVMReflect pass itself. We do know that it is the explicit intent of 
`__nvvm_reflect()`. If NVVMReflect explicitly guarantees that the dead branch 
will be gone, it should allow you to use approach `#1` w/o concerns for whether 
optimizations are enabled and you should be able to provide whatever 
alternative implementation you need (even if it's a null one), without 
affecting correctness of LLVM itself. 



https://github.com/llvm/llvm-project/pull/81033
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to