junrushao1994 opened a new pull request, #11581:
URL: https://github.com/apache/tvm/pull/11581

   Following #11575, this PR allows CUDA thread binding for TIR programs
   like
   
   ```python
   @T.prim_func
   def zero_dim_add(
       A: T.Buffer[(), "float32"],
       B: T.Buffer[(), "float32"],
       C: T.Buffer[(), "float32"],
   ) -> None:
       with T.block("C"):
           vi = T.axis.spatial(1, 0)
           C[()] = A[()] + B[()]
   ```
   
   where there is no loop available to be bound to threadIdx/blockIdx.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to