vinx13 opened a new pull request, #11677:
URL: https://github.com/apache/tvm/pull/11677

   This PR registers CUDA wmma intrinsics. Currently, only 16x16x16 wmma 
intrinsics are supported. Input should be `float16`. Accumulator can be 
`float16` or `float32`. Fragment B can be transposed (column major).
   
   cc @masahi @junrushao1994 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to