andy-yang-1 commented on PR #14329:
URL: https://github.com/apache/tvm/pull/14329#issuecomment-1475160180

   This is an enhanced version of the ptx_ldg32 pass. I didn’t think about 
using async when I wrote ptx_ldg32, but cp.async needs to be careful about the 
GPU architecture. Some architectures cannot use cp.async. Can we distinguish 
between different GPU architectures?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to