merrymercy opened a new pull request #6041:
URL: https://github.com/apache/incubator-tvm/pull/6041


   - Fix a bug when generating unrolled and vectorized cuda code
       This bug is the same as the bug in 
https://github.com/apache/incubator-tvm/pull/711. In the old PR, I only added a 
new SSA scope for the "else" branch. But in the "then" branch, it has the same 
problem. So I moved the addition of a new SSA scope to the top-level.
   - Fix a bug when generating cuda code for`tir.reinterpret`
      If we call `tir.reinterpret` on an rvalue, the existing strategy will 
generate wrong code. Because we cannot get the address of an rvalue. To fix 
this, we need to store the rvalue into a temporary variable and get the address 
of this temporary variable.
   - Improve the VeirfyGPUCode pass
     Besides checking the LoadNode, we should also check the StoreNode.
     


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to