MasterJH5574 opened a new pull request, #12819:
URL: https://github.com/apache/tvm/pull/12819

   Prior to this PR, the LCA detector of buffers in TIR didn't take buffer 
memory scopes and GPU hierarchy into consideration. An consequent issue is 
that, when an intermediate buffer is in global memory, TIR's lowering passes 
don't necessarily allocated the intermediate buffer outside all `blockIdx`. As 
a result, the global intermediate buffer is allocated under a GPU thread block, 
which is illegal.
   
   This PR fixes this issue by fixing the LCA detector, making it be aware of 
the buffer memory scopes and GPU hierarchy. With this fix, the global 
intermediate buffers are all allocated outside `blockIdx`.
   
   cc @junrushao @tqchen @Hzfengsy @spectrometerHBH 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to