MasterJH5574 opened a new pull request, #15192:
URL: https://github.com/apache/tvm/pull/15192

   This PR enhances the LowerCrossThreadReduction pass with the 
thread-broadcasting block rewrite.
   
   Specifically, previously whenever a TIR block has thread-broadcast behavior 
(i.e., there exists some thread var which is free for the block), we never 
insert a predicate for the block and therefore the generated final code has 
race condition, which sometimes lead to wrong computation results.
   
   This PR enhances the pass by collecting thread var information along 
transformation, and rewrite the thread-broadcast TIR block with additional 
predicate clauses which bound the thread vars and effectively state that "only 
execute the block when `thread_var == 0`". Therefore, the race condition issue 
in such blocks is resolved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to