[GitHub] [tvm] zhuwenxi commented on issue #7246: [BUG][Tensorize] race condition when using "tvm.tir.call_packed()" in a parallel schedule.

GitBox Tue, 23 Feb 2021 18:58:12 -0800


zhuwenxi commented on issue #7246:
URL: https://github.com/apache/tvm/issues/7246#issuecomment-784712542



   The function looks pretty much like the fix I proposed. In my proposal 
"reallocate the stack in parallel for loop", the function looks like this:
   <pre>
   fn myfunc() {
      stack_tcode = @tir.tvm_stack_alloca("arg_tcode", 8)
      stack_value = @tir.tvm_stack_alloca("arg_value", 8)
      for i in range(10):
          stack_tcode = @tir.tvm_stack_alloca("arg_tcode", 8)                 
// Do reallocation if current loop is parallel
          stack_value = @tir.tvm_stack_alloca("arg_value", 8)                   
// Do reallocation if current loop is parallel
          tir.tvm_call_packed_lowered("tvm.contrib.cblas.matmul", stack_1)
   }
   </pre>
   
   So from this point of view, is it true that the only difference between 
"packed_arg_alloca" and "tvm_stack_alloca" is the former one uses thread-local 
storage allocation?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] zhuwenxi commented on issue #7246: [BUG][Tensorize] race condition when using "tvm.tir.call_packed()" in a parallel schedule.

Reply via email to