Lunderberg commented on PR #16361:
URL: https://github.com/apache/tvm/pull/16361#issuecomment-1881661035

   Ah, I see.  It looks like this originates from using `sch.bind` multiple 
times for the same thread index.  With that usage, I'm less sure whether this 
should be correct behavior.  When binding a thread index that isn't currently 
bound, the total number of loop iterations remains the same.  When binding a 
thread index in a context where it is already bound, it constrains the two loop 
indices to always be the same.
   
   ```python
   # Before
   for i in range(16):
       for j in range(16):
           B[i,j] = A[i,j]
   
   # First thread binding, still have 16*16 = 256 total iterations.  B is
   # set at all indices.
   for i in T.thread_binding('threadIdx.x', 16):
       for j in range(16):
           B[i,j] = A[i,j]
   
   # Second thread binding, now only have 16 iterations.  B is only set
   # for indices where i==j, along the diagonal.
   for i in T.thread_binding('threadIdx.x', 16):
       for j in T.thread_binding('threadIdx.x', 16):
           B[i,j] = A[i,j]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to