Lunderberg commented on PR #16361:
URL: https://github.com/apache/tvm/pull/16361#issuecomment-1881661035
Ah, I see. It looks like this originates from using `sch.bind` multiple
times for the same thread index. With that usage, I'm less sure whether this
should be correct behavior. When binding a thread index that isn't currently
bound, the total number of loop iterations remains the same. When binding a
thread index in a context where it is already bound, it constrains the two loop
indices to always be the same.
```python
# Before
for i in range(16):
for j in range(16):
B[i,j] = A[i,j]
# First thread binding, still have 16*16 = 256 total iterations. B is
# set at all indices.
for i in T.thread_binding('threadIdx.x', 16):
for j in range(16):
B[i,j] = A[i,j]
# Second thread binding, now only have 16 iterations. B is only set
# for indices where i==j, along the diagonal.
for i in T.thread_binding('threadIdx.x', 16):
for j in T.thread_binding('threadIdx.x', 16):
B[i,j] = A[i,j]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]