roastduck opened a new pull request #5382:

   Suppose the extent of `threadIdx.x` equals to the warp size, only 
`threadIdx.x` should be a index of a warp memory, while `threadIdx.y` or 
`threadIdx.z` should not, since warp shuffle is inside a warp, rather than 
   The old code make every indices to be indices of a warp memory, which lead 
to a bug in my application. Unfortunately, I failed to construct a simple 
enough counter-example for this. Things are complicated by numerous 
transformation passes. Actually the bug in my application is only a bound 
checking problem, and the memory accessing indices is somehow correct. Anyway, 
this PR indeed fixed the problem.
   In fact, things are more complicated because `threadIdx` can be less than 
the warp size. To determine whether an index is inside or outside a warp, we 
need to know the extent of that index. But this information is not available in 
`MakeLoopNest`. I assume only `threadIdx.x` is inside a warp for now, but left 
a warning message in case we detect `threadIdx.y` or `threadIdx.z`.
   Requesting a review from @tqchen or the author of the related PR (#5190), 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

Reply via email to