roastduck commented on a change in pull request #5382:
URL: https://github.com/apache/incubator-tvm/pull/5382#discussion_r420490285
##########
File path: src/te/operation/op_util.cc
##########
@@ -164,9 +164,21 @@ MakeLoopNest(const Stage& stage,
value_map[iv] = dom->min;
} else {
runtime::ThreadScope ts =
runtime::ThreadScope::make(bind_iv->thread_tag);
- if (stage->scope == "" || stage->scope == "warp" ||
+ if (stage->scope == "" ||
static_cast<int>(runtime::StorageScope::make(stage->scope).rank)
<= ts.rank) {
value_map[iv] = var;
+ } else if (stage->scope == "warp" && ts.rank == 1) {
+ // To determine whether a thread index is inside or outside a warp,
we need
+ // to know the thread extent. We leave a warning for now.
+ if (ts.dim_index == 0) {
+ value_map[iv] = var;
+ } else {
Review comment:
Here is an example for the 2nd situation: Consider extent of
`threadIdx.x` = 32, extent of `threadIdx.y` = 4. Now there are 4 warps, each
consisting of 32 threads.
> In the 2nd situation, is cross "warp" access allowed?
No, all of the 128 threads access a warp storage `a`, but threads in
different warps are actually accessing different group of registers, although
they are all called `a`. Only threads in the same warp are accessing the same
group of registers. This is what I mean by inside/outside. Just like shared
memory, different blocks access different piece of SRAM, even their variable
name is the same.
> In the 1st situation, does threadIdx.x's extent matter here or is it
handled by warp memory lowering?
It dose not matter. `lower_warp_memory` pass will handle different exntents.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]