LeiWang1999 opened a new issue, #17375:
URL: https://github.com/apache/tvm/issues/17375
Lead to Suboptimal Shared Memory Reuse.
pr #9341 introduced liveness analysis to merge the shared memory allocations
, places touched buffer records at the outermost scope (e.g., outer loops)
rather than at the innermost possible scope (e.g., inner loops or conditional
branches). This can lead to incorrect liveness analysis. This approach luckily
works well for some cases, such as the GEMM kernel, it fails in more complex
scenarios, like the batched GEMM case or more complex algos. (as the outermost
loop is always the single for loop node, that lead to incorrect gen kill point
for each buffer).
One solutions I'm applying is to replace:
```cpp
if (it != alloc_info_.end() && it->second.alloc) {
ICHECK_LT(it->second.level, scope_.size());
if (IsAppropriateSharedMemory(GetRef<Var>(buf))) {
scope_[it->second.level].touched.push_back(buf);
}
}
```
into
```cpp
if (IsAppropriateSharedMemory(GetRef<Var>(buf))) {
scope_[scope_.size() - 1].touched.push_back(buf);
}
```
more detailed analysis can be found at [TVM Shared Memory Reuse
Analysis](https://leiblog.wang/TVM%E4%B8%AD%E7%9A%84Shared-Memory-Reuse-Pass-%E5%88%86%E6%9E%90/)
If you think this analysis is correct, I can submit a PR then. :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]