masahi commented on a change in pull request #10689:
URL: https://github.com/apache/tvm/pull/10689#discussion_r835996197
##########
File path: src/tir/schedule/primitive/sampling.cc
##########
@@ -299,22 +299,12 @@ std::vector<int64_t>
SamplePerfectTile(support::LinearCongruentialEngine::TRandS
return SamplePerfectTile(rand_state, extent, n_splits);
}
CHECK_GE(n_splits, 2) << "ValueError: Cannot tile a loop into " << n_splits
<< " splits";
- std::vector<int32_t> innermost_candidates;
- innermost_candidates.reserve(max_innermost_factor);
- for (int32_t i = 1; i <= max_innermost_factor; ++i) {
- if (extent % i == 0) {
- innermost_candidates.push_back(i);
+ while (true) {
+ std::vector<int64_t> result = SamplePerfectTile(rand_state, extent,
n_splits);
+ if (result.back() <= max_innermost_factor) {
+ return result;
Review comment:
I found an interesting issue with this change. After this PR, my VNNI
tuning test hangs at
https://gist.github.com/masahi/3579be9b2b9506106270eb2217746e74#file-vnni_dense_tir-py-L164,
i.e. the second call in
```
def schedule_rule_batch_matmul_vnni(sch: tir.Schedule, bmm_block):
sch_copy = sch.copy()
schedule_batch_matmul(bmm_block, None, True, sch,
layout_trans_compute_root=False)
schedule_batch_matmul(bmm_block, None, True, sch_copy,
layout_trans_compute_root=True) # <- Hangs at sample_perfect_tile
return [sch, sch_copy]
```
If I remove the second `schedule_batch_matmul`, everything is ok.
Apparently, during the second call to `sample_perfect_tile`,
`SamplePerfectTile` above keeps returning 1024 when `max_inner_most_factor` is
128 and `extent` is 1024. Does this bring any bell?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]