masahi commented on a change in pull request #5727:
URL: https://github.com/apache/incubator-tvm/pull/5727#discussion_r435576994
##########
File path: src/tir/transforms/lower_thread_allreduce.cc
##########
@@ -478,9 +478,20 @@ class ThreadAllreduceBuilder final : public
StmtExprMutator {
// the warp size.
//
// TODO(tvm-team) reduction with a sub-warp of 8 or 16 threads.
+ // Note: The ROCm backend will only have warp reductions for now.
+ // Also, the warp/wavefron size differs (64 on rocm, 32 on cuda).
bool is_warp_reduction(const std::vector<DataType>& types) const {
// Only cuda target supports warp reductions.
- if (target_->target_name != "cuda") return false;
+ if ((target_->target_name != "cuda") && (target_->target_name != "rocm"))
return false;
+
+ // rocm only supports 32 bit operands afor shuffeling at the moment
Review comment:
afor -> for
shuffeling -> shuffling
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]