merrymercy commented on a change in pull request #7038:
URL: https://github.com/apache/tvm/pull/7038#discussion_r536513571



##########
File path: src/auto_scheduler/feature.cc
##########
@@ -1296,7 +1296,8 @@ void GetPerStoreFeaturesWorkerFunc(const SearchTask& 
task, const State& state, i
     }
     auto mod = IRModule(Map<GlobalVar, BaseFunc>({{global_var, f}}));
 
-    if (task->target->kind->device_type == kDLGPU) {
+    auto device_type = task->target->kind->device_type;
+    if (device_type == kDLGPU || device_type == kDLROCM) {

Review comment:
       To align with the search policy, we can try to the condition from this 
function
   
https://github.com/apache/tvm/blob/fd5ce645941153972ecee404c90479b2b391df15/src/auto_scheduler/search_policy/utils.h#L55-L62

##########
File path: src/auto_scheduler/search_task.cc
##########
@@ -66,6 +69,13 @@ HardwareParams 
HardwareParamsNode::GetDefaultHardwareParams(const Target& target
 
     device_api->GetAttr(ctx, 
tvm::runtime::DeviceAttrKind::kMaxRegistersPerBlock, &ret);
     int max_registers_per_block = ret;

Review comment:
       I think this is a bug.
   I will send another PR to rename `max_registers_per_block` to 
`max_local_memory_per_block` to make it align with `VerifyGPUCode` pass.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to