jcf94 commented on a change in pull request #7635:
URL: https://github.com/apache/tvm/pull/7635#discussion_r593026285



##########
File path: python/tvm/relay/op/strategy/arm_cpu.py
##########
@@ -135,6 +136,14 @@ def conv2d_strategy_arm_cpu(attrs, inputs, out_type, 
target):
                     name="conv2d_direct_simd.micro_dev",
                 )
             elif kernel_layout == "HWIO":
+                if is_auto_scheduler_enabled():

Review comment:
       In x86, we have:
   ```python
                   judge_winograd_auto_scheduler = (
                       "float" in data.dtype
                       and "float" in kernel.dtype
                       and kernel_h == 3
                       and kernel_w == 3
                       and stride_h == 1
                       and stride_w == 1
                       and dilation_h == 1
                       and dilation_w == 1
                       and 64 < co < 512
                       # The last condition of co is based on our profiling of 
resnet workloads
                       # on skylake avx512 machines. We found winograd is 
faster than direct
                       # only when co is within this range
                   )
   ```
   , do you have any suggestion on how to deal with the `co` in ARM? ....
   
   As for the depthwise, I'm finding that we do not have Autoscheduler support 
for NHWC even in x86, only a NCHWc support.
   
   Maybe these problems are better to be solved in another pr.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to