jcf94 commented on a change in pull request #7635:
URL: https://github.com/apache/tvm/pull/7635#discussion_r593026285
##########
File path: python/tvm/relay/op/strategy/arm_cpu.py
##########
@@ -135,6 +136,14 @@ def conv2d_strategy_arm_cpu(attrs, inputs, out_type,
target):
name="conv2d_direct_simd.micro_dev",
)
elif kernel_layout == "HWIO":
+ if is_auto_scheduler_enabled():
Review comment:
In x86, we have:
```python
judge_winograd_auto_scheduler = (
"float" in data.dtype
and "float" in kernel.dtype
and kernel_h == 3
and kernel_w == 3
and stride_h == 1
and stride_w == 1
and dilation_h == 1
and dilation_w == 1
and 64 < co < 512
# The last condition of co is based on our profiling of
resnet workloads
# on skylake avx512 machines. We found winograd is
faster than direct
# only when co is within this range
)
```
, do you have any suggestion on how to deal with the `co` in ARM? ....
As for the depthwise, I'm finding that we do not have Autoscheduler support
for NHWC even in x86, only a NCHWc support.
Maybe these problems are better to be solved in another pr.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]