FrozenGene commented on a change in pull request #7132:
URL: https://github.com/apache/tvm/pull/7132#discussion_r546664255
##########
File path: python/tvm/relay/op/strategy/mali.py
##########
@@ -69,6 +71,36 @@ def conv2d_strategy_mali(attrs, inputs, out_type, target):
raise RuntimeError(
"Unsupported weight layout {} for conv2d
NCHW".format(kernel_layout)
)
+ elif layout == "NHWC":
+ assert kernel_layout == "HWIO"
+ if not is_auto_scheduler_enabled():
+ logger.error("conv2d NHWC layout is not enabled for mali with
autotvm.")
+ strategy.add_implementation(
+ wrap_compute_conv2d(topi.nn.conv2d_nhwc,
need_auto_scheduler_layout=True),
+ naive_schedule,
+ name="conv2d_nhwc.mali",
+ )
+ is_winograd_applicable = False
+ if len(kernel.shape) == 4:
+ kernel_h, kernel_w, _, _ = get_const_tuple(kernel.shape)
+ is_winograd_applicable = (
+ "float" in data.dtype
+ and "float" in kernel.dtype
+ and kernel_h == 3
+ and kernel_w == 3
+ and stride_h == 1
+ and stride_w == 1
+ and dilation_h == 1
+ and dilation_w == 1
+ )
Review comment:
I think about it for a while. I think current way is acceptable.
Winograd on cuda is encapsulated because we need complex logic to distinguish
with TensorCore, but mali target doesn't need this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]