Anndrey24 commented on code in PR #16899:
URL: https://github.com/apache/tvm/pull/16899#discussion_r1572475838
##########
python/tvm/topi/arm_cpu/conv2d_gemm.py:
##########
@@ -478,23 +498,21 @@ def schedule_conv2d_gemm_native(cfg, s, out, final_out):
s[C].tensorize(y_inner, gemm_acc)
s[C].parallel(x_outer)
else:
- k_outer, k_inner = s[C].split(k, 4)
- x_outer, y_outer, x_inner, y_inner = s[C].tile(x, y, x_factor=4,
y_factor=y_tile_size)
- y_inner_outer, y_inner_inner = s[C].split(y_inner, nparts=4)
+ k_outer, k_inner = s[C].split(k, factor=4)
+ x_outer, x_inner = s[C].split(x, factor=4)
+ y_outer, y_inner = s[C].split(y, factor=y_tile_size,
disable_predication=True)
Review Comment:
In theory it should be, as the two possible compute definitions of C in the
hybrid implementation have the shape `(batches, M_padded, N_padded)` in the
[scalable](https://github.com/apache/tvm/pull/16899/files#diff-42b1313a1be464c7f2c94f75d656be725f1ccb54b9391cd5b27c33009ac0e2d5R332)
and
[non-scalable](https://github.com/apache/tvm/pull/16899/files#diff-42b1313a1be464c7f2c94f75d656be725f1ccb54b9391cd5b27c33009ac0e2d5R348)
cases.
However, you are right that in practice the padding might get removed by one
of the tir passes. That is prevented for the scalable cases by this dummy
padding
[here](https://github.com/apache/tvm/pull/16899/files#diff-42b1313a1be464c7f2c94f75d656be725f1ccb54b9391cd5b27c33009ac0e2d5R341)
and in the non-scalable fp16 case by the one
[here](https://github.com/apache/tvm/pull/16899/files#diff-42b1313a1be464c7f2c94f75d656be725f1ccb54b9391cd5b27c33009ac0e2d5L346).
I believe I didn't use the same trick for the non-scalable fp32 case because I
noticed performance being better when allowing the padding to be removed in
some cases.
What do you think about changing it to
`disable_predication=use_scalable_vectors` in order to leave the non-scalable
case as it was before?
Should we have any more guard rails than this? It might be overkill, but
something like this also comes to mind: `disable_predication =
analyzer.can_prove(y.dom.extent % y_tile_size == 0)`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]