ekalda commented on code in PR #14483:
URL: https://github.com/apache/tvm/pull/14483#discussion_r1158615483


##########
python/tvm/topi/arm_cpu/conv2d_spatial_pack.py:
##########
@@ -316,12 +317,23 @@ def _tile_size(axis, candidates):
                     return candidate
             return 1
 
-        # Tile size 8 results in efficient vectorization for these schedules.
-        # If the axis is not divisible by 8, try 4
+        # For data tensors with unity height and width we can leave it to the
+        # backend to vectorize the inner loop. This has been observed to be 
more
+        # performant on SVE targets with a vector width > 128bits.
+        target = Target.current(allow_none=False)
+        if target.features.has_sve and OW == OH and OW == 1:

Review Comment:
   Is there no required minimum length for the output channels for this to be 
beneficial?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to