vielmetti commented on issue #18561:
URL: https://github.com/apache/tvm/issues/18561#issuecomment-3793544990

   > Input shape: (14, 23, 67, 99) ≈ 1.7M elements
   
   Have you tried this with various input shapes to see if there's anything 
about the sizes of the parameters that affects the slowdown or speedup?  In 
particular if you make the element count 2x/4x/8x the size, is the resulting 
computation 2x/4x/8x the cost?
   
   Does this same slowdown happen on other RVV hardware or in emulation?
   
   I remember old arm64 server hardware where the NEON units were technically 
speedups but in fact were so slow that scalar operations turned out to be 
faster. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to