giuseros commented on pull request #6095:
URL: https://github.com/apache/incubator-tvm/pull/6095#issuecomment-661738954


   Hi @FrozenGene , 
   Before introducing tuning knobs, I wanted to do first an analysis to find 
the minimum set of tuning parameters to bring the best performance. 
   
   The aim is to reduce the tuning time. The point is that sometimes we are 
constrained by the number of registers available in AArch64, so trying out 
different splits might only increase the tuning time, without giving any 
benefit. 
   
   So the idea was to have a "default" schedule which mimics [ACL 
implementation](https://github.com/ARM-software/ComputeLibrary/blob/master/src/core/NEON/kernels/convolution/depthwise/impl_qa8_qa8.hpp#L292-L314)
 and then introduce (the minimal set of) tuning knobs + tensorization to speed 
things up. 
   
   What do you think? If you want to add tuning knobs in this PR, I will try to 
do the tuning analysis today


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to