ekalda opened a new pull request, #14003:
URL: https://github.com/apache/tvm/pull/14003

   No changes to the compute, various bugfixes and improvements to the 
corresponding NHWC schedule:
   
   * There is currently a block that is not run as a part of tuning trials, but 
gets run during compilation with tuning logs. Since a lot of unrolling and 
vectorization happens there, for some conv2d operators the extra vectorizing 
and unrolling results in about 18x size increase in asm and can take around 10 
minutes per operator to compile. That essentially makes whole networks 
uncompilable, so remove that block.
   * There is no fallback config or NHWC logs in the TopHub. So add a fallback 
config. This significantly reduces the no tuning compile time, e.g. by about 
10x for mobilenet.
   * The order of axis we passed to reorder_config was different to the order 
that was used to define the reorder. By looking at the compute definition and 
based on tuning results of whole networks, it seems to be a bug.
   * Constrain potential unrolling to OWI and OCI axis as unrolling across OHI 
results in uncompilably huge code size. This change reduces the number of 
unsuccessful tuning trials from about 50% to about 20%.
   * Other minor tweaks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to