ekalda opened a new pull request, #14003: URL: https://github.com/apache/tvm/pull/14003
No changes to the compute, various bugfixes and improvements to the corresponding NHWC schedule: * There is currently a block that is not run as a part of tuning trials, but gets run during compilation with tuning logs. Since a lot of unrolling and vectorization happens there, for some conv2d operators the extra vectorizing and unrolling results in about 18x size increase in asm and can take around 10 minutes per operator to compile. That essentially makes whole networks uncompilable, so remove that block. * There is no fallback config or NHWC logs in the TopHub. So add a fallback config. This significantly reduces the no tuning compile time, e.g. by about 10x for mobilenet. * The order of axis we passed to reorder_config was different to the order that was used to define the reorder. By looking at the compute definition and based on tuning results of whole networks, it seems to be a bug. * Constrain potential unrolling to OWI and OCI axis as unrolling across OHI results in uncompilably huge code size. This change reduces the number of unsuccessful tuning trials from about 50% to about 20%. * Other minor tweaks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
