kevinthesun commented on pull request #7137: URL: https://github.com/apache/tvm/pull/7137#issuecomment-751297202
@masahi Thanks for this investigating and improvement. Indeed this change won't affect cpu perf much even when MKL is enabled and large dynamic shape dense becomes faster. One interesting thing about output result is: for pytorch 1.7, we can exact match the results of tvm vs pt with this change, but for pytorch 1.4 there is still mismatch which won't affect final accuracy. I'm fine with this change now. BTW, enabling MKL on my Intel Xeon Platinum machine with 18 cores can reduce the latency of pt maskrcnn from 1000 ms to 600 ms. Those large dynamic shape dense layers do contribute a lot to the latency. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
