zxybazh opened a new pull request, #12648: URL: https://github.com/apache/tvm/pull/12648
This PR is a follow up for #12127 with updates on a critical local read cache (`d`) in `data_pack` block and scheduling for the kernel parts if available. This change would bring MS's performance to be aligned with AutoTVM on NCHW Conv2d on CUDA. Benchmarking results to follow. And dispatch priority change will follow up in a separate PR. CC @vinx13 @junrushao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
