malixian opened a new pull request, #15967: URL: https://github.com/apache/tvm/pull/15967
This Pull Request adds support for AMD Matrix Core in MetaScheduler. ## Changes Made - **Add code generation for HIP**. Additional support for hip code-gen is necessary. The advantages of generating HIP code are: 1. High readability of the code, 2. Avoiding some problems caused by generating llvm ir. This part of the code is mainly contributed by @LeiWang1999 - **Add multi-level-tiling schedule rule for matrix core**. Similar to the implementation of cuda tensorcore - **Add WMMA tensor intinsic**. ## Test Result Compared with template-based scheduling, metascheduler can achieve better performance when `max_trials_global` set 32 . | data type | mnk | template-based | meta scheduler | | ---- | ---- | ---- | ---- | | f16f16f32| 128, 128, 128 | 692 GFLOPS | 1251 GFLOPS | | f16f16f32| 256, 256, 256| 1166 GFLOPS | 5310 GFLOPS | | f16f16f32| 1024, 1024, 1024 | 24126 GFLOPS | 28856 GFLOPS | | f16f16f32| 4096, 4096, 4096| 44041 GFLOPS | 61763 GFLOPS | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
