[PR] [KVCache] TIR attention kernel support for MLA [tvm]

via GitHub Sat, 01 Feb 2025 18:47:08 -0800


MasterJH5574 opened a new pull request, #17618:
URL: https://github.com/apache/tvm/pull/17618


   This PR introduces the MLA attention kernels written in TIR. It also 
implements the KV cache MLA computation logic.
   
   A new unit test file is added to ensure the correctness of the TIR kernels.
   
   This PR also fixes a few TIR prefill kernel tile size initialization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [KVCache] TIR attention kernel support for MLA [tvm]

Reply via email to