MasterJH5574 opened a new pull request, #17618:
URL: https://github.com/apache/tvm/pull/17618

   This PR introduces the MLA attention kernels written in TIR. It also 
implements the KV cache MLA computation logic.
   
   A new unit test file is added to ensure the correctness of the TIR kernels.
   
   This PR also fixes a few TIR prefill kernel tile size initialization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to