hamzaq5 opened a new issue, #18196: URL: https://github.com/apache/tvm/issues/18196
This feature is critical for modern LLM compilation workflows and is currently not available in Relax. Adding native support for GQA in Relax will enable better performance and compatibility with transformer-based models exported from PyTorch, HuggingFace, or ONNX formats. https://arxiv.org/abs/2305.13245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
