Re: [PR] [runtime][vm] Add LoRA adapter metadata to paged KV cache [tvm]

via GitHub Wed, 11 Mar 2026 11:43:01 -0700


MagellaX commented on PR #18890:
URL: https://github.com/apache/tvm/pull/18890#issuecomment-4041341928


   > @MagellaX Thank you so much for the contributions! My overall read is that 
we probably need to first establish end-to-end LoRA serving flow, with runnable 
tests and real commands, before upstreaming parts. The main reason is that we 
don't want to iterate over the implementations for too many times in the 
mainline repo without seeing end-to-end effects.
   
   yeah totally agree with that!!! i think with this PR the main useful outcome 
was clarifying the runtime boundary a bit, but I agree the next step should be 
downstream first, not upstream first. I’ll focus on getting a minimal 
end-to-end LoRA serving path working in MLC with real runnable commands, tests, 
and a clear single-adapter flow, and then only upstream the smallest TVM pieces 
that are actually required by that working path. That should make the design 
much easier to evaluate and avoid churning upstream abstractions too early.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [runtime][vm] Add LoRA adapter metadata to paged KV cache [tvm]

Reply via email to