MagellaX commented on PR #18890: URL: https://github.com/apache/tvm/pull/18890#issuecomment-4041341928
> @MagellaX Thank you so much for the contributions! My overall read is that we probably need to first establish end-to-end LoRA serving flow, with runnable tests and real commands, before upstreaming parts. The main reason is that we don't want to iterate over the implementations for too many times in the mainline repo without seeing end-to-end effects. yeah totally agree with that!!! i think with this PR the main useful outcome was clarifying the runtime boundary a bit, but I agree the next step should be downstream first, not upstream first. I’ll focus on getting a minimal end-to-end LoRA serving path working in MLC with real runnable commands, tests, and a clear single-adapter flow, and then only upstream the smallest TVM pieces that are actually required by that working path. That should make the design much easier to evaluate and avoid churning upstream abstractions too early. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
