Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
gyenesvi commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2102803235 Great, thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
tqchen commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2102706022 Leaving it open for another week in case others want to chime in, otherwise LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
gyenesvi commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2102668139 Thanks for the info about the schedules and differences, it makes sense. As for moving on, what would be the next step now? Do you need any other info from us for reviewing? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
tqchen commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2102587457 I think the main reason here was relay default incorporate autotuning by default, while Relax dos not. The main rationale as of now is we would like to choose to decouple metaschedule tuning from the flow. That does not mean metaschedule cannot be applied, we do encourage users to apply metaschedule for traditional applications. In the build flow, the meta-schedule can then get applied by composing together with default flow. The zero pipeline as of now mainly focus on some of some extra out of box improvement for latest LLM models and can expand to more in future -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
agoston-mc commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2100973263 We have updated the PR with Relax frontend, but we have also kept the Relay, as an option, thinking it could be useful to have both, because we noticed performance differences during testing. We observed that Relax by the default build pipeline is significantly slower than Relay. (On CPU we observed 2 orders of magnitude slower runs, while on GPU 3-5x slower, the models we tested were mobilenet and resnet variants, all static models) We observed the same with the ONNX Relax frontend, so we suspect the issue is with the compilation, not with the frontends. Is this a normal situation with the current state of development of Relax? By using Meta Schedule with a custom pipeline (with only a ValidateOps transformation), we managed to match or surpass the speed of Relay, but in many cases using the 'zero' or 'default_build' pipelines did not improve the performance. What is the recommended workflow to be able to reach the performance of Relay reliably (at least on static models)? Anyways this does not affect the frontend code in the PR so we could move on with that as the frontend is ready to be submitted to TVM. We are just curious for debug/measurement reasons, as we were surprised by the results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
tqchen commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2058952581 Thanks for the note. We are in the process of revamping docs. The latest set of emerging model optimizations like LLMs will be based on relax. https://github.com/apache/tvm/tree/main/python/tvm/relax/frontend/onnx likely is a good reference there -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
gyenesvi commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2058579469 Hi, > as a community we recently moves towards the relax IR for latest genAI workloads Thanks for directing us towards Relax. I guess that means that new frontends should convert their representations into Relax IR instead of Relay? The documentation on tvm.apache.org refers to Relay, but not Relax. Is that documentation obsolete in this area? Is Relay going to be superseded by Relax? We only see frontend examples in tvm.relax that we can use as reference. Is there further documentation on tvm.relax? It is interesting to hear that there's more focus on dynamic graphs / shape inference, as one of the key goals of the next version of NNEF, under development, is support for dynamic graphs and shape inference. > it is unclear how much adoption NNEF have as of now versus ONNX and other formats One of the goals of integration into compiler stacks like TVM would be exactly to drive more adoption, as adoption requires public tooling to be able to demonstrate the capabilities / usage of NNEF in end-to-end workflows. As the next version of NNEF will focus on dynamic graphs, custom operations and lowering to tensor IR level, TVM seems like a good option to demonstrate its potential in compilation based inference engines. But first we would like to start with integrating the currently publicly available version of NNEF. Also, TVM has backends to multiple Khronos formats, such as SPIR-V (Vulkan) and OpenCL, that is why TVM could provide us with an end-to-end workflow starting from a Khronos defined input format, and resulting in Khronos defined outputs. Furthermore, some Khronos members may be interested in implementing their own (proprietary) hardware backends to TVM, with which an NNEF frontend could also provide an end-to-end workflow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
tqchen commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2053701317 Thanks for the proposal, as a community we recently moves towards the relax IR for latest genAI workloads, additionally, it is unclear how much adoption NNEF have as of now versus ONNX and other formats -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org