ibsidorenko opened a new pull request, #11228: URL: https://github.com/apache/tvm/pull/11228
This PR is an attempt to revive [PR#9164](https://github.com/apache/tvm/pull/9164) . It enables folding of constants for QNN operations. Motivation to have this feature is BYOC use cases. For some BYOC it can help to avoid weights converting at runtime and thus to improve performance. One important thing: for the case when we call `FoldConstant` before `FakeQuantizationToInteger` pass, we can prevent `FQ2I` from converting some ops to qnn equivalent. To avoid this, folding of QNN constants is disabled by default. To enable use fold_qnn=True flag in `FoldConstant` pass. Co-authored-by: Alexander Peskov [email protected] -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
