ibsidorenko opened a new pull request, #11228:
URL: https://github.com/apache/tvm/pull/11228

   This PR is an attempt to revive 
[PR#9164](https://github.com/apache/tvm/pull/9164) . It enables folding of 
constants for QNN operations. Motivation to have this feature is BYOC use 
cases. For some BYOC it can help to avoid weights converting at runtime and 
thus to improve performance.
   
   One important thing: for the case when we call `FoldConstant` before 
`FakeQuantizationToInteger` pass, we can prevent `FQ2I` from converting some 
ops to qnn equivalent. To avoid this, folding of QNN constants is disabled by 
default. To enable use fold_qnn=True flag in `FoldConstant` pass.
   
   Co-authored-by: Alexander Peskov [email protected]
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to