masahi commented on pull request #9164: URL: https://github.com/apache/tvm/pull/9164#issuecomment-937591028
@apeskov Please see this PR https://github.com/apache/tvm/pull/9135. I understand why you want to do this, namely, constant fold `quantize(weight_fp32)` in a QNN graph. Returning float32 weights from the PyTorch frontend and relying on Relay constant folding to recover quantized weights was my design mistake. Now you can directly obtain quantized weights from the frontend (we do quantize at numpy level). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
