mbrookhart opened a new pull request #8126:
URL: https://github.com/apache/tvm/pull/8126


   Recently, we discovered that tf2onnx is exporting some int8 graphs as fake 
quantized/QAT models in ONNX, i.e, int8 ops are exported as 
dequantize->op->quantize. 
   
   This PR introduces a pass to convert those graphs into direct int8 ops 
inside relay. I've tested correctness of the resulting models on Inceptionv1 
and ssd-mobilenet-v1 from the tensorflow lite model zoo imported via ONNX. 
Follow up work will analyze further models for more operations to include in 
this pass. 
   
   cc @AndrewZhaoLuo @masahi @jwfromm 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to