Yes, without HW support for int8, you shouldn't expect int8 to be any faster 
than fp32. For avx2, Torch is much faster than TVM for int8. For avx512, where 
int8 does make a difference, TVM is much faster. 

I have a script 
https://github.com/Edgecortix-Inc/pytorch_quantization/tree/master/tvm_qnn_evaluation
 which can also be used for perf benchmark. Set this to True
https://github.com/Edgecortix-Inc/pytorch_quantization/blob/master/tvm_qnn_evaluation/imagenet_test.py#L82
 and pick your target here 
https://github.com/Edgecortix-Inc/pytorch_quantization/blob/master/tvm_qnn_evaluation/test_util.py#L63

* For skylake with avx512 support, the target should be "llvm 
-mcpu=skylake-avx512
* For cascadelake, "llvm -mcpu=cascadelake"

Maybe @anijain2305 can give more comments.





---
[Visit 
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/2)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/500eadc82da727ca72d6066bf857afbf376ebda416a12a3196d5f844fd62fc35).

Reply via email to