zhaoyang-star commented on PR #11700:
URL: https://github.com/apache/tvm/pull/11700#issuecomment-1171024890
> Hmm, do you have a profiler report?
>
> I am curious since I would expect runtimes to be better vs dq - fp32 - q.
Do you have a repo to reproduce?
I replaced all relu with hswish in resnet50.
- only use one cpu core
- benchmark int8 model
Quantized Model | Inference Time(msec)
---|---
resnet50(relu) | 1149
resnet50(hswish) w/o LUT | 1210
resnet50(hswish, LUT) | 1171
About 3% speedup by using LUT. I also tried yolov5 with hswish model, which
is about 9% speedup by LUT.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]