zhaoyang-star commented on PR #11700:
URL: https://github.com/apache/tvm/pull/11700#issuecomment-1177277418

   > > Hmm, do you have a profiler report?
   > > I am curious since I would expect runtimes to be better vs dq - fp32 - 
q. Do you have a repo to reproduce?
   > 
   > Based on `tests/python/fronend/pytorch/test_fx_quant.py`, I replaced all 
relu with hswish in resnet50.
   > 
   > * only use one cpu core
   > * benchmark int8 model
   > 
   > Quantized Model    Inference Time(msec)
   > resnet50(relu)     1149
   > resnet50(hswish) w/o LUT   1210
   > resnet50(hswish, LUT)      1171
   > About 3% speedup by using LUT. I also tried yolov5 with hswish model, 
which is about 9% speedup by LUT.
   
   Maybe there is something wrong when I created the resnet50 with hswish.
   I used a quantized YOLOv5s in which has hswish. The perf improved 50.2% ^_^ 
   
    Quantized Model  | Inference Time(msec)
   ---|---
   YOLOv5s(hswish) w/o LUT | 18.88
   YOLOv5s(hswish, LUT) | 12.57
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to