masahi edited a comment on pull request #10332:
URL: https://github.com/apache/tvm/pull/10332#issuecomment-1047304225


   Ok here is the comparison of GOPS between the new VNNI impl and the existing 
generic code. Also note that the VNNI numbers were obtained after only 1 or 2 
min of tuning while the generic ones have very large tuning space and it took 
more than 12 hours to get these numbers under the same tuning option. The 
script is at https://github.com/masahi/int8_experiment/blob/main/relay_bench.py
   
   This is on a rocket lake `i5-11400 @ 2.60GHz`, 6 threads.
   
   B|M|N|K|TVM VNNI (new) | TVM existing (old) 
   --| -- | -- | -- | --| -- |
   8|64|800|320|1862.9816985699251|471.93086647752153
   8|64|768|512|1957.1780318372826|254.2322265717467
   8|16|256|512|481.7846564891195|249.41214520865546
   8|128|128|128|1940.7730023523345|372.7504095880382
   8|256|512|256|2380.99163061598|496.7852808609268
   8|1024|1024|1024|2275.097320545042|219.50257992579049
   8|128|768|3072|1449.8759165025203|219.86756788442386
   8|128|768|768|1883.3963380647226|234.35976664468328
   8|128|3072|768|1595.616577196681|196.09770614852056
   16|384|384|64|2487.792996038378|418.875373840064
   16|384|64|384|2441.74586017639|301.37582872146345
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to