djaym7 opened a new issue #17053: Save quantized params in int8 
URL: https://github.com/apache/incubator-mxnet/issues/17053
 
 
   ## Description
   For the current quantize_net solution, the params are saved in fp32 format 
and converted to int8 during first pass and cached in memory. It would be 
better for edge devices and devices with less memory in general to load in8 
models vs fp32 (4x more size than int8). Additionally, extra processing time of 
converting params everytime a model is loaded will be reduced. 
   
   This is planned for future as informed by @xinyu-intel 
   (https://github.com/dmlc/gluon-cv/issues/1046)
   
   Thanks
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to