djaym7 opened a new issue #17053: Save quantized params in int8 URL: https://github.com/apache/incubator-mxnet/issues/17053 ## Description For the current quantize_net solution, the params are saved in fp32 format and converted to int8 during first pass and cached in memory. It would be better for edge devices and devices with less memory in general to load in8 models vs fp32 (4x more size than int8). Additionally, extra processing time of converting params everytime a model is loaded will be reduced. This is planned for future as informed by @xinyu-intel (https://github.com/dmlc/gluon-cv/issues/1046) Thanks
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
