ThomasDelteil commented on a change in pull request #13145: Add documentation 
on GPU performance on Quantization example
URL: https://github.com/apache/incubator-mxnet/pull/13145#discussion_r231351811
 
 

 ##########
 File path: example/quantization/README.md
 ##########
 @@ -320,4 +320,6 @@ the console to run model quantization for a specific 
configuration.
 - `launch_inference.sh` This is a shell script that calculate the accuracies 
of all the quantized models generated
 by invoking `launch_quantize.sh`.
 
-**NOTE**: This example has only been tested on Linux systems.
\ No newline at end of file
+**NOTE**: 
+- This example has only been tested on Linux systems.
+- Performance is expected to decrease with GPU as the params. The purpose of 
the quantization implementation is to minimize accuracy loss when converting 
FP32 models to INT8. MXNet community is working on improving the performance. 
 
 Review comment:
   "Performance is expected to decrease with GPU as the params" -> sentence is 
incomplete
   
   Could you add something saying that though it is slower it has a smaller 
memory footprint?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to