Kh4L opened a new pull request #19011:
URL: https://github.com/apache/incubator-mxnet/pull/19011


   ## Description ##
   This PR adds INT8 with calibration support to MXNet-TensorRT.
   It enables TensorRT internal optimization to create an INT8 engine (that 
will contain some INT8 kernels, if they are faster than the FP16 or FP32 ones). 
   In this first version, the quantization and de-quantization values are 
computed during the calibration phase. During this phase (of a number of 
iterations set by the `calibration_iters`), the user is expect to provide 
samples representing the inference data, used to calibrate the engine. The 
inference model is slower during this phase.
   Once the calibration is done, the MXNet-TensorRT inference model is ready 
for fast inference with INT8. 
   
   Saving and loading of the calibration tables will be added in a later PR.
   
   ## Usage ##
   WIP
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to