Kh4L opened a new pull request #19011: URL: https://github.com/apache/incubator-mxnet/pull/19011
## Description ## This PR adds INT8 with calibration support to MXNet-TensorRT. It enables TensorRT internal optimization to create an INT8 engine (that will contain some INT8 kernels, if they are faster than the FP16 or FP32 ones). In this first version, the quantization and de-quantization values are computed during the calibration phase. During this phase (of a number of iterations set by the `calibration_iters`), the user is expect to provide samples representing the inference data, used to calibrate the engine. The inference model is slower during this phase. Once the calibration is done, the MXNet-TensorRT inference model is ready for fast inference with INT8. Saving and loading of the calibration tables will be added in a later PR. ## Usage ## WIP ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
