[GitHub] [tvm] jcf94 commented on a change in pull request #8808: [BYOC][TensorRT] Add Int8 support to TensorRT BYOC integration

GitBox Mon, 23 Aug 2021 19:34:37 -0700


jcf94 commented on a change in pull request #8808:
URL: https://github.com/apache/tvm/pull/8808#discussion_r694444768




##########
File path: src/runtime/contrib/tensorrt/tensorrt_runtime.cc
##########
@@ -66,7 +78,16 @@ class TensorRTRuntime : public JSONRuntimeBase {
         use_implicit_batch_(true),
         max_workspace_size_(size_t(1) << 30),
         max_batch_size_(-1),
-        multi_engine_mode_(false) {}
+        multi_engine_mode_(false) {
+          const bool use_int8 = dmlc::GetEnv("TVM_TENSORRT_USE_INT8", false);

Review comment:
       Suggest to move these options to `LoadGlobalAttributes` and remove the 
use of environment variables(also for the FP16).

##########
File path: src/runtime/contrib/tensorrt/tensorrt_builder.cc
##########
@@ -40,30 +40,30 @@ namespace contrib {
 TensorRTBuilder::TensorRTBuilder(TensorRTLogger* logger,
                                  const std::vector<const DLTensor*>& 
data_entry,
                                  size_t max_workspace_size, bool 
use_implicit_batch, bool use_fp16,
-                                 int batch_size)
+                                 int batch_size, nvinfer1::IInt8Calibrator* 
calibrator)
     : data_entry_(data_entry),
       max_workspace_size_(max_workspace_size),
       use_implicit_batch_(use_implicit_batch),
       use_fp16_(use_fp16),
       batch_size_(batch_size) {
   // Create TRT builder and network.
   builder_ = nvinfer1::createInferBuilder(*logger);
-#if TRT_VERSION_GE(6, 0, 1)

Review comment:
       Why does these version macros need to be removed?
   
   In my experience in higher version of TRT, use api like 
`builder_->setInt8Mode` will throw out some warning(Even in TRT8, some api has 
been deprecated ... Better to test these in a higher version.)

##########
File path: src/runtime/contrib/tensorrt/tensorrt_runtime.cc
##########
@@ -125,17 +147,26 @@ class TensorRTRuntime : public JSONRuntimeBase {
 
   /*! \brief Run inference using built engine. */
   void Run() override {
+    

Review comment:
       Suggest to create calibrator and process the input information here, if 
there is still some remaining calibration batches, just return.
   
   This can get rid of building an engine then destory it and build one again.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] jcf94 commented on a change in pull request #8808: [BYOC][TensorRT] Add Int8 support to TensorRT BYOC integration

Reply via email to