freedenS opened a new issue #10506: URL: https://github.com/apache/tvm/issues/10506
I tried to quantize resnet50 (https://github.com/onnx/models/blob/main/vision/classification/resnet/model/resnet50-v1-7.onnx) according to the tutorial([Deploy a Quantized Model on Cuda](https://tvm.apache.org/docs/how_to/deploy_models/deploy_quantized.html?highlight=quantize)) , quantize pass is ok, but i met a type error when inference. it looks like fused_fixed_point_multiply doesn't support int64, but why there has a cast before it ### Expected behavior get the correct result ### Actual behavior ``` ======= Execute fused_nn_contrib_conv2d_NCHWc_41 ... Output#0, CPU [1,8,56,56,8], Int8 at<0x6f07000>: 127 0 127 127 0 127 ... 127 127 127 127 127 127 Output#1, CPU [8,8,1,1,8,8], Int8 at<0x4f35080>: -4 32 -10 -9 0 -1 ... -4 -74 6 0 10 0 Output#2, CPU [1,8,56,56,8], Int32 at<0x8faf480>: -19050 -40132 5969 -17526 -25908 -6985 ... -9144 -32512 -28194 -43688 3048 -29972 ======= Execute fused_cast_1065 ... Output#0, CPU [1,8,56,56,8], Int32 at<0x8faf480>: -19050 -40132 5969 -17526 -25908 -6985 ... -9144 -32512 -28194 -43688 3048 -29972 Output#1, CPU [1,8,56,56,8], Int64 at<0x7527080>: -19050 -40132 5969 -17526 -25908 -6985 ... -9144 -32512 -28194 -43688 3048 -29972 ======= Execute fused_fixed_point_multiply_143 ... Traceback (most recent call last): File "/tvm/python/tvm/contrib/graph_executor.py", line 248, in run self._run() File "/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__ raise get_last_ffi_error() tvm._ffi.base.TVMError: Traceback (most recent call last): 12: TVMFuncCall at /tvm/src/runtime/c_runtime_api.cc:486 11: tvm::runtime::PackedFunc::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const at /tvm/include/tvm/runtime/packed_func.h:1150 10: std::function<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const at /usr/include/c++/5/functional:2267 9: _M_invoke at /usr/include/c++/5/functional:1871 8: operator() at /tvm/src/runtime/graph_executor/graph_executor.cc:575 7: tvm::runtime::GraphExecutor::Run() at /tvm/src/runtime/graph_executor/graph_executor.cc:68 6: std::function<void ()>::operator()() const at /usr/include/c++/5/functional:2267 5: _M_invoke at /usr/include/c++/5/functional:1871 4: operator() at /tvm/src/runtime/graph_executor/graph_executor.cc:508 3: tvm::runtime::PackedFunc::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const at /tvm/include/tvm/runtime/packed_func.h:1150 2: std::function<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const at /usr/include/c++/5/functional:2267 1: _M_invoke at /usr/include/c++/5/functional:1871 0: operator() at /tvm/src/runtime/library_module.cc:78 File "/tvm/src/runtime/library_module.cc", line 78 TVMError: --------------------------------------------------------------- An error occurred during the execution of TVM. For more information, please see: https://tvm.apache.org/docs/errors.html --------------------------------------------------------------- Check failed: ret == 0 (-1 vs. 0) : Assert fail: (((tir.tvm_struct_get(arg1, 0, 5) == (uint8)0) && (tir.tvm_struct_get(arg1, 0, 6) == (uint8)32)) && (tir.tvm_struct_get(arg1, 0, 7) == (uint16)1)), arg1.dtype is expected to be int32 ``` ### Environment Ubuntu 18.04 x86 / tvm0.8 / python3.6 ### Steps to reproduce ``` def quantize(mod, params, data_aware): if data_aware: with relay.quantize.qconfig(calibrate_mode="kl_divergence", weight_scale="max"): mod = relay.quantize.quantize(mod, params, dataset=calibrate_dataset()) else: with relay.quantize.qconfig(calibrate_mode="global_scale", global_scale=8.0): mod = relay.quantize.quantize(mod, params) return mod onnx_model = onnx.load(onnx_path) shape_dict = {'data': (1, 3, 224, 224)} mod, params = relay.frontend.from_onnx(onnx_model, shape_dict) mod = quantize(mod, params, True) target_host = 'llvm' ctx_host = tvm.cpu(0) with relay.build_config(opt_level=3): graph, lib, params = relay.build(mod, target=target_host, target_host=None, params=params) m = graph_executor.create(graph, lib, ctx_host) m.set_input(**params) m.set_input("data", tvm.nd.array(data)) m.run() ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
