haojin2 opened a new pull request #7843:
URL: https://github.com/apache/tvm/pull/7843
This PR wants to fix a small bug in PT converter.
Bug reproduction script:
```Python
import torch
from torch import nn
from torch.nn import Linear
class SimpleModel(nn.Module):
def __init__(self, input_size, output_size):
super(SimpleModel, self).__init__()
self.fc = Linear(input_size, output_size)
def forward(self, x):
return self.fc(x)
batch_size = 128
dim = 64
T = 50
x = torch.randn((batch_size, T, dim))
model = SimpleModel(dim, 1)
model.eval()
scripted_model = torch.jit.trace(model, x).eval()
import tvm
from tvm import relay
mod, params = relay.frontend.from_pytorch(scripted_model, [("data",
[batch_size, T, dim])])
target = tvm.target.Target('cuda -libs=cublas')
with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target, params=params)
tvm_ctx = tvm.gpu(0)
rt = tvm.contrib.graph_executor.GraphModule(lib['default'](tvm_ctx))
ndarray_inputs = {
"data": x.numpy()
}
rt.set_input(**ndarray_inputs)
rt.run()
print(rt.get_output(0).asnumpy())
```
Without this fix (current `main`):
```
python repro.py
Cannot find config for target=cuda -keys=cuda,gpu -libs=cublas
-max_num_threads=1024 -thread_warp_size=32,
workload=('batch_matmul_cublas.cuda', ('TENSOR', (128, 50, 64), 'float32'),
('TENSOR', (1, 1, 64), 'float32'), (128, 50, 1)). A fallback configuration is
used, which may bring great performance regression.
Traceback (most recent call last):
File "repro.py", line 41, in <module>
rt.run()
File
"/home/ubuntu/.local/lib/python3.6/site-packages/tvm-0.8.dev846+g81afb14c4-py3.6-linux-x86_64.egg/tvm/contrib/graph_executor.py",
line 206, in run
self._run()
File
"/home/ubuntu/.local/lib/python3.6/site-packages/tvm-0.8.dev846+g81afb14c4-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py",
line 237, in __call__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
3: TVMFuncCall
2: tvm::runtime::GraphExecutor::Run()
1: std::_Function_handler<void (),
tvm::runtime::GraphExecutor::CreateTVMOp(tvm::runtime::TVMOpParam const&,
std::vector<DLTensor, std::allocator<DLTensor> > const&, unsigned
long)::{lambda()#3}>::_M_invoke(std::_Any_data const&)
0: std::_Function_handler<void (tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*), tvm::runtime::WrapPackedFunc(int (*)(TVMValue*,
int*, int, TVMValue*, int*, void*),
tvm::runtime::ObjectPtr<tvm::runtime::Object>
const&)::{lambda(tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&,
tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
2: TVMFuncCall
1: std::_Function_handler<void (tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*), tvm::contrib::{lambda(tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*)#3}>::_M_invoke(std::_Any_data const&,
tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
0: void
tvm::contrib::CallBatchGemm<tvm::contrib::CublasSgemmBatchOp>(tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*, tvm::contrib::CublasSgemmBatchOp)
File "/home/ubuntu/tvm/src/runtime/contrib/cublas/../cblas/gemm_common.h",
line 189
File "/home/ubuntu/tvm/src/runtime/library_module.cc", line 78
TVMError:
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here:
https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
Check failed: ret == 0 (-1 vs. 0) : TVMError:
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here:
https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
Check failed: BatchCount3D(B) == batch_size (1 vs. 128) :
terminate called after throwing an instance of 'tvm::runtime::InternalError'
what(): [23:18:59] /home/ubuntu/tvm/src/runtime/workspace_pool.cc:118:
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here:
https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
Check failed: allocated_.size() == 1 (2 vs. 1) :
Stack trace:
0: tvm::runtime::WorkspacePool::~WorkspacePool()
1: __call_tls_dtors
2: 0x00007fdd60f44236
3: exit
4: __libc_start_main
5: _start
6: 0xffffffffffffffff
Aborted (core dumped)
```
With this fix:
```
python repro.py
Cannot find config for target=cuda -keys=cuda,gpu -libs=cublas
-max_num_threads=1024 -thread_warp_size=32,
workload=('batch_matmul_cublas.cuda', ('TENSOR', (128, 50, 64), 'float32'),
('TENSOR', (128, 1, 64), 'float32'), (128, 50, 1)). A fallback configuration is
used, which may bring great performance regression.
[[[-0.21468109]
[ 0.3858583 ]
[ 0.16572809]
...
[-0.03322682]
[ 0.33868816]
[ 0.3021463 ]]
[[ 1.052577 ]
[ 0.26492748]
[ 0.37078723]
...
[-0.0752994 ]
[-0.66205776]
[-0.19348428]]
[[ 0.6743065 ]
[ 0.02969196]
[-0.03708391]
...
[ 0.16056934]
[ 0.41362724]
[ 0.629748 ]]
...
[[-0.05230951]
[-0.3116043 ]
[-0.07618818]
...
[-0.7429178 ]
[ 0.34146884]
[-0.46452078]]
[[ 0.6838716 ]
[-0.0820943 ]
[ 0.01337433]
...
[ 0.6866671 ]
[-0.4317361 ]
[ 0.16978306]]
[[ 0.7288995 ]
[ 0.57882047]
[ 0.40440276]
...
[ 0.36602104]
[ 0.6143365 ]
[ 0.5057366 ]]]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]