gbxu commented on issue #8991:
URL: https://github.com/apache/tvm/issues/8991#issuecomment-962448719
Weird. When I use auto_scheduler to search conv2d on V100, it shows "No
valid state found in this search round. Check if it has traversed all of the
search space." and lots of messages like
```
No: 3 GFLOPS: 0.00 / 0.00 results:
MeasureResult(error_type:RuntimeDeviceError, error_msg:Traceback (most recent
call last):
File "/home/test/tvm/python/tvm/auto_scheduler/measure.py", line 1124, in
_rpc_run
random_fill(empty_array)
File "/home/test/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in
__call__
rai
...
----------
An error occurred during the execution of TVM.
```
My workload is as follows.
```
@auto_scheduler.register_workload
def my_workload():
A = tvm.te.placeholder((128, 64, 224, 224), name='input0')
W = tvm.te.placeholder((64, 64, 3, 3), name='input1')
C = topi.nn.conv2d(A, W, (1,1), (1,1), (1,1), layout='NCHW',
out_dtype=A.dtype) # (128, 64, 224, 224)
return [A, W, C]
```
After setting MAX_TRACEBACK_INFO_LEN=8192, I can get more details:
```
Get 64 programs to measure:
........................................................
*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E*E==================================================
No: 1 GFLOPS: 0.00 / 0.00 results:
MeasureResult(error_type:RuntimeDeviceError, error_msg:Traceback (most recent
call last):
File "/home/test/tvm/python/tvm/auto_scheduler/measure.py", line 1124, in
_rpc_run
random_fill(empty_array)
File "/home/test/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in
__call__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
4: TVMFuncCall
3: _ZNSt17_Function_handlerIFvN3tvm7runtime7TVMArgsEPNS1_11
2: tvm::runtime::RPCWrappedFunc::operator()(tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*) const
1: tvm::runtime::RPCClientSession::CallFunc(void*, TVMValue const*, int
const*, int, std::function<void (tvm::runtime::TVMArgs)> const&)
0: tvm::runtime::RPCEndpoint::CallFunc(void*, TVMValue const*, int const*,
int, std::function<void (tvm::runtime::TVMArgs)>)
File "/home/test/tvm/src/runtime/rpc/rpc_endpoint.cc", line 801
TVMError:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
Check failed: (code == RPCCode::kReturn) is false: code=kShutdown
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/test/tvm/python/tvm/auto_scheduler/measure.py", line 1185, in
_rpc_run_worker
res = _rpc_run(*args)
File "/home/test/tvm/python/tvm/auto_scheduler/measure.py", line 1143, in
_rpc_run
dev.free_raw_stream(stream)
File "/home/test/tvm/python/tvm/_ffi/runtime_ctypes.py", line 456, in
free_raw_stream
check_call(_LIB.TVMStreamFree(self.device_type, self.device_id, stream))
File "/home/test/tvm/python/tvm/_ffi/base.py", line 348, in check_call
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
42: 0xffffffffffffffff
41: _start
40: __libc_start_main
39: main
38: Py_Main
37: 0x000000000063886a
36: PyObject_Call
35: 0x000000000058945c
34: 0x0000000000507cd3
33: _PyEval_EvalFrameDefault
32: 0x000000000050a3fc
31: 0x00000000005099ff
30: 0x0000000000507cd3
29: _PyEval_EvalFrameDefault
28: 0x000000000050a22e
27: 0x0000000000516284
26: 0x0000000000507cd3
25: _PyEval_EvalFrameDefault
24: 0x000000000050a3fc
23: 0x00000000005099ff
22: 0x0000000000507cd3
21: _PyEval_EvalFrameDefault
20: PyObject_Call
19: 0x00000000005893d9
18: 0x0000000000507cd3
17: _PyEval_EvalFrameDefault
16: PyObject_Call
15: 0x000000000058931a
14: 0x0000000000507cd3
13: _PyEval_EvalFrameDefault
12: 0x000000000050a3fc
11: 0x00000000005096c7
10: _PyEval_EvalFrameDefault
9: 0x000000000050a532
8: _PyObject_FastCallKeywords
7: 0x00007f2b9b1f5c12
6: _ctypes_callproc
5: ffi_call
4: ffi_call_unix64
3: TVMStreamFree
2: tvm::runtime::RPCDeviceAPI::FreeStream(DLDevice, void*)
1: non-virtual thunk to
tvm::runtime::RPCClientSession::FreeStream(DLDevice, void*)
0: std::_Function_handler<void (tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*),
tvm::runtime::RPCEndpoint::Init()::{lambda(tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*)#2}>::_M_invoke(std::_Any_data const&,
tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
File "/home/test/tvm/src/runtime/rpc/rpc_endpoint.cc", line 681
TVMError:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
Check failed: (code == RPCCode::kReturn) is false: code=1
, all_cost:11.04, Tstamp:1636202883.20)
==================================================
Placeholder: input0, input1
blockIdx.x [email protected]@[email protected]@ (0,16384)
vthread [email protected]@[email protected]@ (0,4)
threadIdx.x [email protected]@[email protected]@ (0,64)
for rc.0 (0,64)
for rx.0 (0,3)
threadIdx.x ax0@ax1@ax2@[email protected] (0,64)
input1.shared = ...
for ax0@ax1@ax2@[email protected] (0,126)
threadIdx.x ax0@ax1@ax2@[email protected] (0,64)
pad_temp.shared = ...
for ry.1 (0,3)
for nn_c.3 (0,2)
for yy_c.4 (0,7)
for xx_c.4 (0,7)
compute.local = ...
for nn.3 (0,2)
for yy.3 (0,7)
for xx.3 (0,7)
compute = ...
==================================================
```
hi, @comaniac . Is it due to the large problem size of conv2d? I think it's
a common setting of CNN model.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]