TaylorZowtuk opened a new pull request #6909:
URL: https://github.com/apache/incubator-tvm/pull/6909
While running scripts using both AutoScheduler and AutoTvm to consecutively
search for schedules for a number of operators/shapes, I observed different
behaviors during measurement following the output “Too many errors happened
during tuning.”
After looking into the code I determined that the difference in behavior was
due to AutoScheduler and AutoTvm handling the case of, the number of
accumulated errors during measurement exceeding some threshold, differently.
I observed that while using AutoTvm, the program would switch to debug level
logging and continue search.
```
Too many errors happen in the tuning. Now is in debug mode
No: 217 GFLOPS: 0.00/0.00 result:
MeasureResult(costs=(RuntimeError('Traceback (most recent call last):\n [bt]
(5) /home/tanvir/tvm/build/libtvm.so(TVMFuncCall+0x63) [0x7fd9b685ee13]\n [bt]
(4) /home/tanvir/tvm/build/libtvm.so(+0x1309037) [0x7fd9b68c8037]\n [bt] (3)
/home/tanvir/tvm/build/libtvm.so(tvm::runtime::RPCWrappedFunc::operator()(tvm::runtime::TVMArgs,
tvm::runtime::TVMRetValue*) const+0x3fa) [0x7fd9b68cc86a]\n [bt] (2)
/home/tanvir/tvm/build/libtvm.so(tvm::runtime::RPCClientSession::CallFunc(void*,
TVMValue const*, int const*, int, std::function<void (tvm::runtime::TVMArgs)>
const&)+0x57) [0x7fd9b68c0217]\n [bt] (1)
/home/tanvir/tvm/build/libtvm.so(tvm::runtime::RPCEndpoint::CallFunc(void*,
TVMValue const*, int const*, int, std::function<void
(tvm::runtime::TVMArgs)>)+0x6bd) [0x7fd9b68b546d]\n [bt] (0)
/home/tanvir/tvm/build/libtvm.so(+0x12f3668) [0x7fd9b68b2668]\n File
"/home/tanvir/tvm/src/runtime/rpc/rpc_endpoint.cc", line 807\nTVMError: Check
failed: code == R
PCCode: :kReturn: code=1'),), error_no=4, all_cost=10.765872716903687,
timestamp=1604092331.4940712) [('tile_f', [-1, 16]), ('tile_y', [-1, 2]),
('tile_x', [-1, 2]), ('tile_z', [-1, 16])],None,1719
…
<continues>
```
While using AutoScheduler, the program would crash after throwing an
uncaught error.
```
Traceback (most recent call last):
…
File "runner.py", line 124, in fig_6
m = run_operator(
File "runner.py", line 58, in run_operator
sch, args = auto_scheduler.auto_schedule(task,
tuning_options=tune_option)
File "/home/taylor/tvm/python/tvm/auto_scheduler/auto_schedule.py", line
213, in auto_schedule
sch, tensors = _ffi_api.AutoSchedule(search_policy, tuning_options)
File "/home/taylor/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237,
in __call__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (5) /home/taylor/tvm/build/libtvm.so(TVMFuncCall+0x63)
[0x7f11e187d7b3]
[bt] (4) /home/taylor/tvm/build/libtvm.so(+0x6965ab) [0x7f11e0c755ab]
[bt] (3)
/home/taylor/tvm/build/libtvm.so(tvm::auto_scheduler::AutoSchedule(tvm::auto_scheduler::SearchPolicy,
tvm::auto_scheduler::TuningOptions)+0x11a) [0x7f11e0c74cca]
[bt] (2)
/home/taylor/tvm/build/libtvm.so(tvm::auto_scheduler::SketchPolicyNode::Search(int,
int, int, tvm::auto_scheduler::ProgramMeasurer)+0x760) [0x7f11e0cfb3d0]
[bt] (1)
/home/taylor/tvm/build/libtvm.so(tvm::auto_scheduler::ProgramMeasurerNode::Measure(tvm::auto_scheduler::SearchTask
const&, tvm::auto_scheduler::SearchPolicy const&,
tvm::runtime::Array<tvm::auto_scheduler::MeasureInput, void> const&,
tvm::runtime::Array<tvm::auto_scheduler::MeasureResult, void>*, int)+0x11ed)
[0x7f11e0cd7b2d]
[bt] (0) /home/taylor/tvm/build/libtvm.so(+0x6f4af8) [0x7f11e0cd3af8]
File "/home/taylor/tvm/src/auto_scheduler/measure.cc", line 268
TVMError: Too many errors happened during tuning
```
In my particular case, AutoScheduler crashing rather than continuing to
attempt searching meant that my script would terminate prematurely when it may
have recovered from whatever was causing errors during search.
In addition, I was unclear why this behavior was only occurring in
AutoScheduler and not AutoTvm. This discrepancy in behavior can be confusing to
new users who may want to explore both methods of schedule searching. This PR
proposes bringing the AutoScheduler handling of errors in measurement in line
with AutoTvm.
By removing the LOG(FATAL) and changing verbosity for AutoScheduler in the
same way we change logging level in AutoTvm the programs will behave the same.
In addition, I changed the default verbosity of AutoScheduler to 0 (silent) in
order to match the default logging level of AutoTvm.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]