thaisacs commented on PR #17627:
URL: https://github.com/apache/tvm/pull/17627#issuecomment-2643448525
> > @cbalint13
> > Hello, I'm conducting experiments with the following models:
> >
> > * resnet_18
> > * resnet_50
> > {...}
> > * googlenet
>
> > Imported directly from PyTorch to Relay. This issue occurred more
frequently with GoogLeNet, MobileNetV2, MobileNetV3, ResNet-152, and
InceptionV3 models.
>
> Thank you, see now. But from your description it is still not clear the
extend of the issue:
>
> Metaschedule (a) bails-out/abort as a whole program or (b) only skips on
some specific layers ?
>
> * In case of (a) yes , if whole program is aborted it is clear we should
fix this issue.
> * In case of (b) iterated variations fails (it is normal, maybe all
variations too so layer is skipped), relaxing this can get "working schedules"
but no guarantees on correctness. If Ansor is unable to find a single valid
schedule within a layer then must check the cause of inflexibility, but just
relaxing some rules to get some results is not the best idea.
>
> Lets again see others opinion here on it.
>
> Could attach outputs (as text.gz) here with complete schedule proccess
from your side ?
When I execute the following function with the GoogLeNet network, for
example, and `log_file` being the file achieved during tuning time,
```
def model_run(network_arg, dtype, target, log_file):
mod, params, inputs = get_network_with_key(network_arg, dtype)
print("Compile...")
input_shape = inputs[0][1]
with auto_scheduler.ApplyHistoryBest(log_file):
with tvm.transform.PassContext(opt_level=3,
config={"relay.backend.use_auto_scheduler": True}):
lib = relay.build(mod, target=target, params=params)
with auto_scheduler.ApplyHistoryBest(log_file):
with tvm.transform.PassContext(opt_level=0,
config={"relay.backend.use_auto_scheduler": True}):
ref_lib = relay.build(mod, target=target, params=params)
# Check the correctness
def get_output(input_data, data, lib):
dev = tvm.device(str(target), 0)
module = graph_executor.GraphModule(lib["default"](dev))
module.set_input(input_data, data)
module.run()
return module.get_output(0).numpy()
def run_bench(input_data, data, lib):
dev = tvm.device(str(target), 0)
# Create graph executor
module = graph_executor.GraphModule(lib["default"](dev))
module.set_input(input_data, data)
# Evaluate
print("Evaluate inference time cost...")
for x in range(0, 1):
print(module.benchmark(dev, repeat=10, number=10,
min_repeat_ms=500, end_to_end=True))
np.random.seed(0)
data_tvm =
tvm.nd.array((np.random.uniform(size=input_shape)).astype(dtype))
run_bench(inputs[0][0], data_tvm, lib)
np.random.seed(0)
data_tvm =
tvm.nd.array((np.random.uniform(size=input_shape)).astype(dtype))
actual_output1 = get_output(inputs[0][0], data_tvm, lib)
expected_output = get_output(inputs[0][0], data_tvm, ref_lib)
tvm.testing.assert_allclose(actual_output1, expected_output, rtol=1e-4,
atol=1e-4)
```
The program is aborted during compilation, specifically at the following
lines:
```
with auto_scheduler.ApplyHistoryBest(log_file):
with tvm.transform.PassContext(opt_level=3,
config={"relay.backend.use_auto_scheduler": True}):
lib = relay.build(mod, target=target, params=params)
```
Then, we are unable to generate code and perform model inference. Note that
during tuning to generate the `log_file`, there were no issues. It is possible
to generate the .json file with the schedules for the layers that were tuned."
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]