Dear community,
I'm currently trying to **reduce overall Auto-TVM runtimes** by selectively
tuning only the kernels that are actual hotspots in the application.
**Hotspot detection** can be performed fairly easily, e.g. by using the **debug
runtime** which gives a detailed callgraph profile
I see! That's a pity if convolutios coexist with the same shapes... Maybe still
can be assiciated somehow.. Anyhow, thank you very much for your answer :)
---
[Visit
Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/3) to
respond.
You are receiving this because yo
Damn you are right! Hm.. matching the debug runtime output to the LLVM IR is
fairly easy. I don't know whether the shapes are somehow encoded in the LLVM
IR. My guess: no...
---
[Visit
Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/5) to
respond.
You are recei
Hi,
I'm not fully sure whether I understood your questions correctly. If you are
looking for the default schedule for a selected target, TVM fetches that from
[this](https://github.com/uwsampl/tvm-distro/tree/master/tophub) repository
during execution time and places a copy under /home/yourus
I see! That's a very good question. I'm not sure to be honest. I wanted to look
into it myself for measuring a performance baseline for each kernel before
autotuning. I'll look into it the next days when I have spare time. Please let
me know if you have found something in the meanwhile..
Alright! I performed some profiling. I've taken
[this](https://github.com/apache/incubator-tvm/blob/master/tutorials/autotvm/tune_relay_x86.py)
tutorial as a basis, using only VGG-16.
This is the **output** of the **debug runtime** :
```
Node Name Ops
Hi!
Does this appear because of this previous issue:
https://discuss.tvm.ai/t/bug-arm-significant-performance-degradation-of-execution-times-between-tvm-revisions/6029/
I was experiencing a similar slowdown on ARM CPUs which was lead back by the
limited Winograd algorithm...
Cheers
Robert