[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots
Alright! I performed some profiling. I've taken [this](https://github.com/apache/incubator-tvm/blob/master/tutorials/autotvm/tune_relay_x86.py) tutorial as a basis, using only VGG-16. This is the **output** of the **debug runtime** : ``` Node Name Ops Time(us)Time(%) Shape Inputs Outputs - --- --- - -- --- fused_nn_contrib_conv2d_NCHWc_add_nn_relu_5 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_5 107147.012.758 (1, 2, 112, 112, 64) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_7 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_7 99490.1 11.847 (1, 2, 224, 224, 32) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_1 96205.3 11.456 (1, 32, 28, 28, 16) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_11 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_1 96064.8 11.439 (1, 32, 28, 28, 16) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_31 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_3 94304.2 11.229 (1, 8, 56, 56, 32)3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_3 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_3 94182.4 11.215 (1, 8, 56, 56, 32)3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_6 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_6 52424.5 6.242(1, 2, 112, 112, 64) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_4 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_4 48494.9 5.775(1, 8, 56, 56, 32)3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_2 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_2 45365.3 5.402(1, 16, 28, 28, 32) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu23580.5 2.808(1, 32, 14, 14, 16) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu fused_nn_contrib_conv2d_NCHWc_add_nn_relu23557.3 2.805(1, 32, 14, 14, 16) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu2 fused_nn_contrib_conv2d_NCHWc_add_nn_relu23549.4 2.804(1, 32, 14, 14, 16) 3 1 fused_nn_dense_add_nn_relu_1 fused_nn_dense_add_nn_relu_1 19578.8 2.331(1, 4096) 3 1 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_8 fused_nn_contrib_conv2d_NCHWc_add_nn_relu_8 5776.86 0.688(1, 2, 224, 224, 32) 3 1 fused_nn_dense_add_nn_relufused_nn_dense_add_nn_relu 3206.76 0.382(1, 4096) 3 1 fused_layout_transform_19 fused_layout_transform_19 2011.35 0.24 (1, 4, 224, 224, 16) 1 1 fused_nn_max_pool2d_4 fused_nn_max_pool2d_4 964.965 0.115(1, 2, 112, 112, 32) 1 1 fused_nn_dense_addfused_nn_dense_add 784.874 0.093(1, 1000) 3 1 fused_layout_transform_171fused_layout_transform_17 561.36 0.067(1, 1, 56, 56, 256) 1 1 fused_layout_transform_17 fused_layout_transform_17 559.474 0.067(1, 1, 56, 56, 256) 1 1 fused_nn_max_pool2d_3 fused_nn_max_pool2d_3 495.34 0.059(1, 2, 56, 56, 64)1 1 fused_layout_transform_15 fused_layout_transform_15 289.293 0.034(1, 1, 28, 28, 512) 1 1 fused_layout_transform_14 fused_layout_transform_14 232.466 0.028(1, 1, 28, 28, 512) 1 1 fused_nn_max_pool2d_2 fused_nn_max_pool2d_2 221.099 0.026(1, 8, 28, 28, 32)1 1 fused_layout_transform_nn_batch_flatten fused_layout_transform_nn_batch_flatten 179.002 0.021(1, 25088) 1 1 fused_layout_transform_16 fused_layout_transform_16 135.138 0.016(1, 1, 28, 28, 256) 1 1 fused_nn_max_pool2d_1 fused_nn_max_pool2d_1 109.964 0.013(1, 32, 14, 14, 16) 1 1 fused_layout_transform_18 fused_layout_transform_18 101.463 0.012(1, 4, 56, 56, 32)1 1 fused_layout_transform_20 fused_layout_transform_20 66.265 0.008(1, 1, 224, 224, 3) 1 1
[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots
Damn you are right! Hm.. matching the debug runtime output to the LLVM IR is fairly easy. I don't know whether the shapes are somehow encoded in the LLVM IR. My guess: no... --- [Visit Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/5) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/6d380d61d478300441f5e949498e6262cb08abc29955757460a47e95c4b5b740).
[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots
Well...if more than two convs have the same shape (both input and weight), then they will be the same tuning task. The tricky part is that it's not straightforward to see the weight shape from the debug runtime log. --- [Visit Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/4) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/148a0cc5619e5c1553a2b72938e240b5352eaa3a3b62dbe31926885037705c5d).
[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots
I see! That's a pity if convolutios coexist with the same shapes... Maybe still can be assiciated somehow.. Anyhow, thank you very much for your answer :) --- [Visit Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/2fa5f72c46f7ee5767450ce87c93b3d40cf291d2c28f88dc43118082870b83fb).
[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots
It's a bit tricky. For now you can only match op type and shape. --- [Visit Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/7d0c981a34db26dd31063a523c1b9d8c419aba4bf39154493ac15aee45f78f2b).
[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots
Dear community, I'm currently trying to **reduce overall Auto-TVM runtimes** by selectively tuning only the kernels that are actual hotspots in the application. **Hotspot detection** can be performed fairly easily, e.g. by using the **debug runtime** which gives a detailed callgraph profile when executing run(). My question is **how to match** these identified operations to the AutoTVM selected kernels. On the one hand, the profile information looks like [this example](https://docs.tvm.ai/dev/debugger.html) shows. A prioritized list of nodes mostly identified by their LLVM IR name. On the other hand, when selecting the tasks to be tuned ``` kernels =autotvm.task.extract_from_program(ir["main"], target=target, params=params, ops=None)``` gives you a list of [Task](https://docs.tvm.ai/api/python/autotvm.html#tvm.autotvm.task.task.Task) objects, e.g.: ``` Task(func_name=dense_nopack.x86, args=(('TENSOR', (1, 16), 'float32'), ('TENSOR', (64, 16), 'float32'), None, 'float32'), kwargs={}, workload=('dense_nopack.x86', ('TENSOR', (1, 16), 'float32'), ('TENSOR', (64, 16), 'float32'), None, 'float32')) ``` My question refers to how to match such Tasks to their IR counterparts? Any help, ideas, suggestions are much appreciated! Thank you & Best regards --- [Visit Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/1) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/b3ab2b4aee2549639b76b097f07b59596f5d6f8492a15dcdfdb721d70a74f9df).