[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots

2020-04-04 Thread Robert Bücs via TVM Discuss


Alright! I performed some profiling. I've taken 
[this](https://github.com/apache/incubator-tvm/blob/master/tutorials/autotvm/tune_relay_x86.py)
 tutorial as a basis, using only VGG-16. 

This is the **output** of the **debug runtime** :
```
Node Name Ops   
   Time(us)Time(%)  Shape Inputs  Outputs  
- ---   
   ---  - --  ---  
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_5   
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_5  107147.012.758   (1, 2, 112, 
112, 64)  3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_7   
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_7  99490.1 11.847   (1, 2, 224, 
224, 32)  3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_1   
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_1  96205.3 11.456   (1, 32, 28, 
28, 16)   3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_11  
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_1  96064.8 11.439   (1, 32, 28, 
28, 16)   3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_31  
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_3  94304.2 11.229   (1, 8, 56, 
56, 32)3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_3   
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_3  94182.4 11.215   (1, 8, 56, 
56, 32)3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_6   
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_6  52424.5 6.242(1, 2, 112, 
112, 64)  3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_4   
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_4  48494.9 5.775(1, 8, 56, 
56, 32)3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_2   
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_2  45365.3 5.402(1, 16, 28, 
28, 32)   3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu23580.5 2.808(1, 32, 14, 
14, 16)   3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu 
fused_nn_contrib_conv2d_NCHWc_add_nn_relu23557.3 2.805(1, 32, 14, 
14, 16)   3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu2
fused_nn_contrib_conv2d_NCHWc_add_nn_relu23549.4 2.804(1, 32, 14, 
14, 16)   3   1
fused_nn_dense_add_nn_relu_1  fused_nn_dense_add_nn_relu_1  
   19578.8 2.331(1, 4096) 3   1
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_8   
fused_nn_contrib_conv2d_NCHWc_add_nn_relu_8  5776.86 0.688(1, 2, 224, 
224, 32)  3   1
fused_nn_dense_add_nn_relufused_nn_dense_add_nn_relu
   3206.76 0.382(1, 4096) 3   1
fused_layout_transform_19 fused_layout_transform_19 
   2011.35 0.24 (1, 4, 224, 224, 16)  1   1
fused_nn_max_pool2d_4 fused_nn_max_pool2d_4 
   964.965 0.115(1, 2, 112, 112, 32)  1   1
fused_nn_dense_addfused_nn_dense_add
   784.874 0.093(1, 1000) 3   1
fused_layout_transform_171fused_layout_transform_17 
   561.36  0.067(1, 1, 56, 56, 256)   1   1
fused_layout_transform_17 fused_layout_transform_17 
   559.474 0.067(1, 1, 56, 56, 256)   1   1
fused_nn_max_pool2d_3 fused_nn_max_pool2d_3 
   495.34  0.059(1, 2, 56, 56, 64)1   1
fused_layout_transform_15 fused_layout_transform_15 
   289.293 0.034(1, 1, 28, 28, 512)   1   1
fused_layout_transform_14 fused_layout_transform_14 
   232.466 0.028(1, 1, 28, 28, 512)   1   1
fused_nn_max_pool2d_2 fused_nn_max_pool2d_2 
   221.099 0.026(1, 8, 28, 28, 32)1   1
fused_layout_transform_nn_batch_flatten   
fused_layout_transform_nn_batch_flatten  179.002 0.021(1, 25088)
1   1
fused_layout_transform_16 fused_layout_transform_16 
   135.138 0.016(1, 1, 28, 28, 256)   1   1
fused_nn_max_pool2d_1 fused_nn_max_pool2d_1 
   109.964 0.013(1, 32, 14, 14, 16)   1   1
fused_layout_transform_18 fused_layout_transform_18 
   101.463 0.012(1, 4, 56, 56, 32)1   1
fused_layout_transform_20 fused_layout_transform_20 
   66.265  0.008(1, 1, 224, 224, 3)   1   1   

[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots

2020-03-25 Thread Robert Bücs via TVM Discuss


Damn you are right! Hm.. matching the debug runtime output to the LLVM IR is 
fairly easy. I don't know whether the shapes are somehow encoded in the LLVM 
IR. My guess: no...





---
[Visit 
Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/5) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/6d380d61d478300441f5e949498e6262cb08abc29955757460a47e95c4b5b740).


[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots

2020-03-25 Thread Cody H. Yu via TVM Discuss


Well...if more than two convs have the same shape (both input and weight), then 
they will be the same tuning task. The tricky part is that it's not 
straightforward to see the weight shape from the debug runtime log.





---
[Visit 
Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/4) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/148a0cc5619e5c1553a2b72938e240b5352eaa3a3b62dbe31926885037705c5d).


[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots

2020-03-25 Thread Robert Bücs via TVM Discuss


I see! That's a pity if convolutios coexist with the same shapes... Maybe still 
can be assiciated somehow.. Anyhow, thank you very much for your answer :)





---
[Visit 
Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/3) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/2fa5f72c46f7ee5767450ce87c93b3d40cf291d2c28f88dc43118082870b83fb).


[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots

2020-03-25 Thread Cody H. Yu via TVM Discuss


It's a bit tricky. For now you can only match op type and shape.





---
[Visit 
Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/2) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/7d0c981a34db26dd31063a523c1b9d8c419aba4bf39154493ac15aee45f78f2b).


[TVM Discuss] [Questions] [AutoTVM] Selective tuning of hotspots

2020-03-24 Thread Robert Bücs via TVM Discuss


Dear community,

I'm currently trying to **reduce overall Auto-TVM runtimes** by selectively 
tuning only the kernels that are actual hotspots in the application. 

**Hotspot detection** can be performed fairly easily, e.g. by using the **debug 
runtime** which gives a detailed callgraph profile when executing run().

My question is **how to match** these identified operations to the AutoTVM 
selected kernels. 

On the one hand, the profile information looks like [this 
example](https://docs.tvm.ai/dev/debugger.html) shows.  A prioritized list of 
nodes mostly identified by their LLVM IR name.

On the other hand, when selecting the tasks to be tuned ``` kernels 
=autotvm.task.extract_from_program(ir["main"], target=target, params=params, 
ops=None)``` gives you a list of 
[Task](https://docs.tvm.ai/api/python/autotvm.html#tvm.autotvm.task.task.Task) 
objects, e.g.:

```
Task(func_name=dense_nopack.x86, args=(('TENSOR', (1, 16), 'float32'), 
('TENSOR', (64, 16), 'float32'), None, 'float32'), kwargs={}, 
workload=('dense_nopack.x86', ('TENSOR', (1, 16), 'float32'), ('TENSOR', (64, 
16), 'float32'), None, 'float32'))
```

My question refers to how to match such Tasks to their IR counterparts? 

Any help, ideas, suggestions are much appreciated!
Thank you & Best regards





---
[Visit 
Topic](https://discuss.tvm.ai/t/autotvm-selective-tuning-of-hotspots/6083/1) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/b3ab2b4aee2549639b76b097f07b59596f5d6f8492a15dcdfdb721d70a74f9df).