shingjan commented on PR #12141:
URL: https://github.com/apache/tvm/pull/12141#issuecomment-1198300892
resnet18 cuda:
```
ID |
Name | FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted Latency
(us) | Trials | Terminated
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 |
fused_nn_conv2d_add | 12870144 | 1 | 965.5146 | 13.3298 |
13.3298 | 32 | Y
1 |
fused_nn_conv2d_add_1 | 12895232 | 1 | 1330.3102 | 9.6934 |
9.6934 | 32 | Y
2 |
fused_nn_conv2d_add_2 | 12945408 | 1 | 2103.2869 | 6.1548 |
6.1548 | 32 | Y
3 |
fused_layout_transform | 1 | 1 | 0.0002 | 5.0254 |
5.0254 | 6 | Y
4 |
fused_nn_conv2d_add_nn_relu | 237633536 | 1 | 6085.8811 |
39.0467 | 39.0467 | 32 | Y
5 |
fused_nn_max_pool2d | 1806336 | 1 | 328.9316 | 5.4915 |
5.4915 | 30 | Y
6 |
fused_nn_contrib_conv2d_winograd_without_weight_transform_add_nn_relu |
128651264 | 2 | 2512.3525 | 51.2075 | 102.4150 |
32 | Y
7 |
fused_nn_contrib_conv2d_winograd_without_weight_transform_add_add_nn_relu |
128851968 | 2 | 1360.9069 | 94.6810 | 189.3619 |
32 | Y
8 |
fused_nn_conv2d_add_nn_relu_1 | 115806208 | 1 | 2482.7300 |
46.6447 | 46.6447 | 32 | Y
9 |
fused_nn_contrib_conv2d_winograd_without_weight_transform_add_nn_relu_1 |
127045632 | 1 | 3352.8175 | 37.8922 | 37.8922 |
32 | Y
10 |
fused_nn_contrib_conv2d_winograd_without_weight_transform_add_add_nn_relu_1 |
127145984 | 2 | 1854.8033 | 68.5496 | 137.0992 |
32 | Y
11 |
fused_nn_conv2d_add_nn_relu_2 | 115705856 | 1 | 3359.4190 |
34.4422 | 34.4422 | 32 | Y
12 |
fused_nn_contrib_conv2d_winograd_without_weight_transform_add_nn_relu_2 |
114903040 | 1 | 2106.9193 | 54.5360 | 54.5360 |
32 | Y
13 |
fused_nn_contrib_conv2d_winograd_without_weight_transform_add_add_nn_relu_2 |
114953216 | 2 | 1723.1163 | 66.7124 | 133.4248 |
32 | Y
14 |
fused_nn_conv2d_add_nn_relu_3 | 115655680 | 1 | 1007.9003 |
114.7491 | 114.7491 | 32 | Y
15 |
fused_nn_contrib_conv2d_winograd_without_weight_transform_add_nn_relu_3 |
142132224 | 1 | 1615.3274 | 87.9897 | 87.9897 |
32 | Y
16 |
fused_nn_contrib_conv2d_winograd_without_weight_transform_add_add_nn_relu_3 |
142157312 | 2 | 1053.2288 | 134.9729 | 269.9457 |
32 | Y
17 |
fused_nn_adaptive_avg_pool2d | 25600 | 1 | 5.8995 |
4.3393 | 4.3393 | 32 | Y
18 |
fused_layout_transform_reshape_squeeze | 1 | 1 | 0.0003 |
3.2615 | 3.2615 | 5 |
19 |
fused_nn_dense_add | 1025000 | 1 | 68.4000 | 14.9854 |
14.9854 | 32 | Y
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
```
profiler table:
```
ID | Name | Time (min) | Percentage
----------------------------------------------------------------------------
| Total | 83.8914 | 100.0000
1 | EvoSearch/Evolve/Mutation | 48.0123 | 57.2314
2 | SendToBuilder | 13.7197 | 16.3541
3 | EvoSearch/SampleInitPopulation | 7.9949 | 9.5300
4 | EvoSearch/Evolve/PredictNormalizedScore | 4.0848 | 4.8691
5 | SendToRunner | 3.6898 | 4.3983
6 | EvoSearch/Evolve/Misc | 2.7077 | 3.2277
7 | MeasureCallback/UpdateCostModel | 1.8705 | 2.2297
8 | ApplyHistoryBest | 0.7764 | 0.9254
9 | TaskExtraction | 0.5058 | 0.6030
10 | InitializeTask | 0.0267 | 0.0318
11 | MeasureCallback/AddToDatabase | 0.0142 | 0.0170
12 | EvoSearch/PickBestFromDatabase | 0.0131 | 0.0157
13 | EvoSearch/PickWithEpsGreedy | 0.0100 | 0.0119
14 | MeasureCallback/EchoStatistics | 0.0037 | 0.0044
15 | MeasureCallback/RemoveBuildArtifact | 0.0019 | 0.0023
16 | JoinRunnerFutures | 0.0005 | 0.0006
17 | EvoSearch/Evolve/Misc/CopyMeasuredWorkloads | 0.0000 | 0.0000
----------------------------------------------------------------------------
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]